Yearly

Weekly

Daily

Reading list

Links 25

Title: None

Score: 0.8699225284191421

User feedback: None

Out links: 3268773 Raw text: 3268773

https://cs.stanford.edu/~diyiy/docs/acl21_hiddencut.pdf

HiddenCut: Simple Data Augmentation for Natural Language Understanding with Better Generalization Jiaao Chen, Dinghan Shen1 , Weizhu Chen1 , Diyi Yang Georgia Institute of Technology, 1 Microsoft Dynamics 365 AI {jchen896,dyang888}@gatech.edu {dishen,wzchen}@microsoft.com Abstract Fine-tuning large...

Title: None

Score: 0.8556674907722507

User feedback: None

Out links: 1271410 Raw text: 1271410

https://web.stanford.edu/class/archive/cs/cs224n/cs224n.1234/final-reports/final-report-169506405.pdf

Few-shot Classification of Disaster-related Tweets Stanford CS224N Custom Project Jubayer Ibn Hamid Department of Computer Science Stanford University [email protected] Jitendra Nath Pandey Department of Computer Science Stanford University [email protected] Sheikh Rifayat Daiyan Srijon De...

Title: None

Score: 0.8426350711444989

User feedback: None

Out links: 3885218 Raw text: 3885218

https://www.mit.edu/~gfarina/2023/escher_iclr23/2206.04122.pdf

ESCHER: E SCHEWING I MPORTANCE S AMPLING IN G AMES BY C OMPUTING A H ISTORY VALUE F UNCTION TO E STIMATE R EGRET arXiv:2206.04122v2 [cs.GT] 11 Oct 2022 Stephen McAleer Carnegie Mellon University [email protected] Gabriele Farina Carnegie Mellon University [email protected] Marc Lanctot DeepMi...

Title: None

Score: 0.8270818663345989

User feedback: None

Out links: 2124217 Raw text: 2124217

http://www.cs.toronto.edu/~hinton/absps/googlerectified.pdf

ON RECTIFIED LINEAR UNITS FOR SPEECH PROCESSING M.D. Zeiler1∗ , M. Ranzato2 , R. Monga2 , M. Mao2 , K. Yang2 , Q.V. Le2 , P. Nguyen2 , A. Senior2 , V. Vanhoucke2 , J. Dean2 , G.E. Hinton3 1 New York University, USA 2 Google Inc., USA ABSTRACT Deep neural networks have recently become the gold st...

Title: None

Score: 0.8073852083678074

User feedback: None

Out links: 2124224 Raw text: 2124224

http://www.cs.toronto.edu/~hinton/absps/uai_crbms.pdf

Conditional Restricted Boltzmann Machines for Structured Output Prediction Volodymyr Mnih Department of Computer Science University of Toronto Toronto, Canada Hugo Larochelle ∗ Département d’informatique Université de Sherbrooke Sherbrooke, Canada Abstract Conditional Restricted Boltzmann Machi...

Title: None

Score: 0.8040562140858766

User feedback: None

Out links: 1271380 Raw text: 1271380

https://web.stanford.edu/class/archive/cs/cs224n/cs224n.1234/final-reports/final-report-169513350.pdf

Contrastive Learning for Sentence Embeddings in BERT and its Smaller Variants Stanford CS224N Custom Project Vrishab Krishna Department of Computer Science Stanford University [email protected] Rohan Bansal Department of Computer Science Stanford University [email protected] Abstract Contr...

Title: Modeling Strong and Human-Like Gameplay with KL-Regularized Search

Score: 0.799613010172785

User feedback: None

Out links: 3885237 Raw text: 3885237

https://www.mit.edu/~gfarina/2022/human_like_pikl_icml22/human_like_pikl.icml22.pdf

Modeling Strong and Human-Like Gameplay with KL-Regularized Search Athul Paul Jacob * 1 2 David J. Wu * 1 Gabriele Farina * 3 Adam Lerer 1 Hengyuan Hu 1 Anton Bakhtin 1 Jacob Andreas 2 Noam Brown 1 arXiv:2112.07544v2 [cs.MA] 17 Feb 2022 Abstract We consider the task of building strong but humanli...

Title: None

Score: 0.7986178776498492

User feedback: None

Out links: 3268749 Raw text: 3268749

https://cs.stanford.edu/~diyiy/docs/naacl_treemix.pdf

TreeMix: Compositional Constituency-based Data Augmentation for Natural Language Understanding Le Zhang Fudan University [email protected] Zichao Yang CMU [email protected] Diyi Yang Georgia Tech [email protected] Abstract Data augmentation is an effective approach to tackle over-fittin...

Title: None

Score: 0.7974628078721803

User feedback: None

Out links: 3885228 Raw text: 3885228

https://www.mit.edu/~gfarina/2024/dtp_iclr24/dtp_iclr24.pdf

Published as a conference paper at ICLR 2024 T HE U PDATE -E QUIVALENCE D ECISION -T IME P LANNING F RAMEWORK FOR Samuel Sokota†1 Gabriele Farina†2 David J. Wu† Hengyuan Hu3 Kevin A. Wang†4 J. Zico Kolter1,5 Noam Brown†6 † Work done at Meta AI 1 Carnegie Mellon University 2 Massachusetts Institu...

Title: None

Score: 0.7902377174914134

User feedback: None

Out links: 2124194 Raw text: 2124194

http://www.cs.toronto.edu/~hinton/absps/dropout.pdf

Improving neural networks by preventing co-adaptation of feature detectors G. E. Hinton∗ , N. Srivastava, A. Krizhevsky, I. Sutskever and R. R. Salakhutdinov Department of Computer Science, University of Toronto, 6 King’s College Rd, Toronto, Ontario M5S 3G4, Canada ∗ To whom correspondence should ...

Title: None

Score: 0.7881641899084061

User feedback: None

Out links: 655841 Raw text: 655841

https://www.usenix.org/system/files/atc21-ren-jie.pdf

ZeRO-Offload: Democratizing Billion-Scale Model Training Jie Ren, UC Merced; Samyam Rajbhandari, Reza Yazdani Aminabadi, and Olatunji Ruwase, Microsoft; Shuangyan Yang, UC Merced; Minjia Zhang, Microsoft; Dong Li, UC Merced; Yuxiong He, Microsoft https://www.usenix.org/conference/atc21/presentation/...

Title: None

Score: 0.7812106818675081

User feedback: None

Out links: 1271320 Raw text: 1271320

https://web.stanford.edu/class/archive/cs/cs224n/cs224n.1234/final-reports/final-report-169506299.pdf

Looking Outside the Context Window: In-Context Learning with Up to Hundreds of Examples Stanford CS224N Custom Project Linden Li Department of Computer Science Stanford University [email protected] Varun Shenoy Department of Electrical Engineering Stanford University [email protected] Abs...

Title: None

Score: 0.7811602910197608

User feedback: None

Out links: 3268784 Raw text: 3268784

https://cs.stanford.edu/~diyiy/docs/naacl21_cl.pdf

Continual Learning for Text Classification with Information Disentanglement Based Regularization Yufan Huang∗, Yanzhe Zhang∗ , Jiaao Chen, Xuezhi Wang1 , Diyi Yang Georgia Institute of Technology, 1 Google {yhuang704, jiaaochen, dyang888}@gatech.edu, 1 [email protected] Abstract Continual learning...

Title: None

Score: 0.780504441497556

User feedback: None

Out links: 1271406 Raw text: 1271406

https://web.stanford.edu/class/archive/cs/cs224n/cs224n.1234/final-reports/final-report-169964588.pdf

Finetuning minBERT Model for Multiple Downstream Tasks Stanford CS224N Default Project Yuan Wang Department of Computer Science Stanford University [email protected] Abstract Pre-trained Large Language Models, such as BERT and GPT, contain rich token embeddings that are useful for various downs...

Title: None

Score: 0.7784315761978045

User feedback: None

Out links: 2124221 Raw text: 2124221

http://www.cs.toronto.edu/~hinton/absps/multiframe.pdf

Autoregressive product of multi-frame predictions can improve the accuracy of hybrid models Navdeep Jaitly1 , Vincent Vanhoucke2 , Geoffrey Hinton1,2 1 University of Toronto 2 Google Inc. [email protected], [email protected] Abstract We describe a simple but effective way of using multi...

Title: Draco: Byzantine-resilient Distributed Training via Redundant Gradients

Score: 0.7775682348559958

User feedback: None

Out links: 353141 Raw text: 353141

http://proceedings.mlr.press/v80/chen18l/chen18l.pdf

D RACO: Byzantine-resilient Distributed Training via Redundant Gradients Lingjiao Chen 1 Hongyi Wang 1 Zachary Charles 1 Dimitris Papailiopoulos 1 Abstract Distributed model training is vulnerable to byzantine system failures and adversarial compute nodes, i.e., nodes that use malicious updates to...

Title: doi:10.1016/j.tics.2007.09.004

Score: 0.7759493070114146

User feedback: None

Out links: 2124151 Raw text: 2124151

http://www.cs.toronto.edu/~hinton/absps/tics.pdf

Review TRENDS in Cognitive Sciences Vol.11 No.10 Learning multiple layers of representation Geoffrey E. Hinton Department of Computer Science, University of Toronto, 10 King’s College Road, Toronto, M5S 3G4, Canada To achieve its impressive performance in tasks such as speech perception or objec...

Title: None

Score: 0.7759423578852973

User feedback: None

Out links: 1271352 Raw text: 1271352

https://web.stanford.edu/class/archive/cs/cs224n/cs224n.1234/final-reports/final-report-169472020.pdf

Multi-task Learning with BERT in NLP Stanford CS224N Default Project Fan Wang Department of Computer Science Stanford University [email protected] Abstract In natural language processing, while deep learning techniques have achieved remarkable success in many different problems, these models ar...

Title: None

Score: 0.7701090216789104

User feedback: None

Out links: 1271438 Raw text: 1271438

https://web.stanford.edu/class/archive/cs/cs224n/cs224n.1234/final-reports/final-report-170018026.pdf

Does Learning Syntax Help Models Learn Language? Stanford CS224N Custom Project Lian Wang Department of Computer Science Stanford University [email protected] Abstract Papadimitriou and Jurafsky (2020) showed that LSTMs trained on nonlinguistic structural data performed significantly better th...

Title: None

Score: 0.769124031716929

User feedback: None

Out links: 2124169 Raw text: 2124169

http://www.cs.toronto.edu/~hinton/absps/Outrageously.pdf

Published as a conference paper at ICLR 2017 O UTRAGEOUSLY L ARGE N EURAL N ETWORKS : T HE S PARSELY-G ATED M IXTURE - OF -E XPERTS L AYER Noam Shazeer1 , Azalia Mirhoseini∗†1 , Krzysztof Maziarz∗2 , Andy Davis1 , Quoc Le1 , Geoffrey Hinton1 and Jeff Dean1 1 Google Brain, {noam,azalia,andydavis,qv...

Title: None

Score: 0.7672189088172875

User feedback: None

Out links: 2124228 Raw text: 2124228

http://www.cs.toronto.edu/~hinton/absps/DNN-2012-proof.pdf

Geoffrey Hinton, Li Deng, Dong Yu, George Dahl, Abdel-rahman Mohamed, [Navdeep Jaitly, Andrew Senior, Vincent Vanhoucke, Patrick Nguyen, Tara Sainath, and Brian Kingsbury] IE E Pr E oo f Deep Neural Networks for Acoustic Modeling in Speech Recognition [Four research groups share their views] <AU:...

Title: None

Score: 0.766606258594767

User feedback: None

Out links: 1271394 Raw text: 1271394

https://web.stanford.edu/class/archive/cs/cs224n/cs224n.1234/final-reports/final-report-169506594.pdf

Exploring Multi-Task Learning for Robust Language Encoding with BERT Stanford CS224N Default Project Alejandro Lozano Department of Biomedical Data Science Stanford University [email protected] Laura Bravo Department of Biomedical Data Science Stanford University [email protected] Abstract ...

Title: None

Score: 0.7652138330119215

User feedback: None

Out links: 2124186 Raw text: 2124186

http://www.cs.toronto.edu/~hinton/absps/mcimage.pdf

Generating more realistic images using gated MRF’s Marc’Aurelio Ranzato Volodymyr Mnih Geoffrey E. Hinton Department of Computer Science University of Toronto {ranzato,vmnih,hinton}@cs.toronto.edu Abstract Probabilistic models of natural images are usually evaluated by measuring performance on rat...

Title: None

Score: 0.7630162712282244

User feedback: None

Out links: 1271416 Raw text: 1271416

https://web.stanford.edu/class/archive/cs/cs224n/cs224n.1234/final-reports/final-report-169729542.pdf

SerBERTus: A SMART Three-Headed BERT Ensemble Stanford CS224N Default Project Matthew Hayes Department of Computer Science Stanford University [email protected] Mentor: Gabriel Poesia No External Collaborators No shared project Abstract We examine different architectures, learning methods, an...

Title: Deep Models of Interactions Across Sets

Score: 0.7604815735654534

User feedback: None

Out links: 353143 Raw text: 353143

http://proceedings.mlr.press/v80/hartford18a/hartford18a-supp.pdf

Deep Models of Interactions Across Sets Jason Hartford * 1 Devon R Graham * 1 Kevin Leyton-Brown 1 Siamak Ravanbakhsh 1 Abstract We use deep learning to model interactions across two or more sets of objects, such as user–movie ratings, protein–drug bindings, or ternary useritem-tag interactions. T...