Yearly
Weekly
Daily
Reading list
Links 25
Score: 0.9156188206792982
User feedback: None
Out links: 290926 Raw text: 290926https://arxiv.org/pdf/2302.13861.pdf
Differentially Private Diffusion Models Generate Useful Synthetic Images Sahra Ghalebikesabi1,+ , Leonard Berrada2 , Sven Gowal2 , Ira Ktena2 , Robert Stanforth2 , Jamie Hayes2 , Soham De2 , Samuel L. Smith2 , Olivia Wiles2 and Borja Balle2 arXiv:2302.13861v1 [cs.LG] 27 Feb 2023 1 University of Ox...
Score: 0.902011208960867
User feedback: None
Out links: 291713 Raw text: 291713https://arxiv.org/pdf/2404.02258.pdf
Mixture-of-Depths: Dynamically allocating compute in transformer-based language models David Raposo1* , Sam Ritter1 , Blake Richards1,2 , Timothy Lillicrap1 , Peter Conway Humphreys1 and Adam Santoro1* arXiv:2404.02258v1 [cs.LG] 2 Apr 2024 1 Google DeepMind, 2 McGill University & Mila, * Equal Con...
Score: 0.8802603803835644
User feedback: None
Out links: 283102 Raw text: 283102https://arxiv.org/pdf/2406.16838
Published in Transactions on Machine Learning Research (11/2024) From Decoding to Meta-Generation: Inference-time Algorithms for Large Language Models Sean Welleck [email protected] Carnegie Mellon University Amanda Bertsch∗ [email protected] arXiv:2406.16838v2 [cs.CL] 20 Nov 2024 Carnegie M...
Score: 0.852643768121922
User feedback: None
Out links: 277192 Raw text: 277192https://arxiv.org/pdf/1511.06295.pdf
Under review as a conference paper at ICLR 2016 P OLICY D ISTILLATION arXiv:1511.06295v2 [cs.LG] 7 Jan 2016 Andrei A. Rusu, Sergio Gómez Colmenarejo, Çağlar Gülçehre∗, Guillaume Desjardins, James Kirkpatrick, Razvan Pascanu, Volodymyr Mnih, Koray Kavukcuoglu & Raia Hadsell Google DeepMind Lo...
Score: 0.8477027680946693
User feedback: None
Out links: 267898 Raw text: 267898https://arxiv.org/pdf/1703.03864.pdf
Evolution Strategies as a Scalable Alternative to Reinforcement Learning arXiv:1703.03864v2 [stat.ML] 7 Sep 2017 Tim Salimans Jonathan Ho Xi Chen OpenAI Szymon Sidor Ilya Sutskever Abstract We explore the use of Evolution Strategies (ES), a class of black box optimization algorithms, as an al...
Score: 0.823370986383254
User feedback: None
Out links: 274736 Raw text: 274736https://arxiv.org/pdf/2102.06701.pdf
Explaining Neural Scaling Laws Yasaman Bahri∗1 , Ethan Dyer*1 , Jared Kaplan*2 , Jaehoon Lee*1 , and Utkarsh Sharma*†2 1 arXiv:2102.06701v2 [cs.LG] 29 Apr 2024 2 Google DeepMind, Mountain View, CA Department of Physics and Astronomy, Johns Hopkins University [email protected], edyer@google....
Score: 0.8146873173533921
User feedback: None
Out links: 274623 Raw text: 274623https://arxiv.org/pdf/1909.12673.pdf
Published as a conference paper at ICLR 2020 A C ONSTRUCTIVE P REDICTION OF THE G ENERALIZATION E RROR ACROSS S CALES Jonathan S. Rosenfeld1 Amir Rosenfeld2 Yonatan Belinkov13 Nir Shavit145 {jonsr,belinkov,shanir}@csail.mit.edu [email protected] 1 arXiv:1909.12673v2 [cs.LG] 20 Dec 2019 4 Massach...
Score: 0.8066509680041216
User feedback: None
Out links: 783579 Raw text: 783579https://gwern.net/doc/dual-n-back/2012-shipstead.pdf
Psychological Bulletin 2012, Vol. ●●, No. ●, 000 – 000 © 2012 American Psychological Association 0033-2909/12/$12.00 DOI: 10.1037/a0027473 Is Working Memory Training Effective? Zach Shipstead, Thomas S. Redick, and Randall W. Engle Georgia Institute of Technology Working memory (WM) is a cognitive...
Score: 0.8022509535160067
User feedback: None
Out links: 280224 Raw text: 280224https://arxiv.org/pdf/2405.09673v1
LoRA Learns Less and Forgets Less arXiv:2405.09673v1 [cs.LG] 15 May 2024 Dan Biderman1,2 , Jose Gonzalez Ortiz2 , Jacob Portes2 , Mansheej Paul2 , Philip Greengard1 , Connor Jennings2 , Daniel King2 , Sam Havens2 , Vitaliy Chiley2 , Jonathan Frankle2 , Cody Blakeney2 , John P. Cunningham1 1 Columb...
Score: 0.7923287178801257
User feedback: None
Out links: 267915 Raw text: 267915https://arxiv.org/pdf/1707.06347.pdf
arXiv:1707.06347v2 [cs.LG] 28 Aug 2017 Proximal Policy Optimization Algorithms John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, Oleg Klimov OpenAI {joschu, filip, prafulla, alec, oleg}@openai.com Abstract We propose a new family of policy gradient methods for reinforcement learning, wh...
Score: 0.7922379631781226
User feedback: None
Out links: 269063 Raw text: 269063https://arxiv.org/pdf/2402.09470
Rolling Diffusion Models David Ruhe 1 2 * Jonathan Heek 1 Tim Salimans 1 Emiel Hoogeboom 1 arXiv:2402.09470v3 [cs.LG] 9 Sep 2024 Abstract 2023). Other impressive results for generating video data have been achieved by, e.g., (Blattmann et al., 2023; Ge et al., 2023; Harvey et al., 2022; Singer e...
Score: 0.7839570108493173
User feedback: None
Out links: 783765 Raw text: 783765https://gwern.net/doc/www/arxiv.org/183b65de0506f31cdf62cef9a5efbd0fde29afcb.pdf
CPM: A Large-scale Generative Chinese Pre-trained Language Model Zhengyan Zhang, Xu Han, Hao Zhou, Pei Ke, Yuxian Gu, Deming Ye, Yujia Qin, Yusheng Su, Haozhe Ji, Jian Guan, Fanchao Qi, Xiaozhi Wang, Yanan Zheng, Guoyang Zeng, Huanqi Cao, Shengqi Chen, Daixuan Li, Zhenbo Sun, Zhiyuan Liu† , Minlie H...
Score: 0.7827254342133503
User feedback: None
Out links: 783887 Raw text: 783887https://gwern.net/doc/www/pdfs.semanticscholar.org/c0ac26d3f1e7cd328a547800f46cbd57fdb7087e.pdf
Psychologica Belgica 2010, 50-3&4, 245-276. DOES WORKING MEMORY TRAINING GENERALIZE? Zach SHIPSTEAD, Thomas S. REDICK, & Randall W. ENGLE Georgia Institute of Technology Recently, attempts have been made to alter the capacity of working memory (WMC) through extensive practice on adaptive working me...
Score: 0.7806016275101726
User feedback: None
Out links: 277111 Raw text: 277111https://arxiv.org/pdf/1802.01561.pdf
IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures arXiv:1802.01561v3 [cs.LG] 28 Jun 2018 Lasse Espeholt * 1 Hubert Soyer * 1 Remi Munos * 1 Karen Simonyan 1 Volodymyr Mnih 1 Tom Ward 1 Yotam Doron 1 Vlad Firoiu 1 Tim Harley 1 Iain Dunning 1 Shane Legg 1 Kora...
Score: 0.7749709837252079
User feedback: None
Out links: 269044 Raw text: 269044https://arxiv.org/html/2402.09470v3
Title: Rolling Diffusion Models Description: No description Keywords: Machine Learning, ICML Text content: Rolling Diffusion Models 1 Introduction 2 Background: Diffusion Models 2.1 Diffusion 2.2 Diffusion for temporal data 3 Rolling Diffusion Models 3.1 A global perspective ...
Score: 0.7744401079719282
User feedback: None
Out links: 277212 Raw text: 277212https://arxiv.org/pdf/1902.08234.pdf
An Empirical Study of Large-Batch Stochastic Gradient Descent with Structured Covariance Noise 0.4 Abstract 1 Introduction From a strictly mathematical perspective, training neural networks is a high-dimensional non-convex optimization problem and the dynamics of the training process is incredi...
Score: 0.7743499715369196
User feedback: None
Out links: 272548 Raw text: 272548https://arxiv.org/pdf/2210.12496.pdf
Bayesian Optimization with Conformal Prediction Sets arXiv:2210.12496v4 [cs.LG] 12 Dec 2023 Samuel Stanton1,2 Prescient Design, Genentech1 Wesley Maddox2 Abstract Bayesian optimization is a coherent, ubiquitous approach to decision-making under uncertainty, with applications including multi-arm ...
Score: 0.7692018114571237
User feedback: None
Out links: 271781 Raw text: 271781https://arxiv.org/pdf/1611.02779.pdf
Under review as a conference paper at ICLR 2017 RL2 : FAST R EINFORCEMENT L EARNING VIA S LOW R EINFORCEMENT L EARNING arXiv:1611.02779v2 [cs.AI] 10 Nov 2016 Yan Duan†‡ , John Schulman†‡ , Xi Chen†‡ , Peter L. Bartlett† , Ilya Sutskever‡ , Pieter Abbeel†‡ † UC Berkeley, Department of Electrical E...
Score: 0.7682773816331065
User feedback: None
Out links: 783770 Raw text: 783770https://gwern.net/doc/dual-n-back/2016-lindelov.pdf
NEUROPSYCHOLOGICAL REHABILITATION, 2016 http://dx.doi.org/10.1080/09602011.2016.1141692 Training and transfer effects of N-back training for braininjured and healthy subjects Jonas Kristoffer Lindeløv, Jonas Olsen Dall, Casper Daniel Kristensen, Marie Holt Aagesen, Stine Almgren Olsen, Therese Ruud...
Score: 0.7665216611838731
User feedback: None
Out links: 10285363 Raw text: 10285363https://medium.com/@ifdefelse/the-cost-of-asic-design-a44f9a065b72
Title: The Cost of ASIC Design – IfDefElse – Medium Description: The speculation about hardware design and development costs as it pertains to both ProgPoW and Ethash are usually followed by a statement of authority: trust the author, because they have previous… Keywords: No keywords Text content: T...
Score: 0.7661532920861701
User feedback: None
Out links: 290813 Raw text: 290813https://arxiv.org/pdf/2308.10888.pdf
Unlocking Accuracy and Fairness in Differentially Private Image Classification Leonard Berrada*,1 , Soham De*,1 , Judy Hanwen Shen*,2,† , Jamie Hayes1 , Robert Stanforth1 , David Stutz1 , Pushmeet Kohli1 , Samuel L. Smith1 and Borja Balle1 * Equal contributions, 1 Google DeepMind, London, UK, 2 Comp...
Score: 0.7627485477364769
User feedback: None
Out links: 268172 Raw text: 268172https://arxiv.org/pdf/2312.17742
Learning Vision from Models Rivals Learning Vision from Data Yonglong Tian1,† Lijie Fan2,†, * Kaifeng Chen1 Dina Katabi2 Dilip Krishnan1 Phillip Isola2 1 Google Research, 2 MIT CSAIL, † equal contribution arXiv:2312.17742v1 [cs.CV] 28 Dec 2023 Github Repo: https://github.com/google-research/s...
Score: 0.7610700642068765
User feedback: None
Out links: 261879 Raw text: 261879https://arxiv.org/pdf/2309.10400.pdf
Published as a conference paper at ICLR 2024 P O SE: E FFICIENT C ONTEXT W INDOW E XTENSION OF LLM S VIA P OSITIONAL S KIP - WISE T RAINING arXiv:2309.10400v3 [cs.CL] 21 Feb 2024 Dawei Zhu ∗ ♡♠ Nan Yang ♢ Liang Wang ♢ Yifan Song ♡♠ Wenhao Wu ♡♠ Furu Wei ♢ Sujian Li ♡♠ ♡ School of Computer Science...
Score: 0.7606806140411423
User feedback: None
Out links: 783322 Raw text: 783322https://gwern.net/doc/dual-n-back/2010-seidler.pdf
Report No. M-CASTL 2010-01 COGNITIVE TRAINING AS AN INTERVENTION TO IMPROVE DRIVING ABILITY IN THE OLDER ADULT 1,2 Rachael D. Seidler, 1Jessica A. Bernard, 1Martin Buschkuehl, 1Susanne Jaeggi, 1John Jonides, 2Jennifer Humfleet University of Michigan 1 Department of Psychology, 2School of Kinesiol...
Score: 0.7578576183159949
User feedback: None
Out links: 271779 Raw text: 271779https://arxiv.org/pdf/1907.06077.pdf
arXiv:1907.06077v1 [cs.NE] 13 Jul 2019 Evolvability ES: Scalable and Direct Optimization of Evolvability Alexander Gajewski∗ Jeff Clune Columbia University Uber AI Labs University of Wyoming Kenneth O. Stanley Joel Lehman Uber AI Labs Uber AI Labs ABSTRACT 1 Designing evolutionary algorit...