Yearly

Weekly

Daily

Reading list

Links 25

Title: None

Score: 0.9156188206792982

User feedback: None

Out links: 290926 Raw text: 290926

https://arxiv.org/pdf/2302.13861.pdf

Differentially Private Diffusion Models Generate Useful Synthetic Images Sahra Ghalebikesabi1,+ , Leonard Berrada2 , Sven Gowal2 , Ira Ktena2 , Robert Stanforth2 , Jamie Hayes2 , Soham De2 , Samuel L. Smith2 , Olivia Wiles2 and Borja Balle2 arXiv:2302.13861v1 [cs.LG] 27 Feb 2023 1 University of Ox...

Title: None

Score: 0.902011208960867

User feedback: None

Out links: 291713 Raw text: 291713

https://arxiv.org/pdf/2404.02258.pdf

Mixture-of-Depths: Dynamically allocating compute in transformer-based language models David Raposo1* , Sam Ritter1 , Blake Richards1,2 , Timothy Lillicrap1 , Peter Conway Humphreys1 and Adam Santoro1* arXiv:2404.02258v1 [cs.LG] 2 Apr 2024 1 Google DeepMind, 2 McGill University & Mila, * Equal Con...

Title: None

Score: 0.8802603803835644

User feedback: None

Out links: 283102 Raw text: 283102

https://arxiv.org/pdf/2406.16838

Published in Transactions on Machine Learning Research (11/2024) From Decoding to Meta-Generation: Inference-time Algorithms for Large Language Models Sean Welleck [email protected] Carnegie Mellon University Amanda Bertsch∗ [email protected] arXiv:2406.16838v2 [cs.CL] 20 Nov 2024 Carnegie M...

Title: None

Score: 0.852643768121922

User feedback: None

Out links: 277192 Raw text: 277192

https://arxiv.org/pdf/1511.06295.pdf

Under review as a conference paper at ICLR 2016 P OLICY D ISTILLATION arXiv:1511.06295v2 [cs.LG] 7 Jan 2016 Andrei A. Rusu, Sergio Gómez Colmenarejo, Çağlar Gülçehre∗, Guillaume Desjardins, James Kirkpatrick, Razvan Pascanu, Volodymyr Mnih, Koray Kavukcuoglu & Raia Hadsell Google DeepMind Lo...

Title: Evolution Strategies as a Scalable Alternative to Reinforcement Learning

Score: 0.8477027680946693

User feedback: None

Out links: 267898 Raw text: 267898

https://arxiv.org/pdf/1703.03864.pdf

Evolution Strategies as a Scalable Alternative to Reinforcement Learning arXiv:1703.03864v2 [stat.ML] 7 Sep 2017 Tim Salimans Jonathan Ho Xi Chen OpenAI Szymon Sidor Ilya Sutskever Abstract We explore the use of Evolution Strategies (ES), a class of black box optimization algorithms, as an al...

Title: None

Score: 0.823370986383254

User feedback: None

Out links: 274736 Raw text: 274736

https://arxiv.org/pdf/2102.06701.pdf

Explaining Neural Scaling Laws Yasaman Bahri∗1 , Ethan Dyer*1 , Jared Kaplan*2 , Jaehoon Lee*1 , and Utkarsh Sharma*†2 1 arXiv:2102.06701v2 [cs.LG] 29 Apr 2024 2 Google DeepMind, Mountain View, CA Department of Physics and Astronomy, Johns Hopkins University [email protected], edyer@google....

Title: None

Score: 0.8146873173533921

User feedback: None

Out links: 274623 Raw text: 274623

https://arxiv.org/pdf/1909.12673.pdf

Published as a conference paper at ICLR 2020 A C ONSTRUCTIVE P REDICTION OF THE G ENERALIZATION E RROR ACROSS S CALES Jonathan S. Rosenfeld1 Amir Rosenfeld2 Yonatan Belinkov13 Nir Shavit145 {jonsr,belinkov,shanir}@csail.mit.edu [email protected] 1 arXiv:1909.12673v2 [cs.LG] 20 Dec 2019 4 Massach...

Title: None

Score: 0.8066509680041216

User feedback: None

Out links: 783579 Raw text: 783579

https://gwern.net/doc/dual-n-back/2012-shipstead.pdf

Psychological Bulletin 2012, Vol. ●●, No. ●, 000 – 000 © 2012 American Psychological Association 0033-2909/12/$12.00 DOI: 10.1037/a0027473 Is Working Memory Training Effective? Zach Shipstead, Thomas S. Redick, and Randall W. Engle Georgia Institute of Technology Working memory (WM) is a cognitive...

Title: None

Score: 0.8022509535160067

User feedback: None

Out links: 280224 Raw text: 280224

https://arxiv.org/pdf/2405.09673v1

LoRA Learns Less and Forgets Less arXiv:2405.09673v1 [cs.LG] 15 May 2024 Dan Biderman1,2 , Jose Gonzalez Ortiz2 , Jacob Portes2 , Mansheej Paul2 , Philip Greengard1 , Connor Jennings2 , Daniel King2 , Sam Havens2 , Vitaliy Chiley2 , Jonathan Frankle2 , Cody Blakeney2 , John P. Cunningham1 1 Columb...

Title: None

Score: 0.7923287178801257

User feedback: None

Out links: 267915 Raw text: 267915

https://arxiv.org/pdf/1707.06347.pdf

arXiv:1707.06347v2 [cs.LG] 28 Aug 2017 Proximal Policy Optimization Algorithms John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, Oleg Klimov OpenAI {joschu, filip, prafulla, alec, oleg}@openai.com Abstract We propose a new family of policy gradient methods for reinforcement learning, wh...

Title: Rolling Diffusion Models

Score: 0.7922379631781226

User feedback: None

Out links: 269063 Raw text: 269063

https://arxiv.org/pdf/2402.09470

Rolling Diffusion Models David Ruhe 1 2 * Jonathan Heek 1 Tim Salimans 1 Emiel Hoogeboom 1 arXiv:2402.09470v3 [cs.LG] 9 Sep 2024 Abstract 2023). Other impressive results for generating video data have been achieved by, e.g., (Blattmann et al., 2023; Ge et al., 2023; Harvey et al., 2022; Singer e...

Title: None

Score: 0.7839570108493173

User feedback: None

Out links: 783765 Raw text: 783765

https://gwern.net/doc/www/arxiv.org/183b65de0506f31cdf62cef9a5efbd0fde29afcb.pdf

CPM: A Large-scale Generative Chinese Pre-trained Language Model Zhengyan Zhang, Xu Han, Hao Zhou, Pei Ke, Yuxian Gu, Deming Ye, Yujia Qin, Yusheng Su, Haozhe Ji, Jian Guan, Fanchao Qi, Xiaozhi Wang, Yanan Zheng, Guoyang Zeng, Huanqi Cao, Shengqi Chen, Daixuan Li, Zhenbo Sun, Zhiyuan Liu† , Minlie H...

Title: Untitled

Score: 0.7827254342133503

User feedback: None

Out links: 783887 Raw text: 783887

https://gwern.net/doc/www/pdfs.semanticscholar.org/c0ac26d3f1e7cd328a547800f46cbd57fdb7087e.pdf

Psychologica Belgica 2010, 50-3&4, 245-276. DOES WORKING MEMORY TRAINING GENERALIZE? Zach SHIPSTEAD, Thomas S. REDICK, & Randall W. ENGLE Georgia Institute of Technology Recently, attempts have been made to alter the capacity of working memory (WMC) through extensive practice on adaptive working me...

Title: IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures

Score: 0.7806016275101726

User feedback: None

Out links: 277111 Raw text: 277111

https://arxiv.org/pdf/1802.01561.pdf

IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures arXiv:1802.01561v3 [cs.LG] 28 Jun 2018 Lasse Espeholt * 1 Hubert Soyer * 1 Remi Munos * 1 Karen Simonyan 1 Volodymyr Mnih 1 Tom Ward 1 Yotam Doron 1 Vlad Firoiu 1 Tim Harley 1 Iain Dunning 1 Shane Legg 1 Kora...

Title: Rolling Diffusion Models

Score: 0.7749709837252079

User feedback: None

Out links: 269044 Raw text: 269044

https://arxiv.org/html/2402.09470v3

Title: Rolling Diffusion Models Description: No description Keywords: Machine Learning, ICML Text content: Rolling Diffusion Models 1 Introduction 2 Background: Diffusion Models 2.1 Diffusion 2.2 Diffusion for temporal data 3 Rolling Diffusion Models 3.1 A global perspective ...

Title: None

Score: 0.7744401079719282

User feedback: None

Out links: 277212 Raw text: 277212

https://arxiv.org/pdf/1902.08234.pdf

An Empirical Study of Large-Batch Stochastic Gradient Descent with Structured Covariance Noise 0.4 Abstract 1 Introduction From a strictly mathematical perspective, training neural networks is a high-dimensional non-convex optimization problem and the dynamics of the training process is incredi...

Title: None

Score: 0.7743499715369196

User feedback: None

Out links: 272548 Raw text: 272548

https://arxiv.org/pdf/2210.12496.pdf

Bayesian Optimization with Conformal Prediction Sets arXiv:2210.12496v4 [cs.LG] 12 Dec 2023 Samuel Stanton1,2 Prescient Design, Genentech1 Wesley Maddox2 Abstract Bayesian optimization is a coherent, ubiquitous approach to decision-making under uncertainty, with applications including multi-arm ...

Title: RL2: Fast Reinforcement Learning via Slow Reinforcement Learning

Score: 0.7692018114571237

User feedback: None

Out links: 271781 Raw text: 271781

https://arxiv.org/pdf/1611.02779.pdf

Under review as a conference paper at ICLR 2017 RL2 : FAST R EINFORCEMENT L EARNING VIA S LOW R EINFORCEMENT L EARNING arXiv:1611.02779v2 [cs.AI] 10 Nov 2016 Yan Duan†‡ , John Schulman†‡ , Xi Chen†‡ , Peter L. Bartlett† , Ilya Sutskever‡ , Pieter Abbeel†‡ † UC Berkeley, Department of Electrical E...

Title: Training and transfer effects of <em>n</em>-back training for brain-injured and healthy subjects

Score: 0.7682773816331065

User feedback: None

Out links: 783770 Raw text: 783770

https://gwern.net/doc/dual-n-back/2016-lindelov.pdf

NEUROPSYCHOLOGICAL REHABILITATION, 2016 http://dx.doi.org/10.1080/09602011.2016.1141692 Training and transfer effects of N-back training for braininjured and healthy subjects Jonas Kristoffer Lindeløv, Jonas Olsen Dall, Casper Daniel Kristensen, Marie Holt Aagesen, Stine Almgren Olsen, Therese Ruud...

Title: The Cost of ASIC Design – IfDefElse – Medium

Score: 0.7665216611838731

User feedback: None

Out links: 10285363 Raw text: 10285363

https://medium.com/@ifdefelse/the-cost-of-asic-design-a44f9a065b72

Title: The Cost of ASIC Design – IfDefElse – Medium Description: The speculation about hardware design and development costs as it pertains to both ProgPoW and Ethash are usually followed by a statement of authority: trust the author, because they have previous… Keywords: No keywords Text content: T...

Title: None

Score: 0.7661532920861701

User feedback: None

Out links: 290813 Raw text: 290813

https://arxiv.org/pdf/2308.10888.pdf

Unlocking Accuracy and Fairness in Differentially Private Image Classification Leonard Berrada*,1 , Soham De*,1 , Judy Hanwen Shen*,2,† , Jamie Hayes1 , Robert Stanforth1 , David Stutz1 , Pushmeet Kohli1 , Samuel L. Smith1 and Borja Balle1 * Equal contributions, 1 Google DeepMind, London, UK, 2 Comp...

Title: None

Score: 0.7627485477364769

User feedback: None

Out links: 268172 Raw text: 268172

https://arxiv.org/pdf/2312.17742

Learning Vision from Models Rivals Learning Vision from Data Yonglong Tian1,† Lijie Fan2,†, * Kaifeng Chen1 Dina Katabi2 Dilip Krishnan1 Phillip Isola2 1 Google Research, 2 MIT CSAIL, † equal contribution arXiv:2312.17742v1 [cs.CV] 28 Dec 2023 Github Repo: https://github.com/google-research/s...

Title: None

Score: 0.7610700642068765

User feedback: None

Out links: 261879 Raw text: 261879

https://arxiv.org/pdf/2309.10400.pdf

Published as a conference paper at ICLR 2024 P O SE: E FFICIENT C ONTEXT W INDOW E XTENSION OF LLM S VIA P OSITIONAL S KIP - WISE T RAINING arXiv:2309.10400v3 [cs.CL] 21 Feb 2024 Dawei Zhu ∗ ♡♠ Nan Yang ♢ Liang Wang ♢ Yifan Song ♡♠ Wenhao Wu ♡♠ Furu Wei ♢ Sujian Li ♡♠ ♡ School of Computer Science...

Title: M-CASTL Synthesis Report

Score: 0.7606806140411423

User feedback: None

Out links: 783322 Raw text: 783322

https://gwern.net/doc/dual-n-back/2010-seidler.pdf

Report No. M-CASTL 2010-01 COGNITIVE TRAINING AS AN INTERVENTION TO IMPROVE DRIVING ABILITY IN THE OLDER ADULT 1,2 Rachael D. Seidler, 1Jessica A. Bernard, 1Martin Buschkuehl, 1Susanne Jaeggi, 1John Jonides, 2Jennifer Humfleet University of Michigan 1 Department of Psychology, 2School of Kinesiol...

Title: Evolvability ES: Scalable and Direct Optimization of Evolvability

Score: 0.7578576183159949

User feedback: None

Out links: 271779 Raw text: 271779

https://arxiv.org/pdf/1907.06077.pdf

arXiv:1907.06077v1 [cs.NE] 13 Jul 2019 Evolvability ES: Scalable and Direct Optimization of Evolvability Alexander Gajewski∗ Jeff Clune Columbia University Uber AI Labs University of Wyoming Kenneth O. Stanley Joel Lehman Uber AI Labs Uber AI Labs ABSTRACT 1 Designing evolutionary algorit...