Title: None

Score: 0.9156188206792982

User feedback: None

Out links: 290926 Raw text: 290926

https://arxiv.org/pdf/2302.13861.pdf

Differentially Private Diffusion Models Generate Useful Synthetic Images Sahra Ghalebikesabi1,+ , Leonard Berrada2 , Sven Gowal2 , Ira Ktena2 , Robert Stanforth2 , Jamie Hayes2 , Soham De2 , Samuel L. Smith2 , Olivia Wiles2 and Borja Balle2 arXiv:2302.13861v1 [cs.LG] 27 Feb 2023 1 University of Ox...

Title: None

Score: 0.902011208960867

User feedback: None

Out links: 291713 Raw text: 291713

https://arxiv.org/pdf/2404.02258.pdf

Mixture-of-Depths: Dynamically allocating compute in transformer-based language models David Raposo1* , Sam Ritter1 , Blake Richards1,2 , Timothy Lillicrap1 , Peter Conway Humphreys1 and Adam Santoro1* arXiv:2404.02258v1 [cs.LG] 2 Apr 2024 1 Google DeepMind, 2 McGill University & Mila, * Equal Con...

Title: A Closer Look at Self-Supervised Lightweight Vision Transformers

Score: 0.8795824543149096

User feedback: None

Out links: 430151 Raw text: 430151

https://proceedings.mlr.press/v202/wang23e/wang23e.pdf

A Closer Look at Self-Supervised Lightweight Vision Transformers Shaoru Wang 1 2 Jin Gao 1 2 Zeming Li 3 Xiaoqin Zhang 4 Weiming Hu 1 2 5 Abstract sive labeled data. SSL focuses on various pretext tasks for pre-training. Among them, several works (He et al., 2020; Chen et al., 2020; Grill et al.,...

Title: None

Score: 0.852643768121922

User feedback: None

Out links: 277192 Raw text: 277192

https://arxiv.org/pdf/1511.06295.pdf

Under review as a conference paper at ICLR 2016 P OLICY D ISTILLATION arXiv:1511.06295v2 [cs.LG] 7 Jan 2016 Andrei A. Rusu, Sergio Gómez Colmenarejo, Çağlar Gülçehre∗, Guillaume Desjardins, James Kirkpatrick, Razvan Pascanu, Volodymyr Mnih, Koray Kavukcuoglu & Raia Hadsell Google DeepMind Lo...

Title: Evolution Strategies as a Scalable Alternative to Reinforcement Learning

Score: 0.8477027680946693

User feedback: None

Out links: 267898 Raw text: 267898

https://arxiv.org/pdf/1703.03864.pdf

Evolution Strategies as a Scalable Alternative to Reinforcement Learning arXiv:1703.03864v2 [stat.ML] 7 Sep 2017 Tim Salimans Jonathan Ho Xi Chen OpenAI Szymon Sidor Ilya Sutskever Abstract We explore the use of Evolution Strategies (ES), a class of black box optimization algorithms, as an al...

Title: A Three-regime Model of Network Pruning

Score: 0.8306759145221821

User feedback: None

Out links: 430238 Raw text: 430238

https://proceedings.mlr.press/v202/zhou23p/zhou23p.pdf

A Three-regime Model of Network Pruning Yefan Zhou 1 2 Yaoqing Yang 3 Arin Chang 2 Michael W. Mahoney 1 2 4 Abstract NNs often have improved memory and inference efficiencies, compared to the original “denser” NNs. Recent work has highlighted the complex influence training hyperparameters, e.g.,...

Title: None

Score: 0.8290986080023466

User feedback: None

Out links: 430353 Raw text: 430353

https://proceedings.mlr.press/v202/pal23a/pal23a.pdf

Locally Regularized Neural Differential Equations: Some Black Boxes Were Meant to Remain Closed! Avik Pal 1 2 Alan Edelman 1 3 Christopher Rackauckas 1 Abstract 1. Introduction Implicit Models, such as Neural Ordinary Differential Equations (Chen et al., 2018) and Deep Equilibrium Models (Bai et a...

Title: None

Score: 0.8290381223387205

User feedback: None

Out links: 430372 Raw text: 430372

https://proceedings.mlr.press/v202/freund23a/freund23a.pdf

A Coupled Flow Approach to Imitation Learning Gideon Freund 1 Elad Sarafian 1 Sarit Kraus 1 Abstract only by the ability of the agent to reproduce the expert’s behavior. In reinforcement learning and imitation learning, an object of central importance is the state distribution induced by the pol...

Title: SurCo: Learning Linear SURrogates for COmbinatorial Nonlinear Optimization Problems

Score: 0.8288598373914465

User feedback: None

Out links: 430160 Raw text: 430160

https://proceedings.mlr.press/v202/ferber23a/ferber23a.pdf

SurCo: Learning Linear SURrogates for COmbinatorial Nonlinear Optimization Problems Aaron Ferber 1 Taoan Huang 1 Daochen Zha 2 Martin Schubert 3 Benoit Steiner 4 Bistra Dilkina 1 Yuandong Tian 3 Abstract ming (LP) (Chvatal et al., 1983), have been extensively studied in operations research (OR). ...

Title: None

Score: 0.823370986383254

User feedback: None

Out links: 274736 Raw text: 274736

https://arxiv.org/pdf/2102.06701.pdf

Explaining Neural Scaling Laws Yasaman Bahri∗1 , Ethan Dyer*1 , Jared Kaplan*2 , Jaehoon Lee*1 , and Utkarsh Sharma*†2 1 arXiv:2102.06701v2 [cs.LG] 29 Apr 2024 2 Google DeepMind, Mountain View, CA Department of Physics and Astronomy, Johns Hopkins University [email protected], edyer@google....

Title: Long-Tailed Recognition by Mutual Information Maximization between Latent Features and Ground-Truth Labels

Score: 0.8233681523193468

User feedback: None

Out links: 430304 Raw text: 430304

https://proceedings.mlr.press/v202/suh23a/suh23a.pdf

Long-Tailed Recognition by Mutual Information Maximization between Latent Features and Ground-Truth Labels Min-Kook Suh 1 Seung-Woo Seo 1 Abstract ber of samples while the remaining classes, i.e., tail classes, have only a small number of samples. The most straightforward remedy to this problem is...

Title: Bag of Tricks for Training Data Extraction from Language Models

Score: 0.819294064892932

User feedback: None

Out links: 430440 Raw text: 430440

https://proceedings.mlr.press/v202/yu23c/yu23c.pdf

Bag of Tricks for Training Data Extraction from Language Models Weichen Yu * 1 Tianyu Pang 2 Qian Liu 2 Chao Du 2 Bingyi Kang 2 Yan Huang 1 Min Lin 2 Shuicheng Yan 2 Abstract 19.7 20.8 Sampling Strategy With the advance of language models, privacy protection is receiving more attention. Trainin...

Title: BiRT: Bio-inspired Replay in Vision Transformers for Continual Learning

Score: 0.8157906990303174

User feedback: None

Out links: 430218 Raw text: 430218

https://proceedings.mlr.press/v202/jeeveswaran23a/jeeveswaran23a.pdf

BiRT: Bio-inspired Replay in Vision Transformers for Continual Learning Kishaan Jeeveswaran 1 Prashant Bhat 1 2 Bahram Zonooz * 1 2 Elahe Arani * 1 2 Abstract DyTox The ability of deep neural networks to continually learn and adapt to a sequence of tasks has remained challenging due to catastroph...

Title: None

Score: 0.8146873173533921

User feedback: None

Out links: 274623 Raw text: 274623

https://arxiv.org/pdf/1909.12673.pdf

Published as a conference paper at ICLR 2020 A C ONSTRUCTIVE P REDICTION OF THE G ENERALIZATION E RROR ACROSS S CALES Jonathan S. Rosenfeld1 Amir Rosenfeld2 Yonatan Belinkov13 Nir Shavit145 {jonsr,belinkov,shanir}@csail.mit.edu [email protected] 1 arXiv:1909.12673v2 [cs.LG] 20 Dec 2019 4 Massach...

Title: Scaling Spherical CNNs

Score: 0.8114691711122822

User feedback: None

Out links: 430058 Raw text: 430058

https://proceedings.mlr.press/v202/esteves23a/esteves23a.pdf

Scaling Spherical CNNs Carlos Esteves 1 Jean-Jacques Slotine 2 Ameesh Makadia 1 Abstract Spherical CNNs generalize CNNs to functions on the sphere, by using spherical convolutions as the main linear operation. The most accurate and efficient way to compute spherical convolutions is in the spectral...

Title: Uncertainty Estimation by Fisher Information-based Evidential Deep Learning

Score: 0.8070801104351815

User feedback: None

Out links: 430166 Raw text: 430166

https://proceedings.mlr.press/v202/deng23b/deng23b.pdf

Uncertainty Estimation by Fisher Information-based Evidential Deep Learning Danruo Deng 1 2 Guangyong Chen 3 Yang Yu 1 2 Furui Liu 3 Pheng-Ann Heng 1 2 Abstract Predictive uncertainty is quite diverse and can be divided into data uncertainty, model uncertainty, and distributional uncertainty (Gal...

Title: None

Score: 0.8066509680041216

User feedback: None

Out links: 783579 Raw text: 783579

https://gwern.net/doc/dual-n-back/2012-shipstead.pdf

Psychological Bulletin 2012, Vol. ●●, No. ●, 000 – 000 © 2012 American Psychological Association 0033-2909/12/$12.00 DOI: 10.1037/a0027473 Is Working Memory Training Effective? Zach Shipstead, Thomas S. Redick, and Randall W. Engle Georgia Institute of Technology Working memory (WM) is a cognitive...

Title: Recasting Self-Attention with Holographic Reduced Representations

Score: 0.8045868710460758

User feedback: None

Out links: 430454 Raw text: 430454

https://proceedings.mlr.press/v202/alam23a/alam23a.pdf

Recasting Self-Attention with Holographic Reduced Representations Mohammad Mahmudul Alam 1 Edward Raff 1 2 3 Stella Biderman 2 3 4 Tim Oates 1 James Holt 2 Abstract 90 In recent years, self-attention has become the dominant paradigm for sequence modeling in a variety of domains. However, in domai...

Title: The Dormant Neuron Phenomenon in Deep Reinforcement Learning

Score: 0.8032229745209516

User feedback: None

Out links: 430125 Raw text: 430125

https://proceedings.mlr.press/v202/sokar23a/sokar23a.pdf

The Dormant Neuron Phenomenon in Deep Reinforcement Learning Ghada Sokar 1 2 Rishabh Agarwal 3 4 Pablo Samuel Castro 3 * Utku Evci 3 * IQM Normalized Score Abstract In this work we identify the dormant neuron phenomenon in deep reinforcement learning, where an agent’s network suffers from an incr...

Title: Auxiliary Modality Learning with Generalized Curriculum Distillation

Score: 0.8026047727432333

User feedback: None

Out links: 430494 Raw text: 430494

https://proceedings.mlr.press/v202/shen23f/shen23f.pdf

Auxiliary Modality Learning with Generalized Curriculum Distillation Yu Shen 1 Xijun Wang 1 Peng Gao 1 Ming C. Lin 1 Abstract cars, but it’s reasonable to equip a few developer’s cars with Lidar for training. However, this specific type of learning task, i.e., ”test with fewer modalities than dur...

Title: None

Score: 0.8002278007727708

User feedback: None

Out links: 277193 Raw text: 277193

https://arxiv.org/pdf/1907.04164.pdf

arXiv:1907.04164v2 [cs.LG] 28 Oct 2019 Which Algorithmic Choices Matter at Which Batch Sizes? Insights From a Noisy Quadratic Model Guodong Zhang1,2,3∗, Lala Li3 , Zachary Nado3 , James Martens4 , Sushant Sachdeva1 , George E. Dahl3 , Christopher J. Shallue3 , Roger Grosse1,2 1 University of Toron...

Title: None

Score: 0.7979744616683715

User feedback: None

Out links: 430132 Raw text: 430132

https://proceedings.mlr.press/v202/qin23b/qin23b.pdf

BiBench: Benchmarking and Analyzing Network Binarization Haotong Qin * 1 2 Mingyuan Zhang * 3 Yifu Ding 1 Aoyu Li 3 Zhongang Cai 3 Ziwei Liu 3 Fisher Yu 2 Xianglong LiuB 1 Abstract widely studied to address this issue, including quantization (Gong et al., 2014; Wu et al., 2016; Vanhoucke et al., 2...

Title: None

Score: 0.796568117325422

User feedback: None

Out links: 430504 Raw text: 430504

https://proceedings.mlr.press/v202/kang23b/kang23b.pdf

Beyond Reward: Offline Preference-guided Policy Optimization Yachen Kang 1 2 Diyuan Shi 2 Jinxin Liu 2 Li He 2 Donglin Wang 2 Abstract 1. Introduction This study focuses on the topic of offline preference-based reinforcement learning (PbRL), a variant of conventional reinforcement learning that ...

Title: Improving Graph Generation by Restricting Graph Bandwidth

Score: 0.7945837196597596

User feedback: None

Out links: 430442 Raw text: 430442

https://proceedings.mlr.press/v202/diamant23a/diamant23a.pdf

Improving Graph Generation by Restricting Graph Bandwidth Nathaniel Diamant 1 Alex M. Tseng 1 Kangway V. Chuang 1 Tommaso Biancalani 1 Gabriele Scalia 1 Abstract tive modeling has recently been proven capable of learning both global and fine-grained structural properties, along with their complex...

Title: Guiding Pretraining in Reinforcement Learning with Large Language Models

Score: 0.7942813190854575

User feedback: None

Out links: 430035 Raw text: 430035

https://proceedings.mlr.press/v202/du23f/du23f.pdf

Guiding Pretraining in Reinforcement Learning with Large Language Models Yuqing Du * 1 Olivia Watkins * 1 Zihan Wang 2 Cédric Colas 3 4 Trevor Darrell 1 Pieter Abbeel 1 Abhishek Gupta 2 Jacob Andreas 3 Abstract You see trees, cows, grass, table, and bushes. You have wood in your inventory. You fe...