Title: Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers

Score: 0.9220361642694509

User feedback: None

Out links: 396378 Raw text: 396378

https://gwern.net/doc/www/arxiv.org/239bfc5a718b2b7e356700f6efdc0b7b3331bab2.pdf

Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers arXiv:2002.11794v2 [cs.CL] 23 Jun 2020 Zhuohan Li * 1 Eric Wallace * 1 Sheng Shen * 1 Kevin Lin * 1 Kurt Keutzer 1 Dan Klein 1 Joseph E. Gonzalez 1 Abstract Since hardware resources are limited,...

Title: None

Score: 0.8898784677884709

User feedback: None

Out links: 8317449 Raw text: 8317449

https://arxiv.org/pdf/2411.08706

Searching Latent Program Spaces arXiv:2411.08706v1 [cs.LG] 13 Nov 2024 Clément Bonnet [email protected] Matthew V Macfarlane University of Amsterdam [email protected] Abstract Program synthesis methods aim to automatically generate programs restricted to a language that can explain ...

Title: None

Score: 0.8248744190885741

User feedback: None

Out links: 12034745 Raw text: 12034745

https://arxiv.org/pdf/2502.04896

Goku: Flow Based Video Generative Foundation Models Shoufa Chen1∗ Chongjian Ge1∗ Yuqi Zhang2 Yida Zhang2 Fengda Zhu2 Hao Yang2 Hongxiang Hao2 Hui Wu2 Zhichao Lai2 Yifei Hu2 Ting-Che Lin2 Shilong Zhang1 Fu Li2 Chuan Li2 Xing Wang2 Yanghua Peng2 Peize Sun1 Ping Luo1 Yi Jiang2 Zehuan Yuan2 Bingyue Peng...

Title: None

Score: 0.822890388529411

User feedback: None

Out links: 9081584 Raw text: 9081584

https://arxiv.org/pdf/2501.00663v1

Titans: Learning to Memorize at Test Time † † † Ali Behrouz , Peilin Zhong , and Vahab Mirrokni † Google Research arXiv:2501.00663v1 [cs.LG] 31 Dec 2024 {alibehrouz, peilinz, mirrokni}@google.com Abstract Over more than a decade there has been an extensive research effort of how effectively u...

Title: Improved Training of Wasserstein GANs

Score: 0.8013133098407761

User feedback: None

Out links: 451505 Raw text: 451505

https://gwern.net/doc/www/arxiv.org/8d7bc4501cf2763e3d533db9b2ae7b6559412d11.pdf

arXiv:1704.00028v3 [cs.LG] 25 Dec 2017 Improved Training of Wasserstein GANs Ishaan Gulrajani1∗, Faruk Ahmed1 , Martin Arjovsky2 , Vincent Dumoulin1 , Aaron Courville1,3 1 Montreal Institute for Learning Algorithms 2 Courant Institute of Mathematical Sciences 3 CIFAR Fellow [email protected] {faru...

Title: None

Score: 0.7923972430802166

User feedback: None

Out links: 14658400 Raw text: 14658400

https://storage.googleapis.com/deepmind-media/gemma/Gemma3Report.pdf

2025-03-12 Gemma 3 Technical Report Gemma Team, Google DeepMind1 We introduce Gemma 3, a multimodal addition to the Gemma family of lightweight open models, ranging in scale from 1 to 27 billion parameters. This version introduces vision understanding abilities, a wider coverage of languages and l...

Title: None

Score: 0.7920913349885812

User feedback: None

Out links: 9303979 Raw text: 9303979

https://arxiv.org/pdf/2501.09223

arXiv:2501.09223v1 [cs.CL] 16 Jan 2025 Foundations of Large Language Models Tong Xiao and Jingbo Zhu January 17, 2025 NLP Lab, Northeastern University & NiuTrans Research Copyright © 2021-2025 Tong Xiao and Jingbo Zhu NATURAL L ANGUAGE P ROCESSING L AB , N ORTHEASTERN U NIVERSITY & N IU T RANS...

Title: None

Score: 0.7895508106146834

User feedback: None

Out links: 195795 Raw text: 195795

https://arxiv.org/pdf/2401.10020

Self-Rewarding Language Models Weizhe Yuan1,2 Richard Yuanzhe Pang1,2 Kyunghyun Cho2 Xian Li1 Sainbayar Sukhbaatar1 Jing Xu1 Jason Weston1,2 arXiv:2401.10020v2 [cs.CL] 8 Feb 2024 1 Meta 2 NYU Abstract We posit that to achieve superhuman agents, future models require superhuman feedback in ord...

Title: None

Score: 0.7852585325404082

User feedback: None

Out links: 11617095 Raw text: 11617095

https://arxiv.org/pdf/2502.04327

Value-Based Deep RL Scales Predictably Oleh Rybkin1 , Michal Nauman1,2 , Preston Fu1 , Charlie Snell1 , Pieter Abbeel1 , Sergey Levine1 and Aviral Kumar3 DMC OpenAI Gym Isaac Gym arXiv:2502.04327v1 [cs.LG] 6 Feb 2025 1 University of California, Berkeley, 2 University of Warsaw, 3 Carnegie Mellon ...

Title: None

Score: 0.758706019648826

User feedback: None

Out links: 956469 Raw text: 956469

https://usenix.org/system/files/osdi24-agrawal.pdf

Taming Throughput-Latency Tradeoff in LLM Inference with Sarathi-Serve Amey Agrawal, Georgia Institute of Technology; Nitin Kedia, Ashish Panwar, Jayashree Mohan, Nipun Kwatra, and Bhargav Gulavani, Microsoft Research India; Alexey Tumanov, Georgia Institute of Technology; Ramachandran Ramjee, Micro...

Title: Backpropagation and the brain

Score: 0.7547592173612423

User feedback: None

Out links: 2124257 Raw text: 2124257

http://www.cs.toronto.edu/~hinton/absps/backpropandbrain.pdf

PeRspeCTIves Backpropagation and the brain Timothy P. Lillicrap , Adam Santoro, Luke Marris, Colin J. Akerman and Geoffrey Hinton Abstract | During learning, the brain modifies synapses to improve behaviour. In the cortex, synapses are embedded within multilayered networks, making it difficult to d...

Title: HybridFlow: A Flexible and Efficient RLHF Framework

Score: 0.7545709726063655

User feedback: None

Out links: 10477954 Raw text: 10477954

https://arxiv.org/pdf/2409.19256v2

arXiv:2409.19256v2 [cs.LG] 2 Oct 2024 HybridFlow: A Flexible and Efficient RLHF Framework Guangming Sheng Chi Zhang Zilingfeng Ye The University of Hong Kong [email protected] ByteDance [email protected] ByteDance [email protected] Xibin Wu Wang Zhang Ru Zhang By...

Title: Persia: An Open, Hybrid System Scaling Deep Learning-based Recommenders up to 100 Trillion Parameters

Score: 0.7379729379956916

User feedback: None

Out links: 396330 Raw text: 396330

https://gwern.net/doc/www/arxiv.org/bcbed3d35c8ef6f6edb7123a1b4e354fa9a9a28e.pdf

Persia: An Open, Hybrid System Scaling Deep Learning-based Recommenders up to 100 Trillion Parameters Xiangru Lian1 , Binhang Yuan3 , Xuefeng Zhu2 , Yulong Wang2 , Yongjun He3 , Honghuan Wu2 , Lei Sun2 , Haodong Lyu2 , Chengjun Liu2 , Xing Dong2 , Yiqiao Liao2 , Mingnan Luo2 , Congfei Zhang2 , Jingr...

Title: nips_draft3.dvi

Score: 0.6949723950878988

User feedback: None

Out links: 6094537 Raw text: 6094537

http://www.cs.toronto.edu/~rsalakhu/papers/nips07_pmf.pdf

Probabilistic Matrix Factorization Ruslan Salakhutdinov and Andriy Mnih Department of Computer Science, University of Toronto 6 King’s College Rd, M5S 3G4, Canada {rsalakhu,amnih}@cs.toronto.edu Abstract Many existing approaches to collaborative filtering can neither handle very large datasets nor ...

Title: s1: Simple test-time scaling

Score: 0.6889910735387487

User feedback: None

Out links: 11610188 Raw text: 11610188

https://arxiv.org/pdf/2501.19393

s1: Simple test-time scaling Niklas Muennighoff * 1 3 4 Zitong Yang * 1 Weijia Shi * 2 Xiang Lisa Li * 1 Li Fei-Fei 1 Hannaneh Hajishirzi 2 3 Luke Zettlemoyer 2 Percy Liang 1 Emmanuel Candès 1 Tatsunori Hashimoto 1 Mathematical Competition PhD-Level Problem Solving Math Science Questions (MATH500)...

Title: None

Score: 0.6887940375746637

User feedback: None

Out links: 12958666 Raw text: 12958666

https://arxiv.org/pdf/2502.14499

MLGym: A New Framework and Benchmark for Advancing AI Research Agents Deepak Nathani1† , Lovish Madaan2,7 , Nicholas Roberts3† , Nikolay Bashlykov7 , Ajay Menon7 , Vincent Moens5 , Amar Budhiraja7 , Despoina Magka6 , Vladislav Vorotilov7 , Gaurav Chaurasia7 , Dieuwke Hupkes7 , Ricardo Silveira Cabra...

Title: None

Score: 0.6639534686787638

User feedback: None

Out links: 9761393 Raw text: 9761393

https://sites.cs.ucsb.edu/~vigna/publications/2016_NDSS_Driller.pdf

Driller: Augmenting Fuzzing Through Selective Symbolic Execution Nick Stephens, John Grosen, Christopher Salls, Audrey Dutcher, Ruoyu Wang, Jacopo Corbetta, Yan Shoshitaishvili, Christopher Kruegel, Giovanni Vigna UC Santa Barbara {stephens,jmg,salls,dutcher,fish,jacopo,yans,chris,vigna}@cs.ucsb.edu...

Title: None

Score: 0.660376826138596

User feedback: None

Out links: 10187848 Raw text: 10187848

https://arxiv.org/pdf/1806.05101

Order-book modelling and market making strategies Xiaofei Lu∗1 and Frédéric Abergel†1 1 Chaire de finance quantitative, Laboratoire MICS, CentraleSupélec, Université Paris Saclay arXiv:1806.05101v1 [q-fin.TR] 13 Jun 2018 June 14, 2018 Abstract Market making is one of the most important aspec...

Title: None

Score: 0.6519012553366885

User feedback: None

Out links: 13696632 Raw text: 13696632

https://arxiv.org/pdf/1709.00103

S EQ 2SQL: G ENERATING S TRUCTURED Q UERIES FROM NATURAL L ANGUAGE USING R EINFORCEMENT L EARNING arXiv:1709.00103v7 [cs.CL] 9 Nov 2017 Victor Zhong, Caiming Xiong, Richard Socher Salesforce Research Palo Alto, CA {vzhong,cxiong,rsocher}@salesforce.com A BSTRACT Relational databases store a signi...

Title: None

Score: 0.6445696493191402

User feedback: None

Out links: 3601338 Raw text: 3601338

https://arxiv.org/pdf/2412.06264

Flow Matching Guide and Code Yaron Lipman1 , Marton Havasi1 , Peter Holderrieth2 , Neta Shaul3 , Matt Le1 , Brian Karrer1 , Ricky T. Q. Chen1 , David Lopez-Paz1 , Heli Ben-Hamu3 , Itai Gat1 arXiv:2412.06264v1 [cs.LG] 9 Dec 2024 1 FAIR at Meta, 2 MIT CSAIL, 3 Weizmann Institute of Science Flow Ma...

Title: Building Query Compiler

Score: 0.6429572679875343

User feedback: None

Out links: 6387820 Raw text: 6387820

https://pi3.informatik.uni-mannheim.de/~moer/querycompiler.pdf

Building Query Compilers (Under Construction) [expected time to completion: 5 years] Guido Moerkotte October 31, 2024 Contents I Basics 3 1 Introduction 1.1 General Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 DBMS Architecture . . . . . . . . . . . . . . . . . . . . . . ....

Title: None

Score: 0.6268198558636983

User feedback: None

Out links: 6336646 Raw text: 6336646

https://arxiv.org/pdf/1701.01427

Rational Decision-Making Under Uncertainty: Observed Betting Patterns on a Biased Coin1 Victor Haghani2 Richard Dewey3 Working Draft October 19, 2016 arXiv:1701.01427v1 [q-fin.GN] 4 Jan 2017 1 Introduction You′ re invited to a talk by a hedge fund manager who was a partner at a fund that famousl...

Title: Smart Contract Fuzzing Towards Profitable Vulnerabilities

Score: 0.6234566466512556

User feedback: None

Out links: 15387886 Raw text: 15387886

https://arxiv.org/pdf/2501.08834

arXiv:2501.08834v2 [cs.CR] 12 Feb 2025 Smart Contract Fuzzing Towards Profitable Vulnerabilities ZIQIAO KONG, Nanyang Technological University, Singapore CEN ZHANG, Nanyang Technological University, Singapore MAOYI XIE, Nanyang Technological University, Singapore MING HU, Singapore Management Unive...

Title: QECO: A QoE-Oriented Computation Offloading Algorithm based on Deep Reinforcement Learning for Mobile Edge Computing

Score: 0.6194587410822758

User feedback: None

Out links: 15465491 Raw text: 15465491

https://arxiv.org/pdf/2311.02525

1 QECO: A QoE-Oriented Computation Offloading Algorithm based on Deep Reinforcement Learning for Mobile Edge Computing arXiv:2311.02525v2 [cs.NI] 14 Aug 2024 Iman Rahmati ID , Hamed Shah-Mansouri ID , and Ali Movaghar ID Abstract—In the realm of mobile edge computing (MEC), efficient computation...

Title: None

Score: 0.6083059488439466

User feedback: None

Out links: 13032376 Raw text: 13032376

https://www.cs.cmu.edu/~15850/notes/cmu850-f20.pdf

A D VA N C E D A L G O R I T H M S notes for cmu 15-850 (fall 2020) lecturer: anupam gupta About this document This document contains the course notes for 15-850: Advanced Algorithms, a graduate-level course taught by Anupam Gupta at Carnegie Mellon University in Fall 2020. Parts of these notes...

Title: None

Score: 0.5910138031374234

User feedback: None

Out links: 7852103 Raw text: 7852103

https://people.csail.mit.edu/nickolai/papers/henzinger-tiptoe.pdf

Private Web Search with Tiptoe Alexandra Henzinger Emma Dauterman MIT UC Berkeley Abstract. Tiptoe is a private web search engine that allows clients to search over hundreds of millions of documents, while revealing no information about their search query to the search engine’s servers. Tiptoe’s pri...

Title: None

Score: 0.5904436722702294

User feedback: None

Out links: 9199607 Raw text: 9199607

https://aada.ms/pdfs/tokenize_acc.pdf

tokenize/acc Austin Adams Whetstone Research [email protected] Figure 1: A typical Kansas Blue Sky from the Volland Store in Alma, Kansas. In 1911, Alma was the birthplace of the Blue Sky Laws, the precursor to modern securities laws. The Volland Store, where this photo is from, is around an hour...

Title: Matters Computational

Score: 0.588115051501129

User feedback: None

Out links: 7012724 Raw text: 7012724

https://www.jjj.de/fxt/fxtbook.pdf

Matters Computational Ideas, Algorithms, Source Code Jörg Arndt ii CONTENTS iii Contents Preface xi I 1 Low level algorithms 1 Bit wizardry 1.1 Trivia . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Operations on individual bits . . . . ...

Title: ItyFuzz: Snapshot-Based Fuzzer for Smart Contract

Score: 0.5800177166311197

User feedback: None

Out links: 2023600 Raw text: 2023600

https://dl.acm.org/doi/pdf/10.1145/3597926.3598059

ItyFuzz: Snapshot-Based Fuzzer for Smart Contract Chaofan Shou Shangyin Tan Koushik Sen [email protected] UC Berkeley Berkeley, CA, USA [email protected] UC Berkeley Berkeley, CA, USA [email protected] UC Berkeley Berkeley, CA, USA ABSTRACT smart contract. Each transaction serves as an in...

Title: Compiling C to Safe Rust, Formalized

Score: 0.5485596078255999

User feedback: None

Out links: 5351856 Raw text: 5351856

https://arxiv.org/pdf/2412.15042

Compiling C to Safe Rust, Formalized arXiv:2412.15042v1 [cs.PL] 19 Dec 2024 AYMERIC FROMHERZ, Inria, France JONATHAN PROTZENKO, Microsoft Azure Research, USA The popularity of the Rust language continues to explode; yet, many critical codebases remain authored in C, and cannot be realistically rew...

Title: None

Score: 0.5364005363841096

User feedback: None

Out links: 7023960 Raw text: 7023960

https://www.usenix.org/system/files/conference/usenixsecurity18/sec18-zhou.pdf

Erays: Reverse Engineering Ethereum’s Opaque Smart Contracts Yi Zhou, Deepak Kumar, Surya Bakshi, Joshua Mason, Andrew Miller, and Michael Bailey, University of Illinois, Urbana-Champaign https://www.usenix.org/conference/usenixsecurity18/presentation/zhou This paper is included in the Proceedings ...

Title: None

Score: 0.5336803940912357

User feedback: None

Out links: 2834074 Raw text: 2834074

https://www.vldb.org/pvldb/vol13/p3411-armbrust.pdf

Delta Lake: High-Performance ACID Table Storage over Cloud Object Stores Michael Armbrust, Tathagata Das, Liwen Sun, Burak Yavuz, Shixiong Zhu, Mukul Murthy, Joseph Torres, Herman van Hovell, Adrian Ionescu, Alicja Łuszczak, Michał Świtakowski, Michał Szafrański, Xiao Li, Takuya Ueshin, Mostafa Mo...

Title: None

Score: 0.5282948069037552

User feedback: None

Out links: 12079635 Raw text: 12079635

https://arxiv.org/pdf/2502.07617

February 2025 Scaling Pre-training to One Hundred Billion Data for Vision Language Models Xiao Wang† , Ibrahim Alabdulmohsin† , Daniel Salz, Zhe Li, Keran Rong and Xiaohua Zhai arXiv:2502.07617v1 [cs.CV] 11 Feb 2025 † Corresponding Authors: {wangxiao, ibomohsin}@google.com We provide an empirica...

Title: None

Score: 0.5259696888416368

User feedback: None

Out links: 17967 Raw text: 17967

https://drwho.virtadpt.net/files/mov.pdf

mov is Turing-complete Stephen Dolan Computer Laboratory, University of Cambridge [email protected] Abstract registers for now, but later we show how their number can be reduced without losing expressiveness. We have the following instructions (if you like RISC) or addressing modes (if yo...

Title: None

Score: 0.39530180955736244

User feedback: None

Out links: 14661612 Raw text: 14661612

https://www.3jane.xyz/pdf/whitepaper.pdf

3Jane Protocol: The Credit-Based Money Market [email protected] Feb 25, 2025 Abstract 3Jane is a credit-based money market on Ethereum enabling unsecured lines of credit underwritten against verifiable proofs of crypto & bank assets, future cash flows, and credit scores. This unlocks a three-dimensio...

Title: None

Score: 0.38580663906894275

User feedback: None

Out links: 12066891 Raw text: 12066891

https://arxiv.org/pdf/2502.06807

Competitive Programming with Large Reasoning Models OpenAI∗ arXiv:2502.06807v1 [cs.LG] 3 Feb 2025 Abstract We show that reinforcement learning applied to large language models (LLMs) significantly boosts performance on complex coding and reasoning tasks. Additionally, we compare two general-purpos...

Title: None

Score: 0.2259284891369878

User feedback: None

Out links: 6161583 Raw text: 6161583

https://www.tamuz.caltech.edu/teaching/ps172/lectures.pdf

Title: No title Description: No description Keywords: No keywords Text content: ...

Title: None

Score: 0.2259284891369878

User feedback: None

Out links: 9199606 Raw text: 9199606

https://raw.githubusercontent.com/whetstoneresearch/docs/refs/heads/main/whitepapers/doppler/Dutch_auction_Dynamic_Bonding_Curves.pdf

Title: No title Description: No description Keywords: No keywords Text content: ...

Title: None

Score: 0.2259284891369878

User feedback: None

Out links: 9646784 Raw text: 9646784

https://github.com/wtdcode/sand-aflpp/blob/sand/paper.pdf?raw=true

Title: No title Description: No description Keywords: No keywords Text content: ...

Title: None

Score: 0.19811670196212236

User feedback: None

Out links: 204789 Raw text: 204789

https://www.foundationdb.org/files/fdb-paper.pdf

...

Title: Just a moment...

Score: 0.17146332519223004

User feedback: None

Out links: 9060494 Raw text: 9060494

https://www.biorxiv.org/content/10.1101/2024.09.27.615483v2.full.pdf

Title: Just a moment... Description: No description Keywords: No keywords Text content: Just a moment...Enable JavaScript and cookies to continue...

Title: Moonlight/Moonlight.pdf at master · MoonshotAI/Moonlight · GitHub

Score: 0.11929581014392482

User feedback: None

Out links: 13107365 Raw text: 13107365

https://github.com/MoonshotAI/Moonlight/blob/master/Moonlight.pdf

Title: Moonlight/Moonlight.pdf at master · MoonshotAI/Moonlight · GitHub Description: Contribute to MoonshotAI/Moonlight development by creating an account on GitHub. Keywords: No keywords Text content: Moonlight/Moonlight.pdf at master · MoonshotAI/Moonlight · GitHub ...

Title: papers-we-love/audio_comp_sci/shazam-audio-search-algorithm.pdf at main · papers-we-love/papers-we-love · GitHub

Score: 0.09861644704910924

User feedback: None

Out links: 13938338 Raw text: 13938338

https://github.com/papers-we-love/papers-we-love/blob/main/audio_comp_sci/shazam-audio-search-algorithm.pdf

Title: papers-we-love/audio_comp_sci/shazam-audio-search-algorithm.pdf at main · papers-we-love/papers-we-love · GitHub Description: Papers from the computer science community to read and discuss. - papers-we-love/audio_comp_sci/shazam-audio-search-algorithm.pdf at main · papers-we-love/papers-we-lo...