Title: Large Concept Models: Language Modeling in a Sentence Representation Space | Research - AI at Meta Description: LLMs have revolutionized the field of artificial intelligence and have emerged as the de-facto tool for many tasks. The current established technology of... Keywords: No keywords Text content: Large Concept Models: Language Modeling in a Sentence Representation Space | Research - AI at Meta Our approachResearchProduct experiencesLlamaBlogTry Meta AINLPLarge Concept Models: Language Modeling in a Sentence Representation SpaceDecember 11, 2024AbstractLLMs have revolutionized the field of artificial intelligence and have emerged as the de-facto tool for many tasks. The current established technology of LLMs is to process input and generate output at the token level. This is in sharp contrast to humans who operate at multiple levels of abstraction, well beyond single words, to analyze information and to generate creative content. In this paper, we present an attempt at an architecture which operates on an explicit higher-level semantic representation, which we name a “concept”. Concepts are language- and modality-agnostic and represent a higher level idea or action in a flow. Hence, we build a“Large Concept Model”. In this study, as proof of feasibility, we assume that a concept corresponds to a sentence, and use an existing sentence embedding space, SONAR, which supports up to 200 languages in both text and speech modalities. The Large Concept Model is trained to perform autoregressive sentence prediction in an embedding space. We explore multiple approaches, namely MSE regression, variants of diffusion-based generation, and models operating in a quantized SONAR space. These explorations are performed using 1.6B parameter models and training data in the order of 1.3T tokens. We then scale one architecture to a model size of 7B parameters and training data of about 7.7T tokens. We perform an experimental evaluation on several generative tasks, namely summarization and a new task of summary expansion. Finally, we show that our model exhibits impressive zero-shot generalization performance to many languages, outperforming existing LLMs of the same size. The training code of our models is freely available.Download the PaperAUTHORSWritten byThe LCM teamLoic BarraultPaul-Ambroise DuquenneMaha ElbayadArtyom KozhevnikovBelen AlastrueyPierre AndrewsMariano CoriaGuillaume CouaironMarta R. Costa-jussaDavid DaleHady ElsaharKevin HeffernanJoão Maria JaneiroTuan TranChristophe RopersEduardo SánchezRobin San RomanAlexandre MourachkoSafiyyah SaleemHolger SchwenkPublisherarXivResearch TopicsNatural Language Processing (NLP)Related PublicationsDecember 12, 2024NLPCORE MACHINE LEARNINGMemory Layers at ScaleVincent-Pierre Berges, Barlas OguzDecember 12, 2024Read the PaperDecember 12, 2024NLPByte Latent Transformer: Patches Scale Better Than TokensArtidoro Pagnoni, Ram Pasunuru, Pedro Rodriguez, John Nguyen, Benjamin Muller, Margaret Li, Chunting Zhou, Lili Yu, Jason Weston, Luke Zettlemoyer, Gargi Ghosh, Mike Lewis, Ari Holtzman, Srini IyerDecember 12, 2024Read the PaperDecember 12, 2024HUMAN & MACHINE INTELLIGENCENLPExplore Theory-of-Mind: Program-Guided Adversarial Data Generation for Theory of Mind ReasoningMelanie Sclar, Jane Yu, Maryam Fazel-Zarandi, Yulia Tsvetkov, Yonatan Bisk, Yejin Choi, Asli CelikyilmazDecember 12, 2024Read the PaperDecember 11, 2024NLPCOMPUTER VISIONMeta CLIP 1.2Hu Xu, Bernie Huang, Ellen Tan, Ching-Feng Yeh, Jacob Kahn, Christine Jou, Gargi Ghosh, Omer Levy, Luke Zettlemoyer, Scott Yih, Philippe Brunet, Kim Hazelwood, Ramya Raghavendra, Daniel Li (FAIR), Saining Xie, Christoph FeichtenhoferDecember 11, 2024Read the PaperSee All PapersHelp Us Pioneer The Future of AIWe share our open source frameworks, tools, libraries, and models for everything from research exploration to large-scale production deployment.Join our TeamOur approachAbout AI at MetaResponsibilityPeopleCareersResearchInfrastructureResourcesDemosProduct experiencesMeta AIAI StudioLatest newsBlogNewsletterFoundational modelsLlamaOur approachOur approachAbout AI at MetaResponsibilityPeopleCareersResearchResearchInfrastructureResourcesDemosProduct experiencesMeta AIAI StudioLatest newsLatest newsBlogNewsletterFoundational modelsLlamaPrivacy PolicyTermsCookies Meta © 2024