[Music] good Jeff um so I'm going to start uh by asking you your experience in the last 24 long years at Google um love to know um love for you to speak about your recent like all the different roles at Google and what you've been doing for the last 24 years a lot of things must have changed 24 years is a long time um yeah I mean I think uh you know over the course of my career at Google I've I've bounced around to a bunch of different things so for the first few years at Google I really worked on you know I did I created our first advertising system uh and then spent a number of years working on our core search indexing and crawling and and retrieval systems um you know I think that was a very exciting time because our traffic was growing very fast and we had to sort of uh wake up every Tuesday and worry we would melt because we had too many queries and so what could we do to like make the system more efficient uh but also make the index that we're serving to users much larger and updated more often um you know I think that was that was uh exciting we had one of our colleagues who kept a little crayon chart on the wall of how many queries we handled every day and like it was always going up and he kept having to rescale it down because the paper was only so big uh and then I worked for a number of years on kind of uh what I'll call distributed systems infrastructure things like storage systems uh ways of expressing computations things like map produce uh and then sort of higher level storage systems like spanner or big table that kind of sit underneath lots of Google products and enable them to you know uh rely on the same abstractions regardless of whether it's you know Maps or search or you know ads or or Cloud products um and then about maybe 12 years ago or so I got interested in how could we train very large neural networks um I'd actually got exposed to them as an undergrad when I was doing research uh at University of Minnesota um because it felt like a good abstraction like neural nuts at that time there was a wave of excitement in the late 80s and early 90s uh about what they could do because they could seemingly solve you know interesting but small problems in ways that other approaches could not so I got really excited I'm like oh we if we could just make these problem you know these neural works you know 100 times as big it would be amazing um so I did some research on how do you apply parallel computation to training noral networks uh as an undergrad uh turns out we needed more than the 32 processors in the department machine uh to make these neural nuts actually really powerful we needed like a million times as much computation but then we started to have that you know in like 2008 2009 uh with you know the advances of of computing and mors laww and you know special purpose process like gpus or or other uh specialized processors you suddenly had the ability to make neural networks that could do uh you know solve problems that we didn't know how to solve in other ways but real problems not just toy siiz ones so I've been working in that area for a while um most recently I've been co-leading the Gemini Project which is our kind of our latest uh large scale multimodal model for all kinds of things sorry that was long-winded um it wasn't long enough because for those of you who might be intrigued by what Jeff just said I'd highly recommend this article in New Yorker was it 2018 um the friendship that made Google huge uh and I remember my daughter read that article and uh when I told her that uh I'm going to be talking to Jeff I mean I was interviewing for this role so uh Jeff was interviewing me for that her respect for me went way up oh you're getting to talk to Jeff Dean uh so uh that was also played a very positive role in my life uh so Jeff back to uh where we are so there's so much hype around Ai and kind of we've made so much progress right since the those deep learning advances uh of 2 about a decade slightly over a decade ago so how would you describe this moment in AI this moment H yeah it's I mean really what you've seen over the last decade is just continued expansion of what we imagine computers can do right because 10 years ago we wouldn't have thought that you could have computers that can see or understand speech nearly as effectively as the models we have today and I think you're starting to see the ability for these systems to sort of uh comprehend much more complex structures uh you know not just a sentence but like take in a whole document and kind of be able to summarize it or be able to extract information from it you know these These are fields that have been worked on for many years and we've made some progress in some some of those fields but I think when you start to look at the ability of these very large models that are trained in a very general way across you know text and images and video and audio uh to be able to accomplish General tasks rather than what has happened you know prior to that which was generally if you had a machine learning problem you would train a model to do that very narrow specific thing you might go off and gather very specialized data to do that now what you're seeing is these General models can actually accomplish many many of those tasks that you previously would have had to do a you know month or monthlong you know special purpose training of that model now you can just ask the model to do it and it can do it and I think that that really unveils a new way in how we think about interacting with computers and working with computers to do things we care about uh and in fact I remember distinctly you uh just four years ago I think we had organized the research week with students and Jeff was asked this question by one of the students what's this one thing about today's AI models that you don't you're not you don't find satisfactory and that that was exactly the answer you gave to that student that that for every new problem right we end up building a new model from scratch and this is not how we humans right learn we don't do that anymore and so so so you predicted that uh at that stage of course you were already working on some of those things but but it's great to see that coming to fruition so uh uh yeah I and I want to follow up with um on the flip side where do you think this technology is going what is what is our goal and where are we heading towards what's next yeah I mean I think you're seeing kind of a very broad spread of uh machine learning and AI uh in many many fields of endeavor right like you're seeing you know nearly every field of science people are thinking about how could you know a learning based approach learning from data uh impact this sub problem in our field or maybe a whole bunch of sub problems could be simultaneously solved and really transform how you how you think about it doing that and you see that in you know chemistry Material Science and you know uh uh sort of Chip design and education and Healthcare and you know every other field I think people are excited about the possibility but they're also you know uh careful and wary because these Technologies are are not uh you know right out of the box going to solve these problems it really requires domain expertise understanding the limitations of what these models can do but also the possibilities you know it's also good to look at you know not what they can do today but where you think they're going to be able to do what they're going to be able to do in five years from now because I think that's where the field of AI and ml is going and where all other fields are going to be going is what are the possibilities that are unlocked and will happen over the next 5 to 10 years very nice so uh one of the things that I often think about is that we are living in such amazing like times right uh if I look at the evolution or maybe revolution of AI where it has brought us right on one hand we have these llms which have just amazing capabilities right which have taken the World by storm and on the other hand there are so many fundamental problems with them right the problem of factuality or hallucinations alignment with human values or like reasoning and so on uh that are still unsolved so how do you look at this Duality and again do you have any advice for all of our wonderful AI researchers in the audience yeah I mean I think um it's it's good to be both not overly like there are no problems because there are definitely issues that you pointed out things like factuality or bias or you know collect training training uh model on one data and data distribution then when you apply it in another setting things are very different and so the model it doesn't perform the way you wanted to and uh it's really important that everyone sort of thinking about applying machine learning both machine learning people and also people in different kinds of domains be thinking about this and understand the the kind of rough edges that can that can happen when you do that the kind of uh fallibilities of these models uh at the same time I think there's also been a lot of press on things like factuality you know if you look at the models that are you know present in today you know the factuality of these models is actually significantly better than the ones even a year or two ago and so I think there's definite progress in a lot of these areas but it is by no means perfect today and so continued research progress on you know nearly one of every one of these kind of areas in outlined in our AI principles uh you know Google has a set of seven AI principles you know nearly all of those are sort of active areas of research Endeavor like how can you improve factuality of models or how can you lessen bias in these models um and so I think it's important to continue to work on those areas um but also give despite the shortcomings what can the models do and and how can you make sure that you as you say maximize the the benefit while also working on some of the the shortcomings thanks thanks can you say something about the kinds of architectures for these models that we going is the current set of architectures set to stay or do you expect any radically new um forms coming up in the next few years yeah I mean let me let me describe the current state of the world right like right now these foundational models you tend to like collect a bunch of different kinds of data and you train these models in a you know you know month-long process or few monthlong process in order to get them to a state where they can do a bunch of things um and that's a very static thing you then take this you know the checkpoint of the model that's been trained and people can go off and do all kinds of things with it like a bunch of different teams at Google then take the checkpoint they might deploy it directly and use it in their product they might collect a bit more data and fine-tune the model on you know this problem or that problem um but the problem with that is if you have a hundred teams doing things with these models they don't the the the way in which they're applying it doesn't make it back into that core model right it's like if very one train once and then you deploy a bunch of ways and I think we really need to move more much more to a more incremental learning kind of system where you're all continuously working to improve a single model and different groups or different people who care about different kinds of functionality can collectively work on this without getting in the way of other people who are working on improving some other aspect of the model you know people who are working on making the model better better for medical applications shouldn't interfere with people who but allow simultaneous Improvement in other areas like maybe machine learning for code completion or something uh if we could have a system where everyone was collectively improving the same model you could incrementally extend the capabilities of the model um rather than these kind of Monolithic large training runs that then get uh propagated Downstream I think we'd be in a much better State and so incremental learning you know ways of training so that you know one new task doesn't interfere with another doesn't make it so that you have catastrophic forgetting of what the model used to be able to do um all those are really important research areas I think okay so Jeff uh we've seen a lot of progress right in this space obviously and um but it seems like compute and data have become even more important right in the current Paradigm than ever before so what do you think are the research questions to address be going Beyond scale and and I'm sure the academics here in particular are very strongly interested in that yeah I mean I will say it is definitely the case that like training on more data and that requires more compute and also a larger model to absorb the knowledge in that much larger training Corpus definitely has helped but I think it's also a little uh under realized how important algorithmic advances are in improving the capabilities of these models you know we've gotten maybe factors ter of 10 improvement over the last few years from more computation you know more powerful computation per per watt or per dollar or whatever um but we've probably gotten a factor of 10 as well from algorithmic improvements you know things like model architectures or better uh filtering mechanisms to identify what data is going to be most helpful to train on so you don't spend your compute Cycles training on things that are not nearly as useful as as other things um and lot of those kinds of things can be proven out in ways that don't require large amounts of compute you know we actually do very small scale experiments to evaluate you know what is the value of this data set or this particular kind of data versus others um and uh you know that's the kind of thing where I think there's a huge value to have wide and deep exploration of a whole bunch of different approaches and techniques and when you demonstrate those ideas at you know even small scale a lot of times they they do carry over I mean we do notice we do experimentation at a few different scales though because what's really important is not are you improving at this scale but what does the trend line look like as you're ramping up in scale so you might have a teeny Model A little bit bigger model and a slightly bigger one and if the if you're all above the kind of Baseline that you're comparing to but the slope looks a lot worse that's not as interesting as something something where even if you're below the Baseline but the slope looks amazing so as as things scale up it looks like that might dramatically overtake the the Baseline that's you know an interesting result uh even though it might not seem so because it's like below what you're comparing to yeah so plenty of Interest interesting right problems to work on Grace um so as Chief scientist of Google I know that you like to do the technical work some of the technical work yourself I'm interested to hear what you you're working on yourself that you're particularly excited about yeah I mean I think there's a lot of different areas I I'm particularly excited about these kind of more uh sort of incremental learning techniques and um different model architectures that can make that possible uh I think uh you know models that are much sparer than the ones today are probably going to be important models that you can incrementally add add capacity to are going to be important um identifying what are uh you know useful pieces of data to train on could you have an automatic assessment for different data sets or different examples that assess how much uh uh value you get out of those seems like a uh an area direction we should go in um right now it's a little bit more ad hoc and you do some small scale experiments that are designed by humans but if you could have more automated uh approaches in that space that would be exciting um yeah and just General like how do you make the iteration time for research faster you know uh you know you'd like to not have to wait 3 minutes for your your sort of machine learning computation to to launch and start giving you results if you could do that in 20 seconds it would be better for [Music] everyone Back To Top