[Music]
                    

                      good Jeff um so I'm going to start uh by
                    

                      asking you your experience in the last
                    

                      24 long years at Google um love to know
                    

                      um love for you to speak about your
                    

                      recent like all the different roles at
                    

                      Google and what you've been doing for
                    

                      the last 24 years a lot of things must
                    

                      have changed 24 years is a long time um
                    

                      yeah I mean I think uh you know over the
                    

                      course of my career at Google I've I've
                    

                      bounced around to a bunch of different
                    

                      things so for the first few years at
                    

                      Google I really worked on you know I did
                    

                      I created our first advertising system
                    

                      uh and then spent a number of years
                    

                      working on our core search indexing and
                    

                      crawling and and retrieval systems um
                    

                      you know I think that was a very
                    

                      exciting time because our traffic was
                    

                      growing very fast and we had to sort of
                    

                      uh wake up every Tuesday and worry we
                    

                      would melt because we had too many
                    

                      queries and so what could we do to like
                    

                      make the system more efficient uh but
                    

                      also make the index that we're serving
                    

                      to users much larger and updated more
                    

                      often um you know I think that was that
                    

                      was uh exciting we had one of our
                    

                      colleagues who kept a little crayon
                    

                      chart on the wall of how many queries we
                    

                      handled every day and like it was always
                    

                      going up and he kept having to rescale
                    

                      it down because the paper was only so
                    

                      big uh and then I worked for a number of
                    

                      years on kind of uh what I'll call
                    

                      distributed systems infrastructure
                    

                      things like storage systems uh ways of
                    

                      expressing computations things like map
                    

                      produce uh and then sort of higher level
                    

                      storage systems like spanner or big
                    

                      table that kind of sit underneath lots
                    

                      of Google products and enable them to
                    

                      you know uh rely on the same
                    

                      abstractions regardless of whether it's
                    

                      you know Maps or search or you know ads
                    

                      or or Cloud products um and then about
                    

                      maybe 12 years ago or so I got
                    

                      interested in how could we train very
                    

                      large neural networks um I'd actually
                    

                      got exposed to them as an undergrad when
                    

                      I was doing research uh at University of
                    

                      Minnesota um because it felt like a good
                    

                      abstraction like neural nuts at that
                    

                      time there was a wave of excitement in
                    

                      the late 80s and early 90s uh about what
                    

                      they could do because they could
                    

                      seemingly solve you know interesting but
                    

                      small problems in ways that other
                    

                      approaches could not so I got really
                    

                      excited I'm like oh we if we could just
                    

                      make these problem you know these neural
                    

                      works you know 100 times as big it would
                    

                      be amazing um so I did some research on
                    

                      how do you apply parallel computation to
                    

                      training noral networks uh as an
                    

                      undergrad uh turns out we needed more
                    

                      than the 32 processors in the department
                    

                      machine uh to make these neural nuts
                    

                      actually really powerful we needed like
                    

                      a million times as much computation but
                    

                      then we started to have that you know in
                    

                      like 2008
                    

                      2009 uh with you know the advances of of
                    

                      computing and mors laww and you know
                    

                      special purpose process like gpus or or
                    

                      other uh specialized processors you
                    

                      suddenly had the ability to make neural
                    

                      networks that could do uh you know solve
                    

                      problems that we didn't know how to
                    

                      solve in other ways but real problems
                    

                      not just toy siiz ones so I've been
                    

                      working in that area for a while um most
                    

                      recently I've been co-leading the Gemini
                    

                      Project which is our kind of our latest
                    

                      uh large scale multimodal model for all
                    

                      kinds of things sorry that was
                    

                      long-winded um it wasn't long enough
                    

                      because for those of you who might be
                    

                      intrigued by what Jeff just said I'd
                    

                      highly recommend this article in New
                    

                      Yorker was it
                    

                      2018 um the friendship that made Google
                    

                      huge uh and I remember my daughter read
                    

                      that article and uh when I told her that
                    

                      uh I'm going to be talking to Jeff I
                    

                      mean I was interviewing for this role so
                    

                      uh Jeff was interviewing me for that her
                    

                      respect for me went way up oh you're
                    

                      getting to talk to Jeff Dean uh so uh
                    

                      that was also played a very positive
                    

                      role in my life
                    

                      uh so Jeff back to uh where we are so
                    

                      there's so much hype around Ai and kind
                    

                      of we've made so much progress right
                    

                      since the those deep learning advances
                    

                      uh of 2 about a decade slightly over a
                    

                      decade ago so how would you describe
                    

                      this moment in
                    

                      AI this moment H yeah it's I mean really
                    

                      what you've seen over the last decade is
                    

                      just continued expansion of what we
                    

                      imagine computers can do right because
                    

                      10 years ago we wouldn't have thought
                    

                      that you could have computers that can
                    

                      see or understand speech nearly as
                    

                      effectively as the models we have today
                    

                      and I think you're starting to see the
                    

                      ability for these systems to sort of uh
                    

                      comprehend much more complex structures
                    

                      uh you know not just a sentence but like
                    

                      take in a whole document and kind of be
                    

                      able to summarize it or be able to
                    

                      extract information from it you know
                    

                      these These are fields that have been
                    

                      worked on for many years and we've made
                    

                      some progress in some some of those
                    

                      fields
                    

                      but I think when you start to look at
                    

                      the ability of these very large models
                    

                      that are trained in a very general way
                    

                      across you know text and images and
                    

                      video and audio uh to be able to
                    

                      accomplish General tasks rather than
                    

                      what has happened you know prior to that
                    

                      which was generally if you had a machine
                    

                      learning problem you would train a model
                    

                      to do that very narrow specific thing
                    

                      you might go off and gather very
                    

                      specialized data to do that now what
                    

                      you're seeing is these General models
                    

                      can actually
                    

                      accomplish many many of those tasks that
                    

                      you previously would have had to do a
                    

                      you know month or monthlong you know
                    

                      special purpose training of that model
                    

                      now you can just ask the model to do it
                    

                      and it can do it and I think that that
                    

                      really unveils a new way in how we think
                    

                      about interacting with computers and
                    

                      working with computers to do things we
                    

                      care about uh and in fact I remember
                    

                      distinctly you uh just four years ago I
                    

                      think we had organized the research week
                    

                      with students and Jeff was asked this
                    

                      question by one of the students what's
                    

                      this one thing about today's AI models
                    

                      that you don't you're not you don't find
                    

                      satisfactory and that that was exactly
                    

                      the answer you gave to that student that
                    

                      that for every new problem right we end
                    

                      up building a new model from scratch and
                    

                      this is not how we humans right learn we
                    

                      don't do that anymore and so so so you
                    

                      predicted that uh at that stage of
                    

                      course you were already working on some
                    

                      of those things but but it's great to
                    

                      see that coming to fruition so uh uh
                    

                      yeah I and I want to follow up with um
                    

                      on the flip side where do you think this
                    

                      technology is going what is what is our
                    

                      goal and where are we heading towards
                    

                      what's next yeah I mean I think you're
                    

                      seeing kind of a very broad spread of uh
                    

                      machine learning and AI uh in many many
                    

                      fields of endeavor right like you're
                    

                      seeing you know nearly every field of
                    

                      science people are thinking about how
                    

                      could you know a learning based approach
                    

                      learning from data uh impact this sub
                    

                      problem in our field or maybe a whole
                    

                      bunch of sub problems could be
                    

                      simultaneously solved and really
                    

                      transform how you how you think about it
                    

                      doing that and you see that in you know
                    

                      chemistry Material Science and you know
                    

                      uh uh sort of Chip design and education
                    

                      and Healthcare and you know every other
                    

                      field I think people are excited about
                    

                      the possibility but they're also you
                    

                      know uh
                    

                      careful and wary because these
                    

                      Technologies are are not uh you know
                    

                      right out of the box going to solve
                    

                      these problems it really requires domain
                    

                      expertise understanding the limitations
                    

                      of what these models can do but also the
                    

                      possibilities you know it's also good to
                    

                      look at you know not what they can do
                    

                      today but where you think they're going
                    

                      to be able to do what they're going to
                    

                      be able to do in five years from now
                    

                      because I think that's where the field
                    

                      of AI and ml is going and where all
                    

                      other fields are going to be going is
                    

                      what are the possibilities that are
                    

                      unlocked and will happen over the next 5
                    

                      to 10
                    

                      years very nice so uh one of the things
                    

                      that I often think about is that we are
                    

                      living in such amazing like times right
                    

                      uh if I look at the evolution or maybe
                    

                      revolution of AI where it has brought us
                    

                      right on one hand we have these llms
                    

                      which have just amazing capabilities
                    

                      right which have taken the World by
                    

                      storm and on the other hand there are so
                    

                      many fundamental problems with them
                    

                      right the problem of factuality or
                    

                      hallucinations alignment with human
                    

                      values or like reasoning and so on uh
                    

                      that are still unsolved so how do you
                    

                      look at this Duality and again do you
                    

                      have any advice for all of our wonderful
                    

                      AI researchers in the
                    

                      audience yeah I mean I think um it's
                    

                      it's good to be both not
                    

                      overly like there are no problems
                    

                      because there are definitely issues that
                    

                      you pointed out things like factuality
                    

                      or bias or you know collect training
                    

                      training uh model on one data and data
                    

                      distribution then when you apply it in
                    

                      another setting things are very
                    

                      different and so the model it doesn't
                    

                      perform the way you wanted to and uh
                    

                      it's really important that everyone sort
                    

                      of thinking about applying machine
                    

                      learning both machine learning people
                    

                      and also people in different kinds of
                    

                      domains be thinking about this and
                    

                      understand the the kind of rough edges
                    

                      that can that can happen when you do
                    

                      that the kind of uh fallibilities of
                    

                      these models uh at the same time I think
                    

                      there's also been a lot of press on
                    

                      things like factuality you know if you
                    

                      look at the models that are you know
                    

                      present in today you know the factuality
                    

                      of these models is actually
                    

                      significantly better than the ones even
                    

                      a year or two ago and so I think there's
                    

                      definite progress in a lot of these
                    

                      areas but it is by no means perfect
                    

                      today and so continued research progress
                    

                      on you know nearly one of every one of
                    

                      these kind of areas in outlined in our
                    

                      AI principles uh you know Google has a
                    

                      set of seven AI principles you know
                    

                      nearly all of those are sort of active
                    

                      areas of research Endeavor like how can
                    

                      you improve factuality of models or how
                    

                      can you lessen bias in these models um
                    

                      and so I think it's important to
                    

                      continue to work on those areas um but
                    

                      also give despite the shortcomings what
                    

                      can the models do and and how can you
                    

                      make sure that you as you say maximize
                    

                      the the benefit while also working on
                    

                      some of the the shortcomings thanks
                    

                      thanks can you say something about the
                    

                      kinds of architectures for these models
                    

                      that we going is the current set of
                    

                      architectures set to stay or do you
                    

                      expect any radically new um forms coming
                    

                      up in the next few years yeah I mean let
                    

                      me let me describe the current state of
                    

                      the world right like right now these
                    

                      foundational models you tend to like
                    

                      collect a bunch of different kinds of
                    

                      data and you train these models in a you
                    

                      know you know month-long process or few
                    

                      monthlong process in order to get them
                    

                      to a state where they can do a bunch of
                    

                      things um and that's a very static thing
                    

                      you then take this you know the
                    

                      checkpoint of the model that's been
                    

                      trained and people can go off and do all
                    

                      kinds of things with it like a bunch of
                    

                      different teams at Google then take the
                    

                      checkpoint they might deploy it directly
                    

                      and use it in their product they might
                    

                      collect a bit more data and fine-tune
                    

                      the model on you know this problem or
                    

                      that problem um but the problem with
                    

                      that is if you have a hundred teams
                    

                      doing things with these models they
                    

                      don't the the the way in which they're
                    

                      applying it doesn't make it back into
                    

                      that core model right it's like if very
                    

                      one train once and then you deploy a
                    

                      bunch of ways and I think we really need
                    

                      to move more much more to a more
                    

                      incremental learning kind of system
                    

                      where you're all continuously working to
                    

                      improve a single model and different
                    

                      groups or different people who care
                    

                      about different kinds of functionality
                    

                      can collectively work on this without
                    

                      getting in the way of other people who
                    

                      are working on improving some other
                    

                      aspect of the model you know people who
                    

                      are working on making the model better
                    

                      better for medical applications
                    

                      shouldn't interfere with people who but
                    

                      allow simultaneous Improvement in other
                    

                      areas like maybe machine learning for
                    

                      code completion or something uh if we
                    

                      could have a system where everyone was
                    

                      collectively improving the same model
                    

                      you could incrementally extend the
                    

                      capabilities of the model um rather than
                    

                      these kind of Monolithic large training
                    

                      runs that then get uh propagated
                    

                      Downstream I think we'd be in a much
                    

                      better State and so incremental learning
                    

                      you know ways of training so that you
                    

                      know one new task doesn't interfere with
                    

                      another doesn't make it so that you have
                    

                      catastrophic forgetting of what the
                    

                      model used to be able to do um all those
                    

                      are really important research areas I
                    

                      think okay so Jeff uh we've seen a lot
                    

                      of progress right in this space
                    

                      obviously and um but it seems like
                    

                      compute and data have become even more
                    

                      important right in the current Paradigm
                    

                      than ever before so what do you think
                    

                      are the research questions to address be
                    

                      going Beyond scale and and I'm sure the
                    

                      academics here in particular are very
                    

                      strongly interested in that yeah I mean
                    

                      I will say it is definitely the case
                    

                      that like training on more data and that
                    

                      requires more compute and also a larger
                    

                      model to absorb the knowledge in that
                    

                      much larger training Corpus definitely
                    

                      has helped but I think it's also a
                    

                      little uh under realized how important
                    

                      algorithmic advances are in improving
                    

                      the capabilities of these models you
                    

                      know we've gotten maybe factors ter of
                    

                      10 improvement over the last few years
                    

                      from more computation you know more
                    

                      powerful computation per per watt or per
                    

                      dollar or whatever um but we've probably
                    

                      gotten a factor of 10 as well from
                    

                      algorithmic improvements you know things
                    

                      like model architectures or better uh
                    

                      filtering mechanisms to identify what
                    

                      data is going to be most helpful to
                    

                      train on so you don't spend your compute
                    

                      Cycles training on things that are not
                    

                      nearly as useful as as other things um
                    

                      and lot of those kinds of things can be
                    

                      proven out in ways that don't require
                    

                      large amounts of compute you know we
                    

                      actually do very small scale experiments
                    

                      to evaluate you know what is the value
                    

                      of this data set or this particular kind
                    

                      of data versus others um and uh you know
                    

                      that's the kind of thing where I think
                    

                      there's a huge value to have wide and
                    

                      deep exploration of a whole bunch of
                    

                      different approaches and techniques and
                    

                      when you demonstrate those ideas at you
                    

                      know even small scale a lot of times
                    

                      they they do carry over I mean we do
                    

                      notice we do experimentation at a few
                    

                      different scales though because what's
                    

                      really important is not are you
                    

                      improving at this scale but what does
                    

                      the trend line look like as you're
                    

                      ramping up in scale so you might have a
                    

                      teeny Model A little bit bigger model
                    

                      and a slightly bigger one and if the if
                    

                      you're all above the kind of Baseline
                    

                      that you're comparing to but the slope
                    

                      looks a lot worse that's not as
                    

                      interesting as something something where
                    

                      even if you're below the Baseline but
                    

                      the slope looks amazing so as as things
                    

                      scale up it looks like that might
                    

                      dramatically overtake the the Baseline
                    

                      that's you know an interesting result uh
                    

                      even though it might not seem so because
                    

                      it's like below what you're comparing to
                    

                      yeah so plenty of Interest interesting
                    

                      right problems to work on Grace um so as
                    

                      Chief scientist of Google I know that
                    

                      you like to do the technical work some
                    

                      of the technical work yourself I'm
                    

                      interested to hear what you you're
                    

                      working on yourself that you're
                    

                      particularly excited about yeah I mean I
                    

                      think there's a lot of different areas I
                    

                      I'm particularly excited about these
                    

                      kind of more uh sort of incremental
                    

                      learning techniques and um different
                    

                      model architectures that can make that
                    

                      possible uh I think uh you know models
                    

                      that are much sparer than the ones today
                    

                      are probably going to be important
                    

                      models that you can incrementally add
                    

                      add capacity to are going to be
                    

                      important um identifying what are
                    

                      uh you know useful pieces of data to
                    

                      train on could you have an automatic
                    

                      assessment for different data sets or
                    

                      different examples that assess how much
                    

                      uh uh value you get out of those seems
                    

                      like a uh an area direction we should go
                    

                      in um right now it's a little bit more
                    

                      ad hoc and you do some small scale
                    

                      experiments that are designed by humans
                    

                      but if you could have more automated uh
                    

                      approaches in that space that would be
                    

                      exciting um yeah and just General like
                    

                      how do you make the iteration time for
                    

                      research faster you know uh you know
                    

                      you'd like to not have to wait 3 minutes
                    

                      for your your sort of machine learning
                    

                      computation to to launch and start
                    

                      giving you results if you could do that
                    

                      in 20 seconds it would be better for
                    

                      [Music]
                    

                      everyone
                    

                    Back To Top