[Applause]
                    

                      hi listeners welcome back to no priors
                    

                      today we're hanging out with Andre
                    

                      karpathy who needs no introduction Andre
                    

                      is a renowned researcher beloved AI
                    

                      educator and cuber an early team member
                    

                      from open aai the lead for autopilot at
                    

                      Tesla and now working on AI for
                    

                      Education we'll talk to him about the
                    

                      state of research his new company and
                    

                      what we can expect from AI thanks for
                    

                      joining us today it's great to have you
                    

                      here thank you I'm happy to be here you
                    

                      led autopilot at Tesla and now like we
                    

                      actually have fully self-driving cars
                    

                      passenger vehicles on the road how do
                    

                      you read that in terms of where we are
                    

                      in the capability set how quickly we
                    

                      should see increased capability or
                    

                      pervasive passenger vehicles uh yes so I
                    

                      spent uh maybe five years on
                    

                      self-driving space I think it's
                    

                      fascinating space and um basically
                    

                      what's happening in the field right now
                    

                      is well um I do also think that I I draw
                    

                      a lot of like analogies I would say to
                    

                      AGI from Salt driving and maybe that's
                    

                      just because I'm familiar with it but I
                    

                      kind of feel like we've reached AGI a
                    

                      little bit in salt driving uh because
                    

                      there are systems today that you can
                    

                      basically take around and as a pain
                    

                      customer can take around here so weo in
                    

                      San Francisco here is of course very
                    

                      common probably you've taken weo I've
                    

                      taken in a bunch and it's amazing and it
                    

                      can drive youit all over it place and
                    

                      you're paying for it as a product what's
                    

                      interesting with weo is the first time I
                    

                      took weo was actually a decade ago
                    

                      almost exactly 2014 or so and it was a
                    

                      friend of mine who worked there and he
                    

                      gave me a demo and it drove me around
                    

                      the block 10 years ago and it was
                    

                      basically perfect dve drive 10 years ago
                    

                      and it took 10 years to go from like a
                    

                      demo that I had to a product I can pay
                    

                      for that's in a city scale and it's
                    

                      expanding Etc how much of that do you
                    

                      think was regulatory versus technology
                    

                      like when do you think the technology
                    

                      was ready is it I think technology
                    

                      you're just not seeing it in a single
                    

                      demo Drive of 30 minutes you're not
                    

                      running into all the stuff that they had
                    

                      to deal with deal with for a decade and
                    

                      so demo and product there's a massive
                    

                      Gap there and I think a lot of it also
                    

                      regulatory Etc uh but I do think that
                    

                      we've sort of like achieved AI in a soft
                    

                      having space in in that sense a little
                    

                      bit and yet I think there's what's
                    

                      really fascinating about it is the
                    

                      globalization hasn't happened at all so
                    

                      you have a demo and you can take it in a
                    

                      Seth but like the world hasn't changed
                    

                      yet and that going to take a long time
                    

                      and so going from a demo to an actual
                    

                      globalization of it I think there's a
                    

                      big gap there that's how it's related I
                    

                      would say to AGI because I suspect
                    

                      similar it will look in a similar way
                    

                      for AGI when we sort of get it and then
                    

                      staying for a minute in the Sal driving
                    

                      space uh I think people think that waymo
                    

                      is ahead of Tesla I think personally
                    

                      Tesla is ahead of wayo and I know it
                    

                      doesn't look like that but I'm still
                    

                      very uh bullish on Tesla and it sof
                    

                      driving program I think that Tesla has a
                    

                      software problem and I think weo has a
                    

                      hardware problem is the way I put it and
                    

                      I think software problems are much
                    

                      easier Tesla has deployment of all these
                    

                      cars on earth uh like at scale and I
                    

                      think uh way needs to get there and so
                    

                      uh the moment Tesla sort of like gets to
                    

                      the point where they can actually deploy
                    

                      this and it actually works I think it's
                    

                      going to be you know really incredible
                    

                      uh the latest builds I just drove
                    

                      yesterday I mean it's just driving me
                    

                      all over the place now they've made like
                    

                      really good improvements uh I would say
                    

                      very recently yeah I've been using it a
                    

                      lot recently and it actually works quite
                    

                      well it was it did some miraculous uh
                    

                      driving for me yesterday so I'm very
                    

                      impressed what the team is doing and so
                    

                      I still think that Tesla mostly has a
                    

                      software problem wayo mostly Hardware
                    

                      problem and so I think Tesla weo looks
                    

                      like it's winning kind of right now but
                    

                      I think when we look in 10 years and
                    

                      who's actually at scale and where most
                    

                      of the revenue is coming from I still
                    

                      think they're uh they're ahead in that
                    

                      sense how far away do you think we are
                    

                      from the software problem turning the
                    

                      corner in terms of getting to some
                    

                      equivalency because obviously to your
                    

                      point if you look at a way my car has a
                    

                      lot of very expensive liar and other
                    

                      sort of sensors built into the car so it
                    

                      can do what it does it sort of helps
                    

                      support the software system and so if
                    

                      you can just use cameras which is the
                    

                      Tesla approach then you effectively get
                    

                      rid of enormous cost complexity and you
                    

                      can do it in in many different types of
                    

                      cars when do you think that transition
                    

                      happens I mean mean in the next few
                    

                      years Ian I'm hoping like something like
                    

                      that but actually what's really
                    

                      interesting about that is I'm not sure
                    

                      that people are appreciating that Tesla
                    

                      actually does use a lot of expensive
                    

                      sensors they just do it at training time
                    

                      so there are a bunch of cars that drive
                    

                      around with Lars they do a bunch of
                    

                      stuff that like doesn't scale and they
                    

                      have extra sensors Etc and they do
                    

                      mapping and all all the stuff you're
                    

                      doing it at training time and then
                    

                      you're distilling that into a test time
                    

                      package that has deployed to the cars
                    

                      and is Vision only and it's like an
                    

                      Arbitrage on like sensors uh and like
                    

                      expense yeah and so I think it's
                    

                      actually kind of a brilliant strategy
                    

                      that I don't think it's fully
                    

                      appreciated and I think it's going to
                    

                      work out well because the pixels have
                    

                      the information and I think uh the
                    

                      network will be capable of doing that
                    

                      and yes at training time I think these
                    

                      sensors are really useful but I don't
                    

                      think they're as useful at tus time and
                    

                      I think you you don't it seems like the
                    

                      one other thing or transition that's
                    

                      happened is basically a move from a lot
                    

                      of um sort of uh Edge case designed
                    

                      turistic associated with it versus end
                    

                      to end deep learning and that's what
                    

                      other shift that's happened recently do
                    

                      you want to talk a little bit about that
                    

                      and sort of what that yeah I think that
                    

                      was always like the plan from the start
                    

                      I would say at Tesla as I was talking
                    

                      about how the neural nut can like eat
                    

                      through the stack because when I joined
                    

                      there was a ton of C++ code and now
                    

                      there's much much less C++ code uh in
                    

                      the test time package that runs in the
                    

                      car because uh there's still a ton of
                    

                      stuff in the in the back end uh that
                    

                      we're not talking about the neural net
                    

                      kind of like takes takes through the
                    

                      system so first it just does like a
                    

                      detection on the Image level then it
                    

                      does multiple images gives you
                    

                      prediction then multiple Imes over time
                    

                      give you a prediction and you're
                    

                      discarding C++ code and eventually
                    

                      you're just giving steering commands and
                    

                      so I think Tesla is kind of eating
                    

                      through the stack my understanding is
                    

                      that current Whos are actually like not
                    

                      that but that they've tried but they
                    

                      ended up like not doing that is my
                    

                      current understanding but I'm not sure
                    

                      because they don't talk about it but I
                    

                      do fundamentally believe in this
                    

                      approach um and I think um that's the
                    

                      last piece to fall if if you want to
                    

                      think about it that way and I do suspect
                    

                      that the endtoend systems for Tesla in
                    

                      like say 10 years it is just a neural
                    

                      nut I mean the the videos stream into a
                    

                      neural nut and commands come out you
                    

                      have to sort of build to build build up
                    

                      to it incrementally and uh do it piece
                    

                      by piece and even all the intermediate
                    

                      predictions and all the things that
                    

                      we've done I don't think they've
                    

                      actually like misled the development I
                    

                      think they're part of it uh because um
                    

                      there's a lot of solid reasons for this
                    

                      so so actually like in and driving when
                    

                      you're just imitating humans and so on
                    

                      you have very few bits of supervision to
                    

                      train a massive neural nut and it's too
                    

                      too few bits of signal to train so many
                    

                      billions of parameters and so these
                    

                      intermediate representations and so on
                    

                      help you develop the features and the
                    

                      detectors for everything and then it
                    

                      makes a much easier problem for the um
                    

                      end to end part of it and so I suspect
                    

                      although I don't know because I'm not
                    

                      part of the team but there's a ton of
                    

                      pre-training happening so that you can
                    

                      do the fine tuning for end to end and so
                    

                      basically I feel like it was necessary
                    

                      to eat through it incrementally and
                    

                      that's what Tesla has done I think is
                    

                      the right approach and it looks like
                    

                      it's working so I'm really looking
                    

                      forward if you had started end you
                    

                      wouldn't have had the data anyway that
                    

                      makes sense yeah so uh you worked on the
                    

                      um Tesla humanoid robot before you left
                    

                      uh I have so many questions but one is
                    

                      like starting here what transfers
                    

                      basically everything transfers and I
                    

                      don't think people appreciate it okay um
                    

                      that's a big claim like a cars are
                    

                      basically robots when you actually uh
                    

                      look at it um cars are robots and Tesla
                    

                      I don't think it's a car company I think
                    

                      this is misleading this is a robotics
                    

                      company robotics at Scale Company
                    

                      because I would say at scale is also
                    

                      like a whole separate variable they're
                    

                      not building a single thing they're
                    

                      building the machine that builds the
                    

                      thing which is a whole separate thing um
                    

                      and so I think robotics at scale um
                    

                      company uh is what Tesla is and I think
                    

                      uh in terms of the transfer from Cars to
                    

                      to humanoids it was not not that much
                    

                      work at all and in fact like the early
                    

                      versions of Optimus um the the robot uh
                    

                      it thought it was a car like because it
                    

                      had the exact same computer it had the
                    

                      exact same cameras it was really funny
                    

                      because we were running the car networks
                    

                      on the robot but it's walking around the
                    

                      office and so on and like it's trying to
                    

                      like recognize drivable space but it's
                    

                      all just walking space now I suppose but
                    

                      it actually kind of like generalized a
                    

                      little bit and there's some fine tuning
                    

                      necessary and so on but it thought it
                    

                      was driving but it's actually like
                    

                      moving through an environment is a
                    

                      reasonable way to think of this as like
                    

                      actually it's a robot many things
                    

                      transfer but you're just missing for
                    

                      example actuation and action data yeah
                    

                      you definitely miss some components but
                    

                      and the other part I would say is like
                    

                      like so much transfers like the speed
                    

                      with which Optimus was started I think
                    

                      to me was very impressive because the
                    

                      moment Elon said we're doing this uh
                    

                      just people just showed up with all the
                    

                      right tools and all the stuff just
                    

                      showed up so quickly and all these CAD
                    

                      models and all the supply chain stuff
                    

                      and I just felt like wow there's so much
                    

                      in-house expertise for building robotics
                    

                      at Tesla and like it's all the same
                    

                      tools and they're just like okay they're
                    

                      being reconfigured from a car like a
                    

                      Transformer the movie they're just being
                    

                      reconfigured and reshuffled but it's
                    

                      like the same thing and you need all the
                    

                      same components you need to think about
                    

                      all the same kinds of stuff both on the
                    

                      hardware side on the scale stuff and
                    

                      also on the brains and so for the brains
                    

                      there was also a ton of transfer not
                    

                      just of the specific networks but also
                    

                      all of the approach and labeling team
                    

                      and how it all coordinates and the
                    

                      approaches people are taking I just
                    

                      think there's a ton of transfer what do
                    

                      you think of the first application areas
                    

                      for humanoid robotics or human form
                    

                      stuff I think a lot of people have this
                    

                      vision of it like doing your laundry Etc
                    

                      I think that will come late I don't
                    

                      think b2c should be the right start
                    

                      point because I don't think we can have
                    

                      a robot like Crush grandma is how I it
                    

                      sort of I think it's like too much legal
                    

                      liability it's just like I don't think
                    

                      that's the right hug I mean it's just
                    

                      going to fall over or something like
                    

                      that you know like these things are not
                    

                      perfect yet and they require some amount
                    

                      of work so I think the best customer is
                    

                      yourself first and I think probably
                    

                      Tesla's going to do this uh I'm very
                    

                      bullish on Tesla if people can tell the
                    

                      first customer is yourself and you
                    

                      incubate it in the factory and so on
                    

                      doing maybe a lot of material handling
                    

                      Etc this way you don't have to create
                    

                      contracts working with third parties
                    

                      it's all really heavy there's lawyers
                    

                      involved like Etc you incubate it then
                    

                      you go I think B2B second uh and you go
                    

                      to other companies that have massive
                    

                      warehouses we can do Material Handling
                    

                      we're going to do all the stuff
                    

                      contracts get drafted up fences get put
                    

                      around all this kind of stuff and then
                    

                      once you incubate in companies I think
                    

                      that's when you start to go into the b2c
                    

                      applications I do think we'll see b2c uh
                    

                      robots also um like unit tree and so on
                    

                      are starting to come up with robots that
                    

                      I really want I got one you did yeah
                    

                      okay yeah the G1 yeah so I will probably
                    

                      buy one of those and there's probably
                    

                      going to be an ecosystem of people
                    

                      building on those platforms too but I
                    

                      think in terms of like what like wins at
                    

                      scale I would expect that kind of a
                    

                      approach I but in the beginning it's a
                    

                      lot of material handling and then going
                    

                      towards more and more HKC things that
                    

                      are more specific one one that I'm
                    

                      really um excited about is the n
                    

                      Freedman challenge of the leaf blower
                    

                      yeah uh like I would love for an
                    

                      optimist to walk down the street like
                    

                      tiptoe down the street and like pick up
                    

                      individual leaves so that we don't need
                    

                      like Lea blowers and I think this will
                    

                      work and it's an amazing task and so I
                    

                      would hope that that's one of the first
                    

                      applications even raking that should
                    

                      work too just very quietly yeah just
                    

                      quiet raking that's cute they uh I mean
                    

                      they do actually have like a machine
                    

                      that's working it's just not a humanoid
                    

                      can we talk about the humanoid thesis
                    

                      for a second because the the simplest
                    

                      version of this is like the world is
                    

                      built for humans and you build one set
                    

                      of Hardware the right thing to do is
                    

                      build a model that can do an increasing
                    

                      set of tasks in this set of Hardware I
                    

                      think there's a like another camp that
                    

                      believes like well like humans are not
                    

                      optimal for any given task right you can
                    

                      make them stronger or bigger or smaller
                    

                      or whatever and why shouldn't we do
                    

                      superhuman things like how do you think
                    

                      about this I think people are maybe
                    

                      under appreciating the complexity of any
                    

                      fixed cost that goes into any single
                    

                      platform uh I think there's a large cost
                    

                      you're paying for any single platform
                    

                      and so I think it makes a lot of sense
                    

                      to centralize that and have a single
                    

                      platform that can do all the things I
                    

                      would say the human note aspect is also
                    

                      very appealing because people can
                    

                      teleoperate it very easily and so it's a
                    

                      data collection thing that is extremely
                    

                      helpful uh because people will be able
                    

                      to obviously very easily teleoperate it
                    

                      I think that's usually overlooked
                    

                      there's of course the aspect you
                    

                      mentioned which is like the world design
                    

                      for humans Etc so I think that's also
                    

                      important I mean I think we'll have some
                    

                      variations on the humanoid uh platform
                    

                      but I think there is a large fixed cost
                    

                      to any platform and then I would say
                    

                      also one last dimension of it is you
                    

                      benefit a ton from like the transfer
                    

                      learning between the different tasks and
                    

                      in AI you really want the single neuron
                    

                      not that is multitasking doing lots of
                    

                      things that's where you're getting all
                    

                      the intelligence and capability from and
                    

                      that's also H that's also why language
                    

                      models are so interesting is because you
                    

                      have a single uh regime like a text um
                    

                      domain multitasking all these different
                    

                      problems and they're all sharing
                    

                      knowledge between each other and it's
                    

                      all coupled in a single neural nut and I
                    

                      think you want that kind of a platform
                    

                      and uh you know you want all the data
                    

                      you collect for Lea picking to benefit
                    

                      all the other tasks um if you're
                    

                      building a special purpose thing for any
                    

                      one thing you're not going to benefit
                    

                      from a lot of the transferring between
                    

                      all the other tasks if that makes sense
                    

                      yeah I think there's one um argument of
                    

                      like uh it seems I mean the G1 is like
                    

                      30 grand right but it seems hard to
                    

                      build a very capable humanoid robot
                    

                      under a certain bomb and like if you
                    

                      wanted to you know put an arm on Wheels
                    

                      I can do things like it's a like maybe
                    

                      there are cheaper approaches to a
                    

                      general platform at the beginning does
                    

                      that make sense to you uh cheaper
                    

                      approaches to a general platform from a
                    

                      hardware perspective uh yeah I think
                    

                      that makes sense yeah you put a wheel on
                    

                      it instead of a feet Etc I do feel like
                    

                      I wonder if it's taking you down like a
                    

                      local minimum a little bit I just feel
                    

                      like pick a platform make it perfect is
                    

                      like the long-term uh pretty good betat
                    

                      and then the other thing of course is
                    

                      like I just think it will be kind of
                    

                      like familiar to people and I think
                    

                      people will understand that maybe you
                    

                      want to talk to it and I feel like the
                    

                      co the psychological aspect also of it I
                    

                      think favors possibly the human platform
                    

                      unless people are like scared of it and
                    

                      would actually prefer a platform that is
                    

                      more abstract of like some but then I
                    

                      don't know if if
                    

                      just Monster doing stuff then I don't
                    

                      know if that's like more kind like it's
                    

                      interesting that I think that the other
                    

                      um form factor for the unitri is a dog
                    

                      right and it's almost a more friendlier
                    

                      familiar yeah but then people watch
                    

                      Black Mirror and suddenly the dog flips
                    

                      to like a scary thing like so it's hard
                    

                      to think through uh I just think
                    

                      psychologically it will be easy for
                    

                      people to understand what's happening
                    

                      and what do you think is missing in
                    

                      terms of um technological milestones for
                    

                      Progress relative to substantiating this
                    

                      future uh for robotics robtics yeah or
                    

                      the the humanoid robot or anything else
                    

                      human form yeah I don't know that I have
                    

                      like a really good window into it I do
                    

                      think that it is kind of interesting
                    

                      that like in the human for factor for
                    

                      for example for the lower body uh I
                    

                      don't know that you want to do imitation
                    

                      learning from like demonstration because
                    

                      for lower body it's all a lot of like
                    

                      inverted pendulum control and stuff like
                    

                      that it's for the upper body that you
                    

                      need a lot of like to operation and uh
                    

                      data collection and end to endend and
                    

                      Etc and so I think like everything
                    

                      becomes like very hybrid in that sense
                    

                      and I don't know how those systems
                    

                      interact when I when I talk to people
                    

                      working a lot of what they focus on is
                    

                      like actuation and you know manipulation
                    

                      and sort of digital manipulation and
                    

                      things like yeah I do expect in the
                    

                      beginning it's a lot of like
                    

                      teleoperation for uh getting stuff off
                    

                      the ground and imitating it and getting
                    

                      something that works 95% of the time and
                    

                      then talking about human to robot ratios
                    

                      and gradually having people who are
                    

                      supervisors of robots instead of doing
                    

                      the task directly and all this kind of
                    

                      stuff is going to happen over time and
                    

                      pretty gradually I don't know that
                    

                      there's like any individual impediments
                    

                      that I'm like really familiar with I
                    

                      just think it's a lot of grun grunt work
                    

                      a lot of like the tools are available
                    

                      Transformers are this beautiful like
                    

                      blob of tissue you can just get just
                    

                      arbitrary tasks and you just need the
                    

                      data you need to put it in the right
                    

                      form you need to train it you need to
                    

                      experiment with it you need to deploy it
                    

                      iterate on it that's just a lot of grunt
                    

                      work I don't know that I have a single
                    

                      individual thing that is like holding us
                    

                      back technically where are we in the
                    

                      state of large blob research large blob
                    

                      research yeah we're in a really good
                    

                      State uh so I think um I'm not sure if
                    

                      it's fully appreciated but like the
                    

                      transformer is like much more amazing
                    

                      it's not just like an it's not just
                    

                      another neural net it's like an amazing
                    

                      neural nut extremely General uh so for
                    

                      example example when people talk about
                    

                      the scaling loss in neural networks the
                    

                      scaling loss are actually a um a to a
                    

                      large extent a property of the
                    

                      Transformer before the Transformer
                    

                      people are playing with lstms and
                    

                      stacking them Etc you don't actually get
                    

                      like clean scaling loss and this thing
                    

                      doesn't actually train it doesn't
                    

                      actually work if the Transformer that
                    

                      was the first thing that actually just
                    

                      kind of like scales um and you get
                    

                      scaling losss and everything makes sense
                    

                      so this like general purpose training
                    

                      computer I think of it as kind of a
                    

                      computer but it's like a differentiable
                    

                      computer and you can just give it inputs
                    

                      and outputs and billions of it and can
                    

                      train with back propagation it actually
                    

                      kind of like arranges itself into a
                    

                      thing that does the task and so I think
                    

                      it's actually kind of like a magical
                    

                      thing that we've stumbled on in the
                    

                      algorithm space and I think there's a
                    

                      few individual innovations that went
                    

                      into it so you have the residual
                    

                      connections that was a piece that
                    

                      existed you have the layer
                    

                      normalizations uh that needs to slot in
                    

                      you have the attention block you have
                    

                      the lack of these like um uh saturating
                    

                      nonlinearities like tan hes and so on
                    

                      those are not present in the Transformer
                    

                      because they kill gradient signals so
                    

                      there's a few like there's four or five
                    

                      innovations that all existed and were're
                    

                      put together into this Transformer and
                    

                      that's what Google did with their paper
                    

                      and this thing actually trains uh and
                    

                      suddenly you get scaling laws and
                    

                      suddenly you have like this piece of
                    

                      tissue that just trains to a very large
                    

                      extent and so I it was a major unlock
                    

                      you feel like we are not near the limit
                    

                      of that unlock right because I think
                    

                      there is a discussion of of course like
                    

                      the data wall and how expensive another
                    

                      generation of scale would be like how do
                    

                      you think about that that's where you
                    

                      start to get into like I don't think
                    

                      that the neural network architecture is
                    

                      like holding us back fundamentally
                    

                      anymore it's like not the bottom leg
                    

                      whereas I think in the previous before
                    

                      Transformer it was a bottom leg but now
                    

                      it's not the bottom leg so now we're
                    

                      talking a lot more about what is the
                    

                      loss function what is the data set we're
                    

                      talking a lot more about those and those
                    

                      have become the bottlenecks almost um
                    

                      it's not the general piece of tissue
                    

                      that reconfigures based on whatever you
                    

                      want it to be and so that's where I
                    

                      think a lot of the activity has moved
                    

                      and that's why a lot of the companies
                    

                      and someone who are applying this
                    

                      technology like they're not thinking
                    

                      about the Transformer Mar they're not
                    

                      thinking about the architecture you know
                    

                      the Llama release uh like the the
                    

                      Transformer hasn't changed that much uh
                    

                      you know we've added the Rope positional
                    

                      and the Rope relative position en
                    

                      codings um that's like the major change
                    

                      everything else doesn't really matter
                    

                      too much it's like plus 3% on small few
                    

                      things uh but really it's like rope is
                    

                      the only thing that's slotted in and
                    

                      that's the Transformer as as it has
                    

                      changed since the last five years or
                    

                      something so there hasn't been that much
                    

                      Innovation on that everyone just takes
                    

                      it for granted let's train it Etc and
                    

                      then everyone's just innovating on the
                    

                      data set mostly and the loss function
                    

                      details uh so that's where all the
                    

                      activity has gone to right but what
                    

                      about the argument like in that domain
                    

                      that that was easier when we were taking
                    

                      internet data uhuh and we're out of
                    

                      internet data and so the questions are
                    

                      really around like synthetic data or
                    

                      more expensive data collection so I
                    

                      think that's a good point so that's
                    

                      where a lot of the activity is now in
                    

                      LMS so the internet data is like not the
                    

                      data you want for your Transformer it's
                    

                      like a nearest neighbor that actually
                    

                      gets you really far surprisingly but the
                    

                      internet data is a bunch of internet web
                    

                      pages right it's just like what you want
                    

                      is the inner thought monologue of your
                    

                      brain yeah that's the IDE trajectories
                    

                      in your brain the trajectories in your
                    

                      brain as you're doing problem solving if
                    

                      we had a billion of that like AGI is
                    

                      here roughly speaking I mean to a very
                    

                      large extent uh and we just don't have
                    

                      that so where a lot of activity is now I
                    

                      think is we the internet data that
                    

                      actually gets you like really close
                    

                      because it just so happens that internet
                    

                      has enough of reasoning traces in it and
                    

                      a bunch of knowledge and the Transformer
                    

                      just makes it work okay so I think a lot
                    

                      of activity now is around um refactoring
                    

                      the data set into these inner monologue
                    

                      uh formats and I think there's a ton of
                    

                      synthetic data generation that's helpful
                    

                      for that so what's interesting about
                    

                      that also is like the extent to which
                    

                      the current models are helping us create
                    

                      the Next Generation models and so it's
                    

                      kind of like you know the staircase of
                    

                      improv how much do you think synthetic
                    

                      data is or how far does that get us
                    

                      right because to your point on each data
                    

                      each model helps you train the
                    

                      subsequent model better at least create
                    

                      tools for it data labeling whatever may
                    

                      be part of it is synthetic data how
                    

                      important do you think the synthetic
                    

                      data pieces inred yeah I think is the
                    

                      only way we can like make progress is we
                    

                      have to make it work I think with
                    

                      synthetic data you just have to be
                    

                      careful uh because uh these models are
                    

                      silently collapsed is like one of the
                    

                      major issues so if you go to chpt and
                    

                      you ask it to give you a um a joke
                    

                      you'll notice that it only knows like
                    

                      three jokes that's like the only it
                    

                      gives you like one joke I think most of
                    

                      the time and sometimes it gives you like
                    

                      three jokes and it's because the models
                    

                      are collapsed and it's silent so when
                    

                      you're looking at any single individual
                    

                      output you're just seeing a single
                    

                      example but when you actually look at
                    

                      the distribution you'll notice that it's
                    

                      not a very diverse distribution it's
                    

                      silently collapsed when you're doing
                    

                      synthetic data generation this is a
                    

                      problem because you actually really want
                    

                      that entropy you want the diversity and
                    

                      the richness in your data set otherwise
                    

                      you're getting collapsed data sets and
                    

                      you can't see it when you look at any
                    

                      individual example but the distribution
                    

                      is has lost a ton of uh entropy and
                    

                      richness and so it silently gets worse
                    

                      and so that's why you have to be very
                    

                      careful and you have to make sure that
                    

                      you maintain your entropy in your data
                    

                      set and there's a ton of uh techniques
                    

                      for that as an example someone released
                    

                      this uh Persona data set as an example
                    

                      the Persona data set is a data set of 1
                    

                      billion um personalities like humans
                    

                      like backgrounds oh I'm a teacher or I'm
                    

                      an artist I live here I do this Etc and
                    

                      it's like little paragraphs of like um
                    

                      fictitious human background and what
                    

                      what you do when you do synthetic data
                    

                      generation is not only like oh complete
                    

                      this task and do it in this way but also
                    

                      imagine you're uh describing it to this
                    

                      person you put in this information and
                    

                      now you're forcing it to explore more of
                    

                      the space and you're getting some
                    

                      entropy so I think you have to be just
                    

                      very careful to inject the entropy
                    

                      maintain the distribution and that's the
                    

                      hard part uh that I think maybe maybe
                    

                      people aren't like sufficiently
                    

                      appreciating as much in general so I
                    

                      think basically synthetic data
                    

                      absolutely the future uh we're not going
                    

                      to run out of data is my impression I
                    

                      just think you have to be careful what
                    

                      do you think we are learning now about
                    

                      human cognition from this research I
                    

                      don't know if we're learn one could
                    

                      argue that like figuring out the shape
                    

                      of reasoning traces we want for example
                    

                      is um instructive to actually
                    

                      understanding how the brain works I
                    

                      would be careful with those analogies
                    

                      but in general I do think that it's um
                    

                      it's a very different kind of thing but
                    

                      I do think that there are some analogies
                    

                      you can draw so as an example uh I think
                    

                      Transformers are actually better than
                    

                      the human brain in a bunch of ways I
                    

                      think they're actually a lot more
                    

                      efficient system and the reason they
                    

                      don't work as good as the human brain is
                    

                      mostly data issue roughly speaking is
                    

                      the first story approximation I would
                    

                      say and actually like as an example like
                    

                      Transformer memorizing sequences is so
                    

                      much better than humans like if you give
                    

                      it a sequence and you do a single
                    

                      forward backward pass in that sequence
                    

                      then if you give it the first few
                    

                      elements it will complete the rest of
                    

                      the sequence it memorized that sequence
                    

                      and it's so good at it if you gave a
                    

                      human a single presentation of a
                    

                      sequence there's no way that you can
                    

                      remember that and so the Transformers is
                    

                      actually I do think there's a good
                    

                      chance that the gradient um Bas
                    

                      optimization the forward backward update
                    

                      that we do all the time for training
                    

                      neur NS is actually more efficient than
                    

                      the brain in some ways and these models
                    

                      are better they're just not um yet ready
                    

                      to shine but in a bunch of cognitive
                    

                      sort of aspects I think they might come
                    

                      out with the right inputs they will be
                    

                      better that that's generically true of
                    

                      computers for all sorts of applications
                    

                      right put memory to your point yeah
                    

                      exactly and I think human brains just
                    

                      have a lot of constraints you know the
                    

                      working memory is very small I think
                    

                      Transformers have a lot lot bigger
                    

                      working memory and will this will
                    

                      continue to be the case uh they're much
                    

                      more efficient Learners uh the human
                    

                      brains function under all kinds of
                    

                      constraints uh it's not obvious that the
                    

                      human R is back propagation right it's
                    

                      not obvious how that would work it's uh
                    

                      very stochastic sort of dynamic system
                    

                      it has all these constraints it works
                    

                      under so ambient conditions Etc so I I
                    

                      do think that what we have is actually
                    

                      potentially better than the brain um and
                    

                      it's just not there yet how do you think
                    

                      about um human augmentation with
                    

                      different AI systems over time do you
                    

                      think that's a likely direction do you
                    

                      think that's unlikely augmentation
                    

                      augmentation of people with AI models oh
                    

                      of course I mean but in in what sense
                    

                      maybe I think in general absolutely
                    

                      because I mean there's an abstract
                    

                      version of it you're using as a tool
                    

                      that's the the external version there's
                    

                      the you know the merger
                    

                      scenario you know a lot of people end up
                    

                      talking about yeah yeah I mean we're
                    

                      already kind of merging the thing is
                    

                      like there's a you know there's the io
                    

                      bottleneck but for the most part you
                    

                      know at your fingertips if you have any
                    

                      of these models you're that's a little
                    

                      bit different because I mean people have
                    

                      been making that argument for I think 40
                    

                      50 years where uh technological tools
                    

                      are just extension of human capabilities
                    

                      right yeah the computer is the bicycle
                    

                      for human mind exactly so um but there's
                    

                      a subset of the AI community that thinks
                    

                      that for example the way that we subsume
                    

                      some potential conflict with future AI
                    

                      or something else would be through some
                    

                      form of yeah like the neuralink pitch
                    

                      Etc exactly um yeah I don't I don't know
                    

                      what this merger looks like uh yet but I
                    

                      can definitely see that you want to
                    

                      decrease the iio to Tool use and I see
                    

                      this as kind of like an EXO cortex while
                    

                      building on top of our neocortex right
                    

                      and it's just the next layer and uh it
                    

                      just turns out to be in the cloud Etc
                    

                      but it is the next layer of the brain
                    

                      yeah accelerando uh book from the early
                    

                      2000s has a version of this where
                    

                      basically everything is substantiated in
                    

                      a set of goggles that are
                    

                      computationally attached to your brain
                    

                      that you wear and then if you lose them
                    

                      you must feel like you're losing a part
                    

                      of your persona or memory I think that's
                    

                      very likely yeah and today the phone is
                    

                      already almost that and I think it's
                    

                      going to get worse when you put your
                    

                      techno stuff away from you you're just
                    

                      like naked human in nature as well you
                    

                      lose part of your intelligence it's very
                    

                      anxiety a very uh a very simple example
                    

                      of that is just Maps right so a lot of
                    

                      people now I've noticed can't actually
                    

                      navigate their City very well anymore
                    

                      because are always using turn BYT
                    

                      Direction and if we have this for
                    

                      example like Universal translator which
                    

                      I don't think is too far away like you
                    

                      lose the ability to speak to people who
                    

                      don't speak English if you just put your
                    

                      stuff away I'm very comfortable
                    

                      repurposing that part of my brain to do
                    

                      further research I don't know if you saw
                    

                      the video of like the kid that's has a
                    

                      magazine and is trying to like swipe on
                    

                      the magazine what's fascinating to me
                    

                      about it is like this kid doesn't
                    

                      understand what comes with nature and
                    

                      what's technology technology on top of
                    

                      the nature because it's made so
                    

                      transparent and I think this might look
                    

                      similar where people will just start
                    

                      assuming the tools and then when you
                    

                      take them away you realize like I guess
                    

                      like people don't know what's technology
                    

                      and what's not if you're wearing this
                    

                      thing that's always translating one or
                    

                      like doing stuff like that for you then
                    

                      um maybe people like lose the basic
                    

                      cognitive abilities may not exist I
                    

                      think exist yeah like by Nature we're
                    

                      going to specialize you can't understand
                    

                      people who speak Spanish like what the
                    

                      hell or like if when you go to objects
                    

                      like in Disney uh all the objects are
                    

                      alive and I think we are going to
                    

                      potentially come to that kind of a world
                    

                      where why an I talk to things like
                    

                      already today you can talk to Alexa and
                    

                      you can ask her for things and so on
                    

                      yeah I've seen some toy companies like
                    

                      that where they're basically trying to
                    

                      embed in LM in a toys that can interact
                    

                      with a child yeah isn't it strange that
                    

                      when you go to a door you can't just say
                    

                      open like what the hell um another
                    

                      favorite example of that I don't know if
                    

                      you saw either demolition men or iRobot
                    

                      people make fun of the idea that you uh
                    

                      yeah like you can't just talk to things
                    

                      and what the hell if we're talking about
                    

                      a um EXO cortex that feels like a pretty
                    

                      fundamentally um important thing to
                    

                      democratize access to how do you think
                    

                      like the current market structure of
                    

                      what's happening in llm research you
                    

                      know there's a small number of large
                    

                      Labs that actually have a shot at the
                    

                      Next Generation progressing training
                    

                      like how does that translate to what
                    

                      people have access to in the future so
                    

                      what you were kind of alluding to maybe
                    

                      is the state of the ecosystem right so
                    

                      we have kind of like an oligopoly of a
                    

                      few closed platforms and then we have an
                    

                      open platform that is kind of like
                    

                      behind so like metal Lama Etc and this
                    

                      is kind of like mirroring the open
                    

                      source uh kind of ecosystem I do think
                    

                      that when this stuff starts to when we
                    

                      start to think of it as like an
                    

                      exocortex uh so there's the there's a
                    

                      saying in cryp which is like not your
                    

                      keys not your not your tokens like is it
                    

                      the case that if it's like not your
                    

                      weights not your brain that's
                    

                      interesting because a company is
                    

                      effectively controlling your exocortex
                    

                      and therefore for part of it starts to
                    

                      feel kind of invasive if this is my
                    

                      exocortex I think people will care much
                    

                      more about ownership yes like you're
                    

                      yeah you're you realize you're renting
                    

                      your brain like it seems strange to rent
                    

                      your brain the thought experiment is
                    

                      like are you willing to give up
                    

                      ownership and control to rent a better
                    

                      brain because I am yeah yeah so I think
                    

                      that's the trade-off I think we'll see
                    

                      how that works but maybe it's possible
                    

                      to like by default use the closed
                    

                      versions because they're amazing but you
                    

                      have a fallback in various scenarios and
                    

                      I think that's kind of like the way
                    

                      things are shaping up today even right
                    

                      like um when apis go down on some of the
                    

                      Clos Source providers people start to
                    

                      implement fallbacks to like the open
                    

                      ecosystems for example that they fully
                    

                      control on they're in they feel
                    

                      empowered by that right so so maybe
                    

                      that's just the extension of what will
                    

                      look like for the brain is you fall back
                    

                      on the open source stuff um should
                    

                      anything happen but most of the time you
                    

                      actually so it's quite important that
                    

                      the open source stuff continues to
                    

                      progress I think so 100% and this is not
                    

                      like an obvious point or something that
                    

                      people maybe agree on right now but I
                    

                      think 100% I I guess one thing I've been
                    

                      wondering about a little bit is um what
                    

                      is the smallest performant model that
                    

                      you can get to in some sense either in
                    

                      parameter size or however you want to
                    

                      think about it and I'm a little bit
                    

                      curious about your view because youve
                    

                      thought a lot about both uh distillation
                    

                      small models you know I think it can be
                    

                      surprisingly small and I do think that
                    

                      the current models are wasting a ton of
                    

                      capacity remembering stuff that doesn't
                    

                      matter like they remember sha hases they
                    

                      remember like the ancient cuz the data
                    

                      set is not curated the best yeah exactly
                    

                      like and I think this will go away and I
                    

                      think we just need to get to the
                    

                      cognitive core and I think the cognitive
                    

                      core can be extremely small and it's
                    

                      just this thing that thinks and if it
                    

                      needs to look up information it knows
                    

                      how to use different tools is that like
                    

                      three billion parameters is that 20
                    

                      billion param I think even a billion
                    

                      billion suffices we'll probably get to
                    

                      that point and the models can be very
                    

                      very small and I think the reason they
                    

                      can be very small is fundamentally I
                    

                      think just like distillation works it
                    

                      maybe like the only thing I would say
                    

                      distillation works like surprisingly
                    

                      well distillation is where you get a
                    

                      really big model or a huge amount of
                    

                      computer something like that um
                    

                      supervising a very small model and uh
                    

                      you can actually um stuff a lot of
                    

                      capability into a very small level is
                    

                      there some sort of like uh mathematical
                    

                      representation of that or some
                    

                      information theoretical like formulation
                    

                      of that because it almost feels like you
                    

                      should be able to calculate that now in
                    

                      terms of what's the maybe maybe like one
                    

                      way to think about it is like you we go
                    

                      back to like the internet data set which
                    

                      is what we're working with the Internet
                    

                      is like
                    

                      0.001% cognition and like 99.99% of like
                    

                      information just like you know and I
                    

                      think most of it is not uh useful to the
                    

                      thinking part and it's like yeah I I
                    

                      guess maybe another way to frame the
                    

                      question is like is there a math
                    

                      mathematical representation of cognitive
                    

                      capability relative to model size or how
                    

                      do you capture cognition in terms of you
                    

                      know here's the Min or Max relative to
                    

                      what you're trying to accomplish and
                    

                      maybe there's no good way to represent
                    

                      that so I think maybe a billion
                    

                      parameters gets you sort of like a good
                    

                      cognitive core I think probably right I
                    

                      think even 1 billion is too much I don't
                    

                      know we'll see it's very exciting given
                    

                      if you think about uh well you know the
                    

                      question of like on an edge device
                    

                      versus on the cloud but and also this
                    

                      raw cost of using the model and
                    

                      everything yeah it's very exciting right
                    

                      but at less than a billion parameters I
                    

                      have my EXO cortex on a local device as
                    

                      well yeah and then probably it's not a
                    

                      single model right like it's interesting
                    

                      to me to think about what this will
                    

                      actually play out like um cuz I do think
                    

                      you want to benefit from parallelization
                    

                      you don't want to have a a sequential
                    

                      process you want to have a parallel
                    

                      process and I think companies to some
                    

                      extent are also kind of like uh um
                    

                      paralyzation of work and but they
                    

                      there's a hierarchy in a company because
                    

                      that's one way to you know you have the
                    

                      information processing and the
                    

                      reductions that need to happen within
                    

                      Organization for information so I think
                    

                      we'll probably end up with uh companies
                    

                      for ofms I think is not unlikely to me
                    

                      that you have models of different
                    

                      capabilities specialized to various uh
                    

                      unique domains maybe there's a
                    

                      programmer Etc and it will actually
                    

                      start to resemble companies to a very
                    

                      large extent so you have the programmer
                    

                      and the program manager and you know
                    

                      similar kinds of roles of llms working
                    

                      in parallel and coming together and
                    

                      orchestrating computation on your behalf
                    

                      so maybe it's not correct to think about
                    

                      it's more like a swarm your like an
                    

                      ecosystem it's like a biological
                    

                      ecosystem where you have specialized
                    

                      roles and niches I think you'll start to
                    

                      resemble that you have automatic
                    

                      escalation to other parts of the Swarm
                    

                      depending on the difficulty of the
                    

                      problem and the special the CEO is like
                    

                      a really brilliant uh Cloud Model but
                    

                      the Workhorse can be a lot cheaper maybe
                    

                      even open source models what not and my
                    

                      cost function is different from your
                    

                      cost function yeah so uh that could be
                    

                      interesting you left open AI you're
                    

                      working on education you've always been
                    

                      an educator like why why do this I would
                    

                      start with I've always been an educator
                    

                      and I love learning and I love teaching
                    

                      and uh so it's kind of just like a space
                    

                      that I've been very passionate about for
                    

                      a long time and then the other thing is
                    

                      I think one macro picture that's kind of
                    

                      driving me is I think there's a lot of
                    

                      activity in like Ai and um I think most
                    

                      of it is to kind of like replace or
                    

                      displace people I would say is in the of
                    

                      like Sliding Away the people but I'm
                    

                      always uh more interested in anything
                    

                      that kind of empowers people and I feel
                    

                      like I'm kind of on a high level like
                    

                      team human and I'm interested in things
                    

                      that AI can do to empower people and I
                    

                      don't want the future where people are
                    

                      kind of um on the side of automation I
                    

                      want people to be very in an empowered
                    

                      State and I want them to be amazing even
                    

                      much more amazing than today and then
                    

                      other aspects that I find very
                    

                      interesting is like how far can a person
                    

                      go if they have the perfect tutor for
                    

                      all the subjects and I think people
                    

                      could go really far if they had the
                    

                      perfect curriculum for any anything and
                    

                      I think we see that with um you know if
                    

                      you if some rich people maybe have um
                    

                      tutors and they do actually go really
                    

                      far um and so I think we can approach
                    

                      that with AI or even lock their pass it
                    

                      there's very clear literature on that
                    

                      actually from the 80s right we're
                    

                      one-on-one tutoring I think um he people
                    

                      get one standard deviation better than
                    

                      is it two yeah it's the bloom stuff yeah
                    

                      exactly there's a lot of really
                    

                      interesting uh precedence on that how do
                    

                      you actually view that as substantiating
                    

                      through the lens of AI or what's the
                    

                      first types of products that will really
                    

                      help with that or you know because
                    

                      there's books like the diamond age where
                    

                      they talk about the young ladi's
                    

                      Illustrated primer and all that kind of
                    

                      stuff so I would say I'm definitely
                    

                      inspired by aspects of of it so like in
                    

                      practice what uh what I'm doing is
                    

                      trying to currently build a single
                    

                      course and I want it to be just like the
                    

                      course you would go to if you want to
                    

                      learn AI I think the problem with uh
                    

                      basically is like I've already taught
                    

                      courses like I taught 231n at Stanford
                    

                      and that was the first deep learning
                    

                      class and was pretty successful but the
                    

                      question is like how do you actually
                    

                      like really scale these classes like how
                    

                      do you make it so that your target
                    

                      audience is maybe like 8 billion people
                    

                      on Earth and there also speaking
                    

                      different languages and there all
                    

                      different uh capability levels Etc so
                    

                      you and a single teacher doesn't scale
                    

                      to that audience and so the question is
                    

                      how do you use AI to serve like do the
                    

                      scaling of a really good teacher and so
                    

                      the way I'm thinking about it is the
                    

                      teacher is kind of doing a lot of the
                    

                      course creation and the curriculum
                    

                      because currently at current AI
                    

                      capability I I don't think the models
                    

                      are good enough to create a good course
                    

                      uh but I think they're good to become
                    

                      the front end to the student and uh
                    

                      interpret the course to them and so uh
                    

                      basically the teacher doesn't go to the
                    

                      people and the teacher is not the front
                    

                      end anymore the teacher is on the back
                    

                      end designing the materials in the
                    

                      course and the AI is the front end and
                    

                      it can speak all the different languages
                    

                      and it kind of like takes you through
                    

                      the course should I think of that as
                    

                      like like the TA type experience or is
                    

                      that not a good analogy here that is
                    

                      like one way I'm thinking about it is
                    

                      it's AI ta I'm mostly thinking of it as
                    

                      like this front end to the student and
                    

                      it's the thing that's actually
                    

                      interfacing with the student and uh
                    

                      taking them through the course and I
                    

                      think that's tractable today uh and it
                    

                      just doesn't exist and I think it can be
                    

                      made really good and then over time as
                    

                      the capability increases you would
                    

                      potentially uh refactor the setup uh in
                    

                      various ways I like to find things where
                    

                      like the AI capability today and having
                    

                      a good model of it and I think a lot of
                    

                      companies that maybe don't um don't
                    

                      quite understand intuitively where the
                    

                      capability is today and then they end up
                    

                      kind of like building things that are
                    

                      kind of like too ahead of what's what's
                    

                      available or maybe not ambitious enough
                    

                      and so I think uh I do think that this
                    

                      is kind of a sweet spot of what's
                    

                      possible and also really interesting and
                    

                      exciting so I want to go back to
                    

                      something you said that I think is very
                    

                      inspiring especially coming from like
                    

                      your background and understand of where
                    

                      exactly we are in research which is
                    

                      essentially like we do not know what the
                    

                      limits of human performance from a
                    

                      learning perspective are given much
                    

                      better tooling and I think there's like
                    

                      a very easy analogy to we just had the
                    

                      Olympics like a month ago right and you
                    

                      know a runner and it's the the very best
                    

                      mile time or pick any sport today is
                    

                      much better than it was putting aside
                    

                      performance enhancing drugs like 10
                    

                      years ago just because like you start
                    

                      training earlier you have a very
                    

                      different program we have much better
                    

                      scientific understanding we have
                    

                      Technique we have PE the fact that you
                    

                      believe like we can get much further as
                    

                      humans if we're starting with the
                    

                      Tooling in the curriculum is amazing
                    

                      yeah I think we haven't even scratched
                    

                      like what's possible at all so I think
                    

                      there's like two Dimensions basically to
                    

                      it is number one is the globalization
                    

                      dimension of like I want everyone to
                    

                      have really good education but the other
                    

                      one is like how far can a single person
                    

                      go I think both of those are very
                    

                      interesting and exciting usually when
                    

                      people talk about 101 learning they talk
                    

                      about the Adaptive aspect of it where
                    

                      you're a challenging person at the level
                    

                      that they're at do you think you can do
                    

                      that with AI today or is that something
                    

                      for the future and it's more today it's
                    

                      about reach and multiple languages and
                    

                      glob looking fruit this things like for
                    

                      example different languages super low
                    

                      hang fruit I think the current models
                    

                      are actually really good at translation
                    

                      basically and can Target the material
                    

                      and trans translate it like at the spot
                    

                      so I think a lot of things are ling
                    

                      fruit this adaptability to a person's
                    

                      background I think is like not at the
                    

                      low fruit but I don't think it's like
                    

                      too high up or too much away but there
                    

                      is something you definitely want because
                    

                      not everyone is coming in with the with
                    

                      um with the same background and also
                    

                      what's really helpful is like if you're
                    

                      familiar with some other disciplines in
                    

                      the past then it's really useful to make
                    

                      analogies the things you know and that's
                    

                      extremely powerful in education so
                    

                      that's definitely a dimension you want
                    

                      to take advantage of but I think that
                    

                      starts to get to the point where it's
                    

                      like not obvious and needs some work I
                    

                      think like the easy version of it is not
                    

                      too far where you can imagine just
                    

                      prompting model is like oh hey I know
                    

                      physics or I know this and you probably
                    

                      get something but I guess what I'm
                    

                      talking about is something that actually
                    

                      works not something that like you can
                    

                      demo and work sometimes so I just mean
                    

                      like it actually really works and in the
                    

                      way a person would yeah and that's the
                    

                      reason I was asking about adaptability
                    

                      because also people learn at different
                    

                      rates or certain things they find
                    

                      challenging that others don't or vice
                    

                      versa and so it's a little bit of how do
                    

                      you mod relative to that context and I
                    

                      guess you could have some reintroduction
                    

                      of what the person is good or bad at
                    

                      into the model over time as you that's
                    

                      the thing with AI I feel like a lot of
                    

                      them a lot of these capabilities are
                    

                      just kind of like prompt away so you
                    

                      always get like demos but like do you
                    

                      actually get a product you know what I
                    

                      mean so um so in this sense I would say
                    

                      the demo is near but the product is far
                    

                      so one thing we were talking about
                    

                      earlier which I think is really
                    

                      interesting is sort of lineages that
                    

                      happens in the research Community where
                    

                      you come from certain labs and everybody
                    

                      gossips about being from each other's
                    

                      Labs I think a very high proportion of
                    

                      noble Le actually used to work in a
                    

                      former mobal Orit lab so there's some
                    

                      propagation of I don't know if it's
                    

                      culture or knowledge or branding or what
                    

                      and in AI education Centric world how do
                    

                      you maintain lineage or does it not
                    

                      matter or how do you think about those
                    

                      aspects of propagation of network and
                    

                      knowledge I don't actually want to live
                    

                      in a world where lineage like matters
                    

                      too much right so I'm hoping that AI can
                    

                      help you destroy that structure a little
                    

                      bit it it feels like kind of gatekeeping
                    

                      by some finite U scarce resource which
                    

                      is like oh there's finite number of
                    

                      people who have this lineage Etc so I
                    

                      feel like it's a little bit of that
                    

                      aspect so I'm hoping it can destroy that
                    

                      it's definitely one piece like actual
                    

                      learning one piece pedigree right yeah
                    

                      uh well it's also the aggregation of
                    

                      it's a cluster effect right it's like
                    

                      why is all of the or much of the AI
                    

                      community in the Bay Area or why is most
                    

                      of the fintech community in New York
                    

                      yeah and so I think a lot of it is also
                    

                      just you're clustering really smart
                    

                      people with common interests and beliefs
                    

                      and then they kind of propagate from
                    

                      that Common Core and then they share
                    

                      knowledge in an interesting way you got
                    

                      arree a lot of that behavior is shifted
                    

                      online to some extent particularly for
                    

                      younger people I think one aspect of it
                    

                      is kind of like the educational aspect
                    

                      where like if you're part of a community
                    

                      today you're getting a ton of education
                    

                      and apprenticeship Etc which is
                    

                      extremely helpful and gets you to a
                    

                      point of empowered state in that area I
                    

                      think the other piece of it is like the
                    

                      cultural aspect of what you're motivated
                    

                      by and what you want to work on what
                    

                      does the culture prize and what are they
                    

                      put in the pedestal and what do they
                    

                      kind of like worship basically uh so in
                    

                      academic world for example is the H
                    

                      index everyone cares about the H index
                    

                      the amount of papers you publish Etc and
                    

                      I was part I was part of that community
                    

                      and I saw that and I feel like now I've
                    

                      come to different places and there's
                    

                      different Idols in all the different
                    

                      communities I think that has a massive
                    

                      impact of what people are motivated by
                    

                      and where they get their social status
                    

                      and what actually matters to them I also
                    

                      was I think part of different
                    

                      communities like growing up in Slovakia
                    

                      also a very different environment grow
                    

                      being in Canada also a very different
                    

                      environment What mattered
                    

                      there sorry thank you hockey hocky I
                    

                      would say as an example I would say in
                    

                      Canada um I was in University of Toronto
                    

                      and Toronto uh I don't think it's a very
                    

                      entrepreneurial pill uh environment it
                    

                      doesn't even occur to you that you
                    

                      should be starting companies I mean it's
                    

                      not something that people are doing you
                    

                      don't know friends who are doing it you
                    

                      don't know that you're supposed to be
                    

                      looking up to it people aren't like
                    

                      reading books about all the founders and
                    

                      talking about them it's just not a thing
                    

                      you aspire to or care about and uh what
                    

                      everyone is talking about oh is where
                    

                      are you getting your internship where
                    

                      are you going to work afterwards and
                    

                      it's just accepted that there's a bunch
                    

                      of set there's a fixed set of companies
                    

                      that you are supposed to pick from and
                    

                      just align yourself with one of them and
                    

                      that's like what you look up to or
                    

                      something like that so these cultural
                    

                      aspects are extremely strong and maybe
                    

                      actually the dominant variable because I
                    

                      almost feel like today already the
                    

                      education aspects I think is the easier
                    

                      one like a ton of stuff is already
                    

                      available Etc so I think mostly it's a
                    

                      cultural aspect that you're part of so
                    

                      on this point like one thing you and I
                    

                      were talking about a few weeks ago is
                    

                      and and I think you also posted online
                    

                      about this um there's a difference
                    

                      between learning and entertainment and
                    

                      learning is actually supposed to be hard
                    

                      and I think it relates to this question
                    

                      of like you know status um and what like
                    

                      status is a great motivator like who the
                    

                      idol is um how much do you think you can
                    

                      change in terms of um motiv
                    

                      through systems like this if that's like
                    

                      a a blocking Factor are you focused on
                    

                      um give people the resources such that
                    

                      they can get as far as possible in the
                    

                      sequence for their own capability as
                    

                      they can like further than any other
                    

                      point in history already inspirational
                    

                      or do you actually want to change how
                    

                      many people want to learn or at least
                    

                      bring themselves down the path want is a
                    

                      loaded word I would say like I want to
                    

                      make it much easier to learn and then
                    

                      maybe it is possible that maybe people
                    

                      don't want to learn I mean today for
                    

                      example people want to learn for
                    

                      practical reasons right like they want
                    

                      to get a job Etc which makes total sense
                    

                      so in a pre AGI Society education is
                    

                      useful and I think people will be
                    

                      motivated by that because they're um
                    

                      they're uh climbing up the ladder
                    

                      economically Etc I post AGI Society
                    

                      we're just all Society I think education
                    

                      is entertainment to a much larger extent
                    

                      including um like successful outcomes
                    

                      education right not just letting the
                    

                      content wash over you yes I think so
                    

                      outcomes being like understanding
                    

                      learning being able to contribute new
                    

                      knowledge or however you defined it I
                    

                      think it's not a uh an accident that if
                    

                      you go back 200 years 300 years the
                    

                      people who were doing science were
                    

                      nobility or people of wealth we will all
                    

                      be nobility learning with Andre yeah I
                    

                      do think that I see it very much
                    

                      equivalent to your quote earlier like I
                    

                      feel like learning something is kind of
                    

                      like going to the gym but for the brain
                    

                      right like it feels like going to the
                    

                      gym I mean going to the gym is fun uh
                    

                      people like to lift Etc some people
                    

                      don't go to the gym no no no some people
                    

                      do but it is it takes effort yeah yeah
                    

                      it takes effort but it's effort but it's
                    

                      also kind of fun and you also have a
                    

                      payoff of like you feel about yourself
                    

                      in various ways right and I think
                    

                      education is basically equivalent to
                    

                      that so that's what I mean when I say
                    

                      education should not be fun Etc I mean
                    

                      it is kind of fun but it's like a specif
                    

                      kind of fun I suppose right I do think
                    

                      that maybe in a post AGI world what I
                    

                      would hope happens is people actually
                    

                      they do go to the gym a lot not just
                    

                      physically but also mentally and uh is
                    

                      something that we look up to as being
                    

                      highly educated and also you know
                    

                      just just uh yeah can I ask you one last
                    

                      question about Eureka just because I
                    

                      think it would be interesting to people
                    

                      um like who is the audience for the
                    

                      first course the audience for for course
                    

                      I'm I'm mostly thinking of this as like
                    

                      an undergrad level course uh so if
                    

                      you're doing undergrad in technical area
                    

                      I think that would be kind of the ideal
                    

                      audience I do think that what we're
                    

                      seeing now is we have this like
                    

                      Antiquated concept of Education where
                    

                      you go through school and then you
                    

                      graduate and go to work right obviously
                    

                      this will totally break down especially
                    

                      in a society that's turning over so
                    

                      quickly that people are going to come
                    

                      back to school a lot more frequently as
                    

                      the technology changes very very quickly
                    

                      so it is kind of like undergrad level
                    

                      but I would say like anyone at that
                    

                      level at any age uh is kind of like in
                    

                      scope I think it will be very diverse in
                    

                      age as an example but I think it is
                    

                      mostly like uh people who are Technical
                    

                      and mostly want to mostly actually want
                    

                      to understand it uh to you know a good
                    

                      amount um Tech when can they take the
                    

                      course I was hoping it would be late
                    

                      this year I do have a lot of
                    

                      distractions that are piling on but I
                    

                      think probably early next year is kind
                    

                      of like the timeline yeah I'm trying to
                    

                      make it very very good um and uh yeah it
                    

                      just takes time to uh to get there so I
                    

                      have one last question actually that's
                    

                      pseudo related to that if you have
                    

                      little kids today what do you think they
                    

                      should study in order to have a useful
                    

                      future there's a correct answer in my
                    

                      mind and the correct answer is mostly
                    

                      like um I would say like math physics CS
                    

                      kind of disciplines and the reason I say
                    

                      that is because I think it helps um for
                    

                      just thinking skills it's just like the
                    

                      best thinking skill core uh is is my
                    

                      opinion and of course I have a specific
                    

                      background Etc so I would I would think
                    

                      this but but that's just my view on it I
                    

                      think like me taking physics classes and
                    

                      all these other classes just like shaped
                    

                      the way I think and I think it's very
                    

                      useful for problem solving in general
                    

                      Etc and so if we're in this world where
                    

                      pre AGI this is going to be useful post
                    

                      AGI you still want empowered humans who
                    

                      can function in any arbitrary capacity
                    

                      and so I just think that this is just
                    

                      the correct answer for people and what
                    

                      they should be doing and taking and it
                    

                      it's either useful or it's good and so I
                    

                      just think is the right answer and I
                    

                      think a lot of the other stuff you can
                    

                      tack on a bit later but the critical
                    

                      period where people have a lot of time
                    

                      and they have a lot of kind of like
                    

                      attention and and time I think it should
                    

                      be mostly spent on doing these kinds of
                    

                      uh simple manipulation heavy tasks and
                    

                      workloads not memory heavy tasks and
                    

                      workloads I did a a math degree and I
                    

                      felt like there was a a new Grove being
                    

                      carved into my brain as I was doing that
                    

                      and it's a harder Groove to carve later
                    

                      and I would of course put in a bunch of
                    

                      other stuff as well like I'm not opposed
                    

                      to all the other disciplines Etc I think
                    

                      it's actually beautiful to have a large
                    

                      diversity of of things but I do think
                    

                      80% of it should be something like this
                    

                      one we're not efficient memorizers
                    

                      compared to our tools thank you for
                    

                      doing this so much fun great to be here
                    

                      find us on Twitter at no prior pod
                    

                      subscribe to our YouTube channel if you
                    

                      want to see our faces follow the show on
                    

                      Apple podcasts Spotify or wherever you
                    

                      listen that way you get a new episode
                    

                      every week and sign up for emails or
                    

                      find transcripts for every episode at
                    

                      no- pri.com
                    

                    Back To Top