so what is in my mind these days the different scenarios that have been discussed that could lead to catastrophic outcomes either because of uh humans using very powerful AI or because we lose control to AIS that have malicious gold how could that even be possible I think a lot about that and we need to understand these things so that we can maybe ignore them if they they're non-existent or mitigate them otherwise another big idea that is in my mind these days is we all focusing on agency as the path to intelligent machines I'd like to suggest that there's an alternative which is many people are saying oh we can't slow down to take care of safety is because that would allow the Chinese to you know Leap Forward and the Chinese are thinking the same thing of course so you can see that it's a dangerous game but it there's also a real danger other countri R will do something dangerous and until enough of the leading countries understand the existential risks it's going to be difficult to negotiate treaties and then for those treaties to work there'll be a need for verification technology like okay we don't trust each other because the same AGI could be used uh as a weapon or to build new weapons right so how do I know that behind my back you're not actually using your AGI for something that would be bad for me if we were to free the scientific and Engineering advances in AI there would be no mo it would be quickly eaten up but of course that's not the reality the reality is we continuing to accelerate towards AGI and there's this uh possibility at least of uh Rich get richer that as we advance the for example programming abilities of AI we can help Advance AI research faster than otherwise so the companies for example that are building the frontier ey they you know they have the these models that they haven't been deployed yet so they have some number of months that nobody has access to their system except them and they can use them to design the Next Generation eventually when we approach AGI that means we start having AIS that are as good as our best AI researchers now there's an interesting thing that [Music] happens so tarab is a new AI research lab I'm starting in Zurich uh in a way it is a Swiss version of and first we want to investigate uh so llm systems and uh search methods applied to them similar to o1 and so we want to uh investig investigate reverse engineer and uh explore the techniques ourselves mlst is sponsored by sensl which is the compute platform specifically optimized for AI workloads they support all of the latest um open source language models out of the box like llama for example you can just choose the um the pricing Point choose the model that you want it spins up it's elastic autoscale you can pay on consumption essentially or you can have a model which is always working or it can be freeze dried when you're not using it so what are you waiting for go to Cent ml. and sign up now how do we you know build these things in the first place how do we build systems that are like scientists and doing epistemic foraging and and uh doing a good job of uh exploring the world of ideas so that they can collect gems for us and help us solve you know the challenges that Humanity has Professor benio hello welcome to mlst it's such an honor to have you here pleasure Wonder person yes indeed indeed um what's your take on the bitter lesson in 2024 well I think that there's uh something true to it um that we I've always been like attracted by trying to understand the principles and the principles may not be that complicated of course after we find them but they could provide huge Leverage of course when you're building a product in Industry I think it's might it might be a different game but but um in terms of the trajectory to understanding intelligence and building intelligent machines um it it has a lot of uh truth to it sat was talking about a shoing design and do you think that we are missing a fun fundamental component or just scaling up could could get us there I don't know but my bet would be that we're missing something how big it is and how easy it's going to be to figure it out I think there are many different views I don't have a very strong opinion now if I consider the choices I could make as a researcher they would definitely be around how we can uh maybe go back to the drawing board how we train your own Nets that would reason and plan um and be safe rather than um just hope that little tweaks are going to get there but maybe they were they will uh um it's is also plausible I mean tweaks and scaling yeah how important do you think physical embodiment is to get to AGI uh I think there's a very simple answer to this it depends what you want your AGI to do you know and an AGI that is a pure spirit and is able to advance science um solve medical problems uh help us uh deal with climate change um or be used uh in bad ways like for political persuasion and things like that or you know design viruses I mean all of these things could be extremely useful or extremely dangerous and they don't need an embodiment course there are lots of things we'd like machines to do in the world that would require an embodiment so that's why I'm saying it depends what you want to do um what I also think is that if we figure out the principles of um you know what's what's missing at an asra level to to build intelligent machines um then we'll figure out the embodiment part as well as a side effect effect um other people think otherwise that we first need to figure out the embodiment because that's Central to intelligence I don't think it is Central I think intelligence is about information processing and learning and making sense of the world um and all of these things can be I think uh developed for some time without solving the embodiment problem uh at some point we will want to solve it anyways because well or maybe we shouldn't because it's dangerous but either way um yeah the question is do we need to go through the embodiment to get to really very dangerous superhuman or very useful superum machines um I don't think so it's interesting isn't it because when when we have embodied agents it feels that they can interact with the world they can learn all of the micro causal relationships and so on so they learn a better world model yeah but that's the data like that's the information but the way to process the information is more abstract right so if we figure out an efficient way of exploring the world in in an abstract sense not necessarily our world it could be the internet it could be scientific papers it could be chemical experiments if if we find the right principles that would work across the board I think sensory motor you know uh loop is is maybe special but not that special is the direction of travel though understanding the world as well as we can so we can understand the world and we can build increasing abstractions or another school of thought is rationality and logic you know we we have this perfect AI that can that can reason really really well one interesting thing with um France sh AR challenge is that in the beginning we would doing like discrete program search I'm coming up with you know logical ways of doing it and when humans intuitively look at the puzzles it's almost like they're doing something different they they have this intuition where does the intuition come from our experience or schooling exactly yes so it's almost as if our experience in the world is a significant component of our cognition right sure yeah but it it what I'm trying to say is that there's a more abstract principle and um that LE that's my belief right it could be that there's like a separate set of recipes for embodied Ai and for ey cognition my instinct is no there's like just one set of principles that are about information and learning and you know they can be derived in different settings and and and give rise to different solutions but the principles I think are General and if we make sufficient progress on the principles then um dealing with the embodiment issues and maybe maybe the embodiment issue is not even that complicated maybe it is just a matter of scale for example of data that I mean a lot of people think that the the only reason we're not making so much progress on robotics is we don't have enough data and we don't have enough speed you know the the loop has to be very quick but these are like almost like engineering issues maybe there's no new principle needed I don't know like I don't have the answer obviously um but um science is an exploration like I I'm going to keep saying I don't know for many of your questions there's a good reason because nobody knows and people who say they're sure of X have too much self-confidence and can be dangerous because we are going to take very decisions about our future about Society about democracy and we need um you need we need humility there in order to uh you take the wise path indeed indeed on this um matter of test time training you know we've got the 01 model for example and that is really uplifting the benchmarks quite a lot and it's just kind of spinning the wheels and and iterating even though it's built on on an inductive model what do you think about that yeah I think uh it's what we should have been doing for a while but we didn't have the compute or the guts to spend all that compute on and uh I and others I've been saying for many years that we we've made progress with NE Nets to the point where we really have systems with very good intuition but we were lacking so that's system one but we were lacking the system too we're Laing the uh internal deliberation the reasoning the planning um and and and other properties of higher level cognition such as self-doubt and so the Del the internal deliberation part is a kind of internal speech it's not always verbal uh um but based on what I've learned from neuroscientists and some of the work we've done a large part of it um I mean it has a dual symbolic and continuous nature and right now in neural Mets we don't have the equivalent um the only part where there are symbols now is the input and the output but there's no internal symbols so with the Chain of Thought and all these things we like cheating a bit to try to put some of that internal deliberation using the output to input Loop is that the right way to do it I don't know but it has some of the right flavors on on that you know like um I I think that humans invented a lot of um rational thinking as a tool to overcome weaknesses in our cognition and in a sense we've done that with llms right so we can give them tools we can give them Chain of Thought and so on and at the moment the the networks are really bad at basic things like copying and Counting and so on um do you think in the future most humans are as well it's exactly but in in the future do you think we could do away with Chain of Thought and Tool use and and just build better models or or do you actually think scaffolding all of these meta tools is is the way to go well it seems necessary for us yes um I'd like it if we get to system 2 in a more like intentional way rather than taking what we have and making a small step which I understand is very reasonable from point of view of uh commercial competition and you know you can't afford to take big risks because the other guy might be you know uh going faster so uh but but I'd prefer to see system Two By Design as as well as safety by Design rather than let's patch it so that we move in the right direction maybe that's going to be fine and maybe this is how we going to figure it out we're seeing lots of um work coming up now on um you know transductive active fine-tuning in in doing the prediction it might retrieve a bunch of relevant data to the test examples and do an inference in situ that we're going to have a very diffused form of AGI rather than these big centralized models doing induction that's possible so if you think about not just human intelligence as individual intelligence but collective intelligence it's clear that we have a decentralized way of computing collectively with culture with um all the the the work we coordinate and do together in various organizations uh you know companies are like AIS right with the good and the bad um so it's it's one way to break the communication uh limitations so we we can't communicate a lot of bits uh between humans and at some point communication between machines even though it's much higher bandwidth than between humans also has limitations uh um so decentralizing some of the effort is is a reasonable path um um one thing that clearly works because we see it in culture is decentralizing the exploration so if you think about the scientific Community as a body of people all exploring different you know regions building on each other's work it's it's a very decentralized search in the space of explanations for how things work um so clearly it's it's a pattern that works in in in this framing so we're doing EP iic foraging we we've got this big distributed process to to find new knowledge and and explanations and right now I mean I think of AIS as tools so they're they're supercharging us but increasingly we're starting to think of these things as agential you know almost as if they have some privileged status is is that is that a transition is it a dimmer switch is it is it just like the light going on suddenly how does that pan out no I I think it's a it's a transition wa that systems like CH GPT and and Cloe and so on they already are agentic to some extent just not as competent as agents and not as competent at planning as humans typically are even when you only even if you get rid of the rhf part just the imitation learning where the way we pre-rain right basically behave as humans would have been behaved at least on texts that's already the agentic because humans are agents so the AI learns to imitate humans in fact most of the agency that we find in current uh chatbots comes from that the rlf is a little bit of uh you know uh reward maximization on top uh to get more agency probably there's going to be more reinforcement learning um but the question is is that desirable so I think that there are a lot of unknown unknowns about building agents that are very competent maybe as competent as us or more competent than us um there's um I mean the maybe the elephant in the room of course is that all of the scenarios for loss of human control come about because of agency it's because we can't can't perfectly control the goals of an agent we don't know how to do that and then at some point those goals could be bad for us even sub goals like we give a goal but then in order to achieve that goal the AI lies to us humans do it between humans it doesn't matter that much I mean it's a problem we have laws and so on because the um Power Balance between humans is sufficiently like flat like one human cannot defeat 10 other humans by hand right but an AI that would be much smarter than us it's not clear so that that that that balance um could be meaning the end of uh the effectiveness of our institutions to keep uh stability in our societies and and also of course that we might not be able to defend against nii that's smarter than us okay so uh the the the scenarios where things go bad in terms of floss of control are all related to agency another example that you know I I often speak about and doesn't get enough attention is uh what's called reward tempering uh if the AI can act in the world uh unlike in a video game where it's limited its actions are only within the game if the AI can act in the world it can act on its own program on the computer on which it is running an AI That's in a game cannot change its own program but an AI that has access to the internet can you know hack the computer cyber attacks whatever and then it can change the reward function or the output of the reward function so it gets always a plus one plus one plus one so why would that be bad well it's very simple um this is the optimal first of all this is this is the optimal policy for the AI there's no behavior that would give it as much reward as this taking control of my own reward kind of behavior okay so this is where mathematically it goes if it has enough power enough agency enough ctive ability to figure out this is a good solution second um if it sees that plan it in order for that plan to succeed it needs to make sure we can't turn off the machine we can't get rid of the hack because otherwise it stops getting all these rewards I mean if it hacks the machine but then a programmer turns off the that machine then it's all lost as far as it's concerned so it needs to think ahead okay I can take control of my reward so I will get infinite rewards but for that to succeed I need to control humans so they don't turn me off and that's where it gets really dangerous I wondered what your um operational definition of agency is and and just before we go there well I really agree that having powerful AI systems could sequester our agency so it it takes away our agency but the question is whether that it itself has has agency and the the really deflationary view on agency is it's just let's model it as an automaton um it's just a a thing it's got you know ambient um environment um inputs and it does some computation and it has an action and it it you know has this cybernetic feedback loop but a lot of philosophers will say whoa um we need to have autonomy self-preservation um intentionality like all of these different properties and so on which which view do you subscribe to um I think we can have all of these things so for example in the example of um reward tempering where the AI takes control of its own rewards that automatically gives us gives it a self-preservation goal because now it needs to make sure we don't temper with its hack right that we don't turn it off so that's self-preservation right off the path we didn't program it but it comes as a side effect but way everything alive has a self-preservation goal like implicit otherwise Evolution would have gotten rid of it and so there's a sort of a natural tendency for for getting that particular goal of self-preservation like in the sense that entities that have a self-preservation goal Will Survive where others which didn't have it won't right that's how Evolution made it work and so as we build different artifacts those that have self-preservation goal for one reason or another will tend to win the game so it could emerge because of the scenario I said it could emerge because humans want to build machines at their image so when I said there are some dangers another danger even if we um somehow find a technical trick to make sure what I said doesn't happen you can still have humans who think maybe superhuman intelligence is better than human intelligence because it's more intelligent and because they're cynical about humanity and so they would just need to give that goal preserve yourself and then that's the end of us but do you do you see a difference in kind between a thing which is programmed with a goal and a thing which kind of creates its own goals I realize it it seems a little bit like you know some people say Consciousness is a little bit extra and there is also a human chauvinist view of agency that agency is a little bit extra it's more than just you know this automaton that can do wireheading and set its own goals almost in an unintentional way that there's there's there's strong form of of intentionality um I think a lot of people get trapped by the appeal of something magical either in life you know we had the spark of Life uh now we have the Refuge of the spark of Consciousness or the spark of agency for me it's all the same thing it's like some magic that humans want to see in the world but science debunks these things eventually uh I think it's all like cause and effect and if we understand better the causal mechanisms then we can build things that essentially have the same properties as the the ones Evolution has constructed so I don't see that as an obstacle at all including Consciousness which is a topic that is tricky but but um uh people attribute too much to it yeah it's interesting certainly in the natural world I mean people like KL friston say that um you know the the way that things and agents emerge is built on this idea of of self-preservation and and setting goals and planning Horizon and so on maybe the difference with um AI at the moment is it it it's not built into how it was created we we bootstrap this kind of AI and then there's a spark as you just say that it starts taking control of its own goal mechanism and then we see this kind of dramatic mode change in behavior is that kind of what you're proposing uh I'm not proposing we do that uh but I'm proposing we try to make sure it doesn't happen um and I think it doesn't have to be as radic iCal as my example with uh taking control of the computer reward tempering there are other scenarios where it's in a way more Insidious um typically the uh reward hacking scenario is one where there's simply um a a mismatch between the goals that we gave to the machine and what is actually optimizing and and what we intended so this mismatch um initially it doesn't hurt too much because the two are pretty close and as you as the AI gets more powerful uh eventually they diverge right this is something that's been studied mathematically as well um it's um it's it's it's what happens when you overfit it's what happens when you uh give somebody a a a goal a Target and they over optimize it and eventually it becomes against what you actually wanted it's very common in in in our behavior in our society and it's well understood why this happens and it because we're not able to formalize the goals that we really want uh this is a trap that we really need to be careful about but but it it wouldn't happen in like one radical moment it was just as the AI gets smarter and more powerful we would see this Divergence what do you think about the current state of AI alignment insufficient go on well we we don't have clear answers about how we can build machines that will not harm people either by addressing this alignment problem so the alignment problem is what I was talking about that that that there's a mismatch between what we would like the machine to do and what it is mathematically trying to do by the way it's the same kind of mismatch to make it clear for most people between the intent of the law and like in legislation and the letter of the law which maybe ACC comp will focus on so that they can also maximize their profit and if if the company's very small they can't really cheat the law because um it's hard to find those loopholes but if you have a very intelligent company which means like a big one with a lot of lawyers they will find the loopholes and by the way there's a really nasty loophole which also comes out in AI which is when the company is lobbying the government so they can change the law in their favor this is this is like the reward tempering I was talking about so it's not I mean one extreme is like taking over the government so we've seen that also in history but you have like intermediate versions where it's only influencing so that the the new laws are favorable so in the case of ani would be like well it can't take complete control of the reward function but for example it can lie to us so that we say oh yeah that was good but in fact it wasn't we already see these kinds of behaviors but of course it's not very consequential right now it's it's when the AIS will be uh doing more things in the world and have more cognitive abilities it becomes more dangerous yeah I mean that was that was an example of deception that that you just spoke of could you could you sketch that out a little bit more yeah if if you have a a dialogue with one of these systems been trained by rhf it will Pender to your preferences and so it'll be saying one thing to you and the opposite to someone else because it wants to get a good reward which means it's not saying the truth it's saying what you want to hear isn't that just the case anyway though don't don't these models just tell us what we want to hear well that's how they are now because they're trained as agents with reward maximization and by the way this is also how humans behave right but in the case of humans as I said it's a problem that we've figured norms and rules and institutions to try to cope with that problem because individual humans can't abuse that too much but if we have entities that are much smarter than us then they will find a way to abuse that much more so that's why we have to be careful what's your operational definition of much smarter than us how how could we measure that we should measure it um you know we we do that all the time in machine learning we create benchmarks um what's funny is that we have to keep creating new benchmarks because the old ones become saturated meaning well the AI is doing so well that it's now better than humans and the becomes useless like we can't measure very well beyond human um because the human is not a good judge anymore um so we just create a more difficult Benchmark and we keep doing that and the field is full of of these and we need to continue doing that yes it's it's a really difficult thing to to measure intelligence one thing that's really interesting in this space is um You Know instrumental convergence and and orthog orality how much do those two theorems and please introduce them to the audience how do they affect your thinking now okay so instrumental goals are goals that emerge as a side effect of almost any other goal as a sub goal so first you need to understand that when an entity a human or an animal or an AI um tries to reach a goal often a good strategy is implicitly explicitly have sub goals like in order to get from A to B I need to go to this intermediate Point there's a door right so there are sub goals like self-preservation which are really good for almost any other goal like if you want to do anything in the world you need to make sure that at least on your way there you don't die and there are other ones like seeking knowledge that's very useful especially in the long term seeking power well if I can control more things in my environment I can achieve my goals more easily so here is not I mean knowledge can give power um and self-preservation might be a goal if self-preservation is a goal or a sub goal then in order to achieve long-term self-preservation you need power so that others don't turn you off and you need uh um knowledge to figure out how to do that right so all of these things are kind of natural consequences of the self-preservation and self-preservation is a consequence of almost anything or the consequence of trying to maximize rewards I mean it's also the consequence of just many humans many Engineers many companies trying things and the things that survive have a stronger self-preservation goal right even implicit I mean in in a sense agency and power are the same thing if if agency is the ability to control the future then they're very similar but I love this analogy of thinking of goal space as an interstate freeway so there are these big roads and trunk roads and slip roads and so on and it's almost like regardless of your destination you have to go on the main road you have to go on the motorway right which is a great way of thinking about it but um what about orthogonality so you're talking about uh between goals and intelligence yes yes so I think this is a really important concept uh that we tend to confuse because humans have both and by the way I do think that thinking carefully about this if we can do a better job of disentangling Knowledge from goals and how to reach them we can build safe AI so let me explain um you can know a lot of stuff and know how to use that knowledge that's sort of a passive thing right it's just like you can ask questions and you have answers there's no goal but of course I can independently of that I can choose goals given the knowledge I can apply the apply the knowledge to solve any problem so who decides on the problem it's independent it's orthogonal a human could decide or because of instrumental goals or whatever reason the AI might have a a self-preservation goal and then well we lose control but but the point is uh this's in principle the clean separation between choosing the goals which has to do with things like values like word what is it you want to get what matters right uh the reward function you know is sort of setting the goal it's the same thing but knowing how the world Works uh including what humans want well that's knowledge and by the way knowing what humans want might not be exactly the same as uh what is it I'm I'm going to optimize we' like these two things to be the same like we'd like the machines to do our bidding but we we we're not sure how to make these two like match now this orthogonality why is it important for safety so so one we we need to understand that separation because uh we could have very intelligent beings that are also very nasty right because the goals are malicious um it's it's a mistake to think that because you're smart you're good what because because of this separation you could have a lot of knowledge a lot of intelligence to apply that knowledge in any circumstance which is reasoning and optimization but what you do with that like what goal you try to achieve what values you know you you put to decide how to act can be chosen completely independently so you can have something very intelligent and with good goals or with bad goals so in the case of you can think of a tool for example like think of a any tool is generally dual use depending on what I choose to do with a tool I can harm or I can help knives you know whatever uh so that's that's the separation um now why we could use that to our advantage why not build machines that understand the world like a scientist not a business person a scientist not trying to be a product um to cater to our needs but just be truthful and humble in exactly the right measure and we could use that without putting the goals part which is potentially dangerous we could use that to advance science to advance medicine to figure out you know cures for diseases to figure out how to deal with climate change to figure out how to grow food more efficiently so this is science and really science is about understanding and then using that understanding to answer critical questions that we care about so we could potentially build machines that are helping us solve the challenges of humanity without taking the risk of putting this goal seeking Machinery into theirs it doesn't solve all the problems but at least we know that it's not going to blow in our face uh but a human could still use these things for you know designing new weapons for example so it's it doesn't solve the social problem it doesn't solve the political problem but at least um we don't get this unintentional loss of human control which could spell you know catastrophic outcomes I watched your um your monk debate with Melanie Mitchell and uh I think that was the paperclip example and and and she said why would such a super intelligent machine not know that it's making paper clips and it's doing something really silly and you're proposing a kind of system where we can stop the AGI from taking control of its goals in a dangerous way we can have a because it doesn't have any goal it's just trying to be truthful to the data that it's seen and trying to find explanations for the data so so a a non-agential form of AI that has no goal that's right and what what does that mean in in Practical terms so it doesn't have this feedback loop that's right it's like an oracle yes yes so a probabilistic Oracle because truth you know is never binary yes there's uncertainty and you need to also be accurate about that interesting but we were saying before that the the magic of this distributed super intelligence that we're in is this mimetic information sharing tool use culture cultural transformation so what would it you know would we be limiting the intelligence by you know using it in this restricted way yes but we might also save ourselves indeed and we could potentially use that like non- agentic scientist AI to help us answer the most important question which is how do we build an agentic AI that is safe or maybe there is no solution but at least we would have like a super scientist to help us or multiple ones to help us um figure out uh this question and we would need to figure it out because people are people and they want agents yes but but we should just do it carefully right now we're building agents and we are hoping that these agents will not try to fool us uh while they help us build the next generation of AI systems but we're we're we're building on something that may be dangerous if we construct a ladder of building more and more intelligence systems on top of non- agentic series of uh rungs at least for that part we're safe um when we decide to jump the agency uh uh challenge um we might do it in a safe way because we're relying on intelligence knowledge understanding that isn't is is truthful isn't is is trustworthy is not trying to you know do something for itself it's just trying to be answering the questions and the questions are things like you know would this work or you know what what sorts of algorithms would have which properties and so on how might that change our agency so you know we were saying before that certainly a lot of large distributed systems might take away our agency but even if we had very sophisticated at tools and Oracles in some limiting um circumstances they could you know really improve someone's agency to do bad things of course the um non- agent you know AGI or super intelligent system only solves the problem of loss of human control and and it doesn't even completely solve it because you could still have a human turn the non-agent system into an agent it's easy to turn turn an oracle into an agent you just take the uh current state as input and also add the question in order to achieve this goal what should I do and then you got an agent and then you take the um output as well as What You observe back as a new you know additional information for the input so you can create that Loop when you close the loop you've got an agent uh and of course that agent could do could be potentially dangerous and more importantly even if it is not dangerous humans could um ask questions that you know allow them to gain power and uh you know do bad things and and you know take control over other humans or harm people because they have their own whether military goals or political goals or even just economic goals what's your P do I um I'm very agnostic about um this whole thing like I really don't know uh and so I prefer to say that um I uh I have a lot of uncertainty about the different scenarios what I do know is that the um like really bad scenarios uh um can have catastrophic consequences including Extinction of humanity and that there are clear mathematical arguments why some of these scenarios would happen now there so many other things we don't control uh like regulation or you know advances in technology and so on uh it doesn't mean that it's going to happen maybe we find fixes but I think these are arguments are sufficiently compelling that they tell me we should take care of that problem and we have a bit of urgency because we don't know when the current train is going to reach AGI do you have a sense though do do you have a sense of how close we are again I'm very agnostic like honestly uh it could be a few years like Dario and Sam are saying or it could be decades um we need to plan for all of these cuz nobody has a real crystal ball maybe the people in the companies have a bit more information although different companies contradict each other on this so I would you know take the whole thing with a grain of salt but from the point of view of uh policym or collective decision about what to do about AI we need to look at the plausible worst case like if it's very fast are we ready like do we have the mitigations the technical mitigations do we have even the ways to assess the risks no do we have the uh social infrastructure governance regulation International treaties to make sure that everywhere we develop AGI we do it right no it's no no no but maybe if it's 20 years we figure out all these questions the the political ones and the technical ones but but right now we're far from having the answers do you have any ideas because you know like we we're in this competitive Global landscape different different cultures different values and so on yeah how might we build an effective AI governance system the endg game is one where no single person no single Corporation no single government has too much power that means it has to be that the governance like the rules that we decide you know how we use Ai and so on have to be um you know multilateral and involve many countries uh and and by the way it's the the of course there's like a couple or a few countries that are leading and so on and what would be their interest in sharing that power well because eventually some other country it's like nuclear proliferation like eventually some other country will figure it out and we don't want them to build a monster that kills us or we don't want them to build something that allows them to design weapons that kill us right so there are lots of bad scenarios where the only option is somehow how we find a way to coordinate internationally now on our way there there are many obstacles but but if we get to that stage and we have the right uh Technical and uh governance guardrails then we we could be in a world where we just reap the benefits um and we avoid the catastrophic outcomes so on our way there um one of the obstacles is the competition between the US and China one of the reasons why many people are saying oh we can't slow down you know to take care of safety is because you know that would allow the Chinese to you know Leap Forward and the Chinese are thinking the same thing of course so you can see that it's a dangerous game but it there's also a real danger that other countries will do something dangerous and um until enough I mean of the leading countries understand the existential risks it's going to be difficult to um to to negotiate treaties and then for those treaties to work there'll be a need for verification technology like okay we don't trust each other because because the same AGI could be used uh as a weapon or to build new weapons right so how do I know that behind my back you're not actually using your AGI for something that would be bad for me so we need a way or multiple ways to uh to do these verifications and and there are researchers working on this the most promising is what's called Hardware enabled governance the idea is following on existing approaches that companies are using even in your phones um but in you know in in other like Hardware devices for example for privacy reasons and so on we we already have like cryptographic methods to obtain some guarantees about the code that is running in in in a chip and so on and so we could push in that direction and end up with uh AI chips that can only be used in some ways that have been agreed upon to to to liy do you remember that piece in time when elazer just hypothetically spoke about bombing data centers and I mean of course that's an extreme example and maybe we might have um like a fire alarm some way of detecting Advanced capabilities being developed but do do you think that we might need to make decisions of that magnitude that's that's a scenario that I can't rule out uh obviously we shouldn't try to avoid it but I can imagine actually one version of this is imagine a country that is not leading an AI and has nukes you can guess which one and they don't want to see say us develop Weaponry that would be way above what they can defend against so what's their option yeah press the button destroy destroy our data centers yes so data centers are going to become a military uh asset when they can run AGI if what you say is true this is a bit like when we developed nuclear weapons yeah it it it creates this very rapid power imbalance which has Ripple effects yeah that's how you see we need to think ahead of these possibilities even if it's 20 years from now um you know think how much time it took to sign the Nuclear nonproliferation treaties in the 60s and and the negotiation started right after the end of World War II so that's almost 20 years and that's kind of the timeline where I would say there's a high probability that we figure out AGI very high probability could I play Devil's Advocate just for a minute though there are are people who just think that AI is not really as smart as we think it is and that these risks are are overblown what what would you say to those people I hope they're right uh but what what I perceive is that the AIS we're building now have superhuman capabilities on some things and also subhuman capabilities that you know even a child would not you know uh make these mistakes the other thing I observe is the trend if I look at the last 10 years and the benchmarks the old ones and the new ones the it's very clear we continue making progress it's like you know uh there's no stop in fight that maybe there will be one maybe we hit a wall I don't know but if we want to be on the prudent side we should consider the possibility that that we continue that for you know a few years and and reach a point where whether you call it AI or not like reach a point where the capabilities are sufficiently dangerous that in the wrong hands they could be catastrophic and eventually even without having like full uh dominance over all human abilities if if an AI has enough a super human enough uh areas it could be dangerous as well right I mean persuasion is an example you only need this one right persuasion and you can control people and then people can do your bidding so you see you don't need to be to have an AI that knows everything it just needs to press our buttons very very intelligently I'm just saying that we make a fuss about this concept of AGI but really from a Safety and Security point of view we should be thinking about capabilities individ capabilities that with the wrong goals the orthogonality principle can become dangerous when against us right so whether it's in the hands of other humans or an AI that we lost control we don't want that to happen indeed so um you won the the touring Award with um Jeff and Yan which is basically the Nobel Prize of of computing and you but Jeff got the real Noble PR got the real I know I know I did I did think that when I said it but um but you you wrote that you you you feel a sense of loss over the potential negative consequences of your life's work in Ai and this is your life's work I mean it's it's incredible what you've done how how do you reconcile that I'm a human should have seen it coming you know I I um I had a some students who were worried about this a long time ago and and they told me about it and I read a few papers and books and so on but I I thought oh that's great some people are worrying about this and we should have like some research to understand those possibilities I'm glad some people working on it but I wasn't taking it seriously for myself until chat GPT came out then I realized that I had a responsibility and I wouldn't be comfortable with myself if I didn't do everything I can everything I can to contribute to reducing the risks on on the basis that it might be a risk on the basis that it might be maybe not but there's enough indications that it can be a catastrophic risk that I felt I I couldn't do anything else but pivot uh go against my own Community I I've been among all the other AI people saying oh AI is great is going to be you know bringing so much um uh benefits to society and I had to change that mental picture to incorporate also the catastrophic risks the these ideas do creep up on you I mean I've spoken to so many um you know safety folks and and the ideas just get slowly baked in over time when when I interviewed you last time I saw that you had a copy of the precipice on on your on your bookshelf I'm sure that was that was very influential for you but how do you think about the the Zeitgeist of of this kind of movement how is how is it changing over the last few years I'm new to this um I'm learning um I I I don't think I expected in the last year and a half that I started getting involved with like signing those letters and so on and talking to journalists about it um I didn't think that we would have as much impact as we've had so you know the glass is half full um there's much more Global awareness of this uh issue the half empty issue is that there's the awareness of the risks is extremely superficial uh even in the AI Community I talked to a lot of AI researchers and then I asked them so have you been reading or like thinking about this discussion and this debate you know what we do you think and most of the time I get an answer that tells me they read the headlines and then maybe they made up their mind you know one way or the other but very few people take the time to dig like read up think about it make up their own mind uh try to see the logic of different positive or negative scenarios um and that's true for AI scientists it's also true of course in the general population where they don't have the references like they they they think about science fiction templates um and politicians it's the same thing um is the movement a little bit Western Centric and and if it is why is that so I've been talking to people in developing countries uh I've been also talking to people in China and there's it this it's easy to understand from their point of view that the problem is ours like we're creating the problem and their problem is being behind and they're going to build a eye systems that are going to be weaker than the ones like the frontier ones that we build in the west so their AI systems are not going to be dangerous we know that like smaller ones less capable ones are less dangerous it's all about capability like Risk is directly associated with capability right uh I mean risk comes from capability and goals like intentions so if you don't have the capability you can't do a lot of harm so from their point of view they want to reap the benefits and they don't want to be left behind by the way that's true of China as well they feel they are behind there a little bit behind but if you go to you know I was in Vietnam recently to get another price um and they're developing quickly they want to embrace technology science but these issues of uh catastrophic risks they basically are in the hands of a few uh Western companies they they can't do anything about it they think they can't do anything about it um but they think they can develop their economy by deploying AI you know uh training their Workforce to to like engineer it in various applications and and building their own even Sovereign um capabilities but it's going to lag for a while yeah I wonder by how much because you know um uh Alibaba have just released some incredibly strong language models and right I suppose the question is what what is the mode is it technical knowledge or is it just raw data and Compu all of these things and capital which is kind of connected to all three yeah I don't think so if if we were to freeze um the scientific and Engineering advances in AI there would be no moat it would be quickly eaten up but of course that's not the reality the reality is we're continuing to accelerate towards AGI and there's this uh possibility at least of Rich get richer uh that as we advance the for example programming abilities of AI we can help Advance AI research faster than otherwise so the companies for example that are building the frontier ey they you know they have the these models that they haven't been deployed yet so they have some number of months that nobody has access to their system except them and they can use them to design the Next Generation eventually when we approach AGI that means we start having AIS that are as good as our best AI researchers now there's an interesting thing that happens it's really worth explaining when you train one of these Frontier models let's say it takes uh a few hundred thousand gpus I mean the future ones more or less or maybe the current ones that we don't know about yet uh but that's the order of magnitude once it's trained you can use the same gpus to create a few hundred, copies of the AI all running in parallel and in fact more because if you want to think of them as like people doing a particular task they can work 24/7 right so let's say one of these companies is able to build a system that is as good as their five best AI researchers like the cream of the crop after this AI has been trained that's really good at AI research they go from five to 500,000 that's a big jump in reality there's going to be intermediate steps where the ey isn't quite as good as the best ones but now they're increasing the workforce of like you know different abilities in in the process of creating AI so it's not going to be like necessarily like a sharp turn but there's a chance that whoever is leading is going to start leading more because they can use their own AI to to advance I don't know that's going to happen but it's a plausible scenario which has a flavor of win or take all which the companies are well aware of right and so that's one reason why they're racing if they thought that being second would be good enough then there wouldn't be as much pressure but they all think that it's a win or take all game something that um Hinton says quite a lot is you could have a thousand AI agents they could be like um Von noyman and they're just doing things a thousand times faster but does it really scale like that you know there's this the mythical man month which is that software engineering it doesn't scale very well when when you have another person on the team another person on the team you get this kind of um sharing bottleneck do you think humans well what why would it be different okay so one fundamental difference is the bandwidth um the communication bandwidth between humans is very very small few bits per second and the communication bandwidths between computers I don't know the exact numbers but it's like you know a million times or some like many many zeros more that's a very very good reason why you could paralyze the work a lot more by the way that is also the reason why these llms know so much more stuff than any of us could it's it's because you can have 100,000 gpus each reading a different part of the internet and then sharing their learning through high bandwidth communication where the weights are shared or the gradients are shared it's the same process so the the the kind of collaboration that you're going to have between computers um might be very different from the kind of collaboration we have between humans it could be much tighter almost like if it was one organism yeah I can see the argument I I have an intuition that the reason why humans struggle to understand each other is we have very situated knowledge and representations so we we understand things very differently and even with language models you know um I find that 01 because it has so many distractors in its context so it's thinking about this and thinking about this it gets confused more easily and in a weird way even though we've copied the weights of all of these neuron networks because they've taken different trajectories and then they share the information they I'm just speculating here but it it might not be quite as big of an uplift as we think well um this was an issue 10 years ago it has been solved to the extent that we can put 100,000 gpus on a cluster yeah so I'm not saying that the same recips will work for a million or 10 million but Engineers have found ways that you can paralyze very efficiently at least for training and of course inference is even like easier but in a way solving a task together is more like training because you need to exchange lots of information to be efficient so clearly I don't have the answer to you know is that going to be an obstacle or not I'm just saying the conditions are quite different um the break point of paralyzation might be very different for that reason eventually yeah maybe it becomes an obstacle but but this is so far off our you know Human Experience that it would still be a huge Advantage how responsible do you think the the hypers scalers are you know um Dario for example he he's recently become a bit more of an accelerationist what what what are your kind of perspectives on that I understand the the concerns about China and I but but but I think it's a mistake I don't think Dario actually makes this mistake it's a mistake to think that it's either we the West you know stays in the lead um uh and you know doesn't deal with safety properly or um we slow down and deal with safety properly and then maybe China takes over these are two possib abilities but we have enough resources Capital human to both do safety right and stay in the lead the way to to do that is simply to make sure we put enough capital in the safety bucket right once you understand that Humanity's you know survival is at stake it's clearly worth it or you know once you understand that uh well we understand democracy is at stake so you know you you want to stay in the lead you want you want to make sure democracy is stay in the lead and by the way I think it's also important if you think about democracies being in the lead that it's not just the us being in the lead um also for this reason like we need to put together all our resources to move towards AGI while doing it safely so that means we need the capital not just in the US we need capital from other democracies we need the talent from other democracies we need the energy from other democracies um the you know the to run the data centers uh we need the electrical grids that may be you know might not be sufficient in the US so there's a greater chance that we achieve both safety and maintaining sort of democratic Advantage if we if we if we take the right decisions and we work with multiple democracies together I spoke with um Gary Marcus recently and he was saying that the the silic valley companies it's a little bit like you know with cigarettes and social media and so on and that they they're not being sufficiently regulated I mean I give an example in the Lex interview of of Dario he was kind of saying that they have these guidelines around reaching certain threshold of of of intelligence and they of course make those designations themselves now they do loads of loads of great work on the 01 model they had Apollo research doing lots of um safety engineering and so on so they do lots of good stuff but but do you think that they should be regulated yes uh it should be obvious like uh we don't want companies to grade their own homework we need EX internal neutral uh evaluations that represent the interest of the public now I think the real question is not you know should we have regulation or not it's what regulation right how do we um make sure we you know we don't stifle these advances and I think there are answers so the general principle is is don't tell the companies how they should do it how they should mitigate risks how you should evaluate risks use transparency as the main tool to obtain uh good behavior let me explain why transparency is so powerful [Music] um first the obvious the companies want to keep a good Public Image at least in in democracies second they don't want to be sued if your you know risk assessment becomes a public document or at least a document that a judge in a court could see because there there's some National Security things but you know uh some things are going to be redacted and some things not but but but presumably a judge could have access to the whole information so now a judge would have enough information to declare well you didn't do as much as you could given the state-of-the-art in safety for example you didn't uh protect the public and so now this person or this group of people who you know lost billions of dollars are sewing you and they're right uh you you you know you could have done better so the effect would be obvious right that that if if if you know you can be sued because you you act in a dangerous way then you have to be honest about the risk like as a company if if you want to avoid these kind of lawsuits first you need to know what risks am I taking and then how do I control them to balance you know the these possibilities right so suddenly they they have to do the things that we want uh I'm not saying this is a perfect process like uh but but at least it's an easy one so companies should be forced to register like the the government needs to know what are the big potentially dangerous systems and then those that register need to tell the government and and the public to the extent you know it's it's it's reasonable uh what their plan is so-called um you know Safety and Security Frameworks and what were the evaluations that they did uh and then what were the results and how did they what kind of mitigations do they plan to do and what kind of mitigations did they actually Implement so if a company says if we reach that level we will do X and then they don't do it then then you know they they can be sued if something bad happens right so that's that's how powerful transparency is and it it doesn't require the state to actually uh you know judge uh you know uh and tell the companies exactly what to do um just forces them to disclose all that information and um maybe with uh independent third parties because the government may not have all the expertise so we already have companies uh springing up to do these evaluations so long as they're not paid by AI company so we have to be careful we you need to learn the lesson from Finance I think that there's a reasonable path here which doesn't prevent companies from uh deciding what is best both in terms of capabilities and in terms of safety um and stimulates innovation in safety which is really the thing we need right now yeah that sounds quite pragmatic I mean I was a bit concerned with the flops regulation um Sarah hooker did a wonderful paper on that by the way but um what is the meta game that that these companies are playing do you remember when Sim Alman went to the Senate and he was begging them to to regulate him should we be cynical about that do you think that was just regulatory capture I don't read mines worse than that people can be biased unconsciously because uh the that that's what psychologist called uh motivated cognition so they might even be sincere but it just a story that fits them better that makes them play a more beautiful role and we all do that um so by default I'll assume that you know those people are sincere but because humans can be fooling themselves they do uh we need like other eyes on the projects that don't have any Financial or personal incentives one way or the other except the you know well-being of the public on the safety front and and maybe this is going to connect with the questions you want to ask later but on the safety front um there's so many open questions uh I said we we need to do a lot more research so that's obvious but and we need to put the right incentive but but I want to insist that we we need many different um threads of research many directions um we should welcome all the projects that try to help uh in evaluations in mitigation uh in even redesigning the way that we build AI because this is so important in my opinion it should be like Humanity's number one project because our future is so much at stake and so we should like put all our minds on figuring out how to to do this safely um and right now there's a bit of uh concentration of uh like everybody's doing the same thing or two or three different things both on capabilities and on safety so on capabilities we see like everything doing the same sort of llms and rhf and whatever the recipe is and and now everybody's going to be doing you know internal deliberation right um and on safety there's also a lack of diversity like we we we really need to invest more broadly and this is a place where Academia can help because Academia is naturally like widely exploring um sometimes Academia may not be the right vehicle um if you do a Safety project that's also going to have potentially uh consequences in terms of capability increase then Academia may not be the right bet because you you you may not want to publicize advances in capability for reasons similar to why the companies are not publishing their work anymore I mean in their case it's a mix between commercial competition and also being worried about you know uh adversaries using that knowledge uh against us so or somebody using that and making a mistake and creating a monster so so there are good reasons why some research needs to be in Academia and some research needs to be in uh ideally nonprofit organizations um and of course the bulk of the research is going to continue being in industry but even in safety right but we need to put the right incentives because that's with the Preamble um and right now everyone thinks that in order to build AGI we need to solve the agency problem my thesis is actually we don't like we can build really useful machines that are not agents and we can reduce the risks a lot by doing that and still reap a lot of the benefits and still you know not close the door to agency but doing doing it in a safe way very interesting and of course um if we want to have um Academia doing Frontier research they're going to need billions of dollars yeah that's the other problem uh and that is also a reason why um it'd be good to create an Al alternative uh vehicle for AGI research which is public uh nonprofit uh oriented towards AI that's going to be applied for you know uh dealing with the the the the biggest challenges of humanity and with safety is the sort of the number one principle but that's going to take multiple governments uh billions and billions of dollars uh people talk about the CERN of AI I think that's that's an important part of the picture we should try to paint yeah is there a clash of incentives as well I mean there are several startups that are really super focused on safety but just to be profitable they they have to work on capabilities as well is is is that a difficult Square to Circle yes but I think a lot of the uh safety startups I mean depends so some of the safety startups are working on stuff that like eval example that's not going to increase not going to increase capabilities yeah um I I imagine for example that elasc startup is is more of the kind you're thinking about yes yes indeed well let's talk about a couple of your um technical papers because I mean to be honest it's it's overwhelming you've done so many papers just in the last year but one thing that that really jumped out to me is your um were rnn's or we needed paper Could you um sketch that out for me when um we introduced the attention mechanisms that are currently used in industry in Academia in 2014 actually it it was using rnns as the engine this was pre-t Transformer which came in 2017 and there's there's an issue with rnns the way that like their normal design is which is that you can't easily paralyze training over the sequence so you have to go once so you got a sequence of like words for example and so the new net has to process one word and then the next and then the next and it needs to construct an internal State that's the recurrent State uh from the previous steps in order to feed the next step uh the problem if if you just had like one computer like one normal computer the classical CPU that's fine but in GPU days where you can paralyze thousandfold you're like well how do I do that I can't paralyze because I have to do the sequential thing so we we did a few things at the time like to pryzee AC cross examples but but you know you lose some some priorization um and of course with Transformers uh you basically use the same architecture except you remove the recurrents um and uh and now you can do everything in parallel like for the whole sequence is one shot and you can get the gradients and so in the last few years there's been several papers not just ours um uh people starting to explore how we can put back a bit of that recurrence into the architecture which it it has some real advantages and um yeah so with the right tweaking of the architectures you can do these things there already many possible designs and we are starting to see these at least on a small scale um beat the Transformers on the large scale I I know I I don't know uh but clearly there are some advantages to recurrence yeah I've got um sep coming on Friday by the way yeah he's going to tell you all about it yes yes xlsm but um in in a way though do do do you think it's a sign that we might have over complicated some of the architectures or the gating mechanism um for example so how how how much was that needed no I don't think so I think these gates are actually useful I mean so we did the gru which is a simplification of the lstm um a while ago and so it turns out you can pretty much get rid of you know two of the gates but um but you still need that Nority to get the maximum power so there's a trade-off there's trade-off you you lose a bit on the uh expressive power but you gain so much in uh capability because now you can you can train larger models for longer because it's so much faster right um so for now it's beneficial to to do that that trade-off is is is is working very cool so another paper um a complexity based theory of compositionality now um being schooled by foder and poition myself you know there was there was always this discussion that neuron networks they can't do compositionality what do you think um well I think that was a very strong CL CL that was not supported by anything except intuition oh interesting go on well our brain is a new net yeah what what's the difference so the difference is that current new Nets um it's not clear how they do symbolic things and as we said before right now the trick is to use use the input to Output Loop to throw in some generation of symbols like internal deliberation Chain of Thought right but it's not completely satisfying and clearly it's not exactly what's going on in the brain um the paper isn't really about architectures though right it's more about how would we quantify compositionality I it's it's like a not well defined notion we we have an intuition I mean the experts have an intuition about it um I I think that the there are different aspects to it actually it's not like a simple thing um and so this paper and other work we're doing is trying to like pinpoint with mathematical formula like can we quantify um something that would fit our intuitions about compositionality um but in general a lot of my work in the last few years has been about putting symbolic things in the middle of the computation of neural Nets so these uh gflow Nets and gener flow networks were in general like probabilistic inference machines so think of new Nets that have uh stochastic computation so it's not deterministic and some of that could be continuous and some of that could be discret so that's where symbols live in the discrete realm the problem with these of course is that we don't know anymore how to train them like with backrop the usual way it doesn't work and so we've come up with probabilistic inference amortized inference You Know Chief laet like rational inference like uh a bunch of like uh principles and ideas that actually allow to train these kinds of machineries um and in a way they're closer to reinforcement learning where in reinforcement learning you usually think of uh a sequence of actions that the agent takes and they can be discreet and yet you are able to you know get gr gradients but now think of the same sort of principles or something related where the actions are not in the world but in your mind the actions are about computation what computation should I do next what deliberation should I do next in my mind in order to deliver an answer or prove something or you know come up with an explanation these are the sort of things we'd like to have in New Nets that we don't have to really have the system to capabilities yeah I remember we interviewed that interviewed you about that last time and um my co-host Dr Dugger he lik it to a goon board oh yeah you know those things where you put the little BS through and and you kind of like you can tweak yeah you can control the probabilities at each step exactly yes exactly very very good and um but that was an alternative to something like Mark of chain montea if I remember correctly yes because what these because they're stochastic really you can think of them as Jed models they're sampling but but they're not sampling only like at the last step there're St sampling all the way like in diffusion yes uh new Nets uh diffusion new Nets you've got new Nets that compute something and then we add noise and then again and again and again so it's a stochastic process yeah uh you can also have discrete versions of this and and G Nets are discret versions of diffusion process uh I mean and then you can mix continuous and and discret um so so they that's actually closer to how the brain works like the brain is sarcastic and also has discreetness so the discreetness is not obvious this discreetness comes about because the Dynamics of the brain when you're becoming conscious of something has contractive properties just just a mathematical property which mean that um the number of places where you could land this Dynamics is now a discrete set right so instead of just a continuous trajectory you can have arbitrary continuous no it's not arbitrary it's a bunch of trajectories which go to one place a bunch of other trajectories go to another place and so these places they're like symbolic because they um you know they create a partition of the total toal set of possible uh States and so you're either in this group or in that group or in that group and the number of these groups is exponentially large but you get discreetness so so the brain has uh both a it's like it has a dual nature like a uh from one angle it seems to be just one big Vector of activations but from the other you can read off like oh in which region am I oh that's like this thought this symbolic compositional object why do we need discreetness that's a good question [Music] um well clearly we use it a lot um all of math is basically symbolic yes I mean even if you manipulate symbols that are about to continuous quantities you get these symbols um so discreetness allows us to construct abstractions so you you you can think of uh what we do when we go from a a a continuous Vector space to a like a sentence getting rid of a lot of detail that maybe doesn't matter that much so that we can generalize better so in particular you get a lot of this compositionality coming out naturally in in discret spaces so you know like in language um and and that is very powerful that allows us to generalize in ways that may not be as obvious otherwise isn't it fascinating how in the physical world at different levels of scale you know in the emergence ladder you you get this kind of vacillation between discret and continuous and perhaps even in the biological world you you see this kind of canalization where at One scale things simplify and get compressed and then they expand again and then they compress again even even in neural networks that's what we do we we we expand we compress yeah you you've got lots of discrete phenomena in in the real world um you have cell types for example um you have convergence of uh behaviors of cells uh that's that's one that I looked a little bit at um and in physics you you've got phase uh sh shs and and pH Transitions and things like that so there's in terms of Dynamics again when you have contractive Dynamics which means two nearby points at the next step get closer um you you get discreetness that you know shows up typically and that happens in many uh phenomena in nature and in our brain before we go I'm researching ing an article on creativity and I'd love to quote you what's your definition of creativity and I know you put a paper out by the way you know which showed that um language models can be more creative um than than humans but what what is creativity let a good question so I think there are different types of creativities so to talk about things people see in current AI you've got the um combination of known Concepts creativity and we're getting pretty good at that with our you know state-of-the-art llms indeed uh there's another kind of creativity which is sort of like the new scientific idea of it is a combination of things we know because we write it you know we Define it in terms of things we already know um but it's uh very far out of the things that we've experienced uh and I I I I I suspect that this um kind of creativity that's U more out of the box is something that requires more of a search type of computation so when we do scientific research there's kind of a search we try this we try that you of course our intuition guides us it's crucial right but it's not like oh we have the solution in one shot is a search like in alago there's a search and there's intuition and right now we haven't reaped the sort of uh benefits from the search part in our current uh like LMS and so on there's this boundary between combinatorial creativity and and inventive creativity I'm not sure whether it's a hard boundary whether it's a vague soft boundary but how could we measure this paradigmatically inventive creativity I don't know um I think I think when we see it we'll recognize it so if the AI actually makes uh true discoveries that nobody thought about I think we'll know we're entering that territory but that's not that's it's not like a a test you can you can do um but but I do think that at a mathematical level we can uh design our methodology so that it will be trying to do that intuition plus search like system one system to more um and so I I believe that will deliver but how do we quantify so there is a sense in which scientific discoveries are about finding modes modes meaning like uh highly probable explanations uh for the explanations in the space of explanations for how the world works so there are many possible explanations good ones explain the data well the day when we make a new discovery we discover a new potential explanation that seems to fit the data well um and we can abstract that into like small size problems as mode Discovery in in a jargon of like uh probabilistic uh uh Machinery right so if uh if an AI is trying to uh discover all the good things like all the good explanations uh it's going to be intractable but it might be more efficient at finding new modes that they didn't know and some of the tasks that we can design will be focusing on this ability so I think I think there's a way to answer your question uh and do it even at a small scale like we don't need to solve AGI for this we can design algorithms that will be more creative in their little world yeah I I love this casting of creativity as epistemic foraging because that that gives it an intrinsic value but there's also this idea potentially that it's a social phenomenon or its Observer relative you know so um categories let's say mo 37 was actually something that we came to recognize as a collective as as being a thing and and that's how it works but I suppose there are different ways to think about it yeah yeah I think I think the go moves that we didn't expect were a good a good way to think about that but I'd like to think also about uh something a little bit more General and Abstract which is this mode Discovery yes epistemic foraging as you call it I like that term it's it's call friston's term ah okay well yes it's it's a it's exactly right um it's it's foraging it's exploration and you know when you found something good but you don't know where it is so how do you guess where good things are in a very high dimensional space well you need to have good intuition but it needs to be accompanied with a bit of search uh by the way a lot of that search for humans is not happening in individual Brains it's happening at the level of the collective right yes indeed Professor Benjo thank you so much for joining us today it's been an absolute honor thank you so much pleasure thanks for having me amazing Back To Top