[Music] uh hello everyone welcome to bold conjectures with paris chopra today i'm with patrick middlet an independent scientist working at the intersection of neuroscience and deep learning he did his phd in visual neuroscience from mcgill university he's worked at google as a scientist and then worked with facebook to build brain machine interfaces most recently he's held build neuro match academy and online summer school on computational neuroscience i highly recommend you check it out the content is exhaustive and you you cover almost everything there is to sort of cover in computational neuroscience and it's all free so really amazing content on his famous blog xcor.net he writes about the rapidly accelerating merger of techniques in ai and neuroscience this field uh we don't have a name for it yet but people have started calling it neuro ai it aims to study how the brain works by studying artificial intelligence and particularly deep deep learning techniques and artificial neural networks so a neuroscience are two of my favorite topics so there's lots to unpack here in this episode and i couldn't think of a better person than patrick to talk about uh the promise that this new field of nero ai holds uh welcome patrick oh thank you that's great uh it's good to be here great so let's uh just get right into the definition uh how do you define euro ai it's a new field so what do you sort of uh club together when you talk about neuro ai as a term yeah uh i think that the neuro ai as a field uh obviously is a very heterogeneous uh thing but consists of that intersection of neuroscience which is the study of the brain and artificial intelligence which is how to create intelligence inside of a machine which is not a brain and how these two things relate to each other it's certainly been the goal for i think a very very long time in the field of artificial intelligence to make intelligent machines right and uh and one of the ways that we can make intelligent machines is to take inspiration from the brain um so a lot of the feel of near ai is you know what what can we learn about the brain that will allow us to make intelligent machines but there's another part of neuro ai which is what can we know about artificial intelligence that allows us to learn more about how we as humans become or are intelligent so there's kind of a bi-directional interaction in between these two fields and uh it's really been growing exponentially in the uh the last few years as the tools have become better as of course deep learning has like really revolutionized that that field um and uh and the field is just growing bigger so some of the uh now i should say that you know near ai is a field that's existed for a long time so the premier conference in uh artificial intelligence is of course nerves right but the thing about nerfs is that ner inside of nerds means uh neural it's neural information processing systems so originally it was a neuroscience and a.i conference uh in the early 90s and was started by um terry sterznasty uh who's that uh who's that ucsd so people were really looking at that intersection of these two things um but you know eventually it kind of uh really focused in on artificial intelligence and and that's what it is today but there is a tiny sliver of it which is still about neuroscience and we have other conferences as well uh where you know you can explore these these kinds of things there's uh maine there's ccm and we can play some links later if people are interested in learning more about this field and kind of get like a little sampling of all the different approaches that people are using um to tackle this question right um so before we get into some of the techniques uh in euro ai i wanted to spend a little bit of time maybe discussing a bit of philosophy in the sense of what does understanding uh even mean for i mean when we're trying to talk about neuro ai one is neuroscience brain and the other is deep learning networks uh and both are notoriously really hard uh things to understand yeah what happens when you replace one black box with another black box yeah so in that sense what does understanding mean i mean uh when you're sort of talking about let's understand brain using deep learning networks what exactly are we doing there oh yeah absolutely so i i think that uh it's important to recognize that there are a variety of goals when you're modeling something and um with respect to the kind of approach of neuro ai that i'm particularly interested in i've done in the past um i think that maybe the person that's uh that's done the best work of really laying out a framework for uh for how you do these these approaches is jessica thompson uh who was at mila now i believe she's at um oxford um but basically created a framework in which uh you know there are different goals that one could have uh when you evaluate different kinds of neural networks and how they relate to brains one thing that you can evaluate is whether this neural network that that you have let's say it's a neural network that does object recognition and you're asking is it like the brain so first of all can it perform a task that humans can perform so if it does that you know that's like that's a big check mark that you can say okay so now we have like a model of object recognition well that's like one level of understanding um another level of understanding is to say like oh you know we can relate this uh this information to neural activity like one to one we can actually do quantitative measurements okay so that's another check box that we can check now an entire other different kind of checkbox is to say like well this this network is biologically plausible like i can say how this network has uh like how it relates to different parts which are actually implemented biology and what it looks like in terms of synapses and brain areas neurons like actual neurons physical neurons not like relu units and a final thing that we can do is is to say well you know maybe this network here's a plausible picture for how it could have evolved you know through uh true natural evolution and so which is like yet another level of realism so when we're talking about understanding the brain through neural networks there's a whole aerial of factors and it's possible to address one without addressing the other um you know in my mind it's a little bit like uh like the uh they're aristotelian picture of causes right so you have like a lot of different ways that you can define cause and some of them are kind of more useful in some context than um than others right so um so you're saying that we have a number of constraints uh maybe inspired by inspired from biology and if our artificial models uh satisfy some of those constraints then probably we can uh make a claim that we understand something maybe they yeah exactly similar functions or uh uh maybe you know they they sort of model neural activity so these constraints if they're fulfilled then we can claim to understand something yeah exactly and how deep you want to go into this rabbit hole it really depends so when you say it's biologically plausible okay so there's very very different levels of biological plausibility one is that okay maybe i can identify like which area of the brain this corresponds to in the network so like i have like an association it's biologically plausible you know like okay these things aren't done in a vacuum okay well that's cool um maybe another level would be well i have inhibitory cells in excitatory cells in my network so that's good all right so that's like one level of realism another one is to say oh well actually my um uh my network has has noises has biophysically realistic noise so that's another level of realism uh maybe it has some heterogeneity maybe it has the way that it learns is similar to a brain so it's not greedy in the sound but it's something like a local uh learning rule and then you can go down like way down that rabbit hole as far as you can so like you can say each of my units actually has like a physical implementation and it has uh and it has like real synapses and dendrites and axons and it looks like this and it has like this concentration of this neurotransmitter and so you so you can go deeper and deeper and deeper and what it does is really it looks at the hypothesis space which is gigantic and just chops it off like one at a time and depending on which which near scientists and which uh which artificial intelligence researchers are gonna ask they're gonna care a lot more about some parts of the space than others right right so let's say we have uh we have this model uh that satisfies a number of constraints uh and it and it works reasonably well on those different constraints uh what then what what do we do with that model yeah that's a great question so uh i i think that uh i i think that there's kind of a um a bit of a there's an assumption in the field which is often unspoken that if we could only like really uh model the brain really well something would happen that would allow us to do something else but that's something else is sometimes a little bit ill-defined one of the things that the field often uh i think often thinks uh that will be achieved is to make better artificial intelligence models so here the model is really sorry not the model but the the canonical example of something like that happening is uh convolutional neural networks so there's a long history of of convolutional neural networks and how they appeared the short story is that uber and weasel who were two neuroscientists at harvard uh were the first to record from the primary visual cortex and when they recorded in the primary visual cortex they saw two different kinds of cells one type of cell the simple cell was selective for different kinds of orientation in the particular part of the visual field but what's uh what was particular about that is it really cared about what's the sign of the contrast so for instance am i like a white line but not a black line or vice versa and then the other type of cell the complex cell also liked orientation but in this case they did not care about the sign of the contrast and so based off of that they said well okay so here's what's probably happening um the simple cell receives a lot of information from the previous area and just essentially receives and weighs the right inputs in the right way and then controls that information and then several of these cells one which is selective for white line one's for black lines one for edges that go white to black and the other for edges that go black to white uh are added their outputs are added and that creates the complex cell and that complex cell now uh is selective for orientation but it has lost the ability or or rather now it's invariant or equivalent to the sign of contrast and so people looked at that and and said oh well maybe you know maybe the same process is happening all over the brain maybe we just have a selectivity operation and variance operation and we just stack these one on top of the other and so the first person to really formalize this was fukushima in um in 1982 i believe with the neocon intron and thenian lacun picked it up to make the modern cnn in the late 80s early 90s with lynette and then eventually you know that became the first generation of neural nets that had success on solving image net problems right and solving image recognition problems with of course the famous work from alex kuzewski at all in 2012 that solved this mysterious problem how do you do object recognition with these uh with these machines and so that was the first time that they were able to beat um the handcrafted information or handcrafted features and from then onward there's been you know deep neural nets all the way but that nudge that that like uh sorry that little nugget of information which is the brain is composed of these operations uh simple complex simple complex simple complex i mean i it was very important in that generation of the deep learning revolution so i think that people uh still have the hope that there are more principles that we can mine out of the brain the other common example is reinforcement learning um and for people who are interested in this uh there's a great paper that uh where the senior author is demystifies of course the the ceo of deepmind that goes through in particular the cnn example nrl and you know exposes like what he thinks are you know the next big things that could be uncovered by just studying the brain deeply and more deeply maybe you could uh you know uncover principles that could be uh helpful for making intelligent machines right right so uh of course i think in the last 10 years or so deep learning itself has become its own field uh somewhat detached from neuroscience it's seen amazing progress with things like gpd3 and so on uh so has there been a reverse uh reverse sort of push or like neuroscience importing insights from independent progress of deep learning and ai has there been cases just like convolution networks were sort of exported to ai oh yeah absolutely um so there's the uh so on on one hand uh there are very like pragmatic like everyday tasks that can now be automated which uh weren't uh automatable before and that's allowed uh to get a lot more information about how you know humans and animals um interact with the world so i think computational ethology is a super interesting field and and um it was empered by a this technical limitation i mean the computational anthology is uh it sounds very fancy but it's it's really really simple the idea is like what if we place like three cameras and then you just like let's in my kitchen and just like watch me all day like what do i do exactly and i can tell you a lot about you know what it is to be a human i guess like what it is to be a human that plays on his laptop all day and every once in a while stands up and eats nuts and sits back down and and programs um and uh the problem of course is is the labeling problem so uh in that case it was really solved by you know advances and um labeling bodies and to be able to do uh tracker-free labeling um so we've seen like an explosion i i'm there's a particular paper from mackenzie mattis uh who uh that has been cited already thousands of times since only three years old so that whole field was completely enabled by the deep learning revolution that's very much in the pick and access kind of uh way of thinking about deep learning as i can make new tools these tools are helpful and with that i can learn new information about what it is to be a human or an animal and so certainly that's uh that's been very helpful um but some people might argue well that's a little bit superficial you know like who cares more well i i don't think that people actually say uh uh uh say that but it is but but maybe it's not like really oh that that kind of uh like really deep insight that we're looking into but i i personally think that uh we've made tons of progress in understanding sensory systems because of these advances and artificial intelligence and uh we can definitely uh go into that uh yeah yeah my sense was more about say for example looking at how transformers enable translation or how sort of gpt3 is able to hold a current uh conversation or and in fact training without labels uh gp3 is able to sort of infer a lot of context about english language so has there been attempts to study how that artificial model works uh to infer how say brain might be doing similar things like picking up language as a child yeah so i think that i i'm pretty dubious that gbt3 that gpt-3 is a model of the brain in any you know kind of meaningful sense although you know what it does is pretty amazing and you can use it for a lot of uh very interesting things but the reason i'm so dubious is that it's trained on ten thousand times more data than a human child would ever be so how could it ever happen that you know these two things are kind of equivalent um but also you know the setting in which you human children learn language is a very different setting than the kinds of unsupervised learning that that people are doing with the gpt3 so uh for those that um might not like have a lot of context so gt3 is uh so first of all it's trained on all the data that's ever been produced like in terms of tax uh since the beginning of humankind ever which is tons and tons and tons of tokens i think it was i want to say 500 billion tokens yeah i think it's that similar order 400 500 billion tokens yeah it's completely ridiculous right so that means uh every encyclopedia every book that's ever been written all of wikipedia and basically all of the web like 70 of the web de-duped and cleaned up and with the weird parts cut out which is a lot of the web and that's that's you know what it's trained on so clearly a human child would never have like that information that would be a one smart child it would be awesome to speak to this child um but unfortunately it is not the case that uh that human children are really ever trained with that amount of information and then if you look at the task that uh the the transformer network actually does it does like a mass prediction task right so basically you um and now i forget the details whether it's uh it's like a bi-directional mask or it's a unidirectional mask but one of the versions of the task is i give you uh uh you know the start of the sentence like mary add a little and then i ask you like what's the probability distribution of the things after that and then if you say lamp you know that's probably like probably has like point uh 0.99 uh probability and then anything else that's a very low probability so if you just learn to do this this thing recursively over and over again you're going to get pretty good at generating sentences which makes sense and which have like a lot of of context but it's very different from a human child because the the human child has okay first of all it has different stages of of of learning right so um it's uh the human child is an asian let's start with that premise that's that's huge um gpt-3 is not an ancient flavor um whereas a human child you know who doesn't understand a thing can keep asking questions which is amazing so uh i uh you can imagine like a two-year-old that's uh you know that's like pointing at a dog and then like the dog is kind of like especially fluffy or whatever and it thinks it's a cat and then uh you know it's dog no no no it's a it's actually like a it's actually a cat or or vice versa and it's like oh really and like what about this is this a dog you know and then yes yes that's a dog it's just like this one has like very long hair and you see but it's still it's still in its face look at its face right so you can have this this conversation so there's a process of iterative refinement another thing is that people introduce different curricula to children as their as they're they're growing older so at first you know people use very inflected speech um and so they'll you know speak in a kind of sing-song manner and have like very like baby oriented speaking and they will they will say words very slowly and they will make sure that the child is looking at them and then as they grow older uh will they introduce like larger and larger vocabularies and then so on in schooling so there's all these different sub-processes that come from the fact that the child can interrogate the world and learn about it themselves the fact that there's a curriculum that that happens that goes from very simple to a lot um to a lot more complicated the fact that the child can have conversations right and and get their own their own training signal and the fact that they have 10 000 times less data and gp3 the gpt3 would uh whatever use i mean i would be curious like you know hypothetically if human children could uh could grow up in instead of five years uh would be like 50 000 years so they would have actually have the time to be exposed to all this data what would it be like to talk to one of these children it's just like a weird philosophy question like a good science fiction plot yes yes exactly so uh i i hear you say gpd3 and similar models might not be a good model for the brain because brain uses a very different way of learning but a lot of people who've tried to sort of implement these techniques like predictive coding or being reinforcement agent agent-based sort of learning methodologies they've they've not sort of matched performance of uh straightforward models like gt3 so i mean that's that's a dilemma right where you could have a lot of human-like ai models but so far i'm not sure if i mean i might be mistaken but so far i'm not sure if they've performed really well on things that just straightforward deep learning models have yeah yeah so i think that there has been a transition [Music] in different parts of our in different subfields where people first start to work with supervised learning right so supervised learning models are generally the ones that accomplish basically they ask the question how do i accomplish a task i accomplish the task by accomplishing the task it's a very simple principle um right like how do you do the thing you do the thing and then you get good at it um so you know if i want to solve image recognition i train a network to solve image recognition boom you know super super simple so there's a lot less kind of fiddling and a lot less free parameters that um that come with that and so historically those have been the approaches that have worked uh better and more reliably there have been like a few blips uh here and there uh in uh in history where oh you know maybe a word don't like maybe we don't really like supervised learning anymore maybe we want to do unsupervised learning so you know some of the the the modern deep learning revolutions in 2005-2006 um they were usually trained using um not variational method but energy-based methods as um kind of a using like kind of a good sampling approach uh which was very different from you know modern supervised learning but then and they called it the pre-training but these days we don't do that anymore because we have a bunch of tricks that can do it directly with with supervised learning so yeah uh supervised learning works great uh it's not a biologically realistic way of of learning things i think that that's the uh that's the kind of consensus people hate this idea that there's something that sounds like uh giving a child a thousand examples of a thousand different categories they're like this is like there's no way that that ever happens right which is that that's what i'm describing of course is imagenet but you know there's like one category in imagenet that's uh uh that's like let's say a rottweiler another one's a bulldog so you know i think like first of all like would a child ever see like a thousand instances of a rottweiler versus a bulldog okay so there's there's that problem but also they probably wouldn't get labels um so generally people think like okay supervised learning is not like really biologically realistic but maybe maybe that doesn't matter because there's something deeper in this in this fact that these two things that the brain and this network are kind of doing the same thing and it's it's kind of a tiller logical uh idea you know the network or the brain was built to solve problem x and by correlating it with this with this network you can infer what that x is and that's super cool because you know the we just know that the brain helps us survive that's for sure but anything else is kind of fair game at this point um so if you say well out of a billion different networks that i looked at uh you know the ones that were trained to do object recognition in a supervised manner were most similar to the brain compared to every other test that i could show at it whether it's 3d segmentation or 2d segmentation semantic segmentation you know figuring out textures uh 3d mapping what have you then you can say like maybe the brain really does object ignition and that's okay whoa that's something you know that's that now you can say this brain area or this set of brain areas is really like nicely tuned for this problem what does it mean for us as human agents that are trying to interact with the world you know what does it mean that we naturally evolved this thing which was good this particular task maybe it teaches us about the kinds of environment that we grew up in or you know what the human experience is um so there's still something like very deep that you can do it with supervised learning but then you can't turn around and say and therefore you know children receive and and now we have like a new government program in which we just like give a photo book with a million images the children so that i can really learn uh visual recognition very well [Music] so it's uh uh supervised learning is definitely sort of like uh not not how the brain works um so you in in your sort of recent post where you've written how the field is progressing i think one thing uh just clearly evident you were excited about unsupervised methods having uh similar performance uh as supervised methods as far as modeling brain's vision system is concerned so can you talk a little bit more about it why do you think it represents like a big step up oh yeah absolutely so as i mentioned you know the easiest the easiest way to do the thing is by doing a thing that is a generalized consensus in in deep learning uh however uh every once in a while people figure out how to do the thing by doing something completely different and which is you know unsupervised learning so if you want to solve object recognition why don't you learn how to represent uh distribution of textures and and images and maybe you know there's a lot more information there and so you don't need to leverage the the labels um of the of the objects so much uh and the labels of course are very expensive whereas the images themselves are super cheap to acquire so that's the so that's the general calculus unsupervised learning is harder because you're not you're doing things in an indirect fashion but if it works well it could potentially work a lot better than supervised learning which is easier because it's more direct but has hits a wall because there's only so much information to go around and so in order to be able to say we have uh unsupervised networks uh or unsupervised trained networks that are as good as the brain the first thing to do is we should have unsupervised training networks which is good which are as good as supervised trade networks um and i i the the reason is that the match already between uh supervised trained networks and the brain for at least some areas and for some uh vision processes are excellent already so if you start with something that's just like not very good at the task of object recognition you're it's very hard like you're not going to get very uh very far there and so in the last three years or so there's been a veritable uh you know revolution in um in self-supervised and unsupervised learning that has allowed people to to create networks that are very good at image recognition but are trained mostly in an unsupervised or self-supervised manner and they just add a little bit of a sprinkling of supervised training on top um so uh you know we hear a lot about um so you might know about moko about simclear like there's all these like weird acronyms that that people come up with i think the one that uh people may be more generally aware because it's uh there's been like a very nice blog post by uh by open ai it's clip yeah so um and uh i think that between clip and dolly you know there's been a lot of art also that has been created with uh uh with these networks so that's an example of a network that's trained in a uh a self-supervised manner in a contrastive manner clip so the way it works is you do a big crawl of the internet and you find images and text that's close to these uh to these images and then instead of trying to predict the text directly what you try to do is to predict a kind of a low dimensional version of of the information that's uh that's in the text so it's a kind of a soft labels that if you will and by doing that you actually get very good representations which are useful for um for a lot of different things so that's an example of something that's um a a multimodal thing because it uses both text and uh and images and it doesn't have like it it's not nearly as problematic i think as a supervised network right right of the labeling to capture what you're saying uh essentially the first sort of the first part of what you were saying uh supervised supervised networks they match brain but of course they require labels and by matching brain it means they do things like uh recognize objects as well as a brain would do like a primate brain and also uh they can be used to predict the direct recordings neural recordings as well which means yeah precisely so there's enough there again like there's different levels of of explanation one of them is they do something that the brain does another one is they like literally you can predict the brain activity so for instance if i show you uh an image and i can predict what the brain activity is going to be for that image yeah i mean that was very surprising for me when i was reading the paper why should that be possible oh it's bizarre [Music] i think everybody agrees that it's a weird thing that's going on there but um i think there's a lot of like underlying hypotheses why uh why this is the case like maybe if you're gonna try and solve the same kind of task as the brain does well naturally you will do it in kind of a equivalent way to how the brain does it i think that people like liked it to think of that maybe the fact is that you know these networks they're kind of are like the brain in some ways they have multiple stages um and they have both invariance and selectivity operations so they're close enough that they remain kind of close in the idea space or in the manifold space to what you would expect in the brain but the i mean the learning algorithm seems seem to be very different one is back propagation and oh yeah i don't know what i know back propagation might not be possible yeah i think that people really like to do a separation of concerns when it comes to these things like you know so uh people like to to say okay we're gonna solve like this problem we're gonna find a network that is as good as the brain at doing this task great and then another paper is going to tackle i want to find an algorithm that can learn a task as good as an artificial neural network but that doesn't use any sort of gradient descent so it uses some other uh some other like trace algorithm that is purely local and doesn't have like all these difficulties that you have with the stochastic gradient descent but there is still the integration work i think of taking all of these things together uh to create like the ultimate mega model that uh that eats everything right there's a lot of things that you would want in that model it would have to not have a stochastic gradient descent it would have to have excitatory and inhibitory neurons and follow dale's law which is that excitatory neurons make excitatory connections inhibitory neurons make inhibitory connections it would have to have recurrence would have to have multiple areas would have to have like a frontal lobe that talks back to infra temporal cortex it would have to have like multiple passes throughout it would be like kind of a mess so i think that people are pretty wisely attacking one of these things uh at a time and with and just like punting in the uh in the near future we'll have like all of these elements together right my feeling is that if you really put all of these elements together you're going to have like a great brain model which is going to be kind of impossible to even think about it's going to be just as bad as the real brain so coming back to uh again what i was saying so we have a supervised model that works that matches quite well with the brain and uh but obviously there's this big dissonance uh because brain doesn't get so many labels uh thousands of labels per category so the big progress that's happening recently is matching supervised uh models performance and match uh uh but without the labels just uh yeah absolutely uh but but do you i mean when you talked about clip i think clip also uses uh millions and millions of images uh but of course it's very true yeah so and i i gave human brains i mean the child wouldn't get so many images i'm so isn't there still like a gap between uh what a child gets and what an unsupervised model gets as data yeah so i think some of the best work like really tackling this question so when i say like people really hate supervised learning uh as a model of how the brain learns uh i really mean it like that just like like viscerally like it just doesn't make any sense whereas unsupervised learning can maybe make sense if all the things kind of align do you understand like like like supervised learning like wouldn't even work in principle like it's you could just like ah that's that's not how the brain that could possibly work unsupervised learning it's really it it really is a matter of the devils in the details like you mentioned well how many samples do you have so sure gt3 is trained in an unsupervised fashion but you know it uses ten thousand times more tokens than a child has not biologically plausible so i think that the italian conquel has a great paper and and um if people want to read the discussion in particular for like how would you actually implement like for realsies in the brain this unsupervised um method um and i think one of the uh the big um the big conclusions from that is that maybe the key uh is sakaz so remember that um uh unsupervised methods or rather self-supervised methods like simclear and and moco and and all of that they generally function on this principle of transformations of images so one of the big um [Music] one of the big ways that they function is i take an image let's say it's an image of a bird and then i take like a little crop around the uh the bird's face and then i take like a little crop around the bird's body and i feed these two things into the network and i say the network to to the network uh make sure these two things which are the head of the of the the the head of the bird and the body of the bird are as close to each other as possible um so it basically leverages this this idea that generally there's one object in each image that's kind of the main subject and if you just if you take different crops you still maintain this identity so these things should be kind of close to each other in idea space and and then there's another aspect of it which is make sure that if you take ima if you take patches from another image that they're far away from that bird image so if you just keep doing that uh eventually you're going to learn a pretty good representation that is able to solve things like like object recognition so uh she started with the premise that let's take that seriously right let's take this cropping idea pretty seriously like what would that even look like in the brain and you know maybe like the easiest way to create something which kind of looks like cropping is the cuts so we move our eyes in the world every 300 milliseconds or so so if you're reading a paper uh well if you're ever at the library i guess that libraries don't really exist anymore because of of covet but if you ever have the chance of looking at somebody reading a book you know you'll notice that their eyes they dart right so they stay in fixed positions and then they move in these uh very fast movements uh that are uh these ballistic eye movements and um and so humans the reason that people do this is because they have better resolution in the center of the eye than they have in the periphery so if you just bring the letters to the center of your eye where you have like immense resolution and a bunch of cones you can read things which are very very fine um and the sakaz could allow you to do this this like kind of cropping algorithm in an analog physical way because you have one snapshot of the world with the sampling which comes from from the retina and then you move your eyes and then you have another sample of the world and yet your brain must say if i move my eyes just stuff remains about the same like i i did not radically change the scene i did not radically change the identity of objects because i just moved my eyes so that's a really interesting connection oh sorry i was saying that's a very interesting connection very intriguing uh so and and the the paper is uh is really nice there's also this idea of hippocampal replay as another mechanism that you know maybe you could store a buffer of the image and then kind of replay it too but again with the sakats being really part of the of the equation so i really like that because i'm not saying that this is a fully baked prop proposal but it is a great example of somebody like really thinking seriously how would you actually implement this like when it gets down to brass tags because like we're not in silico right it's not uh you don't have to write like a for loop in order to go through the iterations and go down the gradient that's not how it works let's say right we actually implement this in wet wear what will it actually look like right is is anyone trying to uh go one step beyond as in when we're looking at things we're not just uh cards is one thing but we actually also physically move our body and we explore the environment we look at objects from multiple angles by just physically moving in space so is anyone trying to do that uh in in vision active sampling to the whole 3d environment oh yeah absolutely um so yeah that's a great segue for uh for some of my own research uh which involves exactly this problem so um i'm uh interested and i've been interested for a long time in how the brain perceives motion [Music] so i think that the a lot of the research on how brains are similar to artificial neural networks has been focused on object because it's been mysterious for a very long time uh because there are approaches which can solve it in silico and so there's been there's been a lot of research on that but we know far less about how the brain can solve motion problems and you know historically uh or you know on an evolutionary time scale motion has always been kind of the most important thing um so not to uh not to touch my my horn or to say that my the the uh that the emotion is is uh um like is the most important problem to study but it is definitely uh something that's uh evolutionarily uh very important because even mammals or fish or even invertebrates that have barely have a brain will still you know rely on motion in order to orient themselves uh so it's something that's that's been happening for a very long time so the question that we looked at is how could you create something which looks like the the motion processing system [Music] of humans and primates and and like grow that uh from a an objective right so could you find an artificial neural network that does uh motion in kind of the same way that that humans do it and so uh i don't know if you've ever had the experience of of having vertigo yeah right so vertigo is kind of a an interesting thing because you really see like or maybe if you have like a few more uh like a few too much drinks you see that the entire world like maybe like spins around you right so you have like this illusory motion and yet it it doesn't really happen so and it's also like kind of a visceral feeling um and so it's kind of this regulation of the information that comes from the eye and the information that comes from the rest of you which is this inner ear uh thing that we have that tells us the rotational acceleration as well as the location of gravity it's our imu if you will so there's this one area in the brain which is called mst and it receives inputs from both the vestibular system and also from the motion system and if you look at um and if you look at you know what the kinds of things that the response to some of these neurons like really like exactly like that kind of like weird rotation that you might see as your as you have vertigo or they they like uh expansion stimuli or they like contractions so the kinds of things that happens as you navigate towards the world so there's always been this notion that like hey maybe what this area is doing is to help us navigate the world so if i have the intention of going towards one place and i see a lot of motion let's say i see tons of leftward motion it means i'm not going in the right direction because the point where i'm going to is supposed to have no motion uh it's like if you watch like a sci-fi movie then when they do warp speed in the center there's no like the the stars don't become don't become lines right so that that center it uh it doesn't move um so i trained a um a deep neural network to solve a subtask out of this uh um out of this system which is uh essentially to figure out if you have like a little sequence of images where was i going so how fast is my head rotating and uh you know where am i uh heading about in the world if you train uh this in in in silico with simulation data so you can train this in the in the simulation that's made in unreal engine uh and you know you can add whatever uh artifacts you want in this uh in this environment um then you uh you find that the network looks a heck of a lot like what you would see in these motion selective areas in the brain so they do it in kind of the same sequence and the sequence goes first i care about motion in like a little blip of the visual world uh and specifically for a specific orientation then i solve motion for larger patches and there i don't care so much about orientation i don't care about pattern anymore and in the third sequence i care about um combinations of motions so combinations of motions that can create things like rotation and expansion and so you see this like perfect uh well almost perfect recapitulation of what the brain does and it matches both quantitatively and qualitatively so i think the next step along this uh this journey is to say well could you use that path integration as you mentioned right could you use it to map out to do like does the brain do slam right so simultaneous localization and mapping uh or something like that and there are certainly people that are working uh in the in the epicampus like really trying to better understand how to bring those maps and and so forth so it would be nice to connect these uh these two things because and that's kind of the ideas that well you know you do motion and then as you do motion you're kind of buffering this information and then putting it inside of a map but i think like the i think the hippocampal side of things is pretty well understood and now i think the motion side of things is pretty well understood but what happens in the middle is kind of complex and and uh and interesting and and messy oh i mean it's it it seems uh i mean really a little surreal that you can have an artificial network trained on a task uh and that converges to how brain behaves i mean it seems like it should be possible but yet oh i mean when i saw the the initial results i was like this is so wild i do want to i think that scientists are often like very uh you know like even tempered when they present this stuff to the public and just say like well these are results because you know we don't want to over interpret but i think also as scientists it can happen that you get results and you're like i can't believe this actually worked this is like this is weird what yeah what's happening i guess this is i mean this is perhaps touching something which is broadly true uh maybe it's capturing the nature of problem itself in some sense that two different systems are converging to similar behavior similar structure it just seems it should be possible but yet that's what you find so very crazy yeah great so patrick since we are almost at the end of the conversation i did want to talk about uh the c word uh consciousness i know you have some thoughts uh and i know it's uh also not super polite in neuroscience community to talk about but what does it all say about consciousness but uh but i know you you have some thoughts about it uh i so i wanted to ask uh do you think all of this can lead us to understanding consciousness in a much more rigorous way much more scientifically do you have hope for that or you think it's still very very far away i personally think that it's very very far away um but i i do think that we'll learn something about what it is to be human in the process of understanding all of these different subsystems um because we'll understand how uh you know the senses relate to our internal model of the world which is kind of how you know we create behavior and act as uh as humans there uh you know this likeness uh famously said that there is nothing in the mind that is not uh from the senses except for the mind itself so i do think that by you know understanding the senses better uh we will get to this understanding but as far as consciousness i think that there's you know two big theories which are contenders for theories of consciousness that people are really looking at right now there's integrated information theory uh which is from tononi and so if you read the the book by the recent book by neil seth and i think you have because i read your notes uh that's the that's the kind of standard theory that's uh that's often um brought about and then there's uh global workspace theory which is uh from stanislav uh who's a very very famous french psychologist uh who's also written a ton of books and you should definitely pick up this book because i do think it's um it's very interesting so i i mean the question is by understanding these sensory systems will they say things that we could relate back to these theories of consciousness that already exist in the world could they speak about some of the sub aspects of these theories and i do think that that's actually possible that they will say something about it so for instance in global workspace theory you have this notion that um there are these there is essentially uh a stream of consciousness and there's a process by which you know low-level visual information or low-level sensory information can pop out and actually like take over consciousness right so it's um it's kind of a process that's like right about a face transition uh so if i present you a stimulus for a certain amount of time and then i uh place a a distractor or i place some uh some noise stimulus maybe you won't see it but if i just show it to you a little bit more time it has time to grow and be amplified in the brain until it kind of takes over your consciousness and then you see the stimulus so i do think that as we create artificial machines will have um we may be able to to get models which will tell us what kinds of stimuli will actually be accessible to consciousness but that's not the you're kind of dubious because i don't think that that's what you think about when or what most people think about when they think about consciousness right yes i have the uh the contents of my consciousness like there are certain things that i can see and those i can't see and the things that i can see are you know i'm conscious of fine but if and if i have a model of what i can see and what i can't see sure it will tell me like things that i'm conscious of and things i'm not conscious of but i don't think that's what people are talking about people are talking about qualia like right why does it feel anything to be anything um so i think both are interesting questions so i mean yeah the question that would be my contention exactly i mean how does this whole 3d structure uh get instantiated in brain i think that's equally interesting as you know why does red feel like red and i mean the quality question might actually be impossible to answer at the very end but at least a formal question we can answer uh how is this 3d world getting generated inside the head um yeah so i i was also curious as anyone doing just like you said people are trying to people are doing say building networks so object recognition and finding that these artificial networks match how the brain does is anyone trying to actually model consciousness itself in artificial network and see if that artificial model ends up matching uh things you observe in our own experience i think people are hopeful that this will happen i i personally haven't seen much work that speaks directly to that question but it would certainly be super interesting um the closest i think is some of the work from uh from anil seth which has looked at um altered states of consciousness and whether it's possible to kind of induce the kinds of feeling that you get with an altered state of consciousness through an interaction with an artificial neural network so the the classic setup there is okay so if you take psychedelics uh right whether it's cylosibbin or lsd or um dmt ayahuasca or character you will have you might have uh visual hallucinations right there or you might have like enhancements of your visual perception which will make you feel strange for lack of a better word um so he was looking at whether it's possible to recreate these kinds of feelings of strangeness and the visual in visual perception through an artificial neural network assisted method and so what he chose to do is um use the 360 footage and uh like a 360 movie that you can present in nvr and then apply deep dream which essentially is a is a way of so for you you might have seen deep dream images uh already so there's those those funky images where you take like a basic image and then you kind of crank it up to 11 and then everything becomes eyes and and faces of dogs and and so on and so forth uh so you can you can certainly uh uh uh use a search engine to look at some deep dream images right now uh it's very similar uh to that process so they're inside of a vr headset the the subjects and then they undergo this and they looked at whether people had experiences which were along some axis similar to the psychedelic experience so here the idea is that maybe you can alter people's you know state of of of of of being by using the senses and there's another kind of sub sub question here which is uh like the visual hallucinations that you see or the complex visual hallucinations that you might see in the psychedelic state uh could you explain these as basically the visual system driven in overdrive quote-unquote um and uh you know there's a very specific sense you know deep dream uh is an optimization of an objective it's basically you start with some baseline image and then you set it so that all of the sub channels and your deep neural network are cranked up as much as possible so it just tries to drive all the units uh as maximum as possible so is this like kind of a model a visual perception right um so i i think that the second thing is probably like not as convincing but could you create a could you create the same kind of of of um [Music] state in which it feels strange to be in the strange visual world i think that they showed that you know pretty convincingly for what it is but these are very small studies this is still like very early days so i i wouldn't say that it's uh entirely convincing but it's one way that people are exploring right i think yeah i mean the way i would look at it uh consciousness sort of i mean it must have a functional utility for evolution to give us such a rich experience it will be super interesting to see if we have uh two versions of a model one with without anything really anything uh matching uh anywhere near to consciousness another having some sort of internal model that we can perhaps say you know it's its inner world of that model and if the latter has a better performance uh on some tasks which might be similar to survival tasks uh that you know human bodies performed i think that will be pretty convincing for me that yeah yeah having its own internal world so one of the nice things there is that um in integrated information theory you know according to that to that framework if you try to implement this in circle you can simulate a consciousness but the thing is not actually going to be conscious so you don't have to you don't have to worry about that and uh i think that chris [ __ ] has like this whole argument about branching factors and how many uh like how many things are connected to how many things so you know as long as we have like the conventional computers that uh that we have we don't have to be afraid of creating a a conscious machine um [Music] accidentally i guess we can keep going on and on about this uh but this has been wonderful patrick uh i think we've covered a lot uh many interesting threads and i'm i'm really hopeful in the coming years the model that you were talking about integrating more and more aspects about how the brain works uh and building this one mega model uh hopefully we'll get there you know a few years from now absolutely great how was great yeah great talking to you thank you have a great day you Back To Top