[Music] hey everybody uh welcome back to the aval laab system seminar today we're really excited to have uh Guan ballet from the go etherum team who's done and led the a huge amount of the veral tree research over the years so we'll be talking about uh veral trees as well as a lot of the migration path to get there so really excited and thank you so much for coming qu and I'll turn it over to you hey thanks for having me uh yes indeed so I'm unfortunately I need to start correcting you my name is Gom so don't worry I'm not I'm not offended I'm I'm so used to people mispronouncing it that U that it's it's not a problem uh yeah I've been working indeed at the at the EF for six years now and um I've been recently i' say in the last two years I've been working on something called what we refer now as the uh which is the conversion of the the current ATM state which is stored in an MPT a merco Patricia tree into veral trees um so yeah I mean I I don't expect to to teach you all a lot of uh a lot of things uh when it comes to to client development uh but yeah I I assume you're already aware that the the state of etherum is growing and we expect that in a year there will be about a billion leaves in the tree um so this has consequences one of those consequences is that databases as they get bigger and bigger they they get slower um but it's also in general inducing a lot of maintenance issues with with nodes well first you need to upgrade your discs because a lot of people bought a one tbyte disc and we're starting to hit that limit um it means that if you're trying to sync it's going to take a long time so we have made a lot of progress on the on the sync algorithms even recently but um but yeah it's still it's still not satisfying it's still it's still quite difficult to to to think um so yeah there's there's a lot of things that um feels like the the system is uh is you know um bursting bursting at the seams and um that the consequence is that more and more people are relying on centralized providers just because they don't want to maintain the node and as a result we yeah basically ethereum would no longer be the decentralized um ecosystem that it it strives for it strives to be uh so of course we tried to mitigate a few things um we repriced some uh some uh some um up codes um one of them uh we we also change the behavior for example I'm talking about the gas refund um when you used to write some data that was non zero you would pay some gas to to write it uh in in the state but as soon as you vote zero instead the Assumption was that this would reduce the size of the State uh so people would uh be incentivized if to to write zeros to delete uh what what they create if if they can and that backfired really badly because uh someone created contract called gas token where people could just reserve space and hope to get the to get the refund at a time when gas was more expensive uh so very quickly that turned out to be the the biggest Contra maybe not the biggest but one of the big contracts in the ethereum state so clearly that was that was not a good idea um so we we stopped doing that so that people don't keep adding more State we created uh well there's been some um that was not the initial goal of EIP 1559 but because it it had some pressure on um on the gas price uh making it like if you want to to pay a lot of uh if you want to pay a lot of gas uh to to create a block to to make a very big block uh the price would go up so the Hope was also that this would uh this would sort of reduce the the gas price there's clearly not significant uh result with this on this front but that was that was an interesting aspect of it as well um and then there's more technical things that really are guess specific so we started putting um taking stuff out of the database and putting it in an ancient store so somewhere where if it's not going to be accessed it's very unlikely to be accessed by the in the day-to-day business of uh like of client execution um this will be stored out of the way and uh and make the DB less uh like a bit smaller uh we invented this thing called Snap syn that comes with a snapshot so it's basically a uh a key value store so instead of jumping from no to no in the tree you go straight to the to the data that added more uh more data to the state uh but it's it comes at a it's improving the the performance so it's uh it's quite useful and very recently Gary from thef especially from the guest team has uh worked on the Pass based storage so instead of storing nodes by hash uh we store three nodes by by their path in the tree so you can go straight to the to the node you're looking for and that's uh that's improving the reads that's also helping not store several version of the of the tree in in the disc there's always one version of the tree in the dis and all the updates are kept in memory in Ram until they can be collapsed and and return to dis so that's that's a very useful um uh change and veral trees are going to build on top that um but yeah so I would say there's still there's still a lot of room for long-term Improvement and we uh we proposed ideas so veral trees are stateless ethereum um something history expiry so it's all about deleting old data old blocks old logs a lot of things that um right now clients are keeping around but maybe they should not um there's Ste expiry scheme there's been a few proposals none of it uh has been really worked on for the last three years at least but it's it keeps coming back as a topic and then there's also um ideas where you would Shard the state and distribute it all over the network so that's what a project like the portal network is doing um so I'm going to focus on veral trees because yeah that's what I've been working on the last the last two years um yeah it's a new it's a new tree structure so um I don't know how much of the um structure of ethereum you know I assume quite a lot um but um just as a reminder currently ethereum has one single account tree and then when you get to the account you can get uh you have a field in that account that is called the state rout and that that is another tree that's the root of another tree where you get all the all the data in that uh in that uh for for that contract for that account so all the all the storage that pertains to this account will be stored in that second tree well with veral trees we change that truck structure completely we only use a single tree so all account All Storage all code everything goes into the inside the same tree and we use a um a new key format to to find everything will explain that in a minute and we also replace hashes with polinomial commitments I will also uh talk a bit about this it's a pretty big topic so I will just uh give a very high level overview but uh it's a very interesting uh topic so yeah uh I was talking about the new key format so what we do is uh each account has each item in an account so for example the Bal balance sorry the balance the nons uh any slot storage slot um any piece of code like chunk of code code so we chunk the code uh we break it into chunks of 31 bytes plus some in uh indicator bite at the beginning to say if it's been a push data what we call Push data um and every one of this item has an offset Associated to it so for example the balance is at offet one the nons is offet two uh slot number 200 well we have this main what we call the main storage offset which is which is a huge number and uh we add well there's a bit of a complication I'm not going to get into but we have a calculation so we can find the the offset from from this formula uh the number minus 64 plus the main storage offset and then the code chunk are also uh like have have a start off set of 128 plus the plus the chunk number and what we do is we take the to calculate the key where do we find each item in the tree we calculate the tree the key sorry by taking the address and we take the so the offset is taken as a uh big Indian 32 by number and we only take the the 31st bytes out of this and uh we just uh compute that hash so it's not a it's not a Kat Shack it's a Pon hash it's a hash that is based on elliptic curves but has some nice properties including ZK ZK friendliness and out of this hash we take the the first 31 bytes and then we take the last bite of the offset that we did not include um in the first um in the first uh like in the hash calculation and we take um we take it and we add it as the last bite of the key so how does that work in practice well if you take balance and nons we have the offsets here we take the first uh the first 31 bytes so it will be all zeros um because yeah those those are numbers that fit in the BT so we will fit in the least significant bite so basically the 31st bite um and so you get the hash which is the same and then you you take the 381 bytes of that hash and you just put the last bite as um as the last bite of the key so that has the effect that even though the data is spread all over the tree it's actually grouped by groups of 250 six um which has some interesting properties and is also quite uh uh is going to imply a lot of changes especially for solidity uh the the way solidity stores its data which is just hashing a lot of it's data in its map which is just hashing the key and and storing it in the middle middle of nowhere um that's going to be a bit of a problem because you uh you prefer grouping things so I would say a r are going to be the Preferred Future data structure compared to to Maps um yeah so what I wanted to say is if you take the slot the offset of the slot the the the number is no longer zero so you will get a like sorry the the first 31 bytes of the offset are no longer zero so you will not get um the same hash as you did for the previous two items but uh you will take that number and uh the you will take that number modulo 256 and you will find it at the end of the key and same thing for the for the code um so yeah I might be saying something very basic but I think it's important to understand that um the problem with the current structure why why do we switch to to polinomial commitments it's simply that if you want to prove that um Item B in a pre-image is present first of all and at the exact location that you want to to prove it uh to prove it is then you have to pass all the other siblings so if you're thinking of a tree node you have to pass all the siblings uh basically you have to pass the entire data if if you want to to prove that one item is where you you claim it is um so if you want to make a proof of the location of some something in the tree for every level you have to pass all the siblings and that makes for very big proofs if you go for polinomial commitments so um yeah I'm only going to scheme over the surface here because that's very it's a very complex and deep topic but assume you you have a vector um and you want to prove that b is at position one in the vector you create a polinomial that is in lrange form um and I okay I don't want to get too much into the math but the idea is those those polinomial those L polinomial are known as the lron basis and uh L1 for example will be one if x equals 1 and all the others will be zero if x equals one and for example L2 will be will be one if x equals one sorry if x equal two equals two and all the others will be zero if x equals two so uh yeah don't want to get too much into uh the polinomial stuff at the moment we can always discuss it afterwards if you're interested but the idea is that when you pass a commitment you just evaluate this polinomial at a random number that you don't uh that the prover does not control so it can be done interactively or it can be done through through What's called the Fiat Shamir um forgot the word but um like technique I'm GNA say technique and um and so yeah the the prover cannot predict beforehand what this s is going to be so they can't really cheat and then and that's what's interesting the the proof that uh your vector equals B at its first position is equivalent to saying there exist a polinomial that q that uh the prover will send to you and convince you it's a polinomial uh actually the prover will send to you the the polinomial evaluated at s and proves to you that it comes from a polinomial and the way it proves it to you depends on the technique you choose you can go with kcg you can do that with with what we use which is IPM and yourself you are building a polinomial which is actually the the division of um one the the polinomial value that you divide by uh basically you factor out the the the solu the root that you want to prove and uh if like this this division represents the evaluation of that second polinomial at um at the point you're you're interested in and if they both agree there's a there's a LMA it's called a schwarer LMA that say that um if two polinomial agree on one point they are probably with very high probability they are the same polinomial okay so that was a lot of math um very also I was that was very very simplified math um so if you if if you want more details you can always ask me later uh but yeah hopefully that was enough to uh for you to to get the point that the the advantage is that if you want to prove the position of one value in a vector and what it what that value is you only need some constant um data so that means that uh that means that you can make your tree much wider like each of your branching nodes can become much wider because you don't have to pass the siblings and because those trees become much wider they also get more shallower and as a result your proof like there's some kind of virtual cycle that your proofs get uh becomes smaller and smaller and uh just to finish with the with the tree structure um we so we also have a structure I'm not going to get too much into the details but the the structure is basically divided in three parts uh so forget about this part it's only used to to compute the commitment and we don't want to uh we don't I'm not going to cover that but the idea is that you have a first part which is only branching noes so you can't have like you do in the MPT at least in theory right now A an extension node in the middle uh stuffed in the middle of of all those extension of those branching nodes that no longer is that is no longer possible um and then you have some kind of extension node and that corresponds to um to the stem so the stem I forgot to say is uh that part here the the hash of the address and the 31st bytes of the offset actually the the first 31 bytes of the hash of the address and the first 31 bytes of the offset so it's a 31 byte value and so um the last bite represents the group so that's what I was saying the the values are grouped by groups of uh of 256 values and let's say that this group um because the tree is fairly shallow it will not share more than maybe four to six prefixed bites with any other group in the tree so in order to not use all those uh in order to not have uh Branch branching nodes with only one value in the tree all the way down all the way along those thir 31 bytes that we call the stem what we do is we create this uh this extension node that just says um yeah that just contains the stem and that's quite interesting because unlike the MPT that we we use in ethereum um if you for whatever reason you need to you you find yourself inserting in the tree and you end up in um having to insert into the extension noes so you will have to create intermediate um branching noes of course but the commitment or the the equivalent of the hash to this tree doesn't change um so because the the extension node in the MPT you would have to break the the extension into two pieces and so the the node itself that used to be there will have its commitment updated not so here which is quite interesting because you don't have to recompute U recompute everything or at least the commitments to to that note that uh gets a new sibling uh yeah so trying to not spend too much time on that either but why why do we have veral trees or why do we want to use verical trees well basically because the tree gets smaller uh the proofs that are connected to that are connected to its use get smaller so it makes stateless clients light clients more um usable or more practical at least um you can also uh enable other features like like I was talking about the the portal Network so you could imagine getting the data as you need it of the network in real time and verifying the proofs um and another reason we also designed this whole this whole very complicated structure has several intents in mind but one of them is to be ZK friendly so everything is uh uh all the hashes are really um all the values are considered to be field elements and um and that plays along nice nicely or more more nicely than uh than the MPT currently does um so now let's talk about the conversion and and the fork itself um so yeah there's this um this uh item on the road map that's called The Verge um it's yeah it's after the merge we had the I'm trying to remember what was the name of 4844 um the no The Surge I think uh the next one is called The Verge and it's about trans ating um an MPT to a veral tree and the idea is also to make the proofs uh of all the data that gets accessed in the block widely available so either we we were either going to put them inside the blog we could just spread them on a different peer-to-peer Network we could just dump them on a on on some website um either either way the idea is when you get the block you can download the the proof and its witness and this is all you need to execute the block you no longer need to sync the network to have the right State uh so that you can verify if the block is correct or not and that will lead to other interesting properties namely that you could validate um the block without being synced and if you need to produce a block we have this other thing called the PBS proposer Builder separation so when it's your turn to produ use a block what you do is you just ask a builder to do it for you a builder that would have the whole state and what you do is you validate the block yourself uh use because the Builder will be incentivized to give you the proof anyway otherwise you cannot propagate their block so you verify the proof and you just uh propagate the block uh around the the network along with the proof so that people can attest to it um so yeah the the real difficulty here is in the conversion because like I said earlier we in one year there will be about 1 billion leaves in a tree um there's no going back given the amount of computation of data uh that is involved um the amount of ram dis everything um you will need you have to get it right the first time you it's not like you can make an emergency Fork to to to solve everything um and yeah so when when we tried to to see what it was like how much data and how much uh um resources how many resources we would need to to produce this um a very fast machine from actually a couple years ago but still uh xon took roughly 18 hours 16 gigabytes was really the minimal requirement honestly you you need more like 32 and it dumped roughly 150 extra gigabyte so yeah all of that has a cost and is more importantly a pain for the for the node operator to to deal with um so we've been looking at some uh at some uh design ideas and um what we saw uh was what we looked at are the following uh following items so is it easy to think how how much of a burden is it on the on the Node operator uh risks of bugs of course how long does the trans conversion will will take um etc etc um so we came up with uh those five ideas so the first one is called the rollup appreciation month and it's a terrible idea but um it's um it has a lot of good proper properties nonetheless which is why I mention it the the name itself is a joke but the idea is that for a month you only produce empty blocks so no no um uh no transactions get included on Main net that means if people want to uh transact to to interface uh to interact with the network they have to use rollups um clearly that's not really acceptable but on the plus side the execution is safe because you don't EX Ute anything synchronization you don't have to synchronize anything so that's that's nice um at least during the conversion um yeah and the the risk really what I mean not only is the ux bad but on top of that it's more likely that a lot of people say I'm not going to bother doing that conversion I'm just going to wait for people to do it for me and when they do I will download I will download the image uh which is exactly what the next idea a is um and the idea is that some very powerful nodes are going to to do that conversion and then share it with everybody else uh so the idea is that for example here you have two two uh nodes with their own view of the chain and one of them that is very powerful at uh at some height that has been predetermined and it it needs to be the same like it's it's um it's a consensus value everybody needs to agree on that um at uh some block height that is uh yeah a consensus agreed by consensus some of those nodes stop following the chain and they just start converting like uh we did in um when we when we wanted to estimate how long it would take to do the conversion and uh when they do um they produce a database that is uh that is the same the same view as the states here but it's in the vertical world and then they distribute this around to the network and so everybody has to download this state and then replay the blocks as they as they as they have them um in kind of a revisionist history history where um they just pretend yeah no actually we were in vertical mode since that time and we just replay and hopefully they managed to catch up fast enough that uh they are on par with the chain before the fork uh sorry they on par with the empty chain before the fork height and then at this point the revisionist history becomes can canonical um so there's a few problems with that I mean there's a disconet clearly there's replaying um replaying past history in veral mode but the gas because the gas schule the gas model changes uh that means that uh you could potentially create an attack or um yeah nodes are going to pay uh maybe something that used to be cheaper in MPT becomes very expensive in veral tre so you could create some attacks you uh and in general I I've tried this model it's extremely brittle so um it's uh it's really not uh practical there also the question of the syn how do you make sure that what happens if you join late what happens if you don't if you have a problem converting etc etc um so so yeah it's not it's not very uh nice and on top of that uh it uses twice the this space because yes you need to have two basically two clients running side by side um okay I need to uh go a bit uh faster on on the alternative method yeah we have the bulk conversion which is basically the same thing as what I just described but everybody does their own conversion so it's really the worst of Both Worlds even though you don't have to trust anybody for with the conversion um but here here comes the the method that we want to really uh cover which is the overlay method and the idea is that you have two mpts you have the uh the old Frozen mpt3 so at some Fork height you say okay I will not write into that mpt3 anymore you start with a fresh overlay tree and every right goes uh into that tree and so what you have to do and I will show show it but um I have a little demo but um what happens is that when you read from the tree uh sorry when you read from the state first you read from the MPT is from the veral tree which is the overlay tree and uh if you don't find the data in there you have to go to the base tree to see if the data wasn't um yeah was present there and then every n blocks sorry every block you convert you take n values from the MPT npts actually and you move them into the veral tree um and yes it's it's a process that starts after the finalization of the of the four blocks that you can assume for example that the the snapshot is there you can assume that uh uh yeah it it will not reor so it simplifies it simplifies things a bit oh and yeah okay I'll get that uh I'll get to this topic in a minute um and what you can do is when you know that you haven't um um that you will not need to reor the the MPT at least through the MPT you can delete internal nodes so you save you save uh you can save some space um and yeah and because it's in consensus you know you everybody can find out if they have a bug the the chain will split so you can make an emergency release so it's it's really it's really nice nice for that um yeah so just to show how it plays out I have um you you have the the blockchain that is going to expand in that direction um and um this is all the the blue squares represent the values that were written to the state before the fork um the the purple ones represent the values written after the fork So currently there isn't any the red uh squiggles here represent the internal nodes of the meracle petricia tree and the Green Arrow represents the the the head of the iterator so I was saying each block you have n values being moved from or at least copied from the MPT to the veral tree um so we're um yeah this this is the head of the iterator that that is going to to sweep over the state basically so when the fork starts at at the first Fork block you start with an empty veral vertical root uh which is uh yeah represented by a a green triangle um and then because you execute the block you execute the transactions values get written to the tree they are values that are written after the fork so it's purple and then the iterator starts moving over the starts sweeping the state and um moving n values so in this little demo n is two right and they are copied into the veral the veral tree now what you have to understand uh this um something that is inaccurate on this uh on this little diagram is that the hash the hashing function is actually different so those values are actually not in order but for Simplicity I keep them uh um I keep the same order even though it's not going to be the same order and so imagine you want to access some data that is here um you don't know where it is first so you have to go for the tree the veral tree first you don't find it there so then you have you go to the MPT and and you have it and you find it if you don't find it it's simply not present in the state next block um you come up with a new overlay tree I mean you keep increasing the the side the the size of the overlay tree keeps increasing um and you move like the iterator moves by another two values so they get copied but but while you were executing the the block some values got overwritten so they get clobbered really more than overwritten uh so next time if you want to read the value here you will find it in the vertical tree you will never hit the the Merle tree um and when the MPT NOS sorry when the the conversion block the transition block is uh sorry the fork block is um is uh finalized you you know you will not reor through the through the MPT so you can uh Delete the data and eventually uh sorry you can delete all the internal noes and eventually all the data is moved is copied from the MPT to the veral so you can delete what's left of of the of the MP um right so I didn't go too much into details but for example you because you rehash every key in a tree uh you need to um you need to download the pre-images because G in fact I can't think of any except Aran I don't think anyone else stored the pre-images so uh yeah you will need to download the preimages so um and yeah yes uh we we have several options but it's probably just going to be dumped on a um on a um what's that called on the CDN or maybe some people will want to propagate them through a peer-to-peer network but yeah that seems overkill for for this thing um one thing that this little diagram that was simpli simplified did not go over is that actually uh we won't start to sweep before the vertical Fork has uh finalized because uh yeah we don't want to reor through the transition basically or we don't want to reorg through the copy of the MPT that's that's a simplification and um during the transition another thing that is not said so we will make proofs available right off the bat right at the transition but uh proofs and witness uh but we will not uh make um we will not prove what used to be in the MPT we will only prove what got written into the veral tree um so that conversion um yeah it's uh it is heavy how many uh the question is always how many nodes will will be able to follow the the network how many notes will we lose um so it depends how many leaves we translate in one go but we seem to be okay with 10K so it takes if we look at the logs we see that just insert in the values in the trees takes about half a second preparing all the values also take maybe 300 milliseconds um so let's say it's one extra second of work that can actually be done um in preparation so the way the ethereum consensus algorithm works you you have more or less four seconds to produce a block then you have four seconds to attest it to it for the rest of the network to attest to it and then you have those four seconds where the node is not really doing anything I mean of course it's doing something but yeah it's not as critical um and so in that 4 second time frame you could prepare uh you could expect notes to prepare the conversion for the next for the next block um now clearly uh not every machine it's still quite heavy for for some machines Raspberry p are not going to make it but maybe like their um beef your friends like the rock 5B can do it we know it can do 5K we we tested it we wanted to try 10K but it died so uh and they're extremely hard to to get so we we have more testing to do but we haven't been able to do it yet um yeah so the idea okay maybe I'm going to skip over that uh because yeah time is passing but the idea is that if we follow 10K uh if we move 10K per block uh the conversion and if everything goes well and none of the none of the nodes um the validators drop off we can achieve the conversion in like 15 days which is acceptable um right so I wanted to say something about the gas cost uh the idea I was saying we we have the snapshot which is a key value database and we have uh the tree which Cor normally the the snapshot and the and the tree are um are in agreement um but what happens is that okay the question from from what I've said until now you could get the impression that you potentially need to read two trees like every time you need to access the state you will need to read two trees um and so the gas cost the the right and read gas cost should be double but that's not the case and the reason for that is because um you you have the snapshot and the snapshot is like I said a key Value Store um and on top of that the MPT uh will be frozen so you can actually delete the MPT and just keep the snapshot and uh every time you read um in normal guas execution you go straight to the snapshot so what you could do is actually modify the structure of the snapshot so that it um it contains both the MPT values and the veral tree values so that you can read everything in one go if the you have a reorg the snapshot gets invalidated um then you would still have to read the the vertical tree but um but that's more or less what you're expected to do um that's what the the gas cost corespond to it's it's a tree read not a not a snapshot read so that's uh that's fine and um yeah so basically what I'm trying to say is if you read actually you you it's just one database read most of the time so you do not need to change the gas cost really and the write you need to do but you only write to the veral tree so you're only writing to one tree so actually the the gas cost need not evolve need not change just for the I mean the gas cost will change actually when you switch to Vertical trees but it you not need to design a special gas cost for the conversion process um yeah so just as a summary um we the it's the overlay method is is a bit complex but it's not as complex as for example the the bulk conversion method um you still have to handle those two trees especially in a very tricky case where for example I was talking about gas token or crypto kitties or all those very large contracts you find yourself you can't translate the whole contract in one go so you have to move some leaves of the contract uh per like at a time and um you might find yourself accessing the contract um that has some of its sleaves in the veral tree and some of its sleaves in the MPT uh that's a bit difficult to handle I mean um it's a it's a bit uh it's a lot of bugs can come out of that but um so far we've been we've been fixing them and it's been quite stable since we fixed a lot of bugs so we should be we should be fine but yeah uh caveat uh that that keeps that's where a lot of bugs could come from um it's not so bad for the computation the ram the sync um is pretty simple I'm going to get into the sync if I still have some time I don't um but okay so I'm going to skim over it um but uh yeah the idea is that you can reuse a lot of the of the snap sync and because you've got all those um all those values sorry all those proofs uh the healing phase of this of snap sync is is quite nice um so yeah that's uh that's the overlay method in in the nutshell I'm going to talk do do I still have can I still take five minutes because I realize it's almost 45 minut 100% go ahead um so I'm almost done anyway um so yeah it's about uh the synchronization so uh unfortunately we'll have to get two different algorithms for one for the convers one for the conversion and one for after the conversion but um I would say it's not so bad if we don't have time to create one for the for the conversion because the conversion is going to be one week to sorry two weeks to to one month so if we take um if we have to full sync it's going to be a bit of a pain but it's it's not the end of the world um and so the idea is during the conversion what we could do we have the we have to download the MPT at least the snapshot and we have uh also to be up to sync with the with the vertical trade but that's uh quite um the fact you have proofs and you have the witness uh as well you can start reconstructing the state as you download blocks and as you download the witnesses that they that correspond to them so you can start accumulating the the overlay layer and then in the background all you have to do is download the snapshot the snapshot doesn't get updated so you don't really need healing now of course if someone if an attacker gives you uh bad data that's a problem but um very quickly I assume uh people will start putting the the MPT database directly on um on some website or via torrent so there will be a lot of sources to to get it from um so yeah that's um that's the first step and then when you have downloaded the entire um MPT snapshot all you have to do is merge down and get back to the view you had before um so we we have an idea during on how to make it work in during the conversion um the yeah so either you do this or you could also just uh um basically Implement vertical sync for sorry snap sync for for verle and in fact uh what I'm saying here is theoretical we we have just designed the thing but we haven't implemented it yet but there's an implementation by tennish from nether mine of uh the vertical syn the veral snap sync uh that we haven't tested on the test net yet but it should happen next week so uh yeah next week I'll be able to give more information about this um and so after the conversion well either we we do what I just said we Ed the the snap sync from that nethermind has worked on or you could do uh and I apologize I realize my uh my picture is actually uh very hard to understand but what you could do is just accumulate the data from the proof and then back fill that the data you're missing either from getting historical proofs um or uh just um just um doing something like Snap sync like downloading the the historical data from this from the in the snap snap sync way but the advantage is that you do not really need to heal because every time you get the proof you get an updated uh an update to the the value of the upper nodes so you don't have to you don't have to go back and forth with the healing you can find what is what is out of date and in fact you can even update the the values from the proof so it's uh yeah it really simplifies the the healing phase all right um just trying to to wrap up so there was another method it's called State expiry the idea is that we don't copy anything from the MPT to the to the veral tree and then in the end uh people will come up with a state expiry scheme but like I said no one has worked on it for a while so it's kind of kicking the the can down the road and in the meantime people have two uh two databases on their on their discs so it's really not uh it's not really nice and uh yeah we don't get all the nice features of veral tree because we still have the MPT so we don't get the ZK uh we don't get the ZK friendliness we don't get the proofs unless we want to do MPT proofs which which is uh yeah which are huge uh so that's why we're not this is like the backup solution but we don't want to to do that and uh to finish yes uh where are we at so we have acceptable performance at least executing blocks is on par with what we get for MPT on Main net so that's really nice um we have a working test net that we haven't really discussed uh announced yet because we're still doing some tests but it's been running for almost a week now uh we have a shadow Fork that we started working on hopefully before the end of the year and uh yeah we have already two execution client and two consensus clients supporting us um and the things we're going to work on in the future it's uh we deciding if we want to Pro put the proofs in blocks we want to check if it's going to uh to work with Cancun we want to see if l2s are interested in adopting veral trees probably not but um yeah we have we need to have this conversation and currently our uh proving speed our verification speed is not very um is is Not Practical it's like uh one to two seconds uh so we need we need to to improve that so that's going to be what we work on next and with this uh that's pretty much it I encourage you to look at viral. info and you can visit the testet landing page where you can can get all the informations you want to to join the test net if you're interested super cool yeah super cool thank you so much I know Stephen already had a question like five minutes in so you still have time for uh actually before questions giom is that right y yum okay cool all right Stephen you're out right um so first question I think I just want to talk a little bit about the pre-image dissemination like so because the pre-images are still being obviously like updated all the way up to that fork Point um and the translation of the veral tree doesn't start until that fork point is finalized is there any concern that like block production might stall because you don't actually have the pre-images of those first things that you need to translate in the in those like coming blocks like how how is that kind of like thought around and handled yeah that is a that is a concern indeed um so something I yeah I kind of skimmed over that because I could not really uh uh I mean there was a lot to crme in in those 45 minutes uh but what we do is we do more than wait for the MPT to be uh to be finalized sorry for the fork block to be finalized before we start the conversion we're going to give it maybe a week uh so it will be a week with two two trees um yeah and no no copy happening and the reason for that is because we want those notes so what I was describing about the the offline conversion method we don't want to do that for the state but we can do that for the pre-images because you can check that well first there are two things that are really interesting uh first the current pre-image size database is like 78 gigabytes um and we it's you know by the time we get there it's probably going to be 100 gigabytes but out of those 78 gigabytes there's maybe four gigabytes that are really useful um so what we would do is there will be some processes that will stop at the um some yes some notes that will stop at the finalization Block and they will uh actually at the yeah the finalization blog of the fork blog and they will go over to snapshot and resolve every preimages in their database and produce a file that is the that is in the exact order and the reason why the order matters is because currently um the preimages they're stored in at least in gu they're stored in hash order right so that means you're basically jumping all over the database looking for your preimages so you're thrashing so we take the goal is to get a very powerful node that does this thrashing for everyone else and um and then distributes the file it's it's a six gigabyte file okay make it 10 right people can download download this and then iterating the snapshot takes about two hours so very quickly you can find if the file you've been given is correct and the reason why we have this one week is in case we find out that a lot of notes are reporting um I don't have the right preimages I don't know where to download it blah blah blah we make an emergency release to delay it by a week we try to figure figure it out so it's a bit risky um but I would say compared to doing the conversion of the state like distributed 150 gigabytes of data asking everyone to replay blocks on top of that it's uh much safer no yeah that makes a lot of sense thanks and then this the second question I had and then I'll let other people talk um is actually not even necessarily on like veral per se but I know early on um path based storage um was talked about a little bit along with snapshots and I'm very interested on if the path based storage scheme um which I think probably still somehow incorporates in with veral but is actually going to deprecate these snapshots because you still you can do that 01 lookup directly into the MPT with path-based and so I was uh when path base kind of first dropped I was really looking into like how that was set up and it didn't seem like it was directly deprecating snapshots but yeah I I don't know if you have any insight into into that I know it's a little deviation from this to no no it's actually a very good question um so yeah so I'm transitioning myself as guess is transitioning from uh from the hash based to pass based I I'm also transitioning the whole veral um design from from hash based to pass based and indeed uh a lot of my thinking has been made on the snapshot but indeed the goal so it's not going to happen immediately but the goal is to replace according to Gary at least the the goal is to replace the snapshot with the Pass based approach because it's the same thing I mean honestly uh a lot of the code is similar um okay a lot of the storage code the layer the layering there's a lot of similarities right so um it's true I've been I've been mentioning the snapshot a lot but uh typically at the at the time the fork happens in in a year or so we will probably be looking at a at a Pass based uh a Pass based uh version of what I just presented oh sorry to answer your uh answer your question more directly there was also a direct question are we planning to get rid of the snapshot yes uh but it's not going to happen immediately I was going to ask um for you mentioned at the beginning that the you're making this change with offsets where you have one bite that stays the same and is not part of the hash so that you'll actually be doing uh having a bit more locality when you're looking up blocks or keys that are in a similar sort of storage area is that going to coincide with a change in gas accounting as well to charge less for accessing let's say all 256 slots if you're doing that and if so I assume that probably be after completing a conversion uh no yeah okay so indeed the that's also a good question I didn't cover that in the slides uh there's there's indeed a guest model and what happens is because you have this group uh you pay opening the group implies paying some gas and then accessing an individual Leaf uh implies paying more gas but it's cheaper to then then once the group has been open um each individual Leaf accessing them is is much cheaper than opening a new group uh so yeah there's a gas model that change that corresponds to this new this new reality um is it going to be uh Implement it's going to implemented right off the bat at the conversion because we will have a veral tree um and um and yes uh the MPT is uh accessing the MPT means accessing well I was going to say the snapshot again but uh let's say this flat database um so um it's not going to be the discrepancy between the two guest model is not going to be too much of a problem um but uh yeah I was thinking of something else in okay sorry I I I forgot but I I think yeah if if you have more questions about it let me know have a a quick one um uh Patrick here my question is related to uh fee pricing related to the type of modification to the tree you know with these tree structures you can modify arbitrary values in it be a smart contracts and whatever as the tree gets bigger like certain modifications can be more expensive than other modifications based on the process I'm wondering if you guys through any of this like entire revisioning of like how you're going to manage the trees have discussed it all trying to couple like gas costs like with the actual time that that modification is made like especially if you have these smaller smart contractor like me making a change to one part of it may be very different yeah dude I've never heard anything like this um yeah sorry I got a bit confused because indeed it was uh it was like there was a Slowdown um I'm happy to answer it but yeah could could someone repeat I'll try again someone else someone else go I have a good question um part of the benefit of the veral um you know uh version is that you can increase the branch Factor um without without increasing the proof size um what what what is the you know why can't you increase it to be completely flat what is what what ends up being your bottleneck uh that's a good question um well I mean if it's completely flat then you find yourself having um like you need to update it's it's polinomial right um yeah that's that's a wait let me think uh I thought about it some some time ago I just don't remember what conclusion I came too uh but basically okay there are two things for uh for sure the problem is that if um if your pooms get very uh very uh uh yeah if your polinomial gets very large that means that um you need um okay sorry let me take take over restart um when you evaluate your polinomial uh the size of your database the the size of your branch is also the the degree of your polinomial right so that means you need to have uh a lot of uh this polinomial is encoded on an elliptic curve so you have a lot of uh a lot of elliptic curve points right now you have 256 that you encode your polinomial against um if you start having one billion degrees like if your the degree of your polinomial is one billion that means you need to have 1 billion uh elliptic curve curve points I in Ram or in your database so that's already a lot uh for something that um that is uh yeah that you could optimize by by having a tree and uh otherwise um there was there was something else I okay if I don't remember before the end of the of the the presentation I will uh I will write to eron about it but there was another reason I just can't remember it right now I can try again here see is this any better or am I still much better cool uh thank thanks again for stopping by to present I'm gonna excited to watch the record and see what I sound like but um my question is related to um whether or not any thought as you've been working on like new Tri structures or tree structures has gone into thinking about pricing uh lookups or modifications of State based on the complexity of the tree so like based on where you're inserting how many like intermediate nodes or like what the depth of it is in the tree or if that's just not really a concern anymore with some of the stuff you're doing with veral so because indeed the branching factor is much larger um the tree is deeper so I'm not the one who works on the gas model that's that would be vital Canen CR I assume um but my understanding is yeah there they were L considering uh a a depth a depth of an average depth of four maybe six um so yeah I don't think because it would even out uh you know we still have the hashing we still have everything so it would even out U so it makes little sense to design a very specific guest model and also on the user like from a user perspective uh it's quite difficult to uh you know you can't expect the user to to know beforehand what the tree is going to look like so uh I would say no we haven't really given too much uh thought about that makes sense thanks I was going to ask one more which I assume the answer is probably fairly simple um but for the overlay strategy when you're uh having data that's both in the veral tree and then also potentially falling back to the Merkle tree how do you handle deletes in that case do you have some sort of Tombstone or use the snap or uh so we don't delete anymore um even for storage Keys as well so storage Keys We override with zeros ah okay and and we no longer delete I mean that's not even a veral tree thing it's going to happen next Fork dun uh we're going to simply refuse to I mean self-destruct is removed uh I mean mostly because it hasn't served its purpose either we no one is really self-destructing contracts they're more like doing a lot of weird um weird Shenanigans where they create delete create delete million times but they don't really um they don't really delete permanently and those you know we also have a lot of uh maybe not a lot but fairly painful painful history with self-destruct that's the Dow hack that's uh a lot of things so we we'd rather get rid of it it's it hasn't serve delivered on its promise basically cool um yeah I'm still I'm still bothered because I know the answer to to the question by uh who who asked it Daniel was it sorry I'm trying to David I think David sorry um but yeah so I answered there's there's the size of the evaluation plan uh there's also well Computing a polinomial because every time you insert like at least the first time you need to to to compute the entire uh polinomial if it's a million leaves that's a lot uh but now we insert differentially so it's less of a problem you don't have to recompute the entire polinomial every time um but yeah it's basically the same thing as uh as M oh yeah uh no wait that's that doesn't work but it's it's basically the same thing as as Merkel tree so it's uh yeah it's separating uh it's kind of a divide and conqueror but there was another re very good reason I'm sorry I'm really annoyed I can't remember it right now but it's going to bother me all night I will get there I thought that the I thought that generating the commitment of the node was O of the branching Factor um is that not the case uh no uh because I mean okay if it's a new node and you're filling everything in one go yes but uh because it's polinomial you can just you can just do a you know diff like it's basically just adding two numbers so you can just add like the polinomial is just adding a number compared to um yeah at the right degree I mean actually it's not the way we do it we don't do coeffic form we do evaluation form but the idea is the same it's just about adding the difference um so you can just update the the polinomial without uh sorry update the commitment without opening every single branch in the in the every single item in the polinomial so you you can we call that diff update you can update it basically in o1 time that's the other advantage of uh of polinomial commitments over hashes what is used as the source of Randomness for the Peterson commitments uh the randomness um well actually do we use I'm trying to Oh you mean um oh yeah of course um no uh so we we use uh we basically have a a string and we use it as we hash it and we use it I'm trying to remember the exact details in the code but the idea is that yeah you have the string you hash it and then uh you um you take plus one uh you take that number and you say plus one and you you apply that to the uh to what what do you apply it because if you do that then you would be able to uh um well okay sorry I was ask you never mind I don't remember the exact details but the idea is that you are uh you are uh yeah it's a hash basically so it's a it's a string it's a hash and if you can find the idea is that because you can't um you still have the what's that called um discrete log problem like if because you can't solve that you you have uh yeah you have those those you can't calculate the difference sorry the the common factor between the two uh yeah I'm I'm talking uh sorry like this I haven't touched that in a while so I need to remember how that was done but uh but yeah it's basically a hash like we have a train that says veral tree October 21 or something that we hash and this hash is our source of of Randomness cool uh does anybody else have any questions all right I'll take that as as a not I'm gonna try try one more time uh Gom oh yeah very good okay great thank you guys so much for coming it was really fantastic we really appreciate you taking time uh yeah thank you again so much and feel free to use the Uber e code too if you haven't already cool thank you and um yeah so there are those two questions it's going to bother me I'll I'll get back uh I'll get back to you on um on Twitter awesome thank you guys so much we'll see you guys take care [Music] bye oh Back To Top