for those who don't know me my name is Matt Morehouse uh for the past year and a half now I've been working to improve the security and stability of the lightning Network and one of the things I worked on last year mostly uh was fuzz testing the lightning Network and that's what I'd like to talk about today so if you don't know what fuzzing is uh you should it's it's a very valuable testing tool that every software engineer should have in their toolkit and I'm not going to go into details about how it works or that sort of thing but there's plenty of resources out there so uh go ahead and take a look and start using it uh but as with any software the reason we would want to fuz the lightning network is to find bugs and lightning Network bugs are especially bad of course you know as with any software bugs provide a bad user experience but with lightning specifically there's money at stake and if if you can't uh send payments to buy your groceries or whatever then you're probably not going to want to use lightning anymore and things things can be worse than that um in the traditional payment system for example credit cards if there's a bug in the software and you get charged something you didn't actually buy you just call up your your customer service and you tell them what happened and they can fix things on the back end but with lightning uh there's no fixing things if you if if your node glitches out and sends twice as much as you wanted it to send or it burns a whole bunch of money as fees uh you're kind of out of luck so it's extra important that we we make sure that these bugs don't exist and and that we fix them when we find them in addition uh lightning has this unfortunate property that nodes need to be online in order to protect your funds if a Noe is not online with some amount of regularity uh what can happen is your counterparty can broadcast a revoked commitment and you won't be able to notice that in time before they can then permanently claim those funds so they could potentially steal your entire Channel balance and this can also be done using htlcs if there's any htlcs uh in Flight when your node goes offline uh once the time locks on those htlcs expire the Upstream node can come can claim the hclc via the refund path and the downstream node can claim it via the pre-image and so they can steal htlcs that way if they if your Noe is offline for too long and so it basically for lightning any sort of Dos Vector becomes a fund stealing vector and it's especially important to fix any sort of crashes or anything like that for lightning so that's why fuzzing is so important we need to find these bugs we need to fix them and I have a couple examples to talk about uh bugs that fuzzing is found for lightning specifically first uh we have CL cln invoice parsing bugs that existed prior to 23.11 these uh were all bad things that happened when you tried to pay certain invoices uh your node might crash it might read on uninitialized memory you might have a buffer overflow or other undefined beh behavior and all these things all these bugs were found by a single fuzz test that Nicholas gold wrote and that I helped to improve and it it's a great fuzz test um and it's it found a lot of bugs this particular bug or set of bugs is not super concerning from a security point of you because if you pay one of these buggy invoices and your note crashes as as long as you can restart your node and make sure you don't keep paying invoices from that person again then you don't really have any worry about your funds being lost maybe in some you know unusual circumstance you you could have funds at risk if your node is remote somewhere and you you don't aren't able to get access to it through restart it or something like that but I think by and large this is more of a bad user experience and a griefing Vector rather than Le major security risk but other bugs are a lot more serious uh yesterday I disclosed the L&D onion bomb which was a vulnerability in L &d prior to 0.17 where any L &d Note could instantly be crashed by sending it a malicious onion packet and the source of the attack was was trivially concealed because it took place via onion routing so the attacker didn't need to uh connect directly to the victim at all you'd have no idea where the attack was coming from as long as there was a path from the attacker to the victim uh within the lightning network uh they could they could crash your note and this was discovered by a very simple fuzz test um I think it was less than a minute of actual fuzzing so these are the sorts of things that are are really scary you know this bug could have totally taken out the majority of the lightning Network could have kept it offline for indefinitely until till everyone upgraded their nodes um and you know while everyone's offline funds could have been stolen too so these are the sorts of things that we don't want to happen in the lightning Network and we want to make sure all these bugs are found and fixed as fast as possible so with all that said uh how are we doing how is lightning fuzzing going right now to be honest I don't think it's going very good there's a lot that could be improved there's not a whole lot of buyin from maintainers um the past couple years have been pretty stagnant outside of of a couple contributors so I'll talk a little bit about each implementation specifically um l the most popular lightning node currently uh I've heard 80 to 90% of the network runs this uh when I started contributing to L&D a couple years ago they had 58 basic fzz tests that you know encoded and decoded various lightning messages uh and I as far as I'm aware it's actually these tests had found some important bugs in the past and allowed them to fix them before they became problems uh but unfortunately a couple years ago when I started looking at L&D none of the fuz tests actually worked they were all broken and what had happened was L&D was relying on a a fuzzing library that was no longer maintained and that uh was no longer compatible with newer versions of go and so none of the test would run anymore and also they didn't have a public Corpus and what this in fuzzing terms the Corpus is the set of inputs that the fuzzer has found that it considers interesting because it's exercised new coverage or it found a bug in the past or something like that and not having a public Corpus is not the end of the world but if you make it public it it helps outside contributors a lot because instead of having to start from scratch they can can grab the best known Corpus and they can start fuzzing from that and it you know it could save them weeks or months of of fuzzing time and it's just good practice if you want the best possible um fuzzing coverage and if you want to encourage contributions from outside so last year I I did my best to try to improve the situation with L&D um I fixed those fuzz tests that were broken I set up fuzz regression tests in CI and a public Corpus that the regression test would run on and this this is basically to make sure that the fuzz tests don't break again uh if they do then the CI will turn red and and and it can be fixed quickly um also wrote a lot of fuzz tests there's almost double the number of fuzz tests today that there was a couple years ago but unfortunately no one else has really contributed to L&D in a major way so that's not great but cln has has been very similar a couple years ago they had 11 basic fuzz tests four of them were useless uh either because they crashed immediately when you tried to run them or because they weren't actually fuzzing what they were supposed to uh but nobody had actually uh measured the code coverage for those buz tests they didn't realize that it wasn't executing the code that they thought it was uh and cln also didn't have a public Corpus but uh I fixed up those broken fuzz tests and I set up CI in a public Corpus for cln to so hopefully some of those things won't happen again and roll a whole bunch of new fuzz tests uh there're 70 today but again no one else has really helped much I know Nicholas go wrote that invoice fuzzing invoice yeah invoice fuzzing test and that's a great one but outside of that I don't really know of any new fuz tests written by anybody and clns had also had additional issues with maintaining the F tests so on one occasion they were trying to change some things with how CI was configured and they accidentally disabled UB San on all the fuzz tests and UB San is the undefined Behavior sanitizer it finds a whole class of bugs specific toce programs and it's it's important to run with fuzzing uh but there were there were months where where UB San was disabled and nobody knew and it actually caused some bugs to be missed uh when Nicholas uh submitted his invoice parsing fuzzer it actually found several bugs if you ran it with uban and it would have been immediately obvious if UB was running in CI but since it wasn't those went undetected um and on another occasion one of the fuzz regression tests started to fail and the maintainer got frustrated instead of trying to fix the test they decided to disable all the fuzz regression tests in CI and that's not great because the regression tests are there for a reason and if you get rid of them then you don't find bugs as fast as you normally would uh if they're reintroduced so I I tried to explain it to them and got it reverted but both of these issues I don't know what would have happened if I wasn't there to notice them and try to fix them I don't know if the maintainers would have eventually been able to find out and figure it out fix it on their own or or if it just would have went on and the fuz test would have bit rotten again and we would have had fuz tests that didn't work anymore or something like that so this is a concern I have not just with CN but with all the implementations in general is there's not a whole lot of buyin from the maintainers to you know really maintain these improve fuzz tests and that sort of thing it seems to be driven a lot by outside country butions moving on to aair I haven't contributed to aair but I think there's a lot that could be done there because they they really don't have any fuzz tests they have a few tests they call fuzzy tests that are basically have some RNG inserted into the test to randomize some aspect of it but they don't have any modern fuzz test that uses a coverage driven uh mutating fuzzing engine um now aair is written in Scala which doesn't have the best infrastructure for fuzzing but there are things like jazzer that you can use to to fuzz jbm based languages and so I think if if someone were to look into this there there might be a lot of bugs that could be found I also haven't contributed to LDK uh but ldk's been doing a pretty good job on their own uh they have 60 basic fuzz tests and then they have three state machine fuzz tests that are more advanced than anything any other implementation is really doing uh what what these do is instead of like a normal fuzz test where you generate a random input you give it to some function and see what the function does uh with the state machine fuz test you actually generate a random sequence of events and then you execute these events on the Node and put put the node through different states and make sure it doesn't crash make sure that things remain consistent and that horrible things don't happen like you you lose your funds or something like that and I've been told that these these specific Buzz tests have found some important issues in the past so it'd be cool to see more stuff like this with with other implementations too um LDK also runs the their fuzz test in CI they set that up on their own and they actually have continuous fuzzing being done at chain code labs and this is continuous fuzzing means that as the code changes as commits are made uh the latest code is constantly fuzzed in the background and this is really good because it finds it finds new bugs much quicker than if you're you only running your fuzz test intermittently and sometimes uh certain bugs take a lot of CPU time to actually find and so if you've always got them running in the background then you're more likely to find these rare bugs um so of course chain code lab chain code Labs maintains their own private corpora you need to do that to do continuous fuzzing but Matt car said he also maintains corpora himself and he will you know run the fuzz test for an extended period of time before each release update his Corpus and that sort of thing again I'd like to see public corpora but uh you know it's not the the worst thing in the world and and hopefully we can work on that in the future um but also with LDK there really hasn't been any major contributions the past couple years just basic maintenance of the existing fuzz tests no major improvements no major new fuzz targets added so in summary the state is not that great that there's not a whole lot of investment by node maintainers the past couple years some maintainers don't really know much about fuzzing or the best practices and there's very few outside contributors it's really just me and Nicholas that I've noticed um so we really need help if if someone's interested in this there's definitely a lot you can do to contribute I have a few ideas some are easier than others and I can talk about those a little bit um I'd like to see continuous fuzzing for all the implementations you know LDK has their own thing but I'd like to see it for L&D and cln too uh this is just a good fuzzing practice to have you find bugs quicker it ensures your fuzz tests are used and and constantly running and an important aspect of this is to have uh coverage visualization so that you can tell what parts of the code each fuzz test is executing and you know you can make sure things are working as they should but also it gives you ideas how to improve a fuzz test to execute new code or how to write new fuzz test that would execute a part of the code that isn't being covered and you know these are all things that are very useful for fuzz testing so if someone's interested uh there's probably a lot of work that could be done to set up some sort of continuous fuzzing solution for for the different implementations um again air has no fuzz test at all so if someone is interested in looking at jazzer or something like it and setting up the infrastructure needed to start running those fuzz tests I think that could be very fertile ground finding lots of new bugs in a Clare I'd also like to see more differential fuzzing and what that means is you you generate an input and you give the same input to multiple different implementations and then you check the outputs and if any of the outputs differ then probably there's a bug in one of the implementations and lightning is very well suited for this sort of fuzzing because there's four different implementations and so things like invoices or commitments the uh the notes all need to agree on it if they disagree there's problems so you know an invoice differential fuzzing would look like a fuzz test that generates a random invoice uh gives it to all the different implementations and has them parse it and then Compares how they interpret the invoice and if there's differences there's probably a bug in at least one of the implementations likewise commitment transactions given a set of parameters for a channel and you know a set htlcs that are added or subtracted from the previous commitment every implementation needs to generate the exact same new commitment and if they don't then they can no longer interoperate and force closes will happen uh and force closes are a major complaint on the lightning Network so anything we can do to reduce them is going to improve usability and make operators a lot happier um this one's L &d specific but L&D actually uses some third party library to do all their elliptic curve operations and it's it's written for some random crypto currency and I don't really know if I trust it or not I've never heard of dcred before U but I'd like to see differential fuzzing between that library and something like lipsc P that is more widely used or trusted and see if there's any differences between those because if there is that be potentially a major problem uh more ideas I like to see more State machine fuzzing like LDK has uh lightning has all kinds of State machines that could be fuzzed when you open a new channel to someone there's a whole bunch of states that happen there where you're negotiating parameters for the channel uh you're creating the funding transaction exchanging Comm commitment signatures waiting for the funding transaction to be published all these sorts of things and you know and disconnects can happen while that's going on that would change the state and then there's reconnect logic and stuff like that uh this is all things that could be fuzzed and things that we want to make sure you can't cause a crash by putting things into a weird State and uh with the new dual funding protocol there's there's a whole new state machine that's even more complicated that could be fuzzed um you know as mentioned before commitments need to agree between implementations but also there's a whole bunch of State maintained within each implementation to um update commitments as htlcs are added or removed as fees are updated or in the future there's going to be this Dynamic commitment thing where you can actually change the entire commitment transaction format and you know as all these things are happening we need to make sure that things don't crash that you don't lose funds that sort of stuff splicing also has a fairly complex State machine because while the splice is in flight and so on between the point where the the splice transaction is broadcast and it's received some number of confirmations your node actually has to maintain two different states it has to remember the state before the splice and after this place and any htlc that is offered to your node needs to be valid on both of those States otherwise it has to be rejected and so there's a lot of extra State and things to things to keep track of with spacing and I think there could be potential bugs in that onchain resolution has all kinds of States as well it starts when your node detects a force close a commitment that was broadcast to the M Pool and now your nodes you know maybe that commitment confirms maybe something else confirms it doesn't really know and maybe it wants to maybe your node wants to fee bump that confirmations because deadlines for htlcs are coming out so it it might spend an anchor output on that commitment and then as deadlines approach closer then it might want a fee bump and all these are different states there's more States after the commitment is actually confirmed because now there's all these time locks that need to expire before your node can then sweep funds whether it's a second stage htlc transactions that need to be broadcast or your local balance that can be swept after a time lock um some nodes do more complicated things like trying to batch multiple htlcs into the same transaction and so those nodes might actually wait a little bit after after a time lock expires so that they can batch multiple hdc's together all these things add complexity that can hide bugs uh they're stay involved with the network graph as your as your node gets gossip about new noes new channels Channel updates all those things are going to change the state on your node and of course watchtowers a watchtower needs to stay in sync with your node or it's going to be useless and so as your node is streaming events to that Watchtower you don't want it to crash you don't want anything wonky to happen because uh then the Watchtower is not going to serve its purpose anymore so these these are some of the ideas I have that uh we could do more fuzzing of in the future but we really need more contributors um it'd be good to have more Buy in from maintainers uh you know if they could start working on fuzz test their own stuff I think that'd be great and but but yeah that's that's kind of the state of where we're at and I think we have time for questions if you guys have any to ask yeah fan yeah I'm interested if you know if you got feedback from the maintainers why there's not more buying from them or um if you have ideas um because I mean that seems like the much bigger lever to achieve um like the the next step like outside contributors that don't really know the code base um would need quite a lot of um startup time and I mean these uh lightning implementations mostly are VC funded companies that probably could fund someone or pay someone to do this um so yeah where do you think is like the what what's missing basically yeah it's it's hard to know exactly I think so my theory is that um for a long time lightning has not been very security focused it's it's all about you know the new features that are being developed and and so some of this stuff gets kind of pushed to the side and it it makes sense to some degree because you know you got four implementations they're all competing with each other uh you know cln has dual funding and splicing now but L&D doesn't and so we got to work on that or you know LDK is working on the next big thing you know route blinded paths and bolt 12 and um I think there's a little bit more emphasis on trying to uh reach feature parody or to develop the next new feature that's going to get more um usage on your specific implementation um yeah and there's just I mean I think some some node some nodes just don't uh the maintainers just don't have a lot of knowledge about fuzzing either you know like I don't know what aair has done about fuzzing but they have no fuzz tests so maybe they don't know anything about potential jbm fuzzing Solutions or maybe they don't know much about fuzzing but I do agree I think I think if we can get more Buy and from the the maintainers themselves that it could really help improve the security here and hopefully you know this presentation will be public so hopefully you know they'll we can get them to watch this and maybe they'll they'll get more motivated yeah I think like education that that would have been also what I would have expected like it's something that uh I'm not sure how much you have done like maybe you've already like spent 20 hours speaking with all of them and are burned out from it already um but yeah educating them showing them how in quote unquote easy it could be done or like how much they would need to invest um seems like seems like well worth investment yeah hanady you're muted sorry uh what are reasons uh behind maintenance decision to keep fing Corpus private non public I haven't so I don't know exactly with LDK they're the only ones with the private Corpus um and I haven't asked them but at least for L&D and cln I think it was just you know they they never thought about making it public or they I don't know CU When I suggested it there was no argument from them and I was just able to create the new Corpus and there was no no push back at all okay thanks yeah Nicholas have you talked to any of the implementations about PS plus like are they interested or not yeah so I talked with L&D about it a little bit um the tricky thing is OSS fuzz is best suited for C and C++ code they do have they they claim to have go support but it's in a very specific way it doesn't support the new go native fuzzing engine and when I looked at how they have it set up I was they rely on this Library that's not super um useful or not super wellmaintained AED anymore and yeah so I don't know that OSS fuz is best for I think cln could probably use it and at least if OSS fuzz would accept a submission from them but with I don't know how it would work with LDK or L&D um but you know LDK runs their own or at least chain code Labs runs their own continuous fuzzing solution for them so it's certainly possible to do something homegrown we could look into that for L&D as well do you know what specifically they're running Fork uh no I'm not sure exactly how it works Matt car just told me Jam code Labs runs the fuzzers okay yeah Gloria hi um thanks for the presentation I was wondering mostly for curiosity kind of how different you find fuzzing for the different like languages and um setups between the lightning of limitations as well like just from a I don't know most basic imagination like whether memory safety is something that you're looking for while fuzzing um would change I imagine from implementation to how much of it like translates yeah so memory safety stuff is largely a c C++ problem um there are there are classes of bugs that can happen in other languages too but you know I think that's the main reason fuzzing really took off with c and C++ initially and a lot of the a lot of the um fuzzing engines for languages are all based on lid fuzzer So like um rust if you want to fuzz rust then you you need some sort of wrapper around lip fuzzer that works for rust and that's how it worked for go for a long time too before they had their native fuzzing engine that they integrated um the the jazzer thing for jbm is also a wrapper around lip fuzzer so um yeah so a lot of the tooling is based on the C++ tools for fuzzing and yeah you don't find as many memory safety bugs in other languages but there's still good reasons to fuzz uh you know like the L &d thing you can still run out of memory you know you can still cause crashes you can get panics in some languages um things like that that are equally bad for lightning at least cool yeah thanks ban yeah I think the the presentation is a great start but I just had the idea like um you could turn this into website like the wallet scrutiny uh project um I'm not sure if you aware um of it but it basically shows a bunch of checks for each wallet um like if is it reproducible is it open source is have custodial um so yeah and basically your your your presentation also shows basically bunch of checks right of um things that are done and how many tests are there um so maybe that's something where you could continuously um show that and um also show which ones are making progress and which ones are are not making progress or maybe even the opposite um and and point people there on ongoing basis so I mean this presentation like if one of them improves like the presentation might be outdated quickly but but having a a site to point people to for continuously looking into which lightning implementation does the best in in this regard might be interesting yeah that's an interesting idea um I'm not sure exactly how how to keep it up to date besides manually but but yeah maybe that would help to motivate the different implementations a little bit Yeah there will be some manual labor involved but I mean the wallet scrutiny project like there's the person that maintains it is funded just just for this basically um and so I mean yeah it's I'm not sure for you or maybe for somebody else it's an interesting project to kind of just make this more transparent um yeah yeah I'd certainly support any work in that direction uh mark cool oh yeah uh you had um said that like chain Cod for for um for the LDK does like or they have like three tests which are like uh State machine test yeah and so I was kind of curious if or yeah if like you maybe had some link or some information where I could like learn about that I haven't heard or I hadn't heard um yeah just like what what those tests are and kind of what the like difference is I guess between those and just a normal harness thing yeah they have a fuz test called their full stack fuzz test uh that's probably the most interesting one to look at where they they kind of set up a node and they mock out some aspects of it but then they start doing things like uh simulating a connection to the node or a disconnect from the node or opening a Channel or sending a payment and you know start mixing and matching these crazy events um yeah Nicholas has a link to the fuz test there uh that would be the one I would look at first Sebastian hello thanks for the talk uh I was wondering about this deced library from lipc Peter that you mentioned uh do you know why in the goang Bitcoin Community they reimplement everything rather than using a binding to the official sep Library I think that would make more sense yeah uh I'm not the right person to ask I don't know why I understand why maybe L&D would use this because it existed and it was easy at the time but uh yeah there's definitely better Alternatives in my opinion okay okay thanks I think I've heard I don't quite remember I think I've heard some or one reason for not using The Binding is because it's slower that like apparently there's some overhead with SEO or something I also don't know the specifics but that might be why it's actually I also have another question um since there are no coverage reports um how how do you decide what to Target basically uh I do coverage reports myself there I mean there's no public automatic solution available right now but I will you know run coverage of of all the the corpuses and see what's covered or what's not um yeah is that mostly the public purposes or you have like your own that keep growing yeah so everything that's public is basically the same as what I have locally um yeah I've run the fuzz test enough that it kind of stop finding new inputs and then I don't really run them continuously anymore but that you know again that's something i' like to see is some continuous fuzzing you know or like uh Nicholas I know you were you were running some of the cln fuzz test on arm64 and you found a bug doing that you know that's something that could be part of continuous fuzzing too is running on different platforms and in different setups yeah Mike for newer contributors you sort of have a a call for contributing in some different ideas for folks that they may see you know some of the bugs that you've published or that Nicholas has published and think wow there's you know there's a way to contribute high impact here like I want to learn about fuzzing obviously they could just kind of Google around and learn high level fuzzing stuff but at the point that it becomes you know Bitcoin or lightning fuzzing related learning like where would you point those people to go to learn more to be able to start contributing um probably the easiest place to start is to just get the fuzz tests running that already exist and run them on your machine try different platforms you know you can spin up you can easily spin up machines in the cloud that run different architectures or different operating systems and stuff like that um you know Nicholas found a bug recently in cln just by running an existing fuzz test on a new platform and I think you know that gets you some familiar some familiarity with the tools and then you can start tweaking the fuzz test you can start getting familiar with the actual implementation you're looking at you know you can generate coverage reports and hm well I I think maybe it'd be cool to get some coverage of this area of the code how could I write a fuzz test to cover that and that's kind of the process I go through and I think that that's a that's a good way to to learn about this stuff too what's a good place to reach you that would be interesting uh Matt Morehouse atgma gmail.com is my email currently uh that's probably the easiest way to reach out not on any any social media or anything like that so uh but I would just say you know I'd really like to see more buyin among all the implementations I'd like to see more contributors if if you're interested in contributing and you don't know where to start feel free to reach out I'm happy to point you in the right direction and if you it' be great to get more bugs fixed and help to secure the lightning Network Back To Top