Web3 Galaxy Brain 🌌🧠

Decentralized Databases with GUN founder Mark Nadal

2 August 2022

Summary

In this episode I’m joined by GUN founder Mark Nadal to discuss GUN.

GUN, also known as GUNDB, is a real-time, decentralized, offline-first, graph database. In practice, GUN provides a Javascript API that allows webapps to sync data between peers with no centralized server dependency. GUN peers use WebRTC and similar transport protocols to sync data to localStorage and IndexedDB in the client. GUN also provides access controls, which allow developers to store encrypted data on the network. The API can be used in browser tabs, on mobile apps, or run as a relay peer in a NodeJS process on a cloud instance.

In this conversation, GUN’s founding engineer Mark Nadal and I dive into the technical details to explain what makes GUN different from IPFS, WebRTC, and blockchains, and why so many devs are excited about GUN. I hope you enjoy the show.

Topics Discussed

· GUN in 100 seconds Youtube Video
· Learn Cryptography in seven 1-minute videos
· NERFs for photogrammetry
· GUN for Metaverses
· Comparison to IPFS
· WebRTC, Gun Relays
· Every Gun peer runs a WebRTC signalling peer
· A GUN Instagram competitor by an early Bitcoin dev iris.to
· Project built with GUN

Transcript

Nicholas: Welcome to Web3 Galaxy Brain. My name is Nicholas, and each week I have a conversation with some of the brightest people building Web3. In this episode, I'm joined by GUN founder Mark Nadal to discuss GUN. GUN, also known as GUNDB, is a real-time, decentralized, offline-first graph database. In practice, GUN provides a JavaScript API that allows web apps to sync data between peers with no centralized server dependency. GUN peers use WebRTC and similar transport protocols to sync data to local storage and index DB in the client. GUN also provides access controls which allow developers to store encrypted data on the network. The API can be used in browser tabs, on mobile apps, or run as a relay peer in a Node.js process on a cloud instance. In this conversation, GUN's founding engineer Mark Nadal and I dive into the technical details to explain what makes GUN different from IPFS, WebRTC, and blockchains, and why so many devs are excited about GUN. I hope you enjoy the show. Hi, how are you doing? Good, good, good. How are you?

Mark Nadal: Doing great. Excited for the day.

Nicholas: Yeah, awesome. Have you done the Twitter spaces before?

Mark Nadal: Yeah, I run my own podcast series, so.

Nicholas: Oh yeah, I saw that in your bio. Yeah, that's cool.

Mark Nadal: It's really helpful to have integrated podcasts with Twitter to increase following versus having to go offsite, but I just don't know how to do that.

Nicholas: I think I heard they just changed that, possibly. I was kind of surprised when I found out that they were deleting them because obviously it seems like a podcast competitor is a no-brainer if you have all the original recordings right there. But I think I heard they maybe just changed this and now they keep them forever. I'm not sure.

Mark Nadal: Good, because downloading them is also really painful to get them uploaded.

Nicholas: Your podcast is just the recordings of the Twitter space you post, basically?

Mark Nadal: Yeah, and I download and upload them to YouTube when I have time, but it is so hard to get time. So it's pretty high quality, just engineering discussions and conversation, but not very high quality in terms of trying to get expensive equipment or trying to have a timeline and schedule and stuff. It's just find some really cool people building open source stuff, get them on the podcast and ask them as many questions as possible.

Nicholas: Yeah, I do something pretty similar. I just think sort of technical people often don't talk publicly as much, especially in Crypto Web 3 stuff. People do speak, but it's often not the people we're often listening to or not the people actually building the stuff, but someone in marketing or whatever. So I think it's the most fun to hear from the people who actually know deeply the technical issues they're talking about.

Mark Nadal: So true. I painfully, before COVID, right, got invited to go out to conferences around the world. And then since COVID, I mean, I know conferences are starting back up, but I feel a little bit forgotten right now.

Nicholas: Oh no.

Mark Nadal: I'm just like painfully like, people, there's so much out here to talk about.

Nicholas: Stop talking about blockchain.

Mark Nadal: I mean, no offense to the blockchain community, but a lot of the really, really interesting technical achievements are just all in. And even nerfs, I'm talking about even maybe beyond decentralization, just nerfs, neural radiance fields is like, man, if I had the time to sit down and focus on that stuff.

Nicholas: Tell me about nerfs. I don't know anything about nerfs. I think I've seen the acronym maybe, but nothing beyond that.

Mark Nadal: It's definitely outside the scope of decentralization, but I am working on doing decentralized nerfs. Nerfs are neural radiance fields and they're a new machine learning technique for photogram, I can never say photogram or photogrammetry.

Nicholas: Thank you.

Mark Nadal: So you just take a couple pictures and what it does is it first estimates the depth of a single picture to figure out how far away each pixel is. And I think that's using previous trained models, but it ray casts for each pixel. A ray cast in video games is a pretending that you're shooting a bullet from a particular pixel on your screen out into the world to see what that bullet is going to hit. So it ray casts the photo and depending upon what depth the object is at, it then assigns like voxel coordinates along that whole ray until it hits an object. And then it assigns a color coordinate at that point. And then it tries to continue on behind that item. It does that for every single pixel. And then you combine that with the other pictures in the scene. And when you train against it, it's able to build a 3D voxel representation that is extraordinarily accurate. Like it looks like somebody is actually moving their camera around the scene. And it can do it off of just a couple of photos. It's some of the most impressive techniques I've seen. And the reason why I like it so much is because it seems to finally be an algorithm. that is what our human brains do with our two eyes. Because our two eyes, each eye is two dimensional each. But when you combine your two eyes together, they're basically trying to match where the two eyes come together to see a particular object. And then using the trigonometric difference between that and your prior learning of object to create a 3D representation of the world. So I'm like super excited about this technique.

Nicholas: It's essentially using some machine learning trained models to do the parallax effect, binocular vision, depth perception in order to create better photogrammetric models than a naive just taking pictures model might do.

Mark Nadal: Yes. However, people keep on getting the amount of AI needed for learning the scene down to be lower and lower and lower. However, one team figured out how to just like build a actual voxel scene like Minecraft out of it. And that wound up being pretty accurate enough, like fast enough, they're able to render it in the browser. So there's this really fun technique from six, I don't know how many years ago, where you just take two pictures from a like two cell phones that are mounted just slightly apart from each other. And you just them where you alternate an infinite loop back and forth between the two gifs.

Nicholas: Okay.

Mark Nadal: I think it's just called stereoscopic gifs. If you search it can be a little bit nauseating. But just from two gifs alternating back and forth on the internet, like your brain starts to simulate as if you're seeing a 3D object. So it's kind of taking that concept to an extreme, but doing actual learning against it. And I'm like super excited. This is going to be a technique that will let me bring nanite level Unreal Engine five graphics into the browser is going to be 27 years before it works. I'm going to get it to work. Because I know I can use gun for doing that. raycast where I can use my database itself for a particular website that you load. I can make a query where I basically ask for what is data along this line. And that query will go out into the distributed hash table, which is basically a list of computers all around the world that are lit up by. you can almost think of it as like a like a dictionary. If you take a real physical dictionary book and you open it up, every single page will have a list of computers telephone book, but in order of of the words, rather than the words, given the definition of something, the words will give a list of other computers that have the definition. And then you go and ask those computers for the definition of the word and then you get the value back.

Nicholas: It's okay, we should get into what is gun and all the sort of basics just for people listening who don't have the overview. But I do want to follow this thought just a little bit. first. How does this compare what you just said to the way something like IPFS works?

Mark Nadal: So IPFS is using more Clademia style DHT, which was designed prior to like mobile networks. So a lot of it is assuming that there's stable fixed IP addresses that are not changing, they're pretty much permanently online. As soon as you get mobile, right? Like, sure, Clademia is great, like BitTorrent scaled. No offense to the IPFS team, I'm not seeing their system and infrastructure perform very well compared to BitTorrent. 20 years ago, it handled 40% of the world's traffic effortlessly. And then Popcorn Time came around a while ago and kind of did a resurgence. But now I am sort of a competitor to IPFS, so take my words a little bit loaded. But IPFS has just consistently been bogged down with performance problems. for since like.

Nicholas: And then the performance problem specifically would be like, I have some NFT whose metadata is IPFS is hosted on IPFS. And it takes too long to fetch the data because traversing the network to find who actually has the data is too slow. Is that the problem?

Mark Nadal: There's like seven different technical reasons. I'm happy to do a separate podcast explaining. But the main thing is that one of the biggest problems is that content addressed data still needs an externality in order to network into that data. And so you always have to do a separate query to first find where the content is. Because if you already have the hash of something, it means you've already had it on your computer. But typically, if you're trying to find something, you don't have the video or the image or photo to hash to then look for it. So you first have to figure out what the hash is in the first place. And that requires externality. And then that leads into some of the other projects that they've made. And then there is just becomes a deep rabbit hole of algorithms and computer science is called not being a one. A one is where you can search for something and you get an immediate look up without having to search or traverse a set of other lists. And if it's a one, then I'm going to come back to that dictionary example. Technically, this one is like a log log in which I've never heard of a log log in before. It's so close to being a one is effectively a one. It is constant time, but it grows by about one time. I'm getting way into the way. Yeah, we did.

Nicholas: It's good. I'm mostly following.

Mark Nadal: OK, so if you take a physical dictionary book, the human habit is actually a really great algorithm that you should implement. It's called binary search. If you want to find where a word is, so take my name, Mark. It's roughly in the middle of the dictionary. You take the book and you approximately see how large the dictionary book is and you just slice your hand in the middle of the book and then you open it up and then you check the top two title pages to see what titles they are. So let's say you open it up and happen to be open is the word that you landed on the page, the first word of the page that you found it on. And you know that Mark is before open. So you know, OK, I'm going to go to the left of this. Now if you were to start to go page by page, the left, it might take a long time. As humans, we're a little bit smarter. We can approximate that. Maybe we shouldn't go another halfway point. But even if you just have a dumb algorithm says, OK, now let's go to the next halfway point between the page they currently opened up and the very beginning of the book and you keep on repeating this process, you will be able to get to the exact word out of 10 million words in about less than roughly 23 steps. And now if you imagine 10 million pages in about 23 steps, if you imagine that every page could hold anywhere from as small as, like I say, four kilobytes, which is a it's about a screen size of text or all the way. up to two megabytes is a decent page size because two megabytes will load fairly fast in first world countries. But you probably don't want to use two megabytes in third world countries. that if you take two megabytes and multiply it against 10 million pages, it's a decent amount of storage. And then if you imagine each word for each page as two megabytes of IP addresses for each value, I mean, you only store about three to six IP addresses per word and each machine itself, each other machine holds potentially gigabytes of data, let's say 256 gigabytes of data that it has cash blockchain machines that need 256 megabytes. A lot of the blockchains need 256 gigabytes. I'm talking about 256 megabytes of cash ram. Then it turns out you can traverse up to about exabyte scale of data within six hops and six hops can be potentially completed within one point two seconds. It takes oftentimes one point two seconds just to get a response for already cash value in some of these blockchain or IPFS type systems. Now, I haven't tested against two exabyte scale, but we do have 50 million monthly active users at peak. So we are. we are actively battle testing the system at pretty insane levels. But before I get into actual performance production and back to gun itself to finish this thought of doing metaverse queries of real time graphics per raycast on your screen. Yeah, you're talking about being able to traverse an insanely large swath of data and you can do it because of computer science notation. Oh, one time you've designed an algorithm that is nearly constant time to get a response from many different machines around the world. And that is what allows you to scale. Because of what I just explained, like the computer is able to find things fast because it's pre-indexed in a way that's predictable, that is deterministic.

Nicholas: And that what is this called? This is gun or this is a piece of gun?

Mark Nadal: It's a piece of gun. There's quite a few different pieces in the gun stack. In particular, this is a new feature called book. Book is an in-memory dictionary. It's an in-memory hash tables and in-memory object. So in JavaScript world is just called an object. But in like Rust, it's called a hash map and in Python is called a dictionary. So it's an object. And unfortunately, in JavaScript, Firefox used to organize objects as keys lexically, alphanumerically. However, Chrome and a few other engines ordered them by insertion time. And there was actually a difference between how the keys would result. But then because I think Chrome and Internet Explorer at the time were using that technique, Firefox eventually gave up and switched over. So now pretty much all of the engines organize the objects, the keys in your JavaScript object by insertion time. That really sucks because there are a bunch of reasons. So let's just talk about the cryptographic reasons. If you want to sign JSON, you have to convert that object into a string. So that way you can cryptographically sign it. But on each different computer, it received a property in a slightly different order, the order of how it turns into text to do a cryptographic signature on or produce a different result. And so that means we have non-deterministic cryptographic signatures that are not like consistent with potentially a different signature. And it'd be nice if those could be the same. And I should make an aside is that in cryptography, you actually always do want to use a random seed for signing something. So even if the objects were consistent for separate cryptographic and security reasons, you'd want to have a random seed that would alternate it. However, when you're trying to do the hash of something, that's more what I'm referencing, where you'd actually want the hash of that string to be consistent. And typically, right, people are signing the hash of a text, not the text itself. There's a really amazing cartoon series we made to explain cryptography. So I'm going to avoid jumping into cryptography any further. They're one minute cartoon, animated cartoon explainers. Just Google cartoon cryptography. And if you see a gun link, that would be it. They're one minute episodes each. There's only seven of them. And they explain cryptography using cooking analogies. So if you've ever cooked in your life, you will be able to now apply about years worth of cryptographic knowledge. You'll be able to learn in about a few minutes. And that is extremely important because now you can take these cryptographic primitives and understand how to apply them for your applications. There's too many people who just think cryptography is this dark magic and it is like blindly copy and paste what other people are doing. And unfortunately that gets people really mix up between signatures and encryption, hashing and proof of work. And all these things are different and they're for different use cases and you don't want to necessarily mix them up. So watch that seven minute series, one minute episode each, and you'll have a very good model, very accurate model. Cooking is a very accurate model for understanding cryptography to understand how to apply it to any of your decentralized applications afterwards.

Nicholas: Okay. I'm looking forward to that.

Mark Nadal: Going back up the stack. Yeah.

Nicholas: Yeah. So, so let's, so let's pull out and let's just give a, what is Gunn overall?

Mark Nadal: Gunn is a multiplayer synchronization system. So think about that as being like Dropbox in real time. Or if you've ever used a thing called Firebase, you can think of it as being an open source peer to peer Firebase.

Nicholas: Great.

Mark Nadal: Gunn fundamentally is a multiplayer database. So if you don't use databases as a developer, think of Dropbox. Dropbox makes sure that the same file is on different computers as you edit them. Now take Dropbox to an extreme. Imagine if you had Dropbox for every tweet that you made for your profile data, like your avatar or your, your birth date or your, your link or your location. We can update the, your location. Let's say the GPS coordinates of your location, hundreds of times per second in Gunn, and it will synchronize across all the devices that are curious about your profile.

Nicholas: So this is publicly published data. We're talking about an example where there's no encryption applied to it aside for, from communicating between the nodes and the network, but the data is not intended to be private. So I'm able to update a piece of data hundreds of times a second, and that's available in real time to anybody else who is curious about that information. Who is, I mean, I'm going to use the wrong term, subscribe to that topic. How does it actually work underneath like at a high level?

Mark Nadal: Yes. So three things. If you want the data to be encrypted, that's why we have that cartoon series. All we have to do is apply one of our other tools called C security encryption authorization to encrypt the data before you save it to Gunn. So you can do encrypted data in Gunn. It's kind of built in by default for the basic to do tutorial, but you could also, yes, of course do public data with Gunn. And I was only saying a hundred updates per second for the GPS card per second. You might want to double it up to about nine frames per second in a video game. So that way, in case a single frame is skipped, you know, each 16 frame per second, which is 60 frames, sorry, 60 frames per second, which is 16 milliseconds per second. So that way you get a smooth transition. Gunn itself on my 2015 MacBook Air is able to synchronize between two different browser tabs, about 10,000 chat messages per second. If you're on an M1, okay. We've seen that go all the way up to about 65 or 69,000 messages per second. Again, in between two browser tabs.

Nicholas: So it's a WebRTC between these two browsers is communicating information according to the Gunn's protocol. How, whatever the communication method is underlying it, it's a WebRTC connection between the two sessions. Is that right?

Nicholas: Oh, cool. But, and it'll relay for other unrelated data, like say, peer A, peer B, peer C. If peer A and peer C want to communicate, B will act as a relay if it's connected to A and C also?

Mark Nadal: It kind of depends if you want it to. So by default, you're only going to get the data that you query. You're going to be subscribed to the data that you're interested in. But yes, if you accept other browsers connecting to you over WebRTC, they're going to make a query to you and then you're going to be connected to other browsers. So you're going to relay that query to other people and those other people are going to think that you're subscribed to that data. So it's also acting like a Tor onion layer because it doesn't reveal the original person who made the query. At each hop, you just wind up thinking it is the hop before that asked for the data. So you could turn that off. But yeah, if you are WebRTC connected to other browser tabs and they ask questions, you're going to become subscribed to them as a result of you passing them forward in the network as well. But you can disable that if you want to be a not so nice peer.

Nicholas: Got it. Got it. So in this architecture is if for someone coming from thinking about IPFS, it's not a content addressed data. It's something else.

Mark Nadal: So this is the really exciting thing about just having generic queries. So text in computer world is UTF-8 and UTF-8 is built on top of a couple other formats like Unicode and ASCII. But UTF-8 includes all those and pretty much the whole world operates off of UTF-8 and GUN's format for queries is just UTF-8. So you are able to pass in any type of other system. You could pass in a URL because a URL is just a UTF-8 string. You could pass in a content address hash. You could pass in any type of query. And so there's different plugins in GUN. So for instance, C, the security encryption authorization system, has a format that looks for a tilde squiggly. And if there's a little tilde squiggly, it'll check to see if this query or if this text is a public key. It's a public key. Then C is going to inject itself into GUN and enforce cryptographic verification checks so we can do the same thing. We do have a built in one for content addressing and by default we use SHA-256. So if we detect that there is a little hash sign, the number shift three, little hashtag sign, if C detects that there is a GUN query for hash sign, it'll then say, oh, we're going to assume that this is a SHA-256 content address lookup. And then if data is responding on it, it's going to enforce that the data is SHA-256. And shout out to Kevin here. Kevin just added a feature, which I still have to look at the PR for, where he wanted to use HEX rather than SHA-256. So, well, sorry, I think it is SHA-256 underneath, but it's the HEX version of SHA-256, so he just added a feature in C as a plugin to see it, which is also a plugin to GUN to check to see if it's a HEX version. So anybody else could come along and say, hey, we want to natively support the C IDs. So that to me is really powerful because GUN does not come with a rigid assumption of what type of data is in the system. While that's part of the reason why GUN is so performant, because it's generalizable. And now if you actually want to implement something like IPFS, if you go to our content addressing docs, you can build your own IPFS on top of GUN in about, I think, four lines of code.

Nicholas: And you're saying it could potentially be significantly more performant than actual IPFS.

Mark Nadal: Not it could be. It is.

Nicholas: It definitely is.

Mark Nadal: And if you want to test that, now it's not doing the content addressing in this test, but if you do npm install GUN and Mocha, Mocha is the testing framework in JavaScript, and then you go into the GUN repo, all you have to do is run Mocha test slash panic slash chat dot JS. And that will run on your machine, that 10,000 per second sync. It generates a hundred thousand chat messages and then has another browser query for the whole table and then it measures how fast it. So you on your computer in two lines of command line can check to see how fast that system will run. And the cool thing is that the content hashing is pretty, content hashing is a pretty cheap cryptographic operation. So I don't know, estimate you'll get probably half of that performance.

Nicholas: Hmm. So how is the connection established between two GUN sessions? Two, two, two, let's say I have two browsers. They each load a page. The page is importing the GUN library. How did, how was the connection actually made between the two sessions that don't otherwise know about each other?

Mark Nadal: Great question. So there's many different layers and this is called, these layers in GUN are called DAM, Daisy Chain Ad Hoc Mesh Network, and AX. AX is probably the more interesting one, but I'm going to start with the brute force techniques underneath. So choose your own adventure. Do you want to start with the WebRTC approach or do you want to start with the fallback approach?

Nicholas: Let's start with the WebRTC approach.

Mark Nadal: So for the WebRTC approach, because WebRTC, unfortunately browsers do not allow you to connect to another browser unless you go through what is a signaling system.

Nicholas: Oh, yes. I saw you, I saw a video you posted about this, this, this problem with the standard.

Mark Nadal: Yes. Um, so annoying. And, and to be honest, it's really not the browser's fault. It's this thing called NAT traversal, which is an IPv4 problem. IPv4 didn't have enough namespace. So pretty much every single ISP, internet provider out there winds up at each node before it goes into a neighborhood, kind of reassigning a sub network address. And then your router also creates a sub network for all the computers and devices at your network. And each one of those is a NAT traversal layer. So that way it expands the amount of IPv4 address space. IPv6 doesn't have this problem. So it's technically not a browser problem, but even if I were to give you my IP address, right, my IP address is actually not really my IP address. It's, it's the IP address of some upstream, think of it as being like an upstream cell tower.

Nicholas: So other people on your block would have the same IP address as you potentially.

Mark Nadal: Yeah. So you have to match this up with a port and it's called hole punching. So WebRTC, it's really not the fault of the browsers, just the internet, but you have to have some rendezvous point in order to do this hole punching where you match up the port numbers to make up for IPv4's lack of address space. Now, a lot of people, unfortunately, even in the blockchain and other D-Web communities wind up using like Google's signaling servers to do this. And it turns out there's actually some really creepy, like low level cybersecurity stuff where if you're doing video interaction, Google will actually like create a fake local IP in Chrome and still proxy it over Google because they're worried about like nation state actors from China that would be at our intercept. But that's a whole other thing. So it's actually very difficult in some browsers, even if you have an exposed IPv4 or IPv6 address, a dedicated IP address for that to actually be used because of tons of other reasons. IPv4 spoofing DNS just jumped like that.

Nicholas: So they're routing it through their servers, essentially.

Mark Nadal: They potentially still might be doing that. Yeah. Depending upon better conditions in Chrome's code that decides whether it's going to bail out or not. So if we're just going to ignore that for a minute, no matter what, you're still having to run through some other common point. The neat thing, though, is that could be copy and paste, right? If you have two browsers open, you can get the WebRTC offer that's trying to be generated. You can actually just copy and paste it into another browser in some demos. And that actually copying and pasting between two browsers will connect them. Now, how are you going to copy and paste between two different computers, though? So most of the time people are using some service to do that. I am very wary of that because I don't want that to become centralized. So every single Gunn peer, which includes the browser itself, because I write all of my code to run first in the browser and then run it in Node.js as well. We can run that on a Raspberry Pi. So every single Gunn peer runs a WebRTC signaling pair. So if you already have Alice, Bob and Carl, Alice and Bob, let's say, already connected over WebRTC, Bob and Carl are already connected over WebRTC. Alice to Carl will WebRTC signal over Bob another browser. So every single Gunn peer is also running a WebRTC signaling pair. And how is it doing that? Why is it doing that? Well, it's because DAM. DAM is just daisy chain ad hoc mesh networking. It's the relaying system inside of Gunn. So pretty much anything you do inside of Gunn is going to go through this relaying system and it will work on the browser.

Nicholas: So this is to solve the problems introduced by mitigation techniques that the ISPs are executing to get over the lack of namespace in IPv4. They introduced this problem that has as a result that WebRTC sessions, browsers require that WebRTC sessions communicate with a signaling server relay. I forget what it's called. What's the terminology?

Mark Nadal: They call them signaling servers. And because I don't like servers, I'm making them decentralized signaling relays.

Nicholas: This is like platform is a dirty word in crypto stuff. It's a protocol. Must be protocol. So you're shipping that server, aka relay, to enable websites or just sessions in browsers that have imported Gunn to be able to communicate via WebRTC independently without needing any kind of external dependency outside of the Gunn that they already have loaded up.

Mark Nadal: Yeah. So then that finally gets us to understanding how we now send out the signaling request. So your browser with the WebRTC in Gunn generates this what's called an SDP session description packet, which is the WebRTC spec for how you're going to connect another machine in the process of doing the what's called hole punching to traverse NAT. So on this computer, in this browser tab, there originally it was universal signaling because we at peak hit about 50 to 60 million users. The signaling itself started DDoSing the network, even at a much, much lower size. So we had to switch over to a thing called AXE, where AXE will only route any message in Gunn, whether it be a WebRTC signal or not, based off of this dictionary, kind of this DHT lookup system. So the AXE is built on top of DAM. DAM just brute force relays these things. AXE comes in and says, we can make this more intelligent and do, you know, PubSub subscription modeling. We're only going to route messages to other peers that are subscribed to this data. Now, of course, other peers may be subscribed on behalf of another peer, which is on behalf of another peer on behalf of another peer. But it works. It's pretty cool.

Nicholas: So if I open up a browser tab, GunnDB, I import GunnDB, it's fresh instance on a machine that's never touched the internet before. What is my browser tab going to do first in order to start creating this? Is it a dictionary hash table? Is that DHT?

Mark Nadal: So it's going to use, because the WebRTC is signaling, it's going to use the URL path name, not the host, the URL path name as the first WebRTC common key. So it will send out a signal to any peers it can access at that point. And that's now also a pausing point because how that works is another.

Nicholas: Does it locate them or have I typed in a URL or something? Where am I? What is it attempting to reach out to first?

Mark Nadal: Yeah. Gunn apps can be deployed as an HTML file on your computer that you double click and open up. It can be deployed as Electron apps, as React Native apps, to desktop and mobile. And they can also be deployed as a web app. That's where most people deploy them. And typically they deploy them on some static hosts. That could be GitHub Pages, Netfly, Versal, even means a JS bin or a code pin. You can deploy Gunn anywhere HTML runs, whether that's a local host server on a computer or a just a file that you're opening up without local host. Anywhere HTML runs, you can run Gunn, which is really cool. So you're correct. Let's pause for a second to that. first relays is going to attempt to contact because that's a whole series of fallbacks on its own. I'm actually just going to tweet. I think it's my most recent tweet. Like try to summarize it in a single tweet, but let's just pretend for a second. This WebRTC signal goes out and it is based off of that URL path name. You as an application developer, you could modify that in case you want a different rendezvous point. But I found that just the sensible default is where your application is being the path name of your application. The file name of your application is typically a good starting point just to get connected with another swarm or cluster of people. So if I'm in a, let's say like a Google doc type app and I want to do a video between me and my friends. So by the way, a GunnDB video service called MeetThing.space, they were funded by Mozilla. So you can go do zoom replacements with MeetThing.space in the browser. It's end-to-end encrypted browser to browser and it uses Gunn to connect the two users. And it does switch over to a native WebRTC stream underneath, but it's using Gunn to coordinate. If you want to try streaming video coordinated over Gunn, right now you can go to MeetThing.space and replace your zoom calls with that.

Nicholas: Are there two Ts in that or one?

Mark Nadal: I think it's just one. Yeah, one. M-E-E-T-H-I-N-G.

Nicholas: So it's like meeting with an H. Meething. Okay.

Mark Nadal: Meething.

Nicholas: So, okay. So in this example, for instance, I go to meething.com, meeting.com and, or whatever the URL is. And is this where we start making connections to other clients? Other, other peers? Not yet.

Mark Nadal: Yeah. So like, you know, give some slash, you know, your room name or whatever it is. And that will be your room name. And since most apps, right, in this case are, and not all apps do this, but a sensible default to just have it work automatically is you're going to now get automatically connected with other peers that are also, you know, other people that are also on that URL. Or I should say specifically in that path name, even if it's on a hosted on a different machine. And that's nice because very quickly you can start building apps. Now that might be different than the data that you're subscribed to. Okay. But at least prevents our network from getting DDoS where every single browser trying to connect to every single other browser. At some point later, right, you can automatically, you're automatically connected to other people at that path name, that file name. Okay. How is it even sending a signal out in the first place? Because as soon as we land on a page, we're not already connected to another Bob WebRTC peer. We have to figure out how to even connect to Bob in the first place.

Nicholas: Right. So, okay. So I go to the website and then I punch in like a room name and now somehow I need to be connected to the other peers who are in that room. Is that right?

Mark Nadal: You'd have to, yeah, you'd have to send a link out to your few of your friends to actually, you want to just send me.

Nicholas: Sure. So the, the, the, like the classic WebRTC version of this would be that there would be a centralized server somewhere that does the routing connecting all of us. So we're having direct WebRTC session connection once each other, but we need to be connected initially by some kind of centralized server. So in this case, Gunn is replacing that server.

Mark Nadal: Yeah. So the most reliable way is to have a web socket that sends your WebRTC signal out to some known server. that then reprodcasts that back down to another browser on the same page. And then they do their handshake and then you connect directly over WebRTC. So web sockets are extraordinarily reliable. They're so reliable. They are the default in Gunn. WebRTC is something you have to add the lib slash WebRTC adapter for it because you can't use WebRTC, unfortunately without first having pretty much web sockets in the first place. So the nice thing is if your app is deployed on your own server, you just do npm install Gunn, npm start, and that will run a Gunn relay on your machine. Whether it be Raspberry Pi, your local computer as local host, or some machine in the cloud. So that same code, it's Node.js is running the same code as the browser. Okay. So you can web socket connect to whatever the IP address is for that web socket machine. In fact, what I do all the time for local testing is I just use like tailscale or localhost.run or I think the current one I'm using that I really like is a free open source one. It will create a subdomain that connects to my actual computer. So I'll run Gunn locally on my computer and then have some subdomain that points to it to do the proxying. And I can now use my local computer as a Gunn relay, which is again, just Gunn. So I actually, I really want to pull up because I was using a few of these other services, but they all charge a lot of money for something that really is super cheap and simple. It's called SRV.us, serve us, SRV.us. And that's nice if you're a traditional developer running a lot of Node.js or local servers on your computer to expose it. Now, ultimately what Gunn is doing in the browser does do this for you automatically, but that's in the browser versus if you're running command line localhost on a computer. So as long as you're running any of these anywhere in the world, as long as it's IP addressable, then your front end application, whether it's hosted on your own HTTP server or not, or on a static page, you can just list a bunch of these IP addresses to other Gunn relays. And there's a whole volunteer list where you can just find a bunch of other people who are running these volunteer Gunn relays and just copy and paste their IP addresses into your front end react application or just regular JavaScript or whatever your front end application is. You can just paste it in to. when you start Gunn, just paste in all those IP addresses and Gunn will automatically WebSocket connect to all of those and mesh through them. Now it's gotten to a point where I thought before people would like that because they know the application developer is in control of what relays in the mesh and the DHT that they're bootstrapping through. As a kid, if I opened something up and it automatically made web requests to some random URL I've never seen before, I called that spyware. I call that malware. However, you know, enough of blockchain has been around for a long enough time that all the cool kids since 2016, 2017 plus just want their computers to automatically connect to some random server, some random relay, some random gateway, some random infura node in the background. And they want to pretend that this infura node doesn't even exist, right? They want to pretend that there's that this, this Amazon running Google cloud running machine doesn't exist and somehow they're just connecting into the Ethereum network and complete BS. Really? That is just a malware backdoor opportunity. So if you want to strictly control that, that's kind of how it's been the entire time and done. You've had to explicitly list which peers, which fallbacks you're going to open up through, but it's gotten me a lot of flack in the past several years. So, and this isn't quite published yet. We have like a prototype of it. What we're going to be doing is if you, if you add one extra module called axe, axe will automatically connect through several series of these websocket relays. And again, these websocket relays are just other gun peers. It could be another browser, but it's, since a browser is not IP addressable, it typically has to be an IP addressable gun peer. That could be a raspberry PI on your local computer. Um, and I encourage it to be that way. It's that way you're not running these things on a server. And I want to again, differentiate here is that gun runs on raspberry PI and on these old computers can do thousands of operations per second. In contrast to most blockchain and other D web systems that are going to require gigabytes of Ram just to start, uh, let alone hundreds of gigabytes of Ram to run some sort of publicly hosted system on, on a single, like free Heroku instance. We've seen up to about 10,000, um, concurrently connected users that are reeling through that. Um, and then after that point, Heroku says, stop using us for free. We're going to crash your computer. And then if you run it on other hardware, yeah, we're seeing on even like raspberry PI level hardware doing some pretty amazing scale.

Nicholas: What are the applications? Like I've seen examples of people doing chat apps. Uh, there was a hackathon project in a, in the juice box build guild hackathon that I ran. Uh, one of the entries was this, uh, gun based chat app that was using wallets to verify. Do you have sort of provable conversations with the owner of a certain crypto project, a juice box project. So you can know you're talking to the right person and verifiably. And we've seen this replacement for zoom that you mentioned the meeting, but could you do something like replace? I don't know. You, you mentioned in Fira. Could you use gun to start to decentralize and replace, uh, this kind of shared infrastructure that currently is very centralized and a single party doing something like indexing and providing APIs to blockchains or whatever that are easier to use than having to reconstitute that stuff locally. Is that something that gun could help people do?

Mark Nadal: All indexing all of that stuff. Super easy. Cause that goes back to that book analogy. You can live update the value for a particular word in a dictionary globally across so far 50 million monthly active users. I'm trying to get that up to about a hundred million monthly active users by the end of the year, but I desperately working on getting some of these new book features deployed. Yeah. So you can be doing real time updates across all those key values and all those key values, then just point to other gun nodes or, you know, just be a string that points to some other in Fira thing or some blockchain thing. It's very easy to go from gun into any other system. Once you're kind of in the system, we find a lot of people move everything else over to gun guns, not meant for storing like photos and videos, but there's quite a few demos of people that are like live streaming a video over gun that are like also doing live video playback over gun that are hosting photos and files on gun because the response times are just a lot faster. Now you're probably going to hit some hiccups currently as we're working through some improvements on that. You might have to do a couple little workarounds because guns not originally meant for that, but because it's so fast, a lot of the people who start with having guns and entry point into other tools that they built, eventually most of them have just migrated those other things into gun. One thing you're definitely not going to be able to do in gun that you absolutely should not do in gun is payment is some sort of banking type system. Please stick to something that's meant for that because a set of other reasons we can discuss in a different podcast since we're getting near the end of the battle with the whole can of worms on the difference between highly available partitioned DOM system to globally consistent, not partitioned.

Nicholas: I'm trying to understand what gun could be used to replace. Could you replace using Vercel with or Netlify or any of these things with having your web content hosted entirely on gun?

Mark Nadal: Let me first go through a list of things that have been built, right? So there are decent partners to zoom. There's peer to peer Reddit, not a bug.io. Careful what you click because it's peer to peer Reddit.

Nicholas: Okay.

Mark Nadal: Peer to peer version of Instagram, iris.to. So I actually, the peer to peer version of Instagram, iris.to also has a encrypted one-to-one private chat system to kind of like signal. Um, and it also has a group chat system. That's kind of a replacement for discord. And that's, um, Iris is built by Marty Malmy, who was Satoshi's first contributor to Bitcoin.

Nicholas: Wow.

Mark Nadal: Yeah. He's a legend. So he's already built like a Instagram signal discord replacement in one package called iris. So iris.to there's also been a ton of like. these lack alternatives, like pretty much every day people are building some sort of group chat app, group encrypted chat app on top of Gunn. Um, unstoppable domains had one. There's like lone wolf recently from a high school student in Africa. That was pretty good. There's NatNeil who came from Ethiopia. Like pretty much every week people who joined Gunn, the first thing they would have doing is these slack alternatives. There's also, um, some YouTube alternatives. They're actually the ones that have the most hits on them that drive our traffic, but they, they bounce between using Gunn for some people are doing a video over Gunn, but some people are just using, um, like commenting system. I don't feel like I yet have a good YouTube one, but I do have a demo you can run in the example tutorials. Libidvonto I don't know how to pronounce this alias. Um, I was retweeting about a year ago. Um, you might've seen him from a few of the blockchain people retweeting him. He was building a payment processor on top of Gunn, even though I don't do it. So a Stripe, um, replacement. And he did that by connecting BTC pay server with Gunn. And I thought that was actually a pretty good combination. And then of course you also have Wikipedia replacements to Gunn that we worked with internet archive. So internet archive has a D web demo that's running on top of Gunn. And part of the reason why I discourage IPFS is if you go to D web, the archive dot org, um, you'll see that like IPFS consistently crashes and doesn't work. Or in an archive and it has labels up top where you can see which systems are running, which is based off of written by. for us, web, torrent and gun are operational, so you got peer to peer versions of Wikipedia, and then you also have, um, peer to peer versions of Facebook horizon or metaverse stuff. And unfortunately, most of those teams keep on delaying launching them publicly. But if you go to the GitHub repo for gun, you can click on a couple of the gifs that show three JS or a frame integration where they have these full 3d worlds that you're going through. And then there's a whole set of categories that I don't even know. Applications that exist for like 3d CAD, right? So people are doing also like Ironman level stuff on gun where they will have their laptop camera running and they will wave. And there's a demo of this on the GitHub. They will like move their hand in front of their laptop screen and that will control 3d CAD model on their phone and on their laptop. So I guess AR VR is probably what it's called. They will use their hand to grab something and then move it and it will move it on multiple different devices. So Ironman level, I don't even know what the normal application or that is. And then just a couple more, I'm going to finish up distributed AI processing on top of gun. There's also another demo that you can click on. Um, there's an old article is written on how they did that, where they'll split the search space for these little like sheep things or bugs that are trying to find food. Um, it'll split the actual space that's traversing across different computers. And then the local Maxima of the best food that the bugs can find, um, on each computer will be synced to the other computers so they all converge to the global Maxima from the local Maxima. And then there's other crazy stuff like the Dutch Navy way back in the time was using gun for IOT stuff for monitoring the ship's system. There's insane amount of use case. Oh, sorry. I'm going to list two more. Um, before I run out of my list, fraud detection by using the graph traversal inside of gun. And that's less of a real time peer to peer sync thing. That's just more like using gun as a dashboard or analyzing data inside of gun. Um, I haven't really seen them do a real time stuff on that, but they might be, I have no clue. Um, and then of course, what we discussed earlier is like a peer to peer Uber where you get real time GPS coordinates of the driver that you have connected with, um, as they come to you. And then of course, like peer to peer Twitter and junk like that.

Nicholas: So a lot of, a lot of different applications.

Mark Nadal: Yes.

Nicholas: Do you foresee people using it? Like right now in, you know, I'm more familiar with crypto space. So, you know, if you want to host a front end to a smart contract, it's common to host it either on Brazil or on fleek, the IPFS or IPNS. is that an application that a gun could be used as a superior alternative for.

Mark Nadal: So all of those things are still being hosted off of just a traditional HTTP server. Um, they just have that HTTP server bundled in and gun does too. So of course a gun relay just has additionally, you know, an HTTP entry point. So sure. Um, I use guns to deploy their apps and they use that as an HTTP entry point, but I don't necessarily know how valuable or superior it's going to be to any other HTTP server that is bundled or not bundled into other systems.

Nicholas: So it's really more appropriate for things where the peers are in real time communication of novel data. It's generated through their live sessions.

Mark Nadal: Yeah. Because hosting a webpage is so trivial, like you can do it, but it's just like there's, there's no gain over any other system out there. Now there is a team that is trying to actually start saving the web apps into gun itself. So that way on any other gun app you've already loaded, you could load those apps. But I specifically tell them not to do this because that is XSS. That is cross site scripting. Right? You don't want to make some other HTML and JavaScript and then inject it some other place. And this got it, this started getting so out of hand because people would like take their Iris key cause. you can, you can log in to any app with your, your same key pairs. In fact, you don't even necessarily need to remember your key pair because of our login system, you can just type in username and password and it will log in. But the problem is if you do that on a different application, if you like QR scan your keys into another gun application, or if you type in your username, password to log in to your cryptographic account on a different app is. I'm so terrified that just any one rogue application has to be deployed that you didn't go to. And you're so normalized with logging in, you log in and it can steal your, your private key at that point. So I'm really strongly advising people that despite the fact that we have this login system that looks like just a normal username password that can be deployed anywhere and shared your account. That will lead to some really dangerous kind of cross site scripting and injecting. So this got so out of hand this last Christmas, I started building a solution for that. It's not out yet. It's very much prototype called secure render. The MetaMask team has been helping me do some security audits with it. Brendan Ike with Brave was sending me recommendations on how to make sure we're doing things as securely as possible. So there's quite a few really neat teams that are helping doing security audits for it. so you can go play with it as a beta, but don't, don't deploy it yet. And what it does is on any website load, it creates a secure container and it injects the application to this offline security container. And then that security container already has access to all of your portable application data, right? So on any page load without login or registration, you don't even have to do like a MetaMask click. You can instantly render your entire profile data. And that works because if we can take the application and move it into an offline container and we go through these security audits, right, we can have your private data instantly display on screen because no amount of offsite scripting or JavaScript can bypass the security container. And please challenge us, like go in and try and break the system because we're actively like. even IndoJS got broken, I think like eight months ago, but IndoJS does not do process isolation. And pretty much all of the browser teams were saying like do process isolation. So I've always been a little bit wary of IndoJS because it's not doing process isolation, which is like one of the very first things you need to tackle to be Spectre safe. So SecureRender does do process isolation. And so we're feeling so far pretty confident with all the audits that we've gotten to a point where yes, actually coming into the future, we may be able to have arbitrary JavaScript and HTML be cross-site injected, cross-origin injected into any other website and it still be secure and instantly render your data without your data ever leaking out of your browser to some random server.

Nicholas: Fascinating. We've talked about so much. This is incredible. So I guess if people want to get started playing around with Gun, Gun.eco, is that the best place to go?

Mark Nadal: Yeah, Gun.eco.

Nicholas: How did you come up with the name Gun, by the way?

Mark Nadal: As a mathematician, I like to have single letter variables or as few letter variables and you know, dot coms and npm package registry, a lot of three letter names are taken. Gun was not taken. So I was very excited that I could get a very short, succinct name. And I also, you know, just played a lot of Halo and Titan. I mean, Titan came out later. I don't actually like programming. I'm more of a mathematician, but I like the idea that Gun is fast, is powerful. And the database is the digital weapon of dangerous tools. Fang, Google, Apple, Microsoft has taken our data and used it to manipulate things, to destroy things. Now, of course, a real life physical Gun is more dangerous, but I wanted to constantly remind an application developer that when they're dealing with user data, they're working with a loaded Gun. They have to be very careful and sensitive about the security of things. So I've had to branch out and all this stuff over time of cryptographic explainers using cooking analogies and now secure render because too many people are running amok with my own system, sharing their key pairs between different applications, right? So I take that all very seriously and very sensitively. I didn't expect so many web developers who are fine with playing violent video games to then suddenly be like, oh, I'm not going to use something named Gun because of, I've been a little bit surprised at how, no offense to people out there, hypocritical. a lot of people are where they'll play violent games with kids, but then they're not going to use a database, which is trying to teach a moral message of security. So that's kind of the origin of the name of Gun, but it's kind of evolved over time to two other things, which is governed under none, G-U-N. You are not beholden to anybody else. You're not governed under anybody else. Blockchain is this notion of everybody comes together and is part of a governance system and that's cool, right? But like, what if I just want to go off into the wild and have my own utopia system, right? So governed under none is a big part of it as well.

Nicholas: I love it. It's a little bit incendiary. And frankly, I mean, if, if you care about things like, I don't know, the right to bear arms, surely a database and decentralized real time, open source, highly performant database technology is really the kind of arms you need to make a difference and secure freedom, freedom of expression, et cetera, today. So it's very exciting to talk to you about all this. I'm going to try it out. I'm going to play with it. Everybody I know who has played with it so far is extremely enthusiastic and looking for new opportunities to, to use GAN in their projects. So thanks so much for coming onto the show, Mark.

Mark Nadal: Yeah. Thanks so much.

Nicholas: Have a good one, everyone. This is awesome. Thanks everyone for coming to listen and see you next week. Hey, thanks for listening to this episode of Web3 Galaxy Brain. To keep up with everything Web3, follow me on Twitter at Nicholas with four leading ends. You can find links to the topics discussed on today's episode in the show notes. Podcast feed links are available at web3galaxybrain.com. Web3 Galaxy Brain airs live most Friday afternoons at 5 PM Eastern time, 2200 UTC on Twitter spaces. I look forward to seeing you there.

Show less

Related episodes

Lens Protocol with David Silverman

30 August 2022

Unlock Protocol Founder Julien Genestoux

12 May 2023

Sepana CEO Daniel J. Keyes

18 April 2023

Decentralized Databases with GUN founder Mark Nadal