There's only a few more slices of the cake that is 2022. But as the saying goes: you can't have your cake and experience linear time.
It's time to start wrapping this year's project up. A quick recap of what's been done so far:
- A ground-up rebuild of Earthstar (this was good trust me)
- A new, maybe slightly too interactive Earthstar CLI
new replica server implementation
with support for extensions
- One of those extensions being able to serve the contents of a share to browsers
- A new core API for syncing a share with a filesystem directory
- Support for adding (potentially very big) arbitrary data to shares with attachments.
That last one was one of the 'big fish' of these milestones, and I'm really happy to have it merged (so that I could get to the other big fish).
Earthstar's data model divides universes of data into shares. You could have a share for your family, another for your gaming friends, another for collaborators on your open source project.
Shares are basically a namespace represented by an address. Up to now, a share address would look something like this:
If you know this address, you can create your own replica keyed to this share and start writing documents to it. And then you can sync that replica with other user's replicas of the same share.
Which basically means a share's address grants write access to that share. Which makes it a very sensitive bit of information.
The random suffix of the share address (e.g.
.b23xue8orl) was a way to make a
share address harder to guess. But this doesn't help much when the address
leaks, e.g. through a screenshot, looking over a shoulder, or other means.
This meant I had to spend time doing funny things like hiding the suffix in UIs. But no longer!
The next version of Earthstar will introduce protected shares. Obtaining a share address will only confer the ability to sync with other replicas and read that share's documents. To write to a share, you will also need a share secret.
The share address' suffix will become a public key:
to which you need the corresponding secret to write valid documents to the share.
This opens interesting new use cases for Earthstar shares where a trusted group publishes work to the broader public, like blogs or podcasts.
It also makes it easier for shares to be hosted, as replica servers only need the public address to replicate a share.
There will be no option for secretless shares in the future.
In my May + June update, I was excited about range-based set reconciliation, a new method of identifying differences between two sets and reconciling them, and potentially using that to improve Earthstar's syncing. I talked about it like I was going to get it done over a weekend. Happy days.
These past few weeks I've been working with Aljoscha Meyer to create a new JS implementation of this method, and am about to start integrating this new module with Earthstar. Thanks to Aljoscha's wonderful guidance, I've learned a lot about this method and thought it would be interesting to outline it here!
Imagine you have two sets with only some elements in common.
How do we determine the difference between the two sets while sending as little information as possible between the two peers?
Enter this method's first trick: generating fingerprints.
So for example, if the fingerprints for both sets match, then we know that both sets are extremely likely to hold the same elements. And all we had to exchange was fingerprints!
This is pretty much like us generating a hash for the contents of two files and comparing them.
But what if the fingerprints don't match?
At this point, we know there's some difference between the two sets. But where?
This is where this method's second trick comes in: we can generate a fingeprint for a specific range within a set.
Using this, we can subdivide the range of a non-matching fingerprint and identify where the difference is:
We can repeat this approach of comparing fingerprints and drilling down further until we reach a certain threshold where we finally send the items themselves to the other peer.
Of course this is just scratching the surface, and I've obscured and fudged some details for the sake of brevity. There are some ways you can deviate from the above (e.g. how ranges are subdivided), and some critical rules I have not mentioned at all (e.g. both sets must have a common total order). There are also many details to making this work quickly which I'm omitting, as well as many fun properties this method has which can be used to make it even faster!
Most importantly, the implementation I've built has been created as a JS/TS module for general usage. I'll be hooking up this module to Earthstar soon, and releasing it as open source shortly for others to use and peruse afterwards. I'm very excited to see how much time and energy will be saved by adding this exciting method to Earthstar, and maybe outside of it too.
Andrew Chou's app workshops
Finally Andrew Chou has been hosting workshops where he builds little apps using Earthstar for data persistence and transport. In the first two sessions he built a chatroom with display names and online indicators in about 200 lines of code (source here).
And that's without any frameworks! These workshops have also secretly been Web
API workshops in which we've been using standard browser APIs to build
interactive apps. This has made me especially happy, as I've always tried to
design Earthstar to exist as just another API alongside
In future workshops we'll be experimenting with Earthstar's new attachment capabilities to build some multimedia apps. If you're interested, they'll be hosted on the Earthstar Discord server.
Until next time
I expect the rate of updates to accelerate in these last few months of the year. I also hope I'll get a chance to do a post-mortem of the year and think about what I got right (and wrong) in this very first year of full-time open source work. But until then, it's back to working on these last few features before Earthstar Squirrel's release. Ciao!