2021-02-03

Distributed Social Media Is HARD

I stumbled upon an article on Mastodon's failings. It's tangentially interesting to this post as most of it is about social problems, but it got me thinking about the technical side of things again. Because the technical issues with designing a distributed social media are... interesting. I'm failing to find the word for it now, but at the face of it the problem just looks deviously simple even though it's actually really really hard.

The article that got me started today, because it mentions how horrible ActivityPub is.

Let's Start Simple

I mean. What is social media anyway? Let's take microblogging as our example, because that's just about publishing a few words into the ether every now and then. It can't be hard at all, can it?

Just type a line with a timestamp in a public file that your followers can check.

It's so simple. If you want to use more common standards you can write an atom feed instead. But we immediately see problems with this. It's an incomplete solution, to say the least. First of all, we're not talking about long form posting; prolific bloggers write new articles on a daily basis, but on social media a median user may post several times in a day if they get into a discussion. We run into issues of scaling message distribution, as I've mentioned earlier on my gemlog:

2021-01-15 Tricks to Scaling Distributed Social Networks

But it's not even just that. Even if you feel that having a thousand users poll your feed every 30 seconds is perfectly fine...

It Gets Harder

Apart from scaling we also need to deal with things like discoverability (how do other users find you?), and mentions (how will you know if someone mentions you, unless you follow their feed?).

I mean, twtxt has gone some way to helping out here. You can push your post to a registry, which synchronizes with other registries and takes some polling load off of your server (if users poll the registry instead). Anyone can add their feed to a registry. It just takes a curl call to do so.

So that's solved then? Sure. Let's move on.

It Gets Even Harder

Actually, anyone can add anyone else's feed to a registry too. I don't at a glance see how this can be used for nefarious purposes, but my gut feeling tells me it can. Authorization is one very hard nut to crack in distributed social media systems. Most of your posts will be public, and that means anyone can read them. With twtxt this is quite obvious, but with Mastodon and other ActivityPub implementations this sometimes trips users up. They may have blocked someone, but that doesn't stop that user from reading their public posts. It does mean you don't have to see their shit, though.

That's something a twtxt client can solve, of course: just filter out stuff from particular users from your mentions as you fetch them from a registry.

But then this is very very basic. We want more from typical social media. We want to post pictures! Have avatars!

I honestly don't know how they do this on top of the tiny twtxt specification, but it doesn't look so simple anymore. They even have markdown support.

This isn't even close to what ActivityPub offers today, and we're barely scratching the surface of the problems we want to solve.

Remember blocking other users? In ActivityPub a Follow is a mutual agreement: A wants to follow B and B wants to be followed by A. Both servers remember this decision, which is important because this allows B to post stuff to followers only (that's a feature you want, right? Semi-public posts. This is only possible if some form of authentication and authorization is set in place, by the way). Because we solved scaling of distribution with sharedInbox we introduced the problem of out-of-sync follower lists. Maybe B removed A as a follower, but A's instance failed to process that request for some reason. Then B posts to @followers, and the follower C is on the same instance as A. That instance gets the message and distributes it to all of B's followers, including A...

Sharing is another thing. It's called boosting Mastodon, and it's distinct from a post you author yourself. It's not a copy-paste, but a forward of a message. That's a simple thing to implement, but it's one more thing on top of all the others.

I haven't even mentioned likes/acks/+1 or whatever you want to call them. But that's also a central thing that we expect from modern social networks. And responses; posts that have a clear relation to other posts.

And if your solution becomes popular you're going to have to deal with spam. Sorry, but that's a fact.

Something ActivityPub hasn't solved yet is nomadic identities (zot6 has this, though I don't know how it works). The ability to keep your online identity even one of the instances you're on goes down or disappears.

Solutions?

All of these problems are solvable. It's the combination of them all that becomes overwhelming. Suddenly your protocol is so big that the only viable implementation is your own and nobody can reasonably hope to replicate that. It becomes a software instead of a protocol.

If you want to keep it simple you need to go the gemini way: non-extendable, bounded scope. Define exactly which problems you want to solve, and more importantly which ones you don't intend to solve.

-- CC0 Björn Wärmedal