1. The "multiple domains" feature. I'm running my own Mastodon instance right now purely so I can have my simonwillison.net domain as my identifier there (and protect myself from losing my identifier if the server I am using shuts down). This feels pretty wasteful! I'd much rather be able to point my domain at a Takahē instance shared with some of my friends, each with their own domains for it.
2. It's a Django app that's taking full advantage of the async features that have been added in the most recent releases of that framework. Async is a perfect match for ActivityPub due to the need to send thousands of outbound HTTP requests when publishing a message. And Takahē creator Andrew Godwin is the perfect person to build this because he's been driving the integration of async into Django for the past four years: https://www.aeracode.org/2018/06/04/django-async-roadmap/
3. The way it handles task queueing is super interesting. I've not fully got my head around it yet but it's the part of the codebase called Stator and it's modeled on things like the Kubernetes reconciliation loop - Andrew wrote a bit more about that here: https://www.aeracode.org/2022/11/14/takahe-new-server/ - Stator code is here: https://github.com/jointakahe/takahe/blob/main/stator/runner...
Async is good for lots of IO work and managing independent tasks with low coupling.
I am interested in task scheduling and asynchronous code I am interested in programming language development and parallelism and simultaneity without parallelism and cooperative and preemptive scheduling.
As an experiment inspired by Protothreads (a C library for implementing cooperative multitasking with a switch statement) I recently implemented async/await in Java as a giant switch statement and a while loop.
Providing that each coroutine only runs once, the amount of memory used shall not grow. The goal is to be stackless.
I played around with an C++ coroutines but someone told me that the approach I used is not C++20.
Code is at https://GitHub.com/samsquire/multiversion-concurrency-contro...
The reconciliation loop idea sounds interesting.
The problem is that you can't have multiple domains point to a single Mastodon instance. I'd like to share my single instance with friends who can bring their own domain name.
Basically the problem is that current Mastodon only supports single settings for the LOCAL_DOMAIN and WEB_DOMAIN.
More details on how mine works here: https://til.simonwillison.net/mastodon/custom-domain-mastodo...
I'm running a bit of a proxy at https://simonwillison.net/.well-known/webfinger?resource=acc... but it still needs to point to my own dedicated instance, just because Mastodon can't have multiple domains pointed at a single instance of the software yet.
I'm using this pattern (also shared by Andrew, before he started to spin up Takahē) https://aeracode.org/2022/11/01/fediverse-custom-domains/
I had been working on an ActivityPub server in Node.js/TypeScript for a while before the Twitter migration. It's got most of the features I'd want in a small server but it's basically bring-your-own-client at the moment.
Finding all the resources to build a complete server that can interact with other instances isn't easy, so maybe this can help someone. The spec is well worded, but the checklist is confusing, the test server is down, Mastodon has its own rules, etc. Plus you have to have at least a cursory knowledge of JSON-LD/RDF.
I had the idea of running a single user server on CloudFlare Workers and using D2 (their SQLite based db). A light weight JS/TS implementation would be perfect. Looks like you have Postgres planned, it would probably be possible to expand from that to SQLite.
It's like Elon unknowingly funded this space!
Now it's mainstream to work on a a cool technology that's been around for awhile!
Oh and everyone can act like they weren't bad mouthing the tech and saying it wasn't going to work before.
Visit any Mastodon thread here before Elon's Twitter and it's nothing but negativity.
also, a lot of these projects have been on a slow simmer for a long time, and are only just now starting to become complete and interesting.
edit: though it does seem to be true that takahe's initial commit was nov 5 :) and personally i don't consider it complete and interesting yet
Go ahead and look over any Mastodon thread a year ago or before.
Generally it was dismissed with "oh it's too niche" or "moderation will be too difficult".
People ignored the communities already on it and the tech overall.
Only until people got pissy about Elon running Twitter instead of hedge funds did the general sentiment here change.
It wasn't about the tech, and not even about Elon specifically, it was the Twitter safe space got taken away.
But now hopefully the people who want that safe space will isolate themselves in mastodon instances that block all others and we can all live in peace from them.
Or rather, which ones already are.
When you block other instances you realize you are islanding yourself off, right? Not the other way around.
Everyone is federated until you block, so you are isolating yourself from the norm when you block others.
It's not too noticeable as the english speaking instances are small currently, but those who don't want wolfballs and friends, or this or that are more isolated than those who do.
It makes sense that those who want to hide views from others are the outliers. Those who want to be open and allowing of diverse thought are more interoperable.
My argument is that if a server behaves in a way that the majority of large servers block it, then it is islanded, not the others.
I have tried unsuccessfully so far to set up an OAuth provider server along with it, so that you could log in on your phone, etc.
That's exciting! The fediverse is severely lacking algorithmic curation presumably due to the belief that it's inherently evil (I'd strongly disagree; it's merely the algorithm not being user-controllable what's bad).
"These posts from this and other servers in the decentralized network are gaining traction on this server right now."
I don't know what the logic is, but on big servers it's listing a lot of content.
It's probably not that difficult to add one based on your feed if there is one globalized already.
So if you want to check out what's the current buzz, you can, but you won't unknowingly be missing posts from those you follow (which seems to be the common complaint on Twitter).
We are talking about heuristics here, not algorithms.
I especially like some of the biological metaphors:
https://en.m.wikipedia.org/wiki/Aggressive_mimicry#Mimesis “cryptic aggressive mimicry is where the predator mimics an organism that its prey is indifferent to” i.e. wolf in sheep’s clothing.
Why does it need a Postgresql server? For just a handful of users, isn't sqlite the leaner, yet sufficient choice?
How does it compare to GoToSocial, which requires 50-100MB of RAM? They are also in alpha stage and i like their approach of keeping the web UI separate.
For reference, when I say "small to medium", in my head that means "up to about 1,000 people right now".
The general sense I have got is that mastodon - the default software at least - is extremely resource heavy for relatively low user counts. My assumption/hope was that the bulk of this is that the server software hasn't ever really been under sufficient pressure to improve, and takahē seems to indicate that there's at least some room for improvement on the server side (i.e. performance problems aren't entirely protocol/architecture problems)
GIN indexes sound cool - perhaps you can get away not using them however and instead support 2 DB backends?
If you want to accomodate "small" and not just medium, that would be great! ;-)
Is there any advantage to using a traditional db as opposed to a graph db since json-ld is just a text representation of graph nodes?
I was thinking the easiest path would be have the server deal with all the activityPub stuff and expose something like a graphQL interface for a bring your own client implementation. Of all the stuff they shoehorned graphQL into this seems like a valid fit, like they were made for each other.
Anyhoo, just my random thoughts…
Mostly I was thinking how one could implement something in the most efficient way and graph databases/graphQL were literally designed for this stuff.
Actually that looks more like an interactive client.... https://news.ycombinator.com/item?id=30875837
This doesn't match my experience from the last few years. SQLite in WAL mode is extremely capable.
The only thing I really miss from PostgreSQL is that PostgreSQL has more built-in functions for things like date handling - but SQLite custom functions are very easy to register when you need them.
It also has excellent JSON features - JSON maybe stored as text rather than a binary format like JSONB in PostgreSQL, but the SQLite JSON functions crunch through it at multiple GBs per second so it doesn't seem to matter.
What I find a bit unfortunate about Takahe is the coupling with Docker.
An even leaner ActivityPub implementation seems to be MicroBlogPub. I have not yet managed to set it up though.
Anybody interested in collaborating on a MicroBlogPub install script that turns a fresh Ubuntu installation (or container) into a running MicroBlogPub instance?
When I saw "Prerequisites: Something that can run Docker/OCI images" in the documentation, my interpretation was that containers are needed. It also says "You need to run at least two copies of the Docker image". Maybe you want to change the wording a bit.
I would also collaborate on writing a setup script for Takahe then!
I really like to write a setup script instead of following manual installation guides. So for every software I try, my first step is to write a script that turns a fresh Debian installation into a running instance. (MicroBlogPub needs Python 3.10 which is not in Debian stable, so I would use Ubuntu)
Creating a non-Docker fork would then probably be an uphill battle.
> So, I am deliberately avoiding offering a non-Docker install path that is supported right now as it leads to a lot of support burden with different OS package versions and the like!
That doesn't mean you can't write and share a script for people who want to install it without Docker.
It means that he doesn't want to take responsibility for non-Docker installation scripts as part of the official documentation (yet), because if he did that he'd be on the hook to keep researching and updating those scripts in the future.
While I don't love it, it's very understandable for a single-dev application. Anything else involves blizzards of questions and bugs filed against people using their disto version of Django vs their downloaded version of Django and the many versions of distros and the many conventions for Python environments and...
For instance, Go seems to be around an order of magnitude faster than Ruby, and I think I've seen a Golang implementation of ActivityPub somewhere. https://programming-language-benchmarks.vercel.app/go-vs-rub...
Django ranks #137 out of #142 across numerous web frameworks and languages. It’s literally one of the least performant options that exist.
Such framework rankings are also utterly irrelevant when you want something widely used enough to easily find contributors and integrations. That restricts you quite a bit more than “any so called framework that just handles http”.
Did you even look at the top performers on that page? This is number 2: https://github.com/Xudong-Huang/may_minihttp
It depends on how you code. I wrote a user instance in django and and I'm happy with it's performance.
I see the sibling comment about obfuscation, but not sure I follow either of you. Is this code not clear?
To me the code reads with humor and creativity, while every bit as self-evident as a Gary Larson FarSide cartoon on second glance. I mean, what else is nomoroboto going to do than what it does?
I've never seen this tone in the wild before, but got a kick out of it, might even find it refreshing maintaining it.
Anyway, you're right, all code should be written in haiku form, to maximize creativity and succinctness, plus keeping methods short! True elite coders ensure variable names are always a prime number of characters
AP is not entirely Twitter-style microblogging. It can be used to exchange (data or links) photos, video, audio, documents, invitations and meeting appointments. The default privacy assumption for all AP content is that it is basically public.
Matrix is not built on AP. Matrix is a real-time communications protocol suitable for private messages and public chats. Its mission appears to be to bridge every other protocol, so there's at least one Matrix-ActivityPub bridge module, MXToot.
OpenID is not part of these. Some Matrix servers can use OpenID for authentication. As far as I know, no ActivityPub servers currently use OpenID.
The federation was always meant to build communities around shared interests and values.
If I wanted the twitter "experience" I would just use twitter.
If I wanted a "free for all" environment I would be on 4chan.
I was trying to understand how it work in practice. It would be relevant to picking which instance to use.
Or will associating with one likely get my instance blocked from federating with the other?
Yes. Some are over things that should be relatively uncontroversial, like loli-hosting instances. Others are more likely to garner upset on HN.
The same thing has happened in the free software movement in general; some folks have called for copyleft-only, and the rest of the world has largely ignored them and is fine with shipping BSD licensed software along with GPL licensed software.
In reality biggest servers still federate with most servers. With latest changes in Mastodon you can now see whom they don't federate. It's not huge list, and if you browse those who don't federate with, it's pretty obvious why.
A lot of those defederated services advertises themselves as some sort of free-speech absolutist alternatives.
More broadly, I find it sad when the the names of natural species and features are adopted in the business and technology world without any deep connection. A canonical case would be Amazon the company, which has prospered and become a household name while the Amazon itself, with its people and ecosystems has suffered and declined. An egregious case relevant to NZ is Kiwi Farms.
The trend of using species names in technology perhaps started with the O’Reilly books. The argument can be raised that such use raises awareness of endangered species such as the takahē. But perhaps that is best left to other means, for fear that the mauri of a species should be captured and harmed.
Just because you disagree doesn't make it "woke".
Is a woke term. By the way they mention how you can donate to a Takahe recovery programme here https://jointakahe.org/about/
Also see this: