Agreed. The head of line problem is worth solving for certain use cases.
But today, all streaming systems (or workarounds) with per message key acknowledgements incur O(n^2) costs in either computation, bandwidth, or storage per n messages. This applies to Pulsar for example, which is often used for this feature.
Now, now, this degenerate time/space complexity might not show up every day, but when it does, you’re toast, and you have to wait it out.
My colleagues and I have studied this problem in depth for years, and our conclusion is that a fundamental architectural change is needed to support scalable per message key acknowledgements. Furthermore, the architecture will fundamentally require a sorted index, meaning that any such a queuing / streaming system will process n messages in O (n log n).
We’ve wanted to blog about this for a while, but never found the time. I hope this comment helps out if you’re thinking of relying on per message key acknowledgments; you should expect sporadic outages / delays.
It processes unrelated keys in parallel within a partition. It has to track what offsets have been processed between the last committed offset of the partition and the tip (i.e. only what's currently processed out of order). When it commits, it saves this state in the commit metadata highly compressed.
Most of the time, it was only processing a small number of records out of order so this bookkeeping was insignificant, but if one key gets stuck, it would scale to at least 100,000 offsets ahead, at which point enough alarms would go off that we would do something. That's definitely a huge improvement to head of line blocking.
galeaspablo 22 hours ago [-]
Disclosure (given this is from Confluent): I'm ex MSK (Managed Streaming for Kafka at AWS) and my current company was competing with Confluent before we pivoted.
Yup, this is one more example, just like Pulsar. There are definitely great optimizations to be made on the average case. In the case of parallel consumer, if you'd like to keep ordering guarantees, you retain O(n^2) processing time in the worst case.
The issues arise when you try to traverse arbitrary dependency topologies in your messages. So you're left with two options:
1. Make damn sure that causal dependencies don't exhibit O(n^2) behavior, which requires formal models to be 100% sure.
2. Give up ordering or make some other nasty tradeoff.
At a high level the problem boils down to traversing a DAG in topological order. From computer science theory, we know that this requires a sorted index. And if you're implementing an index on top of Kafka, you might as well embed your data into and consume directly from the index. Of course, this is easier said than done, and that's why no one has cracked this problem yet. We were going to try, but alas we pivoted :)
Edit: Topological sort does not required a sorted index (or similar) if you don't care about concurrency. But then you've lost the advantages of your queue.
taurath 21 hours ago [-]
> traverse arbitrary dependency topologies
Is there another way to state this? It’s very difficult for me to grok.
> DAG
Directed acyclic graph right?
galeaspablo 20 hours ago [-]
Apologies, we've been so deep into this problem that we take our slang for granted :)
A graphical representation might be worth a thousand words, keeping in mind it's just one example. Imagine you're traversing the following.
A1 -> A2 -> A3...
|
v
B1 -> B2 -> B3...
|
v
C1 -> C2 -> C3...
|
v
D1 -> D2 -> D3...
|
v
E1 -> E2 -> E3...
|
v
F1 -> F2 -> F3...
|
v
...
Efficient concurrent consumption of these messages (while respecting causal dependency) would take O(w + h), where w = the _width_ (left to right) of the longest sequence, and h = the _height_ (top to bottom of the first column)
But Pulsar, Kafka + parallel consumer, Et al. would take O(n^2) either in processing time or in space complexity. This is because at a fundamental level, the underlying data storages store looks like this
A1 -> A2 -> A3...
B1 -> B2 -> B3...
C1 -> C2 -> C3...
D1 -> D2 -> D3...
E1 -> E2 -> E3...
F1 -> F2 -> F3...
Notice that the underlying data storage loses information about nodes with multiple children (e.g., A1 previously parented both A2 and B1)
If we want to respect order, the consumer will be responsible for declining to process messages that don't respect causal order. E.g., attempting to process F1 before E1. Thus we could get into a situation where we try to process F1, then E1, then D1, then C1, then B1, then A1. Now that A1 is processed, kafka tries again, but it tries F1, then E1, then D1, then C1, then B1... And so on and so forth. This is O(n^2) behavior.
Without changing the underlying data storage architecture, you will either:
1. Incur O(n^2) space or time complexity
2. Reimplement the queuing mechanism at the consumer level, but then you might as well not even use Kafka (or others) at all. In practice this is not practical (my evidence being that no one has pulled it off).
3. Face other nasty issues (e.g., in Kafka parallel consumer you can run out of memory or your processing time can become O(n^2)).
singron 19 hours ago [-]
Do you have an example use case for this? This does seem like something unsuited to kafka, but I'm having a hard time imagining why you would structure something like this.
galeaspablo 14 hours ago [-]
Great follow up question, thank you. I could talk about this "topic" for days, so I appreciate the opportunity to expand. :)
Let's imagine ourselves as a couple of engineers at Acme Foreign Exchange House. We'd like to track Acme's net cash position across multiple currencies, and execute trades accordingly (e.g., heding). And we'd like to retrospectively analyze our hedges, to assess their effectiveness.
Let's say I have this set of transactions (for accounts A, B, C, D, E, F, etc.)
A1 -> A2 -> A3 -> A4
B1 -> B2 -> B3 -> B4
C1-> C2
D1 -> D2 -> D3 -> D4
E1 -> E2
F1
Let's say that that:
- E1 was a deposit made into account E for $2M USD.
- E2 was an outgoing transfer of $2M USD sent to account F (incoming £1.7M GBP at F1).
If we consume our transactions and partiton our consumption by account id, we could get into a state where E1 and F1 are reflected in our net position, but E2 isn't. That is, our calculation has both $2M USD and £1.7M GBP, when in reality we only ever held either $2M USD or £1.7M GBP.
So what could we do?
1. Make sure that we respect causality order. I.e., there's no F1 reflected in our net position if we haven't processed E2.
2. Make sure that pairs of transactions (e.g., E2 and F1) update our net position atomically.
Opinion: the world is causally ordered in arbitrary ways as above. But the tools, frameworks, and infrastructure more readily available to us struggle at modeling arbitrary partially ordered causality graphs. So we shrug our shoulders, and we learn to live with the edge cases. But it doesn't have to be so.
Misdicorl 22 hours ago [-]
I suppose it depends on your message volume. To me, processing 100k messages and then getting a page however long later as the broker (or whatever) falls apart sounds much worse than head of line blocking and seeing the problem directly in my consumer. If I need to not do head of line blocking, I can build whatever failsafe mechanisms I need for the problematic data and defer to some other queueing system (typically, just add an attempt counter and replay the message to the same kafka topic and then if attempts > X, send it off to wherever)
I'd rather debug a worker problem than an infra scaling problem every day of the week and twice on Sundays.
singron 19 hours ago [-]
It's interesting you say that, since this turned an infra scaling problem into a worker problem for us. Previously, we would get terrible head-of-line throughput issues, so we would use an egregious number of partitions to try to alleviate that. Lots of partitions is hard to manage since resizing topics is operationally tedious and it puts a lot of strain on brokers. But no matter how many partitions you have, the head-of-line still blocks. Even cases where certain keys had slightly slower throughput would clog up the whole partition with normal consumers.
The parallel consumer nearly entirely solved this problem. Only the most egregious cases where keys were ~3000 times slower than other keys would cause an issue, and then you could solve it by disabling that key for a while.
Misdicorl 17 hours ago [-]
Yeah I'd say kafka is not a great technology if your median and 99ths (or 999ths if volume is large enough) are wildly different which sounds like your situation. I use kafka in contexts where 99ths going awry usually aren't key dependent so I don't have the issues you see.
I tend to prefer other queueing mechanisms in those cases, although I still work hard to make 99ths and medians align as it can still cause issues (especially for monitoring)
Misdicorl 22 hours ago [-]
Follow on: If you're using kafka to publish messages to multiple consumers, this is even worse as now you're infecting every consumer with data processing issues from every other consumer. Bad juju
cogman10 24 hours ago [-]
> streaming system will process n messages in O (n log n)
I'm guessing this is mostly around how backed up the stream is. n isn't the total number of messages but rather the current number of unacked messages.
Would a radix structure work better here? If you throw something like a UUID7 on the messages and store them in a radix structure you should be able to get O(n) performance here correct? Or am I not understanding the problem well.
bjornsing 23 hours ago [-]
I think the problem is that if you want quick access to all messages with a particular key then you have to maintain some kind of index over all persisted messages. So n would be total number of persisted messages as I read it, which can be quite large. But even storing them in the first place is O(n), so O(n log n) might not be so bad.
galeaspablo 22 hours ago [-]
That's correct. And keep in mind that you might have new consumers starting from the beginning come into play, so you have to permanently retain the indexes.
And yes, O(n log n ) is not bad at all. Sorted database indexes (whether SQL, NoSQL, or AcmeVendorSQL, etc.) already take O(n log n) to insert n elements into data storage or to read n elements from data storage.
vim-guru 1 days ago [-]
https://nats.io is easier to use than Kafka and already solves several of the points in this post I believe, like removing partitions, supporting key-based streams, and having flexible topic hierarchies.
nchmy 18 hours ago [-]
NATS is also in the process of a open source license rugpull...
I greatly prefer redis streams. Not all the same features, but if you just need basic streams, redis has the dead simple implementation I always wanted.
Not to mention you then also have a KV store. Most problems can be solved with redis + Postgres
Actually thinking about building something with Redis streams next week. Any particular advice/sharp edges/etc?
immibis 20 hours ago [-]
Don't, under any circumstances, let it come into contact with an untrusted network. Anyone who can connect to your Redis service gets arbitrary remote code execution.
mountainriver 24 hours ago [-]
Honestly no, which was my favorite part. So easy to spin up, "just works" the way you expect. It's night and day from a lot of the other solutions in the ecosystem.
fasteo 1 days ago [-]
And JetStream[1] adds persistence to make it more comparable to Kafka
Wow, what a poor move. Really surprising having worked with the folks at Synadia directly.
openplatypus 21 hours ago [-]
How to kill a project and blemish your brand?
Ding Ding Ding
That's correct!
wvh 1 days ago [-]
I came here to say just that. Nats solves a lot of those challenges, like different ways to query and preserve messages, hierarchical data, decent authn/authz options for multi-tenancy, much lighter and easier to set up, etc. It has more of a messaging and k/v store feel than the log Kafka is, so while there's some overlap, I don't think they fit the exact same use cases. Nats is fast, but I haven't seen any benchmarks for specifically the bulk write-once append log situation Kafka is usually used for.
Still, if a hypothetical new Kafka would incorporate some of Nats' features, that would be a good thing.
raverbashing 22 hours ago [-]
Honestly that website has the least amount of information per text I've seen in multiple websites
I had to really dig (outside of that website) to understand even what NATS is and/or does
It goes too hard on the keyword babbling and too little on the "what does this actually do"
> Services can live anywhere and are easily discoverable - decentralized, zerotrust security
Ok cool, this tells me absolutely nothing. What service? Who to whom? Discovering what?
nchmy 18 hours ago [-]
Agreed. It is amazing though, and actually so simple that you don't need much docs. Their YouTube channel is excellent.
sgarland 13 hours ago [-]
In no way are videos a substitute for docs, WTF.
nchmy 2 hours ago [-]
I literally agreed that the docs were shit. I just happened to say the videos were great.
Ozzie_osman 1 days ago [-]
I feel like everyone's journey with Kafka ends up being pretty similar. Initially, you think "oh, an append-only log that can scale, brilliant and simple" then you try it out and realize it is far, far, from being simple.
munksbeer 1 days ago [-]
I'm not a fan or an anti-fan of kafka, but I do wonder about the hate it gets.
We use it for streaming tick data, system events, order events, etc, into kdb. We write to kafka and forget. The messages are persisted, and we don't have to worry if kdb has an issue. Out of band consumers read from the topics and persist to kdb.
In several years of doing this we haven't really had any major issues. It does the job we want. Of course, we use the aws managed service, so that simplifies quite a few things.
I read all the hate comments and wonder what we're missing.
benjaminwootton 22 hours ago [-]
That’s my experience too. I’ve deployed it more than ten times as a consultant and never really understood the reputation for complexity. It “just works.”
amazingman 21 hours ago [-]
I've deployed it a bunch of times and, crucially, maintained it thereafter. It's very complex, especially when troubleshooting pathological behavior or recovering from failures, and I don't see why anyone with significant experience with Kafka could reasonably claim otherwise.
Kafka is perhaps the most aptly named software I've ever used.
That said, it's rock solid and I continue to recommend it for cases where it makes sense.
klooney 18 hours ago [-]
I know a team that had their Kafka cluster fall over, and they couldn't get it to stay up until eventually their LOB got shut down for being unreliable. I don't know if they were especially bad at their jobs, or their volumes were unreasonable, or what, but it seemed like a bad time for all.
Bombthecat 9 hours ago [-]
How long is your Kafka down when you cut the cable to Kafka and it needs to fail over?
DrFalkyn 23 hours ago [-]
What happens if the Kafka node fails ?
radiator 22 hours ago [-]
"the" node? Kafka is a cluster of multiple nodes.
nitwit005 18 hours ago [-]
People do lose clusters occasionally.
carlmr 1 days ago [-]
I'm wondering how much of that is bad developer UX and defaults, and how much of that is inherent complexity in the problem space.
Like the article outlines, partitions are not that useful for most people. Instead of removing them, how about having them behind a feature flag, i.e. not on by default. That would ease 99% of users problems.
The next point in the article which to me resonates is the lack of proper schema support. That's just bad UX again, not inherent complexity of the problem space.
On testing side, why do I need to spin up a Kafka testcontainer, why is there no in-memory kafka server that I can use for simple testing purposes.
gunnarmorling 1 days ago [-]
> why is there no in-memory kafka server that I can use for simple testing purposes.
I think it's just horrible software built on great ideas sold on a false premise (this is a generic message queue and if you don't use this you cannot "scale").
mrkeen 1 days ago [-]
It's not just about the scaling, it's about solving the "doing two things" problem.
If you take action a, then action b, your system will throw 500s fairly regularly between those two steps, leaving your user in an inconsistent state. (a = pay money, b = receive item). Re-ordering the steps will just make it break differently.
If you stick both actions into a single event ({userid} paid {money} for {item}) then "two things" has just become "one thing" in your system. The user either paid money for item, or didn't. Your warehouse team can read this list of events to figure out which items to ship, and your payments team can read this list of events to figure out users' balances and owed taxes.
(You could do the one-thing-instead-of-two-things using a DB instead of Kafka, but then you have to invent some kind of pub-sub so that callers know when to check for new events.)
Also it's silly waiting around to see exceptions build up in your dev logs, or for angry customers to reach out via support tickets. When your implementation depends on publishing literal events of what happened, you can spin up side-cars which verify properties of your system in (soft) real-time. One side-car could just read all the ({userid} paid {money} for {item}) events and ({item} has been shipped) events. It's a few lines of code to match those together and all of a sudden you have a monitor of "Whose items haven't been shipped?". Then you can debug-in-bulk (before the customers get angry and reach out) rather than scour the developer logs for individual userIds to try to piece together what happened.
Also, read this thread https://news.ycombinator.com/item?id=43776967 from a day ago, and compare this approach to what's going on in there, with audit trails, soft-deletes and updated_at fields.
carlmr 1 days ago [-]
I kind of agree on the horrible software bit, but what do you use instead? And can you convince your company to use that, too?
buster 1 days ago [-]
I find that many such systems really just need a scalable messaging system.
Use RabbitMQ, Nats, Pub/Sub, ... There are plenty.
Confluent has rather good marketing and when you need messaging but can also gain a persistent, super scalable data store and more, why not use that instead?
The obvious answer is: Because there is no one-size-fits-all-solution with no drawbacks.
1 days ago [-]
mrweasel 1 days ago [-]
The worst part of Kafka, for me, is managing the cluster. I don't really like the partitioning and the almost hopelessness that ensues when something goes wrong. Recovery is really tricky.
Granted it doesn't happen often, if you plan correctly, but the possibility of going wrong in the partitioning and replication makes updates and upgrades nightmare fuel.
hinkley 24 hours ago [-]
There was an old design I encountered in my distributed computing class, and noticed in the world having been primed to look for it, where you break ties in distributed systems with a supervisor whose only purpose was to break ties. In a system that only need 2 or 4 nodes to satisfy demand, the cost of running a 3rd of 5th node only to break ties results in a lot of operational cost. So you created a process that understood the protocol but did not retain the data, whose sole purpose was to break split brain ties.
Then we settled into an era where server rooms grew and workloads demanded horizontal scaling and for the high profile users running an odd number of processes was a rounding error and we just stopped doing it.
But we also see this issue re-emerge with dev sandboxes. Running three copies of Kafka, Redis, Consul, Mongo, or god forbid all four, is just a lot for one laptop, and 50% more EC2 instances if you spin it up in the Cloud, one cluster per dev.
I don’t know much Kafka, so I’ll stick with Consul as a mental exercise. If you take something like consul, the voting logic should be pretty well contained. It’s the logic for catching up a restarted node and serving the data that’s the complex part.
EdwardDiego 24 hours ago [-]
Have a look at Strimzi, a K8s operator, gives you a mostly-managed Kafka experience.
sgarland 13 hours ago [-]
Now you have two problems.
vkazanov 1 days ago [-]
Yeah...
It took 4 years to properly integrate Kafka into our pipelines. Everything, like everything is complicated with it: cluster management, numerous semi-tested configurations, etc.
My final conclusion with it is that the project just doesn't really know what it wants to be. Instead it tries to provide everything for everybody, and ends up being an unbelievably complicated mess.
You know, there are systems that know what they want to be (Amazon S3, Postres, etc), and then there are systems that try to eat the world (Kafka, k8s, systemd).
EdwardDiego 24 hours ago [-]
> that the project just doesn't really know what it wants to be
It's a distributed log? What else is it trying to do?
zaphirplane 7 hours ago [-]
Calling it a distributed log may just be a Reductio ad absurdum
pphysch 22 hours ago [-]
> You know, there are systems that know what they want to be (Amazon S3, Postres, etc), and then there are systems that try to eat the world (Kafka, k8s, systemd).
I am not sure about this taxonomy. K8s, systemd, and (I would add) the Linux kernel are all taking on the ambitious task of central, automatic orchestration of general purpose computing systems. It's an extremely complex problem and I think all those technologies have done a reasonably good job of choosing the right abstractions to break down that (ever-changing) mess.
People tend to criticize projects with huge scope because they are obviously complex, and complexity is the enemy, but most of the complexity is necessary in these cases.
If Kafka's goal is to be a general purpose "operating system" for generic data systems, then that explains its complexity. But it's less obvious to me that this premise is a good one.
1oooqooq 1 days ago [-]
systemd knows very well what it wants to be, they just don't tell anyone.
it's real goal is to make Linux administration as useless as windows so RH can sell certifications.
tell me the output of systemctl is not as awful as opening the windows service panel.
hbogert 1 days ago [-]
Tell me systemctl output isn't more beneficial than per distro bash-mess
zaphirplane 7 hours ago [-]
I have no dog in the systemd wars. The bash output is what? Distro use other open source software.
vkazanov 1 days ago [-]
Well, systemd IS useful, the same way Kafka is. I don't want to back to crappy bash for service management, and Kafka is a de fact standard event streaming solution.
But both are kind of hard to understand end-to-end, especially for an occasional user.
1oooqooq 22 hours ago [-]
not really. both requires that you know obscure and badly documented stuff.
systemd whole premise is "people will not read the distro or bash scripting manual"...
then nobody read systemd's (you have even less reason, since it's badly written, ever changing in conflicting ways, and a single use tool)
so you went from complaining your coworkers can't write bash to complaining they don't know they have to use
EXEC=
EXEC=/bin/x
because random values, without any hint, are lists of commands instead of string values.
tarasglek 8 hours ago [-]
Bash scripts are write-only software. one read systemd c
chupasaurus 1 days ago [-]
There are 2 service panels in Windows since 8 and they are quite different...
1oooqooq 22 hours ago [-]
both (and systemctl) will show two to three times the same useless info on the screen. e.g. service x - start the service x.
hinkley 1 days ago [-]
Once you pick an “As simple as possible, but no simpler” solution, it triggers Dunning Kruger in a lot of people who think they can one up you.
There was a time in my early to mid career when I had to defend my designs a lot because people thought my solutions were shallower than they were and didn’t understand that the “quirks” were covering unhappy paths. They were often load bearing, 80/20 artifacts.
Hamuko 1 days ago [-]
>Initially, you think "oh, an append-only log that can scale, brilliant and simple"
Really? I got scared by Kafka by just reading through the documentation.
nitwit005 1 days ago [-]
> When producing a record to a topic and then using that record for materializing some derived data view on some downstream data store, there’s no way for the producer to know when it will be able to "see" that downstream update. For certain use cases it would be helpful to be able to guarantee that derived data views have been updated when a produce request gets acknowledged, allowing Kafka to act as a log for a true database with strong read-your-own-writes semantics.
Just don't use Kafka.
Write to the downstream datastore directly. Then you know your data is committed and you have a database to query.
wvh 1 days ago [-]
The problem is that you don't know who's listening. You don't want all possible interested parties to hammer the database. Hence the events in between. Arguably, I'd not use Kafka to store actual data, just to notify in-flight.
mike_hearn 23 hours ago [-]
In some databases that's not a problem. Oracle has a built in horizontally scalable message queue engine that's transactional with the rest of the database. You can register a series of SELECT queries and be notified when the results have (probably) changed, either via direct TCP server push or via a queued message for pickup later. It's not polling based, the transaction engine knows what query predicates to keep an eye out for.
Disclosure: I work part time for Oracle Labs and know about these features because I'm using them in a project at the moment.
dheera 23 hours ago [-]
I know as an Oracle employee you don't want to hear this, but part of the problem is that you are no longer database-agnostic if you do this.
The messaging tech being separate from the database tech means the architects can swap out the database if needed in the future without needing to rewrite the producers and consumers.
mike_hearn 22 hours ago [-]
I don't work on the database itself, so it's neither here nor there to me. Still, the benefits of targeting the LCD must be weighed against the costs. Not having scalable transactions imposes a huge drain on the engineering org that sucks up time and imposes opportunity costs.
immibis 1 days ago [-]
But you do know who's listening, because you were the one who installed all the listeners. ("you" can be plural.)
This reminds me of the OOP vs DOD debate again. OOP adherents say they don't know all the types of data their code operates on; DOD adherents say they actually do, since their program contains a finite number of classes, and a finite subset of those classes can be the ones called in this particular virtual function call.
What you mean is that your system is structured in such a way that it's as if you don't know who's listening. Which is okay, but you should be explicit that it's a design choice, not a law of physics, so when that design choice no longer serves you well, you have the right to change it.
(Sometimes you really don't know, because your code is a library or has a plugin system. In such cases, this doesn't apply.)
> Arguably, I'd not use Kafka to store actual data, just to notify in-flight.
I believe people did this initially and then discovered the non-Kafka copy of the data is redundant, so got rid of it, or relegated it to the status of a cache. This type of design is called Event Sourcing.
wvh 2 hours ago [-]
I have worked for financial institutions where random departments have random interests in certain data transactions. You (as in a dev team in one such department) have no say in who touches the data, from where, and how it's used. Kafka is used as a corporate message bus to let e.g. the accountant department know something happened in another department. Those "listening" departments don't have devs and are not involved in development, they operate more on the MS BI level of PowerPoint.
So yes, in large companies, your development team is just a small cog, you don't set policy for what happens to the data you gather. And in some sectors, like finances, you are an especially small cog with little power, which might sound strange if you only ever worked for a software startup.
Spivak 1 days ago [-]
For read only queries, hammer away I can scale reads nigh infinitely horizontally. There's no secret sauce that makes it so that only kafka can do this.
throwaway7783 19 hours ago [-]
It's not about scaling reads, but coordinating consumers so no more than one consumer processes same messages. That means some kind of locking, that means scaling issues.
rakoo 1 days ago [-]
Alternatively, your write doesn't have to be fire-and-forget: downstream datastores can also write to kafka (this time fire-and-forget) and the initial client can wait for that event to acknowledge the initial write
menzoic 1 days ago [-]
Writing directly to the datastore ignores the need for queuing the writes. How do you solve for that need?
immibis 1 days ago [-]
Why do you need to queue the writes?
vergessenmir 1 days ago [-]
some writes might fail, you may need to retry, the data store may be temporarily available etc.
There may be many things that go wrong and how you handle this depends on your data guarantees and consistency requirements.
If you're not queuing what are you doing when a write fails, throwing away the data?
immibis 1 days ago [-]
As bushbaba points out, the same things may happen with Kafka.
The standard regex joke really works for all values of X. Some people, when confronted with a problem, think "I know - I'll use a queue!" Now they have two problems.
Adding a queue to the system does not make it faster or more reliable. It makes it more asynchronous (because of the queue), slower (because the computer has to do more stuff) and less reliable (because there are more "moving parts"). It's possible that a queue is a required component of a design which is more reliable and faster, but this can only be known about the specific design, not the general case.
I'd start by asking why your data store is randomly failing to write data. I've never encountered Postgres randomly failing to write data. There are certainly conditions that can cause Postgres to fail to write data, but they aren't random and most of them are bad enough to require an on-call engineer to come and fix the system anyway - if the problem could be resolved automatically, it would have been.
If you want to be resilient to events like the database disk being full (maybe it's a separate analytics database that's less important from the main transactional database) then adding a queue (on a separate disk) can make sense, so you can continue having accurate analytics after upgrading the analytics database storage. In this case you're using the queue to create a fault isolation boundary. It just kicks the can down the road though, since if the analytics queue storage fills up, you still have to either drop the analytics or fail the client's request. You have the same problem but now for the queue. Again, it could be a reasonable design, but not by default and you'd have to evaluate the whole design to see whether it's reasonable.
bushbaba 1 days ago [-]
Some Kafka writes might fail. Hence the Kafka client having a queue with retries.
smitty1e 1 days ago [-]
A proficient coder can write a program to accomplish a task in the singular.
In the plural, accomplishing that task in a performant way at enterprise scale seems to involve turning every function call into an asynchronous, queued service of some sort.
Which then begets additional deployment and monitoring services.
A queued problem requires a cute solution, bringing acute pain.
salomonk_mur 1 days ago [-]
Yeah... Not happening when you have scores of clients running down your database.
The reason message queue systems exist is scale. Good luck sending a notification at 9am to your 3 million users and keeping your database alive in the sudden influx of activity. You need to queue that load.
nitwit005 21 hours ago [-]
Kafka is itself a database. Sending a message requires what is essentially a database insert. You're still doing a DB commit either way.
briankelly 15 hours ago [-]
It's more of a commit log/write-ahead log/replication stream than a DBMS - consider that DBMSs typically include these in addition to their primary storage.
nitwit005 4 hours ago [-]
It's clearly not a full blown database product, but it's still got the core elements of a database. Your data is getting replicated to multiple instances, written to disk, and an index is created for quick lookup.
It's just the table that's getting is essentially append only (excepting the cleanup processes it supports).
thom 1 days ago [-]
Of course, if you don't have separate downstream and upstream datastores, you don't have anything to do in the first place.
debadyutirc 21 hours ago [-]
This is a question we asked 6 years ago.
What if we wrote it in Rust. And leveraged and WASM.
How would you say your project compares to Arroyo?
peanut-walrus 1 days ago [-]
Object storage for Kafka? Wouldn't this 10x the latency and cost?
I feel like Kafka is a victim of it's own success, it's excellent for what it was designed, but since the design is simple and elegant, people have been using it for all sorts of things for which it was not designed. And well, of course it's not perfect for these use cases.
gunnarmorling 1 days ago [-]
It can increase latency (which can be somewhat mitigated though by having a write buffer e.g. on EBS volumes), but it substantially _reduces_ cost: all cross-AZ traffic (which is $$$) is handled by the object storage layer, where it doesn't get charged. This architecture has been tremendously popular recently, championed by Warpstream and also available by Confluent (Freight clusters), AutoMQ, BufStream, etc. The KIP mentioned in the post aims at bringing this back into the upstream open-source Kafka project.
peanut-walrus 22 hours ago [-]
So it's cheaper *on AWS*. Any cloud provider where cross-AZ traffic is not $$$, I can't imagine this architecture being cheaper.
Engineering solutions which only exist because AWS pricing is whack are...well, certainly a choice.
I can also think of lots of cases where whatever you're running is fine to just run in a single AZ since it's not critical.
nicksnyder 19 hours ago [-]
The other clouds have fees like this too.
Even if this were to change, using object storage results in a lot of operational simplicity as well compared to managing a bunch of disks. You can easily and quickly scale to zero or scale up to handle bursts in traffic.
An architecture like this also makes it possible to achieve a truly active-active multi-region Kafka cluster that has real SLAs.
> people have been using it for all sorts of things for which it was not designed
Kafka is misused for some weird stuff. I've seen it used as a user database, which makes absolutely no sense. I've also seen it used a "key/value" store, which I can't imagine being efficient as you'd have to scan the entire log.
Part of it seems to stem from "We need somewhere to store X. We already have Kafka, and requesting a database or key/value store is just a bit to much work, so let's stuff it into Kafka".
I had a client ask for a Kafka cluster, when queried about what they'd need it for we got "We don't know yet". Well that's going to make it a bit hard to dimension and tune it correctly. Everyone else used Kafka, so they wanted to use it too.
openplatypus 23 hours ago [-]
> Object storage for Kafka? Wouldn't this 10x the latency and cost?
It will become slower. It will become costlier (to maintain). And we will end up with local replicas for performance.
If only people looked outside AWS bubble and realised they are SEVERELY overcharged for storage, this would be mute point.
I would welcome getting partition dropped in favour of multi-tenancy ... but for my use cases this is often equivalent.
Storage is not the problem though.
bjornsing 23 hours ago [-]
The weird thing driving this thinking is that cross-AZ network data transfer between EC2 instances on AWS is more expensive than shuffling the same data through S3 (which has free data transfer to/from EC2). It’s just stupid, but that’s how it is.
ako 1 days ago [-]
Warpstream already does Kafka with object storage.
rad_gruchalski 1 days ago [-]
WarpStream has been acquired by Confluent.
biorach 1 days ago [-]
> the design is simple and elegant
Kafka is simple and elegant?
peanut-walrus 22 hours ago [-]
The core design for producer/broker/consumer certainly is. All the logic is on the ends, broker just makes sure your stream of bytes is available to the consumers. Reliable, scales well, can be used for pretty much any data.
EdwardDiego 24 hours ago [-]
The design is.
fintler 1 days ago [-]
Keep an eye out for Northguard. It's the name of LinkedIn's rewrite of Kafka that was announced at a stream processing meetup about a week ago.
selkin 21 hours ago [-]
It solves some issues, and creates some, since Northguard isn’t compatible with the current Kafka ecosystem.
As such, you can no longer use existing software that is built on Kafka as-is. It may not be a grave concern for LinkedIn, but it could be for others that currently benefit from using the existing Kafka ecosystem.
fintler 21 hours ago [-]
Yeah, it's definitely a significant shift. The Xinfra component helps with Kafka compatibility, but that still has quite a bit of complexity to it. Also, it's written in C++, so that requires a different mindset to operate.
selkin 21 hours ago [-]
I understood Xinfra to not use the Kafka protocol as its API.
As such, even with Xinfra deployed, you have to rewrite all the software that connects to Kafka, regardless of programming language.
supermatt 1 days ago [-]
> "Do away with partitions"
> "Key-level streams (... of events)"
When you are leaning on the storage backend for physical partitioning (as per the cloud example, where they would literally partition based on keys), doesnt this effectively just boil down to renaming partitions to keys, and keys to events?
gunnarmorling 1 days ago [-]
That's one way to look at this, yes. The difference being that keys actually have a meaning to clients (as providers of ordering and also a failure domain), whereas partitions in their current form don't.
olavgg 1 days ago [-]
How many of the Apache Kafka issues are adressed by switching to Apache Pulsar?
I skipped learning Kafka, and jumped right into Pulsar. It works great for our use case. No complaints. But I wonder why so few use it?
enether 24 hours ago [-]
There’s inherently a lot of path-dependent network effects in open source software.
Just because something is 10-30% better in certain cases almost never warrants its adoption, if on the other side you get much less human expertise, documentation/resources and battle tested testimonies.
This, imo, is the story of most Kafka competitors
jsumrall 18 hours ago [-]
I've been down this path, and if my experience is more common, then it really boils down to the classic "Nobody gets fired for buying IBM", and here IBM -> Confluent.
StreamNative seems like an excellent team, and I hope they succeed. But as another comment has written, something (puslar) being better (than kafka) has to either be adopted from the start, or be a big enough improvement to change— and as difficult and feature-poor that Kafka is, it still gets the job done.
I can rant longer about this topic but Pulsar _should_ be more popular, but unfortunately Confluent has dominated here and rent-seeking this field into the ground.
EdwardDiego 24 hours ago [-]
Some, but then Pulsar brings its own issues.
vermon 1 days ago [-]
Interesting, if partitioning is not a useful concept of Kafka, what are some of the better alternatives for controlling consumer concurrency?
galeaspablo 1 days ago [-]
It is useful, but it is not generally applicable.
Given an arbitrary causality graph between n messages, it would be ideal if you could consume your messages in topological order. And that you could do so in O(n log n).
No queuing system in the world does arbitrary causality graphs without O(n^2) costs. I dream of the day where this changes.
And because of this, we’ve adapted our message causality topologies to cope with the consuming mechanisms of Kafka et al
To make this less abstract, imagine you have two bank accounts, each with a stream. MoneyOut in Bob’s account should come BEFORE MoneyIn when he transfers to Alice’s account, despite each bank account having different partition keys.
dpflan 22 hours ago [-]
Can you elaborate on how you have “adapted…message causality topologies to cope with consuming mechanisms” in relation to the example of a bank account? The causality topology being what here, couldn’t one day MoneyIn should come before else there can be now true MoneyOut?
galeaspablo 21 hours ago [-]
Right on, great question. Some examples:
Example Option 1
You give up on the guarantees across partition keys (bank accounts), and you accept that balances will not reflect a causally consistent state of the past.
E.g., Bob deposits 100, Bob sends 50 to Alice.
Balances:
Bob 0 Alice 50 # the source system was never in this state
Bob 100 Alice 50 # the source system was never in this state
Bob 50 Alice 50 # eventually consistent final state
Example Option 2
You give up on parallelism, and consume in total order (i.e., one single partition / unit of parallelism - e.g., in Kafka set a partitioner that always hashes to the same value).
Example Option 3
In the consumer you "wait" whenever you get a message that violates causal order.
E.g.,
Bob deposits 100
Bob sends 50 to Alice (Bob-MoneyOut 50 -> Alice-MoneyIn 50).
If we attempt to consume Alice-MoneyIn before Bob-MoneyOut, we exponentially back off from the partition containing Alice-MoneyIn.
(Option 3 is terrible because of O(n^2) processing times in the worst case and the possibility for deadlocks (two partitions are waiting for one another))
dpflan 18 hours ago [-]
Thanks. With these of examples of messages appearance in time and in physical location in Kafka, how have you adapted your consumers? Which scenario / architectural decision (one of the examples?) have you moved forward with and creating support to yield your desired causality handling?
galeaspablo 14 hours ago [-]
Option 1, but after so many years banging our heads against the wall reasoning about this, we hoped someone would eventually give us a queue that supports arbitrary causal dependency graphs.
We thought about building it ourselves, because we know the data structures, high level algorithms, and disk optimizations required. BUT we pivoted our company, so we've postponed this for the foreseeable future. After all, theory is relatively easy, but a true production grade implementation takes years.
pas 15 hours ago [-]
hmmm... could this be solved by "vector clocks"? if producers are emitting something that depends on a previous event they send the id of the previous event. (so like capabilities, you need proof of "data access".)
or the problem is that again this is O(n^2)? (because then the consumers now need to buffer [potentially] n key streams (and then search for them every time - so "n" times)?
"Faced with such a marked defensive negative attitude on the part of a biased culture, men who have knowledge of technical objects and appreciate their significance try to justify their judgment by giving to the technical object the only status that today has any stability apart from that granted to aesthetic objects, the status of something sacred. This, of course, gives rise to an intemperate technicism that is nothing other than idolatry of the machine and, through such idolatry, by way of identification, it leads to a technocratic yearning for unconditional power. The desire for power confirms the machine as a way to supremacy and makes of it the modern philtre (love-potion)."
Gilbert Simondon, On the mode of existence of technical objects.
This is exactly what I interpret from these kind of articles: engineering just for the cause of engineering. I am not saying we should not investigate on how to improve our engineered artifacts, or that we should not improve them. But I see a generalized lack of reflection on why we should do it, and I think it is related to a detachment from the domains we create software for.
The article suggests uses of the technology that come from so different ways of using it, that it looses coherence as a technical item.
gunnarmorling 1 days ago [-]
For each of the items discussed I explicitly mention why they would be desirable to have. How is this engineering for the sake of engineering?
frklem 1 days ago [-]
True, for each of the points discussed, there is an explicit mention on why it is desirable. But those are technical solutions, to technical problems. There is nothing wrong with that. The issue is, that the whole article is about technicalities because of technicalities, hence the 'engineering for the cause of engineering' (which is different from '.. for the sake of...'). It is at this point that the 'idea of rebuilding Kafka' becomes a pure technical one, detached from the intention of having something like Kafka.
Other commenters in the thread also pointed out to the fact of Kafka not having a clear intention. I agree that a lot of software nowadays suffer from the same problem.
22 hours ago [-]
selkin 18 hours ago [-]
This is a useful Gedankenexperiment, but I think the replies suggesting that the conclusion is that we should replace Kafka with something new are quiet about what seems obvious to me:
Kafka's biggest strength is the wide and useful ecosystem built on top of it.
It is also a weaknesses, as we have to keep some (but not of all) the design decisions we wouldn't have made had we started from scratch today.
Or we could drop backwards compatibility, at the cost of having to recreate the ecosystem we already have.
bjornsing 22 hours ago [-]
> Key-centric access: instead of partition-based access, efficient access and replay of all the messages with one and the same key would be desirable.
I’ve been working on a datastore that’s perfect for this [1], but I’m getting very little traction. Does anyone have any ideas why that is? Is my marketing just bad, or is this feature just not very useful after all?
Some input from previously working on a superset of this problem. And being in a similar position.
Mature projects have too much bureacracy, and even spending time talking to you = opportunity cost. So making a case for why you're going to solve a problem for them is tough.
New projects (whether at big companies or small companies) have 20 other things to worry about, so the problem isn't big enough.
Thanks. Interesting read, and an interesting product / service. Have been thinking about the same approach myself…
notfed 21 hours ago [-]
The website seems very vapid. Extraordinary claims require extraordinary evidence. Personally I see a lack of evidence here (that this vendor-locked product is better than existing freeware) and I'm going to move on.
bjornsing 20 hours ago [-]
Thanks for the honest assessment!
thesimon 20 hours ago [-]
> HaystackDB is accessed through a RESTful HTTPS API. No client library necessary.
That's cool, but but I would prefer to not reinvent the wheel. If you have a simple library, that would already be useful.
Some simple code or request examples would be convenient as well. I really don't know how easy or difficult your interface design is. It would be cool to see the API docs.
bjornsing 18 hours ago [-]
Yeah, it’s a bit of a chicken and egg problem. Since I don’t have a way to find potential customers I feel it’s too risky investing in stuff like client libraries and good API docs. But I can definitely understand you’d like to see more.
tinix 21 hours ago [-]
what can you do that redis can't?
I'm also skeptical of the graph on your front page that claims S3 cost as much as DynamoDB.
that alone makes it look like total nonsense.
as someone else said, extraordinary claims require extraordinary evidence.
bjornsing 19 hours ago [-]
> what can you do that redis can't?
Keep the data in S3 for 0.023 USD per GB-month. If you have a billion keys that can be useful.
> I'm also skeptical of the graph on your front page that claims S3 cost as much as DynamoDB.
Good point. Could have put a bit more work into that.
bjornsing 10 hours ago [-]
> Good point. Could have put a bit more work into that.
On second thought, and after looking at my cost estimates, the reason DynamoDB ends up costing about the same as S3 for this kind of use case is storage costs. DynamoDB is a lot cheaper than S3 to write to, but 5-10x more expensive to keep data stored in. So after about 16-32 months you reach break even.
elvircrn 1 days ago [-]
Surprised there's no mention of Redpanda here.
axegon_ 1 days ago [-]
Having used both Kafka and Redpanda on several occasions, I'd pick Redpanda any day of the week without a second thought. Easier to setup, easier to maintain, a lot less finicky ans uses a fraction of the resources.
enether 23 hours ago [-]
In what way is it materially easier to maintain and less finicky? I read a lot about this but I haven’t seen a concise bullet point list of why, which leads me to naturally distrust such claims. Ditto for the resources - Kafka is usually bottlenecked on disk/network, and whether it’s c++ or not doesn’t solve that
SkipperCat 18 hours ago [-]
I've been using Redpanda for a few years now and here's what I've noticed.
Its a compiled binary - no JVM to manage. Java apps have always been a headache for me. Plus, no zookeeper - one less thing to break.
The biggest benefit I've seen is the performance. Redpanda just outperforms Apache Kakfa on similar hardware. Its also Kafka compliant in every way I've noticed, so all my favorite tools that interact with Kafka work the same with Redpanda.
Redpanda, like Kafka, writes to disk, so you'll always be limited by your hardware no matter what you use (but NVMe's are fast and affordable).
He mentions Automq right in the opener. And if I follow the link, they pitch it in a way that sounds very "too good to be true".
Anyone here have some real world experience with it?
mgaunard 1 days ago [-]
I can't count the number of bad message queues and buses I've seen in my career.
While it would be useful to just blame Kafka for being bad technology it seems many other people get it wrong, too.
mgaunard 22 hours ago [-]
s/useful/easy/
(can't edit)
lewdwig 1 days ago [-]
Ah the siren call of the ground-up rewrite. I didn’t know how deep the assumption of hard disks underpinning everything is baked into its design.
But don’t public cloud providers already all have cloud-native event sourcing? If that’s what you need, just use that instead of Kafka.
dangoodmanUT 22 hours ago [-]
I think this is missing a key point about partitions: Write visibility ordering
The problem with guaranteed order is that you have to have some agreed upon counter/clock for ordering, otherwise a slow write from one producer to S3 could result in consumers already passing that offset before it was written, thus the write is effectively lost unless the consumers wind-back.
Having partitions means we can assign a dedicated writer for that partition that guarantees that writes are in order. With s3-direct writing, you lose that benefit, even with a timestamp/offset oracle. You'd need some metadata system that can do serializable isolation to guarantee that segments are written (visible to consumers) in the order of their timestamps. Doing that transactional system directly against S3 would be super slow (and you still need either bounded-error clocks, or a timestamp oracle service).
Spivak 1 days ago [-]
Once you start asking to query the log by keys, multi-tenancy trees of topics, synchronous commits-ish, and schemas aren't we just in normal db territory where the kafka log becomes the query log. I think you need to go backwards and be like what is the feature a rdbms/nosql db can't do and go from there. Because the wishlist is looking like CQRS with the front queue being durable but events removed once persisted in the backing db where the clients query events from the db.
The backing db in this wishlist would be something in the vein of Aurora to achieve the storage compute split.
0x445442 1 days ago [-]
How about logging the logs so I can shell into the server to search the messages.
EdwardDiego 24 hours ago [-]
What messages? They're just blobs of bytes.
jjk7 19 hours ago [-]
You can do this with Confluent
bionhoward 23 hours ago [-]
step 1: don’t use the JVM
openplatypus 21 hours ago [-]
Don't use best platform engineered and perfected over decades?
Why?
Sohcahtoa82 16 hours ago [-]
It's not the 90s anymore. The JVM is very performant.
pas 15 hours ago [-]
Yes, but if you can use compile-time (arena or otherwise) allocation then you can go even faster, no?
nthingtohide 22 hours ago [-]
do you think Grafana made the right choice by using Golang ?
imcritic 1 days ago [-]
Since we are dreaming - add ETL there as well!
derefr 21 hours ago [-]
> You either want to have global ordering of all messages on a given topic, or (more commonly) ordering of all messages with the same key. In contrast, defined ordering of otherwise unrelated messages whose key happens to yield the same partition after hashing isn’t that valuable, so there’s not much point in exposing partitions as a concept to users.
The user gets global ordering when
1. you-the-MQ assign both messages and partitions stable + unique + order + user-exposed identifiers;
2. the user constructs a "globally-collatable ID" from the (perPartitionMsgSequenceNumber, partitionID) tuple;
3. the user does a client-side streaming merge-sort of messages received by the partitions, sorting by this collation ID. (Where even in an ACK-on-receive design, messages don't actually get ACKed until they exit the client-side per-partition sort buffer and enter the linearized stream.)
The definition of "exposed to users" is a bit interesting here, as you might think you could do this merge-sort on the backend, just exposing a pre-linearized stream to the client.
But one of the key points/benefits of Kafka-like systems, under high throughput load (which is their domain of comparative advantage, and so should be assumed to be the deployed use-case), is that you can parallelize consumption cheaply, by just assigning your consumer-workers partitions of the topic to consume.
And this still works under global ordering, under some provisos:
• your workload can be structured as a map/reduce, and you don't need global ordering for the map step, only the reduce step;
• it's not impractical for you to materialize+embed the original intended input collation-ordering into the transform workers' output (because otherwise it will be lost in all but very specific situations.)
Plenty of systems fit these constraints, and happily rely on doing this kind of post-linearized map/reduce parallelized Kafka partition consumption.
And if you "hide" this on an API level, this parallelization becomes impossible.
Note, however, that "on an API level" bit. This is only a problem insofar as your system design is protocol-centered, with the expectation of "cheap, easy" third-party client implementations.
If your MQ is not just a backend, but also a fat client SDK library — then you can put the partition-collation into the fat client, and it will still end up being "transparent" to the user. (Save for the user possibly wondering why the client library opens O(K) TCP connections to the broker to consume certain topics under certain configurations.)
See also: why Google's Colossus has a fat client SDK library.
eluusive 21 hours ago [-]
This is basically NATS.io
gitroom 18 hours ago [-]
honestly, kafka always felt like way more moving parts than my brain wants to track, but at the same time, its kinda impressive how the ecosystem just keeps growing - you think the reason people stick to it is hype, laziness, or just not enough real pain yet to switch?
hardwaresofton 1 days ago [-]
See also: Warpstream, which was so good it got acquired by Confluent.
Feels like there is another squeeze in that idea if someone “just” took all their docs and replicated the feature set. But maybe that’s what S2 is already aiming at.
Wonder how long warpstream docs, marketing materials and useful blogs will stay up.
dangoodmanUT 22 hours ago [-]
i wouldn't say it was so good it got acquired by them, rather confluent had no s3-backed play and it was easier for them to acquire warpstream than to add it to kafka directly
warpstream has latency issues, which downstream turn into cost issues
hardwaresofton 21 hours ago [-]
That's a good point -- I assumed there were other choices, but now that I look, warpstream may have been the only already-kafka-compatible option.
That said, they were at least "good enough" to make "buy" more appealing than "build"
jjk7 19 hours ago [-]
Confluent has "Freight" which is their integrated s3-backed play.
oulipo 19 hours ago [-]
There are a few interesting projects to replace Kafka: Redpanda / Pulsar / AutoMQ
have some of you some experience with those and able to give pros/cons?
SkipperCat 18 hours ago [-]
I've used Redpanda and I like it. Its Kafka compliant so all the tools that work with Apache Kafka also work with Redpanda. Plus, no JVM and no zookeeper.
Redpanda most importantly is faster that Apache Kafka. We were able to get a lot more throughput. Its also stable, especially compared to dealing with anything that requires a JVM.
tezza 18 hours ago [-]
Now that really would be Kafka-esque
rhet0rica 17 hours ago [-]
One morning, when Hacker News woke from troubled dreams, it found itself transformed in its bed into a horrible bikeshed.
pas 15 hours ago [-]
no no, into a hideous design document.
/s ... seriously
iwontberude 22 hours ago [-]
We could (and will repeatedly) rebuild Kafka from scratch. Solved the question and I didn’t even need to read the article.
YetAnotherNick 1 days ago [-]
I wish there is a global file system with node local disks, which has rule driven affinity to nodes for data. We have two extremes, one like EFS or S3 express which doesn't have any affinity to the processing system, and other what Kafka etc is doing where they have tightly integrated logic for this which makes systems more complicated.
XorNot 1 days ago [-]
I might be misunderstanding but isn't this like, literally what GlusterFS used to do?
Like distinctly recall running it at home as a global unioning filesystem where content had to fully fit on the specific device it was targeted at (rather then being striped or whatever).
Mistletoe 1 days ago [-]
I know it’s not what the article is about but I really wish we could rebuild Franz Kafka and hear what he thought about the tech dystopia we are in.
>I cannot make you understand. I cannot make anyone understand what is happening inside me. I cannot even explain it to myself.
-Franz Kafka, The Metamorphosis
gherkinnn 1 days ago [-]
Both der Proceß and das Schloß apply remarkably well to our times. No extra steps necessary.
tannhaeuser 1 days ago [-]
They named it like that for a reason ;)
jackbauer24 1 days ago [-]
[dead]
ActorNightly 1 days ago [-]
[flagged]
raducu 1 days ago [-]
> Step 1, stop using Java.
I've seen these comments for over 15 years yet for some "unknown", "silly" reason java keeps being used for really,really useful software like kafka.
LunaSea 1 days ago [-]
But it's also the reason for why these Apache projects systematically get displaced by better and faster C, C++, Rust or Go alternatives.
raducu 1 days ago [-]
It would make sense for a highly succesful but stable java project to be replaced like that, but since I'm in the java world, it's usually replaced with another java project.
I could provide examples myself, but I'm not convinced it's about java vs c++ or go: hadoop, cassandra, zookeeper
LunaSea 1 days ago [-]
As an outsider, Java looks like a language that can be very fast but it seems like certain idiomatic practices or patterns lead to over-engineered and thus sometimes also slow projects. The Factory<Factory<x>> joke comes to mind.
EdwardDiego 24 hours ago [-]
2010 called, it wants its Java jokes back.
mrweasel 1 days ago [-]
Why? It's fast, featureful, well maintained and there are tons of people who know the language?
sam_lowry_ 1 days ago [-]
Java APIs are indeed awfully complex.
ldng 1 days ago [-]
More than that. It is a nightmare to make binding for. So if you want to really be extensible through plugins, supporting other languages is very important.
butterlettuce 1 days ago [-]
Step 2, use Python for everything
mirekrusin 1 days ago [-]
Good choice, leaves space to rewrite in rust later, right?
Spivak 1 days ago [-]
The fact that this is so common I think yearns for a language that is basically python and rust smashed together where a project would have some code in the python side and some code in the rust side intermixed fluidly like how you can drop to asm in C. Don't even really try to make the two halves of the language similar.
An embedded interpreter and JIT in rust basically but jostled around a bit to make it more cohesive and the data interop more fluid— PyO3 but backwards.
But I'm doubtful that it's going to make things simpler if one can't even decide on a language.
Spivak 1 days ago [-]
It's not really a "decision" thing, more that the pattern of write most of the code in python, rewrite the performance critical bits in rust as a cpython extension would be nice if it were made first class.
mirekrusin 1 days ago [-]
Maybe something will eventually crystalize out of mojo?
Ygg2 1 days ago [-]
Step 3, Rewrite it in Rust.
Step 4, Rewrite it in Zig.
Step 5, due to security issues rewrite it in Java 63.
rvz 1 days ago [-]
Every time another startup falls for the Java + Kafka arguments, it keeps the AWS consultants happier.
Fast forward into 2025, there are many performant, efficient and less complex alternatives to Kafka that save you money, instead of burning millions in operational costs "to scale".
Unless you are at a hundred million dollar revenue company, choosing Kafka in 2025 is doesn't make sense anymore.
mrkeen 1 days ago [-]
When I pitched Kafka to my backend team in 2018, I got pushback from the ops team who wanted me to use something AWS-native instead, i.e. Kinesis. A big part of why I doubled-down on Kafka is because it's vendor neutral and I can just run the actual thing locally.
EdwardDiego 24 hours ago [-]
Wut.
Kafka shouldn't be used for low dataflow systems, true, but you can scale a long way with a simple 3 node cluster.
enether 23 hours ago [-]
Can you share three? Without concrete suggestions, this is just a disparaging narrative (that a lot of vendors use)
ghuntley 1 days ago [-]
I'm going to get downvoted for this, but you can literally rebuild Kafka via AI right now in record time using the steps detailed at https://ghuntley.com/z80.
But today, all streaming systems (or workarounds) with per message key acknowledgements incur O(n^2) costs in either computation, bandwidth, or storage per n messages. This applies to Pulsar for example, which is often used for this feature.
Now, now, this degenerate time/space complexity might not show up every day, but when it does, you’re toast, and you have to wait it out.
My colleagues and I have studied this problem in depth for years, and our conclusion is that a fundamental architectural change is needed to support scalable per message key acknowledgements. Furthermore, the architecture will fundamentally require a sorted index, meaning that any such a queuing / streaming system will process n messages in O (n log n).
We’ve wanted to blog about this for a while, but never found the time. I hope this comment helps out if you’re thinking of relying on per message key acknowledgments; you should expect sporadic outages / delays.
It processes unrelated keys in parallel within a partition. It has to track what offsets have been processed between the last committed offset of the partition and the tip (i.e. only what's currently processed out of order). When it commits, it saves this state in the commit metadata highly compressed.
Most of the time, it was only processing a small number of records out of order so this bookkeeping was insignificant, but if one key gets stuck, it would scale to at least 100,000 offsets ahead, at which point enough alarms would go off that we would do something. That's definitely a huge improvement to head of line blocking.
Yup, this is one more example, just like Pulsar. There are definitely great optimizations to be made on the average case. In the case of parallel consumer, if you'd like to keep ordering guarantees, you retain O(n^2) processing time in the worst case.
The issues arise when you try to traverse arbitrary dependency topologies in your messages. So you're left with two options:
1. Make damn sure that causal dependencies don't exhibit O(n^2) behavior, which requires formal models to be 100% sure. 2. Give up ordering or make some other nasty tradeoff.
At a high level the problem boils down to traversing a DAG in topological order. From computer science theory, we know that this requires a sorted index. And if you're implementing an index on top of Kafka, you might as well embed your data into and consume directly from the index. Of course, this is easier said than done, and that's why no one has cracked this problem yet. We were going to try, but alas we pivoted :)
Edit: Topological sort does not required a sorted index (or similar) if you don't care about concurrency. But then you've lost the advantages of your queue.
Is there another way to state this? It’s very difficult for me to grok.
> DAG
Directed acyclic graph right?
A graphical representation might be worth a thousand words, keeping in mind it's just one example. Imagine you're traversing the following.
A1 -> A2 -> A3...
|
v
B1 -> B2 -> B3...
|
v
C1 -> C2 -> C3...
|
v
D1 -> D2 -> D3...
|
v
E1 -> E2 -> E3...
|
v
F1 -> F2 -> F3...
|
v
...
Efficient concurrent consumption of these messages (while respecting causal dependency) would take O(w + h), where w = the _width_ (left to right) of the longest sequence, and h = the _height_ (top to bottom of the first column)
But Pulsar, Kafka + parallel consumer, Et al. would take O(n^2) either in processing time or in space complexity. This is because at a fundamental level, the underlying data storages store looks like this
A1 -> A2 -> A3...
B1 -> B2 -> B3...
C1 -> C2 -> C3...
D1 -> D2 -> D3...
E1 -> E2 -> E3...
F1 -> F2 -> F3...
Notice that the underlying data storage loses information about nodes with multiple children (e.g., A1 previously parented both A2 and B1)
If we want to respect order, the consumer will be responsible for declining to process messages that don't respect causal order. E.g., attempting to process F1 before E1. Thus we could get into a situation where we try to process F1, then E1, then D1, then C1, then B1, then A1. Now that A1 is processed, kafka tries again, but it tries F1, then E1, then D1, then C1, then B1... And so on and so forth. This is O(n^2) behavior.
Without changing the underlying data storage architecture, you will either:
1. Incur O(n^2) space or time complexity
2. Reimplement the queuing mechanism at the consumer level, but then you might as well not even use Kafka (or others) at all. In practice this is not practical (my evidence being that no one has pulled it off).
3. Face other nasty issues (e.g., in Kafka parallel consumer you can run out of memory or your processing time can become O(n^2)).
Let's imagine ourselves as a couple of engineers at Acme Foreign Exchange House. We'd like to track Acme's net cash position across multiple currencies, and execute trades accordingly (e.g., heding). And we'd like to retrospectively analyze our hedges, to assess their effectiveness.
Let's say I have this set of transactions (for accounts A, B, C, D, E, F, etc.)
A1 -> A2 -> A3 -> A4
B1 -> B2 -> B3 -> B4
C1-> C2
D1 -> D2 -> D3 -> D4
E1 -> E2
F1
Let's say that that:
- E1 was a deposit made into account E for $2M USD.
- E2 was an outgoing transfer of $2M USD sent to account F (incoming £1.7M GBP at F1).
If we consume our transactions and partiton our consumption by account id, we could get into a state where E1 and F1 are reflected in our net position, but E2 isn't. That is, our calculation has both $2M USD and £1.7M GBP, when in reality we only ever held either $2M USD or £1.7M GBP.
So what could we do?
1. Make sure that we respect causality order. I.e., there's no F1 reflected in our net position if we haven't processed E2.
2. Make sure that pairs of transactions (e.g., E2 and F1) update our net position atomically.
This is otherwise known as a "consistent cut" (see slide 25 here https://www.cs.cornell.edu/courses/cs6410/2011fa/lectures/19...).
Opinion: the world is causally ordered in arbitrary ways as above. But the tools, frameworks, and infrastructure more readily available to us struggle at modeling arbitrary partially ordered causality graphs. So we shrug our shoulders, and we learn to live with the edge cases. But it doesn't have to be so.
I'd rather debug a worker problem than an infra scaling problem every day of the week and twice on Sundays.
The parallel consumer nearly entirely solved this problem. Only the most egregious cases where keys were ~3000 times slower than other keys would cause an issue, and then you could solve it by disabling that key for a while.
I tend to prefer other queueing mechanisms in those cases, although I still work hard to make 99ths and medians align as it can still cause issues (especially for monitoring)
I'm guessing this is mostly around how backed up the stream is. n isn't the total number of messages but rather the current number of unacked messages.
Would a radix structure work better here? If you throw something like a UUID7 on the messages and store them in a radix structure you should be able to get O(n) performance here correct? Or am I not understanding the problem well.
And yes, O(n log n ) is not bad at all. Sorted database indexes (whether SQL, NoSQL, or AcmeVendorSQL, etc.) already take O(n log n) to insert n elements into data storage or to read n elements from data storage.
https://news.ycombinator.com/item?id=43783452
Not to mention you then also have a KV store. Most problems can be solved with redis + Postgres
https://docs.nats.io/nats-concepts/jetstream/key-value-store
[1] https://docs.nats.io/nats-concepts/jetstream
Ding Ding Ding
That's correct!
Still, if a hypothetical new Kafka would incorporate some of Nats' features, that would be a good thing.
I had to really dig (outside of that website) to understand even what NATS is and/or does
It goes too hard on the keyword babbling and too little on the "what does this actually do"
> Services can live anywhere and are easily discoverable - decentralized, zerotrust security
Ok cool, this tells me absolutely nothing. What service? Who to whom? Discovering what?
We use it for streaming tick data, system events, order events, etc, into kdb. We write to kafka and forget. The messages are persisted, and we don't have to worry if kdb has an issue. Out of band consumers read from the topics and persist to kdb.
In several years of doing this we haven't really had any major issues. It does the job we want. Of course, we use the aws managed service, so that simplifies quite a few things.
I read all the hate comments and wonder what we're missing.
Kafka is perhaps the most aptly named software I've ever used.
That said, it's rock solid and I continue to recommend it for cases where it makes sense.
Like the article outlines, partitions are not that useful for most people. Instead of removing them, how about having them behind a feature flag, i.e. not on by default. That would ease 99% of users problems.
The next point in the article which to me resonates is the lack of proper schema support. That's just bad UX again, not inherent complexity of the problem space.
On testing side, why do I need to spin up a Kafka testcontainer, why is there no in-memory kafka server that I can use for simple testing purposes.
Take a look at Debezium's KafkaCluster, which is exactly that: https://github.com/debezium/debezium/blob/main/debezium-core....
It's used within Debezium's test suite. Check out the test for this class itself to see how it's being used: https://github.com/debezium/debezium/blob/main/debezium-core...
If you take action a, then action b, your system will throw 500s fairly regularly between those two steps, leaving your user in an inconsistent state. (a = pay money, b = receive item). Re-ordering the steps will just make it break differently.
If you stick both actions into a single event ({userid} paid {money} for {item}) then "two things" has just become "one thing" in your system. The user either paid money for item, or didn't. Your warehouse team can read this list of events to figure out which items to ship, and your payments team can read this list of events to figure out users' balances and owed taxes.
(You could do the one-thing-instead-of-two-things using a DB instead of Kafka, but then you have to invent some kind of pub-sub so that callers know when to check for new events.)
Also it's silly waiting around to see exceptions build up in your dev logs, or for angry customers to reach out via support tickets. When your implementation depends on publishing literal events of what happened, you can spin up side-cars which verify properties of your system in (soft) real-time. One side-car could just read all the ({userid} paid {money} for {item}) events and ({item} has been shipped) events. It's a few lines of code to match those together and all of a sudden you have a monitor of "Whose items haven't been shipped?". Then you can debug-in-bulk (before the customers get angry and reach out) rather than scour the developer logs for individual userIds to try to piece together what happened.
Also, read this thread https://news.ycombinator.com/item?id=43776967 from a day ago, and compare this approach to what's going on in there, with audit trails, soft-deletes and updated_at fields.
Confluent has rather good marketing and when you need messaging but can also gain a persistent, super scalable data store and more, why not use that instead? The obvious answer is: Because there is no one-size-fits-all-solution with no drawbacks.
Granted it doesn't happen often, if you plan correctly, but the possibility of going wrong in the partitioning and replication makes updates and upgrades nightmare fuel.
Then we settled into an era where server rooms grew and workloads demanded horizontal scaling and for the high profile users running an odd number of processes was a rounding error and we just stopped doing it.
But we also see this issue re-emerge with dev sandboxes. Running three copies of Kafka, Redis, Consul, Mongo, or god forbid all four, is just a lot for one laptop, and 50% more EC2 instances if you spin it up in the Cloud, one cluster per dev.
I don’t know much Kafka, so I’ll stick with Consul as a mental exercise. If you take something like consul, the voting logic should be pretty well contained. It’s the logic for catching up a restarted node and serving the data that’s the complex part.
It took 4 years to properly integrate Kafka into our pipelines. Everything, like everything is complicated with it: cluster management, numerous semi-tested configurations, etc.
My final conclusion with it is that the project just doesn't really know what it wants to be. Instead it tries to provide everything for everybody, and ends up being an unbelievably complicated mess.
You know, there are systems that know what they want to be (Amazon S3, Postres, etc), and then there are systems that try to eat the world (Kafka, k8s, systemd).
It's a distributed log? What else is it trying to do?
I am not sure about this taxonomy. K8s, systemd, and (I would add) the Linux kernel are all taking on the ambitious task of central, automatic orchestration of general purpose computing systems. It's an extremely complex problem and I think all those technologies have done a reasonably good job of choosing the right abstractions to break down that (ever-changing) mess.
People tend to criticize projects with huge scope because they are obviously complex, and complexity is the enemy, but most of the complexity is necessary in these cases.
If Kafka's goal is to be a general purpose "operating system" for generic data systems, then that explains its complexity. But it's less obvious to me that this premise is a good one.
it's real goal is to make Linux administration as useless as windows so RH can sell certifications.
tell me the output of systemctl is not as awful as opening the windows service panel.
But both are kind of hard to understand end-to-end, especially for an occasional user.
systemd whole premise is "people will not read the distro or bash scripting manual"...
then nobody read systemd's (you have even less reason, since it's badly written, ever changing in conflicting ways, and a single use tool)
so you went from complaining your coworkers can't write bash to complaining they don't know they have to use EXEC= EXEC=/bin/x
because random values, without any hint, are lists of commands instead of string values.
There was a time in my early to mid career when I had to defend my designs a lot because people thought my solutions were shallower than they were and didn’t understand that the “quirks” were covering unhappy paths. They were often load bearing, 80/20 artifacts.
Really? I got scared by Kafka by just reading through the documentation.
Just don't use Kafka.
Write to the downstream datastore directly. Then you know your data is committed and you have a database to query.
Disclosure: I work part time for Oracle Labs and know about these features because I'm using them in a project at the moment.
The messaging tech being separate from the database tech means the architects can swap out the database if needed in the future without needing to rewrite the producers and consumers.
This reminds me of the OOP vs DOD debate again. OOP adherents say they don't know all the types of data their code operates on; DOD adherents say they actually do, since their program contains a finite number of classes, and a finite subset of those classes can be the ones called in this particular virtual function call.
What you mean is that your system is structured in such a way that it's as if you don't know who's listening. Which is okay, but you should be explicit that it's a design choice, not a law of physics, so when that design choice no longer serves you well, you have the right to change it.
(Sometimes you really don't know, because your code is a library or has a plugin system. In such cases, this doesn't apply.)
> Arguably, I'd not use Kafka to store actual data, just to notify in-flight.
I believe people did this initially and then discovered the non-Kafka copy of the data is redundant, so got rid of it, or relegated it to the status of a cache. This type of design is called Event Sourcing.
So yes, in large companies, your development team is just a small cog, you don't set policy for what happens to the data you gather. And in some sectors, like finances, you are an especially small cog with little power, which might sound strange if you only ever worked for a software startup.
There may be many things that go wrong and how you handle this depends on your data guarantees and consistency requirements.
If you're not queuing what are you doing when a write fails, throwing away the data?
The standard regex joke really works for all values of X. Some people, when confronted with a problem, think "I know - I'll use a queue!" Now they have two problems.
Adding a queue to the system does not make it faster or more reliable. It makes it more asynchronous (because of the queue), slower (because the computer has to do more stuff) and less reliable (because there are more "moving parts"). It's possible that a queue is a required component of a design which is more reliable and faster, but this can only be known about the specific design, not the general case.
I'd start by asking why your data store is randomly failing to write data. I've never encountered Postgres randomly failing to write data. There are certainly conditions that can cause Postgres to fail to write data, but they aren't random and most of them are bad enough to require an on-call engineer to come and fix the system anyway - if the problem could be resolved automatically, it would have been.
If you want to be resilient to events like the database disk being full (maybe it's a separate analytics database that's less important from the main transactional database) then adding a queue (on a separate disk) can make sense, so you can continue having accurate analytics after upgrading the analytics database storage. In this case you're using the queue to create a fault isolation boundary. It just kicks the can down the road though, since if the analytics queue storage fills up, you still have to either drop the analytics or fail the client's request. You have the same problem but now for the queue. Again, it could be a reasonable design, but not by default and you'd have to evaluate the whole design to see whether it's reasonable.
In the plural, accomplishing that task in a performant way at enterprise scale seems to involve turning every function call into an asynchronous, queued service of some sort.
Which then begets additional deployment and monitoring services.
A queued problem requires a cute solution, bringing acute pain.
The reason message queue systems exist is scale. Good luck sending a notification at 9am to your 3 million users and keeping your database alive in the sudden influx of activity. You need to queue that load.
It's just the table that's getting is essentially append only (excepting the cleanup processes it supports).
What if we wrote it in Rust. And leveraged and WASM.
We have been at it for the past 6 years. https://github.com/infinyon/fluvio
For the past 2 years we have also been building Flink using Rust and WASM. https://github.com/infinyon/stateful-dataflow-examples/
Any chance you’re going to be reviving support for the Kafka wire protocol?
https://github.com/infinyon/fluvio/issues/4259
How would you say your project compares to Arroyo?
I feel like Kafka is a victim of it's own success, it's excellent for what it was designed, but since the design is simple and elegant, people have been using it for all sorts of things for which it was not designed. And well, of course it's not perfect for these use cases.
Engineering solutions which only exist because AWS pricing is whack are...well, certainly a choice.
I can also think of lots of cases where whatever you're running is fine to just run in a single AZ since it's not critical.
Even if this were to change, using object storage results in a lot of operational simplicity as well compared to managing a bunch of disks. You can easily and quickly scale to zero or scale up to handle bursts in traffic.
An architecture like this also makes it possible to achieve a truly active-active multi-region Kafka cluster that has real SLAs.
See: https://buf.build/blog/bufstream-multi-region
(disclosure: I work at Buf)
Kafka is misused for some weird stuff. I've seen it used as a user database, which makes absolutely no sense. I've also seen it used a "key/value" store, which I can't imagine being efficient as you'd have to scan the entire log.
Part of it seems to stem from "We need somewhere to store X. We already have Kafka, and requesting a database or key/value store is just a bit to much work, so let's stuff it into Kafka".
I had a client ask for a Kafka cluster, when queried about what they'd need it for we got "We don't know yet". Well that's going to make it a bit hard to dimension and tune it correctly. Everyone else used Kafka, so they wanted to use it too.
It will become slower. It will become costlier (to maintain). And we will end up with local replicas for performance.
If only people looked outside AWS bubble and realised they are SEVERELY overcharged for storage, this would be mute point.
I would welcome getting partition dropped in favour of multi-tenancy ... but for my use cases this is often equivalent.
Storage is not the problem though.
Kafka is simple and elegant?
As such, you can no longer use existing software that is built on Kafka as-is. It may not be a grave concern for LinkedIn, but it could be for others that currently benefit from using the existing Kafka ecosystem.
As such, even with Xinfra deployed, you have to rewrite all the software that connects to Kafka, regardless of programming language.
> "Key-level streams (... of events)"
When you are leaning on the storage backend for physical partitioning (as per the cloud example, where they would literally partition based on keys), doesnt this effectively just boil down to renaming partitions to keys, and keys to events?
I skipped learning Kafka, and jumped right into Pulsar. It works great for our use case. No complaints. But I wonder why so few use it?
Just because something is 10-30% better in certain cases almost never warrants its adoption, if on the other side you get much less human expertise, documentation/resources and battle tested testimonies.
This, imo, is the story of most Kafka competitors
StreamNative seems like an excellent team, and I hope they succeed. But as another comment has written, something (puslar) being better (than kafka) has to either be adopted from the start, or be a big enough improvement to change— and as difficult and feature-poor that Kafka is, it still gets the job done.
I can rant longer about this topic but Pulsar _should_ be more popular, but unfortunately Confluent has dominated here and rent-seeking this field into the ground.
Given an arbitrary causality graph between n messages, it would be ideal if you could consume your messages in topological order. And that you could do so in O(n log n).
No queuing system in the world does arbitrary causality graphs without O(n^2) costs. I dream of the day where this changes.
And because of this, we’ve adapted our message causality topologies to cope with the consuming mechanisms of Kafka et al
To make this less abstract, imagine you have two bank accounts, each with a stream. MoneyOut in Bob’s account should come BEFORE MoneyIn when he transfers to Alice’s account, despite each bank account having different partition keys.
Example Option 1
You give up on the guarantees across partition keys (bank accounts), and you accept that balances will not reflect a causally consistent state of the past.
E.g., Bob deposits 100, Bob sends 50 to Alice.
Balances: Bob 0 Alice 50 # the source system was never in this state Bob 100 Alice 50 # the source system was never in this state Bob 50 Alice 50 # eventually consistent final state
Example Option 2
You give up on parallelism, and consume in total order (i.e., one single partition / unit of parallelism - e.g., in Kafka set a partitioner that always hashes to the same value).
Example Option 3
In the consumer you "wait" whenever you get a message that violates causal order.
E.g., Bob deposits 100 Bob sends 50 to Alice (Bob-MoneyOut 50 -> Alice-MoneyIn 50).
If we attempt to consume Alice-MoneyIn before Bob-MoneyOut, we exponentially back off from the partition containing Alice-MoneyIn.
(Option 3 is terrible because of O(n^2) processing times in the worst case and the possibility for deadlocks (two partitions are waiting for one another))
We thought about building it ourselves, because we know the data structures, high level algorithms, and disk optimizations required. BUT we pivoted our company, so we've postponed this for the foreseeable future. After all, theory is relatively easy, but a true production grade implementation takes years.
or the problem is that again this is O(n^2)? (because then the consumers now need to buffer [potentially] n key streams (and then search for them every time - so "n" times)?
This is exactly what I interpret from these kind of articles: engineering just for the cause of engineering. I am not saying we should not investigate on how to improve our engineered artifacts, or that we should not improve them. But I see a generalized lack of reflection on why we should do it, and I think it is related to a detachment from the domains we create software for. The article suggests uses of the technology that come from so different ways of using it, that it looses coherence as a technical item.
Kafka's biggest strength is the wide and useful ecosystem built on top of it.
It is also a weaknesses, as we have to keep some (but not of all) the design decisions we wouldn't have made had we started from scratch today. Or we could drop backwards compatibility, at the cost of having to recreate the ecosystem we already have.
I’ve been working on a datastore that’s perfect for this [1], but I’m getting very little traction. Does anyone have any ideas why that is? Is my marketing just bad, or is this feature just not very useful after all?
1. https://www.haystackdb.dev/
Mature projects have too much bureacracy, and even spending time talking to you = opportunity cost. So making a case for why you're going to solve a problem for them is tough.
New projects (whether at big companies or small companies) have 20 other things to worry about, so the problem isn't big enough.
I wrote about this in our blog if you're curious: https://ambar.cloud/blog/a-new-path-for-ambar
That's cool, but but I would prefer to not reinvent the wheel. If you have a simple library, that would already be useful.
Some simple code or request examples would be convenient as well. I really don't know how easy or difficult your interface design is. It would be cool to see the API docs.
I'm also skeptical of the graph on your front page that claims S3 cost as much as DynamoDB.
that alone makes it look like total nonsense.
as someone else said, extraordinary claims require extraordinary evidence.
Keep the data in S3 for 0.023 USD per GB-month. If you have a billion keys that can be useful.
> I'm also skeptical of the graph on your front page that claims S3 cost as much as DynamoDB.
Good point. Could have put a bit more work into that.
On second thought, and after looking at my cost estimates, the reason DynamoDB ends up costing about the same as S3 for this kind of use case is storage costs. DynamoDB is a lot cheaper than S3 to write to, but 5-10x more expensive to keep data stored in. So after about 16-32 months you reach break even.
Its a compiled binary - no JVM to manage. Java apps have always been a headache for me. Plus, no zookeeper - one less thing to break.
The biggest benefit I've seen is the performance. Redpanda just outperforms Apache Kakfa on similar hardware. Its also Kafka compliant in every way I've noticed, so all my favorite tools that interact with Kafka work the same with Redpanda.
Redpanda, like Kafka, writes to disk, so you'll always be limited by your hardware no matter what you use (but NVMe's are fast and affordable).
YMMV, but its been a good experience for me.
IANAL, but it looks pretty open source to me.
Anyone here have some real world experience with it?
While it would be useful to just blame Kafka for being bad technology it seems many other people get it wrong, too.
(can't edit)
But don’t public cloud providers already all have cloud-native event sourcing? If that’s what you need, just use that instead of Kafka.
The problem with guaranteed order is that you have to have some agreed upon counter/clock for ordering, otherwise a slow write from one producer to S3 could result in consumers already passing that offset before it was written, thus the write is effectively lost unless the consumers wind-back.
Having partitions means we can assign a dedicated writer for that partition that guarantees that writes are in order. With s3-direct writing, you lose that benefit, even with a timestamp/offset oracle. You'd need some metadata system that can do serializable isolation to guarantee that segments are written (visible to consumers) in the order of their timestamps. Doing that transactional system directly against S3 would be super slow (and you still need either bounded-error clocks, or a timestamp oracle service).
The backing db in this wishlist would be something in the vein of Aurora to achieve the storage compute split.
Why?
The user gets global ordering when
1. you-the-MQ assign both messages and partitions stable + unique + order + user-exposed identifiers;
2. the user constructs a "globally-collatable ID" from the (perPartitionMsgSequenceNumber, partitionID) tuple;
3. the user does a client-side streaming merge-sort of messages received by the partitions, sorting by this collation ID. (Where even in an ACK-on-receive design, messages don't actually get ACKed until they exit the client-side per-partition sort buffer and enter the linearized stream.)
The definition of "exposed to users" is a bit interesting here, as you might think you could do this merge-sort on the backend, just exposing a pre-linearized stream to the client.
But one of the key points/benefits of Kafka-like systems, under high throughput load (which is their domain of comparative advantage, and so should be assumed to be the deployed use-case), is that you can parallelize consumption cheaply, by just assigning your consumer-workers partitions of the topic to consume.
And this still works under global ordering, under some provisos:
• your workload can be structured as a map/reduce, and you don't need global ordering for the map step, only the reduce step;
• it's not impractical for you to materialize+embed the original intended input collation-ordering into the transform workers' output (because otherwise it will be lost in all but very specific situations.)
Plenty of systems fit these constraints, and happily rely on doing this kind of post-linearized map/reduce parallelized Kafka partition consumption.
And if you "hide" this on an API level, this parallelization becomes impossible.
Note, however, that "on an API level" bit. This is only a problem insofar as your system design is protocol-centered, with the expectation of "cheap, easy" third-party client implementations.
If your MQ is not just a backend, but also a fat client SDK library — then you can put the partition-collation into the fat client, and it will still end up being "transparent" to the user. (Save for the user possibly wondering why the client library opens O(K) TCP connections to the broker to consume certain topics under certain configurations.)
See also: why Google's Colossus has a fat client SDK library.
Feels like there is another squeeze in that idea if someone “just” took all their docs and replicated the feature set. But maybe that’s what S2 is already aiming at.
Wonder how long warpstream docs, marketing materials and useful blogs will stay up.
warpstream has latency issues, which downstream turn into cost issues
That said, they were at least "good enough" to make "buy" more appealing than "build"
have some of you some experience with those and able to give pros/cons?
Redpanda most importantly is faster that Apache Kafka. We were able to get a lot more throughput. Its also stable, especially compared to dealing with anything that requires a JVM.
/s ... seriously
Like distinctly recall running it at home as a global unioning filesystem where content had to fully fit on the specific device it was targeted at (rather then being striped or whatever).
>I cannot make you understand. I cannot make anyone understand what is happening inside me. I cannot even explain it to myself. -Franz Kafka, The Metamorphosis
I've seen these comments for over 15 years yet for some "unknown", "silly" reason java keeps being used for really,really useful software like kafka.
I could provide examples myself, but I'm not convinced it's about java vs c++ or go: hadoop, cassandra, zookeeper
An embedded interpreter and JIT in rust basically but jostled around a bit to make it more cohesive and the data interop more fluid— PyO3 but backwards.
But I'm doubtful that it's going to make things simpler if one can't even decide on a language.
Step 4, Rewrite it in Zig.
Step 5, due to security issues rewrite it in Java 63.
Fast forward into 2025, there are many performant, efficient and less complex alternatives to Kafka that save you money, instead of burning millions in operational costs "to scale".
Unless you are at a hundred million dollar revenue company, choosing Kafka in 2025 is doesn't make sense anymore.
Kafka shouldn't be used for low dataflow systems, true, but you can scale a long way with a simple 3 node cluster.
I'm currently building a full workload scheduler/orchestrator. I'm sick of Kubernetes. The world needs better -> https://x.com/GeoffreyHuntley/status/1915677858867105862
Go on then. Post the repo when you have :)