What I find particularly ironic is that the title make it feel like Rust gives a 5x performance improvement when it actually slows thing down.
The problem they have software written in Rust, and they need to use the libpg_query library, that is written in C. Because they can't use the C library directly, they had to use a Rust-to-C binding library, that uses Protobuf for portability reasons. Problem is that it is slow.
So what they did is that they wrote their own non-portable but much more optimized Rust-to-C bindings, with the help of a LLM.
But had they written their software in C, they wouldn't have needed to do any conversion at all. It means they could have titled the article "How we lowered the performance penalty of using Rust".
I don't know much about Rust or libpg_query, but they probably could have gone even faster by getting rid of the conversion entirely. It would most likely have involved major adaptations and some unsafe Rust though. Writing a converter has many advantages: portability, convenience, security, etc... but it has a cost, and ultimately, I think it is a big reason why computers are so fast and apps are so slow. Our machines keep copying, converting, serializing and deserializing things.
Note: I have nothing against what they did, quite the opposite, I always appreciate those who care about performance, and what they did is reasonable and effective, good job!
Aurornis 3 hours ago [-]
> What I find particularly ironic is that the title make it feel like Rust gives a 5x performance improvement when it actually slows thing down.
Rust didn't slow them down. The inefficient design of the external library did.
Calling into C libraries from Rust is extremely easy. It takes some work to create a safer wrapper around C libraries, but it's been done for many popular libraries.
This is the first and only time I've seen an external library connected via a Rube Goldberg like contraption with protobufs in the middle. That's the problem.
Sadly they went with the "rewrite to Rust" meme in the headline for more clickability.
GuB-42 2 hours ago [-]
> Calling into C libraries from Rust is extremely easy
Calling the C function is not the problem here. It is dealing with the big data structure this function returns in a Rust-friendly manner.
This is something Protobuf does very well, at the cost of performance.
wizzwizz4 17 minutes ago [-]
Writing Rust bindings for arbitrary C data structures is not hard. You just need to make sure every part of your safe Rust API code upholds the necessary invariants. (Sometimes that's non-trivial, but a little thinking will always yield a solution: if C code can do it, then it can be done, and if it can be done, then it can be done in Rust.)
driftwood4537 1 hours ago [-]
[dead]
phkahler 6 hours ago [-]
>> But had they written their software in C, they wouldn't have needed to do any conversion at all. It means they could have titled the article "How we lowered the performance penalty of using Rust".
That's not really fair. The library was doing serialization/deserialization which was poor design choice from a performance perspective. They just made a more sane API that doesn't do all that extra work. It might best be titles "replacing protobuf with a normal API to go 5 times faster."
BTW what makes you think writing their end in C would yield even higher performance?
GuB-42 4 hours ago [-]
> BTW what makes you think writing their end in C would yield even higher performance?
C is not inherently faster, you are right about that.
But what I understand is that the library they use works with data structures that are designed to be used in a C-like language, and are presumably full of raw pointers. These are not ideal for working in Rust, instead, presumably, they wrote their own data model in Rust fashion, which means that now, they need to make a conversion, which is obviously slower than doing nothing.
They probably could have worked with the C structures directly, resulting in code that could be as fast as C, but that wouldn't make for great Rust code. In the end, they chose the compromise of speeding up conversion.
Also, the use of Protobuf may be a poor choice from a performance perspective, but it is a good choice for portability, it allows them to support plenty of languages for cheaper, and Rust was just one among others. The PgDog team gave Rust and their specific application special treatment.
hn_go_brrrrr 4 hours ago [-]
Because then they never would have needed the poorly-designed intermediate library.
the__alchemist 6 hours ago [-]
I wonder why they didn't immediately FFI it: C is the easiest lang to write rust binding for. It can get tedious if using many parts of a large API, but otherwise is straightforward.
I write most of my applications and libraries in Rust, and lament that most of the libraries I wish I would FFI are in C++ or Python, which are more difficult.
Protobuf sounds like the wrong tool. It has applications for wire serialization and similar, but is still kind of a mess there. I would not apply it to something that stays in memory.
vlovich123 5 hours ago [-]
It’s trivial to expose the raw C bindings (eg a -sys crate) because you just run bindgen on the header. The difficult part can be creating safe, high-performance abstractions.
kleton 6 hours ago [-]
>Protobuf sounds like the wrong too
This sort of use for proto is quite common at google
kccqzy 6 hours ago [-]
No it’s not common for two pieces of code within a single process to communicate by serializing the protobuf into the wire format and deserializing it.
It’s however somewhat common to pass in-memory protobuf objects between code, because the author didn’t want to define a custom struct but preferred to use an existing protobuf definition.
hn_go_brrrrr 4 hours ago [-]
I agree it's not super common, but Boq's in-process RPC feature encourages this pattern.
1718627440 3 hours ago [-]
Except it is not a remote procedure call.
dchuk 5 hours ago [-]
Given they heavily used LLMs for this optimization, makes you wonder why they didn’t use them to just port the C library to rust entirely. I think the volume of library ports to more languages/the most performant languages is going to explode, especially given it’s a relatively deterministic effort so long as you have good tests and api contracts, etc
cfors 4 hours ago [-]
The underlying C library interacts directly with the postgres query parser (therefore, Postgres source). So unless you rewrite postgres in Rust, you wouldn't be able to do that.
vineyardmike 2 hours ago [-]
Well then why didn’t they just get the LLM to rewrite all of Postgres too /s
I agree that LLMs will make clients/interfaces in every language combination much more common, but I wonder the impact it’ll have on these big software projects if more people stop learning C.
logicchains 7 hours ago [-]
> they had to use a Rust-to-C binding library, that uses Protobuf for portability reasons.
That sounds like a performance nightmare, putting Protobuf of all things between the language and Postgres, I'm surprised such a library ever got popular.
formerly_proven 5 hours ago [-]
> I'm surprised such a library ever got popular.
Because it is not popular.
pg_query (TFA) has ~1 million downloads, the postgres crate has 11 million downloads and the related tokio-postgres crate has over 33 million downloads. The two postgres crates currently see around 50x as much traffic as the (special-purpose) crate from the article.
edit: There is also pq-sys with over 12 million downloads, used by diesel, and sqlx-postgres with over 16 million downloads, used by sqlx.
cranx 9 hours ago [-]
I find the title a bit misleading. I think it should be titled It’s Faster to Copy Memory Directly than Send a Protobuf. Which then seems rather obvious that removing a serialization and deserialization step reduces runtime.
bluGill 7 hours ago [-]
Protobuf does something important that copying memory cannot do: a protocol that can be changed separately on either end and things can still work. You have to build for "my client doesn't send some new data" (make a good default), or "I got extra data I don't understand" (ignore it). However the ability to upgrade part of the system is critical when the system is large and complex since you can't fix everything to understand your new feature without making the new feature take ages to roll out.
Protobuf also handles a bunch of languages for you. The other team wants to write in a "stupid language" - you don't have to have a political fight to prove your preferred is best for everything. You just let that team do what they want and they can learn the hard way it was a bad language. Either it isn't really that bad and so the fight was pointless, or it was but management can find other metrics to prove it and it becomes their problem to decide if it is bad enough to be worth fixing.
vlovich123 5 hours ago [-]
But something more modern that doesn’t have the encoding/decoding penalty of Protobuf would be better (eg cap’n’proto but there’s a bunch now in this space).
bluGill 4 hours ago [-]
Not that you are wrong, but in the real world this is not significant for most uses. If it is significant you are doing too much IPC. Or maybe using protobuf where you should be making a direct function call. Fix the architecture either way. (similar to how I can make bubble sort faster with careful machine code optimization, but it is hard to make modern tim sort slower in the real world no matter how bad the implementation is)
MrDarcy 8 hours ago [-]
TIL serializing a protobuf is only 5 times slower than copying memory, which is way faster than I thought it’d be. Impressive given all the other nice things protobuf offers to development teams.
dietr1ch 6 hours ago [-]
I guess that number is as good or as bad as you want with the right nesting.
Protobuf is likely really close to optimally fast for what it is designed to be, and the flaws and performance losses left are most likely all in the design space, which is why alternatives are a dime a dozen.
jeffbee 2 hours ago [-]
Serializing a protobuf can be significantly faster than memcpy, depending. If you have a giant vector of small numbers represented with wide types (4-8 bytes in the machine) then the cost of copying them as variable-length symbols can be less.
nicman23 7 hours ago [-]
that actually crazy fast
cmrdporcupine 6 hours ago [-]
I wouldn't hold onto that number as any kind of fixed usable constant since the reality will depend entirely on things like cache locality and concurrency, and the memory bandwidth of the machine you're running on.
Go around doing this kind of pointless thing because "it's only 5x slower" is a bad assumption to make.
miroljub 8 hours ago [-]
Yep.
Just doing memcpy or mmap would be even faster. But the same Rust advocates bragging about Rust speed frown upon such unsecure practices in C/C++.
infogulch 6 hours ago [-]
Why don't we use standardized zero-copy data formats for this kind of thing? A standardized layout like Arrow means that the data is not tied to the layout/padding of a particular language, potential security problems like bounds checks are automatically handled by the tooling, and it works well over multiple communication channels.
mrlongroots 4 hours ago [-]
While Arrow is amazing, it is only the C Data Interface that can be FFI'ed, which is pretty low level. If you have something higher-level like a table or a vector of recordbatches, you have to write quite a bit of FFI glue yourself. It is still performant because it's a tiny amount of metadata, but it can still be a bit tedious.
And the reason is ABI compatibility. Reasoning about ABI compatibility across different C++ versions and optimization levels and architectures can be a nightmare, let alone different programming languages.
The reason it works at all for Arrow is that the leaf levels of the data model are large contiguous columnar arrays, so reconstructing the higher layers still gets you a lot of value. The other domains where it works are tensors/DLPack and scientific arrays (Zarr etc). For arbitrary struct layouts across languages/architectures/versions, serdes is way more reliable than a universal ABI.
7 hours ago [-]
lenkite 6 hours ago [-]
How we used Claude and bindgen to make Rust catch up with C's 5x performance.
jpalepu33 34 minutes ago [-]
The title is misleading but the actual work is impressive - they optimized their Protobuf usage, not replaced it entirely.
This is a common pattern: "We switched to X and got 5x faster" often really means "We fixed our terrible implementation and happened to rewrite it in X."
Key lessons from this:
1. Serialization/deserialization is often a hidden bottleneck, especially in microservices where you're doing it constantly
2. The default implementation of any library is rarely optimal for your specific use case
3. Benchmarking before optimization is critical - they identified the actual bottleneck instead of guessing
For anyone dealing with Protobuf performance issues, before rewriting:
- Use arena allocation to reduce memory allocations
- Pool your message objects
- Consider if you actually need all the fields you're serializing
- Profile the actual hot path
Rust FFI has overhead too. The real win here was probably rethinking their data flow and doing the optimization work, not just the language choice.
lfittl 59 minutes ago [-]
Since there seems to be some confusion in the comments about why pg_query chose Protobufs in the first place, let me add some context as the original author of pg_query (but not involved with PgDog, though Lev has shared this work by email beforehand).
The initial motivation for developing pg_query was for pganalyze, where we use it to parse queries extracted from Postgres, to find the referenced tables, and these days also rewrite and format queries. That use case runs in the background, and as such is much less performance critical.
pg_query actually initially used a JSON format for the parse output (AST), but we changed that to Protobuf a few major releases ago, because Protobuf makes it easy to have typed bindings in the different languages we support (Ruby, Go, Rust, Python, etc). Alternatives (e.g. using FFI directly) make sense for Rust, but would require a lot of maintained glue code for other languages.
All that said, I'm supportive of Lev's effort here, and we'll add some additional functions (see [0]) in the libpg_query library to make using it directly (i.e. via FFI) easier. But I don't see Protobuf going away, because in non-performance critical cases, it is more ergonomic across the different bindings.
Are they sure it's because Rust? Perhaps if they rewrite Protobuf in Rust it will be as slow as the current implementation.
They changed the persistence system completely. Looks like from a generic solution to something specific to what they're carrying across the wire.
They could have done it in Lua and it would have been 3x faster.
consp 11 hours ago [-]
If they made the headline something on the line of "replacing protobuf with a native, optimized implementation" would not get the same attention as putting rust in the title to attract the everything-in-rust-is-better crowd.
desiderantes 10 hours ago [-]
That never happens. Instead, it always attracts the opposite group, the Rust complainers, where they go and complain about how "the everything-in-rust-is-better crowd created yet another fake headline to pretend that Rust is the panacea". Which results in a lot of engagement. Old ragebait trick.
hu3 8 hours ago [-]
At the very least it gets more upvotes.
timeon 7 hours ago [-]
Well it is keyword for RSS feeds.
izacus 8 hours ago [-]
"never" huh?
DangitBobby 43 minutes ago [-]
Pretty much. The tide of rust evangelism has been turned in favor of complainers for a while now. Nothing compared to JS and React hate, but still.
embedding-shape 11 hours ago [-]
It's devbait, not many of us can resist bikeshedding about the title which obviously doesn't accurately reflect the article contents. And the article contents are self-aware enough to admit this to itself too, yet the title remains.
alias_neo 11 hours ago [-]
I was equally confused by the headline.
I wonder if it's just poorly worded and they meant to say something like "Replacing Protobuf with some native calls [in Rust]".
misja111 11 hours ago [-]
Correct, this has very little to do with Rust. But it wouldn't have made the front page without it.
mkoubaa 6 hours ago [-]
Bingo
11 hours ago [-]
win311fwg 10 hours ago [-]
The title would suggest that it was already written in Rust; that it was the rewrite in Go that brought five times faster.
locknitpicker 11 hours ago [-]
Yes you are absolutely right. The article even outright admits that Rust had nothing to do with it. From the article:
> Protobuf is fast, but not using Protobuf is faster.
The blog post reads like an unserious attempt to repeat a Rust meme.
rozenmd 10 hours ago [-]
"5 times faster" reminds me of Cap'n Proto's claim: in benchmarks, Cap’n Proto is INFINITY TIMES faster than Protocol Buffers: https://capnproto.org/
7777332215 10 hours ago [-]
In my experience capn proto is much less ergonomic.
IshKebab 6 hours ago [-]
I agree. It might be faster if you don't actually deserialise the data into native structs but then your codebase will be filled with fairly horrific CapnProto C++ code.
gf000 9 hours ago [-]
I mean, cap'n'proto is written by the same person who created protobuf, so they are legit (and that somewhat jokish claim is simply that it requires no parsing).
Sesse__ 9 hours ago [-]
> I mean, cap'n'proto is written by the same person who created protobuf
Notably, Protobuf 2, a rewrite of Protobuf 1. Protobuf 1 was created by Sanjay Ghemawat, I believe.
7e 8 hours ago [-]
Google loves to reinvent shit because they didn't understand it. And to get promo. In this case, ASN.1. And protobufs are so inefficient that they drive up latency and datacenter costs, so they were a step backwards. Good job, Sanjay.
notyourwork 5 hours ago [-]
Really dismissive and ignorant take from a bystander. Back it up with your delivery that does better instead of shouting with a pitchfork for no reason.
nemothekid 2 hours ago [-]
Can someone explain how protobuf ended up in the middle here? I'm just totally confused; the C ABI exists in almost every language, why did they need protobuf here?
eliasdejong 4 hours ago [-]
Performance of Protobuf is a joke. Why not use a zero copy format so that serialization is free? For example, my format Lite³ which outperforms Google Flatbuffers by 242x: https://github.com/fastserial/lite3
tucnak 3 hours ago [-]
Mmmm, I don't know maybe because your library DIDN'T EXIST before November 2025? Or perhaps for any other million reasons why people use Protobuf, and don't use Cap'n'proto and other 0-serialise libraries, like requiring a schema, established tooling for language of their choice, etc?
ajross 13 minutes ago [-]
Seems like this has nothing to do with Rust or protobufs. The underlying PostgreSQL abstraction engine they'd picked had a wasteful serialization implementation (that happens to have been using protobuf). So pgdog dropped it and open-coded a serialization-free transfer using the C API.
Well, yeah. If there's a feature you don't need, you'll see value by coding around it. Some features turn out not to be needed by anyone, maybe this is one. But some people need serialization, and that's what protobufs are for[1]. Those people are very (!) poorly served by headlines telling them to use Rust (!!) instead of serialization.
[1] Though as always the standard litany applies: you actually want JSON, and not protobus or ASN.1 or anything else. If you like some other technology better, you're wrong and you actually want JSON. If you think you need something faster, you probably don't and JSON would suit your needs better. If you really, 100%, know for sure that you need it faster than JSON, then you're probably isomorphic to the folks in the linked article, shouldn't have been serializing at all, and should get to work open coding your own hooks on the raw backend.
yodacola 11 hours ago [-]
FlatBuffers are already faster than that. But that's not why we choose Protobuf. It's because a megacorp maintains it.
rurban 1 hours ago [-]
I also thought I could trust mega Corp. That's why I put all my code on their platform, code.google.com, and not on this obscure platform without any business model, github.
Well, that sucked. And why should I use protobuf, when I just need to share structs and arrays in memory (aka zero copy) with a version field? Like everyone else does for decades?
nindalf 10 hours ago [-]
You're saying we choose Protobufs [1] because Google maintains it but not FlatBuffers [2]?
I get the OP is off base with his remark - but at the same time maintained by Google means shit in practice.
AFAIK they have a bunch of production infra on protobuff/gRPC - not so sure about flatbufferrs which came out of the game dev side - that's the difference maker to me - which project is actually rooted in.
dmoy 3 hours ago [-]
> AFAIK they have a bunch of production infra on protobuff/gRPC
Stubby, not gRPC. Stubby is used for almost everything internally. gRPC is a similar-ish looking thing that is open sourced, but not used nearly as much as stubby internally.
Stubby predates gRPC by like 15 years or something.
> not so sure about flatbufferrs which came out of the game dev side
I wouldn't know. I'll be honest, I always forget that Google made flatbuffers. I guess if you're doing a lot of IPC?
dewey 10 hours ago [-]
> but at the same time maintained by Google means shit in practice.
If you worked on Go projects that import Google protobuf / grpc / Kubernetes client libraries you are often reminded of that fact.
whoevercares 8 hours ago [-]
Flatbuffers are fine - I think it is used in many places that needs zero-copy. Also outside google, it powers the Arrow format which is the foundation of modern analytics
cmrdporcupine 3 hours ago [-]
I know it's confusing, but things being under the 'google' namespace on GitHub doesn't mean they're maintained by Google. At least not as an official project.
It just means a person working at Google used that avenue to open source them.
Google offers a legal few avenues to allow you to open source your stuff while working there but one of the easiest it just to assign copyright to Google and shove it under their GitHub.
It just means a Googler published it, not that Google itself is maintaining it.
I don't know what the status of flatbuffers is specifically, but I can say I never encountered it in use in the 10 years I worked there. (I use it a lot now on my own things post-Google)
secondcoming 10 hours ago [-]
Yet they've yet to release their internal optimisation that allows zero-copying string-type fields.
suriya-ganesh 5 hours ago [-]
This is an unfair comparison.
using a transport serialization and deserialization protocol for IPC. It is obvious why there was an overhead because it was architectural decision to manage the communication.
I guess the old adage of if something goes 20% faster something was improved if it is 10x faster, it was just built wrong is true here.
maherbeg 4 hours ago [-]
Gotta say, I love using PGDog. It has some fantastic built in features, and I'm looking forward to testing out the improved query parser. Lev and the team are heroes.
At the scale we were using PGDog, enabling the previous form of the query parser was extremely expensive (we would have had to 16x our pgdog fleet size).
levkk 4 hours ago [-]
That's the experimental feature I was talking about! :)
Thank you so much for the kind words!
rgovostes 22 minutes ago [-]
What the hell happened to Protobuf anyway? Go look at their repo; it’s positively byzantine. There are two or three different Python backends.
9 hours ago [-]
t-writescode 10 hours ago [-]
Just for fun, how often do regular-sized companies that deal in regular-sized traffic need Protobuf to accomplish their goals in the first place, compared to JSON or even XML with basic string marshalling?
izacus 8 hours ago [-]
I dunno, are you sure you can manually write correct de/serializaiton for JSON and XML so strings, floats and integer formats correctly get parsed between JavaScript, Java, Python, Go, Rust, C++ and any other languages?
Do you want to maintain that and debug that? Do you want to do all of that without help of a compiler enforcing the schema and failing compiles/CI when someone accidentally changes the schema?
Because you get all of that with protobuf if you use them appropriately.
You can of course build all of this yourself... and maybe it'll even be as efficient, performant and supported. Maybe.
t-writescode 3 hours ago [-]
I mean, the entire internet has been doing that for decades and there’s a lot of tooling, libraries and generators that already do that, so … sure?
And it works in a browser, too!
nicman23 7 hours ago [-]
i mean you can always go mono or duo language and then it is really not that of an issue
eklavya 6 hours ago [-]
That would make sense if protobuf was complex, bloated, slow. But it's not, so the question should be why not use it, unless you are doing browser stuff.
9rx 6 hours ago [-]
If you are going to use it elsewhere, why not use it for browser stuff too?
eklavya 4 hours ago [-]
I would advise against it. Too much friction, try it, maybe you will have a different experience than mine.
9rx 4 hours ago [-]
I am curious about what kind of friction you encountered. Were you generating ad-hoc protobuf messages?
Assuming you were using Protobufs as they are usually used, meaning under generated code, I saw no difference between using it in Javascript and any other language in my experience. The wire format is beyond your concern. At least it is no more of your concern than it is in any other environment.
There are a number of different generator implementations for Javascript/Typescript. Some of them have some peculiar design choices. Is that where you found issue? I would certainly agree with that, but others aren't so bad. That doesn't really have anything to do with the browser, though. You'd have the same problem using protobufs under Node.js.
tcfhgj 10 hours ago [-]
Well, protobuf allows to generate easy to use code for parsing defined data and service stubs for many languages and is one of the faster and less bandwidth wasting options
tuetuopay 9 hours ago [-]
Type safety. The contract is the law instead of a suggestion like JSON.
Having a way to describe your whole API and generate bindings is a godsend. Yes, it can be done with JSON and OpenApi, yet it’s not mandatory.
9rx 5 hours ago [-]
> Yes, it can be done with JSON and OpenApi, yet it’s not mandatory.
It is not mandatory for Protobuf either. You can construct a protobuf message with an implied structure just as you can with JSON. It does not violate the spec.
Protobuf ultimately gets the nod because it has better tooling (which isn't to be taken as praise towards Protobuf's tooling, but OpenAPI is worse).
vouwfietsman 8 hours ago [-]
Besides the other comments already here about code gen & contracts, a bigger one for me to step away from json/xml is binary serialization.
It sounds weird, and its totally dependent on your use case, but binary serialization can make a giant difference.
For me, I work with 3D data which is primarily (but not only) tightly packed arrays of floats & ints. I have a bunch of options available:
1. JSON/XML, readable, easy to work with, relatively bulky (but not as bad as people think if you compress) but no random access, and slow floating point parsing, great extensibility.
2. JSON/XML + base64, OK to work with, quite bulky, no random access, faster parsing, but no structure, extensible.
3. Manual binary serialization: hard to work with, OK size (esp compressed), random access if you put in the effort, optimal parsing, not extensible unless you put in a lot of effort.
4. Flatbuffers/protobuf/capn-proto/etc: easy to work with, great size (esp compressed), random access, close-to-optimal parsing, extensible.
Basically if you care about performance, you would really like to just have control of the binary layout of your data, but you generally don't want to design extensibility and random access yourself, so you end up sacrificing explicit layout (and so some performance) by choosing a convenient lib.
We are a very regularly sized company, but our 3D data spans hundreds of terabytes.
(also, no, there is no general purpose 3D format available to do this work, gltf and friends are great but have a small range of usecases)
t-writescode 3 hours ago [-]
This use case totally makes sense of course. I’m thinking about why people use Protobuf for their string, uuid and int powered CRUD app.
tucnak 3 hours ago [-]
You're making assumptions about what kind of software people write. For a Hacker News degenerate, everything in the world revolves around bean-counting B2B SaaS CRUD crap, but it doesn't mean it's all there is to the world, right? You would be shocked how much networked computer software (not everything is a website) exists that is NOT a CRUD "app."
t-writescode 2 hours ago [-]
Woah buddy, no need for the hostility there.
Statistically, a lot of people who post on HN and cling to new or advanced tech *do* just write CRUD apps with a little special sauce, it’s part of what makes vibe coding and many of the frameworks we use so appealing.
I’m not ignoring that other things exist and are even very common; and I was agreeing with the person that’s a useful case.
I’ve also worked for various companies where protobuf has been suggested as a way to solve a political/organizational issue, not a code or platform issue.
physicsguy 8 hours ago [-]
This was the norm many years ago, I worked on a simulation software which existed long before Protobuf was even an apple in it's authors eyes. The whole thing was on a server architecture with a Java (later ported to Qt) GUI and a C++ core. The solver periodically sent data in a custom binary format over TCP for vector fields and things.
bluGill 10 hours ago [-]
In most languages protobuf is eaiser because it generates the boilerplate. And protobuf is cross language so even if you are working in javascript where json is native protobuf is still faster because the other side can be whatever and you are not spending their time parsing.
t-writescode 3 hours ago [-]
In most languages I’ve worked in, there is no boiler plate for json either, and barely any for XML. You make a data class of some sort and it “just works”.
Not having that functionality is a weakness of a language or its support tools at this point, to me.
Chiron1991 9 hours ago [-]
It's not just about traffic. IoT devices (or any other low-powered devices for that matter) also like protobuf because of its comparatively high efficiency.
pjmlp 8 hours ago [-]
I never used it, coding since 1986.
jonathanstrange 9 hours ago [-]
Protobuf is fantastic because it separates the definition from the language. When you make changes, you recompile your definitions to native code and you can be sure it will stay compatible with other languages and implementations.
speed_spread 8 hours ago [-]
You mean like WSDL, OpenAPI and every other schema definition format?
Well I agree. Contract-first is great. You provide your clients with the specs and let them generate their own bindings. And as a client they're great too because I can also easily generate a mock server implementation that I can use in tests.
0x457 3 hours ago [-]
Now and then I find a wild place people shove protobuf in. It's like zero consideration were given sometimes beyond "multiple languages from the same IDL" like it's some magical zero-overhead abstraction over bytes on a wire.
lowdownbutter 10 hours ago [-]
Don't read clickbaity headlines and scan hacker news five times faster.
3 hours ago [-]
chuckadams 7 hours ago [-]
Become a 5X Hacker News reader with this One Weird Trick.
chuckhend 4 hours ago [-]
Great work Lev!
levkk 4 hours ago [-]
Thank you!
linuxftw 8 hours ago [-]
Many people are exclaiming that the title is baity, but I disagree. It seems like a perfectly fine title in the context of this blog, which is about a specific product. It's unlikely they wrote the blog with a HN submission in mind. They're not a news publication, either.
5 hours ago [-]
spwa4 8 hours ago [-]
You should be terrified of the instability you're introducing to achieve this. Memory sharing between processes is very difficult to keep stable, it is half the reason kernels exist.
levkk 4 hours ago [-]
I was terrified until it worked. The Postgres "ABI" is relatively stable - the parser only really changes between major versions and we bake the whole code into the same executable - largely thanks to the work done by team behind pg_query!
The output is machine-verifiable, which makes this uniquely possible in today's vibe-coded world!
sylware 9 hours ago [-]
I don't understand, I used protobuf for map data, but it is a hardcore simple format, this is the whole purpose of it.
I wrote assembly, memory mapping oriented protobuf software... in assembly, then what? I am allowed to say I am going 1000 times faster than rust now???
IshKebab 11 hours ago [-]
I vaguely recall that there's a Rust macro to automatically convert recursive functions to iterative.
But I would just increase the stack size limit if it ever becomes a problem. As far as I know the only reason it is so small is because of address space exhaustion which only affects 32-bit systems.
jeroenhd 10 hours ago [-]
Explicit tail call optimization is in the works but I don't think it's available in stable jut yet.
(Note that enabling release mode on that link will have the compiler pre-calculate the result so you need to put it to debug mode if you want to see the assembly this generates)
embedding-shape 11 hours ago [-]
> I vaguely recall that there's a Rust macro to automatically convert recursive functions to iterative.
Isn't that just TCO or similar? Usually a part of the compiler/core of the language itself, AFAIK.
koverstreet 10 hours ago [-]
I haven't been following become/TCO in Rust - but what I've usually seen is TCO getting flipped off because it interferes with backtraces and debugging.
So I think there's value in providing it as an explicit opt-in; that way when you're reading the code, you know to account for it when you're looking at backtraces.
Additionally, if you're relying on TCO it might be a major bug if the compiler isn't able to apply it - and optimizations that aren't applied are normally invisible. This might mean you could get an error if you're expecting TCO and you or the compiler screwed something up.
tialaramex 9 hours ago [-]
In a language like Rust where local variables are explicitly destroyed when scope ends a naive TCO is very annoying and `become` also helps fix that.
Suppose I have a recursive function f(n: u8) where f(0) is 0 and otherwise f(n) is n * bar(n) + f(n-1)
I might well write that with a local temporary to calculate bar(n) and then we do the sum, but this would inhibit TCO because that temporary should exist after we did the recursive calculation, even though it doesn't matter in practice.
A compiler could try to cleverly figure out whether it matters and destroy that local temporary earlier then apply TCO, but now your TCO is fragile because a seemingly minor code change might fool that "clever" logic, by ensuring it isn't correct to make this change and breaking your optimisation.
The `become` keyword is a claim by the programmer that we can drop all these locals and do TCO. So because the programmer claimed this should work they're giving the compiler permission to attempt the early drop and if it doesn't work and can't be TCO then complain that the program is wrong.
unnouinceput 6 hours ago [-]
Quote: "We forked pg_query.rs and replaced Protobuf with direct C-to-Rust (and back to C) bindings, ...."
So it's C actually, not Rust. But Hey! we used Rust somewhere, so let's post it on HN and farm internet points.
steeve 10 hours ago [-]
tldr: they replaced using protobuf as the type system across language boundaries for FFI with true FFI
ahartmetz 8 hours ago [-]
Title is as nonsensical as "We replaced Windows with ARM CPUs"
Terretta 3 hours ago [-]
We replaced the periodic table with elements for five times the reaction.
Xunjin 10 hours ago [-]
I loved, every clickbait title should come with a tldr just like this one.
xxs 8 hours ago [-]
if you see an order of magnitude difference and a language involved in the title, it's something I refuse to read (unless it's an obvious choice - interpret vs compilied/jit one)
Rendered at 20:58:33 GMT+0000 (Coordinated Universal Time) with Vercel.
The problem they have software written in Rust, and they need to use the libpg_query library, that is written in C. Because they can't use the C library directly, they had to use a Rust-to-C binding library, that uses Protobuf for portability reasons. Problem is that it is slow.
So what they did is that they wrote their own non-portable but much more optimized Rust-to-C bindings, with the help of a LLM.
But had they written their software in C, they wouldn't have needed to do any conversion at all. It means they could have titled the article "How we lowered the performance penalty of using Rust".
I don't know much about Rust or libpg_query, but they probably could have gone even faster by getting rid of the conversion entirely. It would most likely have involved major adaptations and some unsafe Rust though. Writing a converter has many advantages: portability, convenience, security, etc... but it has a cost, and ultimately, I think it is a big reason why computers are so fast and apps are so slow. Our machines keep copying, converting, serializing and deserializing things.
Note: I have nothing against what they did, quite the opposite, I always appreciate those who care about performance, and what they did is reasonable and effective, good job!
Rust didn't slow them down. The inefficient design of the external library did.
Calling into C libraries from Rust is extremely easy. It takes some work to create a safer wrapper around C libraries, but it's been done for many popular libraries.
This is the first and only time I've seen an external library connected via a Rube Goldberg like contraption with protobufs in the middle. That's the problem.
Sadly they went with the "rewrite to Rust" meme in the headline for more clickability.
Calling the C function is not the problem here. It is dealing with the big data structure this function returns in a Rust-friendly manner.
This is something Protobuf does very well, at the cost of performance.
That's not really fair. The library was doing serialization/deserialization which was poor design choice from a performance perspective. They just made a more sane API that doesn't do all that extra work. It might best be titles "replacing protobuf with a normal API to go 5 times faster."
BTW what makes you think writing their end in C would yield even higher performance?
C is not inherently faster, you are right about that.
But what I understand is that the library they use works with data structures that are designed to be used in a C-like language, and are presumably full of raw pointers. These are not ideal for working in Rust, instead, presumably, they wrote their own data model in Rust fashion, which means that now, they need to make a conversion, which is obviously slower than doing nothing.
They probably could have worked with the C structures directly, resulting in code that could be as fast as C, but that wouldn't make for great Rust code. In the end, they chose the compromise of speeding up conversion.
Also, the use of Protobuf may be a poor choice from a performance perspective, but it is a good choice for portability, it allows them to support plenty of languages for cheaper, and Rust was just one among others. The PgDog team gave Rust and their specific application special treatment.
I write most of my applications and libraries in Rust, and lament that most of the libraries I wish I would FFI are in C++ or Python, which are more difficult.
Protobuf sounds like the wrong tool. It has applications for wire serialization and similar, but is still kind of a mess there. I would not apply it to something that stays in memory.
It’s however somewhat common to pass in-memory protobuf objects between code, because the author didn’t want to define a custom struct but preferred to use an existing protobuf definition.
I agree that LLMs will make clients/interfaces in every language combination much more common, but I wonder the impact it’ll have on these big software projects if more people stop learning C.
That sounds like a performance nightmare, putting Protobuf of all things between the language and Postgres, I'm surprised such a library ever got popular.
Because it is not popular.
pg_query (TFA) has ~1 million downloads, the postgres crate has 11 million downloads and the related tokio-postgres crate has over 33 million downloads. The two postgres crates currently see around 50x as much traffic as the (special-purpose) crate from the article.
edit: There is also pq-sys with over 12 million downloads, used by diesel, and sqlx-postgres with over 16 million downloads, used by sqlx.
Protobuf also handles a bunch of languages for you. The other team wants to write in a "stupid language" - you don't have to have a political fight to prove your preferred is best for everything. You just let that team do what they want and they can learn the hard way it was a bad language. Either it isn't really that bad and so the fight was pointless, or it was but management can find other metrics to prove it and it becomes their problem to decide if it is bad enough to be worth fixing.
Protobuf is likely really close to optimally fast for what it is designed to be, and the flaws and performance losses left are most likely all in the design space, which is why alternatives are a dime a dozen.
Go around doing this kind of pointless thing because "it's only 5x slower" is a bad assumption to make.
Just doing memcpy or mmap would be even faster. But the same Rust advocates bragging about Rust speed frown upon such unsecure practices in C/C++.
And the reason is ABI compatibility. Reasoning about ABI compatibility across different C++ versions and optimization levels and architectures can be a nightmare, let alone different programming languages.
The reason it works at all for Arrow is that the leaf levels of the data model are large contiguous columnar arrays, so reconstructing the higher layers still gets you a lot of value. The other domains where it works are tensors/DLPack and scientific arrays (Zarr etc). For arbitrary struct layouts across languages/architectures/versions, serdes is way more reliable than a universal ABI.
This is a common pattern: "We switched to X and got 5x faster" often really means "We fixed our terrible implementation and happened to rewrite it in X."
Key lessons from this:
1. Serialization/deserialization is often a hidden bottleneck, especially in microservices where you're doing it constantly 2. The default implementation of any library is rarely optimal for your specific use case 3. Benchmarking before optimization is critical - they identified the actual bottleneck instead of guessing
For anyone dealing with Protobuf performance issues, before rewriting: - Use arena allocation to reduce memory allocations - Pool your message objects - Consider if you actually need all the fields you're serializing - Profile the actual hot path
Rust FFI has overhead too. The real win here was probably rethinking their data flow and doing the optimization work, not just the language choice.
The initial motivation for developing pg_query was for pganalyze, where we use it to parse queries extracted from Postgres, to find the referenced tables, and these days also rewrite and format queries. That use case runs in the background, and as such is much less performance critical.
pg_query actually initially used a JSON format for the parse output (AST), but we changed that to Protobuf a few major releases ago, because Protobuf makes it easy to have typed bindings in the different languages we support (Ruby, Go, Rust, Python, etc). Alternatives (e.g. using FFI directly) make sense for Rust, but would require a lot of maintained glue code for other languages.
All that said, I'm supportive of Lev's effort here, and we'll add some additional functions (see [0]) in the libpg_query library to make using it directly (i.e. via FFI) easier. But I don't see Protobuf going away, because in non-performance critical cases, it is more ergonomic across the different bindings.
[0]: https://github.com/pganalyze/libpg_query/pull/321
They changed the persistence system completely. Looks like from a generic solution to something specific to what they're carrying across the wire.
They could have done it in Lua and it would have been 3x faster.
I wonder if it's just poorly worded and they meant to say something like "Replacing Protobuf with some native calls [in Rust]".
> Protobuf is fast, but not using Protobuf is faster.
The blog post reads like an unserious attempt to repeat a Rust meme.
Notably, Protobuf 2, a rewrite of Protobuf 1. Protobuf 1 was created by Sanjay Ghemawat, I believe.
Well, yeah. If there's a feature you don't need, you'll see value by coding around it. Some features turn out not to be needed by anyone, maybe this is one. But some people need serialization, and that's what protobufs are for[1]. Those people are very (!) poorly served by headlines telling them to use Rust (!!) instead of serialization.
[1] Though as always the standard litany applies: you actually want JSON, and not protobus or ASN.1 or anything else. If you like some other technology better, you're wrong and you actually want JSON. If you think you need something faster, you probably don't and JSON would suit your needs better. If you really, 100%, know for sure that you need it faster than JSON, then you're probably isomorphic to the folks in the linked article, shouldn't have been serializing at all, and should get to work open coding your own hooks on the raw backend.
[1] - https://github.com/protocolbuffers/protobuf: Google's data interchange format
[2] - https://github.com/google/flatbuffers: Also maintained by Google
AFAIK they have a bunch of production infra on protobuff/gRPC - not so sure about flatbufferrs which came out of the game dev side - that's the difference maker to me - which project is actually rooted in.
Stubby, not gRPC. Stubby is used for almost everything internally. gRPC is a similar-ish looking thing that is open sourced, but not used nearly as much as stubby internally.
Stubby predates gRPC by like 15 years or something.
> not so sure about flatbufferrs which came out of the game dev side
I wouldn't know. I'll be honest, I always forget that Google made flatbuffers. I guess if you're doing a lot of IPC?
If you worked on Go projects that import Google protobuf / grpc / Kubernetes client libraries you are often reminded of that fact.
It just means a person working at Google used that avenue to open source them.
Google offers a legal few avenues to allow you to open source your stuff while working there but one of the easiest it just to assign copyright to Google and shove it under their GitHub.
It just means a Googler published it, not that Google itself is maintaining it.
I don't know what the status of flatbuffers is specifically, but I can say I never encountered it in use in the 10 years I worked there. (I use it a lot now on my own things post-Google)
using a transport serialization and deserialization protocol for IPC. It is obvious why there was an overhead because it was architectural decision to manage the communication.
I guess the old adage of if something goes 20% faster something was improved if it is 10x faster, it was just built wrong is true here.
At the scale we were using PGDog, enabling the previous form of the query parser was extremely expensive (we would have had to 16x our pgdog fleet size).
Thank you so much for the kind words!
Do you want to maintain that and debug that? Do you want to do all of that without help of a compiler enforcing the schema and failing compiles/CI when someone accidentally changes the schema?
Because you get all of that with protobuf if you use them appropriately.
You can of course build all of this yourself... and maybe it'll even be as efficient, performant and supported. Maybe.
And it works in a browser, too!
Assuming you were using Protobufs as they are usually used, meaning under generated code, I saw no difference between using it in Javascript and any other language in my experience. The wire format is beyond your concern. At least it is no more of your concern than it is in any other environment.
There are a number of different generator implementations for Javascript/Typescript. Some of them have some peculiar design choices. Is that where you found issue? I would certainly agree with that, but others aren't so bad. That doesn't really have anything to do with the browser, though. You'd have the same problem using protobufs under Node.js.
Having a way to describe your whole API and generate bindings is a godsend. Yes, it can be done with JSON and OpenApi, yet it’s not mandatory.
It is not mandatory for Protobuf either. You can construct a protobuf message with an implied structure just as you can with JSON. It does not violate the spec.
Protobuf ultimately gets the nod because it has better tooling (which isn't to be taken as praise towards Protobuf's tooling, but OpenAPI is worse).
It sounds weird, and its totally dependent on your use case, but binary serialization can make a giant difference.
For me, I work with 3D data which is primarily (but not only) tightly packed arrays of floats & ints. I have a bunch of options available:
1. JSON/XML, readable, easy to work with, relatively bulky (but not as bad as people think if you compress) but no random access, and slow floating point parsing, great extensibility.
2. JSON/XML + base64, OK to work with, quite bulky, no random access, faster parsing, but no structure, extensible.
3. Manual binary serialization: hard to work with, OK size (esp compressed), random access if you put in the effort, optimal parsing, not extensible unless you put in a lot of effort.
4. Flatbuffers/protobuf/capn-proto/etc: easy to work with, great size (esp compressed), random access, close-to-optimal parsing, extensible.
Basically if you care about performance, you would really like to just have control of the binary layout of your data, but you generally don't want to design extensibility and random access yourself, so you end up sacrificing explicit layout (and so some performance) by choosing a convenient lib.
We are a very regularly sized company, but our 3D data spans hundreds of terabytes.
(also, no, there is no general purpose 3D format available to do this work, gltf and friends are great but have a small range of usecases)
Statistically, a lot of people who post on HN and cling to new or advanced tech *do* just write CRUD apps with a little special sauce, it’s part of what makes vibe coding and many of the frameworks we use so appealing.
I’m not ignoring that other things exist and are even very common; and I was agreeing with the person that’s a useful case.
I’ve also worked for various companies where protobuf has been suggested as a way to solve a political/organizational issue, not a code or platform issue.
Not having that functionality is a weakness of a language or its support tools at this point, to me.
Well I agree. Contract-first is great. You provide your clients with the specs and let them generate their own bindings. And as a client they're great too because I can also easily generate a mock server implementation that I can use in tests.
The output is machine-verifiable, which makes this uniquely possible in today's vibe-coded world!
I wrote assembly, memory mapping oriented protobuf software... in assembly, then what? I am allowed to say I am going 1000 times faster than rust now???
But I would just increase the stack size limit if it ever becomes a problem. As far as I know the only reason it is so small is because of address space exhaustion which only affects 32-bit systems.
The `become` keyword has already been reserved and work continues to happen (https://github.com/rust-lang/rust/issues/112788). If you enable #![feature(explicit_tail_calls)] you can already use the feature in the nightly compiler: https://play.rust-lang.org/?version=nightly&mode=debug&editi...
(Note that enabling release mode on that link will have the compiler pre-calculate the result so you need to put it to debug mode if you want to see the assembly this generates)
Isn't that just TCO or similar? Usually a part of the compiler/core of the language itself, AFAIK.
So I think there's value in providing it as an explicit opt-in; that way when you're reading the code, you know to account for it when you're looking at backtraces.
Additionally, if you're relying on TCO it might be a major bug if the compiler isn't able to apply it - and optimizations that aren't applied are normally invisible. This might mean you could get an error if you're expecting TCO and you or the compiler screwed something up.
Suppose I have a recursive function f(n: u8) where f(0) is 0 and otherwise f(n) is n * bar(n) + f(n-1)
I might well write that with a local temporary to calculate bar(n) and then we do the sum, but this would inhibit TCO because that temporary should exist after we did the recursive calculation, even though it doesn't matter in practice.
A compiler could try to cleverly figure out whether it matters and destroy that local temporary earlier then apply TCO, but now your TCO is fragile because a seemingly minor code change might fool that "clever" logic, by ensuring it isn't correct to make this change and breaking your optimisation.
The `become` keyword is a claim by the programmer that we can drop all these locals and do TCO. So because the programmer claimed this should work they're giving the compiler permission to attempt the early drop and if it doesn't work and can't be TCO then complain that the program is wrong.
So it's C actually, not Rust. But Hey! we used Rust somewhere, so let's post it on HN and farm internet points.