Next.js App Router + React Server Components Demo

new
past
show
ask
show
jobs
submit

▲Show HN: A bi-directional, persisted KV store that is faster than Redis (hpkv.io)

79 points by mehrant 98 days ago | 75 comments

dangoodmanUT 98 days ago [-]

What disks give 600ns persistence _with fsync/fdatasync_? Never heard of anything under 50us p50.

mehrant 98 days ago [-]

the 600ns figure represents our optimized write path and not a full fsync operation. we achieve it -among other things- through:

1- as mentioned, we are not using any traditional filesystem and we're bypassing several VFS layers.

2- free space management is a combination of two RB trees, providing O(log n) for slice and O(log n + k) - k being the number of adjacent free spaces for merge.

3- majority of the write path employs a lock free design and where needed we're using per cpu write buffers

the transactional guarantees we provide is via:

1- atomic individual operations with retries

2- various conflict resolution strategies (timestamp, etc.)

3- durability through controlled persistence cycles with configurable commit intervals

depending on the plan, we provide persistence guarantee between 30 sec to 5 minutes

dangoodmanUT 98 days ago [-]

I didn't necessarily mean exactly fsync. I guess I'll ask: Is it actually flushed to persistent disk in 600ns such that if the node crashes, the data can always be read again? Or does that not fully flush?

mehrant 98 days ago [-]

yes, in that case data can potentially be lost. 30 sec in a worse case scenario without HA.

dangoodmanUT 98 days ago [-]

So it's not actually persistence then.

That's extremely deceptive, and (IANAL) I think false advertisement. I'd clarify it.

That's also not HA, that's durability. Concerning.

pclmulqdq 97 days ago [-]

I have a product to sell you with a postgres interface but p99 write latency of 100 nanoseconds. It's postgres but our driver says "write done" before a write completes. It's revolutionary!

gkbrk 98 days ago [-]

There's a tiny 50,000,000x difference between the now admitted 30 seconds and the previously claimed 600 nanoseconds.

dangoodmanUT 98 days ago [-]

And hold on, 600ns can't possibly be right...

A memory copy plus updating what ever internal memory structures you have is definitely going to be over 1us. Even a non-fsync NVMe write is still >=1us, so this is grossy misleading.

98 days ago [-]

mehrant 98 days ago [-]

our p50 is indeed 600ns for write, the way I explained it. I understand that at this point, this can be read as "trust me bro" kind of statement, but I can offer you something. we can have a quick call and I provide you access to a temp server with HPKV installed on it, with access to our test suit and you'll have a chance to run your own tests.

this can be a good learning opportunity for both of us (potentially more for us) :)

if you're interested, please send us an email to support@hpkv.io and we can arrange that

mehrant 98 days ago [-]

for the time being, have a look at this please: http://hpkv.io/videos/performance_local.webm

this is 1M records, 3M operations on a single node, single thread, recorded in real time (1x).

I understand that without access to the source of test program it's hard to trust, but we can arrange that if you decided to take on that call :)

pclmulqdq 97 days ago [-]

The question from most of us isn't "did you get that number," it's "what does that number actually mean?" Writes don't need to return any data, so you can sort of set that latency number arbitrarily by changing the meaning of "write done." I can make "redis with 0 write latency" by returning a "write done" immediately after the packet lands, but then the meaning of "write done" is effectively nil.

In every persistent database, that number indicates that an entry was written to a persistent write-ahead log and that the written value will stay around if the machine crashes immediately after the write. Clearly you don't do this because it's impossible to do in 600 ns. For most of the non-persistent databases (eg redis, memcached), write latency is about how long it takes for something to enter the main data structure and become globally readable. Usually, "write done" also means that the key is globally readable with no extra performance cost (ie it was not just dumped into a write-ahead log in memory and then returned).

In a world where you spoke about the product more credulously or where code was open-source, I might accept that this was the case. As it stands, it looks like:

1. This was your "marketing gimmick" number that you are trying to sell (every database that isn't postgres has one).

2. You got it primarily by compromising on the meaning of "write done," and not on the basis of good engineering.

mehrant 97 days ago [-]

Thank you for your thoughtful critique.

To clarify what our numbers actually mean and address your main question of "what does that number actually mean":

1- The 600ns figure represents precisely what you described - an in-memory "write done" where memory structures are updated and the data becomes globally readable to all processes. This is indeed comparable to what Redis without persistence or memcached provides. Even at this comparable measurement basis (which isn't our marketing gimmick, but the same standard used by in-memory stores), we're still 2-6x faster than Redis depending on access patterns.

For full persistence guarantees, our mean latency increases to 2582ns per record (600ns in-memory operation + 1982ns disk commit) for our benchmark scenario with 1M records and 100-byte values. This represents the complete durability cycle. This needs to be compared with for example Redis with AOF enabled.

2- I agree that the meaning of "write done" requires clear context. We've been focusing on the in-memory performance advantages in our communications without adequately distinguishing between in-memory and persistence guarantees.

We weren't trying to hide the disk persistence number, we simply used "write done" because in our comparison we compared with Redis without persistence. but mentioning the persistence made an understandable confusion. that was bad on our part.

Based on your feedback, we'll update our documentation to provide more precise metrics that clearly separate these operational phases and their respective guarantees.

UPDATE:

clarification on mean disk write measurement:

the mean value is calculated from the total time of flushing the whole write buffer (parallel processing depending on the number of available cpu cores) divided by the number of records. so the total time for processing and writing 1M records as described above was 1982ms which makes the mean write time for each record 1982ns.

pclmulqdq 97 days ago [-]

> For full persistence guarantees, our mean latency increases to 2582ns per record (600ns in-memory operation + 1982ns disk commit)

By the way, this set of numbers also makes you look stupid, and you should consider redoing those measurements. No disk out there has less than 10 microseconds of write latency, and the ones in the cloud are closer to 50 us. Citing 2 micros here makes your 600 ns number also look 10x too optimistic.

I would suggest taking this whole thread as less of an opportunity to do marketing "damage control" and more of an opportunity to get honest feedback about your engineering and measurement practices. From the outside, they don't look good.

pclmulqdq 97 days ago [-]

I also see the update in response to this comment, and it puts everything into perspective. You haven't changed the meaning of "write done," you have just been comparing your reciprocal throughput against Redis's latency, and I think you have been confusing those two.

"600 ns" then really means "1.6M QPS of throughput," which is a good number but is well within the capabilities of many similar offerings (including several databases that are truly persistent). It also says nothing about your latency. If you want to say you are 2-6x faster than Redis, you are going to have to compare that number to Redis's throughput.

mehrant 94 days ago [-]

Reading your comment about comparing the throughput to Redis, it seems to me that you haven't read the benchmark article really. In there, we're in fact comparing the "throughput" and not the latency. allow me to quote some of the throughput numbers from the article mentioned above:

Single Operation Performance

Redis Single Operations

SET: 273,672.69 requests per second (p50=0.095 ms)

GET: 278,164.12 requests per second (p50=0.095 ms)

HPKV Single Operations

INSERT: 1,082,578.12 operations per second

GET: 1,728,939.43 operations per second

DELETE: 935,846.09 operations per second

Batch Operation Performance

Redis Batch Operations

SET: 2,439,024.50 requests per second (p50=0.263 ms)

GET: 2,932,551.50 requests per second (p50=0.223 ms)

HPKV Batch Operations

INSERT: 6,125,538.03 operations per second

GET: 8,273,300.27 operations per second

DELETE: 5,705,816.00 operations per second

The latency of 600ns as I mentioned is a local vectored interface call and not over the network. the is not how we compared the system with Redis. the above numbers are using our RIOC API over the network, in which HPKV behaves like a server similar to a Redis server.

The numbers above are compared with Redis in-memory and HPKV is still 2-6x faster. even if you assume HPKV as just an in-memory KV store with no persistence.

pclmulqdq 97 days ago [-]

Wait, "depending on the plan"?

You're already monetizing your non-persistent non-database?

alex_smart 98 days ago [-]

I don’t get it. How could you be fsyncing the WAL in 600ns? What are the transactional guarantees that you are offering?

mehrant 98 days ago [-]

that's a great question. the 600ns figure represents our optimized write path and not a full fsync operation. we achieve it -among other things- through:

1- as mentioned, we are not using any traditional filesystem and we're bypassing several VFS layers.

2- free space management is a combination of two RB trees, providing O(log n) for slice and O(log n + k) - k being the number of adjacent free spaces for merge.

3- majority of the write path employs a lock free design and where needed we're using per cpu write buffers

the transactional guarantees we provide is via:

1- atomic individual operations with retries

2- various conflict resolution strategies (timestamp, etc.)

3- durability through controlled persistence cycles with configurable commit intervals

depending on the plan, we provide persistence guarantee between 30 sec to 5 minutes

buenzlikoder 98 days ago [-]

What storage backend are you using?

A write operation on a SSD takes 10s of uS - without any VFS layers

mehrant 98 days ago [-]

sorry for not being clear again. by saying this number does not represent full fsync operation, I meant it doesn't include the SSD write time. this is the time to update KVs internal memory structure + adding to write buffers.

this is fair because we provide transactional guarantee and immediate consistency, regardless of the state of the append-only write buffer entry. during that speed, for a given key, the value might change and a new write buffer entry might be added for the said key before the write buffer had the chance to complete (as you mentioned the actual write on disk is slower) but the conflict resolution still ensures the write of the last valid entry and skips the rest. before this operation HPKV is acting like an in-memory KV store.

addaon 97 days ago [-]

You’re getting a lot of crap (rightly) for your lack of clarity and fuzzy language use on this point…

But that also points out the demand for the seemingly-unachievable promises you’re making. I wonder if it’s worth stirring up some out-of-production DIMM-connected Optane and using that as a basis for a truly fast-persisted append-only log. If that gives you the ability to achieve something that’s really in demand, you can go from there to a production basis, even if it’s just a stack of MRAM on a PCI-e card or something until the tech (re-) arises.

UltraSane 96 days ago [-]

you can just use NVDIMMs which are generally 8, 16, or 32GB DIMM modules that have a enough flash and backup power to copy all data to the flash storage if power is lost on the host.

https://www.micron.com/content/dam/micron/global/public/prod...

mrbluecoat 98 days ago [-]

Something must be in the water.. this is the third similar tool in three days on HN

https://news.ycombinator.com/item?id=43379262

https://news.ycombinator.com/item?id=43371097

linotype 97 days ago [-]

Yeah, Redis fucked around and found out.

https://redis.io/blog/redis-adopts-dual-source-available-lic...

rendaw 97 days ago [-]

That was a year ago.

conception 97 days ago [-]

I guess we know how long it takes to make a redis clone.

linotype 97 days ago [-]

Correct.

LarsenCC 96 days ago [-]

lol, I guess so haha

Snawoot 98 days ago [-]

> 2-6x faster than Redis (benchmark link below) yet disk persisted!

That's a false contradistinction: Redis is also disk persisted.

The benchmark you did mentions Redis benchmarking guide and this guide has following paragraph:

> Redis is, mostly, a single-threaded server from the POV of commands execution (actually modern versions of Redis use threads for different things). It is not designed to benefit from multiple CPU cores. People are supposed to launch several Redis instances to scale out on several cores if needed. It is not really fair to compare one single Redis instance to a multi-threaded data store.

Did you just benchmarked against only single Redis instance and claimed performance win? Even if so, how do benchmarks compare against source-available competitor DragonflyDB?

Finally, documentation doesn't mention how persistence exactly works and what durability guarantees should we expect?

mehrant 98 days ago [-]

thanks for taking time to write a feedback :)

> That's a false contradistinction: Redis is also disk persisted.

The performance gain mentioned was vs. Redis in memory. so we weren't claiming that Redis can't be persisted (which of course it can), but we were saying that Redis without persistence (which performs faster that with persistence) was still this much slower than HPKV with persistence. But you're correct that we probably should have been more clear in explaining this :)

>Did you just benchmarked against only single Redis instance and claimed performance win?

Signle node of Redis vs. Single node of HPKV. so it's an apples to apples comparison

>Even if so, how do benchmarks compare against source-available competitor DragonflyDB?

Benchmark with DragonFly coming soon :)

sorry about lack of that information in documentation, we'll update that. for for now, the durability guarantee on Pro is 30 seconds. on Business with HA is 5 minutes.

Xelynega 96 days ago [-]

They asked about instances and you responded with nodes.

From the redis comment it sounds like the way to scale a redis node is to increase the size and run multiple instances in parallel.

Saying it's "apples to apples" would be like setting the thread limit to a competitor to 1, then saying it's a fair benchmark.

edoceo 98 days ago [-]

Benchmark to KeyVal too please

mehrant 98 days ago [-]

sure! :)

ForTheKidz 98 days ago [-]

> That's a false contradistinction: Redis is also disk persisted.

This feels wildly disingenuous.

bjornsing 98 days ago [-]

Interesting. I did some work on a related but different product idea (https://www.haystackdb.dev/) a few years back. Gave up though as it seemed hard to get traction / find customers. What’s your thinking on that? How are you going to reach your initial customers?

Would love to have a chat about possible collaboration or if I could help out in some way. Nice to see foundational tech coming out of the EU!

mehrant 98 days ago [-]

thank you :) it would be interesting to have a chat for sure. would you mind dropping an email on the email I mentioned in OP and I'll reach out to you.

edoceo 98 days ago [-]

How will it be faster than my Redis or KeyVal which is very close if your servers are far away? Network time matters here, right?

mehrant 98 days ago [-]

of course. the speeds down to 15us can be achieved over network over our custom protocol on the same region. for sub-microsecond latency, you need to have HPKV running on the same machine as yours :)

avinassh 97 days ago [-]

If it based on some research papers, could you link them please

mehrant 97 days ago [-]

One thing we'd like to know your opinion on, is our key monitoring via WebSocket (pub-sub) feature. You can read more about it in our documentation under WebSocket.

Is it something that you think it's useful and you might have use case for or you can't see any value in it? In other words, is it something that you might consider using HPKV because of it?

kshmir 98 days ago [-]

Why pay what you're asking instead of using dragonfly or something like that and just putting a beefier node?

ehsanaslani 98 days ago [-]

Well that's a technical choice depending on the context, but I can list some of the advantages of HPKV:

-Persistent by default without any performance penalties

-The pub/sub feature which is unique to HPKV and allows for a bi-directional websocket connection from clients to database

-Lower cost as we need less expensive infrastructure to provide the same service

-Simple API to use

quibono 98 days ago [-]

Is this 2-6x faster because of multi threading/core? Or is this actually 2-6x faster on a single core machine?

mehrant 98 days ago [-]

the test was done on a single node and a single thread. on multi thread and batch operations, HPKV was still faster on the same machine

alexpadula 96 days ago [-]

Why no open source :<

cess11 98 days ago [-]

Does it have ACID guarantees?

mehrant 98 days ago [-]

We provide some elements of ACID guarantees, but not full ACID compliance as traditionally defined in database systems:

Atomicity: Yes, for individual operations. Each key-value operation is atomic (it either completes fully or not at all).

Consistency: Partial. We ensure data validity through our conflict resolution strategies, but we don't support multi-key constraints or referential integrity.

Isolation: Limited. Operations on individual keys are isolated, but we don't provide transaction isolation levels across multiple keys.

Durability: Yes. Our persistence model allows for tunable durability guarantees with corresponding performance trade-offs.

So while we provide strong guarantees for individual operations, HPKV is not a full ACID-compliant database system. We've optimized for high-performance key-value operations with practical durability assurances rather than complete ACID semantics.

gcbirzan 98 days ago [-]

> Consistency: Partial. We ensure data validity through our conflict resolution strategies, but we don't support multi-key constraints or referential integrity.

That's not what consistency means in ACID.

> Durability: Yes. Our persistence model allows for tunable durability guarantees with corresponding performance trade-offs.

> ~600ns p50 for writes with disk persistence

I'm pretty sure there's no durability there. That statement is pretty disingenuous in itself, but it'd be nice to see a number for durability (which, granted, is not something you advertise the product for).

My main concern is that all these speed benefits are going to be eclipsed by the 0.5ms of network latency.

cess11 97 days ago [-]

OK, thanks. Those tradeoffs aren't suitable for my purposes.

tobyhinloopen 98 days ago [-]

Not open source, not interested.

It looks neat though, but I won't burn myself on anything closed source if there's open source and/or self-hosted alternatives.

Aurornis 98 days ago [-]

The comparison to Redis felt misleading when I realized it was a commercial product.

I'm not sure I like this recent trend of people registering new accounts on HN to use "Show HN" to pitch commercial products.

Show HN is fun to see projects people have been working on and want to share for the good of the community. It's less fun when it's being used as a simple advertising mechanism to drive purchases.

gkbrk 98 days ago [-]

It's a suitable comparison I think. Redis is also not open-source.

https://github.com/redis/redis/blob/unstable/LICENSE.txt

old_bayes 98 days ago [-]

As of 12 months ago, sure, but before that Redis was open source for 15 years.

pvg 98 days ago [-]

Show HN is not limited to open source projects.

Aurornis 98 days ago [-]

I never said it was. Seeing green accounts (just registered) doing nothing other than promote products doesn't feel consistent with the spirit of Show HN.

pvg 98 days ago [-]

There's nothing wrong with green accounts doing Show HNs, really.

h1fra 98 days ago [-]

agree, especially on something as common as Redis. Being locked in is not worth the better perf on something already super-fast.

But then I wonder what the business model is here? Even without being open-source, I'm constantly asking myself who pays for DBs that are not in a major cloud

tschellenbach 98 days ago [-]

Doesn't always help. Price hikes on CockroachDB for instance have been crazy.

mehrant 98 days ago [-]

thanks for taking time and commenting :) we'd still be happy if you decided to use it and give us your thoughts. as I mentioned in one of the comments below, we're hoping to go open source in future :)

verdverm 98 days ago [-]

> we're hoping to go open source in future :)

That's just lip service, either you intend to or you don't, it's not up to hope

We hope you will see the dev tooling space is based around open source and you will alienate many potential users by not being open source

tobyhinloopen 98 days ago [-]

Let me know when it's open source and I'd be happy to give it a try! It doesn't even have to be free - I'm fine with Unreal's model where you get access to the source even though commercial use requires a paid license.

I want something running locally on my machine that doesn't rely on calling home.

mehrant 98 days ago [-]

your concern is understandable. we'll be in touch :)

hdjjhhvvhga 98 days ago [-]

Give me one reason to use a closed-source Redis alternative rather than one of many open ones, starting with KeyDB. If I wanted a closed clone, I'd probably go with DragonflyDB (whose license is "feel free to run it in production unless you offer it as a managed redis service").

98 days ago [-]

ranger_danger 98 days ago [-]

Does not appear to be open-source.

mehrant 98 days ago [-]

correct. however we're actually planning to make the system open source in future; we can't set an exact date as it depends on various factors, but hopefully not too far out. :)

notpushkin 98 days ago [-]

Will be looking forward to that!

However, it feels a bit weird: at this level of performance going SaaS only kinda defeats the purpose, no?

mehrant 98 days ago [-]

our approach is actually hybrid. on the other side of the performance coin, we have resource efficiency. that resource efficiency let's us provide a performant and low latency managed KV store, with lower cost, so the economy of it makes sense. the idea is that not everyone requires sub-microsecond latency, and for that group the value proposition is a low latency kv store which is feature rich with a novel bi-directional ws api. for people who need sub-microsecond latency, we're planning custom setup that allows them to make a local vectored interface call to get the sub-microsecond speeds. in between, we have the business plan that provides a custom binary protocol that is the one used in the benchmark :)

huhtenberg 98 days ago [-]

In that case you need to provide an SLA for your speed claims. Otherwise the claims are basically moot.

mehrant 98 days ago [-]

that's a fair point and you're correct. we will have the SLAs for latency documented and provided soon. in the mean time, please try it out and give us your feedback :)

huhtenberg 98 days ago [-]

The site is very snappy, which matches well your pitch.

However your principal selling point - the nanosecond-level speed - falls flat because it's a property important in self-hosted scenarios. Once you put your super speedy stuff behind a web-based API, that selling point becomes completely meaningless. The fact that once our data hits your servers it is handled really quickly doesn't mean much. I am sure you are perfectly aware of that.

That is, your pitch is disconnected from your actual offering. If you are selling speed, it needs to be a product, not a service. It doesn't need not be open source though, just looks at something like kdb+.

mehrant 98 days ago [-]

thanks for the feedback :)

our main target for "performance" value proposition are companies and businesses which will setup HPKV either locally (Enterprise plan) for nanosecond performance or in the cloud provider of their choosing, and working via RIOC API (Business Plan), getting ~15 microsecond range over network. however you're totally right, that doesn't really matter much if you're using it REST or WebSocket. for Pro tier, our value proposition is still the fastest managed KV store (you still get <80 ms for writes with a ~30ms ping to our servers) and features such as bi-directional WS, Atomic operations and Range Scans on top basic operations.

but given your comment, I think we should perhaps rethink how we're presenting the product. thanks for the feedback again :)

4m1rk 98 days ago [-]

What's the tech stack? If you can share.

98 days ago [-]

koushik_indie 92 days ago [-]

[flagged]

theonlyvasudev 98 days ago [-]

Amazing!

mehrant 98 days ago [-]

thank you :)

CyberDildonics 97 days ago [-]

That person only has one comment but they also made a database 8 months ago. Crazy coincidence, you could get together and compare notes.

Rendered at 10:52:22 GMT+0000 (Coordinated Universal Time) with Vercel.