Next.js App Router + React Server Components Demo

new
past
show
ask
show
jobs
submit

▲Matching Regexps 200 Times Faster (eregon.me)

28 points by todsacerdoti 98 days ago | 12 comments

broken_broken_ 97 days ago [-]

I love reading optimization stories so thank you!

So essentially the code got faster by switching to smarter regexp engine that can transform (hopefully) the regexp into a FSM. Cool, but what about replacing the regexp with straightforward parsing code written manually? That way it is statically known that this function is fast all of the time.

At least that’s what I would do in a natively compiled language with access to simd, not sure if ruby has access to it and can detect the best variant at runtime.

And I’d argue plain code is simpler to understand, debug, troubleshoot than a complex regexp.

The regexp engine part reminds me of writing scalar code, relying on the compiler doing autovectorization to make it fast. And what very often happens is that the new compiler version changed their heuristics and now there is no autovectorization that kicks in and the performance falls off a cliff . It’s a risky bet.

eregon 96 days ago [-]

Thanks for the comment.

> Cool, but what about replacing the regexp with straightforward parsing code written manually?

If you take a look at the linked snippets of C code, I think it's clear it's all but straightforward. The regexps OTOH are really short and expressive.

Ruby has no access to SIMD. And writing SIMD is basically writing inline assembly, so it's really tedious and messy.

> relying on the compiler doing autovectorization to make it fast.

I can relate to that, but Regexps are a much smaller domain, and there it's clear SIMD is always a win, so if the regexp engine uses SIMD it's very unlikely it would ever stop using it (unless something faster comes up).

SR2Z 93 days ago [-]

> Cool, but what about replacing the regexp with straightforward parsing code written manually?

It would not be faster - an FSM can trivially be created as a few jump tables which fit in cache. This is one of the rare cases where a compiler can be provably optimal.

Hell, a regex compiler could even transform the FSM to read a 4-character sequence at a time for SIMD if it made sense.

Validark 94 days ago [-]

47 bytes per nanosecond? That's 47 GB/s. My machine couldn't stream data into the CPU that fast. Though if you're exclusively operating within cache then I suppose that's possible.

Honestly, I've never really understood why people care so much about portable binaries unless they're running their OS off a thumb drive on multiple different computers. My computer doesn't change instruction sets. And if I did put a new CPU in my computer, I'd want to get new binaries compiled for a more powerful instruction set.

wtallis 93 days ago [-]

That's not really an extreme amount of bandwidth. DDR4-2933 was mainstream five years ago, and at that speed the two channels consumer CPUs are equipped with adds up to 47 GB/s.

Validark 93 days ago [-]

Sure, if you are streaming data to ALL cores simultaneously you can get that kind of number. But on a single-thread, I don't think so.

rurban 95 days ago [-]

Using a pcre2 gem would be even faster. Like about 1000x

Validark 94 days ago [-]

Well, the article reports a speed of 47 GB/s. There is no way to be 1000x faster than that.

eregon 93 days ago [-]

No it wouldn't.

JonChesterfield 94 days ago [-]

No backreference support in the state machine, disappointing https://github.com/oracle/graal/blob/master/regex/README.md

conradludgate 93 days ago [-]

Backreferences are not part of the original regular expression theories. Regular expressions are supposed to be equivalent to DFAs, but backrefs upgrade their status to full Turing machines.

Algorithms for running regexes as non-deterministic finite automata just happen to be easy to add backrefs too, but it completely breaks the DFA principle.

Interesting video on "How regexes got catastrophic": https://youtu.be/gITmP0IWff0

HackerThemAll 93 days ago [-]

and lack of backreferences makes them fast, but crippled.

Rendered at 15:02:40 GMT+0000 (Coordinated Universal Time) with Vercel.