NHacker Next
  • new
  • past
  • show
  • ask
  • show
  • jobs
  • submit
My AI-built PHP engine in Rust passes 17% of PHP-src tests, renders WordPress (ekinertac.com)
sshine 1 hours ago [-]
My boss asked me to set up a WordPress for a product landing page.

I naturally won't do this; it's no more than a couple of weeks ago that some SQL injection landed in the search query function of this monstrosity.

WordPress always was and always will be terrible.

So I set up the landing page with a Hugo static site, and I've been vibe-coding a WordPress-like dashboard that operates on git repositories containing Hugo sites.

I call it WorbPress (not released yet), and I'm sure that's what my boss told me to install, or I might've misheard.

And yes, it's written in Rust (with Axum and Alpine.js), because why not?

kmoser 7 minutes ago [-]
Just to clarify: you think your vibecoded dashboard is more secure than WordPress? Not saying you're wrong, just wondering why you think you're right. Are you auditing the generated code, or is it a giant yolo?
sureglymop 1 hours ago [-]
I feel like not choosing WordPress was a great choice but I'm not sure about the rest of the comment. A simple html file might make for a good landing page though.
brailsafe 10 minutes ago [-]
> because why not?

I'm not certain, but it seems like you're not being entirely serious here, however..

If you aren't joking, or for other people in this position, I'd first wonder if the landing page required a search function that would hypothetically be subject to the vulnerability, then I'd wonder about what the normal nature of your business is and how much latitude you personally have in the allocation of billable hours to arbitrary technology choices and whether those do actually align with the deliverable, then if I was the boss I might wonder why you created a bunch of (potentially) out-of-scope random liability using unusual lesser-known tools based on a personal vendetta against WordPress.

I've been in this position, conceptually if not literally, and I've probably been (in a way, rightfully) fired for it, but my country's labor protections are likely not quite as good as Denmark's.

If there's a question about why money was spent on implementing a bunch of stuff nobody knows for a reason nobody cares about, especially for a very short-lived thing like a landing page, then it's a sticky situation if the answer is basically novelty. Something like this, if it does serve a purpose, should be planned for and a case made for it, but that also doesn't really seem like agency work.

If I was asked for WordPress, which I have, and I delivered Rust, I don't think I'd keep that job, but mileage may vary.

Most work is about solving problems as they are, not what we wish them to be, and if a 5 min job becomes a month long job that the customer didn't ask for, it's an extreme case of yak-shaving.

is_true 1 hours ago [-]
Why not use headless WordPress?
lawrenceduk 2 hours ago [-]
Is it astonishing you got to 17% with some vibe code? Sure.

But most of the stuff I’ve vibe coded this year has been astonishing by 2025’s standards.

If you got 100% I’d be genuinely blown away.

sdesol 1 hours ago [-]
The article doesn't go into how they managed the AI context when implementing things but I would not be surprised if it was done in a methodical way, 80% - 90% of the test could have passed.
pylua 1 hours ago [-]
Does anyone know why we write code anymore? Why not pass through to an llm that generates the page on the fly (ssr)?

Is it cost ?

tmh88j 12 minutes ago [-]
> Does anyone know why we write code anymore?

Write or review?

> Why not pass through to an llm that generates the page on the fly (ssr)?

For one, LLM's aren't deterministic. Ignoring that, PHP and every other mainstream programming language are lightning quick and require a fraction of the resources to render a response compared to an LLM. PHP is SSR at it's core and was designed as an HTML templating engine over 30 years ago, back when a web server might have 16 MB of ram. You need to spend tens of thousands of dollars to get the same performance with an LLM for a small number of people that a few hundred dollars could for a PHP server for thousands of people.

Jabrov 1 hours ago [-]
Yes: cost, speed, and reliability.

But all of those things are improving at shocking speeds, so I think we’re on a path where code is losing value quickly.

pylua 1 hours ago [-]
Yeah, I agree. It will be like serverless but for code : codeless.

It’s a disconcerting future.

block_dagger 16 minutes ago [-]
Why disconcerting?
general_reveal 1 hours ago [-]
Standards vary.
mgaunard 48 minutes ago [-]
Why is the AI only able to reach 17%?

Surely it can just keep iterating until it implements the full test suite?

hoppp 45 minutes ago [-]
Money probably. This is a cash burn project.
fuckinpuppers 2 hours ago [-]
Use AI to make Wordpress secure and not suck as much
lioeters 1 hours ago [-]
Even an AGI can't accomplish the impossible.
AmazingEveryDay 3 hours ago [-]
Interesting read. Given what the process is producing it's probably quite cost-effective?
wsor4035 1 hours ago [-]
Ill preface my comment with saying: this might not be the best solution give the goal of your project to iteratively loop through and improve on the tests each round, and using deps would make that process longer/more complicated having to work potentially with another project.

.....however.....

mago, a static analyzer for php is written in rust and might be useful for gaining some "free" performance uplift: https://github.com/carthage-software/mago. iirc it splits out a far bit of its internals so they can be used by other projects (citation needed)

gamblor956 2 hours ago [-]
Maybe the takeaway is that 20% is about all the LLM can muster.
malisper 1 hours ago [-]
> Maybe the takeaway is that 20% is about all the LLM can muster

At this point there's a long list of projects that have used LLMs to rewrite a system in Rust including:

  - Bun (https://github.com/oven-sh/bun/pull/30412)
  - Valkey (https://github.com/ianm199/valdr)
  - Git (https://github.com/gitbutlerapp/grit)
  - Postgres (https://github.com/malisper/pgrust)
With the exception of Bun, these projects were done pre-fable too, so I bet Fable will make these types of rewrites even easier.
verandaguy 1 hours ago [-]
I'm not sure about the other three, but Bun's rewrite from Zig to Rust was a bit of a joke. `unsafe`s in the thousands, a quarter-million lines of diff, and merged inside a week with no significant public discourse (at least, not much that was responded to by the author).
solid_fuel 18 minutes ago [-]
Still waiting on that blog post from Jarred that will supposedly answer all the questions and concerns about the rust port.
ekinertac 4 hours ago [-]
Author here.

To be upfront about what this is: I'm not a Rust developer or a PHP internals person. This is an experiment in whether the "point the AI at the original project's test suite" methodology (the way Bun was driven against real-world suites) holds up when the human can't review the code. The oracle is php-src's own .phpt corpus, ~22k tests I didn't write. Current honest score: 3,844 passing (17.4%), with a realistic ceiling around 40-45% since the rest tests C extensions (GD, curl, intl, etc.) that are out of scope.

"Renders WordPress" means: fresh install completes into SQLite, the front page renders with real posts, a real theme and /wp-admin/ renders without issues. The REST API is untested, and it's currently ~55x slower than PHP on the front page (a bytecode VM is in progress, micro-benchmarks are already at 1-3x of PHP 8.5).

The scoreboard auto-generates into the repo after every run, whether the number went up or down.

Happy to answer anything.

adamtaylor_13 2 hours ago [-]
This is a pretty cool experiment. Thanks for sharing!
pluc 1 hours ago [-]
Compare with FrankenPHP?
2 hours ago [-]
bbg2401 2 hours ago [-]
Will you answer questions yourself, or will you simply pass on what your LLM of choice writes for you?

Edit: On further inspection, the blog design, the blog build, the blog articles and even the anecdotes used in the articles are entirely Claude generated.

Stop being so lazy. Get Claude to do something interesting and use your own intellect to assess and challenge the work in your write up. Or the other way around. Inject some amount of human work, at least. Otherwise, what's the point in sharing?

ShinyLeftPad 2 hours ago [-]
But it will be as least 17% correct!
Ozzie-D 34 minutes ago [-]
[flagged]
keepupnow 2 hours ago [-]
Why stop at 17%, come back when you are at 100% otherwise it's just another project.
Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact
Rendered at 01:55:12 GMT+0000 (Coordinated Universal Time) with Vercel.