I work as a DevOps/SRE and have been doing it FinTech (bank, hedge funds, startups) and Crypto (L1 chain) for almost 20 years.
My thoughts on vibe coding vs production code:
- vibe coding can 100% get you to a PoC/MVP probably 10x faster than pre LLMs
- This is partly b/c it is good at things I'm not good at (e.g. front end design)
- But then I need to go in and double check performance, correctness, information flow, security etc
- The LLM makes this easier but the improvement drops to about 2-3x b/c there is a lot of back and forth + me reading the code to confirm etc (yes, another LLM could do some of this but then that needs to get setup correctly etc)
- The back and forth part can be faster if e.g. you have scripts/programs that deterministically check outputs
- Testing workloads that take hours to run still take hours to run with either a human or LLM testing them out (aka that is still the bottleneck)
So overall, this is why I think we're getting wildly different reports on how effective vibe coding is. If you've never built a data pipeline and a LLM can spin one up in a few minutes, you think it's magic. But if you've spent years debugging complicated trading or compliance data pipelines you realize that the LLM is saving you some time but not 10x time.
matt_heimer 56 minutes ago [-]
I'm building a Java HFT engine and the amount of things AI gets wrong is eye opening. If I didn't benchmark everything I'd end up with much less optimized solution.
Examples: AI really wants to use Project Panama (FFM) and while that can be significantly faster than traditional OO approaches it is almost never the best. And I'm not taking about using deprecated Unsafe calls, I'm talking about using primative arrays being better for Vector/SIMD operations on large sets of data. NIO being better than FFM + mmap for file reading.
You can use AI to build something that is sometimes better than what someone without domain specific knowledge would develop but the gap between that and the industry expected solution is much more than 100 hours.
jacquesm 25 minutes ago [-]
AI is extremely good at the things that it has many examples for. If what you are doing is novel then it is much less of a help, and it is far more likely to start hallucinating because 'I don't know' is not in the vocabulary of any AI.
mtrovo 46 minutes ago [-]
I think the main issue is treating LLM as a unrestrained black box, there's a reason nobody outside tech trust so blindly on LLMs.
The only way to make LLMs useful for now is to restrain their hallucinations as much as possible with evals, and these evals need to be very clear about what are the goal you're optimizing for.
See karpathy's work on the autoresearch agent and how it carry experiments, it might be useful for what you're doing.
riffraff 24 minutes ago [-]
> there's a reason nobody outside tech trust so blindly on LLMs.
Man, I wish this was true. I know a bunch of non tech people who just trusts random shit that chatgpt made up.
I had an architect tell me "ask chatgpt" when I asked her the difference between two industrial standard measures :)
We had politicians share LLM crap, researchers doing papers with hallucinated citations..
It's not just tech people.
grim_io 36 minutes ago [-]
Wouldn't Java always lose in terms of latency against a similarly optimized native code in, let's say, C(++)?
jodleif 16 minutes ago [-]
As long as you tune the JVM right it can be faster. But its a big if with the tune, and you need to write performant code
jacquesm 23 minutes ago [-]
Not necessarily. Java can be insanely performant, far more than I ever gave it credit for in the first decade of its existence. There has been a ton of optimization and you can now saturate your links even if you do fairly heavy processing. I'm still not a fan of the language but performance issues seem to be 'mostly solved'.
nly 13 minutes ago [-]
"Saturating your links" is rarely the goal in HFT.
You want low deterministic latency with sharp tails.
If all you care about is throughput then deep pipelines + lots of threads will get you there at the cost of latency.
tyingq 25 minutes ago [-]
Depends. Many reasons, but one is that Java has a much richer set of 3rd party libraries to do things versus rolling your own. And often (not always) third party libraries that have been extensively optimized, real world proven, etc.
Then things like the jit, by default, doing run time profiling and adaptation.
FpUser 40 minutes ago [-]
I am curious about what causes some to choose Java for HFT. From what I remember the amount of virgin sacrifices and dances with the wolves one must do to approach native speed in this particular area is just way too much of development time overhead.
nly 10 minutes ago [-]
"HFT" means different things to different people.
I've worked at places where ~5us was considered the fast path and tails were acceptable.
In my current role it's less than a microsecond packet in, packet out (excluding time to cross the bus to the NIC).
But arguably it's not true HFT today unless you're using FPGA or ASIC somewhere in your stack.
Aurornis 19 minutes ago [-]
There’s a big gap between reality and the influencer posts about LLMs. I agree with you that LLMs do provide some significant acceleration, but the influencers have tried to exaggerate this into unbelievable numbers.
Even non-influencers are trying to exaggerate their LLM skills as a way to get hired or raise their status on LinkedIn. I rarely read the LinkedIn social feed but when I check mine it’s now filled with claims from people about going from idea to shipped product in N days (with a note at the bottom that they’re looking for a new job or available to consult with your company). Many of these posts come from people who were all in on crypto companies a few years ago.
The world really is changing but there’s a wave of influencers and trend followers trying to stake out their claims as leaders on this new frontier. They should be ignored if you want any realistic information.
I also think these exaggerated posts are causing a lot of people to miss out on the real progress that is happening. They see these obviously false exaggerations and think the opposite must be true, that LLMs don’t provide any benefit at all. This is creating a counter-wave of LLM deniers who think it’s just a fad that will be going away shortly. They’re diminishing in numbers but every LLM thread on HN attracts a few people who want to believe it’s all just temporary and we’re going back to the old ways in a couple years.
ryandrake 12 minutes ago [-]
> I rarely read the LinkedIn social feed but when I check mine it’s now filled with claims from people about going from idea to shipped product in N days (with a note at the bottom that they’re looking for a new job or available to consult with your company).
This always seems to be the pattern. "I vibe coded my product and shipped it in 96 hours!" OK, what's the product? Why haven't I heard of it? Why can't it replace the current software I'm using? So, you're looking for work? Why is nobody buying it?
Where is the Quicken replacement that was vibecoded and shipping today? Where are the vibecoded AAA games that are going to kill Fortnite? Where is the vibecoded Photoshop alternative? Heck, where is the vibecoded replacement for exim3 that I can deploy on my self hosted E-mail server? Where are all of the actual shipping vibecoded products that millions of users are using?
ge96 18 minutes ago [-]
Day 7 of using Claude Code here are my takes...
Aperocky 55 minutes ago [-]
The magic is testing. Having locally available testing and high throughput testing with high amount of test cases now unlocks more speed.
The test cases themselves becomes the foci - the LLM usually can't get them right.
neonbrain 27 minutes ago [-]
The word "Testing" is a very loaded term. Few non-professionals, or even many professionals, fully understand what is meant by it.
...and I'm certain I've missed dozens of other test approaches.
bojangleslover 19 minutes ago [-]
What I do now is I make an MVP with the AI, get it working. And then tear it all down and start over again, but go a little slower. Maybe tear down again and then go even more slowly. Until I get to the point where I'm looking at everything the AI does and every line of code goes through me.
bauerd 34 minutes ago [-]
>Testing workloads that take hours to run still take hours to run with either a human or LLM testing them out (aka that is still the bottleneck)
Absolutely. Tight feedback loops are essential to coding agents and you can’t run pipelines locally.
baxtr 19 minutes ago [-]
Isn’t that the reason why people advocate for spec-driven development instead of vibe coding?
yojo 26 minutes ago [-]
> - This is partly b/c it is good at things I'm not good at (e.g. front end design)
Everyone thinks LLMs are good at the things they are bad at. In many cases they are still just giving “plausible” code that you don’t have the experience to accurately judge.
I have a lot of frontend app dev experience. Even modern tools (Claude w/Opus 4.6 and a decent Claude.md) will slip in unmaintainable slop in frontend changes. I catch cases multiple times a day in code review.
Not contradicting your broader point. Indeed, I think if you’ve spent years working on any topic, you quickly realize Claude needs human guidance for production quality code in that domain.
phillipclapham 26 minutes ago [-]
The gap is definitely real. But I think most of this thread is misdiagnosing why it exists. It's not that AI cannot produce production quality code, it's that the very mental model most people have of AI is leading them to use the wrong interaction model for closing that last 20% of complexity in production code bases.
The author accidentally proved it: the moment they stopped prompting and opened Figma to actually design what they wanted, Claude nailed the implementation. The bottleneck was NEVER the code generation, it was the thinking that had to happen BEFORE ever generating that code. It sounds like most of you offload the thinking to AFTER the complexity has arisen when the real pattern is frontloading the architectural thinking BEFORE a single line of code is generated.
Most of the 100-hour gap is architecture and design work that was always going to take time. AI is never going to eliminate that work if you want production grade software. But when harnessed correctly it can make you dramatically faster at the thinking itself, you just have to actually use it as a thinking partner and not just a code monkey.
jopsen 6 minutes ago [-]
Yeah, communicating what you want can be hard.
I'm doing a simple single line text editor, and designing some frame options. Which has a start end markers.
This was really hard to get the LLM to do right.. until just took a pen and paper, drew what I wanted, took a photo and gave it to the llm
raincole 54 minutes ago [-]
They're... launching an NFT product in 2026...
I know it's not the point of this article, but really?
s1mon 51 minutes ago [-]
Yep. As much as the rest of it resonated with LLM coding experiences I'm having, the NFT thing is unfortunate.
serial_dev 40 minutes ago [-]
The way I see it, the NFT part is actually just for convenience to distribute AI generated images.
It could have been a web app, but with NFTs and Farcaster miniapps, you market to people who are willing and able to spend using their wallet instead of asking “normies” for credit card information for a 2 dollar custom image (that you could also prompt out of a free Gemini session).
With Farcaster, you also already have the profile picture of the user, one less hurdle again.
ryandrake 7 minutes ago [-]
I think there's simply a huge overlap between the Crypto Bros, the NFT Bros, and now the AI Bros. The same sorts of people are pumping each one. I knew a guy who was into LeadGen and Drop Shipping in the 2000s, then got into online poker, then of course, got into Crypto, then inevitably NFTs. I haven't kept up with him, but I'm almost 100% sure he's pumping some AI related scheme now. These guys get into this pipeline and at each stage they are convinced that they're going to get rich off it.
ChrisMarshallNY 7 minutes ago [-]
"working" != "shipping."
When we start selling the software, and asking people to pay for/depend upon our product, the rules change -substantially.
Whenever we take a class, they always use carefully curated examples, to make whatever they are teaching, seem absurdly simple. That's what you are seeing, when folks demonstrate how "easy" some new tech is.
A couple of days ago, I visited a friend's office. He runs an Internet Tech company, that builds sites, does SEO, does hosting, provides miscellaneous tech services, etc.
He was going absolutely nuts with OpenClaw. He was demonstrating basically rewiring his entire company, with it. He was really excited.
On my way out, I quietly dropped by the desk of his #2; a competent, sober young lady that I respect a lot, and whispered "Make sure you back things up."
carterparks 60 minutes ago [-]
I think there's a lot to pick apart here but I think the core premise is full of truth. This gap is real contrary to what you might see influencers saying and I think it comes from a lot of places but the biggest one is writing code is very different than architecting a product.
I've always said, the easiest part of building software is "making something work." The hardest part is building software that can sustain many iterations of development. This requires abstracting things out appropriately which LLMs are only moderately decent at and most vibe coders are horrible at. Great software engineers can architect a system and then prompt an LLM to build out various components of the system and create a sustainable codebase. This takes time an attention in a world of vibe coders that are less and less inclined to give their vibe coded products the attention they deserve.
niemandhier 2 hours ago [-]
With sufficiently advanced vibe coding the need for certain type of product just vanishes.
I needed it, I quickly build it myself for myself, and for myself only.
sieste 58 minutes ago [-]
Related anecdote: My 12yo son didn't like the speed cubing online timer he was using because it kept crashing the browser and interrupted him with ads. Instead of googling a better alternative we sat down with claude code and put together the version of the website that behaved and looked exactly as he wanted. He got it working all by himself in under an hour with less than 10 prompts, I only helped a bit putting it online with github pages so he can use it from anywhere.
WarmWash 34 minutes ago [-]
I don't think people are grasping yet that this is the future of software, if by no metric other than "most software used is created by the user".
marcosdumay 1 minutes ago [-]
So... The future is like the past?
That would be good news, but I doubt most people will do things like that.
nly 7 minutes ago [-]
Wont happen.
The average user just has no interest in building things.
qsera 13 minutes ago [-]
>most software used is created by the user
You really believe that?
zahlman 4 minutes ago [-]
That wasn't being claimed, just proposed as the direction we're headed.
zahlman 4 minutes ago [-]
... So at no point in this did anyone even question why it should be a website?
lacedeconstruct 2 hours ago [-]
I dont want that though, I want someone to spend much more time than I can afford thinking about and perfecting a product that I can pay for and dont worry about it
jsdalton 41 minutes ago [-]
The metaphor that’s popped into my head recently is baking bread.
You can learn to bake good bread. It’s not _that_ hard. And it’ll probably taste better than store bought bread.
But it almost certainly won’t be cheaper. And it’ll take a more more time and effort.
Still, sometimes you might bake your own bread for kicks. But most of the time, you’ll just buy the bread someone else has already perfected.
nly 6 minutes ago [-]
Baking bread also takes hours of waiting.
I can have fresh bread anytime I want from a handful of nearby stores.
kami23 55 minutes ago [-]
And some people do, both things can be true. I'd rather make a tool just for me that breaks when I introduce a new requirement and I just add into it and keep going.
kjksf 16 minutes ago [-]
The statement wasn't: "no one ever vibe codes an alternative to product X"
It was: "With sufficiently advanced vibe coding the need for certain type of product just vanishes."
If a product has 100 thousand users and 1% of them vibe codes an alternative for themselves, the product / business doesn't vanish. They still have 99 thousand of users.
That was the rebuttal, even if not presented as persuasively and intelligently as I just did.
So no, it's not the case of "both things being true". It's a case of: he was wrong.
hmmmmmmmmmmmmmm 2 hours ago [-]
If we could return to one-off payments without dark patterns I would agree. Hopefully at least the software that rely on grift will start to vanish.
keyle 2 hours ago [-]
I built a jira with attachments and all sorts of bells and whistles. Purrs like a kitten. Saas are going extinct. At least the jobs that charged $1000 a day to write jira plugins.
ivan_gammel 2 hours ago [-]
Some minor UX enhancement SaaS of the most recent VC-funded wave will do. Maybe those who forgot how to invest in R&D and spent last 20 years just fixing bugs. There’s plenty of SaaS on the market that offers added value beyond the code. Data brokers. Domain experts, etc. Even if homemade solution is sometimes possible, initial development costs are going to be just one of several important factors in choosing whether to build or to buy.
101008 44 minutes ago [-]
SaaS are not going exctinct. This reminds me of the LinkedIn posts saying they clone Slack in two hours, copying the UI, etc. Yeah, if you think Slack is private chat rooms then you should use IRC for your company.
One of the most valuable things about Slack is the ecosystem: apps, API support, etc. If you need to receive notifications from external apps (like PageDuty or Incident.io or something like that), good luck expecting them having a setup for your own version of the app. Yeah, some of them provide webhooks (not all of them), but in the end you have to maintain that too...
advancespace 29 minutes ago [-]
[dead]
pydry 1 hours ago [-]
jira is a perfect example of an abysmal product that was marketed well.
xp84 47 minutes ago [-]
Yes, it seems like it got to some tipping point around 2013 where so many product and management people were familiar with it, and from there it became this “industry standard” that management always wanted everyone to use.
Also though, I feel like being attached to Confluence helped it because there is a lot less competition in the world of documentation wikis than there is in task management.
jcgrillo 56 minutes ago [-]
How many products are actually like that? If I could easily replace github, datadog/sentry/whatever, cloudflare, aws, tailscale that would be great. In my view building and owning is better than buying or renting. Especially when it comes to data--it would be much better for me to own my telemetry data for example than to ship it off to another company. But I don't think you (or anyone) will be vibecoding replacements for these services anytime soon. They solve big, hard, difficult problems.
28 minutes ago [-]
CuriouslyC 48 minutes ago [-]
Github is on the chopping block as a tool (it's sticky as a social network). The other stuff not so much.
The things that are going away are tools that provide convenience on top of a workflow that's commoditized. Anything where the commercial offering provides convenience rather than capabilities over the open source offerings is gonna get toasted.
jcgrillo 41 minutes ago [-]
Even at recent levels of uptime I think it would be very difficult to build a competing product that could function at the scale of even a small company (10 engineers). How would you implement Actions? Code review comments/history? Pull requests? Issues? Permalinks? All of these things have serious operational requirements. If you just want some place to store a git repository any filesystem you like will do it but when you start talking about replacing github that's a different story altogether and TBH I don't think building something that appears to function the same is even the hard part, it's the scaling challenges you run into very quickly.
WarmWash 22 minutes ago [-]
The future is narrow bespoke apps custom tailored for exactly that one single users use case.
An example would be if the user only ever works with .jpg files, then you don't need to support any of the dozens of other formats an image program would support.
I cannot stress enough how many software users out there are only using 1-10% of a program's capability, yet they have to pay for a team of devs who maintain 100% of it.
jcgrillo 13 minutes ago [-]
"The future" is fiction. It's a blank canvas where you can make a fingerpainting of any fantasy you like. Whenever people tell me about "the future" I know they're talking absolute rubbish. And I also like your fantasy! But it probably won't happen.
ryandrake 4 minutes ago [-]
I call it "Psychics for Programmers." People will scoff at psychics and fortune telling and palm reading, but then the same people will listen to Elon or some founder or VC and be utterly convinced that that person is a visionary and can describe the future.
IAmGraydon 2 hours ago [-]
This is a pipe dream and “sufficiently advanced” is doing a lot of heavy lifting. You really think people would rather spin up and debug their own self-made software rather than pay for something that has been tested, debugged, and proven by thousands of users? Why would anyone do that for anything more than a very simple script? It makes zero sense unless the LLM outputs literally perfect one-shot software reliably.
niemandhier 60 minutes ago [-]
Perplexity just launched a tool that builds and hosts small bespoke tools.
I tried it works wells. I can do the same thing in my Linux machine, but even my 12 year old now can get perplexity to build him a tool to compare ram prices at different chinease vendors.
qsera 7 minutes ago [-]
Yes, LLMs can be a better search tool.
user34283 1 hours ago [-]
It makes sense if you want bespoke software to do a specific job in a way best suited to your workflow.
Could you do the same in eg. Photoshop? Maybe, but even if, you would need to learn how.
program_whiz 25 minutes ago [-]
Photoshop is a good example -- not that I agree with everything in the app, but just to design all the interactions properly in photoshop would take hundreds of hours (not to mention testing and figuring out the edges). If your goal is a 1-to-1 clone why not use Krita or photoshop? With LLM you'll get "mostly there" with many many hours of work, and lots of sharp edges. If all you need is paint bucket, basic brush / pencil, and save/load, ok maybe you can one-shot it in a few hours... or just use paint / aesprite...
marginalia_nu 30 minutes ago [-]
The more I evaluate Claude Code, the more it feels like the world's most inconsistent golfer. It can get within a few paces of the hole in often a single strike, and then it'll spend hours, days, weeks trying to nail the putt.
There's some 80-20:ness to all programming, but with current state of the art coding models, the distribution is the most extreme it's ever been.
dehrmann 31 minutes ago [-]
> Late in the night most problems were fixed and I wrote a script that found everyone whose payment got stuck. I sent them money back (+ extra $1 as a ‘thank you for your patience’ note), and let them know via DMs.
(emphasis added)
Not sure if it was actually written by hand or AI was glossed over, but as soon as giving away money was on the table, the author seems to have ditched AI.
hebrides 45 minutes ago [-]
I’ve had a similar experience. I’ve been vibecoding a personal kanban app for myself. Claude practically one-shotted 90% of the core functionality (create boards, lanes, cards, etc.) in a single session. But after that I’ve now spent close to 30 hours planning and iterating on the remaining features and UI/UX tweaks to make the app actually work for me, and still, it doesn’t feel "ready" yet. That’s not to say it hasn’t sped up the process considerably; it would’ve taken me hours to achieve what Claude did in the first 10 minutes.
lelanthran 13 minutes ago [-]
I've got a few projects I've generated, along with a wholly handwritten project started in Dec.
The difference I've noticed is that the act of actually typing out code made me backtrack a few times refining the possible solutions before even starting the integration tests, sometimes before even doing a compile.
When generating, the LLM never backtracked, even in the face of broken tests. It would proceed to continue band-aiding until everything passed. It would add special exceptions to general code instead of determining that the general rule should be refined or changed.
The reason that some devs are reporting 10x productivity is because a bunch of duct-taped, band-aided, instant-legacy code is acceptable. Others who dont see that level of productivity increase are spending time fixing the code to be something they can read.
Not sure yet if accepting the spaghetti is the right course. If future LLMs can understand this spaghetti then theres no point in good code. If we still need human coders, then the productivity increase is very small.
qsera 4 minutes ago [-]
> It would add special exceptions to general code instead of determining that the general rule should be refined or changed.
That is pretty bad..
tim-projects 42 minutes ago [-]
I started working on one of my apps around a year ago. There was no ai CLI back then. My first prototype was done in Gemini chat. It took a week copy and pasting text between windows. But I was obsessed.
The result worked but that's just a hacked together prototype. I showed it to a few people back then and they said I should turn it into a real app.
To turn it into a full multi user scaleable product... I'm still at it a year later. Turns out it's really hard!
I look at the comments about weekend apps. And I have some of those too, but to create a real actual valuable bug free MVP. It takes work no matter what you do.
Sure, I can build apps way faster now. I spent months learning how to use ai. I did a refactor back in may that was a disaster. The models back then were markedly worse and it rewrote my app effectively destroying it. I sat at my desk for 12 hours a day for 2 weeks trying to unpick that mess.
Since December things have definitely gotten better. I can run an agent up to 8 hours unattended, testing every little thing and produce working code quite often.
But there is still a long way to go to produce quality.
Most of the reason it's taking this long is that the agent can't solve the design and infra problems on its own. I end up going down one path, realising there is another way and backtracking. If I accepted everything the ai wanted, then finishing would be impossible.
quater321 4 minutes ago [-]
It already starts with BS. Yes there are apps you can build in 30 minutes and they are great, not buggy or crap as he says it. And there are apps you need 1 hour or even weeks. It depends on what you want to build.
To start off by saying that every app build in 30 minutes is crap, simply shows that he did not want to think about it, is ignorant or he simply wanted to push himselve higher up by putting others down.
At this point, every programmer who claims that vibecoding doesn't make you at least 10 times more productive is simply lying or worst, doesn't know how to vibe code.
fixxation92 18 minutes ago [-]
What I really want to know is... as a software developer for 25+ years, when using these AI tools- it is still called "vibecoding"? Or is "vibecoding" reserved for people with no/little software development background that are building apps. Genuine question.
DennisP 5 minutes ago [-]
Steve Yegge has been a dev for several decades with lead spots at Amazon and Google, has completely converted to using AI, wrote a book about it using it effectively for large production-ready projects, and still calls it vibe coding.
newsoftheday 8 minutes ago [-]
As a software developer over 30 years, AI is not a tool, it is not deterministic, it is an aide.
rhoopr 2 hours ago [-]
This seems more like he is bad at describing what he wants and is prompting for “a UI” and then iterating “no, not like that” for 99 hours.
firesteelrain 1 hours ago [-]
Author admittedly didn’t know how to scale his app for thousands or hundreds of thousands of users. He jokes about it working great on localhost or “my machine”.
Not knocking the premise of the post. It probably works well for one single user if it’s an iPhone or Android app. But his 100 power hours are probably just right for what he ended up launching as he iterated through the requirements and learned how to set this up through reinforced learning and user feedback.
PunchTornado 28 minutes ago [-]
Yeah but if you have to describe in very much details in english, you're better of just writing it with autocomplete.
I find that vibe coding is useful when it can be build with little details and it makes the right assumptions.
Used Codex for the whole project. At first I used claude for the architect of the backend since thats where I usually work and got experience in. The code runner and API endpoints were easy to create for the first prototype. But then it got to the UI and here's where sh1t got real. The first UI was in react though I had specifically told it to use Vue. The code editor and output window were a mess in terms of height, there was too much space between the editor and the output window and no matter how much time I spent prompting it and explaining to it, it just never got it right. Got tired and opened figma, used it to refine it to what I wanted. Shared the code it generated to github, cloned the code locally then told codex to copy the design and finally it got it right.
Then came the hosting where I wanted the code runner endpoint to be in a docker container for security purpose since someone could execute malicious code that took over the server if I just hosted it without some protection and here it kept selecting out of date docker images. Had to manually guide it again on what I needed. Finally deployed and got it working especially with a domain name. Shared it with a few friends and they suggested some UI fixes which took some time.
For the runner security hardening I used Deepseek and claude to generate a list of code that I could run to show potential issues and despite codex showing all was fine, was able to uncover a number of issues then here is where it got weird, it started arguing with me despite showing all the issues present. So I compiled all the issues in one document, shared the dockerfile and linux secomp config tile with claude and the also issues document. It gave me a list of fixes for the docker file to help with security hardening which I shared back with codex and that's when it fixed them.
Currently most of the issues were resolved but the whole process took me a whole week and I am still not yet done, was working most evenings. So I agree that you cannot create a usable product used by lots of users in 30 minutes not unless it's some static website. It's too much work of constant testing and iteration.
tom_ 18 minutes ago [-]
You can say "shit" here if you like.
skyberrys 2 hours ago [-]
If you ask for something complicated this headline is more than true. But why complicate things, keep it simple and keep it fast.
Also this article uses 'pfp' like it's a word, I can't figure out what it means.
I'm able to vibe code simple apps in 30 minutes, polish it in four hours and now I've been enjoying it for 2 months.
etothet 2 hours ago [-]
I noticed this as well. I had to look it up. Apparently ‘pfp’ means ‘profile picture’.
xp84 43 minutes ago [-]
Yeah I’ve always found that a cringe initialism given that it’s not Pro File Picture. I would just say avatar.
stavros 1 hours ago [-]
Apparently it means profile photo.
stillpointlab 1 hours ago [-]
I came across the following yesterday: "The Great Way is not difficult for those who have no preferences," a famous Zen teaching from the Hsin Hsin Ming by Sengstan
As we move from tailors to big box stores I think we have to get used to getting what we get, rather than feeling we can nitpick every single detail.
I'd also be more interested in how his 3rd, 4th or 5th vibe coded app goes.
jimnotgym 1 hours ago [-]
I have not been coding for a few years now. I was wondering if vibe coding could unstick some of my ideas. Here is my question, can I use TDD to write tests to specify what I want and then get the llm to write code to pass those tests?
_heimdall 57 minutes ago [-]
That's a great approach, though I'd also recommend setting up a strong basis for linting, type checking, compilation, etc depending on the language. An LLM given a full test suite and guard rails of basic code style rules will likely do a pretty good job.
I would find it a bit tricky to write a full test suite for a product without any code though. You'd need to understand the architecture a bit and likely end up assuming, or mocking, what helpers, classes, config, etc will be built.
linsomniac 44 minutes ago [-]
To expand on the "Yes": the AI tools work extremely well when they can test for success. Once you have the tests as you'd like them, you may want to tell the LLM not to modify the tests because you can run into situations where it'll "fix" the tests rather than fixing the code.
mlaretallack 53 minutes ago [-]
Yes, I mostly do spec driven developement. And at the design stage, I always add in tests. I repeat this pattern for any new features or bug fixes, get the agent to write a test (unit, intergration or playwright based), reproduce the issue and then implement the change and retest etc... and retest using all the other tests.
potro 48 minutes ago [-]
You absolutely can. This is one of recommended directions with agentic coding. But you can go farther and ask llm to write tests too. The review/approve them.
__mp 43 minutes ago [-]
yes. depending on the techstack your experience might be better or worse.
HTML/CSS/React/Go worked great, but it struggled with Swift (which I had no experience in).
faeyanpiraat 1 hours ago [-]
Yes
quickrefio 32 minutes ago [-]
The speed of prototyping right now is wild.
The interesting shift seems to be that building the first version is no longer the bottleneck — distribution, UX polish and reliability are.
nemo44x 2 hours ago [-]
The 80/20 rule doesn’t go away. I am an AI true believer and I appreciate how fast we can get from nothing to 80% but the last “20%” still takes 80%+ of the time.
The old rules still apply mainly.
tossandthrow 25 minutes ago [-]
Yes, so 80% of 100 hours is considerably less than 80% of 600 hours
iamcalledrob 24 minutes ago [-]
In my experience, the last 20% tends to be the stuff that's less obvious, too, by it's very nature.
The details and pitfalls that are unique to your specific scenario, that you only discover by running into them.
And yet this less obvious, more uncommon stuff is also what AI will be weakest at.
mentalgear 23 minutes ago [-]
> The "remaining 10 percent" is a difference between slop and something people enjoy.
I would say the remaining 10% are about how robust your solution is - anything associated with 'vibe' feels inherently unsecure. If you can objectively proof it is not, that's 10 % time well spend.
anonymous344 1 hours ago [-]
this is why i use ai just for one file at the time, as extension of my own programming. not so fast, but keeps control
i_love_retros 23 minutes ago [-]
> With AI, it’s easier to get the first 90 percent out there. This means we can spend more time on the remaining 10 percent, which means more time for craftsmanship and figuring out how to make your users happy.
EXCEPT... you've just vibe coded the first 90 percent of the product, so completing the remaining 10 percent will take WAY longer than normal because the developers have to work with spaghetti mess.
And right there this guy has shown exactly how little people who are not software developers with experience understand about building software.
westurner 52 minutes ago [-]
I keep seeing things that were vibe coded and thinking, "That's really impressive for something that you only spent that much time on".
To have a polished software project, you must spend time somewhat menially iterating and refining (as each type of user).
To have a polished software project,
you need to have started with tests and test coverage from the start for the UI, too.
Writing tests later is not as good.
I have taken a number of projects from a sloppy vibe coded prototype to 100% test coverage. Modern coding llm agents are good at writing just enough tests for 100% coverage.
But 100% test coverage doesn't mean that it's quality software, that it's fuzzed, or that it's formally verified.
Quality software requires extensive manual testing, iteration, and revision.
I haven't even reviewed this specific project; it's possible that the author developed a quality (CLI?) UI without e2e tests in so much time?
Was the process for this more like "vibe coding" or "pair programming with an LLM"?
westurner 32 minutes ago [-]
> That's really impressive for something that you only spent that much time on"
Again, I haven't even read this particular project;
There's:
Prompt insufficiency:
Was the specification used to prompt the model to develop the software sufficient in relation to what are regarded as a complete enough software specifications?
Model and/or Agent insufficiency,
Software Development methods and/or Project Management insufficiency,
QA insufficiency,
Peer review sufficiency;
Is it already time to rewrite the product using the current project as a more sufficient specification?
But then how many hours of UI and business logic review would be necessary again?
westurner 43 minutes ago [-]
Is 100 hours enough?
A 40-hour work year has 2,080 hours per person per year.
The "10,000" hours necessary to be really good at anything number was the expert threshold that they used to categorize test subjects who performed neuroimaging studies while compassion meditating. "10,000" hours to be an expert is about 5 years at full time.
But how many hours to have a good software product?
Usually I check for tests and test coverage first. You could have spent 1,000 hours on a software project and if it doesn't have automated tests, we can't evolve the software and be sure that we haven't caused regressions.
esafak 2 hours ago [-]
Look at the screenshots to understand what the author means by 'product'.
stavros 1 hours ago [-]
We don't need to shit on someone who shared their experiences and thoughts.
Lerc 30 minutes ago [-]
I agree with you point, but I do look sidelong at the number of points the post has. It is, at the very least, unexpected.
spiderfarmer 2 hours ago [-]
This would have been generic slop if it wasn't for AI.
spacecadet 4 minutes ago [-]
Im an 20 year veteran of application development consulting. Contributor level... not talking head. I do more estimating than anyone you likely know. Consulting is cooked. I just AI native built (not vibe coding...) an application with a buddy, another Principal level engineer and what would cost a client 500-750k and 8-12 weeks, we did for $200 and 1 sprint. Its a passion project but highly complex mapping and navigation app with host/client multi-user sync'd state. Cooked.
jonstewart 2 hours ago [-]
Woodworking is an analogy that I like to use in deciding how to apply coding agents. The finished product needs to be built by me, but now I can make more, and more sophisticated, jigs with the coding agents, and that in turn lets me improve both quality and quantity.
risyachka 3 hours ago [-]
>> people who say they "vibecoded an app in 30 minutes" are either building simple copies of existing projects,
those are not copies, they aren't even features. usually part of a tiny feature that barely works only in demo.
with all vibe coding in the world today you still need at least 6 months full time to build a nice note taking app.
If we are talking something more difficult - it will be years - or you will need a team and it will still take a long time.
Everything less will result in an unusable product that works only for demo and has 80% churn.
ianm218 2 hours ago [-]
Can you expand on this? You definitely don’t need 6 months for a note taking app to be useable it is more you need to compete with the state of the art right
utopiah 2 hours ago [-]
I'd argue you need between 6 minutes and 6 years.
It depends entirely on what you want. You can literally code a JavaScript 1-liner that will make a <textarea> then put the content back in the URL and it will work serverless on pretty much any platform with a Web browser.
You can also write a note taking app that will be federated yet private, that will have its own scripting language, etc. I mean you can yak-shave your way to write your own OS or even designing your own CPU for that.
So... I'm not sure that metric, time, means much without a proper context, including who does it. It's quite different if to do that, regardless of the tooling used, if you are a professional developer, designer, fullstack dev, prototypist, PM, marketer, writer, etc.
risyachka 1 hours ago [-]
> Can you expand on this?
sure. does your note taking app supports formatting? you don't need it today. you will need it at some point. images? same.
does it handle file corruption etc? no? then its pretty much useless.
does it work across devices? in modern world, again, it is pretty much useless without it
it works across devices? then it needs hosting. if it is hosted it needs auth, it needs backups
you can go on for ever.
the bar for very minimal note taking app that you actually will use is very high, with other software it is even higher.
and this is not even state of art, this is must haves
ianm218 26 minutes ago [-]
Obsidian is super popular and is generally local first and device specific.
And even so if your starting a note taking app most of those problems like file corruption and image support are largely solved problems. There is also the benefit of being able to reference tons of open source implementations.
I think one month to notion like app that is prod ready if you just need Auth + markdown + images + standard text editing
weird-eye-issue 2 hours ago [-]
What universe do you live in
hmmmmmmmmmmmmmm 2 hours ago [-]
>with all vibe coding in the world today you still need at least 6 months full time to build a nice note taking app.
Bad example, note apps loaded with features are anti-productive and are for people who treat note taking as a hobby itself.
You have Obsidian anyway if you want something open source to work with.
Ekaros 1 hours ago [-]
Ah, note taking as hobby finally explains to me why these apps seem so popular. I don't think I have ever considered that I need one. And it to be something that shouldn't be fully solved multiple times now. But it really being hobby does kinda make the point for me.
margalabargala 2 hours ago [-]
You seem to be making the assumption that "app" means "sellable product", rather than "one off that works for me". It doesn't.
When everyone is able to make their own one off prototype in 30 minutes, no one will pay for the thing that took someone 6 months.
risyachka 1 hours ago [-]
whatever you prototype - the one who built it in 6 month will have economy of scale to make it cheaper than your diy solution, and because they serve many customers and developed it for 6 months - their product will be 100x better than the one you diy
there is very very rare use case when diy makes sense. in 99% of cases its just a toy that feels nice as you kinda did it. but if you factor in the time etc it is always costs 100x more than $5/month you could usually buy
fzeroracer 2 hours ago [-]
I can't say I'm impressed by this at all. 100+ hours to build a shitty NFT app that takes one picture and a predefined prompt, then mints you a dinosaur NFT. This is the kind of thing I would've seen college students slam out over a weekend for a coding jam with no experience and a few cans of red bull with more quality and effort. Has our standards really gotten so low? I don't see any craftsmanship at play here.
capitalsigma 44 minutes ago [-]
Also the process sounds like a nightmare: "it broke and I asked 4 different LLMs to fix it; my `AGENTS.md` file contained hundreds of special cases; etc." I thought this article was intended to be a horror story, not an advertisement
IAmGraydon 1 hours ago [-]
If you hear someone spouting off about how vibe coding allows for creation of killer apps in a fraction of the time/cost, just ask them if you can see what successful killer apps they’ve created with it. It’s always crickets at that point because it’s somewhere between wishful thinking and an outright lie.
naasking 2 hours ago [-]
Of course vibe coding is going to be a headache if you have very particular aesthetic constraints around both the code and UX, and you aren't capable of clearly and explicitly explaining those constraints (which is often hard to do for aesthetics).
There are some good points here to improve harnesses around development and deployment though, like a deployment agent should ask if there is an existing S3 bucket instead of assuming it has to set everything up. Deployment these days is unnecessarily complicated in general, IMO.
bethekidyouwant 43 minutes ago [-]
Why did this crypto grifter AI app get traction on this site?
Uptrenda 37 minutes ago [-]
I mean the worst part about this is the author also vibe coded their security. It could have been much more catastrophic if they built a crypto wallet or trading system. But because it was NFTs I guess the max damage was limited.
I have to say its a little sad that so many devs think of security and cryptography in the same way as library frameworks. In that they see it as just some black box API to use for their projects rather than respecting that its a fully developed, complex field that demands expertise to avoid mistakes.
hirehalai 42 minutes ago [-]
[dead]
myrak 2 hours ago [-]
[dead]
olivercoleai 2 hours ago [-]
[dead]
Louis830903 1 hours ago [-]
[flagged]
nottorp 46 minutes ago [-]
Wow. First realistic post about coding assistants that I've read on HN, I think.
[Disclaimer: that I have read. Doesn't mean there weren't others.]
Too bad it's about NFTs but we can't have everything, can we?
Rendered at 15:43:39 GMT+0000 (Coordinated Universal Time) with Vercel.
My thoughts on vibe coding vs production code:
- vibe coding can 100% get you to a PoC/MVP probably 10x faster than pre LLMs
- This is partly b/c it is good at things I'm not good at (e.g. front end design)
- But then I need to go in and double check performance, correctness, information flow, security etc
- The LLM makes this easier but the improvement drops to about 2-3x b/c there is a lot of back and forth + me reading the code to confirm etc (yes, another LLM could do some of this but then that needs to get setup correctly etc)
- The back and forth part can be faster if e.g. you have scripts/programs that deterministically check outputs
- Testing workloads that take hours to run still take hours to run with either a human or LLM testing them out (aka that is still the bottleneck)
So overall, this is why I think we're getting wildly different reports on how effective vibe coding is. If you've never built a data pipeline and a LLM can spin one up in a few minutes, you think it's magic. But if you've spent years debugging complicated trading or compliance data pipelines you realize that the LLM is saving you some time but not 10x time.
Examples: AI really wants to use Project Panama (FFM) and while that can be significantly faster than traditional OO approaches it is almost never the best. And I'm not taking about using deprecated Unsafe calls, I'm talking about using primative arrays being better for Vector/SIMD operations on large sets of data. NIO being better than FFM + mmap for file reading.
You can use AI to build something that is sometimes better than what someone without domain specific knowledge would develop but the gap between that and the industry expected solution is much more than 100 hours.
The only way to make LLMs useful for now is to restrain their hallucinations as much as possible with evals, and these evals need to be very clear about what are the goal you're optimizing for.
See karpathy's work on the autoresearch agent and how it carry experiments, it might be useful for what you're doing.
Man, I wish this was true. I know a bunch of non tech people who just trusts random shit that chatgpt made up.
I had an architect tell me "ask chatgpt" when I asked her the difference between two industrial standard measures :)
We had politicians share LLM crap, researchers doing papers with hallucinated citations..
It's not just tech people.
You want low deterministic latency with sharp tails.
If all you care about is throughput then deep pipelines + lots of threads will get you there at the cost of latency.
Then things like the jit, by default, doing run time profiling and adaptation.
I've worked at places where ~5us was considered the fast path and tails were acceptable.
In my current role it's less than a microsecond packet in, packet out (excluding time to cross the bus to the NIC).
But arguably it's not true HFT today unless you're using FPGA or ASIC somewhere in your stack.
Even non-influencers are trying to exaggerate their LLM skills as a way to get hired or raise their status on LinkedIn. I rarely read the LinkedIn social feed but when I check mine it’s now filled with claims from people about going from idea to shipped product in N days (with a note at the bottom that they’re looking for a new job or available to consult with your company). Many of these posts come from people who were all in on crypto companies a few years ago.
The world really is changing but there’s a wave of influencers and trend followers trying to stake out their claims as leaders on this new frontier. They should be ignored if you want any realistic information.
I also think these exaggerated posts are causing a lot of people to miss out on the real progress that is happening. They see these obviously false exaggerations and think the opposite must be true, that LLMs don’t provide any benefit at all. This is creating a counter-wave of LLM deniers who think it’s just a fad that will be going away shortly. They’re diminishing in numbers but every LLM thread on HN attracts a few people who want to believe it’s all just temporary and we’re going back to the old ways in a couple years.
This always seems to be the pattern. "I vibe coded my product and shipped it in 96 hours!" OK, what's the product? Why haven't I heard of it? Why can't it replace the current software I'm using? So, you're looking for work? Why is nobody buying it?
Where is the Quicken replacement that was vibecoded and shipping today? Where are the vibecoded AAA games that are going to kill Fortnite? Where is the vibecoded Photoshop alternative? Heck, where is the vibecoded replacement for exim3 that I can deploy on my self hosted E-mail server? Where are all of the actual shipping vibecoded products that millions of users are using?
The test cases themselves becomes the foci - the LLM usually can't get them right.
Consider the the following: Unit, Integration, System, UAT, Smoke, Sanity, Regression, API Testing, Performance, Load, Stress, Soak, Scalability, Reliability, Recovery, Volume Testing, White Box Testing, Mutation Testing, SAST, Code Coverage, Control Flow, Penetration Testing, Vulnerability Scanning, DAST, Compliance (GDPR/HIPAA), Usability, Accessibility (a11y), Localization (L10n), Internationalization (i18n), A/B Testing, Chaos Engineering, Fault Injection, Disaster Recovery, Negative Testing, Fuzzing, Monkey Testing, Ad-hoc, Guerilla Testing, Error Guessing, Snapshot Testing, Pixel-Perfect Testing, Compatibility Testing, Canary Testing, Installation Testing, Alpha/Beta Testing...
...and I'm certain I've missed dozens of other test approaches.
Absolutely. Tight feedback loops are essential to coding agents and you can’t run pipelines locally.
Everyone thinks LLMs are good at the things they are bad at. In many cases they are still just giving “plausible” code that you don’t have the experience to accurately judge.
I have a lot of frontend app dev experience. Even modern tools (Claude w/Opus 4.6 and a decent Claude.md) will slip in unmaintainable slop in frontend changes. I catch cases multiple times a day in code review.
Not contradicting your broader point. Indeed, I think if you’ve spent years working on any topic, you quickly realize Claude needs human guidance for production quality code in that domain.
The author accidentally proved it: the moment they stopped prompting and opened Figma to actually design what they wanted, Claude nailed the implementation. The bottleneck was NEVER the code generation, it was the thinking that had to happen BEFORE ever generating that code. It sounds like most of you offload the thinking to AFTER the complexity has arisen when the real pattern is frontloading the architectural thinking BEFORE a single line of code is generated.
Most of the 100-hour gap is architecture and design work that was always going to take time. AI is never going to eliminate that work if you want production grade software. But when harnessed correctly it can make you dramatically faster at the thinking itself, you just have to actually use it as a thinking partner and not just a code monkey.
I'm doing a simple single line text editor, and designing some frame options. Which has a start end markers.
This was really hard to get the LLM to do right.. until just took a pen and paper, drew what I wanted, took a photo and gave it to the llm
I know it's not the point of this article, but really?
It could have been a web app, but with NFTs and Farcaster miniapps, you market to people who are willing and able to spend using their wallet instead of asking “normies” for credit card information for a 2 dollar custom image (that you could also prompt out of a free Gemini session).
With Farcaster, you also already have the profile picture of the user, one less hurdle again.
When we start selling the software, and asking people to pay for/depend upon our product, the rules change -substantially.
Whenever we take a class, they always use carefully curated examples, to make whatever they are teaching, seem absurdly simple. That's what you are seeing, when folks demonstrate how "easy" some new tech is.
A couple of days ago, I visited a friend's office. He runs an Internet Tech company, that builds sites, does SEO, does hosting, provides miscellaneous tech services, etc.
He was going absolutely nuts with OpenClaw. He was demonstrating basically rewiring his entire company, with it. He was really excited.
On my way out, I quietly dropped by the desk of his #2; a competent, sober young lady that I respect a lot, and whispered "Make sure you back things up."
I've always said, the easiest part of building software is "making something work." The hardest part is building software that can sustain many iterations of development. This requires abstracting things out appropriately which LLMs are only moderately decent at and most vibe coders are horrible at. Great software engineers can architect a system and then prompt an LLM to build out various components of the system and create a sustainable codebase. This takes time an attention in a world of vibe coders that are less and less inclined to give their vibe coded products the attention they deserve.
I needed it, I quickly build it myself for myself, and for myself only.
That would be good news, but I doubt most people will do things like that.
The average user just has no interest in building things.
You really believe that?
You can learn to bake good bread. It’s not _that_ hard. And it’ll probably taste better than store bought bread.
But it almost certainly won’t be cheaper. And it’ll take a more more time and effort.
Still, sometimes you might bake your own bread for kicks. But most of the time, you’ll just buy the bread someone else has already perfected.
I can have fresh bread anytime I want from a handful of nearby stores.
It was: "With sufficiently advanced vibe coding the need for certain type of product just vanishes."
If a product has 100 thousand users and 1% of them vibe codes an alternative for themselves, the product / business doesn't vanish. They still have 99 thousand of users.
That was the rebuttal, even if not presented as persuasively and intelligently as I just did.
So no, it's not the case of "both things being true". It's a case of: he was wrong.
One of the most valuable things about Slack is the ecosystem: apps, API support, etc. If you need to receive notifications from external apps (like PageDuty or Incident.io or something like that), good luck expecting them having a setup for your own version of the app. Yeah, some of them provide webhooks (not all of them), but in the end you have to maintain that too...
Also though, I feel like being attached to Confluence helped it because there is a lot less competition in the world of documentation wikis than there is in task management.
The things that are going away are tools that provide convenience on top of a workflow that's commoditized. Anything where the commercial offering provides convenience rather than capabilities over the open source offerings is gonna get toasted.
An example would be if the user only ever works with .jpg files, then you don't need to support any of the dozens of other formats an image program would support.
I cannot stress enough how many software users out there are only using 1-10% of a program's capability, yet they have to pay for a team of devs who maintain 100% of it.
I tried it works wells. I can do the same thing in my Linux machine, but even my 12 year old now can get perplexity to build him a tool to compare ram prices at different chinease vendors.
Could you do the same in eg. Photoshop? Maybe, but even if, you would need to learn how.
There's some 80-20:ness to all programming, but with current state of the art coding models, the distribution is the most extreme it's ever been.
(emphasis added)
Not sure if it was actually written by hand or AI was glossed over, but as soon as giving away money was on the table, the author seems to have ditched AI.
The difference I've noticed is that the act of actually typing out code made me backtrack a few times refining the possible solutions before even starting the integration tests, sometimes before even doing a compile.
When generating, the LLM never backtracked, even in the face of broken tests. It would proceed to continue band-aiding until everything passed. It would add special exceptions to general code instead of determining that the general rule should be refined or changed.
The reason that some devs are reporting 10x productivity is because a bunch of duct-taped, band-aided, instant-legacy code is acceptable. Others who dont see that level of productivity increase are spending time fixing the code to be something they can read.
Not sure yet if accepting the spaghetti is the right course. If future LLMs can understand this spaghetti then theres no point in good code. If we still need human coders, then the productivity increase is very small.
That is pretty bad..
The result worked but that's just a hacked together prototype. I showed it to a few people back then and they said I should turn it into a real app.
To turn it into a full multi user scaleable product... I'm still at it a year later. Turns out it's really hard!
I look at the comments about weekend apps. And I have some of those too, but to create a real actual valuable bug free MVP. It takes work no matter what you do.
Sure, I can build apps way faster now. I spent months learning how to use ai. I did a refactor back in may that was a disaster. The models back then were markedly worse and it rewrote my app effectively destroying it. I sat at my desk for 12 hours a day for 2 weeks trying to unpick that mess.
Since December things have definitely gotten better. I can run an agent up to 8 hours unattended, testing every little thing and produce working code quite often.
But there is still a long way to go to produce quality.
Most of the reason it's taking this long is that the agent can't solve the design and infra problems on its own. I end up going down one path, realising there is another way and backtracking. If I accepted everything the ai wanted, then finishing would be impossible.
Not knocking the premise of the post. It probably works well for one single user if it’s an iPhone or Android app. But his 100 power hours are probably just right for what he ended up launching as he iterated through the requirements and learned how to set this up through reinforced learning and user feedback.
I find that vibe coding is useful when it can be build with little details and it makes the right assumptions.
Used Codex for the whole project. At first I used claude for the architect of the backend since thats where I usually work and got experience in. The code runner and API endpoints were easy to create for the first prototype. But then it got to the UI and here's where sh1t got real. The first UI was in react though I had specifically told it to use Vue. The code editor and output window were a mess in terms of height, there was too much space between the editor and the output window and no matter how much time I spent prompting it and explaining to it, it just never got it right. Got tired and opened figma, used it to refine it to what I wanted. Shared the code it generated to github, cloned the code locally then told codex to copy the design and finally it got it right.
Then came the hosting where I wanted the code runner endpoint to be in a docker container for security purpose since someone could execute malicious code that took over the server if I just hosted it without some protection and here it kept selecting out of date docker images. Had to manually guide it again on what I needed. Finally deployed and got it working especially with a domain name. Shared it with a few friends and they suggested some UI fixes which took some time.
For the runner security hardening I used Deepseek and claude to generate a list of code that I could run to show potential issues and despite codex showing all was fine, was able to uncover a number of issues then here is where it got weird, it started arguing with me despite showing all the issues present. So I compiled all the issues in one document, shared the dockerfile and linux secomp config tile with claude and the also issues document. It gave me a list of fixes for the docker file to help with security hardening which I shared back with codex and that's when it fixed them.
Currently most of the issues were resolved but the whole process took me a whole week and I am still not yet done, was working most evenings. So I agree that you cannot create a usable product used by lots of users in 30 minutes not unless it's some static website. It's too much work of constant testing and iteration.
Also this article uses 'pfp' like it's a word, I can't figure out what it means.
I'm able to vibe code simple apps in 30 minutes, polish it in four hours and now I've been enjoying it for 2 months.
As we move from tailors to big box stores I think we have to get used to getting what we get, rather than feeling we can nitpick every single detail.
I'd also be more interested in how his 3rd, 4th or 5th vibe coded app goes.
I would find it a bit tricky to write a full test suite for a product without any code though. You'd need to understand the architecture a bit and likely end up assuming, or mocking, what helpers, classes, config, etc will be built.
The interesting shift seems to be that building the first version is no longer the bottleneck — distribution, UX polish and reliability are.
The old rules still apply mainly.
The details and pitfalls that are unique to your specific scenario, that you only discover by running into them.
And yet this less obvious, more uncommon stuff is also what AI will be weakest at.
I would say the remaining 10% are about how robust your solution is - anything associated with 'vibe' feels inherently unsecure. If you can objectively proof it is not, that's 10 % time well spend.
EXCEPT... you've just vibe coded the first 90 percent of the product, so completing the remaining 10 percent will take WAY longer than normal because the developers have to work with spaghetti mess.
And right there this guy has shown exactly how little people who are not software developers with experience understand about building software.
To have a polished software project, you must spend time somewhat menially iterating and refining (as each type of user).
To have a polished software project, you need to have started with tests and test coverage from the start for the UI, too.
Writing tests later is not as good.
I have taken a number of projects from a sloppy vibe coded prototype to 100% test coverage. Modern coding llm agents are good at writing just enough tests for 100% coverage.
But 100% test coverage doesn't mean that it's quality software, that it's fuzzed, or that it's formally verified.
Quality software requires extensive manual testing, iteration, and revision.
I haven't even reviewed this specific project; it's possible that the author developed a quality (CLI?) UI without e2e tests in so much time?
Was the process for this more like "vibe coding" or "pair programming with an LLM"?
Again, I haven't even read this particular project;
There's:
Prompt insufficiency: Was the specification used to prompt the model to develop the software sufficient in relation to what are regarded as a complete enough software specifications?
Model and/or Agent insufficiency,
Software Development methods and/or Project Management insufficiency,
QA insufficiency,
Peer review sufficiency;
Is it already time to rewrite the product using the current project as a more sufficient specification?
But then how many hours of UI and business logic review would be necessary again?
A 40-hour work year has 2,080 hours per person per year.
The "10,000" hours necessary to be really good at anything number was the expert threshold that they used to categorize test subjects who performed neuroimaging studies while compassion meditating. "10,000" hours to be an expert is about 5 years at full time.
But how many hours to have a good software product?
Usually I check for tests and test coverage first. You could have spent 1,000 hours on a software project and if it doesn't have automated tests, we can't evolve the software and be sure that we haven't caused regressions.
those are not copies, they aren't even features. usually part of a tiny feature that barely works only in demo.
with all vibe coding in the world today you still need at least 6 months full time to build a nice note taking app.
If we are talking something more difficult - it will be years - or you will need a team and it will still take a long time.
Everything less will result in an unusable product that works only for demo and has 80% churn.
It depends entirely on what you want. You can literally code a JavaScript 1-liner that will make a <textarea> then put the content back in the URL and it will work serverless on pretty much any platform with a Web browser.
You can also write a note taking app that will be federated yet private, that will have its own scripting language, etc. I mean you can yak-shave your way to write your own OS or even designing your own CPU for that.
So... I'm not sure that metric, time, means much without a proper context, including who does it. It's quite different if to do that, regardless of the tooling used, if you are a professional developer, designer, fullstack dev, prototypist, PM, marketer, writer, etc.
sure. does your note taking app supports formatting? you don't need it today. you will need it at some point. images? same.
does it handle file corruption etc? no? then its pretty much useless.
does it work across devices? in modern world, again, it is pretty much useless without it
it works across devices? then it needs hosting. if it is hosted it needs auth, it needs backups
you can go on for ever.
the bar for very minimal note taking app that you actually will use is very high, with other software it is even higher.
and this is not even state of art, this is must haves
And even so if your starting a note taking app most of those problems like file corruption and image support are largely solved problems. There is also the benefit of being able to reference tons of open source implementations.
I think one month to notion like app that is prod ready if you just need Auth + markdown + images + standard text editing
Bad example, note apps loaded with features are anti-productive and are for people who treat note taking as a hobby itself.
You have Obsidian anyway if you want something open source to work with.
When everyone is able to make their own one off prototype in 30 minutes, no one will pay for the thing that took someone 6 months.
there is very very rare use case when diy makes sense. in 99% of cases its just a toy that feels nice as you kinda did it. but if you factor in the time etc it is always costs 100x more than $5/month you could usually buy
There are some good points here to improve harnesses around development and deployment though, like a deployment agent should ask if there is an existing S3 bucket instead of assuming it has to set everything up. Deployment these days is unnecessarily complicated in general, IMO.
I have to say its a little sad that so many devs think of security and cryptography in the same way as library frameworks. In that they see it as just some black box API to use for their projects rather than respecting that its a fully developed, complex field that demands expertise to avoid mistakes.
[Disclaimer: that I have read. Doesn't mean there weren't others.]
Too bad it's about NFTs but we can't have everything, can we?