In case the topic of memory safety is interesting to anyone I've been experimenting with using AI agents to port common web infra projects to safe/ performant Rust. Somewhat inspired by the Bun port - was thinking that at some point memory safety might be such a big deal that people just need drop in replacements.
- Valkey/ Redis port here https://github.com/ianm199/valdr (passes ~99% of single node test suite, real prod features like replication/ clustering/ HA early or not implemented)
- Further along port of Lua 5.1-5.5 https://github.com/ianm199/lua-rs-port/tree/main
- I have a less developed nginx version that would be the north star
- These projects are very alpha at the moment
If anyone is interested in getting involved in this or has done similar experiments I'd love to collaborate! There is so much variation in how you can run these large scale agent fleets I don't think anyone has a perfect system yet.
julianlam 45 minutes ago [-]
Respectfully, as an OSS maintainer (not to the scale of nginx or valkey, of course)... if a third-party used an AI agent to rewrite my software in a different language, that gives me absolutely no reason to support that new project.
It is in all respects foreign code in a language I may or may not be familiar with, and worse yet, if I were to take over, I'd be responsible for maintaining the whole black box forever more?
Thank you but no thanks.
baq 38 minutes ago [-]
I don’t think anyone expects you to TBH. If you show interest, great. If not, the robot will translate your work into a different form of expression anyway. If you’re releasing open source software under BSD-like licenses, it’s still better than some company taking your work and selling it with zero value contributed back.
rxhampton 28 minutes ago [-]
This kind of nihilistic argument worked in 2000 when open source software was a counter movement.
patates 25 minutes ago [-]
I don't understand how it's nihilistic and how it doesn't work in 2026.
ianm218 37 minutes ago [-]
Yes I hope this can be separated from people who are inundating OSS maintainers with slop PRs - these are fully separate projects with zero expectation of involvement from maintainers. Valkey itself is forked off the original Redis.
There might be a world where people soon just find unsafe C code exposed to the web (i.e. nginx) an untenable situation and I hope it can be a helpful resource.
Anyway, I see open source code as positive sum. Maybe in the end only a small community who cares about cross compilation finds this helpful and thats a win!
dyauspitr 38 minutes ago [-]
Why would you have to take it over? Wouldn’t it just be a fork/different project entirely.
notnullorvoid 41 minutes ago [-]
I love Rust, but porting others software to Rust (or any language) is a mixed bag. I'm a strong believer that good software requires deep domain knowledge to build and maintain. Porting code you don't understand by hand already risks still not understanding it afterwards, doing it in an automated fashion all but guarantees it.
All that to say I think these automated ports are interesting experiments. However if you want to build something people can trust, the people need to be able to trust that you fully understand what is built, and why it's built the way it is.
dingnuts 1 hours ago [-]
[dead]
rxhampton 1 hours ago [-]
If someone is porting such disparate projects as Valkey and Lua it is just for show and will be pre-alpha forever.
No one wants Bun in Rust, no one wants the rsync vibe code additions. This is just the only pro-AI comment, so the AI people voted it to the top.
ianm218 47 minutes ago [-]
Hmm my reasoning was that in order to have Nginx work in Rust you want to expose scripting so having a decent Lua in Rust is key to not call out to C for that. And then Valkey/ Redis is a lot simpler than nginx so it was a good way to learn how some of this works.
And I'd disagree on no one wants - Lua is quite helpful since it is easily used in WASM. There has been some interest from people in the Bevy community - a game engine in Rust - since you can't have Lua scripting in browser games easily with the C version.
I don’t care if bun is written in zig, rust, go, f# or sql. If it works, it works.
I also don’t care if it’s written by humans or LLMs or robot overlords from Alpha Centauri. Again, if it works, it works.
The operative word here is ‘works’. Code is now cheap, QA still isn’t. Since people don’t really like doing the same thing twice, specs for working code have never been written. Nowadays there is no reason to not create a spec detailed enough for robots to make no mistakes (pun intended) when filling in the gaps when converting from spec space to code space. As long as this remains true, I don’t care who or what does the boring parts.
mekpro 3 hours ago [-]
It’s clear that Anthropic has run out of the compute capacity needed to serve Mythos publicly.
They’re using security concerns to mask their inability to deliver the model at scale, while still trying to maintain their lead over OpenAI. As a result, they’ve chosen to release it privately under the banner of an “ethical” rollout.
NiloCK 1 hours ago [-]
I find this line of reasoning highly dubious.
Yes, Anthropic is compute constrained, even after the SpaceX Colossus deal.
But supply constraints are the normal operating mode of any market. Anthropic could choose to serve whatever models it pleases at whatever price points it chooses and let the market decide where the value is.
If Mythos at $X overwhelms their capacity, they could just charge $X+1. If still overwhelmed, there are larger prices as well.
mattnewton 1 minutes ago [-]
No insider info, but just wanted to mention that pricing signals things too. If Mythos is only servable at $X*Y dollars and isn’t Y times better than $X of compute at another provider, it’s quite possible that affects the IPO price negatively versus the halo of having the worlds most expensive model that is “too powerful to release” unpriced and unbenchmarked.
deaton 10 minutes ago [-]
The question is, will anyone pay enough for Mythos to offset the opportunity cost of offering that much Opus? You don't want to end up in a spot where you don't have enough compute and your service's reliability degrades to an unusable state like xAI.
malfist 1 hours ago [-]
And then the bubble would collapse. Corps are already putting limits on token usage across the board because of costs. Increasing costs would significantly contract the hype bubble.
tiahura 57 minutes ago [-]
Sort of, but valuation models depend on X being in a certain range. If it > this range, revenue and therefore valuation are impacted.
jb_briant 3 hours ago [-]
It is not "clear", as your comment suggests, it's hidden. Which is semantically the opposite of clear. Regarding your theory, might be true, might be false. But it's highly speculative.
Forgeties79 1 hours ago [-]
All of us, including you, know that he is not saying "they are being transparent." When someone says "it's clear that..." in this way they're saying "It's clear to us what is really happening here.
jb_briant 1 hours ago [-]
It's not clear, there is no tangible proof that Mythos is not released because they don't have compute power to serve it. Saying that would imply that the "too dangerous" is a lie. Nobody has proof. It can feel "clear" for you, but it's not. Hence, I correct it.
jb_briant 54 minutes ago [-]
Yes I got how they used the phrase. And it was wrong, so I wanted to react. Thanks for your addition, it dissipates any doubt on the intention of OP: he thinks Anthropic is hiding the lack of power by pretending it's too dangerous. But he is wrong to assume that without proof, hence my reaction.
Forgeties79 59 minutes ago [-]
Agreed, but I'm talking about how they are, very clearly, using the phrase.
WhitneyLand 1 hours ago [-]
The not clear comment is valid by either interpretation.
To a lot of us it’s not clear that’s what’s happening. It’s speculation and one possibility.
It may also be a secondary consideration and not the primary gating factor.
Anthropic has had their missteps but it’s still plausible to take what they say at face value.
jb_briant 58 minutes ago [-]
I agree, saying "it's clear" when at best, "it's plausible" doesn't let the conversation happen.
And pretending to know what is going on behind the scene, anon on HN is not credible
baq 33 minutes ago [-]
I had to patch my Linux boxes daily at some point in the past couple months. I don’t want Mythos to be publicly released for as long as it is economically feasible for Anthropic. I hope they have a gentleman’s agreement with OpenAI and DeepMind about this, too.
Chinese labs will force their hands, until then let’s hope maximum number of projects get patched at a reasonable pace.
atleastoptimal 18 minutes ago [-]
Why do you think that? All these rumors about compute constraint just seem like speculation and not based on any data or information. All they would need to do is increase their prices to free up compute capacity.
simonw 3 hours ago [-]
They started Glasswing before they struck that $1.25B/month deal with xAI/SpaceX for their (notoriously dirty) Memphis data centers.
So they have a whole lot more compute now than they did last month.
mekpro 3 hours ago [-]
Yes, 300 MW from SpaceX helps a lot, but I think that’s mainly to support Opus demand, which has grown faster than expected. If Mythos is roughly 5× more expensive to serve than Opus, as the pricing suggests, then 300 MW is nowhere near enough to enable large-scale deployment of Mythos.
As an ordinary developer who relies on a $20–$200/month subscription, I feel disappointed by the release of a paper describing a model that I can’t actually use.
aspenmartin 2 hours ago [-]
Ok but they can easily upsell this to enterprise customers at a market price reflective of their capacity constraints. Big corps would pay it, this is clearly a major update.
nickthegreek 3 hours ago [-]
But that compute might not be available to then long term. Hard to make big moves with a contract like that.
simonw 2 hours ago [-]
I don't know if any of the big AI labs have confidence in planning for the long term.
For all they know they'll find a new optimization that lets them serve Opus class models for half the computing cost next month. Or someone will invent the next OpenClaw and demand will 10x over night.
cobolcomesback 3 hours ago [-]
So why is OpenAI also releasing 5.5-Cyber in a private manner? Are they also out of compute?
LiamPowell 1 hours ago [-]
OpenAI has been pulling this marketing trick for years. Remember how GPT-3 was too dangerous to release? It's also probably bad PR if script kiddies have access to GPT model with no guardrails even if it doesn't enable any significant attacks.
signatoremo 57 minutes ago [-]
I suppose you meant GPT-2, but for years? Did they say the same about subsequent models?
LiamPowell 45 minutes ago [-]
They did it for 2 and 3, however it looks like they didn't for 4 and 5.
I bet Huawei and co would be happy to sell them some cheapo chips for inference!
notahacker 3 hours ago [-]
The security concerns argument would have worked better if a forum full of people hadn't promptly obtained access by the extremely sophisticated tactic of guessing its URL...
cute_boi 1 hours ago [-]
Also, they just want to jack up the price by creating sensation.
lossolo 3 hours ago [-]
Probably. This is an 8-12 trillion-parameter model, which is why it costs so much, that is also a major reason, besides RL and synthetic data, why it suddenly gained these new capabilities. They claim it was not fine-tuned or trained specifically for cybersecurity, but is instead a general purpose model.
benashford 3 hours ago [-]
[dead]
mentalgear 4 hours ago [-]
Here's my big fear: Even IF (and that's a BIG if) we get all critical vulnerabilities fixed in tech (before adversarial/state-actors turn up with open attack models) - we still have (in at least a year) models that will be so good in social engineering that they can still (given enough tokens) gain access to whatever system they want.
If society can't trust banks and other institutions to safely control their data, what follows ?
Do we we collectivelly switch off the internet?
colechristensen 4 hours ago [-]
Social engineering as a problem goes away when anybody can get a model to do it for them for $5. It stops being possible, it's really the bank's problem when they can't have a minimum wage call center or a robot responsible for people's data.
p-e-w 4 hours ago [-]
Yes. There will be a few high-profile incidents, and then institutions will be forced to stop performing administrative actions based on people’s word.
applfanboysbgon 4 hours ago [-]
This outcome is massively detrimental to humanity at large. By eliminating the human factor from support, you make it impossible to get support in edge cases that fall outside of the pre-planned bureacratic process. Everyone already hates that Google can arbitrarily ban anybody they please with no way to get in contact with a human, and you want to extend that to banks in control of people's life savings?
hallway_monitor 4 hours ago [-]
I don't think anyone is saying that. You will just need to be authenticated before giving any commands to the bank. Maybe some type of TOTP that you can use over the phone or in person.
applfanboysbgon 3 hours ago [-]
That is the exact problem. You have identification tied to your device. Your device is lost or stolen. Now you can't access your bank account. Human support can help you out by finding flexible ways to ascertain your identity. This is the angle social engineers exploit, tricking employees trying to be helpful to abuse that area of flexibility. You can take away human judgment and all flexibility in the system, and that will make the system more secure, but it also results in a deeply uncaring system that makes life harder for people. Rigid bureacracy doesn't do a good job of accounting for a house fire destroying everything you own or your e-mail provider shutting down; these are fringe cases but they do happen and there are positive resolutions available as long as human discretion is involved.
DANmode 2 hours ago [-]
No.
You don’t tie it to “your device”.
You tie it to your security key.
Which is treated like a credit card.
and your extended family, friends, or volunteers can act as social proof to allow you back into your accounts,
if your key burns up, it breaks and you were too cool to provision a backup, etc.
pesus 1 hours ago [-]
Credit cards are lost and stole all the time, and it isn't really a big deal when it happens, since charges can usually be easily reversed. This does not sound like the same scenario. It also doesn't account for people who lack friends/family nearby or at all.
> it breaks and you were too cool to provision a backup
If we're relying on the average person to back things up properly, this idea is doomed from the start.
DANmode 41 minutes ago [-]
> If we're relying on the average person to back things up properly, this idea is doomed from the start.
The average person is relying on the average person, for everything, and I agree, they are doomed from the start.
Tech-related items inclusive.
DANmode 42 minutes ago [-]
Yes, you wouldn’t offer your private key to a random food truck.
Just new banks.
Same as people being unafraid of their car key being cloned - because they don’t hand it around the general public.
repeekad 4 hours ago [-]
> Everyone already hates that Google can arbitrarily ban people
Yet they’re still the predominate search engine, sadly the concerns of the few don’t interest monopolistic profit seekers without forced regulations, think how airlines are legally required to give refunds for delayed flights, there’s a reason it required legislation
lern_too_spel 3 hours ago [-]
The government should be in charge of ID Provider infrastructure and has local offices (postal) that can establish physical identity (and already do for people who need to travel abroad), but the religiously affiliated NWO conspiracy theorists have made this politically infeasible in the US, so we have unsavory private sector providers like World ID stepping in.
827a 4 hours ago [-]
GPT-5.5-Cyber has already at least hit if not surpassed Mythos capability in cyber tasks. The only reason they're holding back is because once its out everyone would realize that its capabilities were a step change in March, but are not anymore, yet it costs significantly more and is much slower.
john_strinlai 4 hours ago [-]
how did you go about assessing this?
jansan 4 hours ago [-]
So you believe one marketing department more than the other?
I believe the correct way to interpret AISI’s findings is that both Mythos and 5.5-Cyber are capable of solving their full benchmark (the only two models that can); Mythos does it with fewer tokens and more consistently.
Two things of note: 5.5-Cyber is likely to be substantially cheaper than Mythos, given it is priced around Opus. Additionally: AISI has never tested OpenAI’s best public model and actual Mythos competitor: 5.5-Pro.
aspectop 4 hours ago [-]
i think anthropic is being performative here, creating a hype for mythos and not releasing. i guess this is all a marketing thing to sell a security specialized AI to enterprise and startups at a way larger cost coz security market is deep in money.
skybrian 2 hours ago [-]
This is just cover for being sore that you don’t have access yet <- see what I did there?
People and organizations can have mixed motivations. It’s often not “just” one thing.
This feels more and more like a marketing/scarcity play for the largest global corps.
Will likely give them time to expand capacity as well. And make them harder to dislodge in these orgs.
aspenmartin 3 hours ago [-]
To me this makes little sense — I can’t imagine the orgs they have limited this rollout to don’t already have Claude subscriptions and integrations. And sure this may play nicely into branding a build a mystique around the model but ultimately they are missing out on a ton of revenue and risking being totally front-run now that model performance parameters are out and people have firsthand experience. Feels more like a fairly genuine attempt to be responsible. They could have easily rolled out an update and done some PR to absolve themselves of responsibility
jb_briant 3 hours ago [-]
Urgency x scarcity, unbeatable marketing move.
bushido 3 hours ago [-]
It is really good. Will also cut through the common procurement, legal and change management processes seen at these orgs.
jb_briant 3 hours ago [-]
Genius^2
yanis_t 4 hours ago [-]
Is there any evidence Mythos is qualitatively better than the Opus 4.x?
I'm afraid that the usual mantra that "we just need more scale" that worked well for attracting investments, is not working anymore - bigger models provide marginal improvements while naturally get much more expensive to run.
Is this why both Anthropic and OpenAI are rushing for IPOs this year?
rfgplk 7 minutes ago [-]
It probably isn't, at least in terms of security or memory safety. The current models can already sniff out all memory vulnerabilities with relative ease, you can't really beat that.
alasano 4 hours ago [-]
From what I've read so far it's less about Mythos being much better at tasks in isolation.
Security wise, it's about being able to find and chain multiple vulnerabilities to actually create viable exploits.
So I would imagine that if you were using it for regular software development you may not feel that it's that different unless used in a particular way?
testfrequency 23 minutes ago [-]
Mythos gives BIG Tesla FSD energy, I’m over it
atleastoptimal 19 minutes ago [-]
What does that even mean?
aplthrowaway67 4 hours ago [-]
How "altruistic" of them. If only Anthropic extended this level of care to the environment or the economy.
iamniels 3 hours ago [-]
Whats currently an open source project which comes closest to Mythos capabilities?
adrian_b 3 hours ago [-]
No single open weights model comes close to either Mythos or GPT 5.5.
Nonetheless, running many of the open weights models over a codebase, with an appropriate harness, can provide about the same vulnerability coverage (i.e. each of the open weights models would find a subset of what Mythos or GPT 5.5 could find, but the subsets are not the same).
Despite needing more runs and more time, this may be significantly cheaper, especially if the models are self hosted.
Based on what Anthropic said about Mythos, they also use a quite elaborate harness for finding bugs and vulnerabilities, i.e. not a simple prompt like "find the bugs".
They run repeatedly Mythos on each file of the codebase, many times. They start with more generic prompts, used to determine whether a more thorough analysis of that file is worthwhile. Then they use more specific prompts, to detect various classes of bugs. After it becomes probable that a certain bug exists, they do a final run where the prompt requests a confirmation of the already known bug, perhaps together with a proposed patch or a PoC exploit.
Therefore the efficiency of finding vulnerabilities depends a lot on the harness, not only on the LLM. Also, searching vulnerabilities in a big codebase when paying per token is very expensive, because it requires many runs of the LLM.
catigula 42 minutes ago [-]
I still find it funny that GPT-5.5 is just as good as Mythos and yet Anthropic likes to make things worse than they actually are.
3sk_ask8 2 hours ago [-]
Anthropic has the marketing of a weight loss product.
- They still claim 10000 issues, but they found only one in curl.
- They did not find rsync issues but Claude rather introduced rsync issues.
- Facebook is a member of this cult program but Mythos did not find the account takeover flaw.
- Mythos did not find the issues in Anthropic's own Bun rewrite.
They will not release Mythos because it would be exposed as a fraud before the IPO.
rfgplk 4 minutes ago [-]
It's just pure marketing, and most people are falling for it. The primary issue stems from their definition of "vulnerability". Most C code will be _swimming_ in vulnerabilities depending on how you analyze it (ie function that accepts a pointer but doesn't validate -> potential vulnerability right there). The only thing that matters is if it's de facto exploitable or not.
4 hours ago [-]
andrewjneumann 4 hours ago [-]
They keep writing like they stand to profit from this or something. Too many “coulds” in there for me too, this could be an amazing advancement and it could be nothing… normally we look at data and last headline I saw was 25 “high” vulnerabilities at the cost of $1 million in tokens.
No comparison to human teams, and I’m sure that $1 million in tokens was used by humans, in a team. So like most AI, they’ve developed a tool that capable people can use to be better, but unlike most tools, they’re claiming this to be outright magic. The magic is the hype train.
jb_briant 4 hours ago [-]
Step 1: claim you created a tool so dangerous you can't release it
Step2: offer to test it, but only for the biggest companies in the world
Step 3: onboard those big players on your tooling and product
Step 4: profit
This is genius.
estearum 3 hours ago [-]
And all you have to do is demonstrate unique value during the pilot phase!
Err... wait... that was already the hard part... hmm
jb_briant 3 hours ago [-]
Genius marketing move doesn't mean there is no value.
It means than even if the value you offer is similar as your competitors, you are the one conquering the market.
That's the only way to not becoming a commodity.
skybrian 2 hours ago [-]
It’s true that providing security services to so many organizations will likely put them in a position to earn lots of money. It makes them an essential service, sort of like what happened with Cloudflare and denial-of-service attacks. (There are competitors, but they’re the first company people think of.)
But I think that downplays the importance of having a good product. If the product didn’t work, this would be a good way to lose trust with a lot of organizations in a hurry.
sandeepkd 2 hours ago [-]
This is a circular economy that makes everyone look good. Almost all of these enterprise companies are sitting on top of so much of tech debt that in any realistic scenario they cant really patch vulnerabilities if they are even in double digits. A lot of these companies would not even let their valuable enough code to be ingested by LLM's.
At this phase no company would risk their brand by calling the product as ineffective. The big players are in it together and small ones have no option but to play along.
Nevertheless collecting the historical wisdom and running it at machine scale does have a lot of benefits for sure. The only question is the signal to noise ratio, machine is doing what humans did, just at a multiplier speed and with a lot more context than what a normal human can hold.
jb_briant 1 hours ago [-]
Yeah and apparently, Mythos is pretty effective at finding critical issues. So it seams to be a good product served with a genius offer. Anthropic founding engineers are already comfortable, they will end rich.
They did produce great value, claude code and opus 4.5 are a singularity in software engineering.
The job we practiced for decades simply doesn't exist anymore.
geodel 3 hours ago [-]
With trillion dollars at stake they can hire best of best in sales and marketing. And unlike some hardcore hackers who may have ethics that does not always move in direction of more money. Sales and marketing people are highly motivated for opportunities to make more money.
jb_briant 3 hours ago [-]
Our game is to craft shit, their game is to sell shit. You gotta respect the different tastes in the nature!
geodel 3 hours ago [-]
Yeah, Companies to buy shit and their employees to eat shit. Lion king would say it is great circle of life.
jb_briant 3 hours ago [-]
Here spoke the wise man!
cyanydeez 2 hours ago [-]
<stop hiring people>
Don't you understand, if they really did do the <ai magic> they don't need to hire anyone, IT SELLS ITSELF
aspenmartin 3 hours ago [-]
These companies are surely already onboarded…? They claim like 10k verified and high severity CVEs. Would you have preferred they just rolled it out like another opus update? You wouldn’t be insinuating in that situation that they were careless and reckless? They risk missing a boatload of revenue if openAI front runs them for a public launch. In what world is this some sort of scam??
jb_briant 3 hours ago [-]
Where did I use the word scam?
Marketing move doesn't mean scam. It describe the ability to sell people over a narrative and surpassing your competitor in market share. And that's exactly what is happening.
My post is a "tribute" to the efficiency of Anthropic's communication.
I never complained about anything, nor calling it a scam, nor saying they should have released mythos to the public instead of rolling it out to a selected cohort.
You tried to expand my words to make me say something I didn't, because my post wasn't giving you a clear conclusion of my opinion regarding their private release.
aspenmartin 3 hours ago [-]
Ok you’re totally right, I read this as a cynical “this is all marketing” post ==> a scammy connotation. Without that read, your points are fairly valid, but are you still implying this is all a pure marketing tactic? If so I would still argue against that as a necessity but surely marketing could be heavily involved. But still: this could easily be a footgun. OpenAI will easily release the same model and now that Anthropic has taken the initiative to do a slower more contained rollout they wouldn’t need to do any of that. So from a business perspective I would still argue this whole glasswing initiative would make their sales and marketing department pretty nervous. I mean in a second-order branding sense sure this plays into the “we are ethical” ethos but it hardly seems worth the risk
jb_briant 1 hours ago [-]
I don't have enough elements to conclude if the world would collapse if Mythos was released publicly without Glasswing.
Nor publicly or in my internal reasoning. I rarely conclude without proof or very intense and clear intuition.
From a strategic PoV it makes sense to check if their model is dangerous, I wouldn't want to have my brand name associated with "NK hacker team find zero day in all linux servers of the web and ..."
baggachipz 1 hours ago [-]
Seems like they're not even close to step 4.
wslh 2 hours ago [-]
And put Chris Olah, Anthropic co-founder, sitting next to Pope Leo XIV presenting his first encyclical, Magnifica Humanitas, at the Vatican.
cyanydeez 2 hours ago [-]
>can't release it
can't release it the plebs
jb_briant 1 hours ago [-]
Unsure about that, opus was already insanely good and we got it for a fractional cost via subscriptions.
They want the plebs, they want the mass.
fontain 4 hours ago [-]
“Mythos Preview continues a long-term trend that we’ve been warning about for some time: within 6 to 12 months […]”
The only trend Mythos continues is Anthropic’s trend of warning that disaster is always 6 to 12 months away.
jofzar 4 hours ago [-]
> The organizations in this new group are based in more than 15 countries
I mean most nasdaq tech companies would be in 13+ countries, why are they writing this like it's a big number, is hilariously small?
newtonsmethod 4 hours ago [-]
I assume they're using a more candid definition where they're not counting all the countries a company may be based, but rather the primary country they're based in.
I don't think they're trying to flex this as a large number. They don't want to give an exact number, as that may change etc / is fuzzy, but also want to give you an idea of the scale.
They say "In the future, we intend to expand our geographical reach much further". I imagine this commentary is somewhat related to the concerns that AI will create an even worse "global underclass". AI developments are first accessible to Americans, then allies, and then later the whole world.
SpicyLemonZest 3 hours ago [-]
They're writing it in contrast to the previous scope, which doesn't seem to have been available to any organizations based outside the US. (There was news a few weeks ago about how Japanese banks were going to gain access, but based on the timing I think this announcement is that access.)
cmxch 4 hours ago [-]
That’s fine as long as I can identify and reject any Mythos derived patch as being irreproducible.
IanCal 4 hours ago [-]
Why would it not be reproducible?
astrange 2 hours ago [-]
How can a patch be "reproducible"? The testcases are reproducible.
philipwhiuk 4 hours ago [-]
It would have been nice to have a list of the 150, but I guess it would make them a hacking target?
cyanydeez 2 hours ago [-]
Expanding Project Glasswing (IPO)
andai 17 minutes ago [-]
[dead]
frays 4 hours ago [-]
[flagged]
Jtarii 4 hours ago [-]
Thanks for your input Claude.
devmor 4 hours ago [-]
I see that as not just a spam post, but a generated addition to the dead internet - a real win for us algorithms.
jwpapi 4 hours ago [-]
Ragebait god
mrbonner 3 hours ago [-]
Maybe it is just me: I feel Anthropic most recent product announcements resemble more and more like what IBM tactic was at its high. For instance, the Watson AI hype after it defeated Kasparov. The difference is IBM actually wanted and let businesses buy and use Watson as opposed to time released like what Anthropic does to even boost the hype higher.
3sk_ask8 2 hours ago [-]
Big Blue defeated Kasparov. The Watson hype was about winning Jeopardy, which is still kind of the only use case for current AI.
Rendered at 18:10:24 GMT+0000 (Coordinated Universal Time) with Vercel.
- Valkey/ Redis port here https://github.com/ianm199/valdr (passes ~99% of single node test suite, real prod features like replication/ clustering/ HA early or not implemented) - Further along port of Lua 5.1-5.5 https://github.com/ianm199/lua-rs-port/tree/main - I have a less developed nginx version that would be the north star - These projects are very alpha at the moment
If anyone is interested in getting involved in this or has done similar experiments I'd love to collaborate! There is so much variation in how you can run these large scale agent fleets I don't think anyone has a perfect system yet.
It is in all respects foreign code in a language I may or may not be familiar with, and worse yet, if I were to take over, I'd be responsible for maintaining the whole black box forever more?
Thank you but no thanks.
There might be a world where people soon just find unsafe C code exposed to the web (i.e. nginx) an untenable situation and I hope it can be a helpful resource.
Anyway, I see open source code as positive sum. Maybe in the end only a small community who cares about cross compilation finds this helpful and thats a win!
All that to say I think these automated ports are interesting experiments. However if you want to build something people can trust, the people need to be able to trust that you fully understand what is built, and why it's built the way it is.
No one wants Bun in Rust, no one wants the rsync vibe code additions. This is just the only pro-AI comment, so the AI people voted it to the top.
And I'd disagree on no one wants - Lua is quite helpful since it is easily used in WASM. There has been some interest from people in the Bevy community - a game engine in Rust - since you can't have Lua scripting in browser games easily with the C version.
But anyway if people want it or not memory safety might become much more important so I think it is a good area to explore. Some people think large C codebases are inherently unsecurable https://alexgaynor.net/2020/may/27/science-on-memory-unsafet...
I also don’t care if it’s written by humans or LLMs or robot overlords from Alpha Centauri. Again, if it works, it works.
The operative word here is ‘works’. Code is now cheap, QA still isn’t. Since people don’t really like doing the same thing twice, specs for working code have never been written. Nowadays there is no reason to not create a spec detailed enough for robots to make no mistakes (pun intended) when filling in the gaps when converting from spec space to code space. As long as this remains true, I don’t care who or what does the boring parts.
They’re using security concerns to mask their inability to deliver the model at scale, while still trying to maintain their lead over OpenAI. As a result, they’ve chosen to release it privately under the banner of an “ethical” rollout.
Yes, Anthropic is compute constrained, even after the SpaceX Colossus deal.
But supply constraints are the normal operating mode of any market. Anthropic could choose to serve whatever models it pleases at whatever price points it chooses and let the market decide where the value is.
If Mythos at $X overwhelms their capacity, they could just charge $X+1. If still overwhelmed, there are larger prices as well.
To a lot of us it’s not clear that’s what’s happening. It’s speculation and one possibility.
It may also be a secondary consideration and not the primary gating factor.
Anthropic has had their missteps but it’s still plausible to take what they say at face value.
Chinese labs will force their hands, until then let’s hope maximum number of projects get patched at a reasonable pace.
So they have a whole lot more compute now than they did last month.
As an ordinary developer who relies on a $20–$200/month subscription, I feel disappointed by the release of a paper describing a model that I can’t actually use.
For all they know they'll find a new optimization that lets them serve Opus class models for half the computing cost next month. Or someone will invent the next OpenClaw and demand will 10x over night.
GPT-2: https://slate.com/technology/2019/02/openai-gpt2-text-genera...
GPT-3: https://www.itpro.com/technology/artificial-intelligence-ai/...
If society can't trust banks and other institutions to safely control their data, what follows ?
Do we we collectivelly switch off the internet?
You don’t tie it to “your device”.
You tie it to your security key.
Which is treated like a credit card.
and your extended family, friends, or volunteers can act as social proof to allow you back into your accounts,
if your key burns up, it breaks and you were too cool to provision a backup, etc.
> it breaks and you were too cool to provision a backup
If we're relying on the average person to back things up properly, this idea is doomed from the start.
The average person is relying on the average person, for everything, and I agree, they are doomed from the start.
Tech-related items inclusive.
Just new banks.
Same as people being unafraid of their car key being cloned - because they don’t hand it around the general public.
Yet they’re still the predominate search engine, sadly the concerns of the few don’t interest monopolistic profit seekers without forced regulations, think how airlines are legally required to give refunds for delayed flights, there’s a reason it required legislation
They seem pretty close, in both average and "best run" scores. And, in a highly verifiable domain, "best run" or pass@n is what you're looking for.
Two things of note: 5.5-Cyber is likely to be substantially cheaper than Mythos, given it is priced around Opus. Additionally: AISI has never tested OpenAI’s best public model and actual Mythos competitor: 5.5-Pro.
People and organizations can have mixed motivations. It’s often not “just” one thing.
https://www.0xsid.com/blog/meta-account-takeover-fiasco
Will likely give them time to expand capacity as well. And make them harder to dislodge in these orgs.
I'm afraid that the usual mantra that "we just need more scale" that worked well for attracting investments, is not working anymore - bigger models provide marginal improvements while naturally get much more expensive to run.
Is this why both Anthropic and OpenAI are rushing for IPOs this year?
Security wise, it's about being able to find and chain multiple vulnerabilities to actually create viable exploits.
So I would imagine that if you were using it for regular software development you may not feel that it's that different unless used in a particular way?
Nonetheless, running many of the open weights models over a codebase, with an appropriate harness, can provide about the same vulnerability coverage (i.e. each of the open weights models would find a subset of what Mythos or GPT 5.5 could find, but the subsets are not the same).
Despite needing more runs and more time, this may be significantly cheaper, especially if the models are self hosted.
Based on what Anthropic said about Mythos, they also use a quite elaborate harness for finding bugs and vulnerabilities, i.e. not a simple prompt like "find the bugs".
They run repeatedly Mythos on each file of the codebase, many times. They start with more generic prompts, used to determine whether a more thorough analysis of that file is worthwhile. Then they use more specific prompts, to detect various classes of bugs. After it becomes probable that a certain bug exists, they do a final run where the prompt requests a confirmation of the already known bug, perhaps together with a proposed patch or a PoC exploit.
Therefore the efficiency of finding vulnerabilities depends a lot on the harness, not only on the LLM. Also, searching vulnerabilities in a big codebase when paying per token is very expensive, because it requires many runs of the LLM.
- They still claim 10000 issues, but they found only one in curl.
- They did not find rsync issues but Claude rather introduced rsync issues.
- Facebook is a member of this cult program but Mythos did not find the account takeover flaw.
- Mythos did not find the issues in Anthropic's own Bun rewrite.
They will not release Mythos because it would be exposed as a fraud before the IPO.
No comparison to human teams, and I’m sure that $1 million in tokens was used by humans, in a team. So like most AI, they’ve developed a tool that capable people can use to be better, but unlike most tools, they’re claiming this to be outright magic. The magic is the hype train.
Step2: offer to test it, but only for the biggest companies in the world
Step 3: onboard those big players on your tooling and product
Step 4: profit
This is genius.
Err... wait... that was already the hard part... hmm
It means than even if the value you offer is similar as your competitors, you are the one conquering the market.
That's the only way to not becoming a commodity.
But I think that downplays the importance of having a good product. If the product didn’t work, this would be a good way to lose trust with a lot of organizations in a hurry.
At this phase no company would risk their brand by calling the product as ineffective. The big players are in it together and small ones have no option but to play along.
Nevertheless collecting the historical wisdom and running it at machine scale does have a lot of benefits for sure. The only question is the signal to noise ratio, machine is doing what humans did, just at a multiplier speed and with a lot more context than what a normal human can hold.
They did produce great value, claude code and opus 4.5 are a singularity in software engineering.
The job we practiced for decades simply doesn't exist anymore.
Don't you understand, if they really did do the <ai magic> they don't need to hire anyone, IT SELLS ITSELF
Marketing move doesn't mean scam. It describe the ability to sell people over a narrative and surpassing your competitor in market share. And that's exactly what is happening.
My post is a "tribute" to the efficiency of Anthropic's communication. I never complained about anything, nor calling it a scam, nor saying they should have released mythos to the public instead of rolling it out to a selected cohort.
You tried to expand my words to make me say something I didn't, because my post wasn't giving you a clear conclusion of my opinion regarding their private release.
Nor publicly or in my internal reasoning. I rarely conclude without proof or very intense and clear intuition.
From a strategic PoV it makes sense to check if their model is dangerous, I wouldn't want to have my brand name associated with "NK hacker team find zero day in all linux servers of the web and ..."
can't release it the plebs
They want the plebs, they want the mass.
The only trend Mythos continues is Anthropic’s trend of warning that disaster is always 6 to 12 months away.
I mean most nasdaq tech companies would be in 13+ countries, why are they writing this like it's a big number, is hilariously small?
I don't think they're trying to flex this as a large number. They don't want to give an exact number, as that may change etc / is fuzzy, but also want to give you an idea of the scale.
They say "In the future, we intend to expand our geographical reach much further". I imagine this commentary is somewhat related to the concerns that AI will create an even worse "global underclass". AI developments are first accessible to Americans, then allies, and then later the whole world.