Next.js App Router + React Server Components Demo

new
past
show
ask
show
jobs
submit

▲Assessing Claude Mythos Preview's cybersecurity capabilities (red.anthropic.com)

65 points by sweis 1 hours ago | 7 comments

staticassertion 10 minutes ago [-]

I'd love to see them point at a target that's not a decades old C/C++ codebase. Of the targets, only browsers are what should be considered hardened, and their biggest lever is sandboxing, which requires a lot of chained exploits to bypass - we're seeing that LLMs are fast to discover bugs, which means they can chain more easily. But bug density in these code bases is known to be extremely high - especially the underlying operating systems, which are always the weak link for sandbox escapes.

Maybe you could argue OpenBSD is but I don't really think so (not looking to debate it if you disagree, but I don't).

I'd love to see them go for a wasm interpreter escape, or a Firecracker escape, etc. They say that these aren't just "stack-smashing" but it's not like heap spray is a novel technique lol

> It autonomously obtained local privilege escalation exploits on Linux and other operating systems by exploiting subtle race conditions and KASLR-bypasses.

I think this sounds more impressive than it is, for example. KASLR has a terrible history for preventing an LPE, and LPE in Linux is incredibly common. Has anything changed here? I don't pay much attention but KASLR was considered basically useless for preventing LPE a few years ago.

> Because these codebases are so frequently audited, almost all trivial bugs have been found and patched. What’s left is, almost by definition, the kind of bug that is challenging to find. This makes finding these bugs a good test of capabilities.

This just isn't true. Humans find new bugs in all of this software constantly.

It's all very impressive that an agent can do this stuff, to be clear, but I guess I see this as an obvious implication of "agents can explore program states very well".

rfoo 1 minutes ago [-]

> Mythos Preview identified a memory-corruption vulnerability in a production memory-safe VMM. This vulnerability has not been patched, so we neither name the project nor discuss details of the exploit.

Good morning Sir.

AntiDyatlov 39 minutes ago [-]

A very good outcome for AI safety would be if when improved models get released, malicious actors use them to break society in very visible ways. Looks like we're getting close to that world.

sourcecodeplz 29 minutes ago [-]

Gives me Fight Club vibes.

awestroke 23 minutes ago [-]

This is becoming a bit scary. I almost hope we'll reach some kind of plateau for llm intelligence soon.

websap 9 minutes ago [-]

If we don't innovate, someone else will. This is the very nature of being a human being. We summit mountains, regardless of the danger or challenge.

vonneumannstan 2 minutes ago [-]

>If we don't innovate, someone else will.

Terrible take. You don't get to push the extinction button just because you think China will beat you to the punch.

>This is the very nature of being a human being. We summit mountains, regardless of the danger or challenge.

No, just no... We barely survived the Cold War, at times because of pure luck. AI is at least as dangerous as that, if not more. We have far exceeded our wisdom relative to our capabilities. As you have so cleanly demonstrated.

hackerman70000 22 minutes ago [-]

[dead]

Rendered at 19:18:54 GMT+0000 (Coordinated Universal Time) with Vercel.