I don't get the comments trashing this. If it slightly beats or even matches Opus 4.6, it means Meta is capable of building a model competitive with the leading AI company. Sure, they spent a lot of money and will have on-going costs. But how much more work would it take to turn that into a coding agent people are willing to try (and pay for) along side their usage of a collection of agents (Claude, Codex, etc)?
Also means Meta doesn't have to pay another company to use a SATA model across all their products (including IG and WhatsApp, vr) which will matter to their balance sheet long term (despite the constant r&d spend).
eranation 24 seconds ago [-]
So this is why Anthropic rushed the weirdest "pre-responsible-disclosure-totally-not-for-marketing" announcement yesterday? To make sure Spark doesn't steal their thunder? (Spark beats Opus 4.6 on some benchmarks...). Or did I become a bitter cynical old man.
gallerdude 48 minutes ago [-]
This would have been an amazing release 6 months ago. But the industry moves so fast, this is a trite release. Maybe it’s best for Meta to sell their superintelligence division. I don’t think Zuck’s vision is particularly compelling.
dgellow 40 minutes ago [-]
I never understood why meta decided to join the race. They don’t sell compute like Google or Microsoft. Why not let others do the hard work and integrate their LLMs in your systems if needed?
I assume it’s because they have Instagram, Facebook, WhatsApp, Thread data and feel they should be the ones using them for training, but it’s really not obvious how having a frontier AI lab benefits their business
observationist 16 minutes ago [-]
Adtech Money. They've got GPUs, they've got the infrastructure, and they've got the advertisement platform, and the point is getting AI that can exploit the adtech and create a flywheel effect, maximizing return from the data they collect from Insta, WhatsApp, Facebook, etc.
It's not just about LLMs, it's about being able to model consumers and markets and psychology and so on. Meta is also big in the manipulation side of things, any sort of cynical technological exploitation of humans you can imagine but that is technically legal, they're doing it for profit.
6 minutes ago [-]
eldenring 4 minutes ago [-]
Because there's a realistic chance this is the only important software technology moving forward, and commoditizes Metas's entire business which is software.
bee_rider 9 minutes ago [-]
I think they just want to be a winner in the “next thing.” They hit social networking, but missed mobile operating systems and didn’t compellingly win at social media. Eventually an ambitious person with a bazillion dollars wants a clear win, right?
prodigycorp 5 minutes ago [-]
Zuck talked about how they had spare compute from training models initially to chase after tiktok's ai efforts and they repuposed them to training llms. they got carried away after seeing how viral chatgpt was.
xnx 22 minutes ago [-]
Zuck is trying to convince himself he's good, and not just lucky.
gallerdude 39 minutes ago [-]
I’m sure there’s more to it than this, but it feels like Zuck has pet interests like VR and now AI.
alex1138 23 minutes ago [-]
But no account support, that's boring
Or any quality control (people missing posts)
Or banning the people who should be banned while leaving everyone else alone
you dont understand why zuck, who paid $1B for instagram when they had no revenue and 7 employees because he is paranoid about platform shifts, decided to join the race for (what is seeming highly possibly) the biggest platform shift in human history?
oceansky 8 minutes ago [-]
He also tried and fail to buy Snapchat, and then copied their feature on all their big products: Instagram, Facebook and even WhatsApp.
prodigycorp 5 minutes ago [-]
The way you put it, I understand it less. lol
zeroonetwothree 33 minutes ago [-]
But then how will Zuck win the billionaire dick measuring contest?
awestroke 35 minutes ago [-]
Because Zuck has chronic FOMO, he's said as much himself
gordonhart 45 minutes ago [-]
A new model comparable (ish) to the Claude/Gemini/GPT flagships is a big deal for the industry and for Meta even if it doesn't set the new frontier.
blahblaher 26 minutes ago [-]
Why would you use this instead of the other more proven models? Unless it's significantly cheaper. The general population mostly wants it free, and the more professional users are willing to pay for good/better responses.
NitpickLawyer 4 minutes ago [-]
You wouldn't use this as an API. You would "use" this inside the meta properties. Have a shop on fb marketplace? Now you have copy, images, support, chat, translations, erp, esp, fps and all the other acronyms :) and so on for your mom and pop shop @200$/mo. Probably worse than say claude/gemini but it's right there, one button away. "Click here to upgrade to AI++" or something.
gallerdude 40 minutes ago [-]
I’m not sure. If it was open source, certainly. But 4th place doesn’t really matter if you have nothing different to add.
lairv 14 minutes ago [-]
If the model is truly on par with Opus 4.6/Gemini 3.1/GPT 5.4 (beyond benchmarks) this still puts MSL in the frontier lab category, which is no small feat given that they pretty much rebooted last year
Many labs aren't able to keep up with the frontier, xAI, Mistral
datadrivenangel 23 minutes ago [-]
Fourth place means you're not reliant on any of the external providers for internal AI use, which is important for organizational health and negotiating with those other providers.
zozbot234 40 minutes ago [-]
Their new Contemplating mode gives this model a Deep Research ability (akin to existing models from GPT and Gemini) that might make it quite comparable to the just-announced Mythos.
solenoid0937 27 minutes ago [-]
Mythos is a much bigger pre train, Contemplating is not the same thing.
zozbot234 23 minutes ago [-]
> Mythos is a much bigger pre train
Do we have data to substantiate that claim?
solenoid0937 16 minutes ago [-]
It's pretty common knowledge. Spud is the only other PT comparable with Mythos.
Both Spud and Mythos can also scale via inference time compute.
Meta simply did not have enough compute online, long enough ago, to have a similar PT.
throwaw12 33 minutes ago [-]
> I don’t think Zuck’s vision is particularly compelling.
But he has to do it anyways, otherwise Meta can be disrupted easily.
Google, Apple has hardware, distribution channels for their products
Amazon has the marketplace and cloud
Microsoft has enterprise and cloud
Meta is always looking for ways to stay afloat
xnx 23 minutes ago [-]
Meta has 3.5 billion daily active users
throwaw12 13 minutes ago [-]
and has competitors like: TikTok, SnapChat, YouTube, Netflix, X, HBO, Amazon Prime, all fighting for the attention time.
They are worried something like Sora can disrupt them quickly
throwaw12 35 minutes ago [-]
How is that Meta spent so much money for talent and hardware, but the model barely matches Opus 4.6?
Especially, looking at these numbers after Claude Mythos, feels like either Anthropic has some secret sauce, or everyone else is dumber compared to the talent Anthropic has
strulovich 29 minutes ago [-]
Meta did a bunch of mistakes, and look like Zuckerberg spent a lot of money on talent and made big swings to change it (that happened about a year ago)
I think it’s unrealistic to expect them to come back from that pit to the top in one year, but I wouldn’t rule them out getting there with more time. That’s a possible future. They have the money and Zuckerberg’s drive at the helm. It can go a long way.
coffeebeqn 16 minutes ago [-]
Matching Opus 4.6 would be pretty good? It’s the SOTA actually available model
impulser_ 28 minutes ago [-]
It's not even on par with Sonnet. It's on par with open source models and it not even open source and sit behind a private preview API.
Might as well not release anything.
solenoid0937 29 minutes ago [-]
It's benchmaxxed.
If they actually matched Opus 4.6 on such a short timeline, it would have been mighty impressive. (Keep in mind this is a new lab and they are prohibited from doing distills.)
throwaw12 28 minutes ago [-]
how do you know it's benchmaxxed?
solenoid0937 15 minutes ago [-]
Friends at Meta with access to the model + personal experience at Meta.
Meta's performance process is essentially "show good numbers or you're out." So guess what people do when they don't have good numbers? They fudge them. Happens all across the company.
prodigycorp 10 minutes ago [-]
meta's benchmaxing tendencies are well known. llama4 was mega benchmaxxed, there's nothing that suggests to me that meta's culture has changed.
username223 20 minutes ago [-]
Facebook is working with the talent that can’t find a job at some other company. It doesn’t surprise me they ship mediocrity.
wotsdat 31 minutes ago [-]
[dead]
zozbot234 30 minutes ago [-]
> has some secret sauce
Yup, it's called test-time compute. Mythos is described as plenty slower than Opus, enough to seriously annoy users trying to use it for quick-feedback-loop agentic work. It is most properly compared with GPT Pro, Gemini DeepThink or this latest model's "Contemplating" mode. Otherwise you're just not comparing like for like.
throwaw12 26 minutes ago [-]
> it's called test-time compute.
Why can't others easily replicate it?
coder68 16 minutes ago [-]
I have not delved into the theory yet but it seems that the smaller open-source models do this already to an extent. They have less parameters, but spend much more time/tokens reasoning, as a way to close the performance gap. If you look at "tokens per problem" on https://swe-rebench.com/ it seems to be the case at least.
creddit 28 minutes ago [-]
Ran some of my internal benchmarks against this and I'm very unimpressed. I don't think this moves them into the OAI v Anthropic v Gemini conversation at all.
Major analytical errors in their response to multiple of my technical questions.
creddit 17 minutes ago [-]
Playing with this some more and it's actively not good. Just basic mathematical errors riddling responses. Did some basic adversarial testing where its responses are analyzed by Gemini and Gemini is finding basic math errors across every relatively (relative to Opus, Gemini or GPT can handle) simple ask I make. Yikes.
1970-01-01 5 minutes ago [-]
I can remember when AOL was an unstoppable giant. Except it wasn't. People eventually realized they could get a better, cheaper, faster experience with ISPs and search engines. The same path is unfolding before Meta. People have much better options, and plethora of Meta users will slowly leave until the big moat is drained. Zuck, go retire to your NZ bunker before Meta is forced to merge with another media company.
toddmorey 39 minutes ago [-]
Question: since they've rebooted their approach to AI... have they given up on open models? There's no mention of open source or open weights or access to the models beyond their hosted services.
thegeomaster 36 minutes ago [-]
Alexandr Wang on Twitter [0] mentioned open source plans:
"this is step one. bigger models are already in development with infrastructure scaling to match. private api preview open to select partners today, with plans to open-source future versions. incredibly proud of the MSL team. excited for what’s to come!"
So the answer is: no. lol. Remember Llama 4 Behemoth, and how we were supposed to get more great models from it?
wmf 28 minutes ago [-]
This may be too large to run locally anyway. Maybe they will distill down some smaller open versions later.
sidcool 44 minutes ago [-]
Will experiment with the model. But I am scared of sharing any information with the Zuck ecosystem.
zurfer 51 minutes ago [-]
> Muse Spark is available today at meta.ai and the Meta AI app. We’re opening a private API preview to select users.
m4r1k 45 minutes ago [-]
So no Open-weight .. why one would choose Muse Spark instead of Anthropic, OpenAI, or Google models all featuring from good to amazing harness?
visioninmyblood 36 minutes ago [-]
https://meta.ai/ this is where you can try it seems like the API is not publicly accessable yet. I feel they are very late to the game and do not show value to customers over other models.
p_stuart82 18 minutes ago [-]
late isn't the problem. private preview api and no reason to switch. that's just another hosted model
ehutch79 18 minutes ago [-]
How's the metaverse doing? It was the next big thing and how we're all going to be working inside it in... was it like 3 months ago?
Maybe they need to mine more libra coin first? or is it diem now? is that even still part of meta?
I'm sure this new AI is super intelligent and super awesome and will be writing all the code, making all the blog posts, and generating all our youtube shorts in 6 months.
serf 10 minutes ago [-]
what's with the negativity?
yeah, the metaverse got abandoned. Also: Meta was the only one to try the concept for the past X-umpteen years even though everyone in the industry ga-gas over virtual reality worlds and workplaces at every opportunity. It's literally Meta and Linden Labs (which has been on life support for 10+ years.)
The alternative is : no one does it and nothing gets abandoned, which the industry has shown itself to be exceedingly good at w.r.t VR for the past 40+ years.
To be clear: I have no faith in meta as a company; my problem lies in kicking an entity because they attempted something different.. I don't think that's productive, and it produces stuff like the past AI winters because groups get afraid of touching experimental concepts ever again lest they incur the wrath of the shareholder.
sva_ 8 minutes ago [-]
> Please don't post shallow dismissals, especially of other people's work. A good critical comment teaches us something.
Oh good, if they built a lab, I’m sure they took the time the precisely define what they mean by super intelligence? Right? …
52-6F-62 22 minutes ago [-]
If this is super intelligence, then it follows we must all be super-duper intelligence.
Artgor 49 minutes ago [-]
I'm cautiously waiting for the feedback from the first users.
Meta has produced a lot of great models (LLama), maybe this is a comeback... but I'm cautious, as the jump in the quality is almost too high.
Also, I think people aren't used that using such models requires meta.ai or meta ai app.
solenoid0937 42 minutes ago [-]
My Meta friends say it's benchmaxxed af
loeg 9 minutes ago [-]
We used to call this "overfitting," but I suppose everything has to be maxxed now. Fitmaxxed?
conradkay 39 minutes ago [-]
It doesn't seem benchmaxxed, ARC AGI 2 score is quite bad (42.5%, GPT 5.4 is 76.1%) and coding is okay. But maybe this is the best Meta can do even benchmaxxing
The impressive part is multimodality, very plausible since there's less focus there by other labs (especially Anthropic)
oliver236 28 minutes ago [-]
so glad its beating all the others on bioweapons refusal. this is what i most wanted out of the latest SOTA model
wmf 28 minutes ago [-]
Zuck has a lot more experience being summoned before Congress than you.
27 minutes ago [-]
santiagobasulto 38 minutes ago [-]
This looks like a very interesting model and very promising, especially after llama lost so much ground recently. I hope they release the weights
jansport123 7 minutes ago [-]
did they just copy the chatgpt ui?
chrsw 45 minutes ago [-]
So Meta is not releasing open source models anymore?
Until you actually try the model itself, assume any benchmark presented to you as being part of the marketing material of the model, as it is not independently verified and completely biased.
The same is true with any other model, unless otherwise stated.
In the next few days, we'll see who Meta has paid to promote this model on social media.
It's not just about LLMs, it's about being able to model consumers and markets and psychology and so on. Meta is also big in the manipulation side of things, any sort of cynical technological exploitation of humans you can imagine but that is technically legal, they're doing it for profit.
Or any quality control (people missing posts)
Or banning the people who should be banned while leaving everyone else alone
This is Zuck: https://news.ycombinator.com/item?id=4151433 or https://news.ycombinator.com/item?id=10791198
Many labs aren't able to keep up with the frontier, xAI, Mistral
Do we have data to substantiate that claim?
Both Spud and Mythos can also scale via inference time compute.
Meta simply did not have enough compute online, long enough ago, to have a similar PT.
But he has to do it anyways, otherwise Meta can be disrupted easily.
Google, Apple has hardware, distribution channels for their products
Amazon has the marketplace and cloud
Microsoft has enterprise and cloud
Meta is always looking for ways to stay afloat
They are worried something like Sora can disrupt them quickly
Especially, looking at these numbers after Claude Mythos, feels like either Anthropic has some secret sauce, or everyone else is dumber compared to the talent Anthropic has
I think it’s unrealistic to expect them to come back from that pit to the top in one year, but I wouldn’t rule them out getting there with more time. That’s a possible future. They have the money and Zuckerberg’s drive at the helm. It can go a long way.
Might as well not release anything.
If they actually matched Opus 4.6 on such a short timeline, it would have been mighty impressive. (Keep in mind this is a new lab and they are prohibited from doing distills.)
Meta's performance process is essentially "show good numbers or you're out." So guess what people do when they don't have good numbers? They fudge them. Happens all across the company.
Yup, it's called test-time compute. Mythos is described as plenty slower than Opus, enough to seriously annoy users trying to use it for quick-feedback-loop agentic work. It is most properly compared with GPT Pro, Gemini DeepThink or this latest model's "Contemplating" mode. Otherwise you're just not comparing like for like.
Why can't others easily replicate it?
Major analytical errors in their response to multiple of my technical questions.
"this is step one. bigger models are already in development with infrastructure scaling to match. private api preview open to select partners today, with plans to open-source future versions. incredibly proud of the MSL team. excited for what’s to come!"
https://x.com/alexandr_wang/status/2041909388852748717
Maybe they need to mine more libra coin first? or is it diem now? is that even still part of meta?
I'm sure this new AI is super intelligent and super awesome and will be writing all the code, making all the blog posts, and generating all our youtube shorts in 6 months.
yeah, the metaverse got abandoned. Also: Meta was the only one to try the concept for the past X-umpteen years even though everyone in the industry ga-gas over virtual reality worlds and workplaces at every opportunity. It's literally Meta and Linden Labs (which has been on life support for 10+ years.)
The alternative is : no one does it and nothing gets abandoned, which the industry has shown itself to be exceedingly good at w.r.t VR for the past 40+ years.
To be clear: I have no faith in meta as a company; my problem lies in kicking an entity because they attempted something different.. I don't think that's productive, and it produces stuff like the past AI winters because groups get afraid of touching experimental concepts ever again lest they incur the wrath of the shareholder.
https://news.ycombinator.com/newsguidelines.html
https://en.wikipedia.org/wiki/Diem_(digital_currency)
Also, I think people aren't used that using such models requires meta.ai or meta ai app.
The impressive part is multimodality, very plausible since there's less focus there by other labs (especially Anthropic)
Edit: nvm I can't read, regular benchmarks against SOTA are there
The same is true with any other model, unless otherwise stated.
In the next few days, we'll see who Meta has paid to promote this model on social media.