Next.js App Router + React Server Components Demo

new
past
show
ask
show
jobs
submit

▲End of Transformer Era Approaches (manifestai.com)

15 points by obiefernandez 99 days ago | 1 comment

alyxya 98 days ago [-]

> The training budget for this model was $4,000, trained 60 hours on a cluster of 32 H100s. (For comparison, training an LLM of this scale from scratch typically costs ~$200k.)

What they did is closer to fine-tuning, so this comparison isn’t helpful. The article is transparent about this at least, but listing the cost and performance seems disingenuous when they’re mostly piggybacking off an existing model. Until they train an equivalently sized model from scratch and demonstrate a notable benefit, all this looks like is at best a sidegrade to transformers.

Rendered at 13:50:37 GMT+0000 (Coordinated Universal Time) with Vercel.