Next.js App Router + React Server Components Demo

new
past
show
ask
show
jobs
submit

▲Executing programs inside transformers with exponentially faster inference (percepta.ai)

94 points by u1hcw9nx 1 days ago | 14 comments

bonoboTP 52 minutes ago [-]

This shows the downside of using AI to write up your project. I see the eloquent sentences, but don't get the message.

> This works, but the actual execution happened outside the model. The model specified the computation, then waited for an external system to carry it out. > Our transformer also emits a program, but instead of pausing for an external tool, it executes that program itself, step by step, within the same transformer.

What's the benefit? Is it speed? Where are the benchmarks? Is it that you can backprop through this computation? Do you do so?

Why is it good that it's "inside" the model? Just making it more elegant and nice? The tool was already "inside" the overall hybrid system. What's the actual problem?

famouswaffles 37 minutes ago [-]

>This shows the downside of using AI to write up your project. I see the eloquent sentences, but don't get the message.

Not really sure what this obsession with calling things you don't like AI generated is but it's poor form. If you have something to say about the text then say it. Otherwise leave baseless accusations out of it.

>What's the benefit? Is it speed? Where are the benchmarks? Is it that you can backprop through this computation? Do you do so?....

It's pretty clearly an ideological thing. Some people are firmly on the 'some sort of symbolic logic is necessary' camp. From the article, 'A system that cannot compute cannot truly internalize what computation is.'

Some things are just interesting for the sake of it. This is one of those things. I don't agree with the authors on the above and I'm still glad they shared. It's a very interesting read regardless.

entropi 22 minutes ago [-]

I got the same impression as the parent post. Even if its not AI-generated, the text reads like a politician's speech at a lot of places. Talks a lot, says little.

The idea itself was very cool, so I endured it. But it was not a pleasant read.

andy12_ 32 minutes ago [-]

Honestly, the most interesting thing here is definitely that just 2D heads are enough to do useful computation (at least they are enough to simulate an interpreter) and that there is an O(log n) algorithm to compute argmax attention with 2D heads. It seems that you could make an efficient pseudosymbolic LLM with some frozen layers that perform certain deterministic operations, but also other layers that are learned.

MattPalmer1086 51 minutes ago [-]

Interesting... But why? What is the benefit, other than increasing our understanding of model architectures?

Our brains can also simulate turing machines, slowly. We automated that with computers that are faster and more reliable. So why not allow a model to use external much faster and reliable tools, just as we do?

deviation 14 minutes ago [-]

I really liked the article, but food for thought: is a transformer that offloads computation to python really that different from Python code being read and then executed by a compiler?

Both examples are of a system we created to abstract most of the hard work.

I think a more important concept here is that the term "AI" has a lot of built-in assumptions, one of which being that it is (or will be) super intelligent, and so folks like the author here think (correctly) that it's important for the AI to be actually doing the work itself.

andy12_ 21 hours ago [-]

This seems a really interesting path for interpretability, specially if a big chunk of a model's behavior occurs pseudo-symbolically. This is an idea I had thought about, integrating tools into the main computation path of a model, but I never imagined that it could be done efficiently with just a vanilla transformer.

Truly, attention is all you need (I guess).

mirekrusin 1 hours ago [-]

This is brilliant, game changing level.

Hey, give it also access to the dump of its weights and way to propose updates so it can see and tinker its brain directly.

pennomi 19 hours ago [-]

It makes sense that a next token predictor could execute assembly code. This is fascinating work, especially with the memory implementation.

behehebd 1 hours ago [-]

Is this genius? Or just a new binary executable format? Can't tell.

koolala 2 hours ago [-]

I'd like to see this combined with reinforcement learning to optimize models to think computationally. Generating ideas with hypothetical results and then running them in the same thought. Their solution sounded like a lot of tokens though.

galsapir 19 hours ago [-]

one of the most interesting pieces I've read recently. Not sure I agree with all the statements there (e.g. without execution the system has no comprehension) - but extremely cool

ndxone 52 minutes ago [-]

big question is how efficient is this compare to executing assembly on CPU

ThouYS 57 minutes ago [-]

what!

Rendered at 09:50:29 GMT+0000 (Coordinated Universal Time) with Vercel.