Leslie Lamport : "I am not smart. I have the gift of abstraction."
Real mathematics isn't about details. Its about concepts and abstractions and how we compose them (LLMs are good at those aspects).
iammjm 4 hours ago [-]
Why doesn’t it just call tools such as Mathematica for such operations?
ACCount37 34 minutes ago [-]
For the same reason you don't run "4+6" on a calculator.
External tool call has an overhead. It requires a round trip into an external tool. It requires an LLM to run in agentic autoregression - it can't be used in prefill.
Which means that having native arithmetic capabilities is useful. Forward pass arithmetics are an LLM version of quick mental math.
An LLM can read "#define SILLY_TIME_CONST (3*20*60*60*1000)" and have "SILLY_TIME_CONST is 60 h expressed as 216000000 ms" already cached by the end of the line, before it even emits its first token.
11 minutes ago [-]
defrost 4 hours ago [-]
This is more how an LLM thinks about math internally - an LLM version of drilled tables being used for mental arithmetic "as humans do".
When humans stall on these tasks, they reach for pen and paper, a slide rule, a calculator, etc.
Mathematica is overkill for arithmetic, in addition it's licenced and can cost a bit extra.
If an LLM were to reach for a light cheap arithmetic tool something like bc would be a good first stop - a CLI tool with a language that supports arbitrary precision numbers with interactive execution of statements.
They do. I asked CharGPT for 327 x 48 and it used the "ChatGPT Instruments" calculator.
Previously it used to run Python scripts, and may still do for more complex calculations.
euroderf 4 hours ago [-]
The spirit of Rube Goldberg is alive and well.
old_sound 2 days ago [-]
What happens inside an LLM when it tries to calculate with nothing but matrices.
silvestrov 4 hours ago [-]
This is a very nice and fresh page layout.
andrewstuart 2 hours ago [-]
I assumed it wrote Python or some sort of other code.
mavhc 1 hours ago [-]
writing and calling an entire python setup seems massive overkill, surely just have an internal way of calling a simple calculator function would be millions of times faster
Rendered at 11:37:52 GMT+0000 (Coordinated Universal Time) with Vercel.
https://www.youtube.com/watch?v=U719vQz-WFs
Leslie Lamport : "I am not smart. I have the gift of abstraction."
Real mathematics isn't about details. Its about concepts and abstractions and how we compose them (LLMs are good at those aspects).
External tool call has an overhead. It requires a round trip into an external tool. It requires an LLM to run in agentic autoregression - it can't be used in prefill.
Which means that having native arithmetic capabilities is useful. Forward pass arithmetics are an LLM version of quick mental math.
An LLM can read "#define SILLY_TIME_CONST (3*20*60*60*1000)" and have "SILLY_TIME_CONST is 60 h expressed as 216000000 ms" already cached by the end of the line, before it even emits its first token.
When humans stall on these tasks, they reach for pen and paper, a slide rule, a calculator, etc.
Mathematica is overkill for arithmetic, in addition it's licenced and can cost a bit extra.
If an LLM were to reach for a light cheap arithmetic tool something like bc would be a good first stop - a CLI tool with a language that supports arbitrary precision numbers with interactive execution of statements.
https://en.wikipedia.org/wiki/Bc_(programming_language)
Previously it used to run Python scripts, and may still do for more complex calculations.