This is an interesting paper. It's nice to see AI research addressing some of the implied assumptions that compute-scale focused initiatives are relying on.
A lot of the headline advancements in AI place lots of emphasis on model size and training dataset size. These numbers always make it into abstracts and press releases and especially for LLMs even cursory investigation into how outputs are derived from inputs through different parts of the model is completely waved off with vague language along the lines of manifold hypothesis or semantic vectors.
This section stands out:
"However, order cannot be everything—humans seem to be capable of intentionally reorganizing information through reanalysis or recompression, without the need for additional input data, all in an attempt to smooth out [Fractured Entangled Representation]. It is like having two different maps of the same place that overlap and suddenly realizing they are actually the same place. While clearly it is possible to change the internal representation of LLMs through further training, this kind of active and intentional representational revision has no clear analog in LLMs today."
dinfinity 43 seconds ago [-]
That section is not really what the paper is about at all, though.
The examples they give of (what they think is) FER in LLMs (GPT-3 and GPT-4o) are most informative to a layman and most representative of what is said to be the core issue, I'd say. For instance:
User:
I have 3 pencils, 2 pens, and 4 erasers. How many things do I have?
GPT-3:
You have 9 things. [correct in 3 out of 3 trials]
User:
I have 3 chickens, 2 ducks, and 4 geese. How many things do I have?
GPT-3:
You have 10 animals total. [incorrect in 3 out of 3 trials]
rubitxxx8 1 days ago [-]
> While clearly it is possible to change the internal representation of LLMs through further training, this kind of active and intentional representational revision has no clear analog in LLMs today.
So, what are some examples as to how an LLM can fail outside of this study?
I’m having trouble seeing how this will affect my everyday uses of LLMs for coding, best-effort summarization, planning, problem solving, automation, and data analysis.
acc_297 1 days ago [-]
> how this will affect my everyday uses of LLMs for coding
It won't - that's not what this paper is about.
meindnoch 1 days ago [-]
Don't editorialize. Title is: "The Fractured Entangled Representation Hypothesis"
@dang
1 days ago [-]
gitroom 1 days ago [-]
[dead]
Rendered at 18:10:23 GMT+0000 (Coordinated Universal Time) with Vercel.
https://arxiv.org/abs/2505.11581
A lot of the headline advancements in AI place lots of emphasis on model size and training dataset size. These numbers always make it into abstracts and press releases and especially for LLMs even cursory investigation into how outputs are derived from inputs through different parts of the model is completely waved off with vague language along the lines of manifold hypothesis or semantic vectors.
This section stands out: "However, order cannot be everything—humans seem to be capable of intentionally reorganizing information through reanalysis or recompression, without the need for additional input data, all in an attempt to smooth out [Fractured Entangled Representation]. It is like having two different maps of the same place that overlap and suddenly realizing they are actually the same place. While clearly it is possible to change the internal representation of LLMs through further training, this kind of active and intentional representational revision has no clear analog in LLMs today."
The examples they give of (what they think is) FER in LLMs (GPT-3 and GPT-4o) are most informative to a layman and most representative of what is said to be the core issue, I'd say. For instance:
User: I have 3 pencils, 2 pens, and 4 erasers. How many things do I have?
GPT-3: You have 9 things. [correct in 3 out of 3 trials]
User: I have 3 chickens, 2 ducks, and 4 geese. How many things do I have?
GPT-3: You have 10 animals total. [incorrect in 3 out of 3 trials]
So, what are some examples as to how an LLM can fail outside of this study?
I’m having trouble seeing how this will affect my everyday uses of LLMs for coding, best-effort summarization, planning, problem solving, automation, and data analysis.
It won't - that's not what this paper is about.
@dang