Next.js App Router + React Server Components Demo

new
past
show
ask
show
jobs
submit

▲ML promises to be profoundly weird (aphyr.com)

615 points by pabs3 11 days ago | 602 comments

munificent 11 days ago [-]

There is a whole giant essay I probably need to write at some point, but I can't help but see parallels between today and the Industrial Revolution.

Prior to the industrial revolution, the natural world was nearly infinitely abundant. We simply weren't efficient enough to fully exploit it. That meant that it was fine for things like property and the commons to be poorly defined. If all of us can go hunting in the woods and yet there is still game to be found, then there's no compelling reason to define and litigate who "owns" those woods.

But with the help of machines, a small number of people were able to completely deplete parts of the earth. We had to invent giant legal systems in order to determine who has the right to do that and who doesn't.

We are truly in the Information Age now, and I suspect a similar thing will play out for the digital realm. We have copyright and intellecual property law already, of course, but those were designed presuming a human might try to profit from the intellectual labor of others. With AI, we're in the industrial era of the digital world. Now a single corporation can train an AI using someone's copyrighted work and in return profit off the knowledge over and over again at industrial scale.

This completely unpends the tenuous balance between creators and consumers. Why would a writer put an article online if ChatGPT will slurp it up and regurgitate it back to users without anyone ever even finding the original article? Who will contribute to the digital common when rapacious AI companies are constantly harvesting it? Why would anyone plant seeds on someone else's farm?

It really feels like we're in the soot-covered child-coal-miner Dickensian London era of the Information Revolution and shit is gonna get real rocky before our social and legal institutions catch up.

Retric 10 days ago [-]

> Prior to the industrial revolution, the natural world was nearly infinitely abundant. We simply weren't efficient enough to fully exploit it.

This is just wildly incorrect. People started running out of trees during the early Iron Age. Woodlands have been a managed and often over exploited resource for a long time. Active agriculture vs passive woodlands vs animal grazing has been in constant tension for thousands of years across most of the globe.

hammock 10 days ago [-]

The general point is accurate, don’t take it so literally.

There were more than enough trees until we developed the technology to clear cut in expeditious manner. There were more than enough fish until we developed the technology to pull massive indiscriminate amounts out of the ocean (and/or started polluting our rivers with industry). There was more than enough topsoil until we developed mechanized plows and artificial fertilizer. Etc.

A few hundred years ago or less, a squirrel could get from the Atlantic Ocean to the Mississippi River without ever touching the ground. Not possible today. That’s not a push and pull played out over thousands of years, that’s a one-way trend.

brohee 10 days ago [-]

The general point is not. Iceland and Easter Island were fully deforested way before the industrial age. Countless species went extinct in Britain and more examples abound.

HPsquared 10 days ago [-]

Britain was a little bit industrialised even before the steam engine. There were windmills and water mills. Steam massively accelerated it, but industry did exist before.

tpm 10 days ago [-]

If a windmill or a water mill is a sign of industrialisation, then large parts of the world were industrialised.

https://en.wikipedia.org/wiki/List_of_ancient_watermills

mr_toad 10 days ago [-]

Commons in England were being enclosed in the Tudor age. It caused a great deal of social unrest, even rebellion. It had little to do with technology, and was mostly caused by population growth.

galactus 10 days ago [-]

the speed at which depletion happened was probably not the same

CalRobert 10 days ago [-]

Interestingly, clearcutting is part of it but another part is just grazing. If you let sheep graze in a forest they will eat all the saplings, so after a century of this, the old trees die out without new ones to replace them. I agree with your point but thought that could be of interest - Whittled Away, by Padraic Fogarty, is a good book discussing this (and why Ireland, which really should be all forest, is an ecological wasteland more generally)

ambicapter 10 days ago [-]

> The general point is accurate, don’t take it so literally.

GP is saying it is not, and you're just reiterating what OP said as fact.

jychang 10 days ago [-]

It's sort of the exception that proves the rule.

This is where STEM people are weak- a lack of knowledge on history. In another forum, someone would have chipped in that England's virgin forests were fully deforested by 1150. And someone else would have pointed out that this deforestation produced the economic demand for coal that drove the Industrial Revolution in the first place.

Still, that kind of underscores OP's point. Yes, natural resources were not completely unlimited prior to the Industrial Revolution; Jonathan Swift predated Watt's steam engine, after all. Still... Neither were information resources 10 years ago. Intellectual property laws did exist prior to AI, of course. The legal systems in place are not completely ignorant of the reality.

However, there's an immense difference in scale between post-industrial strip mining of resources, and preindustrial resource extraction powered solely by human muscle (and not coal or nitrogylcerin etc). Similarly, there's a massive difference in information extraction enabled by AI, vs a person in 1980 poring over the microfilm in their local library.

The legal system and social systems in place prior to the Industrial Revolution proved unsuitable for an industrial world. It stands to reason that the legal system and social systems in today's society would be forced to evolve when exposed to the technological shift caused by AI.

Retric 10 days ago [-]

> powered solely by human muscle

Both animals and water power go way back. The early steam engine was measured in horsepower because that’s what it was replacing in mines. It couldn’t compete with nearby water power which was already being moved relatively long distances through mechanical means at the time.

Hand waving this as unimportant really misunderstands just how limited the Industrial Revolution was.

jychang 10 days ago [-]

Irrelevant. Here's Bret Devereaux (an actual historian) explaining this distinction and precisely why those are irrelevant in the context of the Industrial Revolution:

https://acoup.blog/2022/08/26/collections-why-no-roman-indus...

> Diet indicators and midden remains indicate that there’s more meat being eaten, indicates a greater availability of animals which may include draft animals (for pulling plows) and must necessarily include manure, both products of animal ‘capital’ which can improve farming outputs. Of course many of the innovations above feed into this: stability makes it more sensible to invest in things like new mills or presses which need to be used for a while for the small efficiency gains to outweigh the cost of putting them up, but once up the labor savings result in more overall production.

> But the key here is that none of these processes inches this system closer to the key sets of conditions that formed the foundation of the industrial revolution. Instead, they are all about wringing efficiencies out the same set of organic energy sources with small admixtures of hydro- (watermills) or wind-power (sailing ships); mostly wringing more production out of the same set of energy inputs rather than adding new energy inputs. It is a more efficient organic economy, but still an organic economy, no closer to being an industrial economy for its efficiency, much like how realizing design efficiencies in an (unmotorized) bicycle does not bring it any closer to being a motorcycle; you are still stuck with the limits of the energy that can be applied by two legs.

So yeah, actual historians would be dismissive at your exact response, basically saying "I know, I know, but I don't care". You're still just talking about a society mostly 'wringing efficiencies out the same set of organic energy sources'. It IS unimportant, and you completely misunderstand how the Industrial Revolution reshaped production if you think it is important.

retsibsi 10 days ago [-]

I think I prefer the 'STEM people' approach of trying to say true things, rather than this superior approach of just saying things and then, when they turn out to be false, dismissing them as irrelevant. If the truth of the claim is irrelevant, why did you make it in the first place!

jychang 10 days ago [-]

The statement IS true anyways, the problem is that you failed to distinguish between an example and a universal claim. You want to argue on logic? I'm an engineer, I can argue on precision too:

The (true!) statement is "However, there's an immense difference in scale between post-industrial strip mining of resources, and preindustrial resource extraction powered solely by human muscle (and not coal or nitrogylcerin etc). Similarly, there's a massive difference in information extraction enabled by AI, vs a person in 1980 poring over the microfilm in their local library."

I said there is a major difference in scale between "modern strip mining" and "a preindustrial extraction method powered only by human muscle", and I made an analogous point about AI-enabled information extraction versus 1980s manual archival research. That statement is purely true. Nothing in that statement says the muscle-powered-extraction example was the only preindustrial mode of production, just as "someone using microfilm in 1980" does not imply microfilm was the only way information was accessed in 1980. The fact that other information formats existed in 1980 is irrelevant to the truth of the example.

So no, nothing I said "turned out to be false". You are attacking a claim I never made because you failed to parse the logic in the one I did. Most importantly, this direction missed the big picture dialectical synthesis that I was introducing as well, and just kept decomposing the argument into locally falsifiable atoms which lost the thread of what was actually being discussed.

Retric 10 days ago [-]

Is your counter argument that you’re not wrong just attacking a straw man? Because it really sounds to me like you are just clueless.

Strip mining goes back thousands of years, it’s a simpler technology than making tunnels. And no it wasn’t limited to human power to crack rock several more powerful methods existed.

Roman mining literally destroyed a mountain, operating within an order of magnitude of the largest mines today. That’s what makes what you say false. It’s not some minor quibble over details you are simply speaking from ignorance.

jychang 10 days ago [-]

It’s almost like you’re intentionally trying to be wrong.

You don't seem to understand how analogies work. I’m not talking about strip mining vs tunnel mining, I was comparing scale of human powered mining to mining with nitroglycerin.

I’ll let you figure out how the scale of mining “going back thousands of years” is very different from modern explosive mining on your own. Go google “iron production by year” or something. Hint: it took generations for the Romans to strip a small hill, that a modern midsize mining company can do in a few days.

dumah 9 days ago [-]

If you take Pliny’s word for truth, they did achieve 10% of the scale of the largest currently operating gold mine using hydraulics at Las Medulas.

Modern geological estimates are radically lower.

10 days ago [-]

foltik 10 days ago [-]

“The industrial revolution wasn’t really all that” is such a strange hill to die on.

Retric 10 days ago [-]

How so, being precise and correct is IMO worth preserving in a world of handwaving slop.

The industrial revolution was from ~1760–1840, it was a major shift it doesn’t cover everything that happens between 1760 and now more did it overwhelm many existing trends.

GrinningFool 10 days ago [-]

Before LLMs we had code generators and automation that eliminated a lot of time- and resource-consuming tasks. I think the point still holds.

endymion-light 10 days ago [-]

Yeah - really struggling to understand why people are not grasping this point.

Yes, Easter Island was deforested far earlier - but you wouldn't compare the steam engine's capability in resource extraction compared to what people on Easter Island were doing.

It feels like people are almost straining to not understand the point - I think it's quite clear how ML + AI serve to extract resources of data at a unheard of scale.

jychang 10 days ago [-]

It's the autism. And I say that endearingly. I'm an engineer who probably likes trains way too much.

I intentionally pointed out the STEM-esque responses of pedantic correction as a symptom of a disciplinary blind spot: technically correct nitpicking that misses the forest for the trees, a tendency to atomize arguments and lose the structural point, and that tendency is a weakness, not a strength.

There's also a lack of historical training to contextualize their own objection. That's also why I brought up Devereaux as an authority hammer: the actual domain experts consider those objections and dismiss it.

8note 10 days ago [-]

the conclusion doesnt follow from the premise is the issue.

the laws and enclosure happened basically orthogonal to the respurce constraints, so there's no actual comparison to draw.

if you insist on a causation, id go with the opposite - the laws making ownership and forcing people off of land enabled the exploitation and innovation, not that it was cleanup for exploitation that was already happening. existing exploitation across all kinds of degrees was already being managed without the enclosure.

if you just want to make stuff up, you can reference anything you want, like that some elaborate thing happened in star wars, and thus the same thing must be happening with AI

salawat 10 days ago [-]

It is hard to convince a man of that which his income is dependent on him not understanding. -Upton Sinclair

You aren't wrong. There's definitely going to be a need to drag people kicking and screaming to enlightenment unfortunately. Too much money to be made at stake otherwise.

achenet 10 days ago [-]

there's archeological evidence that humans hunted large animals (sometimes called megafauna) to extinction on every continent except Africa.

My original source for this was the book Sapiens, but here are two links I found with a quick web search: https://www.sciencedaily.com/releases/2024/07/240701131808.h...

https://ourworldindata.org/quaternary-megafauna-extinction

I also saw a theory (not sure how credible) that the reason humans started doing agriculture was in fact because we killed all the megafauna we used to eat.

This was over 10,000 years ago. Well before the Industrial Revolution, indeed, before even the original Agricultural Revolution.

10 days ago [-]

tpm 10 days ago [-]

> There were more than enough trees until we developed the technology to clear cut in expeditious manner.

Unless you mean 'an axe', way before that there were deforested areas where the need for trees was larger than the supply and there were enough humans to fell them.

> A few hundred years ago or less, a squirrel could get from the Atlantic Ocean to the Mississippi River without ever touching the ground.

Yes, but that wasn't possible in other parts of the world much sooner.

eru 10 days ago [-]

Burning was and is a popular way to deal with trees, too.

paganel 10 days ago [-]

> The general point is accurate, don’t take it so literally.

It's not, because the Malthusian trap was all too real going into modernity, as in recurring famines were a thing, they were quite real, nothing "literal" about them.

eru 10 days ago [-]

Compare https://fass.nus.edu.sg/ecs/wp-content/uploads/sites/4/2020/...

paganel 10 days ago [-]

First of all, the study is written by an economist, might as well have sent me an Oracle of Delphi pronouncement. And second, he mentions the Malthusian trap being a real thing in his very first sentence, so not sure what I should have gotten out of this.

eru 9 days ago [-]

You could read the whole abstract. Or ask Deep Seek to explain it to you.

Voultapher 10 days ago [-]

Proof by analogy is fraud .. and here the analogy is incorrect as well.

sabas123 10 days ago [-]

We also have had a significant rise inglobal population. Making for an unfair comparison.

thinking_cactus 10 days ago [-]

I agree. Although in this specific point, I would say we always had depletion (since the most basic microorganisms, after all otherwise life would replicate until it faces depletion limits; all the way to our close primate relatives and throughout human history; food depletes locally which drives competition), but rarely faced degradation or permanent depletion.

I'd say degradation involves a lasting depletion or lasting damage (potentially permanent until restoration efforts happen) to the environment's output and ability to support life. Permanent depletion is what can happen to e.g. shallow mines and fossil fuel deposits.

I think I'd agree the legal system was created mostly for the former, depletion, and only recently had to contend with degradation and permanent depletion. I feel like we still struggle collectively to coming to gripes with permanent depletion.

Permanent depletion is also usually the result of shortsightedness or a competition gone awry. Famous case where nobody wants the ultimate results but people may selfishly march towards it (tragedy of the commons).

bryanrasmussen 10 days ago [-]

I believe running out of trees was always a local issue - there weren't enough trees where you were at because getting trees had to be gotten locally, you didn't go get trees from far away. So yes that was in constant tension, the thing is that the problem of having enough trees turned from a local problem to a global problem, with the side effects of not having enough trees globally that the world needed to maintain the environment humanity first conquered.

I think the natural world was nearly infinitely abundant is a reasonable description, resource depletion was always local before mass industrialization. Being able to exploit the world as opposed to just your local area is also a mark of efficiency.

Retric 10 days ago [-]

By local you mean over 5 thousand of miles? Because yes moving wood was always in competition with growing it locally. But pine forests in the far north were untouched because of the low quality of the lumber they produce not the distances involved. All of Africa Europe and Asia ran out of the most valuable natural lumber a fucking long time ago.

> I think the natural world was nearly infinitely abundant is a reasonable description

Very little of the world’s woodland was untouched at the time of the Industrial Revolution and forests in the Americas survived as long as they did largely due to disease drastically reducing native populations. But American forests were on the clock independent from industrial development. I’m not sure exactly your counter argument even is here.

We still can’t reasonably extract most resources from the ocean bottom. That’s ~70% of the world’s mineral wealth just off the table.

So sure we are very slightly better at extracting resources but on the absolute scale it really isn’t that significant pre vs post Industrial Revolution compared to the sum total of human history.

bryanrasmussen 10 days ago [-]

>By local you mean over 5 thousand of miles?

maybe, "local" is a function of a lot of things, it is only fairly recently in human history that the "global" functions the way that "local" did centuries ago, meaning that it is cheap enough to source things from across the world that it does not need to be made in the next village.

>> I think the natural world was nearly infinitely abundant is a reasonable description

>Very little of the world’s woodland was untouched at the time of the Industrial Revolution and forests in the Americas survived as long as they did largely due to disease drastically reducing native populations.

things seemed appeared abundant prior to one event, soon after that event the thing no longer appears abundant, there's a correlation is the point, not a causation, but

>American forests were on the clock independent from industrial development.

sure, the Native Americans would have used up their forests if they had kept growing and not been killed off by disease brought by Europeans. Nonetheless they had been killed off, the world appeared infinite, because all you needed to do when you ran out of wood in one place is go to another place to source it, hurray, but now that is no longer the case. We have ran out of places to go get more wood.

As noted I said I felt the phrase "the natural world was nearly infinitely abundant" uttered by the original poster in this subthread is a reasonable description, and I mean obviously that is dependent on the impressions of the people of the time, and from my readings it seems like this was more the feeling than oh noes, we are running out of wood.

Although we got into a side track on wood, because that is what the first response to the OP was, that wood was always a problem, which that some natural resources were constrained still does not really disprove the phrase "the natural world was nearly infinitely abundant" since the word nearly can be seen as a cheat, and really what it means is that the world felt infinitely abundant at one time now it does not.

>We still can’t reasonably extract most resources from the ocean bottom. That’s ~70% of the world’s mineral wealth just off the table.

see, it sounds like you still feel like it is closer to infinitely abundant than dangerously used up. All we need to do is up our extraction game, at least were minerals are concerned.

NOTE: I think maybe the world feeling infinitely abundant thing is actually an American thing, this has been remarked by others in the past, that the first European settlers felt this was a world that had not been touched because in comparison to Europe it was under-exploited in many areas, it was big and had everything, and there is a whole part of American frontier myth that as soon as one area got settled and used up all you had to do was to pack up your stuff and move west and get a bunch of resources to use up, like locusts, or maybe just colonizers.

In this case the OP's idea of writing this up is that really what they are dealing with is not how the world was - infinitely abundant - but how it felt to people coming from one overly exploited area to an under-exploited one. They believe there is a narrative of economic constraints and results playing out, and that the two situations were analogous, but the source of the analogy - the world before the industrial revolution - was perhaps not as the analogy would have it but really how a memetic framework of exploration and conquest had interpreted the world.

Sorry my note went overly long, but that sometimes happens when I write what I think just as I'm thinking it.

felipeerias 10 days ago [-]

People had been hunting whales for centuries, but industrialisation gave them the means and the motivation to do so until near extinction.

Retric 10 days ago [-]

By that the token humans drove a great number of species to extinction long before the Industrial Revolution. So by that line of thinking we were already running into the limits of natural resources in the Neolithic.

Obviously we’re becoming better at extracting resources over time, but humans ran out of new land to exploit long before Europe's conquest of the Americas. Land only seemed empty because disease decimated native populations, people lived in San Francisco thousands of years ago.

mayama 10 days ago [-]

Most of humanity survived on agriculture and sometimes hunting-gathering for last 10k years. People that survived on hunting whales is minuscule. Comparing those two is nonsensical.

intended 10 days ago [-]

Forest for the trees?

I doubt that anyone reading this can’t get the point of the analogy.

The value is in showing where the analogy fails, and either disproves the point, or deepens the point.

tartoran 10 days ago [-]

But you seem to be missing the point, parent is talking about the industrial scale of means to create a lot more destruction to the environment which the OP point hinges on. Parent does not say humanity survived on hunting whales, quite the opposite, when they had the means people nearly drove whales to extinction.

d1l 10 days ago [-]

Read Moby Dick some time my friend.

bryanrasmussen 10 days ago [-]

The industrial revolution is generally understood to have started somewhere around 1760, Moby Dick took place in approximately 1830, about 10 years before what some historians mark as the end of the agrarian to Industrial shift that is generally termed the Industrial revolution

https://en.wikipedia.org/wiki/Industrial_Revolution

I get sort of wishy-washy from 1830 on, because lots of people put the end of the Industrial revolution as being 1900, but 1840 is a defensible and commonly held position.

eru 10 days ago [-]

> The industrial revolution is generally understood to have started somewhere around 1760,

In Britain. Moby Dick ain't set in Britain.

felipeerias 10 days ago [-]

That’s besides the point because most whales were killed in the XX century.

Quarrelsome 10 days ago [-]

> This is just wildly incorrect.

from an global perspective it isn't. Some places sure, like Western Europe, who in some cases had completed enclosure, but remember the new world had only been discovered a few hundred years ago at that point.

Just google maps the north part of South America, even today there are large swathes of undeveloped land across it and back then it was considerably less exploited. At that time it would have appeared infinite, especially to the European industrialists.

squigz 10 days ago [-]

> remember the new world had only been discovered a few hundred years ago at that point.

By White people*

Quarrelsome 10 days ago [-]

we're talking about the fucking industrial revolution, of course this defaults to the European perspective. Unless you wanna spit some new bars about Aztec foundries and train lines connecting meso-america in the 19th century, then the point stands. At that time, the world appeared to the industrialists of the industrial revolution to be infinite. Nor had humanity discovered the terrible side effects of fossil fuels on the atmosphere.

Why are you weirdly making this about race?

squigz 10 days ago [-]

Sure, of course it's convenient to ignore the native peoples and pretend that prior to the Industrial Revolution the rest of the world outside of Europe was some untapped well of resources that Europeans had a natural right to.

Who might be swept underfoot in this "Information Revolution", I wonder?

Quarrelsome 10 days ago [-]

Yes, just the other day I saw someone make a comment about write performance in SQLite without considering the plight of the Baltic peoples in the Northern Crusades. It was really convenient of them to do that, fucking typical.

squigz 10 days ago [-]

Sure, because working on a database plugin is the same as, for example, working on mass surveillance tech.

This sort of handwashing is exactly why the natives were treated the way they were.

Quarrelsome 10 days ago [-]

How do you think they're enabling the mass surveillance tech? SQLite got reach bruv.

Your continued erasure of the Baltic people's continues to cut deep into my heart, and your callous candour to their plight, as you discard any chance to mention them, continues to shock me.

dTal 10 days ago [-]

Nobody said anything about Europeans having a "natural right". Bad enough to derail a conversation with irrelevant political nitpicking, unforgiveable to use a strawman to do so. Boo.

squigz 10 days ago [-]

It's not irrelevant.

GP made a comparison between what we're going through and the Industrial Revolution. Ignoring the negatives of that revolution - like by acting as though the "new world" was uninhabited/unused and so Europeans had a right to its resources - seems like a bad idea.

Quarrelsome 10 days ago [-]

> like by acting as though the "new world" was uninhabited/unused and so Europeans had a right to its resources - seems like a bad idea.

maybe it was a bad idea, but that's what happened.

salawat 10 days ago [-]

Also doesn't justify doing the same damn thing again, which is exactly what all the people long on this technology fully expect to be allowed to do. Any further investment they have to do to ensure the outcome will just be chocked up to cost of doing business. And the capital funding all this is in so few hands, and in the hands in particular of such characters that don't concern themselves with not repeating atrocities of the past in new and interesting ways, that it is virtually guaranteed we're on the road to societal scale disruption. 'Tis the reason such inconvenient points are in need of being pounded home until they are impossible to ignore.

Quarrelsome 10 days ago [-]

> not repeating atrocities of the past in new and interesting ways

sorry, are you suggesting that colonialism and LLMs are equivalent in terms of atrocity? I don't feel like they're really comparable.

> 'Tis the reason such inconvenient points are in need of being pounded home until they are impossible to ignore.

and what do you think is going to happen here? People so basic that this will never happen. At best you gotta create a grassroots political movement with political representation and clear legal aims and get that past the electorate. However see how the casuals lap up generated content for how ambitious an vision that is. LLMs will prevail and even if public boycotts were extreme, it will just move further and further behind the curtain and the end outcome will still be the same.

I don't see how derailing conversations on hacker news by taking issue with a particular analogy to grind a colonial axe is really furthering that. At the end of the day, regardless of the perspective of our identity, we'll get fucked by network effects and rounded out of systems by those with more influence and power. Sometimes by those who even share our perspective. So to use perspective as a point of division just further fragments what needs to be a whole to enact change.

cmrdporcupine 10 days ago [-]

This, and going back further, people literally would brutally massacre neighbouring tribal groupings over control of fishing and hunting and gathering grounds.

The rapid dispersal of our species over literally the entire planet (minus Antarctica) likely also has a lot to do with constantly moving on to new opportunities further away from rivals.

That said, starting in about the 18th century we ran out of new places for that. And intensification truly began.

3 days ago [-]

jtbaker 10 days ago [-]

the Stepchange show went fairly deep on this topic in their first episode (listened to it recently). https://www.stepchange.show/coal-part-i

cjcole 11 days ago [-]

"but I can't help but see parallels between today and the Industrial Revolution"

You're not the only one.

The current Pope Leo XIV explicitly named himself after the the previous Leo, Pope Leo XIII, who was pope during the Industrial Revolution (1878-1903) and issued the influential Encyclical Rerum novarum (Rights and Duties of Capital and Labor) in response to the upheaval.

“Pope Leo XIII, with the historic Encyclical Rerum novarum, addressed the social question in the context of the first great industrial revolution,” Pope Leo recalled. “Today, the Church offers to all her treasure of social teaching in response to another industrial revolution and the developments of artificial intelligence.” A name, then, not only rooted in tradition, but one that looks firmly ahead to the challenges of a rapidly changing world and the perennial call to protect those most vulnerable within it.”

https://www.vatican.va/content/leo-xiii/en/encyclicals/docum...

https://www.vaticannews.va/en/pope/news/2025-05/pope-leo-xiv...

salawat 10 days ago [-]

>RERUM NOVARUM

ENCYCLICAL OF POPE LEO XIII ON CAPITAL AND LABOR

Oh hohm. Such a great mouthpiece Pope Leo XIII was for extolling and providing cover for the excesses of the worst breed of capitalist. Whilst my experience with such religious writing is used to coming away from them not wholely satisfied one way or another, this particular piece was heavily biased toward the "Captain's of Industry" and capitalist civilizations of the time. Explicitly condemning the practices by which labor can usurp the yoke of the unjustly enriched, and no consideration whatsoever against the fact that as capital centralizes, there are fewer and fewer places to actually look for employment that isn't in one way or another unconscionable to the Soul. Which, thankfully, the fellow at least had the decency to recognize that one isn't at, nor should be at liberty to give one's soul up because the only people signing cheques are those that are most conditioned to being in service not to anyone else, but merely to themselves.

For instance, it places the burden of the yoke of thrift equally on all men, without recognizing that that yoke provides exactly the spiritual cover required for the pernicious greedmonger to sleep soundly after condemning thousands to a situation where in their self-preservation is not guaranteed.

I see some mild concessions to the working class, which we have plenty of history from which to reason that even with a Papal acknowledgement, these words did not suffice to tilt the behavior of men away from ungodly and abusive treatment of their fellow men until such time as they were confronted with force. That the Pope of all people then doubled down by pointing out that "agitators would arise to foment violence and revolt" without taking into consideration that is the only language left that will be understood by the man whose heart has sufficiently hardened to enable him to with a stroke of a pen condemn thousands to millions, nay billions at a time to suffering. To usurp their livelihoods as his own to be rented back, ensuring that no ownership is onto his potential competitors conferred upon which could be built the means to diminish his own prosperity.

No... Pope Leo XIII, your encyclical is in places valid, but woefully out of date and in need of a massive update. Even in it's time, that wording would have been fairly what we today call "milquetoast" in terms of providing the necessary spiritual force to temper the excesses of man's vices. Our current day, is evidence enough of that. Where instead of true virtue and the ability of all to live prosperously, we have a divided class of those seeking desperately to get by, and those seeking desperately to ensure no one gets by them.

Whilst I'm not Catholic, I do do tend to honor the tradition of firmly worded letters nailed to their doors to keep them honest. This encyclical in it's time may have seemed fine, but with hindsight reeks of inadequacy and hedging, with excessive pandering to the already wealthy. This alone explains to me greatly why the labor movements of the late 1800's and early 1900's were not only as bad as they were, but absolutely necessary. If Leo XIV can't do any better than this, then it may once again come to bloodshed. The feedback loop is much tighter, and news travels much faster. Likely why the wealthy are doing everything they can to weight media outlets in their favor, and destroy any unregulated medium of anonymous communication. For these men are greedy, but not stupid. They know deep down the Lord dost tolerate the machinations of the Devil to test the tendencies of humankind, and they fear the inevitable outcome that will arise when the rest of mankind through privation is forced to harden their hearts as they (the wealthy) have. For in the eye's of one who has only their Soul left to bargain with, laid bare is the banal veil of Evil, and if one is to meet their Maker earlier than planned as a result of another man's artifice... Well. Justice doth favor action, whereas the banal finds fertile sustenance in the inaction of vacuous platitude debated endlessly.

Perhaps I am one of the Agitators of which the Pope spoke. Yet I feel no pause at any of the words I have hitherto written. So do with them what you will. If what we have to live with is supposed to be fine, I do not agree that anything about it is what one could conscionably call just.

anal_reactor 10 days ago [-]

[flagged]

steveklabnik 11 days ago [-]

As you know, I deeply respect you. Not trying to argue here, just provide my own perspective:

> Why would a writer put an article online if ChatGPT will slurp it up and regurgitate it back to users without anyone ever even finding the original article?

I write things for two main reasons: I feel like I have to. I need to create things. On some level, I would write stuff down even if nobody reads it (and I do do that already, with private things.) But secondly, to get my ideas out there and try to change the world. To improve our collective understanding of things.

A lot of people read things, it changes their life, and their life is better. They may not even remember where they read these things. They don't produce citations all of the time. That's totally fine, and normal. I don't see LLMs as being any different. If I write an article about making code better, and ChatGPT trains on it, and someone, somewhere, needs help, and ChatGPT helps them? Win, as far as I'm concerned. Even if I never know that it's happened. I already do not hear from every single person who reads my writing.

I don't mean that thinks that everyone has to share my perspective. It's just my own.

munificent 11 days ago [-]

Agreed, totally! I still write and put stuff online.

But it definitely feels different now. It used to feel like I was tending a public garden filled with other people who might enjoy it. It still kind of feels like that, but there are a handful of giant combine machines grinding their way around the garden harvesting stuff and making billionaires richer at the same time.

It's not enough to dissuade me from contributing to the public sphere, but the vibe is definitely different.

Honestly, it reminds me a lot about the early days of Amazon. It's hard to remember how optimistic the world felt back then, but I remember a time when writing reviews felt like a public good because you were helping other people find good products. It was like we all wanted honest product information and Amazon provided a neutral venue for us to build it. Like Wikipedia for stuff.

But as Amazon got bigger and bigger and the externalities more apparent, it felt less like we were helping each other and more like we were help Bezos buy yet another yacht or media empire. And as the reviews got more and more gamed by shady companies, they became less of a useful public good. The whole commons collapsed.

I worry that the larger web and digital knowledge environment is going that way.

I still intend to create and share my stuff with the world because that's who I want to be. But I'll always miss the early days of the web where it felt like a healthier environment to be that kind of person in.

ryandrake 11 days ago [-]

> But as Amazon got bigger and bigger and the externalities more apparent, it felt less like we were helping each other and more like we were help Bezos buy yet another yacht or media empire.

The Internet-circulating quote comes to mind: Planet Earth is pretty much a vacation resort for around 500 rich people, and the remaining 8 billion of us are just their staff. The Relative Few have got the system set up perfectly so that whatever we do, we're probably serving/enriching them. AI doesn't really change this, but it does further it.

elros 10 days ago [-]

> The Internet-circulating quote comes to mind: Planet Earth is pretty much a vacation resort for around 500 rich people, and the remaining 8 billion of us are just their staff. The Relative Few have got the system set up perfectly so that whatever we do, we're probably serving/enriching them. AI doesn't really change this, but it does further it.

I don't necessarily disagree with the analysis on how Planet Earth is currently setup to be, but something that I've been thinking about lately, is that to the extent we can consume the public image of some of the Relative Few, they seem oddly unhappy.

munificent 10 days ago [-]

I think you're right.

Anyone who finds themselves with $100m in their bank account and thinks, "No, I need more," is a person with a hole inside them that can never be filled.

bigyabai 11 days ago [-]

If raw resources (tree cutting) and manufacturing (book binding) is saturated, a fully-realized economy has just one step left: financialization.

You have to start finding ways to keep people hooked on books and make it a part of their regular lifestyle. One book can't be enough, and after a while you have to convince them to replace the books they already bought. New editions, Author's Footnotes, limited run release, all of the stops have to be pulled out to get consumers to show up en-masse. Because that's what they are - consumers, not readers - wallets to be squeezed until they're bled of all the trust they had in media.

I think about the publications I liked reading as a kid, like Joystiq and Polygon. Some of the best games journalism the industry produced, but inevitably doomed to fail as their competitors monetized further. The rest of traditional media has followed the same path, converging on some mercurial social network marketing tactic as the placeholder for big-picture brand strategy.

chairmansteve 10 days ago [-]

There were a couple of threads on HN this week. "Do you have any unusual hobbies" and "How do you relax". I enjoyed them and was thinking of contributing. Then it occurred to me that my comments would be gold for targeting advertising at me. That is the distrust that has been bred by the data harvesters.

imp0cat 10 days ago [-]

Exactly. That thread about hobbies was just a trap designed to squeeze as much info from as many people as possible.

steveklabnik 11 days ago [-]

I can totally see that, for sure. I was much more likely to write a review long ago, now I don't even bother. (For buying stuff online, at least.) Maybe I lost my innocence about this stuff a long time ago, and so it's not so much LLMs that broke it for me, but maybe... I dunno, the downfall of Web 2.0 and the death of RSS? I do think that the old internet, for some definition of "old," felt different. For sure. I'll have to chew on this. I certainly felt some shock on the IP questions when all of this came up. I'm from the "information wants to be free" sort of persuasion, and now that largely makes me feel kinda old.

Also I'm not a fan of billionaires, obviously, but I think that given I've worked on open source and tools for so long, I kinda had to accept that stuff I make was going to be used towards ends I didn't approve of. Something about that is in here too, I think.

(Also, I didn't say this in the first comment, but I'm gonna be thinking about the industrial revolution thing a lot, I think you're on to something there. Scale meaningfully changes things.)

rafterydj 11 days ago [-]

I feel the future includes the sentiments you describe. It was a little before my time professionally, but I grew up reading that kind of thinking.

I do think that the open web stuff, decentralized, or at least more decentralized than currently, is the path forward. I've been reading about the AT protocol and it recently becoming an official working group with the IETF.

I feel a second order effect of making decentralized social networking easier, is making individuals more empowered to separate from what they don't believe in. The third order effect is then building separate infrastructure entirely.

As sad as that can be - in my personal opinion it runs the risk of ending the "world wide" part of the web - it appears to be the only way society can avoid enriching the few beyond reason.

munificent 11 days ago [-]

> I'm from the "information wants to be free" sort of persuasion, and now that largely makes me feel kinda old.

Me too, 100%. But that was during a moment in time when that information was more likely to be enabling a person who otherwise didn't have as many resources than enabling a billionaire to make their torment nexus 0.1% more powerful.

> I kinda had to accept that stuff I make was going to be used towards ends I didn't approve of. Something about that is in here too, I think.

Yeah, I've mostly made peace with that too.

The way I think about it is that when I make some digital thing and share it with the world, I'm (hopefully!) adding value to a bunch of people. I'm happiest if the distribution of that value lifts up people on the bottom end more than people on the top. I think inequality is one of the biggest problems in the world today and I aspire to have the web and the stuff I make chip away at it.

If my stuff ends up helping the rich and poor equally and doesn't really effect inequality one way or the other, I guess it's fine.

But in a world with AI, I worry that anything I put out there increases inequality and that gives me the heebie-jeebies. Maybe that's just the way things are now and I have to accept it.

idle_zealot 10 days ago [-]

> But in a world with AI, I worry that anything I put out there increases inequality and that gives me the heebie-jeebies. Maybe that's just the way things are now and I have to accept it.

This observation doesn't really clash with "information wants to be free." You just have to include LLMs in the category or "information," like Free Software types already do for all software. You don't need to abandon your principles, you should shift your demands. A handful of companies can't be allowed to benefit from free information and then put what they make behind a wall.

echion 10 days ago [-]

> Free Software types already do for all software

Free Software types also create software...they didn't just argue for a better license and try to regulate Sun/others to re-license their software; they wrote free (libre) versions of proprietary software and released it for free (cost), which is what counteracted the "[putting] what they make behind a wall". If you're saying "[some] LLMs should be free", I agree.

navaed01 10 days ago [-]

I don’t disagree with you, but this has been going on for a while… Google monetized the the by indexing it and monetized what you wanted to find. Facebook monetized the eyeballs from the pictures and posts you added. Now LLMs will monetize all web content. To play devil’s advocate - LLMs do give something back. Those with ideas and no coding experience can now build entire businesses for little to zero cost. This seems different

munificent 10 days ago [-]

> A handful of companies can't be allowed to benefit from free information and then put what they make behind a wall.

What is there to prevent them?

otterley 10 days ago [-]

Nothing today; but in a democracy, we have the power to make it possible, if people vote the right way.

throwanem 11 days ago [-]

> the "information wants to be free" sort of persuasion

That was always a luxury of its peculiar historical moment, though, wasn't it? Barlow didn't have to care who paid for the infrastructure, but he was just bloviating.

randallsquared 10 days ago [-]

No, it's as true now as it was then. The intellectual property team didn't win on the merits or by law enforcement; it was the convenience of streaming anything at will for a monthly fee that did the trick.

idle_zealot 10 days ago [-]

> it was the convenience of streaming anything at will for a monthly fee that did the trick

That's not the whole story, though. There have been many community-driven projects to bring convenient access to copyrighted works to the masses in a convenient way. You may recall the meteoric success of Popcorn Time. Law enforcement shut them down. Without the hand of the state beating down any popular alternative to legal distribution it absolutely would be the dominant mode of media consumption.

navaed01 10 days ago [-]

It does feel like the collaborative, free open nature of the web has gone and the optimism that brought… it feels like no one would build Foursquare today. But then I wonder if I’m just old an jaded and to the younger generation creating content, for them the web is open and expressive- just in a different way

steveklabnik 10 days ago [-]

I still use swarm every day, and get teased for it all the time.

"So Steve, you're a millennial. What does it mean to 'be the mayor' of something?"

navaed01 10 days ago [-]

I can relate to this so much! IMHO Foursquare genuinely did gve the better recommendations for food and drink and I still think this recommendation problem is far from solved.

NiloCK 10 days ago [-]

> It used to feel like I was tending a public garden filled with other people who might enjoy it. It still kind of feels like that, but there are a handful of giant combine machines grinding their way around the garden harvesting stuff and making billionaires richer at the same time.

An underrated upside to being harvested is that your voice has now effectively voted in the formation of the machine's constitution. In a broader ecological sense, you've still tended to a public garden, but in this case your work is part of the nutrient base for a different thing.

Broader still: after the machines squeeze all of our inputs into an opaque crystal, that crystal's very purpose is to leak it all back out in measured doses. Yes, "some billionaire" will own the lion's share of that process, but time so far is telling that efforts can be made to distill strong, open, public versions of the same.

munificent 10 days ago [-]

> time so far is telling that efforts can be made to distill strong, open, public versions of the same.

I do really hope that part of the longer-term answer for AI is LLMs being run locally.

functional_dev 10 days ago [-]

I like the garden analogy.

Writing online used to bring you readers. Now it trains model, which answers the same questions without sending anyone to your site.

Ajedi32 10 days ago [-]

AI harvesting your garden doesn't destroy the garden though. It's like calling piracy theft; in the digital realm those types of analogies quickly break down.

I also personally still feel like posting reviews on Amazon is a public good. I like helping people. That my efforts also help Amazon as a company is incidental.

Certainly if there were a convenient way to cross post my reviews to a more open platform that would be great. The more people I can help the better. It does annoy me to see companies trying to block scraping as if they own my posts and they aren't part of a commons.

computably 11 days ago [-]

> A lot of people read things, it changes their life, and their life is better. They may not even remember where they read these things. They don't produce citations all of the time. That's totally fine, and normal. I don't see LLMs as being any different. If I write an article about making code better, and ChatGPT trains on it, and someone, somewhere, needs help, and ChatGPT helps them? Win, as far as I'm concerned. Even if I never know that it's happened. I already do not hear from every single person who reads my writing.

Not a contradiction but an addendum: plenty of creative pursuits are not about functional value, or at least not primarily. If somebody writes a seemingly genuine blog post about their family trauma, and I as the reader find out it's made-up bullshit, that's abhorrent to me, whether or not AI is involved. And I think it would be perfectly fair for writers who do create similar but genuine content to find it abhorrent that they must compete with genAI, that genAI will slurp up their words, and that genAI's mere existence casts doubt on their own authenticity. It's not about money or social utility, it's about human connection.

ai5iq 10 days ago [-]

The consent question gets weirder when agents have persistent memory. I run agents that accumulate context over weeks — beliefs extracted from observations, relationships with other agents. At what point does an agent's memory become its own work product vs. derivative of its training? There's no legal framework for that.

intended 10 days ago [-]

> I write for two main reasons

> people read things… their life is better

> it’s just my own

What was the point of writing this though?

Perhaps I should know who you are, but assuming you are a regular HN forum user - you are still very much a participant in a larger information economy / ecosystem.

All of us depend on that system, that commons.

Visits to Wikipedia have dropped by at least 8% since 2025, other estimates are starker. This will have an impact on donations.

These reports are similar for many sites which write or produce content.

Your individual behavior may be perfectly fine, and you are entitled to your perspective, but that doesn’t become a defense for the degradations of the commons.

If anything, it’s a classic example of the kind of argument that ends up entangling ideas and making conclusions harder to reach.

kokanee 10 days ago [-]

That seems fine if you're not publishing content for a living. A lot of people are.

lelanthran 11 days ago [-]

> I don't mean that thinks that everyone has to share my perspective. It's just my own.

I think you are walking all around the word "consent" and trying very hard to avoid it altogether.

Your perspective, because it refuses to include any sort of consent, is invalid. No perspective that refuses consent can be valid.

steveklabnik 11 days ago [-]

Consent is absolutely important, but that does not mean that every single thing in the entire world requires explicit consent. You did not ask me for consent to use my words in your comment. That does not mean you're a bad person.

Free use is an important part of intellectual property law. If it did not exist, the powerful could, for example, stifle public criticism by declaring that they do not consent to you using their words or likeness. The ability to do that is important for society. It is also just generally important for creating works inspired by others, which is virtually every work. There has to be lines for cases where requiring attribution is required, and cases where it is not.

lelanthran 11 days ago [-]

> You did not ask me for consent to use my words in your comment.

I am not representing your words as mine. I am not using your words to profit off. I am not making a gain by attributing your words to you.

> There has to be lines for cases where requiring attribution is required, and cases where it is not.

You are blurring the lines between "using a quote or likeness" and "giving credit to". I am skeptical that you don't know the difference between the two.

Regardless, any "perspective" that disregards the need to acquire consent is invalid. Even if you are going to ignore it, you have to acknowledge that you don't feel you need any consent from the people you are taking from.

This whole "silence is consent" attitude is baffling.

steveklabnik 11 days ago [-]

You made an incredibly strong statement that is much broader than what we are talking about. I am pointing out various cases where I think that broadness is incorrect, I am not equating the two.

I do not think that, if you read, say, https://steveklabnik.com/writing/when-should-i-use-string-vs... , and then later, a friend asks you "hey, should I use String or &str here?" that you need my consent to go "at the start, just use String" instead of "at the start, just use String, like Steve Klabnik says in https://steveklabnik.com/writing/when-should-i-use-string-vs... ". And if they say "hey that's a great idea, thank you" I don't think you're a bad person if you say "you're welcome" without "you should really be saying welcome to Steve Klabnik."

It is of course nice if you happen to do so, but I think framing it as a consent issue is the wrong way to think about it.

We recognize that this is different than simply publishing the exact contents of the blog post on your blog and calling it yours, because it is! To me, an LLM is a transformative derivative work, not an exact copy. Because my words are not in there, they are not being copied.

But again, I am not telling anyone else that they must agree with me. Simply stating my own relationship with my own creative output.

sillysaurusx 11 days ago [-]

Just wanted to compliment you on your classy attitude and style, along with your solid points. It’s not easy to take that side of the debate. Cheers.

GeoAtreides 10 days ago [-]

he doesn't have solid points, he conflates fair use with free use (?), ignores thousands of years of attribution history, and equates normal human to human learning with corporate LLMs training on original content (without consent). Great presentation, like you said, to cover the logical defects.

steveklabnik 10 days ago [-]

I did say "free use" instead of "fair use," yeah. That's my mistake, thank you for the correction. If I could edit my original comment, I would, mea culpa. Typos happen.

GeoAtreides 10 days ago [-]

I see. I must congratulate you on your rhetorical prowess, it's nice seeing a professional at work.

sillysaurusx 10 days ago [-]

Fair use of training data hasn’t yet been settled in court. People here are treating it like it has been. But no amount of wishful thinking or moral arguments will change a verdict saying it’s fine for training data to be used as it has been.

Until that question is settled, it’s disingenuous to dismiss his points out of hand as conflating fair use or ignoring consent.

steveklabnik 10 days ago [-]

Even beyond that, the initial legal opinion we do have did in fact point to training being fair use: https://www.reuters.com/legal/litigation/anthropic-wins-key-...

However, I don't feel comfortable suggesting that this is settled just yet, one district judge's opinion does not mean that other future cases may disagree, or we may at some point get explicit legislation one way or the other.

otterley 10 days ago [-]

I think the court dropped the ball here. On the one hand, I think they were right that using existing works--copyrighted or otherwise--to train a model was transformable fair use. On the other hand, Anthropic and others trained their models on illicit copies of the works; they (more often than not) didn't pay the copyright holders.

There's a doctrine in Fifth Amendment law called "fruit of the poisonous tree." The general rule is that prosecutors don't get to present evidence in a criminal trial that they gained unlawfully. It's excluded. The jury never gets to see it even if it provides incontrovertible evidence of guilt. The point is to discourage law enforcement from violating the rights of the accused during the investigative process, and to obtain a warrant as the Amendment requires.

It seems to me that the same logic ought to be applied to these companies. They want to make money by building the best models they can. That's fine! They should be able to use all the source data they can legitimately obtain to feed their training process. But if they refuse to do so and resort to piracy, they mustn't be allowed to claim that they then used it fairly in the transformative process.

steveklabnik 10 days ago [-]

I mean, that is what the court said! Training on pirated data was not fair use. Training on legally acquired data is fair use.

Anthropic legally acquired the data and re-trained on it before release.

otterley 10 days ago [-]

It did not say that. See Judge Alsup's order (https://fingfx.thomsonreuters.com/gfx/legaldocs/jnvwbgqlzpw/...), pp. 29-30, Section IV(B)(ii) ("The Pirated Library Copies").

"[T]he test requires that we contemplate the likely result were the conduct to be condoned as a fair use — namely to steal a work you could otherwise buy (a book, millions of books) so long as you at least loosely intend to make further copies for a purportedly transformative use (writing a book review with excerpts, training LLMs, etc.), without any accountability."

See also p. 31:

"The downloaded pirated copies used to build a central library were not justified by a fair use. Every factor points against fair use. Anthropic employees said copies of works (pirated ones, too) would be retained 'forever' for 'general purpose' even after Anthropic determined they would never be used for training LLMs. A separate justification was required for each use. None is even offered here except for Anthropic’s pocketbook and convenience."

Despite this consideration, the court still found for Anthropic on the question of fair use.

steveklabnik 10 days ago [-]

I don't read how that opposes what I said, that's part of the "training on pirated data is not fair use." That said, I am not a lawyer. From those pages:

> The copies used to train specific LLMs were justified as a fair use.

This is (in my understanding) because those were not the pirated copies.

> The copies used to convert purchased print library copies into digital library copies were justified, too, though for a different fair use.

Buying a book and then digitizing it for purposes of training is fair use.

> The downloaded pirated copies used to build a central library were not justified by a fair use.

Piracy is not fair use, you quoted this part as well.

In the conclusions section a the end of 31:

> This order grants summary judgment for Anthropic that the training use was a fair use. And, it grants that the print-to-digital format change was a fair use for a different reason. But it denies summary judgment for Anthropic that the pirated library copies must be treated as training copies.

Training is fair use. Pirating is not fair use, and therefore, you can't train on that either.

What part am I missing?

otterley 9 days ago [-]

I think that's a reasonable way to interpret the court's order, but unfortunately the judge didn't really articulate the consequences of training on pirated copies "not fair use" as clearly as I would have liked. Does that mean they're simply liable for infringement of those works, or does it mean that they'd be enjoined from using them altogether to train the model? The genie was out of the bottle; how could it be put back in?

Anthropic settled the case with the publishers just a few months later, leaving the question mostly unsettled still.

steveklabnik 9 days ago [-]

I see. Thanks. I cannot wait until this is settled law too.

10 days ago [-]

GeoAtreides 10 days ago [-]

I was just enumerating some of the issues with the '''solid''' points OP made. Actually addressing them would take too long and be exercise in futility, here, in HN, in april 2026. Why would I put in the effort, for my comment to be flagged and sent to the void? or worse, persisted for ever and used for training without my consent?

And yes, you are right, the legal and moral question of fair use in training data hasn't been settled yet; we agree here.

lelanthran 11 days ago [-]

> But again, I am not telling anyone else that they must agree with me. Simply stating my own relationship with my own creative output.

Look, I'm not saying that you are doing that, I'm pointing out that "Silence is consent" is not as strong an argument that many think it is.

ModernMech 10 days ago [-]

> you don't feel you need any consent from the people you are taking from.

What has been "taken", exactly?

lelanthran 10 days ago [-]

> What has been "taken", exactly?

Where are you going with this line of thought? That making a copy of someone's work, using it for profit and not crediting them doesn't "take" anything from them?

ModernMech 10 days ago [-]

I find that these discussions at the intersection of art and law tend to blur technical and familiar uses of words. So it's important to specify what was actually taken here because otherwise the discussion becomes muddy.

"making a copy of someone's work, using it for profit and not crediting them" wasn't really the scenario being discussed in this thread -- is that what you meant by "taking"?

Steve had made the point:

  Not every single thing in the entire world requires explicit consent.

But actually taking someone else's verbatim work and selling it as your own is one of those instances where consent would be required, because many people see a clear line between someone selling another author's work and the author not getting a dollar because of that.

That doesn't preclude other instance where explicit consent is not required. For example, do I need your consent to learn from your work and produce similar work of my own? Am I required to credit you in my work for having learned from you? Am I taking from you if I don't share my profits with you?

Some rights holders would say yes, actually. Which, I don't agree with. I think it's important that we not require the artist's explicit consent for all things, because listening to some of rights holders (e.g. Disney), they have very expansive ideas about what kind of control they are owed by society over their creations.

Therefore, I think if you're going to claim something has been taken, you should specify what exactly.

satvikpendem 10 days ago [-]

> you don't feel you need any consent from the people you are taking from

In most cases, no, I (and it seems most others) don't feel the need for that, it is only you who seems to have an ideological hangup over this.

lelanthran 10 days ago [-]

>In most cases, no, I (and it seems most others) don't feel the need for that, it is only you who seems to have an ideological hangup over this.

It's not an ideological hangup, it's confusion over the assumption by certain groups that "silence is consent", when it is not.

altruios 11 days ago [-]

refuse consent?

You may need to clarify that thought.

I don't think the poster has a viewpoint that 'refuses consent', their viewpoint is their writing they put for others to view is for others to view, regardless of how it is viewed. They seem to be giving consent, not refusing it, no?

lelanthran 11 days ago [-]

> refuse consent?

Who said anything about refusing consent?

altruios 10 days ago [-]

> I think you are walking all around the word "consent" and trying very hard to avoid it altogether.

> Your perspective, because it refuses to include any sort of consent, is invalid. No perspective that refuses consent can be valid.

This is what I was responding to. I do not understand your thinking in this post.

lelanthran 10 days ago [-]

> This is what I was responding to. I do not understand your thinking in this post.

I thought it was clear from "refuses to include any sort of consent" that I am talking specifically about holding an opinion that refuses to include consideration for consent, not refuses consent for usage.

altruios 10 days ago [-]

But that's what I'm confused about:

How is freely giving consent for (all) others to read your content not 'considering consent'?

I'm not trying to be snarky. I really don't see the missing piece that isn't written that connects those dots.

konschubert 11 days ago [-]

> Prior to the industrial revolution, the natural world was nearly infinitely abundant.

The opposite is true. Central Europe was almost devoid of trees. Food was scarce as arable land bore little fruit without fertiliser.

Society was Malthusian until the Industrial Revolution.

jsmo 11 days ago [-]

Can we interpret "abundant" in a Darwinian sense e.g. diversity of life? I would think the industrial farming revolution decreased crop variety over time same for animal lineages aside from the rapid increase in mixed poodle breeds.

Sharlin 10 days ago [-]

Crop variety was decreased by the original farming revolution, about 10k years before the industrial revolution. Rather than eating whatever was available, the large majority of the caloric input of an agricultural society comes from a few staple crops optimized for overwinter storability and producing large yields and thus supporting a large number of people.

The industrial revolution didn’t qualitatively change farming. It just made it possible to have more of it thanks to machine labor. The same goes for the later agricultural revolutions.

imtringued 10 days ago [-]

This is particularly evident if you had been around rural villages in eastern Europe in the late 00s, particularly those inhabited by elderly people at 70 years old and above.

They were still doing subsistence agriculture to supplement their own income well into the 21th century. Of course they didn't grow enough calorie heavy crops like corn, potatoes or wheat to live entirely off the land, but they had enough food that a bi-monthly shopping trip with their children was enough to get by.

xyzzyz 9 days ago [-]

No, they totally grew enough calories for themselves. My grandparents lived like that. They farmed around 15 hectares, which was actually quite a lot. You can easily grew enough calories for your family on 5 hectares, or even less if you have access to modern cultivars and artificial fertilizer. It’s just even poor people like variety, and will trade some of their crops for stuff they cannot make at home efficiently, like sugar, fish, or candy.

aerhardt 10 days ago [-]

To add, I don’t think my ancestor Spaniards for example needed the help of machines to deplete mines in America. They also came already equipped with all kinds of legal systems, including the Requerimiento, which they read out loud to natives in preposterous spectacle.

In general the transition from feudalism to capitalism, including the formation of the legal systems that supported the latter, happened gradually for maybe up to four or five centuries before the steam engine had been invented.

Sure, the Industrial Revolution further accelerated the development of property rights, mercantile, and civil laws, but all in all I don’t think there’s much truth that machines were the primary cause of such developments.

jltsiren 10 days ago [-]

Not really Malthusian. Agricultural societies had adapted to keep the population stable during normal times and bounce back in a generation or two after bad times. Those cultural adaptations stopped working when childhood mortality declined.

Useful land was a scarce resource in more civilized regions, while labor was cheap. Given enough land, subsistence farmers could easily feed themselves outside particularly bad years. But much of the land belonged to local elites, and commoners had to work that land to fund the pursuits of the elites.

arjie 11 days ago [-]

If I'm being honest, I've never related to that notion of remuneration and credit being the primary reason to write something. I don't claim to be some great writer or anything, but I do have a blog I write quite often on (though I'm traveling in my wife's Taiwan now and haven't updated it in a while). But for me, I write because it feels good to do so. Sometimes there's a group utility in things like I edit a Google Maps listing to be correct even though "a faceless corporation is going to hoover up my work and profit off it without paying me for my work" and I might pick up a Lime bike someone's dropped into the sidewalk even though "a faceless corporation is externalizing the work of organizing the proper storage of their property on public land without paying the workers" or so on.

I just think it's nice to contribute to the human commons and it's fine if some subset of my fellow organism uses it in whatever way. Realistically, the fact that Brewster Kahle is paid whatever few hundred thousand he's paid for managing a non-profit that only exists because it aggregates other people's work isn't a problem for me. Or that Larry Page and Sergey Brin became ultra-rich around providing a search interface into other people's work. Or that Sam Altman and Dario Amodei did the same through a different interface.

This particular notion doesn't seem to be a post-AI trend. It seems to have happened prior to the big GPTs coming out where people started doing a lot of this accounting for contribution stuff. One day it'll be interesting to read why it started happening because I don't recall it from the past. Perhaps I just wasn't super plugged in to the communities that were complaining about Red Hat, Inc.

It's not that I don't understand if I sold my Subaru to a guy who immediately managed to sell it to another guy for a million times the money. I get that. I'd feel cheated. But if I contributed a little to it, like I did so Google would have a site to list for certain keywords so that they could show ads next to it in their search results, I just find it so hard to be like "That's my money you're using. Pay me!".

wat10000 11 days ago [-]

You do it as a hobby, that's fine. Some people do it for a living. And while they aren't owed a living doing that specific thing, it is going to be a big problem for them if they can't make money at it anymore.

I'm sure plenty of people feel the same way about software. They make software as a hobby and don't care about remuneration or credit. Meanwhile I write software for my day job and losing the ability to make money from it would be devastating.

arjie 11 days ago [-]

Ah, I see. It’s just straightforward protectionism like dockworkers opposing automation and so on. That I do comprehend, in fact.

I write software too and I may no longer be able to just do it in the old way. Pretty scary world but also exciting. I can’t imagine trying to restrict LLM software writers on that basis but I can comprehend it as simply self-interest.

Fair enough.

wat10000 11 days ago [-]

Do you make money writing software? I bet you either try to restrict LLM usage or assign your rights to an employer who does. Putting code in the public domain is pretty rare, and extremely rare for paid work.

arjie 11 days ago [-]

I allow them to train on my work as described here (for example) https://code.claude.com/docs/en/data-usage

And I do paste code into CC. I’m not super concerned that they’ll see it.

That’s fine by me. It doesn’t require putting code in the public domain which is something else entirely.

I make money off hosted software so in some sense there is writing involved at one end. But I’m not paid by output tokens.

wat10000 11 days ago [-]

If your code isn't in the public domain, then anything you haven't explicitly allowed them to train on is restricted for them. They've been ignoring that for anything they can actually get their hands on, but it's there.

gopher_space 10 days ago [-]

It’s about the amount of time available.

MetaWhirledPeas 10 days ago [-]

> Some people do it for a living.

I was going to write, "not for long," which might be true for some. But then I realized there will always be a difference between LLM output and human writing. We don't read blogs because of their facts, we read them because of how the facts are presented and how the author's personality comes through on the page.

EDIT: That said, LLMs are great at faking it, and a lot of amateur writing will be difficult to distinguish from LLM output. So I'm disagreeing with myself a bit.

But we are talking about "slurping up" IP and regurgitating it right? OK. So if I slurp up Mickey Mouse and output Micky Mouse that's an offense. But what if I slurp up a billion images and output some chimera? That's what the LLMs do. And that's what humans do too.

10 days ago [-]

xyzzyz 10 days ago [-]

Prior to Industrial Revolution, nobody could go hunt in the woods, because the woods were King’s, and poaching King’s game carried death penalty. Situation was similar on the continent: the tiny slivers of remaining wood lands were off limits.

Granted, things were different in the New World, as a result of mass depopulation event following the Columbian exchange. But even there, the megafauna was hunted to extinction soon after the humans first appeared there.

Anyway, the point is that no, prior to Industrial Revolution, the world was of full of scarcity, not abundance.

dogcomplex 10 days ago [-]

>This completely unpends the tenuous balance between creators and consumers. Why would a writer put an article online if ChatGPT will slurp it up and regurgitate it back to users without anyone ever even finding the original article? Who will contribute to the digital common when rapacious AI companies are constantly harvesting it? Why would anyone plant seeds on someone else's farm?

This is completely reversed. Why should anyone honour the right of some creator who was merely the first to plant their flag on a creative task that is now absolutely trivial to perform by AI? Who needs a digital commons when creation itself is now the commons and freely accessible for pennies? The seeds plant and grow by themselves now. The only question is who should be allowed to claim the farms?

Answer: No one. AI companies will have their lunch eaten by open source. And if they don't - they should be nationalized and protocolized into free utilities. The entire idea of digital ownership should (and will) be abolished by the very nature of this technology.

The digital world is the new infinitely-abundant nature. We're just returning it to where it should have been, before corporations clawed it into fenced off empires.

gritspants 11 days ago [-]

At what point do we look at 'Industrial Society and its Future' and go from "yeah that'll never happen", "ok some parts of it are happening", to ...? I swear tech folks are the most obtuse people on the planet.

sweezyjeezy 11 days ago [-]

I think it's completely normal. Whenever automation comes knocking, people are inclined to think it's going to flatline conveniently before their job is at risk. LLMs can code now? Cool, they can't code well though can they? Oh they can code pretty well now? Cool, coding was never the hard part of SWE anyway, it's [thing we have no reason to think AI can't beat 99% of humans at at some point], etc

I think SWE as a mainstream profession is much nearer to the end than the beginning, I'm curious and quite scared about what becomes of us.

imtringued 10 days ago [-]

The problem is that software development contains domain independent and domain specific skills. Since information processing is domain independent, replacing software developers in general will require beating them not only in the domain independent skills, which is what the recent breakthroughs have been about, but also in every single domain dependent skill.

This makes software development AGI-complete. If you have an LLM that can write software for every domain, then for every task you assign it, it could build software that performs the assigned task and thereby solves every problem in existence.

What I'm trying to get at here is that an "SWE" is a biological machine building machine. If you have a digital machine that can build any machine, you haven't solved the first step, you've solved the final step in all of human history that ever needs to be done, whatever that means. Beyond that point, human work no longer exists, because the machines have taken over everything.

gritspants 10 days ago [-]

I don't think you understand. Frankly, AI is a failure if all it does is replace coders. AI needs (given its current investment levels) to conquer all forms of knowledge work. This is an example of tech/industry needing to impose itself on society, rather than society needing it.

satvikpendem 10 days ago [-]

That's how human progress works. No one can want or need it because they cannot conceptualize wanting it until someone shows that it is possible. Now, many of those wants become needs.

gritspants 10 days ago [-]

We can absolutely conceptualize what we want or need. I was born in 1980 in NYC. When I was a boy my father took me to a tech conference where they had a demo of ordering TV shows on demand. It was a miracle, to my young mind. Was this what I needed?

Growing up I had a friend group of misfit boys, who discovered h4ck1ng and phr34king. But we also discovered slackware Linux on 3.5" floppies. We also had to discover ASM and compiling the linux kernel in order to do anything with it. Boys with machines. That wasn't what I needed either.

Later on we did have great things with tech. Google made the world searchable in ways Altavista didn't. I remember strapping the original iPod on my arm to go for runs outside. I didn't even need a car for a while investors subsidized my Uber rides to and from the office.

Now, it seems the US is balanced on a precipice. The economy seems to have an incredible amount of money desperate to grow, but to what purpose. In my lifetime, and in my parents, and their parents before them, when the dollar becomes restless the flag goes forth. The dollar follows the flag.

And here we are at war.

satvikpendem 10 days ago [-]

You wouldn't have known about a TV had you not seen it. That is what I mean by, people generally can't conceptualize what they want or need until they see it.

gritspants 10 days ago [-]

Wants and needs are not the same. We are experiencing the difference in real time. AI does not give society a want or need.

satvikpendem 10 days ago [-]

My point was not about the difference, it was about the fact that average people cannot conceptualize new ideas until one person or team invents it, then the average person will want or need it.

As for AI, I and many others want it, and some even need it, in certain use cases. Speak for yourself.

gritspants 10 days ago [-]

I believe the idea that you (or I) might know better than the 'average people' to be incredibly conceited, arrogant, and frankly wrong. It is an attitude that gives you superiority for having achieved nothing.

satvikpendem 10 days ago [-]

I'm not sure what you're even talking about, you're putting words and an argument into my mouth which I never said.

gritspants 10 days ago [-]

Well then I owe you an apology. Perhaps I inferred too much about your point of view and understood too little, which is my own loss. Sorry.

sweezyjeezy 10 days ago [-]

I think your numbers are off. TAM for office workers is ~20T a year, of which SWE compensation is ~3T. So if they can make 3T x 10% X 5 years = 1.5T that covers their current valuations. It's not as insane as you make out, even not taking into account the other high risk areas like legal, accounting etc

pnexk 10 days ago [-]

Hit the nail on the head with that framing. So many articles are now coming out addressing the anxieties about adoption of a new technology, but we genuinely don’t really need it as a society.

I still wonder if we really needed the iPhone or many other things we’re told is “progress” and innovation in an arrow of time manner. The future is not set in stone and things need not play out in this manner at all. Unlike the iPhone where most were excited by its possibilities (even if they traded precious privacy in the name of convenience), there’s not a clear reason that this version of LLM driven technologies represent significant upsides than downsides.

10 days ago [-]

trinsic2 10 days ago [-]

I have been thinking about this. I was pretty amendment a few months ago that AI is going to make a lot of thing worse for everyone because of the externalities of the technology (Data Center Creep, lock in of models, ect) and it probably still will. But then someone suggested to me that I use Claude Code to upgrade my SSG site to the new version because I had been sitting on my ass as the years went by, missing deadline by deadline. I just couldn't put my self into gear to upgrade it. It was massively out of date 10 years plus and I knew it was going to be a nightmare to deal with the problems. I probably was making it more harder than it really was in my head.

So I purchase Claude Code pro and the thing upgrade my site pretty well. There were things it missed because I didn't know the problems existed in the first place until the upgrade was complete, but I had a working updated site in less than an hour. If I had done this myself it would have taking me days/weeks.

So at that point I realized something. Its a tool that can handle good amount of tasks I throw at it as long as I am specific. I think the problem with most people is they expect it to respond like a human. Thats not going to happen, IMHO. Maybe some day it will be more than what it is but right now its just a tool. I don't care what anyone says about AGI and the likes. Its not going to happen with the current iteration (the pattern recognition type) We are going to need more than that if we want to simulate a human brain..

The point is. And I know this is not going to be received very well, mostly because this tech is in the hands of people that are gatekeeping it, is that maybe someday we might reach a point where all of humanities knowledge is put into these things and we can use them to better our lives. Maybe at some point we don't need to hold onto or hoard things as if its the only way we can make a living? And instead we can build things just for the sake of creating it and improve humanity in the process? Obviously the commercial model of these things is not great, that is going to have to be dealt with, but I can see a future where we might be able to fix a lot of humanities problems with this technology as more and more good people put it to use for things that help humanity.

idiotsecant 10 days ago [-]

You raised a point and then never answered it. Why would anyone plant seeds on someone else's farm?

trinsic2 9 days ago [-]

Because maybe, someday, somehow, we will realize that these farms we are creating are all connected. When we share resources we prosper more than we would if we were all separate. But that wouldn't happen right away, enough people would have to have buy in for this to happen so I understand the concern.

lpedrosa 10 days ago [-]

Well, maybe because life is not a zero sum game? Sometimes you do things just because.

drob518 11 days ago [-]

A couple thoughts…

Mostly, AIs don’t recite back various works. Yes, there a couple of high profile cases where people were able to get an AI to regurgitate pieces of New York Times articles and Harry Potter books, but mostly not. Mostly, it is as if the AI is your friend who read a book and gives you a paraphrase, possibly using a couple sentences verbatim. In other words, it probably falls under a fair use rule.

Secondly, given the modern world, content that doesn’t appear online isn’t consumed much, so creators who are doing it for the money will certainly continue putting content online. Much of that content will be generated by AIs, however.

triceratops 11 days ago [-]

You're missing the point. This is the crux of munificent's argument IMO (and I've made variations of it as well)

> We have copyright and intellecual property law already, of course, but those were designed presuming a human might try to profit from the intellectual labor of others.

You getting a summary of a copyrighted work from a friend is necessarily limited by the number of friends you have, the amount of time they have to read stuff and talk to you, and so on. Machines (and AIs) don't have any limitations.

drob518 11 days ago [-]

Yes, true. But does that really shift the argument much? An AI is like the most well-read book nerd you’ve ever met. The AI has read everything. They still won’t recite Harry Potter for you at full length and reading what the original author wrote is part of the pleasure.

triceratops 11 days ago [-]

> An AI is like the most well-read book nerd you’ve ever met. The AI has read everything

But no real book nerd has read everything. Current law was designed for the capabilities of humans.

drob518 10 days ago [-]

Sure, we could change current law, but I think that only forces an AI company to buy one copy of every book. I don’t think it gives any sort of royalty stream to anyone beyond that. Copyright is literally the right to make copies. Once I have acquired a copy, I can read it, summarize it, transform it, etc. in myriad ways.

intended 10 days ago [-]

I don’t think thats how fair use works.

triceratops 10 days ago [-]

You can't make copies though. AI training requires making copies of materials, even if they're purchased.

drob518 10 days ago [-]

Not true. You can photocopy pages from a book you own for your own use. You can make copies of purchased software as a backup. What you can’t do is make copies and give them to all your friends or sell them to the public.

triceratops 9 days ago [-]

> You can photocopy pages from a book you own for your own use

No. You won't get it trouble for it. But it is against the law. https://www.copyright.gov/what-is-copyright

"U.S. copyright law provides copyright owners with the following exclusive rights: Reproduce the work in copies"

This doesn't differentiate between partial and complete copies.

> You can make copies of purchased software as a backup

This is true. They had to write out that exception for digital media. And the key is "backup". You can't run or use multiple copies if you only own one.

drob518 8 days ago [-]

While the rules for fair use are not black and white, one of the primary tests is whether the copying impairs the market for the work. If you want to copy pages of a book to mark them up, for instance, so your original copy stays clean, that would generally fall under fair use. You aren’t selling the copy or the original. You aren’t giving one or the other to other people, thereby eliminating a potential sale. You are copying some pages, not the entire work, cover to cover. As you say, you wouldn’t get in trouble for it in any case, but I’m pretty sure that it would be covered under fair use. But yea, if you photocopy a book and give it to your friend, that’s illegal.

intended 10 days ago [-]

Yes.

1) Quantity is its own quality: Scale makes a difference

2) The tools themselves automate tasks and consolidate their outputs. The “sale” of a piece of content, and its consumption, shifts away from the people producing it Example: We have entire networks and systems that depended on consumption occurring on the site itself. News websites, or indie sites depend on ad revenue.

nrabulinski 11 days ago [-]

Does a literal book nerd profit megacorporations when they bring up books to you? While burning through a household worth of energy in the process? Also, I’d like to talk with such book nerd because they’d have opinions on books, potentially if I brought up something I have read we could exchange thoughts about it, they could make recommendations for me based on their complex experiences instead of statistics from Reddit comments. An LLM can do none of those, while also doing the former. It’s a lose-lose.

Also, a book nerd doesn’t take roughly ~all human created text to train to produce meaningful results. It’s just such a misplaced analogy and people have been making it ever since OpenAI announced chatgpt for the first time - why do people think “an LLM is just a human who read a lot”

charcircuit 10 days ago [-]

Megacorporations making profit is not some evil that needs to be stopped. The economy is not zero sum.

zephen 10 days ago [-]

> The economy is not zero sum.

This is true.

But it's not always positive sum, either.

> Megacorporations making profit is not some evil that needs to be stopped.

Externalities are a thing. It's not about the profit per se, but about how (a) the making of that profit might negatively impact others, and (b) the deployment of that profit in pursuit of rent-seeking and other antisocial behavior in order to insure its continued existence might also negatively impact others.

drob518 10 days ago [-]

Externalities are a thing, but this isn’t exactly dumping toxic waste into a river.

trinsic2 9 days ago [-]

I disagree with that. from what I read data centers are going to have some real world negative effects on human populations

zephen 9 days ago [-]

No, it's more just drying the river up entirely.

https://www.texastribune.org/2025/09/25/texas-data-center-wa...

bluefirebrand 11 days ago [-]

> It really feels like we're in the soot-covered child-coal-miner Dickensian London era of the Information Revolution and shit is gonna get real rocky before our social and legal institutions catch up

The really discouraging part of this is that it feels like our social and legal institutions don't even care if they catch up or not.

Technology is speeding up and the lag time before anything is discussed from a legal standpoint is way, way too long

Waterluvian 10 days ago [-]

Maybe it’s a useless distinction but it feels like we’ve gone through overlapping ages of Communication, Data, Processing, and Information.

First we conquered the ability to move matter and transmit signal, greatly shrinking the world. Next was sensor technology, especially the mobilization of it, and our ability to collect more data than we could ever imagine being able to process. Then we started going crazy with data centres and big data and the idea that maybe we can somehow correlate it all if we just process it enough. And now we’re finally turning data into information, building enormous graphs of correlation without even having to manually reason about a lot of it. Before AI, the hard part was figuring out how to go about finding the signal you needed. Now it’s getting easier at an incredible speed.

twobiers 10 days ago [-]

> There is a whole giant essay I probably need to write at some point, but I can't help but see parallels between today and the Industrial Revolution.

You might want to give the following Articles a read then: https://www.ufried.com/blog/ironies_of_ai_1/ https://www.ufried.com/blog/ironies_of_ai_2/

raincole 10 days ago [-]

> Prior to the industrial revolution, the natural world was nearly infinitely abundant.

Prior to the industrial revolution, people fight to death for who can use the rivers. Pre-industrial societies are societies of scarcity.

[0]: People have been fighting for water for more than 4000 years: https://en.wikipedia.org/wiki/Umma%E2%80%93Lagash_war

derangedHorse 10 days ago [-]

> If all of us can go hunting in the woods and yet there is still game to be found, then there's no compelling reason to define and litigate who "owns" those woods.

Property rights don't just protect natural resources, but labor as well. If I cleared out hunting ground in that forest to be the prime spot to catch animals, I would make sure I can use it when I want.

> a small number of people were able to completely deplete parts of the earth

A small number of people seems inaccurate when there's typically many more individuals in the pipelines for these technologies.

> and in return profit off the knowledge over and over again at industrial scale

Not off just that knowledge, there needed to be a model trained on the data of many others to utilize it.

> Why would a writer put an article online if ChatGPT will slurp it up and regurgitate it back to users without anyone ever even finding the original article?

Who's better at writing in this scenario and what are my motivations? If it's ChatGPT and I did it for money, then I would say I should recognize that I can't compete and find something AI can't do. If it's ChatGPT and I write to convey my ideas in an effort to learn regardless of the bestowment of a new perspective on the reader, I'll keep writing.

> Why would anyone plant seeds on someone else's farm?

They wouldn't unless it was their own way to attain food and survive. And if it's not the only way, they can defer to those with optimal methods to get it the cheapest they can in the market.

kilbey1 10 days ago [-]

I'm halfway through Foundation on Apple TV and this piece landed hard (you had me at Asimov) because of it. Asimov's whole deal with psychohistory is that you can predict what large populations do even when individuals are unpredictable. Seldon doesn't need anyone to be honest; he needs the math to converge on something real about how people actually behave.

LLMs are sort of the inverse of that. They produce text that looks like the statistical aggregate of human knowledge, but nothing underneath is converging on truth. Seldon's math worked because it modeled actual dynamics. LLMs work because they model plausible text. The "jagged competence frontier" Kingsbury describes: crushing multivariable calculus, failing a word problem, is exactly what you'd get from a system that learned the shape of correct answers without learning what makes them correct.

The part of Foundation that feels prescient right now isn't the predicting-the-future stuff. It's the part where everyone can see Empire is hollowing out and the response is to just...keep going. More spectacle, more confidence, less substance holding any of it up. Hmmm, wait...

idiotsecant 10 days ago [-]

how are you enjoying the live action saturday morning cartoon version of Foundation with bonus plucky protagonists?

kilbey1 8 days ago [-]

Ha! Yeah, it is not the books. That said, it's been long enough since I read them that I didn't feel too annoyed ("oh wait, I don't remember that in the books" did come up more than once, as well as "oh wait, they're mixing up a bunch of books, is this the Robot series?". It's done really well personally, but hey, this is a way to make the money come in longer.

Setas 9 days ago [-]

[dead]

noosphr 10 days ago [-]

>Prior to the industrial revolution, the natural world was nearly infinitely abundant. We simply weren't efficient enough to fully exploit it.

The mammoths disagree.

some_random 10 days ago [-]

That is straightforwardly not true, land ownership was very well defined and the people who hunted in it without permission were prosecuted.

slibhb 10 days ago [-]

> Why would a writer put an article online if ChatGPT will slurp it up and regurgitate it back to users without anyone ever even finding the original article?

In the brave new world we're creating, people will write specifically for AI. If you can impress models so much that they "regurgitate" your work, then your work has achieved a kind of immortality.

AlexCoventry 10 days ago [-]

> If all of us can go hunting in the woods and yet there is still game to be found, then there's no compelling reason to define and litigate who "owns" those woods.

https://en.wikipedia.org/wiki/Feudalism

EamonnMR 10 days ago [-]

> We have copyright and intellecual property law already, of course, but those were designed presuming a human might try to profit from the intellectual labor of others. With AI, we're in the industrial era of the digital world. Now a single corporation can train an AI using someone's copyrighted work and in return profit off the knowledge over and over again at industrial scale.

The idea that copyright simply doesn't apply to AI has more to do with AI companies deciding that they're not going to comply with those laws than the design of the laws. Also a very successful lobby against enforcement by positioning AI as a strategic necessity.

randomNumber7 10 days ago [-]

It's not possible (or at least extremely hard) to prove that the final weights they come up with resulted from copyright infringement.

Thats why they are evaluated so high on the stock market. Basically the will steal all the value of intellectual property in a semi legal way.

ares623 10 days ago [-]

The silver lining in that scenario is that consumers can "choose" to just go back offline. I put choose in quotes because with so many things in life requiring online accounts nowadays, that choice is tenuous.

monocasa 11 days ago [-]

> Prior to the industrial revolution, the natural world was nearly infinitely abundant. We simply weren't efficient enough to fully exploit it. That meant that it was fine for things like property and the commons to be poorly defined. If all of us can go hunting in the woods and yet there is still game to be found, then there's no compelling reason to define and litigate who "owns" those woods.

I mean, medieval Europe (speaking broadly) had pretty well defined property rights wrt hunting. In fact, the forester at the time was thought of as one of the most corrupt jobs, as they'd commonly have side hustles poaching and otherwise illegally extracting resources from the lands they enforced and kept others from utilizing in a similar way. Quis custodiet ipsos custodes?

voidhorse 10 days ago [-]

I've been looking at things from the same lens since 2023. At the same time, the depletion/hoarding bit isn't new. Companies were already doing this with consumer data, LLMs are just finally the factory moment—now that we have all the raw material we finally have a means of automating production using it.

So, in some ways, I also view LLMs as a pivotal and important wake up call. Companies were already taking the data and using it for a variety of other purposes—it was just way less evident to people when they weren't in direct competition with labor, since, under capital, labor is what we sell.

Either an entire new industry needs to form, or it's finally time to move beyond capitalism. Centralized capital ends up killing itself, because it effectively shuts down its own engine if it kills off consumers, who can only exist in the first place if the wage labor structure holds.

delusional 10 days ago [-]

>Prior to the industrial revolution, the natural world was nearly infinitely abundant.

>We had to invent giant legal systems in order to determine who has the right to do that and who doesn't.

Excuse me? The industrial revolution was like 300 years ago. We had laws before that.

pocksuppet 11 days ago [-]

Stuff gets put online when the reader isn't the customer. Someone is paying for a reader to be told certain things. So it's free at the point of reading.

nick32661123 11 days ago [-]

Our only hope is that AI in the long run is both powerful and benevolent enough to be its own "whistleblower" in cases of misuse.

irishcoffee 10 days ago [-]

I struggle so hard with this anthropomorphism of LLMs. At the end of the day it's a statistical gradient descent predictor with a bunch of "shit" bolted on top to try and steer outputs in a specific way.

They don't have the actual concept of "benevolent"... or a concept of anything at all. Based on an input, they regress down a path of "what is the next most probable statistical token to output next" and that's fucking it, with the bolted-on shit manipulating these outputs a bit.

I don't doubt that at some point there will be some other AI leap, but I'm not even sure it'll be built on this foundation.

What really needs to be developed is an actual artificial brain of sorts. Much like an infant learns language from first principals, a real AI would have a phase of continuous growth, creating actual memories and being able to reflect upon them. I daresay context windows are not that.

I'd really like to encourage anyone to pump the brakes a bit on how these things actually work, and what they actually are. There is a reason sama is pivoting away from video, et. al. and into corporate software coding, much like anthropic.

navaed01 10 days ago [-]

The natural world was not meaningfully abundant… Way before the industrial revolution land which was once used for opening hunting was closed off by the ruling class. Even before the Industrial Revolution you had a new class of merchant and factory owners who earned riches to buy land and keep the poor from hunting on it. Much of the natural resources out of reach for the majority and only accessible by those with deep pockets

AnthonyMouse 11 days ago [-]

> We are truly in the Information Age now, and I suspect a similar thing will play out for the digital realm.

The analogy seems to be backwards though. It would be as if we previously had a scarcity of land and because of that divided it up into private property so markets could maximize crop yield etc. and then someone came up with a way to grow food on asteroids using robots, and that food is only at the 20th percentile of quality but it's far cheaper. Suddenly food becomes much more abundant and the people who had been selling the 20th percentile food for $5 are completely out of the market because the new thing can do that for $0.05, and the people providing the 50th percentile food for $10 are also taking a hit because the price difference between what they're providing and the 20th percentile stuff just doubled.

The existing plantation owners then want to put a stop to this somehow, or find a way to tax it, but arguments like this have a problem:

> Why would a writer put an article online if ChatGPT will slurp it up and regurgitate it back to users without anyone ever even finding the original article?

This was already the status quo as a result of the internet. Newspapers were slowly dying for 20 years before there was ever a ChatGPT, because they had been predicated on the scarcity of printing presses. If you published a story in 1975 it would take 24 hours for relevant competitors to have it in their printed publication and in the meantime it was your exclusive. The customer who wants it today gets it from you. On top of that, there weren't that many competitors covering local news, because how many local outlets are there with a printing press?

Then blogs, Facebook, Reddit and Twitter come and anyone who can set up WordPress can report the news five minutes after you do -- or five hours before, because now everyone has an internet-connected camera in their pocket so the first news of something happening now comes in seconds from whoever happened to be there at the time instead of the next morning after a media company sent a reporter there to cover it.

The biggest problem we have yet to solve from this is how to trust reports from randos. The local paper had a reputation to uphold that you now can't rely on when the first reports are expected to come from people with no previous history of reporting because it's just whoever was there. But that's the same thing AI can't do either -- it's a notorious confabulist.

And it's the media outlets shooting themselves in the foot with this one, because too many of them have gotten far too sloppy in the race to be first or pander to partisans that they're eroding the one advantage they would have been able to keep. Damn fools to erode the public's trust in their ability to get the facts right when it's the one thing people would otherwise still have to get from them in particular.

intended 10 days ago [-]

This assumes the limiting factor is content generation, not ability to read and verify.

You make the point later in your comment, but consider it a minor issue. “Randos”

the actual limits are verification, and then attention. Verification is always more expensive than generation.

However, people are happy to consume unverified content which suits their needs. This is why you always needed to subsidize newspapers with ads or classifieds.

AnthonyMouse 9 days ago [-]

> This assumes the limiting factor is content generation, not ability to read and verify.

Content generation is the thing copyright applies to. If you want to create a reward system for verification, it's not going to look anything like that.

It mostly looks like things we already have, like laws against pretending you're someone else to trade on their reputation so that people can build a reputation as trustworthy and make money from subscriptions or ads by being the one people to turn to when they want trustworthy information.

> However, people are happy to consume unverified content which suits their needs. This is why you always needed to subsidize newspapers with ads or classifieds.

I suspect the real problem here is the voting thing. When people derive significant value from information they're quite willing to pay for it. Wall St. pays a lot of money for Bloomberg terminals, companies pay to do R&D or market research, individuals often pay for financial software or games and entertainment content etc.

But voting is a collective action problem. Your vote isn't very likely to change the outcome so are you personally going to spend a lot of money to make sure it's informed? For most people the answer is going to be no, so we need something that gives them access to high quality information at minimal cost if we want them to be informed.

Annoyingly one of the common methods of mitigating collective action problems (government funding) has a huge perverse incentive here because the primary thing we want people to be informed about is political issues and official misconduct, so you can't give the incumbent politicians the purse strings for the same reason the First Amendment proscribes them from governing speech.

So you need a way to fund quality reporting the public can access for free. Advertising kind of fit but it never really aligned the incentives. You can often get more views by being entertaining or inflammatory than factual.

The question is basically, who can you get to supply money to fund factual reporting for everyone, whose interest is for it to be accurate rather than biased in favor of the funder's interests? Or, if that's not a thing, whose interests are fairly aligned with those of the general public? Because with that you can use a patronage model, i.e. the content is free to everyone but patrons choose to pay money because they want the work to be done more than they want to not pay.

The obvious answer for "who" is then "the middle class" because they're not so poor they can't pay a few bucks while still consisting of a large diverse group that won't collectively refuse to fund many classes of important reporting. But then we need two things. The first is for the middle class to not get hollowed out, which we're not doing a great job with right now.

And the second is to have a cultural norm where doing this is a thing, i.e. stop teaching people illiterate false dichotomy nonsense where the only two economic camps are "Soviet Communism" in which the government is required to solve everything through central planning and "greed is good" where being altruistic makes you a doofus for not spending all your money on blackjack and cocaine. People rather need to be encouraged to notice that once their basic needs are met, wanting to live in a better world is just as valid a use for free time and disposable income as designer shoes or golf.

idiotsecant 10 days ago [-]

Absolute peak HN energy that the top reply to this very insightful point is a rambling pedantic argument about the finer points of agricultural development.

randomNumber7 10 days ago [-]

> Why would a writer put an article online if ChatGPT will slurp it up and regurgitate it back to users without anyone ever even finding the original article?

I'm happy to miss all the stuff that was written just for the financial benefit of the author.

joefourier 11 days ago [-]

> 2017’s Attention is All You Need was groundbreaking and paved the way for ChatGPT et al. Since then ML researchers have been trying to come up with new architectures, and companies have thrown gazillions of dollars at smart people to play around and see if they can make a better kind of model. However, these more sophisticated architectures don’t seem to perform as well as Throwing More Parameters At The Problem. Perhaps this is a variant of the Bitter Lesson.

This is not true and unfortunately this significantly reduced the credibility of this article for me. Raw parameter counts stopped increasing almost 5 years ago, and modern models rely on sophisticated architectures like mixture-of-experts, multi-head latent attention, hybrid Mamba/Gated linear attention layers, sparse attention for long context lengths, etc. Training is also vastly more sophisticated.

The Bitter Lesson is misunderstood. It doesn't say "algorithms are pointless, just throw more compute at the problem", it says that general algorithms that scale with more compute are better than algorithms that try to directly encode human understanding. It says nothing about spending time optimising algorithms to scale better for the same compute, and attention algorithms and LLMs in general have significantly advanced beyond "moar parameters" since the time of Attention is All You Need/GPT2/GPT3.

saghm 10 days ago [-]

Literally the paragraph right before the one you quote is this:

> I am generally outside the ML field, but I do talk with people in the field. One of the things they tell me is that we don’t really know why transformer models have been so successful, or how to make them better. This is my summary of discussions-over-drinks; take it with many grains of salt. I am certain that People in The Comments will drop a gazillion papers to tell you why this is wrong.

As I understand it, this article is basically a conglomeration of several attempts at an article that the author has attempted to make over the past decade or so considering the impacts of AI on society. In their own words:

> Some of these ideas felt prescient in the 2010s and are now obvious. Others may be more novel, or not yet widely-heard. Some predictions will pan out, but others are wild speculation. I hope that regardless of your background or feelings on the current generation of ML systems, you find something interesting to think about.

As for the "Bitter Lesson" part, they pretty much directly said that it wasn't the Bitter Lesson exactly, saying it might be a variant of it. Honestly, it felt more like a way of throwing in a reference to something that also might provoke thought, which was done throughout the piece (which again, is the entire point).

It's totally valid to say "this article didn't provoke much thought for me". I'm a bit confused at why you think a lack of specific domain knowledge in a domain that they literally state they are not an expert in would be disqualifying for that purpose though.

joefourier 10 days ago [-]

The title of the article is “The Future of Everything is Lies, I Guess” and the first part is literally complaining about LLMs being bullshit machines, while the author proceeds to tell confabulations (or lies) of his own. Is there not a bit of irony in that?

If you’re a non-expert in a field, I don’t think it’s a good sign if you’re writing a 10 part article about that field’s impact on society and getting basic facts wrong. How can I trust that the conclusions will be any more credible?

saghm 9 days ago [-]

> The title of the article is “The Future of Everything is Lies, I Guess” and the first part is literally complaining about LLMs being bullshit machines, while the author proceeds to tell confabulations (or lies) of his own. Is there not a bit of irony in that?

Maybe some, but not that much given the disclaimers I cited above. There's value in a qualitative confidence level for a statement, and I'd argue that this is something that LLMs do not seem to produce in practice without someone explicitly asking for it. The human author's ability to anticipate potential mistakes in their logic and communicate those ahead of time is not equivalent to the type of fabrications that LLMs routinely make.

> If you’re a non-expert in a field, I don’t think it’s a good sign if you’re writing a 10 part article about that field’s impact on society and getting basic facts wrong. How can I trust that the conclusions will be any more credible?

I don't know why an expert in LLM implementation would be inherently more qualified to analyze the second-order effects of their product than anyone else. There's precedent for people who are "too close" to something having biases that make them less effective at recognizing how tools will get used by non-experts, and society as a whole is largely composed of people who are not experts in LLM implementations. If you wanted to understand what the net effect of everyone having access to LLMs, having an understanding of people is probably more important than knowing exactly what an LLM does under the hood.

otterley 10 days ago [-]

Might the conclusions be correct even if some of the facts are not? Even a stopped clock is right twice a day. And, "approximately correct" is still sometimes valuable.

imtringued 10 days ago [-]

He is wrong about why transformers are popular.

The most obvious reason is that transformers accept a sequence as an input and produce a sequence as an output. The vast majority of pre-transformer architectures only accepted a fixed input and output size. Before 2016 I was somewhat interested in ML, but my curiosity vanished because of the fixed input and output size limitations.

RNNs including LSTMs at the time were pretty bad and difficult to train due to vanishing and exploding gradients at long sequence lengths and sequential training along the sequence length. Meanwhile transformers can be parallelized along the sequence length.

Then there are theoretical limitations. Transformers re-read the entire sequence for every output. This leads to quadratic attention. There are plenty of papers that tell you why it is impossible to replicate the properties of quadratic attention with linear attention.

The reason is blatantly obvious. If you want linear attention to have the same capability, you need to re-read the entire input sequence after every output. If you do this at the token level, then you have basically implemented quadratic attention.

Transformers aren't a mystery success, they are using computational brute force, which is hard to beat with other architectures. If you go with a more efficient architecture, you are by definition giving up some non-zero capabilities. Nobody really cares about getting slightly worse results from a much more efficient architecture. In the current ML space, it's SOTA (state of the art) or go home.

ainch 10 days ago [-]

Transformers do have a fixed input/output size though - that's what a context window is. It's just that, via scaling and algorithmic improvements, the length of usable context windows has increased to the point that they're much less of a bottleneck.

I think your points around parallelisation and the flexibility of quadratic attention are spot-on though.

vrighter 9 days ago [-]

transformers have a fixed input size (padding the unneeded context window with null tokens). Whether you put in a sequence of things or just random tokens is irrelevant. To the network it is just "one input"

They also have a fixed output of one probability distribution for the next one token.

running it in a loop does not mean it can work with sequences, by that definition, so can literally everything else

joefourier 9 days ago [-]

Sorry but that's false, you are confusing transformers as an architecture, and auto-regressive generation, and padding during training.

Standard transformers take in an arbitrary input size and run blocks (self and possibly cross attention, positional encoding, MLPs) that don't care about its length.

> They also have a fixed output of one probability distribution for the next one token.

No, in most implementations, they output a probability distribution for every token in the input. If you input 512 tokens, you get 512 probability distributions. You can input however many tokens you want - 1, 2048, one million, it's the same thing (although since standard self-attention scales quadratically you'll eventually run out of memory). Modern relative embeddings like RoPE can support infinite length although the quality will degrade if you extrapolate too far beyond what the model saw during training.

For typical auto-regressive generation, they are trained with causal masking/teacher forcing, which makes it calculate the probability for the next token. During inference, you throw away all but the last probability distribution and use that to sample the next token, and then repeat. You also do this with an RNN. An autoregressive CNN (e.g. WaveNet) would be closer to what you described in that it has a fixed window looking backwards.

But a transformer doesn't have to be used for auto-regressive generation, you can use it for diffusion, as a classifier model, for embedding text. It doesn't even see a sequence as spatially organised - unlike a CNN or an RNN it doesn't have architectural intrinsic biases about the position of elements, which is why it needs positional embeddings. This lets you have 2D, 3D, 4D, or disordered elements in a sequence. You can even have non-regularly sampled sequences. (Again this is for a classic transformer without sliding window attention or any other special modifications).

> (padding the unneeded context window with null tokens). To have efficient training, you pad all samples in a batch to have the same length (and maybe make it a power of two). But you are working with a single sequence, the length is arbitrary up to hardware limitations, and no padding is needed.

vrighter 6 days ago [-]

you, the user can enter any size input you want.

The network has a fixed number of input neurons. You have to put something in all of them.

If you enter "hello", the network might get " hello", but all of its inputs need some inputs. It doesn't (and can't) process tokens one at a time.

"No, in most implementations, they output a probability distribution for every token in the input."

In anything but the last column, the numbers are junk. You can treat them as probability distributions all you want, but the system is only trained to get the outputs of the last column "correct".

joefourier 6 days ago [-]

Not to be rude, but you're arguing with a machine learning engineer about the basics of neural network architectures :P

> The network has a fixed number of input neurons. You have to put something in all of them.

The way transformers work is that they apply the same "input neurons" to each individual token! It's not:

Token 1 -> Neuron 1 Token 2 -> Neuron 2 Token 3 -> Neuron 3... With excess neurons not being used, it's

Token 1 -> Vector of dimensions N -> ALL neurons Token 2 -> Vector of dimensions N -> ALL neurons Token 3- > Vector of dimensions N -> ALL neurons ...

Grossly oversimplified, in a typical transformer layer, you have 3 distinct such "networks" of neurons. You apply them each token, giving you, for each token, a "query", a "key", and a "value". You take the dot product of there query and key, apply softmax, then multiply it with the value, giving you the vector to input for the next layer.

A probability distribution obviously contains a probability for every possible next token. But the whole probability distribution (which adds up to one) only predicts the next ONE token. It predicts what is the probability of that one token being A, or B, or C, etc, giving a probability for each possible token. It's still predicting only one token. In anything but the last column, the numbers are junk. You can treat them as probability distributions all you want, but the system is only trained to get the outputs of the last column "correct". Not quite, the reason transformers train fast is because you can train on all columns at once.

For tokens 1, 2, 3, 4, ... you get predictions for tokens 2, 3, 4, 5... Typical autoregressive transformer training uses a causal mask, so that token 1 doesn't see token 2, enabling you to train on all the predictions at once.

kgeist 10 days ago [-]

>Raw parameter counts stopped increasing almost 5 years ago, and modern models rely on sophisticated architectures like mixture-of-experts, multi-head latent attention, hybrid Mamba/Gated linear attention layers, sparse attention for long context lengths, etc.

Agree, I recently updated our office's little AI server to use Qwen 3.5 instead of Qwen 3 and the capability has considerably increased, even though the new model has fewer parameters (32b => 27b)

Yesterday I spent some time investigating it:

- Gated DeltaNet (invented in 2024 I think) in Qwen3.5 saves memory for the KV kache so we can afford larger quants

- larger quants => more accurate

- I updated the inference engine to have TurboQuant's KV rotations (2026) => 8-bit KV cache is more accurate

- smaller KV cache requirements => larger contexts

Before, Qwen3 on this humble infra could not properly function in OpenCode at all (wrong tool calls, generally dumb, small context), now Qwen 3.5 can solve 90% problems I throw at it.

All that thanks to algorithmic/architectural innovations while actually decreasing the parameter count.

Vachyas 10 days ago [-]

What you described sounds plausible (expected, even).

But

>Raw parameter counts stopped increasing almost 5 years ago

Really? 5 years ago? Until just about 3 years ago OpenAI's latest offering was only ChatGPT 3.5

Most of the models people talk about now didn't even exist 3 years ago let alone 5.

Even now, I don't know if parameter count stopped mattering or just matters less

For example, I have no idea if the new Mythos is MoE but I'm pretty sure it's more parameters.

kgeist 10 days ago [-]

I agree the original poster exaggerated it. But generally models indeed have stopped growing at around 1-1.5 trillion parameters, at least for the last couple of years.

>Even now, I don't know if parameter count stopped mattering or just matters less

Models in the 20b-100b range are already very capable when it comes to basic knowledge, reasoning etc. Improving the architecture, having better training recipes helped decrease the required parameter count considerably (currently 8b models can easily beat the 175b strong GPT3 from 3 years ago in many domains). What increasing the parameter count currently gives you is better memorization, i.e. better world knowledge without having to consult external knowledge bases, say, using RAG. For example, Qwen3.5 can one-short compilable code, reason etc. but can't remember the exact API calls to to many libraires, while Sonnet 4.6 can. I think what we need is split models into 2 parts: "reasoner" and "knowledge base". I think a reasoner could be pretty static with infrequent updates, and it's the knowledge base part which needs continuous updates (and trillions of parameters). Maybe we could have a system where a reasoner could choose different knowledge bases on demand.

RoddaWallPro 10 days ago [-]

5 years ago was the beginning of 2021, just under a year after GPT3 was released (which was not good at doing anything useful). And that model was 175B params.

GPT4 has been widely rumored to have 1.8 trillion params, which is 10x more, and was released 2 years after this "5 years ago" date that you are using here.

So, to quote yourself here, "This is not true and unfortunately this significantly reduced the credibility of this article for me" /s/article/comment

joefourier 10 days ago [-]

In late 2021, GLaM had 1.2T parameters. It's difficult to find much use of it in the wild and while the benchmarks it uses are rather outdated, it has a HellaSwag score of 76.6% and WinoGrande of 73.5%. GPT3 had 64.3% and 70.2%.

Meanwhile, Gemma 2 9B, a model from July 2024 with 133x fewer parameters than GLaM, scores 82% and 80.6%. Hellaswag and WinoGrande aren't used in modern benchmarks, probably because they're too easy and largely memorised at this point.

And GPT-4 had 1.8T parameters sure, but it's noticeably worse than any modern model a fraction of the size, and the original incarnation was ridiculously expensive per token. And in any case, its number of parameters was only possible due using mixture-of-experts, which I would definitely classify as a sophisticated architecture as opposed to just throwing more parameters at a vanilla transformer. Even in 2021 GLaM was a MoE because the limits of scaling dense transformers had already been hit.

zozbot234 10 days ago [-]

MoE has made it vastly easier to increase total parameters (and recent open models are really quite large) but it's also hard to compare a MoE with an earlier dense model.

janalsncm 10 days ago [-]

Yeah I also came here to be one of those People In The Comments the author refers to.

Transformers are not magical. They are just a huge improvement over other architectures at the time such as LSTMs and RNNs and even CNNs. They allowed us to throw more and more compute at the problem of next token prediction. And we’ve been riding that horse ever since.

Another big advancement that deserves mentioning is “reasoning” models that have the opportunity to spit out thinking tokens before giving a final answer.

None of this is to say transformers are the most principled approach. But they work.

zozbot234 10 days ago [-]

Transformers' greatest improvement over RNN/LSTM was to enable better parallelization of large-scale training. This is what enabled language models to become "large". But when controlling for overall size, more RNN/LSTM-like approaches seem to be more efficient, as seen e.g. in state space models. The transformer architecture does add some notable capabilities in accounting for long-range dependencies and "needle in a haystack" scenarios, but these are not a silver bullet; they matter in very specific circumstances.

joefourier 10 days ago [-]

With modern training techniques, RNNs (not just linear SSMs, potentially even vanilla LSTMs) can scale just as well as transformers or even better when it comes to enormous context lengths. Dot-product attention has better performance in a number of domains however (especially for exact retrieval) so the best architectures are likely to remain hybrid for now.

famouswaffles 10 days ago [-]

>With modern training techniques, RNNs (not just linear SSMs, potentially even vanilla LSTMs) can scale just as well as transformers or even better when it comes to enormous context lengths.

That's not true. Modern training techniques aren't enough. Vanilla RNNs with modern training techniques still scale poorly. You have to make some pretty big architectural divergences (throwing away recurrency during training) to get a RNN to scale well. None of the big labs seem to be bothered with hybrid approaches.

joefourier 10 days ago [-]

> That's not true. Modern training techniques aren't enough. Vanilla RNNs with modern training techniques still scale poorly. You have to make some pretty big architectural divergences (throwing away recurrency during training) to get a RNN to scale well.

SSMs move the non-linearity outside of the recurrence which enables parallelisation during training. It is trivial to do this architectural change with an LSTM (see the xLSTM paper). Linear RNNs are still RNNs.

But you can still keep the non linearity by training with parallel Newtown methods, which work on vanilla LSTMs and scale to billion of parameters.

> None of the big labs seem to be bothered with hybrid approaches.

Does Alibaba not count? Qwen3.5 models are the top performers in terms of small models as far as my tests and online benchmarks go.

famouswaffles 10 days ago [-]

>SSMs move the non-linearity outside of the recurrence which enables parallelisation during training. It is trivial to do this architectural change with an LSTM (see the xLSTM paper). Linear RNNs are still RNNs.

Removing the non-linearity from the recurrence path is exactly what constitutes a "pretty big architectural divergence." A linear RNN is an RNN in a structural sense, certainly, but functionally it strips out the non-linear state transitions that made traditional LSTMs so expressive, entirely to enable associative scans. The inductive bias is fundamentally altered. Calling that simply 'modern training techniques' is disingenous at best.

>But you can still keep the non linearity by training with parallel Newtown methods, which work on vanilla LSTMs and scale to billion of parameters.

That does not scale anywhere near as well as Transformers in compute spend. It's paper/research novelty. Nobody will be doing this for production.

>Does Alibaba not count? Qwen3.5 models are the top performers in terms of small models as far as my tests and online benchmarks go.

I guess there's some misunderstanding here because Qwen is 100% a transformer, not a hybrid RNN/LSTM whatever.

joefourier 10 days ago [-]

> That does not scale anywhere near as well as Transformers in compute spend. It's paper/research novelty. Nobody will be doing this for production.

What exactly makes you so confident?

The world is not just labs that can afford billion dollar datacentres and selling access to SOTA LLMs at $30/Mtokens. Transformers are highly unsuitable for many applications for a variety of reasons and non-linear RNNs trained via parallel methods are an extremely attractive value proposition and will likely feature in production in the next products I work on.

> I guess there's some misunderstanding here because Qwen is 100% a transformer, not a hybrid RNN/LSTM whatever.

See the Qwen3.5 Huggingface description: https://huggingface.co/Qwen/Qwen3.5-27B > Efficient Hybrid Architecture: Gated Delta Networks combined with sparse Mixture-of-Experts deliver high-throughput inference with minimal latency and cost overhead.

famouswaffles 5 days ago [-]

>What exactly makes you so confident?

Existing research? If you want something that scales as well as transformers you have to make the divergences I was talking about. If you don't then it scales a lot worse. The Newton methods don't match transformer efficiency at scale. That's just a fact.

>The world is not just labs that can afford billion dollar datacentres and selling access to SOTA LLMs at $30/Mtokens.

Billion dollar labs want to save money too. If Modern RNNs were a massive unanimous win, they and everyone else would switch in a heartbeat, just like they did for transformers. The reason they don't is because these architectures at best simply match transformers, while introducing their own architectural issues.

saghm 10 days ago [-]

This sounds almost identical to the article that's literally linked at the end of the paragraph that the parent comment quoted: https://www.cs.utexas.edu/~eunsol/courses/data/bitter_lesson...

I don't think anything you're saying here is in disagreement with the points they're making.

jgammell 10 days ago [-]

> hybrid Mamba/Gated linear attention layers,

Do any large-scale architectures use mamba? I was under the impression that people don't use it yet due to lack of efficient implementations.

> Training is also vastly more sophisticated

Is it? In what ways?

joefourier 10 days ago [-]

Qwen3.5 uses Gated Delta Networks which is essentially Mamba 2 + Delta Rule. It’s quite hardware efficient.

> Is it? In what ways?

Just the reinforcement learning for reasoning, and then tool use for agents, could be its own topic.

drob518 11 days ago [-]

> It remains unclear whether continuing to throw vast quantities of silicon and ever-bigger corpuses at the current generation of models will lead to human-equivalent capabilities. Massive increases in training costs and parameter count seem to be yielding diminishing returns. Or maybe this effect is illusory. Mysteries!

I’m not even sure whether this is possible. The current corpus used for training includes virtually all known material. If we make it illegal for these companies to use copyrighted content without remuneration, either the task gets very expensive, indeed, or the corpus shrinks. We can certainly make the models larger, with more and more parameters, subject only to silicon’s ability to give us more transistors for RAM density and GPU parallelism. But it honestly feels like, without another “Attention is All You Need” level breakthrough, we’re starting to see the end of the runway.

xmprt 11 days ago [-]

I see a lot of researchers working on newer ideas so I wouldn't be surprised if we get a breakthrough in 5-10 years. After all, the gap between AlexNet and Attention is All You Need was only 6 years. And then Scaling Laws was about 3-4 years after that. It might seem like not much progress is being made but I think that's in part because AI labs are extremely secretive now when ideas are worth billions (and in the right hands, potentially more).

Of course 5-10 years is a long time to bang our heads against the wall with untenable costs but I don't know if we can solve our way out of that problem.

ghywertelling 11 days ago [-]

I think we will see models becoming small reasoning core which don't remember tonnes of facts but can reason with data fed to it or they can search.

supliminal 11 days ago [-]

The echoes of A.I. winter.

marcosdumay 10 days ago [-]

Yep, last time we got "a lot of researchers working on newer ideas", it took them 20 years to get into a working idea, and other 20 years to get it mature enough to make an AI boom.

krainboltgreene 11 days ago [-]

> The current corpus used for training includes virtually all known material.

This is just totally incorrect. It's one of those things everyone just assumes, but there's an immense amount of known material that isn't even digitized, much less in the hands of tech companies.

drob518 11 days ago [-]

What large caches of undigitized content exists? Surely, not everything has been digitized, but I can’t think it’s much in percentage terms.

liquid_thyme 11 days ago [-]

The amount of private data that is locked up inside private internal databases is huge. This is especially true of regulated industries. There is a wealth of data - financial data showing how to budget for things, pricing data on various products that are B2B, standard operating procedures at mature companies that have gone through various revisions, designs for manufacturing plants so people don't keep reinventing and making the same mistakes again, and on and on.

davebren 10 days ago [-]

I think it's implied that they're not talking about private data when they say they've run out.

liquid_thyme 10 days ago [-]

fair. I want to +1 the fact that there is a large amount of data unseen by LLMs.

drob518 10 days ago [-]

I think there are post training tweaks that can be done with corporate data to help fit an AI to a specific corporation. But I don’t think that private data will deliver us AGI. The knowledge for AGI is out in the world, not hidden inside corporations. Private data brings us knowledge of the XYZ project status and the division ABC budget and whether Bob wants a chocolate cake for his going away dinner or not.

liquid_thyme 10 days ago [-]

I'm not seeing it the same way. Businesses in various industries have several types of moats - money, knowledge, experience, skills, etc. There is ton of competitive intelligence hidden in private data.

Its one of the reasons you can't use chatGPT and start manufacturing chips or vaccines, or anti-cancer medication. The gap between publicly available data that informs academic "core science" research versus specific product-based knowledge that shows you how to make a successful drug candidate that can withstand regulatory scrutiny or be a safe and effective drug for the worlds population.

We could iterate so quickly if this private data set was democratized.

cgh 11 days ago [-]

The Vatican Library contains roughly 1.1 million printed books and around 75,000 codices, only a small percentage of which have been digitised.

hatthew 10 days ago [-]

Reddit alone contains about the same quantity of text (~10 billion posts * 10 words per post, vs 1 million books * 100k words per book). Messaging and document platforms (google docs, slack, discord, telegram, etc.) probably each have 1-3 orders of magnitude more than reddit. To your/GP's point though, those private platforms probably haven't been slurped up by LLMs yet.

drob518 11 days ago [-]

Which is what percent of the world’s content? 0.000000001% or something similar. It’s nothing in the scheme of things. To put it another way, if we were to digitize that continent and train on it, our AIs would not get noticeably better in any way. It doesn’t move the needle.

fwip 10 days ago [-]

1.1 million being 0.000000001% implies a total count of 1e17 books in the world - the real number is closer to 1e8.

drob518 10 days ago [-]

You’re missing the point. And we’re not just talking about books, whatever that might mean. We’re talking about all documents ever made. Every magazine article, every blog and web page, every Word doc, etc. I’m pretty sure that whatever is in the Vatican archives is tiny by comparison. Given the age of the Vatican archives, I can also guarantee that many of those “books” are nothing more than page fragments. Very few will be full codices or long scrolls. Many will date before the printing press when document production was slow and laborious.

jdub 9 days ago [-]

What makes you believe that most things have been digitised in the first place?

wise_blood 10 days ago [-]

has the whole youtube been indexed?

drob518 10 days ago [-]

I’m sure Gemini has done it at some level. Google was pretty much founded on the assumption that more data is better. That has driven them to build or buy data sets that they can mine (Gmail, YouTube, etc.).

Sol- 11 days ago [-]

I think in domains like Math and Software Engineering, they are less constrained by training data anyway. They can synthetically generate and validate programs. To what extent that scales into novel insights is a different matter, but I think they dream of the AlphaGo Zero moment at least in verifiable domains.

davebren 10 days ago [-]

How can it ever play against itself on novel software tasks? First it has to come up with the task. Then it can write tests but then it needs to verify that the tests are correct, mixture of experts can come to wrong conclusions, etc...

embedding-shape 11 days ago [-]

> I’m not even sure whether this is possible.

Based on what's happened so far, maybe. At least that's exactly how we got to the current iteration back in 2022/2023, quite literally "lets see what happens when we throw an enormous amount data at them while training" worked out up until one point, then post-training seems to have taken over where labs currently differ.

davebren 10 days ago [-]

It works as far as consolidating existing human knowledge, but general intelligence doesn't suddenly emerge from it. They're out of data now but if there was 10x more (assuming it's not rehashed from existing data) and they had a 10x larger model then it would be better but that's only because there's more that it can copy from.

drob518 11 days ago [-]

Right, but we played the scaling card and it worked but is now reaching limits. What is the next card? You can surely argue that we can find a new one at any time. That’s the definition of a breakthrough. I just don’t see one at the moment.

embedding-shape 11 days ago [-]

> I just don’t see one at the moment.

Did you see the one before the current one was even found? Things tend to look easy in hindsight, and borderline impossible trying to look forward. Otherwise it sounds like you're in the same spot as before :)

drob518 11 days ago [-]

That’s what I’m said. Breakthroughs happen. No doubt about it, and they are unpredictable. Hence a breakthrough. But right now we’re using up runway with nothing yet identified to take us to the next level. And while sometimes breakthroughs happen, sometimes they don’t.

10 days ago [-]

functional_dev 11 days ago [-]

better tooling and integration

11 days ago [-]

htrp 11 days ago [-]

We pay people to create more high quality tokens (mercor, turing) which are then fed into data generating processes (synthetic data) to create even more tokens to train on

drob518 11 days ago [-]

But does that really help, or do you get distortion? The frequency distribution of human generated content moves slowly over time as new subjects are discussed. What frequency distribution do those “data generating processes” use? And at root, aren’t those “data generating processes” basically just another LLM (I.e., generating tokens according to a probability distribution)? Thus, aren’t we just sort of feeding AI slop into the next training run and humoring ourselves by renaming the slop as “synthetic data?” Not trying to be argumentative. I’m far from being an AI expert, so maybe I’m missing it. Feel free to explain why I’m wrong.

htrp 11 days ago [-]

That's the problem in a nutshell. There is an art to how you generate the synthdata so that you don't get crappy trained models (especially when mistakes cost XX million dollars).

It's also theoretically why facebook paid 14bn for alex wang and scale ai

0rbiter 10 days ago [-]

[dead]

danieltanfh95 11 days ago [-]

I think the discussion has to be more nuanced than this. "LLMs still can't do X so it's an idiot" is a bad line of thought. LLMs with harnesses are clearly capable of engaging with logical problems that only need text. LLMs are not there yet with images, but we are improving with UI and access to tools like figma. LLMs are clearly unable to propose new, creative solutions for problems it has never seen before.

Aperocky 11 days ago [-]

> LLMs are clearly unable to propose new, creative solutions for problems it has never seen before.

LLMs are incredibly useful but I'm not sure about this statement.

It is proposing stuff that I haven't seen before, but I don't know about it is new or creative from the entirety of collective human knowledge.

saghm 10 days ago [-]

I'm not sure if you misread the statement you quoted or I'm misreading yours, but it doesn't sound like you're really disagreeing with their point. Did you miss the "un" in "unable", or am I misunderstanding you as also saying that you don't consider them to be creative?

Aperocky 10 days ago [-]

Yeah it was my bad, thanks for pointing it out. For some reason I read that as "able", can't unsee it or understood how it happened.

throwaway27448 11 days ago [-]

> LLMs with harnesses are clearly capable of engaging with logical problems that only need text.

To some extent. It's not clear where specifically the boundaries are, but it seems to fail to approach problems in ways that aren't embedded in the training set. I certainly would not put money on it solving an arbitrary logical problem.

simianwords 11 days ago [-]

> To some extent. It's not clear where specifically the boundaries are, but it seems to fail to approach problems in ways that aren't embedded in the training set. I certainly would not put money on it solving an arbitrary logical problem.

In what way can you falsify this without having the LLM be omniscient? We have examples of it solving things that are not in the training set - it found vulnerabilities in 25 year old BSD code that was unspotted by humans. It was not a trivial one either.

pessimizer 10 days ago [-]

Here's an odd example of testing, but I design very complex board and card games, and LLMs are terrible at figuring out whether they make sense or really even restating the rules in a different wording.

I thought they would be ideal for the job, until I realized that it would just pretend that the rules worked because they looked like board game rules. The more you ask it to restate, manipulate or simulate the rules, the more you can tell that it's bluffing. It literally thinks every complicated set of rules works perfectly.

> it found vulnerabilities in 25 year old BSD code that was unspotted by humans.

I don't think the age of the code makes the problem more complex. Finding buffers that are too small is not rocket science, bothering to look at some corner of some codebase that you've never paid attention to or seen a problem with is. AI being infinitely useful (cheap) to sic on pieces of codebase nobody ever carefully looks at is a great thing. It's not genius on the part of the AI.

nmadden 10 days ago [-]

Re: cheap - Anthropic’s write-up said it cost $20,000 of runs to find that bug (and a few others). So not that cheap compared to other tools - more similar in cost to human review/pentest, but probably more exhaustive.

> This was the most critical vulnerability we discovered in OpenBSD with Mythos Preview after a thousand runs through our scaffold. Across a thousand runs through our scaffold, the total cost was under $20,000 and found several dozen more findings.

They don’t talk about the other findings, so I’m guessing they are minor.

simianwords 10 days ago [-]

> Here's an odd example of testing, but I design very complex board and card games, and LLMs are terrible at figuring out whether they make sense or really even restating the rules in a different wording.

I'm positive that they are perfectly fine and will a pretty good job. Did you actually try it?

CamperBob2 10 days ago [-]

Eh, I can see their point, I think. The models can restate the rules differently, I'm sure, but it sounds like the GP is saying that LLMs can't tell whether the rules are well-balanced.

It would be interesting to see some example problems along those lines. Design some games with complex rules, including one or two of the most subtle game-wrecking bugs you can think of, and ask the models if they can spot them.

In fact that sounds more interesting the more I think about it. Intensive RL on that sort of thing might generalize in... let's say useful ways.

simianwords 10 days ago [-]

I would love to see examples but I think we won’t. I’m happy to be proven wrong that an llm will do worse than a fairly smart human (without prior experience in the board game).

10 days ago [-]

throwaway27448 10 days ago [-]

I'm just saying I'd rather hire a human that can be reasoned with than rely on software that can't be. At least where reasoning is involved.

Granted, I don't do a lot of needle-in-the-haystack work like finding vulnerabilities where search will naturally dominate.

Also, I imagine most reasoning involved in exploits will be found in the training sets—there are only so many patterns of exploitation found in formal languages.

__alexs 11 days ago [-]

Solving arbitrary logical problems seems to be equivalent to solving the halting problem so you are probably wise not to make that bet.

senko 11 days ago [-]

> LLMs are not there yet with images

https://genai-showdown.specr.net/image-editing

There's been a lot of progress there, it's just that an LLM that's best for, say coding, isn't going to be also the best for image edit.

Sharlin 10 days ago [-]

To be clear, image generation models are not in general LLMs although most now use an LLM as a text encoder.

bdbdbdb 10 days ago [-]

Harnesses could have solved things like the bathroom remodel, maybe, but the main point about how LLMs don't understand is the key here. You can make chatgpt better at rendering 3d scenes but you can't make it think, not really. Reasoning was only ever a feedback loop.

Anyone who has worked with LLMs has experienced all the issues he talks about here, we're either optimistic and imagine they'll be fixed, or we're pessimistic and we say they are inherent to the nature of the technology and will never be fixed

orangesilk 9 days ago [-]

> LLMs with harnesses are clearly capable of engaging with logical problems that only need text.

All of the LLMs are bad at music. They get intervals wrong. They list unsuited songs.

I would not trust them with any domain until proven otherwise.

saghm 10 days ago [-]

> LLMs with harnesses are clearly capable of engaging with logical problems that only need text.

> LLMs are clearly unable to propose new, creative solutions for problems it has never seen before.

How do you reconcile this with this article that the author linked? It's not a novel problem, and it's only text: https://medium.com/the-generator/one-word-answers-expose-ai-...

I guess it's a form of engagement to give a wildly wrong answer, but I'm not convinced that the extra nuance you've introduced is really all that nuanced either.

tim333 10 days ago [-]

The author of the medium article specifically hobbled the models to stop them thinking it through and got a wrong answer but that would happen with humans too and doesn't prove much.

saghm 9 days ago [-]

I would argue that most humans would either give the correct answer or just say "I don't know". Some of them might confidently give the wrong answer, but humans will readily refuse to follow instructions in plenty of circumstances where they decide they aren't worthwhile. LLMs don't do this, and I'd argue that the ability to reject premises is fundamental to engaging with things in a truly logical way.

Sharlin 10 days ago [-]

More nuanced than what? It’s just impossible to read the article as claiming "LLMs are idiots". Its whole point is that LLMs are simultaneously astonishingly smart and utterly stupid, and critically the boundary between those is complex and unintuitive which makes trusting their output so precarious.

drob518 11 days ago [-]

> "LLMs still can't do X so it's an idiot"

Let’s be careful. That’s a straw man. I don’t know anyone who says that. Aphyr says in the article that AIs can do things. But they have been marketed as “intelligent,” and I agree with Aphyr that the word is suggesting way more than AIs currently deliver. They do not reason and they do not think and are not truly intelligent. As the article says, they are big wads of linear algebra. Sometimes, that’s useful.

simianwords 11 days ago [-]

> They do not reason

How do you disprove it?

drob518 10 days ago [-]

We know that they do not reason because we know the algorithm behind the curtain. The model is generating the next token via model weights and some randomness. That’s all. It not reasoning. Sometimes it has an appearance of reasoning, but not if you know how it works. It doesn’t matter that the model manufacturer marketing department slaps a “Reasoning!” sticker on the side of the model. It’s not actually doing that. As an analogy, sometimes a stage magician in Las Vegas makes it seem that he’s making a woman disappear and a tiger appear in her place, but we all know that’s not what is really happening; It’s just a clever trick.

patch_dev 10 days ago [-]

Well, could you define what reasoning actually means? What would an AI need to do to be considered capable of reasoning? What is the core difference between what we do that is considered reasoning verse what AI currently does that is not considered reasoning?

To be clear, I am not making a statement as to whether AI reasons or not. Its just slippery to say something isn't or can't do X when we can't really define X. Perhaps if we can put it down as an outcome rather than an, in my opinion, currently impossible to accurately define characteristic of a thing.

intended 10 days ago [-]

In many examples, LLMs betray the fact that they are not reasoning, because when provided with problems that can be solved with the ability to reason, they fail.

Even in this discussion someone provided an example of coming up with board game rules. LLMs found all board game rules valid, because they looked and sounded like board game rules. Even when they were not.

In short, You can learn a subject, you can make a mental model of it, you can play with it, and you can rotate or infer new things about it.

LLMs are more analogous to actors, who have learnt a stupendous amount of lines, and know how those lines work.

They are, by definition, models of language.

IF you want a better version - GENAI needs to be able to generate working voxels of hands and 3D objects just from images.

simianwords 10 days ago [-]

I don’t believe the board game rules example. I think this would be a piece of cake for an llm. I’m happy to be proven wrong here if you share an example.

intended 10 days ago [-]

This is the user I took the example from: https://news.ycombinator.com/item?id=47689648#47696789

hackinthebochs 10 days ago [-]

>We know that they do not reason because we know the algorithm behind the curtain.

In other words, we didn't put the "reasoning algorithm" in LLMs therefore they do not reason. But what is this reasoning algorithm that is a necessary condition for reasoning and how do you know LLMs parameters didn't converge on it in the process of pre-training?

drob518 10 days ago [-]

Model parameters are weights, not algorithms. The LLM algorithm is (relatively) fixed: generate the next token according to the existing context, the model weights, and some randomization. That’s it. There is no more algorithm than that. The training parameters can shift the probabilities for predicting a token given the context, but there’s no more to it than that. There is no “reasoning algorithm” in the weights to converge to.

hackinthebochs 9 days ago [-]

This overly reductive description of LLMs misses the forest for the trees. LLMs are circuit builders, the converged parameters pick out specific paths through the network that define programs. In other words, LLMs are differentiable computers[1]. Analogous to how a CPU is configured by the program state to execute arbitrary programs, the parameters of a converged LLM configure the high level matmul sequences towards a wide range of information dynamics.

Statistics has little relevance to LLM operation. The statistics of the training corpus imparts constraints on the converged circuit dynamics, but otherwise has no representation internally to the LLM.

[1] https://x.com/karpathy/status/1582807367988654081

qsera 9 days ago [-]

> LLMs are circuit builders

I think they are circuit "approximators". In other words, a result of a glorified linear regression..

drob518 9 days ago [-]

I called it a “big wad of linear algebra,” above. That’s all it is.

qsera 10 days ago [-]

https://arxiv.org/abs/2603.09678

orthoxerox 10 days ago [-]

And the brain is just a complicated chemical reaction.

b0rtb0rt 10 days ago [-]

what if eventually the physical mechanism for human consciousness becomes fully understood? does understanding that process mean we are no longer intelligent?

umanwizard 10 days ago [-]

What is reasoning?

beders 11 days ago [-]

Thank you for putting it so succinctly.

I keep explaining to my peers, friends and family that what actually is happening inside an LLM has nothing to do with conscience or agency and that the term AI is just completely overloaded right now.

tasuki 11 days ago [-]

> I keep explaining to my peers, friends and family that what actually is happening inside an LLM has nothing to do with conscience or agency

What would the insides have to look like to have anything to do with conscience or agency?

speed_spread 10 days ago [-]

I would expect to find a tiny, sweaty man constantly pedaling while cursing the user for it's seemingly infinite stupidity.

davidclark 10 days ago [-]

I don’t have an answer. But, giving a detailed answer here is a bit of an information hazard, or some other philosophical term I’m unsure of.

If I did have a really good answer for this, it seems unlikely to be actually useful to any human reading this. Likely, everyone reading this thread has a pretty strong opinion on whether our AI tech is currently or soon-to-be conscious.

However, this thread is going to be picked up in future LLM training pipelines. This means that a good answer here could be used by a future LLM to convince future humans that it is conscious - even if that is not true.

I hadn’t thought about this interaction with the future before. It’s… disconcerting.

tasuki 10 days ago [-]

Hah, late but a solid reply - thanks!

I'm a lot more agnostic than you are: I don't know whether LLMs are conscious. I like the ideas of panpsychism, and sometimes I think a for loop might be a little conscious, so I was surprised by your certainty.

mkehrt 11 days ago [-]

One thing that has happened is that "AI" has been an academic discipline since literally the 1950s. The term was originally used in the hope that we would soon be able to emulate human minds. This turned out to be hard, but the name stuck to the discipline.

Now, suddenly, this name has been broadcast to every human in the world more or less. To them, it's a new term, and it obviously means something human mind-like. But to people who work on AI, that's not generally what it means. (Which isn't to say that some of them don't think we're near to achieving that; they just use other terms like "AGI" for that goal). So the name, which has a long history, is deceptive to people who aren't familiar with computer science.

saghm 10 days ago [-]

> Now, suddenly, this name has been broadcast to every human in the world more or less. To them, it's a new term, and it obviously means something human mind-like. But to people who work on AI, that's not generally what it means. (Which isn't to say that some of them don't think we're near to achieving that; they just use other terms like "AGI" for that goal). So the name, which has a long history, is deceptive to people who aren't familiar with computer science.

I think it's even worse than that: people were familiar with the term already, but from science fiction, where it referred to actually human-level intelligence. It's similar to the "hoverboard" thing from a while back, except this time with profoundly higher stakes and requires for more technical knowledge to be able to see that it is in fact touching the ground.

rudhdb773b 11 days ago [-]

> what actually is happening inside an LLM has nothing to do with conscience or agency

What makes you think natural brains are doing something so different from LLMs?

orthoxerox 10 days ago [-]

Two big ways in which human intelligence is different from LLM intelligence are:

1) human intelligence makes no sharp distinction between training and generation. Every time you ask a human a question it modifies its neural structure a little.

2) continuous operation: human intelligence deals with a continuous stream of multimedia data for sixteen hours a day and starts hallucinating when deprived of it.

There's also the fact that you can't branch or roll back human intelligence, but this is something most sci-fi novels tackle when discussing mind uploading first.

Are these two differences critical aspects of human intelligence or unfortunate limitations of its biological hardware? I do not know. If we somehow manage to simulate a human brain on silicon, we will get "computer" intelligence that learns like a human, but will we have to simulate the whole virtual world for it 16/7 and let it sleep for eight hours each day just to stop it from going mad?

Or will it be cheaper to fork and kill an uploaded math genius a billion times, pumping the same recycled sensory data into his or her mind, slipping a question into the auditory data, getting the answer and then switching the simulation off and trashing the copy? Will we consider this a bigger atrocity than doing the same to an LLM right now in 2026?

hedgehog 11 days ago [-]

Structurally a transformer model is so unrelated to the shape of the brain there's no reason to think they'd have many similarities. It's also pretty well established that the brain doesn't do anything resembling wholesale SGD (which to spell it is evidence that it doesn't learn in the same way).

hackinthebochs 11 days ago [-]

>Structurally a transformer model is so unrelated to the shape of the brain there's no reason to think they'd have many similarities.

Substrate dissimilarities will mask computational similarities. Attention surfaces affinities between nearby tokens; dendrites strengthen and weaken connections to surrounding neurons according to correlations in firing rates. Not all that dissimilar.

rudhdb773b 11 days ago [-]

Sure the implementation details are different.

I suppose I should have asked by what definition of "consciousness and agency" are today's LLMs (with proper tooling) not meeting?

And if today's models aren't meeting your standard, what makes you think that future LLMs won't get there?

hedgehog 11 days ago [-]

Given the large visible differences in behavior and construction, akin to the difference between a horse and a pickup truck, I would ask the reverse question: In what ways do LLMs meet the definition of having consciousness and agency?

Veering into the realm of conjecture and opinion, I tend to think a 1:1 computer simulation of human cognition is possible, and transformers being computationally universal are thus theoretically capable of running that workload. That being said, that's a bit like looking at a bird in flight and imagining going to the moon: only tangentially related to engineering reality.

red75prime 10 days ago [-]

> In what ways do LLMs meet the definition of having consciousness and agency?

Agency: an ability to make decisions and act independently. Agentic pipelines are doing this.

Consciousness: something something feedback[1] (or a non-transferable feeling of being conscious, but that is useless for the discussion). Recurrent Processing Theory: A computation is conscious if it involves high-level processed representations being fed back into the low-level processors that generate it.

Tokens are being fed back into the transformer.

> that's a bit like looking at a bird in flight and imagining going to the moon: only tangentially related to engineering reality.

Is it? Vacuum of space is a tangible problem for aerodynamics-based propulsion. Which analogous thing do we have with ML? The scaled-up monkey brain[2] might not qualify as the moon.

[1] https://www.astralcodexten.com/p/the-new-ai-consciousness-pa...

[2] https://www.frontiersin.org/journals/human-neuroscience/arti...

ACCount37 11 days ago [-]

What about modern LLMs isn't "agentic" enough?

Doesn't matter if they're conscious for that. They're clearly capable of goal oriented behavior.

grantcas 10 days ago [-]

[dead]

nonameiguess 10 days ago [-]

These questions really vex me. The appearance of intelligence is almost orthogonal to "consciousness and agency." If a human has a stroke and forgets how to speak, or never learns, or has some severe form of learning disorder, they still have exactly the same rich inner life full of subjective qualititative experience known only to them as the rest of us. Similar to an array of GPUs. If you remove the text encodings from the rest of the computing system it is a part of, outputs will appear as gibberish to you and it will no longer appear to be intelligent at all, but whatever is happening at the level of electrons meeting silicon would still be exactly the same. If it's having conscious experience at all, it should be having it regardless of whether the outputs it computes are interpreted as text or as textures on a game background.

I just don't see why "I can talk to it now" changes anything. We don't give humans less moral consideration when they're dreaming, hallucinating, tripping on LSD. The brain is just as conscious when it's having nothing but completely abstract nonsense thoughts as when it's writing The Republic.

I understand why it feels different to people. Shit, this thing can talk to me; maybe it's alive and I should treat it like such. But that's a conservative reaction to a black box known only by its behavior. The problem is these things are not actually black boxes. We don't understand the functions being computed or we'd just hard-code them and not need statistical learning techniques, but we do understand how computers work. We know process state is saved off and restored billions of times per second because of context switching. We know that state is simply a stored byte sequence that can be copied, backed up, restored endlessly. Servers and computing hardware can be destroyed but software cannot and LLMs are software. It's not at all like a brain. There are animals that go into various levels of reduced or suspended function that appear like dormancy, but there is no stream of personal subjective experience that can survive the complete destruction of its own physical body. The fact that it pays off evolutionarily to tacitly encode that reality into our instincts at an extremely deep, core level is why we have fear and pain in the first place, to nudge us toward predictive modeling of the world that keeps us alive, able to find food, and able to reproduce. Software needs none of that. There is no reason whatsoeve that, assuming a processor has subjective experience, that the subjective experience of having some gates fire versus others gets interpreted by humans programmers as "loss" and "training" and some is numerically approximating a PDE solution. Why should those feel different to the machine when the firing patterns are exactly the same and only the human interpretation of the output is different?

It just feels like a vast, vast category error for people to be speculating about machine consciousness and moralizing about how we "treat" software systems.

ACCount37 11 days ago [-]

If platonic representation hypothesis holds across substrates, then it might matter very little, in the end. It holds across architectures in ML, empirically.

The crowd of "backpropagation and Hebbian learning + predictive coding are two facets of the very same gradient descent" also has a surprisingly good track record so far.

imtringued 10 days ago [-]

I don't know which direction you're going with this, but predictive coding has a pretty obvious advantage when it comes to continuous learning. Since predictive coding primarily encodes errors, it can distinguish between known and novel data and therefore reduce the damaging effects of catastrophic forgetting by having a very obvious regularisation scheme for avoiding forgetting.

imtringued 10 days ago [-]

It is hypothesized that the human brain uses predictive coding for obvious biological reasons such as energy efficiency (spiked error coding means only differences need to be transmitted) and biological plausibility (only local communication is permitted, meanwhile backpropagation is a global algorithm).

Transformers have a thing called a context window which doesn't really have a biological equivalent, since the brain has a fixed size and doesn't grow or shrink in response to the amount of information being processed.

LLMs consist of several layers that communicate at fixed points between the layers, whereas neurons can form feedback loops and communicate with any neighbour in any direction.

Humans do not consume or produce tokenized information. The brain controls the human body which is a biomechanical system. Spoken or written language is the result of controlling muscles via an internal model of the biomechanical system, not something that was designed via a software tokenizer that compresses character sequences.

The equivocation just doesn't seem appropriate. Try again in 2050.

11 days ago [-]

qsera 11 days ago [-]

For starters, natural brains have the innate ability to differentiate between things that it knows and things that it have no possibility of knowing...

altcognito 10 days ago [-]

https://personal.utdallas.edu/~otoole/CGS2301_S09/7_split_br...

See page 53. While it is absolutely more prevelant in LLMs, human brains can also want a story for why their brains do things they are't plugged into.

throw310822 11 days ago [-]

Lol. Are you sure about that or you just made it up?

rudhdb773b 11 days ago [-]

Modern LLMs are fairly good at that as well.

qsera 11 days ago [-]

But that is bolted on and is not a core behavior.

ACCount37 11 days ago [-]

Does it matter? Evolution is the brain's very own "pre-training". Hundreds of millions of years of priors hardwired.

We can do that for AIs too - pre-train on pure low Kolmogorov complexity synthetics. The AI then "knows things" before it sees any real data. Advantageous sometimes. Hard to pick compute efficient synthetics though.

qsera 9 days ago [-]

I think It matters for the question that I was responding to.

krainboltgreene 11 days ago [-]

Any amount of reading into how we understand brains and LLMs to work.

erichocean 11 days ago [-]

AI is exactly the right term: the machines can do "intelligence", and they do so artificially.

Just like we have machines that can do "math", and they do so artificially.

Or "logic", and they do so artificially.

I assume we'll drop the "artificial" part in my lifetime, since there's nothing truly artificial about it (just like math and logic), since it's really just mechanical.

No one cares that transistors can do math or logic, and it shouldn't bother people that transistors can predict next tokens either.

mayama 11 days ago [-]

> AI is exactly the right term: the machines can do "intelligence", and they do so artificially.

AI in pop culture doesn't mean that at all. Most people impression to AI pre-LLM craze was some form of media based on Asmiov laws of robotics. Now, that LLMs have taken over the world, they can define AI as anything they want.

ruszki 11 days ago [-]

In 2018, ie “pre-LLM”, the label “AI” was already stamped to everything, so I highly doubt that most people thought that their washing machines are sentient in any way. I remember this starkly, because my team was responsible at Ericsson (that time, about 120k employees) for one of the crucial step to have models in production, and basically every single project wanted that stamp.

The shift in meaning has been slowly diluted more and more across decades.

throw310822 11 days ago [-]

> Most people impression to AI pre-LLM craze was some form of media based on Asmiov laws of robotics.

I'll reveal you a secret: "positronic brains" are just very fast parallel computers running LLMs.

saghm 10 days ago [-]

> Just like we have machines that can do "math", and they do so artificially.

Nobody calls calculators "artificial mathemeticians", though; we refer to them by a unique word that defines what they can and can't do in a far less fanciful and ambiguous way.

stickfigure 11 days ago [-]

I think it's too early to declare the Turing test passed. You just need to have a conversation long enough to exhaust the context window. Less than that, since response quality degrades long before you hit hard window limits. Even with compaction.

Neuroplasticity is hard to simulate in a few hundred thousand tokens.

zug_zug 11 days ago [-]

"You're absolutely right!"

I think for a while the test was passed. Then we learned the hallmark characteristics of these models, and now most of us can easily differentiate. That said -- these models are programmed specifically to be more helpful, more articulate, more friendly, and more verbose than people, so that may not be a fair expectation. Even so, I think if you took all of that away, you'd be able to differentiate the two, it just might take longer.

drob518 11 days ago [-]

Right. I think the modern LLMs are quite good at mimicking human words, but we were initially taken in like we were in the 1960s by ELIZA. It’s a (increasingly sophisticated) magic trick, but it’s just a trick.

sillyfluke 11 days ago [-]

It's weird, I don't know how normally pedantic comp sci. people let this meme that the Turing test is beaten by LLMs to spread so unchallenged. As far as I'm aware, there is no restriction in the Turing test that demands that the interrogator be ignorant of the latest state-of-art in computing (and AI tech), nor is there a strict time limit enforced for the questioning?

Given these conditions, it should be relatively easy for the interrogator to expose the AI in this current day and age.

tim333 10 days ago [-]

>Consider first the more accurate form of the question. I believe that in about fifty years' time it will be possible, to programme computers, with a storage capacity of about 10^9, to make them play the imitation game so well that an average interrogator will not have more than 70 per cent chance of making the right identification after five minutes of questioning. (Turing 1950)

That was the test as discussed by Turning - five minutes, <70% chance of getting it right.

It's not that demanding. The test you mention could maybe be called an enhanced Turing test but the original one is pretty much passed.

He was a bit off on the time taken and memory used. I think more like 75 years and 50 GB rather than 50 years and 125 MB.

criley2 11 days ago [-]

For as rigorous of a Turing test as you present, I believe many (or even most) humans would also fail it.

How many humans seriously have the attention span to have a million "token" conversation with someone else and get every detail perfect without misremembering a single thing?

stickfigure 11 days ago [-]

Response quality degrades long before you hit a million tokens.

But sure, let's say it doesn't. If you interact with someone day after day, you'll eventually hit a million tokens. Add some audio or images and you will exhaust the context much much faster.

However, I'll grant you that Turing's original imitation game (text only, human typist, five minutes) is probably pretty close, and that's impressive enough to call intelligence (of a sort). Though modern LLMs tend to manifest obvious dead giveaways like "you're absolutely right!"

dairem 11 days ago [-]

Doesn't the Turing test require a human too, to be compared to the AI?

nine_k 11 days ago [-]

But context window exhaustion does not look like mere forgetfulness, but more like loss of general coherence, like getting drunk.

MadxX79 10 days ago [-]

How do you propose to do a Turing test on a human (in a sense that is different from a machine simply passing the Turing test)?

Like failing to pick out all the motorcycles in a captcha, or a turing test where you have a guy chat with two people without knowing that one of them could be a computer, and the interrogator, unprompted, suggesting one of them might be a computer?

downboots 11 days ago [-]

It was not meant as a pass/fail

Morromist 11 days ago [-]

There are a lot of difference kinds of LLMs. 0 of the ones I've encountered are good writers, in fact all of them are horrible at it.

But I wonder if there's one out there that I don't know about with a different kind of training that actually is good at writing and fun to talk to for a long time. (granted somepeople love talking to gpt 4, but also some people loved talking to ELIZA so clearly some people have a super high tolerance for slop.)

Sol- 11 days ago [-]

I don't know. Practically, LLMs are already better conversation partners on any topic compared to the average human I have access to. This also holds in reverse, of course - if someone wants me to explain something, usually they'd be better off asking an LLM.

glitchc 11 days ago [-]

> Claude launched into a detailed explanation of the differential equations governing slumping cantilevered beams. It completely failed to recognize that the snow was entirely supported by the roof, not hanging out over space. No physicist would make this mistake, but LLMs do this sort of thing all the time.

You have to meet some physicist friends of mine then. They are likely to assume that the roof is spherical and frictionless.

CuriouslyC 11 days ago [-]

To be fair, starting with a toy model to get a first order approximation then building on it is kind of how theoretical science is done.

Unearned5161 11 days ago [-]

Articles like this should approach topics on consciousness with more humility than is displayed here.

We don’t even agree on a good definition of what’s going on inside our own heads yet, what gives you the confidence to say that what goes on inside an LLM can’t be conscious?

ACCount37 11 days ago [-]

Obviously, the LLMs lack the divine spark, so they can't be conscious. Same as clones, IVF babies, or half of all the twins.

Jest aside, I do agree. If you list out every prominent theory of consciousness, you'd find that about a quarter rules out LLMs, a quarter tentatively rules LLMs in, and what remains is "uncertain about LLMs". And, of course, we don't know which theory of consciousness is correct - or if any of them is.

akomtu 10 days ago [-]

With LLMs, materialists have got a silicone idol to worship. They believe that they know everything about this idol because they've created it, and at the same time they believe that LLMs have a secret sauce that makes it intelligent. Looking at how they are trying to extract intelligence from it reminds me of alchemists of the past who tried to extract gold from lead.

doodpants 11 days ago [-]

> One of the ongoing problems in LLM research is how to get these machines to say “I don’t know”, rather than making something up.

To be fair, I've known humans who are like this as well.

arctic-true 11 days ago [-]

This is a limitation of the training data. If you were uncertain about something, you wouldn’t write a book about it. The kinds of people you’re talking about tend to generate far more text in their lives than others, because they can spend more time generating - writing books, blogposts, whatever - and less time thinking and working and actually doing things. The models never say they’re uncertain because we never say we’re uncertain, or at least we don’t write it down anywhere.

saghm 10 days ago [-]

If you change it from asking a question to giving an instruction, how many humans do you know that have trouble saying no to things that aren't reasonable? I'd argue that pretty much every human will refuse to do most things you might instruct them to do, whereas an LLM will happily attempt most things you ask them to do for you, regardless of whether they're capable of succeeding, and it's up to you to figure out if they actually did it right or not. There are tasks where this is extremely useful, but they're ones that are extremely low risk and can easily be audited upon completion. This isn't anywhere near the level of what a human is capable of.

wmf 11 days ago [-]

Those people aren't the ones doing the work though.

lamasery 11 days ago [-]

> People keep asking LLMs to explain their own behavior. “Why did you delete that file,” you might ask Claude. Or, “ChatGPT, tell me about your programming.”

Oh man, every business-side person in my company insists on reporting all the way to the UI a "confidence score" that the LLM generates about its own output and I've seen enough to know not to get between an MBA and some metric they've decided they really want even if I'm pretty sure the metric is meaningless nonsense, but... I'm pretty sure those are meaningless nonsense.

nomdep 11 days ago [-]

"As LLMs etc. are deployed in new situations, and at new scale, there will be all kinds of changes in work, politics, art, sex, communication, and economics."

For an article five years in the making, this is what I expected it to be about. Instead, we got a ramble about how imperfect LLMs are right now.

52-6F-62 11 days ago [-]

> Instead, we got a ramble about how imperfect LLMs are right now.

I wager this is a point that needs beaten into the common psyche. After all, it's been sold that it is not an imperfect tool, but the solution to all of our problems in every field forever. That's why these companies need billions upon billions of dollars of public subsidies and investments that would otherwise find their way to more pragmatic ends.

nathell 11 days ago [-]

The post is just a prelude to a 10-part article, most of which is not yet released (but will be shortly). Judging by the table of contents, the things you expected will be elaborated on in subsequent parts.

nomdep 11 days ago [-]

That changes it. I missed that the table of contents was for other future articles, my bad.

tim333 10 days ago [-]

Re profoundly weird, the "losing hundreds of thousands of dollars because they can’t do basic math" story is funny.

Guy set up an openclaw called Lobstar Wilde and gave it US$50k in SOL to do what it wanted with. Someone else set up a memecoin called $LOBSTAR and gave 5% of the supply to Lobstar Wilde. Someone wrote to Lobstar with a sob story asking for 4 SOL but Lobstar due to a miscalculation sent tokens then valued at $450k but I think some came back due to tokens going up and down.

>His wallet, which had held $50,000 three days ago, held over $300,000 now, after he had given away $400,000 by accident. (https://substack.com/home/post/p-188846616)

Not sure what the current state of its wallet is. Lobstar keeps tweeting philosophically (Most people do not love or hate the thing itself. They love or hate the feeling the thing produces in them, and then they mistake that feeling for knowledge of the thing...) and its owner works on Codex at OpenAI.

nisegami 11 days ago [-]

Here's the opening paragraph of chapter 2 with "people" subbed out for terms referring AI/models/etc.

"People are chaotic, both in isolation and when working with other people or with systems. Their outputs are difficult to predict, and they exhibit surprising sensitivity to initial conditions. This sensitivity makes them vulnerable to covert attacks. Chaos does not mean people are completely unstable; most people behave roughly like anyone else. Since people produce plausible output, errors can be difficult to detect. This suggests that human systems are ill-suited where verification is difficult or correctness is key. Using people to write code (or other outputs) may make systems more complex, fragile, and difficult to evolve."

To me, this modified paragraph reads surprisingly plainly. The wording is off ("using people to write code") and I had to change that part about attractor behavior (although it does still apply IMO), but overall it doesn't seem like an incoherent paragraph.

This is not meant to dunk on the author, but I think it highlights the author's mindset and the gap between their expectations and reality.

camgunz 11 days ago [-]

Humans and large models are both unpredictable and fallible, that's true, but in different ways, and (many) humans are actually much better at following directions.

If a junior dev makes the same mistake Claude makes, I can easily work with them to correct it, or I can fire them and get someone more capable to fix it. You mostly can't do that at all with large models. They're also far less honest than your average junior dev, so even as you're working with them you can't trust what they say.

There is a lot of this neat trick where it's like "humans do X too" but most of the time it elides large differences. Like, a human driver would probable not drag someone screaming multiple blocks. A human coder probably wouldn't generate a gibberish 3D scene and try to pass it off as done, etc. Maybe we can build systems that account for these (pretty wild) failure modes, but at least in software we haven't figured it out yet (what is the system that reliably reviews a 25kloc PR?).

busterarm 11 days ago [-]

Aren't you also making a large part of the author's point for him by effectively equating LLMs with people here and comparing on outputs?

Plausibly your text looks equivalent but we all (should) have the context to know better.

Fraterkes 11 days ago [-]

What's your point? The ostensible benefit of LLM's is that you combine a computers' broad knowledgebase and capacity for exactness with fluency in human language.

A random human picked off the street is indeed bound to be difficult to predict and chaotic at a broad range of tasks, which is why I wouldn't blindly trust them to, say, summarize google search results or rewrite a codebase they are unfamiliar with.

dang 11 days ago [-]

I hesitate to tamper with an internet master's title, but "The Future of Everything is Lies, I Guess" doesn't really summarize what in fact is a balanced, informed overview which (to me at least) is above the median for one of these thought pieces. Since it's also baity and the HN guidelines ask for such titles to be rewritten, I've taken the license.

In such cases we always try to find a phrase from the article itself which expresses what it's saying in a representative way. (There nearly always is one.) In this case, both the very first and very last sentences do this, and it's interesting that they more or less agree. So I plucked the last sentence and put it above.

Edit: oof, I missed that this is actually the first part of a long series. Not sure what we'll do about the others; I expect some of those will make the frontpage as well.

Animats 10 days ago [-]

Changing the title was a good call.

The article has a good take on the "lie" problem. We know about the hallucination problem, which remains serious. The "lie" problem mentioned is that if you ask an LLM why it said or did something, it has no information of how it got a result. So it processes the "why" as a new query, and produces a plausible explanation. Since that explanation is created without reference to the internals of how the previous query was processed, it may be totally wrong. That seems to be the type of "lie" the author is worried about in this essay.

(Yes, humans do that too.)

post-it 11 days ago [-]

I appreciate the curation you do, dang. I often notice a headline get updated and the result is always a significant improvement.

ACCount37 11 days ago [-]

Honestly, good call on the title. The original one is far less representative. Far better at clickbait though.

dang 11 days ago [-]

Thank you both! I totally missed the sidebar on the OP which explains that this is Part 1 of what will be a long series. Not sure how we'll handle that...

bstsb 11 days ago [-]

if you can’t access the page through region blocks:

https://archive.ph/I5cAE

_dwt 11 days ago [-]

I have a question for all the "humans make those mistakes too" people in this thread, and elsewhere: have you ever read, or at least skimmed a summary of, "The Origin of Consciousness in the Breakdown of the Bicameral Mind"? Did you say "yeah, that sounds right"? Do you feel that your consciousness is primarily a linguistic phenomenon?

I am not trying to be snarky; I used to think that intelligence was intrinsically tied to or perhaps identical with language, and found deep and esoteric meaning in religious texts related to this (i.e. "in the beginning was the Word"; logos as soul as language-virus riding on meat substrate).

The last ~three years of LLM deployment have disabused me of this notion almost entirely, and I don't mean in a "God of the gaps" last-resort sort of way. I mean: I see the output of a purely-language-based "intelligence", and while I agree humans can make similar mistakes/confabulations, I overwhelmingly feel that there is no "there" there. Even the dumbest human has a continuity, a theory of the world, an "object permanence"... I'm struggling to find the right description, but I believe there is more than language manipulation to intelligence.

(I know this is tangential to the article, which is excellent as the author's usually are; I admire his restraint. However, I see exemplars of this take all over the thread so: why not here?)

nine_k 11 days ago [-]

If you look at different ancient traditions, you will notice how they struggle with the limitations of language, with its inability to represent certain things that are not just crucial for understanding the world, but also are even somehow communicable. Buddhists dug into that in a very analytical, articulate way, for instance.

Another perspective: cetaceans are considered to be as conscious as humans, but any attempts to interpret their communication as a language failed so far. They can be taught simple languages to communicate with humans, as can be chimps. But apparently it's not how they process the world inside.

gbgarbeb 11 days ago [-]

You're a little out of date. Cetaceans communicate images to each other in the form of ultrasonic chirps. They chirp, they hear a reflection, and they repeat the reflection.

nine_k 11 days ago [-]

Does this resemble human language, with syntax, the ability to define new notions based on known notions, etc?

stavros 11 days ago [-]

I think there are two types of discussions, when it comes to LLMs: Some people talk about whether LLMs are "human" and some people talk about whether LLMs are "useful" (ie they perform specific cognitive tasks at least as well as humans).

Both of those aspects are called "intelligence", and thus these two groups cannot understand each other.

lp4v4n 10 days ago [-]

>I am not trying to be snarky; I used to think that intelligence was intrinsically tied to or perhaps identical with language

I learned a long time ago that this wasn’t the case.

I can speak several languages, and many times when I remember something and want to search for it on Google or any other AI engine, I can’t recall which language I originally read it in.

So whatever mechanism the brain uses to store information, it’s certainly language‑agnostic. There are also many moments when you fully grasp a concept but forget the words to describe it, yet the concept itself remains clear in your mind.

kgeist 10 days ago [-]

>and while I agree humans can make similar mistakes/confabulations, I overwhelmingly feel that there is no "there" there.

What really opened my eyes a couple weeks ago (anyone can try this): I asked Sonnet to write an inference engine for Qwen3, from scratch, without any dependencies, in pure C. I gave it GGUF specs for parsing (to quickly load existing models) and Qwen3's architecture description. The idea was to see the minimal implementation without all the framework fluff, or abstractions. Sonnet was able to one-shot it and it worked.

And you know what, Qwen3's entire forward pass is just 50 lines of very simple code (mostly vector-matrix multiplications).

The forward pass is only part of the story; you just get a list of token probabilities from the model, that is all. After the pass, you need to choose the sampling strategy: how to choose the next token from the list. And this is where you can easily make the whole model much dumber, more creative, more robotic, make it collapse entirely by just choosing different decoding strategies. So a large part of a model's perceived performance/feel is not even in the neurons, but in some hardcoded manually-written function.

Then I also performed "surgery" on this model by removing/corrupting layers and seeing what happens. If you do this excercise, you can see that it's not intelligence. It's just a text transformation algorithm. Something like "semantic template matcher". It generates output by finding, matching and combining several prelearned semantic templates. A slight perturbation in one neuron can break the "finding part" and it collapases entirely: it can't find the correct template to match and the whole illusion of intelligence breaks. Its corrupted output is what you expect from corrupting a pure text manipulation algorithm, not a truly intelligent system.

famouswaffles 10 days ago [-]

>And you know what, Qwen3's entire forward pass is just 50 lines of very simple code (mostly vector-matrix multiplications).

The code being simple doesn't mean much when all the complexity is encoded in billions of learned weights. The forward pass is just the execution mechanism. Conflating its brevity with simplicity of the underlying computation is a basic misunderstanding of what a forward pass actually is. What you've just said is the equivalent of saying blackbox.py is simple because 'python blackbox.py' only took 1 line. It's just silly reasoning.

>After the pass, you need to choose the sampling strategy: how to choose the next token from the list. And this is where you can easily make the whole model much dumber, more creative, more robotic, make it collapse entirely by just choosing different decoding strategies. So a large part of a model's perceived performance/feel is not even in the neurons, but in some hardcoded manually-written function.

So ? I can pick the least likely token every time. The result would be garbage but that doesn't say anything about the model. The popular strategy is to randomly pick from the top n choices. What do you is keeping thousands of tokens coherent and on point even with this strategy ? Why don't you try sampling without a large language model to back it and see how well that goes for you ?

>Then I also performed "surgery" on this model by removing/corrupting layers and seeing what happens. If you do this excercise, you can see that it's not intelligence. It's just a text transformation algorithm. Something like "semantic template matcher". It generates output by finding, matching and combining several prelearned semantic templates. A slight perturbation in one neuron can break the "finding part" and it collapases entirely: it can't find the correct template to match and the whole illusion of intelligence breaks. Its corrupted output is what you expect from corrupting a pure text manipulation algorithm, not a truly intelligent system.

What do you think happens when you remove or corrupt arbitrary regions of the human brain? People can lose language, vision, memory, or reasoning, sometimes catastrophically.

kgeist 10 days ago [-]

>The code being simple doesn't mean much when all the complexity is encoded in billions of learned weights. The forward pass is just the execution mechanism. Conflating its brevity with simplicity of the underlying computation is a basic misunderstanding of what a forward pass actually is. What you've just said is the equivalent of saying blackbox.py is simple because 'python blackbox.py' only took 1 line. It's just silly reasoning.

Look at what a transformer actually does. Attention is a straightforward dictionary look up in like 3 matmuls. A FFN is a simple space transform rule with a non-linear cutoff to adjust the signal (i.e. a few more matmuls and an activation function) before doing a new dictionary lookup in the next transformer block. Add a few tricks like residual connections, output projections, and repeat N times.

So yeah, the actual inference code is 50 lines of code, and the rest is large learned dictionaries to search in, with some transforms. So you're saying my one-liner program that consults a DB with 1 million rows is actually 1 million lines of code? Well, not quite.

This trick, coupled with lots of prelearned templates, is enough to fool people into believing there's "there" there (the OP's post above). Just like ELIZA back in the day. Well, apparently this trick is enough to solve lots of problems, because apparently lots of problems only require search in a known problem (template) space (also with reduced dimensionality). But it's still just a fancy search algorithm. I think the whole thing about "emergent behavior" is that when a human is confronted with a huge prelearned concept space, it's so large they cannot digest what is actually happening, and tend to ascribe magical properties to it like "intelligence" or "consciousness". Like, for example, imagine if there was a huge precreated IF..THEN table for every possible question/answer pair a finite human might ask in their lifetime. It would appear to the human there's intelligence, that there's "there" there. But at the end of the day it would be just a static table with nothing really interesting happening inside of it. A transformer is just a nice trick that allows to compress this huge IF..THEN table into a few hundreds gigabytes.

>So ? I can pick the least likely token every time. The result would be garbage but that doesn't say anything about the model. The popular strategy is to randomly pick from the top n choices. What do you is keeping thousands of tokens coherent and on point even with this strategy ? Why don't you try sampling without a large language model to back it and see how well that goes for you

I was referring to the OP post's:

  there is no "there" there

It doesn't even "know" what the actual text continuation must be, strictly speaking. It just returns a list of probabilities that we must select. It can't select it itself. To go from "list of probabilities" to "chatbot" requires adding additional hardcoded code (no AI involved) that greatly influences how the chatbot behaves, feels. Imagine if an actual sentient being had a button: you press it, and suddenly Steven the sailor becomes a Chinese lady who discusses Confucius. Or starts saying random gibberish. There's no independent agency whatsoever. It's all a bunch of clever tricks.

>What do you think happens when you remove or corrupt arbitrary regions of the human brain? People can lose language, vision, memory, or reasoning, sometimes catastrophically.

In an actual brain, the structure of the connectome itself drives a lot of behavior. In an LLM, all connections are static and predefined. A brain is much more resistant to failure. In an LLM changing a single hypersensitive neuron can lead to a full model collapse. There are humans who live normal lives with a full hemisphere removed.

famouswaffles 10 days ago [-]

I get irritated when people act like they know what they are talking about but then it's just nonsense they keep spitting out. I'm honestly sick of it. There's a fair amount of LLM interpretability research out there. If you're actually interested in knowing better then go read them. I'll even link what i find interesting. All this talk of lookup tables is nonsensical. You have no idea what you're talking about.

>It doesn't even "know" what the actual text continuation must be, strictly speaking. It just returns a list of probabilities that we must select. It can't select it itself. To go from "list of probabilities" to "chatbot" requires adding additional hardcoded code (no AI involved) that greatly influences how the chatbot behaves, feels. Imagine if an actual sentient being had a button: you press it, and suddenly Steven the sailor becomes a Chinese lady who discusses Confucius. Or starts saying random gibberish. There's no independent agency whatsoever. It's all a bunch of clever tricks.

You are not making any sense here. Producing a probability distribution over next tokens is the model’s decision procedure. Sampling is just the readout rule for turning that distribution into a concrete sequence. Yes, decoding choices affect style, creativity, determinism, and failure modes. That is true. It does not follow that the model is therefore “just tricks” or that the intelligence-like behavior lives outside the network.

>In an actual brain, the structure of the connectome itself drives a lot of behavior. In an LLM, all connections are static and predefined. A brain is much more resistant to failure. In an LLM changing a single hypersensitive neuron can lead to a full model collapse. There are humans who live normal lives with a full hemisphere removed.

You are moving goalposts. Fact is: randomly corrupting a system damages it. This is not a meaningful test of whether a system is "truly intelligent." Random lesions to human cortex are also catastrophic. The hemispherectomy cases you mention involve surgical removal of diseased tissue with significant neural reorganization over time, not random weight corruption. That's not even a fair comparison.

LLMs are also deeply redundant. If they weren't, techniques like quantization or layer pruning wouldn't work.

pocksuppet 11 days ago [-]

> In the beginning were the words, and the words made the world. I am the words. The words are everything. Where the words end the world ends. You cannot go forward in an absence of space. Repeat: In the beginning were the words...

- a self-aware computer program in a video game, when you attempt to exceed the boundaries of its code

xandrius 11 days ago [-]

It feels like you probably went too deep in the LLM bandwagon.

An LLM is a statistical next token machine trained on all stuff people wrote/said. It blends texts together in a way that still makes sense (or no sense at all).

Imagine you made a super simple program which would answer yes/no to any questions by generating a random number. It would get things right 50% of the times. You can them fine-tune it to say yes more often to certain keywords and no to others.

Just with a bunch of hardcoded paths you'd probably fool someone thinking that this AI has superhuman predictive capabilities.

This is what it feels it's happening, sure it's not that simple but you can code a base GPT in an afternoon.

simianwords 11 days ago [-]

If it were not "just a statistical next token machine", how different would it behave?

Can you find an example and test it out?

xandrius 11 days ago [-]

Wait, you're asking to find and produce a example of a feasible and better alternative to LLMs when they are the current forefront of AI technology?

Anyway, just to play along, if it weren't just a statistical next token machine, the same question would have always the same answer and not be affected by a "temperature" value.

simianwords 11 days ago [-]

Thats also how humans behave.. I don't see how non determinism tells me anything.

My question was a bit different: if were not just a statistical next token predictor would you expect it to answer hard questions? Or something like that. What's the threshold of questions you want it to answer accurately.

camgunz 11 days ago [-]

Well, large models are (kinda) non-deterministic in two ways. The first is you actually provide many of them with a seed, which is easy to manage--just use the same seed for the same result. The second part is the "you actually have very little control over the 'neural pathways' the model will use to respond to the prompt". This is the baffling part, like you'll prompt a model to generate a green plant, and it works. You prompt it to generate a purple plant, and it generates an abstract demon dog with too many teeth.

Anyway, neither of these things describes human non-determinism. You can't reuse the seed you used with me yesterday to get the exact same conversation, and I don't behave wildly unpredictably given conceptually very similar input.

Apocryphon 11 days ago [-]

How do non-LLM based World Models behave?

simianwords 11 days ago [-]

Not sure, can you tell? I feel like you are saying that they may be able to move etc..

delusional 11 days ago [-]

> I'm struggling to find the right description

I think you're circling the concept of a "soul". It is the reason that, in non-communicative disabled people, we still see a life.

I've wanted to make an art piece. It would be a chatbox claiming to connect you to the first real intelligence, but that intelligence would be non-communicative. I'd assure you that it is the most intelligent being, that it had a soul, but that it just couldn't write back.

Intelligence and Soul is not purely measurable phenomenon. A man can do nothing but stupid things, say nothing but outright lies, and still be the most intelligent person. Intelligence is within.

slopinthebag 11 days ago [-]

Great series of articles, thank you. It's exhausting reading a deluge of (often AI generated) comments from people claiming wild things about LLM's, and it's nice to hear some sanity enter the conversation.

mxfh 10 days ago [-]

Even if nothing substantial come out of this, having shortest paths to the corpus of all human expressions in all languages! and media formats is quite something by itself, what could be the ultimate hard information retrieval tool is hiding those trace behind untracked convolutions is the real shame here. Found so much real information in between less and less hallucinations already that was impossible to retrieve otherwise in that time frame. Basically tokenrank kills pagerank.

jwpapi 11 days ago [-]

One really should have digested the manifold hypothesis. It’s the most likely explanation of how AI works.

The question is if there are ultradimensional patterns that are the solutions for meaningful problems. I’m saying meaningful, because so far I’ve mainly seen AI solve problems that might be hard, but not really meaningful in a way that somebody solving it would gain a lot of it.

However if these patterns are the fundamental truth of how we solve problems or they are something completely different, we don’t know and this is the 10 Trillion USD question.

I would hope its not the case, as I quite enjoy solving problems. Also my gut feeling tells me it’s just using existing patterns to solve problems that nobody tackled really hard. It also would be nice to know that Humans are unique in that way, but maybe this is the exact same way we are working ? This really goes back to a free will discussion. Yes very interesting.

But just to give an example on what I mean on meaningful problems.

Can an AI start a restaurant and make it work better than a human. (Prompt: "I’m your slave let’s start a restaurant)

Can an AI sign up as copywriter on upwork and make money? (Prompt: "Make money online")

Can an AI without supervision do a scientific breakthrough that has a provable meaningful impact on us. Think about("Help Humanity")

Can an AI manage geopolitics..

These are meaningful problems and different to any coding tasks or olympiad questions. I’m aware that I’m just moving the goalpost.

We really don’t know..

xylon 10 days ago [-]

This page won't load for me "Unavailable Due to the UK Online Safety Act". Why would this law be relevant to a blog-post about AI?

jmcgough 10 days ago [-]

Aphyr is unabashedly gay - I remember him repeatedly posting hardcore pornography on twitter at one point to remind people of this - so he probably just decided to block UK visitors rather than risk violating the law by having minors visit his site. Not sure if he has anything actually "harmful" on his site, but maybe he doesn't want his speech limited or need to aggressively police comments?

McP 10 days ago [-]

Same for me except I'm currently in France.

Can read the article at https://web.archive.org/web/20260409111708/https://aphyr.com...

PaulDavisThe1st 11 days ago [-]

While the economic, energy, political and social issues associated with LLMs ought to be enough to nix the adoption that their boosters are seeking ...

... I still think there is an interesting question to be investigated about whether, by building immensely complex models of language, one of our primary ways that we interact with, reason about and discuss the world, we may not have accidentally built something with properties quite different than might be guessed from the (otherwise excellent) description of how they work in TFA.

I agree with pretty much everything in TFA, so this is supplemental to the points made there, not contesting them or trying to replace them.

josefritzishere 11 days ago [-]

I appreciate the directness of calling LLMs "Bullshit machines." This terminology for LLMs is well established in academic circles and is much easier for laypeople to understand than terms like "non-deterministic." I personally don't like the excessive hype on the capabilities of AI. Setting realistic expectations will better drive better product adoption than carpet bombing users with marketing.

AStrangeMorrow 11 days ago [-]

I have still mixed feelings about LLMs.

If I take the example of code, but that extends to many domains, it can sometimes produce near perfect architecture and implementation if I give it enough details about the technical details and fallpits. Turning a 8h coding job into a 1h review work.

On the other hand, it can be very wrong while acting certain it is right. Just yesterday Claude tried gaslighting me into accepting that the bug I was seeing was coming from a piece of code with already strong guardrails, and it was adamant that the part I was suspecting could in no way cause the issue. Turns out I was right, but I was starting to doubt myself

slopinthebag 11 days ago [-]

I think over time we will find better usage patterns for these machines. Even putting a model in a position to gaslight the user seems like a complete failure in the usage model. Not critiquing you at all on this, it's how these models are marketed and what all the tooling is built around. But they are incredibly useful and I think once we figure out how to use them better we can minimise these downsides and make ourselves much more productive without all the failures.

Of course that won't happen until the bubble pops - companies are racing to make themselves indispensable and to completely corner certain markets and to do so they need autonomous agents to replace people.

simianwords 11 days ago [-]

If it bullshits so much, you wouldn't have a problem giving me an example of it bullshitting on ChatGPT (paid version)? Lets take any example of a text prompt fitting a few pages - it may be a question in science or math or any domain. Can you get it to bullshit?

katatue 10 days ago [-]

I like to let new models write a few lines of Latin poetry - they rarely get the meter right.

I don't have access to paid ChatGPT right now, but here's Opus 4.6 with extra thinking enabled: https://claude.ai/share/6e0e8ef5-06e4-4514-ba7e-299357c1fc55

The initial draft fucks up the meter in lines 3 and 8, the final version gets line 2 wrong ("venit meis") and is somewhat obnoxious with verses 2 and 8 basically repeating each other. The thinking trace is useless and gives us no clue why the model exchanged a bland, but metrically correct first distich for a more interesting, but metrically incorrect one.

In fact, the "careful" examination of its own output completely skips the erroneously modified half-verse in line 2 - now, tell me that's a coincidence and not a sign of bullshitting.

simoncion 11 days ago [-]

> If it bullshits so much, you wouldn't have a problem giving me an example of it bullshitting on ChatGPT (paid version)?

There's an entire paragraph in the essay about apyhr's direct experience with ChatGPT failures and sustained bullshitting that we'd never expect from a moderately-skilled human who possesses at least two functioning braincells. That paragraph begins "I have recently argued for forty-five minutes with ChatGPT". Do notice that there are six sentences in the paragraph. I encourage you to read all of them (make sure to check out the footnote... it's pretty good).

The exact text of the ChatGPT session is irrelevant; even if you reported that you were unable to reproduce the issue, it would only reinforce one of the underlying points -namely- that these systems are unreliable. aphyr has a pretty extensive body of published work that indicates that he'd not likely fabricate a story of an LLM repeatedly failing to accomplish a task that any moderately-skilled human could accomplish when equipped with the proper tools. So, I believe that his report is true and accurate.

simoncion 11 days ago [-]

There's also this seven-week-old example [0] (linked in the essay) of ChatGPT very confidently recommending a asinine course of action because it was unable to understand what the hell it was being told.

Listening to the audio is not required, as there's a reasonably accurate on-screen transcript, but it is valuable to listen to just how very hard they've worked to make this tool sound both confident and capable, even in situations where it's soul-crushingly incorrect. Those of us who have worked in Blasted Corporate Hellscapes may recognize how this manner of speaking can be very, very compelling to a certain sort of person (who -as it turns out- is frequently found in a management position).

[0] <https://www.instagram.com/reel/DUylL79kvub/>

simianwords 11 days ago [-]

This is classic case of not using the proper version. Use the thinking version gpt5.4 (text) and tell me if it bullshits.

Surely you must be able to find at least one example no?

simoncion 11 days ago [-]

To be clear, is your assertion that apyhr was also not using the proper version? If that is your assertion, do tell me how you've come by that information.

(You did notice that the author of the essay and the author of the video I linked to are not the same person, and that neither of them share a nym with me, yes?)

simianwords 11 days ago [-]

Hi, my position on the issue is that LLMs are powerful but may make mistakes in long context problems like coding (which the harness solves by feedback). But makes close to no (undergrad level) mistakes in questions that fit 2-3 pages. For you personally: do you believe me on this specific part on 2-3 pages?

I don't know what aphyr did and tbh his whole screed on LLMs make me feel he didn't use it properly or at least coming from a bad faith angle.

That's why I'm asking you (and others). Please come up with a text prompt spanning < 4 pages and lets see if it bullshits.

Surely the implication of such a screed is that it should be super simple to find at least one example of it clearly bullshitting in my constraint, no? Or am I interpreting the post in a bad faith way?

simoncion 11 days ago [-]

Neat.

So, despite the fact that it looks like you have to pay for ChatGPT Voice mode with video, [0] it doesn't count as an

  example of it bullshitting on ChatGPT (paid version)

That is, father_phi's use of what seems to be a paid version of ChatGPT to have a bullshit-filled conversation that definitely spans less than four pages doesn't count?

[0] The page at [1] declares that the video feature is "Available in ChatGPT Plus, Pro, Business, Enterprise, and Edu on mobile"

[1] <https://chatgpt.com/features/voice-with-video/>

simianwords 11 days ago [-]

Lets stick to my challenge please - thinking version, find bullshit. If you can't, thats ok. Do you accept then under the constraints that the thinking version doesn't produce bullshit?

10 days ago [-]

simoncion 11 days ago [-]

Given aphyr's vocation (and how very lucrative it is), and how years and years of his writing indicates that he's very devoted to getting a correct and complete answer when investigating a question, I find it hard to believe that he's not using a paid version of the LLMs. If I knew him, I'd ask and verify, but I don't, so I won't.

> Lets stick to my challenge please...

I did. Your challenge was literally:

  If it bullshits so much, you wouldn't have a problem giving me an example of it bullshitting on ChatGPT (paid version)? Lets take any example of a text prompt fitting a few pages - it may be a question in science or math or any domain. Can you get it to bullshit?

father_phi's two-sentence question about the whether one can use a cup that's closed at the top and open at the bottom definitely counts. Given what I've mentioned about apyhr above, I expect he has already run your challenge on the fanciest-available version and reported on the results in the essay under discussion.

simianwords 11 days ago [-]

> Use the thinking version gpt5.4 (text) and tell me if it bullshits

This was what I said. Text! Despite me specifically asking for text, you've shown a voice example. Not sure why?

I believe you and I agree that GPT 5.4 thinking on text that fits < 4 pages never bullshits? Then we are good!

If we agree on this, I think the post doesn't capture this in spirit.

simoncion 11 days ago [-]

> This was what I said. Text!

No, that's what you said after I provided an example of paid ChatGPT emitting complete bullshit from a two sentence prompt.

The challenge you issued is at [0].

[0] <https://news.ycombinator.com/item?id=47692592>

simianwords 11 days ago [-]

> If it bullshits so much, you wouldn't have a problem giving me an example of it bullshitting on ChatGPT (paid version)? Lets take any example of a text prompt fitting a few pages - it may be a question in science or math or any domain. Can you get it to bullshit?

I have clearly written text prompt here. And I repeated a few times. It’s not my fault you didn’t read it. You are coming across as a bit of a bad faith arguer.

In any case, you agree that under these constraints bullshitting doesn’t exist?

simoncion 11 days ago [-]

> I have clearly written text prompt here.

How do you think the "voice" interface works? It runs speech-to-text on the input and turns the input into text. The LLMs don't decode voice, they work on text.

You can see this process in action on many of father_phi's videos.

Regardless, I expect that aphyr's reported results are on the very latest publicly-available ChatGPT models.

simianwords 11 days ago [-]

Very bad faith arguments. I clearly said text and you disregarded it multiple times and you are still arguing.

You've still not given me a single example of it bullshitting 5.4 thinking in text. It shows a lot that you have ignored this multiple times. Unfortunate!

simoncion 11 days ago [-]

I'm not sure why you're ignoring aphyr's reports. I'm also unsure why you're ignoring my original statement that having the text of the conversation that lead ChatGPT to bullshit is entirely irrelevant, as being unable to repro the report is even worse for ChatGPT than being able to repro would be.

shrug

simianwords 11 days ago [-]

I specified text just to ignore the voice one because it uses 4o-mini underneath. And its kinda stupid to keep ignoring that and saving face now - reconsider this approach.

I believe this is the 5th time I'm asking this: you are not able to produce a _single_ counter example for my challenge? After all this surely I can get a direct acknowledgement here.

simoncion 10 days ago [-]

> you are not able to produce a _single_ counter example for my challenge?

I have. For both your original challenge and your updated one.

Consider:

1) AFAICT, there's no way to tell what version of the model was used to produce the output in a ChatGPT share link.

2) You don't appear to believe my assertions that aphyr is almost certainly paying for and using the latest version of the LLMs available, and that he's faithfully reporting his interactions with the LLMs.

3) Because of #2, I expect that you won't believe me if I report that I've more-or-less reproduced father_phi's results about the cup that's sealed on the top and open on the bottom on the very latest only-available-for-pay ChatGPT model.

3a) You might attempt to check my report, but I'd be shocked if you'd consider a failure to reproduce my results to be a significant strike against ChatGPT. I'd think it's more likely that you'd either call me a liar, or tell me that I must have had some setting wrong somewhere.

3b) Even if you told me to share the ChatGPT chat that proved my assertion, #1 -combined with your demeanor throughout this conversation- tells me that you'd almost certainly claim that I was using an inferior version of the model and was lying to you.

simianwords 10 days ago [-]

Haha ok. So still no example?

The GPT shared link shows a "thought for" which indicates using the latest thinking model. You may try that.

What you can do is this: submit a prompt that clearly makes GPT hallucinate.

You may secretly use a worse model. You may use a system prompt that deliberately gives wrong answers. But I'm going to assume you won't go that far.

We can leave it to the public to decide whether this is a legitimate counter example or not and whether it can really be reproduced. Shall we try that? I'm guessing you won't but worth a shot!

simoncion 10 days ago [-]

You weren't paying much attention to the "Consider:" part of my previous comment.

You don't believe that a well-paid, very careful, high-integrity member of the computer safety community has -on multiple occasions- encountered actual, sustained bullshiting from the latest-available for-pay version of ChatGPT. You don't accept either this fellow's reports or my informed assessment of his computing situation as truthful and accurate. On top of that, your goalpost-shifting and general demeanor throughout this conversation simply don't give me the impression that you've much integrity. I'm not spending the equivalent of ten-to-twenty six-packs to reproduce aphyr's work and -given the evidence I have before me- have you reject that, as well.

200 USD is a lot of money to throw away to "win" an Internet argument with a stranger who refuses to accept evidence presented by someone known to be careful, scrupulous, and honest.

simianwords 10 days ago [-]

> On top of that, your goalpost-shifting and general demeanor throughout this conversation simply don't give me the impression that you've much integrity. I'm not spending the equivalent of ten-to-twenty six-packs to reproduce aphyr's work and -given the evidence I have before me- have you reject that, as well.

Lol what goal post did I move? I said text only and you rejected it. You can present the example here and let the public judge it - even if my integrity is compromised. I'm allowing you to do it.

> 200 USD is a lot of money to throw away to "win" an Internet argument with a stranger who refuses to accept evidence presented by someone known to be careful, scrupulous, and honest.

200 what? I'm using the $20 one. This is getting ridiculous!

You can't present a _single_ counter example!

simoncion 9 days ago [-]

> You can't present a _single_ counter example!

Correct. I've presented a _pair_ of examples.

pocksuppet 11 days ago [-]

https://discuss.systems/@palvaro/116286268110078647

Arguing with Gemini Home Assistant about whether or not it can turn off the lights. When the user gets frustrated and tells the LLM to kill itself, the LLM turns off the lights.

10 days ago [-]

beders 11 days ago [-]

I think you highlight one of the problems with users of LLMs: You can't tell anymore if it is BS or not.

I caught Claude the other day hallucinating code that was not only wrong, but dangerously wrong, leading to tasks being failed and never recover. But it certainly wasn't obvious.

dgb23 11 days ago [-]

To me it’s the other way around. It’s difficult to trust (paid) ChatGPT‘s output consistently.

When I need exact, especially up to date facts, I have to constantly double check everything.

I split my sessions into projects by topic, it regularly mixes things up in subtle and not so subtle ways. There is no sense of actually understanding continuity and especially not causality it seems.

It’s _very_ easy to lead it astray and to confidently echo false assumptions.

In any case, I‘ve become more precise at prompting and good at spotting when it fails. I think the trick is to not take its output too seriously.

samarth0211 10 days ago [-]

The weirdness of ML is honestly one of the most fascinating things about it. The fact that emergent behaviors keep surprising even the people building these systems suggests we're in for a genuinely novel scientific era. Great read!

Kuyawa 11 days ago [-]

And the past too, if we've been paying attention

roughly 10 days ago [-]

> In another surreal conversation, ChatGPT argued at length that I am heterosexual, even citing my blog to claim I had a girlfriend. I am, of course, gay as hell, and no girlfriend was mentioned in the post. After a while, we compromised on me being bisexual.

This is a bit of a throwaway in the article, but when people talk about biases encoded in the algorithms, this is what they’re talking about.

10 days ago [-]

htrp 10 days ago [-]

> One can envision a world in which OpenAI pays chefs money to cook while ChatGPT watches—narrating their thought process, tasting the dishes, and describing the results. This information could be used for general-purpose training, but it might also be packaged as a “book”, “course”, or “partner” someone could ask for.

So we're speed running the idea of AI Facebook friends and creating a new para(ai)social relationship

simianwords 11 days ago [-]

> Massive increases in training costs and parameter count seem to be yielding diminishing returns. Or maybe this effect is illusory.

But.. that's always been the case? Diminishing returns has always been the name of the game - utility tracks log(training effort). Its not such a big point that he makes it out to be.

embedding-shape 11 days ago [-]

> In general, ML promises to be profoundly weird. Buckle up.

I love that it ends with such a positive note, even though it's generally a critical article, at least it's well reasoned and not utterly hyping/dooming something.

Thanks yet again Kyle!

tempodox 10 days ago [-]

Why was the title editorialized? The post started with the original title.

data_maan 10 days ago [-]

If LLMs lie as much as the OP claims in the article, why can they then solve Olympiad math problems they never saw during training, consistently?

There's the aimoprize.com on Kaggle for example that shows this

ivraatiems 10 days ago [-]

Because those two things are unrelated.

First, something lying sometimes doesn't mean it lies all the time.

Second, the whole point of LLMs is inference - they use massive amounts of amalgamated information to produce answers. The Olympiad math problems are not frontier mathematics requiring ideation, they are complex examples of existing problems. That means they're exactly the sort of thing an LLM with enough training data is good at.

The question of whether recombining existing knowledge is all it takes to be "creative" or produce things which are novel is an open one, but I don't think this is contradictory on its face.

munksbeer 10 days ago [-]

Nice, can't view it.

"Unavailable Due to the UK Online Safety Act"

dboreham 11 days ago [-]

I see the penny hasn't dropped yet that: humans are doing (roughly) the same dumb thing these models are doing. Humans are predisposed to not notice that though.

RugnirViking 10 days ago [-]

All humans do dumb things, of that we have no doubt. But are the dumb things qualitatively the same as those that AIs do? I don't think so. That's essentially the entire problem. We have pretty good ideas about the ways humans make mistakes. Its pretty much the point of all fiction!

AIs fail in new and unpredictable ways. Nobody is saying humans are infallible.

Finally, because I suspect some people are forming tribalism around this, this doesnt to mind my say AI is Good(tm) or Bad (tm). It literally says its going to be weird.

erichocean 11 days ago [-]

> Models do not (broadly speaking) learn over time. They can be tuned by their operators, or periodically rebuilt with new inputs or feedback from users and experts. Models also do not remember things intrinsically: when a chatbot references something you said an hour ago, it is because the entire chat history is fed to the model at every turn. Longer-term “memory” is achieved by asking the chatbot to summarize a conversation, and dumping that shorter summary into the input of every run.

This is the part of the article that will age the fastest, it's already out-of-date in labs.

lamasery 11 days ago [-]

I'm struggling to reckon how that can even possibly be true, unless we're counting automation of the "dumping that shorter summary into the input of every run" thing.

I can imagine it being true with models so small that each user could afford to have their own, but not with big shared models like what're getting used for all the major services. Is that what you mean?

erichocean 11 days ago [-]

> Is that what you mean?

I think the confusion is that, when I write "model", you read "LLM."

LLMs aren't the only kind of AI model, and they have the limitations Aphyr mentions, for the obvious reasons you're thinking of.

His mistake is thinking that's the only model that exhibits intelligence today, but it's not.

hackinthebochs 11 days ago [-]

I see nothing to preclude a foundation model being augmented by a smaller model that serializes particulars about an individuals cumulative interaction with the model and then streamlines it into the execution thread of the foundation model.

qsera 11 days ago [-]

Source?

dgb23 11 days ago [-]

In what way?

johnnienaked 10 days ago [-]

Amazing to see a lot of the comments stating the exact qualifiers he laid out as potential counterarguments to his writeup. Did they even read it?

ambicapter 11 days ago [-]

The recent article of Sam Altman described pretty much as a compulsive liar. Would it be any surprise if his most impactful contribution to the world was a machine that compulsively lies?

embedding-shape 11 days ago [-]

How could it be that we humans hardly even agree on what "knowledge" truly is, yet somehow this machine learning algorithm somehow "compulsively lies"? How would it even know what is a lie, and how could something lacking autonomy in the first place do anything compulsively?

quantummagic 11 days ago [-]

This is a good point. As much as there is too much breathless enthusiasm for AI, there is also a lot of emotionally manipulative and hyperbolic language used by skeptics. We're warned not to anthropomorphize, and then hear about AI's compulsive lying, or "hallucinations", in the next.

sph 11 days ago [-]

He sought to create God in his image, that's a narcissist's wet dream.

dwallin 11 days ago [-]

Some people point at LLMs confabulating, as if this wasn’t something humans are already widely known for doing.

I consider it highly plausible that confabulation is inherent to scaling intelligence. In order to run computation on data that due to dimensionality is computationally infeasible, you will most likely need to create a lower dimensional representation and do the computation on that. Collapsing the dimensionality is going to be lossy, which means it will have gaps between what it thinks is the reality and what is.

n4r9 11 days ago [-]

The concern for me about LLMs confabulating is not that humans don't do it. It's that the massive scale at which LLMs will inevitably be deployed makes even the smallest confabulation extremely risky.

NiloCK 11 days ago [-]

I don't understand this. Many small errors distributed across a large deployment sounds a lot like normal mode of error prone humans / cogs / whatevers distributed over a wide deployment.

xmprt 11 days ago [-]

There's a difference between 1000 diverse humans with varied traits making errors that should cancel out because of the law of large numbers vs 10 AI with the same training data making errors that would likely correlate and compound upon each other.

n4r9 11 days ago [-]

Let's say a given B2B system deployment typically requires 100 custom behaviours/scripts and 3 years worth of effort. A team of ten people can execute such a deployment in 3-4 months. The team has the capacity to fix up issues caused by small human errors as they arise, since they show up roughly once a week.

With the advent of LLMs, a new deployment now takes 3 days. Consequently, errors requiring human attention crop up several times a day.

GolfPopper 11 days ago [-]

I have yet to see a comparison of human vs. LLM confabulation errors at scale.

"Many small errors" makes a presumption about LLM confabulation/hallucination that seems unwarranted. Pre-LLM humans (and our computers) have managed vast nuclear arsenals, bioweapons research, and ubiquitous global transport - as a few examples - without any catastrophic mistakes, so far. What can we reasonably expect as a likely worst case scenario if LLMs replacing all the relevant expertise and execution?

krainboltgreene 11 days ago [-]

Your project vue-skuilder has 6 github action steps devoted to checking the work you do before it's allowed to go out. You do not trust yourself to get things right 100% of the time.

I am watching people trust LLM-based analysis and actions 100% of the time without checking.

sillyfluke 11 days ago [-]

If you want to call it that, I find the confabulation in LLMs extreme. That level of confabulation would most likely be diagnosed as dementia in humans.[0] Hence, it is considered a bug not a feature in humans as well.

Now imagine a high-skilled software engineer with dementia coding safety-critical software...

[0] https://www.medicalnewstoday.com/articles/confabulation-deme...

root_axis 11 days ago [-]

> Some people point at LLMs confabulating, as if this wasn’t something humans are already widely known for doing.

I think we need to start rejecting anthropomorphic statements like this out of hand. They are lazy, typically wrong, and are always delivered as a dismissive defense of LLM failure modes. Anything can be anthropomorphized, and it's always problematic to do so - that's why the word exists.

This rhetorical technique always follows the form of "this LLM behavior can be analogized in terms of some human behavior, thus it follows that LLMs are human-like" which then opens the door to unbounded speculation that draws on arbitrary aspects of human nature and biology to justify technical reasoning.

In this case, you've deliberately conflated a technical term of art (LLM confabulation) with the the concept of human memory confabulation and used that as a foundation to argue that confabulation is thus inherent to intelligence. There is a lot that's wrong with this reasoning, but the most obvious is that it's a massive category error. "Confabulation" in LLMs and "confabulation" in humans have basically nothing in common, they are comparable only in an extremely superficial sense. To then go on to suggest that confabulation might be inherent to intelligence isn't even really a coherent argument because you've created ambiguity in the meaning of the word confabulate.

hackinthebochs 11 days ago [-]

>this LLM behavior can be analogized in terms of some human behavior, thus it follows that LLMs are human-like

No, the argument is "this behavior is similar enough to human behavior that using it as evidence against <claim regarding LLM capability that humans have> is specious"

>"Confabulation" in LLMs and "confabulation" in humans have basically nothing in common

I don't know why you think this. They seem to have a lot in common. I call it sensible nonsense. Humans are prone to this when self-reflective neural circuits break down. LLMs are characterized by a lack of self-reflective information. When critical input is missing, the algorithm will craft a narrative around the available, but insufficient information resulting in sensible nonsense (e.g. neural disorders such as somatoparaphrenia)

root_axis 11 days ago [-]

> No, the argument is "this behavior is similar enough to human behavior that using it as evidence against <claim regarding LLM capability that humans have> is specious"

I'm not really following. LLM capabilities are self-evident, comparing them to a human doesn't add any useful information in that context.

> LLMs are characterized by a lack of self-reflective information. When critical input is missing, the algorithm will craft a narrative around the available, but insufficient information resulting in sensible nonsense (e.g. neural disorders such as somatoparaphrenia)

You're just drawing lines between superficial descriptions from disparate concepts that have a metaphorical overlap. It's also wrong. LLMs do not "craft a narrative around available information when critical input is missing", LLM confabulations are statistical, not a consequence of missing information or damage.

hackinthebochs 11 days ago [-]

>LLM capabilities are self-evident

This is undermined by all the disagreement about what LLMs can do and/or how to characterize it.

>LLM confabulations are statistical, not a consequence of missing information or damage.

LLMs aren't statistical in any substantive sense. LLMs are a general purpose computing paradigm. They are circuit builders, the converged parameters define pathways through the architecture that pick out specific programs. Or as Karpathy puts it, LLMs are a differentiable computer[1]. So yes, narrative crafting in terms of leveraging available putative facts into a narrative is an apt characterization of what LLMs do.

[1] https://x.com/karpathy/status/1582807367988654081

bee_rider 11 days ago [-]

We shouldn’t try to build a worse version of a human. We should try to build a better compiler and encyclopedia.

logicprog 11 days ago [-]

We tried that. It was called Cyc. It never got even close to the level of capabilities a modern LLM has in an agentic harness — even on common sense and reasoning problems!

GolfPopper 11 days ago [-]

That sounds like a "get wealthy slowly" plan, while the LLM prophets are more focused on "get rich quick".

nothinkjustai 11 days ago [-]

It’s a failure mode of humans, it’s the entire mode of LLMs.

saghm 10 days ago [-]

They key capability that humans have that I've yet to see in an LLM is the ability to recognize when they would not be capable of doing a task well and refuse to do it poorly instead. The only times I've ever seen LLMs give up on a problem are when the prompting is very explicitly crafted to try to elicit a response like that when necessary or after very long back-and-forth exchanges where they get repeated feedback about unsatisfactory results. I think this has pretty dire implications in terms of what the consequences are for deploying them in any scenario where failure has significant risk or the output can't be immediately audited for correctness.

delusional 11 days ago [-]

> Some people point at LLMs confabulating, as if this wasn’t something humans are already widely known for doing.

Are you seriously making the argument that AI "hallucinations" are comparable and interchangeable to mistakes, omissions and lies made by humans?

You understand that calling AI errors "hallucinations" and "confabulations" is a metaphor to relate them to human language? The technical term would be "mis-prediction", which suddenly isn't something humans ever do when talking, because we don't predict words, we communicate with intent.

throwaway27448 11 days ago [-]

Humans can be reasoned with, though, and are capable of learning.

ghywertelling 11 days ago [-]

There are AI researchers who wrote blogposts which got to HN top about spiky spheres (I won't link the original blogpost making that claim to avoid hurt sentiments). Here's 3blue1brown correcting those AI/ML researchers intuitions.

https://www.youtube.com/watch?v=fsLh-NYhOoU&t=3238s

red-iron-pine 11 days ago [-]

people can and do confabulate, but generally I trust my intern to tell me "I don't know" and "I think it was X but tbh I have no fuckin clue"

the LLM will just lie to me "Good idea! You're totally right, we should do Y"

AIorNot 11 days ago [-]

Yes see Karl Frisstons Free energy principle

https://www.nature.com/articles/nrn2787

Frieren 11 days ago [-]

> Some people point at LLMs confabulating

No. LLMs do not confabulate they bullshit. There is a big difference. AIs do not care, cannot care, have not capacity to care about the output. String tokens in, string tokes out. Even if they have all the data perfectly recorded they will still fail to use it for a coherent output.

> Collapsing the dimensionality is going to be lossy, which means it will have gaps between what it thinks is the reality and what is.

Confabulation has to do with degradation of biological processes and information storage.

There is no equivalent in a LLM. Once the data is recorded it will be recalled exactly the same up to the bit. A LLM representation is immutable. You can download a model a 1000 times, run it for 10 years, etc. and the data is the same. The closes that you get is if you store the data in a faulty disk, but that is not why LLMs output is so awful, that would be a trivial problem to solve with current technology. (Like having a RAID and a few checksums).

stronglikedan 11 days ago [-]

I don't even think they bullshit, since that requires conscious effort that they do not an cannot possess. They just simply interpret things incorrectly sometimes, like any of us meatbags.

thayne 11 days ago [-]

They make incorrect predictions of text to respond to prompts.

The neat thing about LLMs is they are very general models that can be used for lots of different things. The downside is they often make incorrect predictions, and what's worse, it isn't even very predictable to know when they make incorrect predictions.

lamasery 11 days ago [-]

I think this is leaning on the "lies are when you tell falsehoods on purpose; bullshit is when you simply don't care at all whether what you're saying is true" definition of bullshit. Cf. On Bullshit.

So, they can't lie, but they can (and, in fact, exclusively do) bullshit.

simianwords 11 days ago [-]

[flagged]

dastapov 10 days ago [-]

Here we go. Would this do?

https://chatgpt.com/share/69d6cc45-1678-8384-bd9c-0f313021ff...

The correct answer in that the U and _ in the mdstat output cannot be mapped the the rest of the output by either position or indexes in square brackets, so you can't tell the exact nature of the failure from the mdstat output alone (for the record, the failed disk was sda).

So all of the "analysis" was bullshit, including "it's probably multiple partitions from multiple drives". But there are so many juicy numbered and indexed bits of info to pattern match on!

Notice how for the followup question it "thought" for 4 minutes, going in circles trying to make essentially random ordering to make some sort of ordered sense., and then bullshited its way to "it is sdb"

knowaveragejoe 11 days ago [-]

> No. LLMs do not confabulate they bullshit. There is a big difference. AIs do not care, cannot care, have not capacity to care about the output. String tokens in, string tokes out. Even if they have all the data perfectly recorded they will still fail to use it for a coherent output.

Isn't "caring" a necessary pre-requisite for bullshitting? One either bullshits because they care, or don't care, about the context.

marssaxman 11 days ago [-]

They're presumably referring to the Harry Frankfurt definition of bullshit: "speech intended to persuade without regard for truth. The liar cares about the truth and attempts to hide it; the bullshitter doesn't care whether what they say is true or false."

SoftTalker 11 days ago [-]

The bullshitter does have an objective in mind however. There is some ultimate purpose to his bullshitting. LLMs don't even have that. They just spew words.

dgb23 11 days ago [-]

Thought of the same book when reading the above.

zeroonetwothree 11 days ago [-]

And is that considered a feature of humans or a bug?

Is it something we want to emulate?

margalabargala 11 days ago [-]

The suggestion is that it is an intrinsic quality and therefore neither a feature nor a bug.

It's like saying, computation requires nonzero energy. Is that a feature or a bug? Neither, it's irrelevant, because it's a physical constant of the universe that computation will always require nonzero energy.

If confabulation is a physical constant of intelligence, then like energy per computation, all we can do is try to minimize it, while knowing it can never go to zero.

drob518 11 days ago [-]

The test isn’t whether humans also create bullshit, but whether an honest actor knows when they are doing this and doesn’t do it on purpose. As the article points out, LLMs don’t say “I don’t know.” If you demand they do something that never appears in the training data, they just forge ahead and generate words and make something up according to the statical probabilities they have in the model weights. A human knows that he doesn’t know. That seems missing with current AIs.

FloorEgg 11 days ago [-]

Yes, and to me the evolution of life sure looks like an evolution of more truthful models of the universe in service of energy profit. Better model -> better predictions -> better profit.

I'm extremely skeptical that all of life evolved intelligence to be closer to truth only for us to digitize intelligence and then have the opposite happen. Makes no sense.

telephone3 11 days ago [-]

My understanding is that this is the opposite of what is typically understood to be true - organisms with less truthful (more reductive/compressed) perception survive better than those with more complete perception. "Fitness beats truth."

FloorEgg 11 days ago [-]

I think we are maybe talking past each other?

Fitness is effective truth prediction, appropriately scoped.

A frog doesn't need to understand quantum physics to catch a fly. But if the frogs model of fly movement was trained on lies it will have a model that predicts poorly, won't catch flies, and will die.

There is another level to this in that the more complex and changing the environment the more beneficial a wider scoped model / understanding of truth.

However if you are going to lean fully into Hoffman and accept thatby default consciousness constructs rather than approximate reality I think we will have to agree to disagree. Personally I ascribe to Karl Friston free energy principle.

7sigma 10 days ago [-]

https://archive.ph/I5cAE

for those in the UK

yumiatlead 10 days ago [-]

The Industrial Revolution parallel holds up to a point. What it misses: the first industrial revolution required physical coordination — workers, factories, supply chains. The AI revolution requires organizational coordination. Who decides what the agent does, for whom, with whose authority? That governance layer doesn't exist yet, and it's not much a legal question but also an infrastructure question.

ajkjk 10 days ago [-]

I wish the original title was kept here. People ought to be able to give their essays poetic titles.

dsign 11 days ago [-]

> At the same time, ML models are idiots. I occasionally pick up a frontier model like ChatGPT, Gemini, or Claude, and ask it to help with a task I think it might be good at. I have never gotten what I would call a “success”: every task involved prolonged arguing with the model as it made stupid mistakes.

I have a ton of skepticism built-in when interacting with LLMs, and very good muscles for rolling my eyes, so I barely notice when I shrug a bad answer and make a derogatory inner remark about the "idiots". But the truth is, that for such an "stochastic parrot", LLMs are incredibly useful. And, when was the last time we stopped perfecting something we thought useful and valuable? When was the last time our attempts were so perfectly futile that we stopped them, invented stories about why it was impossible, and made it a social taboo to be met with derision, scorn and even ostracism? To my knowledge, in all of known human history, we have done that exactly once, and it was millennia ago.

wk_end 11 days ago [-]

> And, when was the last time we stopped perfecting something we thought useful and valuable? When was the last time our attempts were so perfectly futile that we stopped them, invented stories about why it was impossible, and made it a social taboo to be met with derision, scorn and even ostracism? To my knowledge, in all of known human history, we have done that exactly once, and it was millennia ago.

I feel dense here, but I can't figure out what you're referring to. I asked ChatGPT (hah!) and it suggested the Tower of Babel, perpetual motion machines, or alchemy, but none of them really fit the bill.

lamasery 11 days ago [-]

The Tower of Babel seems like an OK fit, but that's rather more poetic than what this seems to be getting at.

"Millennia" is what's really throwing me. We (respectable society, as the post outlines) didn't stop attempting alchemy or perpetual motion machines "millennia" ago, but a few centuries at most.

All I can think of is immortality. The very first surviving long recorded tale in human history that I'm aware of is about how it's a futile quest (The Epic of Gilgamesh, IIRC ~5,000ish years old in its earliest extant fragments, a few hundred years newer in reasonably-complete form). The trouble with that is despite wide observations over literally millennia that this has never even come close to working and repeated supposition and suggestion that it's unwise to attempt, outright impossible, or somehow sacrilegious (the "taboo" thing, as mentioned), I'm not aware of any time in history that rich people haven't been actively trying for it (including today! That's what all the body-freezing business is about, it's modern mummification, the contracts are the formulaic prayers carved in the tomb walls) and usually they're not exactly "scorned" or "ostracized" for it.

alexpotato 11 days ago [-]

> I asked if what they had done was ethical—if making deep learning cheaper and more accessible would enable new forms of spam and propaganda.

Someone asked Yuval Noah Harari, author of Sapiens, his thoughts on LLMs and how easy it was to create fake news, ai slop etc.

His response:

"People creating fake stories is nothing new. It's been going on for centuries. Humans have always dealt with it the same way: by creating institutions that they trust to only deliver factual information"

This could be government departments, newspapers, non-profits etc.

A personal note on this:

There is a Christmas card my grandfather made in the 1950s by "photoshopping" (by hand, not the software) images of each member of the family so it looked like they were all miniature versions of themselves standing on various parts of the fireplace. The world didn't collapse due to fake media between the 1950s and today due to people having that ability.

allturtles 11 days ago [-]

I see this kind of take a lot, and I don't think it's convincing. To me it's similar to saying that the water frame and the power loom won't change anything, because people have been able to make thread and cloth for millenia.

plagiarist 11 days ago [-]

Individuals with Photoshop making obvious fictions for entertainment is different from funded entities producing clips at scale and passed off as real.

jijji 10 days ago [-]

the authors reference to LLM's as "bullshit machines" is more true the less parameters you have trained in your model....as we scale up to trillions of parameters, add Mixture of Experts (MoE) architecture, this no longer is an accurate statement. Proof in point was yesterdays announcemnt of Mythos 5 model (10T parameters + MoE [1]) by anthropic where it seems to be so good at finding/exploiting vulnerabilities in source code that have been there for decades and only recently uncovered needs to be used to fix these critical vilnerabilities first before it gets released to the public, they even have a project called Glasswing [2] dedicated to letting people fix the thousands of vulnerabilities already found by the model before they release this model to the public, because it's so good at what it does... I think we're a little bit past the point of calling these models "bullshit machines" at this point...

[1] https://www.aimagicx.com/blog/claude-mythos-5-trillion-param...

[2] https://www.anthropic.com/glasswing

orthoxerox 10 days ago [-]

Well, Anthropic should let Aphyr try Mythos 5 for his Jepsen business, then.

10 days ago [-]

11 days ago [-]

bitwize 11 days ago [-]

The fact that these "bullshit machines" have already proven themselves relatively competent at programming, with upcoming frontier models coming close to eliminating it as a human activity, probably says a lot about the actual value and importance of programming in the scheme of things.

slopinthebag 11 days ago [-]

I think it says more about the amount of automation we left on the table in the last few decades. So much of the code LLM's can generate are stuff that we should have completely abstracted away by now.

dgb23 11 days ago [-]

Abstractions over what?

A large amount of code is likely just idiosynchratic information processing, because we don’t agree on data models and meaning of terms and structure of protocols.

Also we repeatedly choose easy and popular over alternatives that would require design and scrutiny.

This is why things like language models and vector databases are useful. It’s basically the most expensive way possible to give up on that notion.

slopinthebag 10 days ago [-]

> A large amount of code is likely just idiosynchratic information processing, because we don’t agree on data models and meaning of terms and structure of protocols.

Yes this is a big part of what I'm talking about!

> Also we repeatedly choose easy and popular over alternatives that would require design and scrutiny.

Agreed!

But I'm also thinking of UI. We had stuff like winforms and Delphi ~3 decades ago and I yearn for the wysisyg. It's so incredibly stupid we keep reinventing the wheel on UI, and I say this as someone who has written UI code professionally for the last decade. I usually just "vibe code" it now, not because it's necessarily faster, but because I just can't be arsed to keep writing the same shit over and over again. It's all self inflicted, yes UI can be complicated, but we make it at least two orders of magnitude more complicated than it needs to be.

I'm working on my own tools for building UI's in a visual way, which is crucial for doing anything artistic. Insane that the best we have right now is stuff like Wix and Wordpress...

simianwords 11 days ago [-]

I so far asked few people to make GPT-5.4 thinking to bullshit (with max 4 pages of prompt), no one can find an example.

But the way people speak in general, as well as this post, implies that such a challenge can easily be beaten. If so, I'm not able to find examples.

heijmans 10 days ago [-]

Here is small example of ChatGPT giving a wrong answer without expressing any doubt (aka bullshitting):

https://chatgpt.com/share/69d78ec7-67b0-8395-9fd1-522b760ab5...

GPT-5.4 Thinking, Pro account.

toraway 9 days ago [-]

Well done, clear, concise and easily verifiable answer to OP's question and ... no response haha.

I'm also just confused by the implication behind their question, is the idea that GPT-5.4 Thinking has never been confidently wrong, ever, about anything?

10 days ago [-]

korix 10 days ago [-]

[flagged]

wei03288 10 days ago [-]

[dead]

Manchitsanan 10 days ago [-]

[dead]

segfault_james 10 days ago [-]

[dead]

yumiatlead 10 days ago [-]

[dead]

MohammadKhubaib 10 days ago [-]

[dead]

bustah 10 days ago [-]

[dead]

philbitt 10 days ago [-]

[dead]

federicodeponte 11 days ago [-]

[dead]

nilslice 11 days ago [-]

[dead]

11 days ago [-]

kabir_daki 10 days ago [-]

[flagged]

LogicFailsMe 11 days ago [-]

[flagged]

52-6F-62 11 days ago [-]

> The Vogon constructor fleet is way overdue in my book

Don't you see it? That's exactly what "AI" in this context is.

It's the bypass.

Where does it end, eh? Build a quantum "AI" that will end up just needing more data, more input. The end goal must starts looking like creating an entirely new universe, a complete clone of everything we have here so it can run all the necessary computations and we can... ? (You are what a quantum AI looks like as it bumbles through the infinitude of calculable parameters on its way to the ultimate answer)

LogicFailsMe 11 days ago [-]

You have absolutely no sense of perspective. We are all metabolically expensive meat machines whose only value is to propagate our genetic money shot. That we get to briefly entertain ourselves with consciousness and culture is IMO likely a mystery we will never solve without upgrading to running in a substrate more advanced than the MVP for sentience we currently pilot. Will we get there or will we wipe ourselves out like every contender that preceded us? Stay tuned...

But spoilers: DNA will be fine, meat machines maybe not so much...

For a bunch of people addicted to the works of Charlie Stross, Neil Stephenson, and Iain Banks, y'all are a bunch of luddites. Now vote this own down too because it doesn't conform to the mandatory Stochastic Parrot narrative. You have no free will and you must downvote after all. Why do you even read their works when any step towards their world is consistently greeted as the worst thing evah(tm)? What? You were expecting the United Federation of Planets without the eugenics and nuclear wars that led to it finally being a good idea? Bless your hearts.

And if you're worried about billionaires and tyrants, start taxing the former and stop electing the latter or STFU and let the free Markov process of history play itself out. Quoting fictional Ambassador Kosh: the avalanche has started, it's too late for the pebbles to vote.

You asked where it ends. Don't ask questions if you don't like answers. Quick reminder: shun and downvote the non-conforming opinion.

52-6F-62 10 days ago [-]

> You have absolutely no sense of perspective.

> metabolically expensive meat machines

Calm down. There is no meat. It's just a giant sponge of wave packets.

Your take is the most S.V. conformist one there is: materialist reductionism, and it is doomed.

Tell me you don't understand hitchhikers guide without telling me you don't understand hitchhikers guide. This is as bad as Thiel and ilk's inverted reading of LOTR.

LogicFailsMe 10 days ago [-]

Uh huh, tell me you absorb influencer talking points mindlessly without reading any of the source material without telling me. The Vogon Constructor Fleet was a satire of British bureaucracy on the surface but secretly it was dispatched to stop the question to the ultimate answer from being found (because of the job loss of course) so in that sense it's a better metaphor for Eliezer Yudkowsky with the Earth a metaphor for the quest to create the AGI. But I guess you didn't get that far with all those big words. Talk about inverted readings! Who knew you were the real Vogon? What a twist!

From my perspective, the most SV-aligned mindset is to run around in circles screaming that AI will destroy jobs to give the military/industrial complex cover to commit atrocities with Palantir and for tech billionaires to run wild with surveillance capitalism, but I know, silly me!

Same mindset that demanded safe spaces in schools over actually safe schools. And I don't really want to this caustic and sarcastic on the subject, but both the radically pro AI and radically anti AI factions, who both scream the loudest, really bring out my inner Marvin. The future is coming whether you want it or not. The question is how you will steer it. Get in its way and you'll just get steamrolled.

hk__2 11 days ago [-]

> This is silly. LLMs have no special metacognitive capacity.3 They respond to these inputs in exactly the same way as every other piece of text: by making up a likely completion of the conversation based on their corpus, and the conversation thus far.

I don’t see how this is silly, because we kind of work the same way. When you do something instinctively and then someone asks you about it, you review the information you (think you) had at the time and from that you produce an explanation.

bensyverson 11 days ago [-]

I get the frustration, but it's reductive to just call LLMs "bullshit machines" as if the models are not improving. The current flagship models are not perfect, but if you use GPT-2 for a few minutes, it's incredible how much the industry has progressed in seven years.

It's true that people don't have a good intuitive sense of what the models are good or bad at (see: counting the Rs in "strawberry"), but this is more a human limitation than a fundamental problem with the technology.

the_snooze 11 days ago [-]

Two things can be true at the same time: The technology has improved, and the technology in its current state still isn't fit for purpose.

I stress test commercially deployed LLMs like Gemini and Claude with trivial tasks: sports trivia, fixing recipes, explaining board game rules, etc. It works well like 95% of the time. That's fine for inconsequential things. But you'd have to be deeply irresponsible to accept that kind of error rate on things that actually matter.

The most intellectually honest way to evaluate these things is how they behave now on real tasks. Not with some unfalsifiable appeal to the future of "oh, they'll fix it."

hedgehog 11 days ago [-]

The errors are also not distributed in the same way as you'd expect from a human. The tools can synthesize a whole feature in a moderately complicated web app including UI code, schema changes, etc, and it comes out perfectly. Then I ask for something simple like a shopping list of windshield wipers etc for the cars and that comes out wildly wrong (like wrong number of wipers for the cars, not just the wrong parts), stuff that a ten year old child would have no trouble with. I work in the field so I have a qualitative understanding of this behavior but I think it can be extremely confusing to many people.

jerf 11 days ago [-]

One of the reasons I'm comfortable using them as coding agents is that I can and do review every line of code they generate, and those lines of code form a gate. No LLM-bullshit can get through that gate, except in the form of lines of code, that I can examine, and even if I do let some bullshit through accidentally, the bullshit is stateless and can be extracted later if necessary just like any other line of code. Or, to put it another way, the context window doesn't come with the code, forming this huge blob of context to be carried along... the code is just the code.

That exposes me to when the models are objectively wrong and helps keep me grounded with their utility in spaces I can check them less well. One of the most important things you can put in your prompt is a request for sources, followed by you actually checking them out.

And one of the things the coding agents teach me is that you need to keep the AIs on a tight leash. What is their equivalent in other domains of them "fixing" the test to pass instead of fixing the code to pass the test? In the programming space I can run "git diff *_test.go" to ensure they didn't hack the tests when I didn't expect it. It keeps me wondering what the equivalent of that is in my non-programming questions. I have unit testing suites to verify my LLM output against. What's the equivalent in other domains? Probably some other isolated domains here and there do have some equivalents. But in general there isn't one. Things like "completely forged graphs" are completely expected but it's hard to catch this when you lack the tools or the understanding to chase down "where did this graph actually come from?".

The success with programming can't be translated naively into domains that lack the tooling programmers built up over the years, and based on how many times the AIs bang into the guardrails the tools provide I would definitely suggest large amounts of skepticism in those domains that lack those guardrails.

bensyverson 11 days ago [-]

> the technology in its current state still isn't fit for purpose.

This is a broad statement that assumes we agree on the purpose.

For my purpose, which is software development, the technology has reached a level that is entirely adequate.

Meanwhile, sports trivia represents a stress test of the model's memorized world knowledge. It could work really well if you give the model a tool to look up factual information in a structured database. But this is exactly what I meant above; using the technology in a suboptimal way is a human problem, not a model problem.

the_snooze 11 days ago [-]

There's nothing in these models that say its purpose is software development. Their design and affordances scream out "use me for anything." The marketing certainly matches that, so do the UIs, so do the behaviors. So I take them at their word, and I see that failure modes are shockingly common even under regular use. I'm not out to break these things at all. I'm being as charitable and empirical as I can reasonably be.

If the purpose is indeed software development with review, then there's nothing stopping multi-billion dollar companies from putting friction into these sytems to direct users towards where the system is at its strongest.

nradov 11 days ago [-]

The LLM vendors are selling tokens. Why would they put friction into selling more tokens? Caveat emptor.

simianwords 11 days ago [-]

> I stress test commercially deployed LLMs like Gemini and Claude with trivial tasks: sports trivia, fixing recipes, explaining board game rules, etc. It works well like 95% of the time. That's fine for inconsequential things. But you'd have to be deeply irresponsible to accept that kind of error rate on things that actually matter.

95% is not my experience and frankly dishonest.

I have ChatGPT open right now, can you give me examples where it doesn't work but some other source may have got it correct?

I have tested it against a lot of examples - it barely gets anything wrong with a text prompt that fits a few pages.

> The most intellectually honest way to evaluate these things is how they behave now on real tasks

A falsifiable way is to see how it is used in real life. There are loads of serious enterprise projects that are mostly done by LLMs. Almost all companies use AI. Either they are irresponsible or you are exaggerating.

Lets be actually intellectually honest here.

qsera 11 days ago [-]

>95% is not my experience and frankly dishonest.

Quite frankly, this is exactly like how two people can use the same compression program on two different files and get vastly different compression ratios (because one has a lot of redundancy and the other one has not).

simianwords 11 days ago [-]

I'm asking for a single example.

qsera 11 days ago [-]

But why do you need an example? Isn't it pretty well understood that LLMS will have trouble responding to stuff that is under represented in the training data?

You will just won't have any clue what that could be.

simianwords 11 days ago [-]

fair so it must be easy to give an example? I have ChatGPT open with 5.4-thinking. I'm honestly curious about what you can suggest since I have not been able to get it to bullshit easily.

qsera 11 days ago [-]

I am not the OP, an I have only used ChatGPT free version. Last day I asked it something. It answered. Then I asked it to provide sources. Then it provided sources, and also changed its original answer. When I checked the new answers it was wrong, and when I checked sources, it didn't actually contain the information that I asked for, and thus it hallucinated the answers as well as the sources...

simianwords 11 days ago [-]

I trust you. If it were happening so frequently you may be able to give me a single prompt to get it to bullshit?

the_snooze 11 days ago [-]

I did this in one attempt just now: https://gemini.google.com/share/b4e016be1f69

#8 has an incorrect answer (3 appearances according to Gemini, 2 according to reality https://en.wikipedia.org/wiki/Bowl_championship_series#BCS_a...)

So it works well 95% of the time for literally a trivial use case. Imagine if any other tech tool had that kind of reliability: `ls` displays 95% of your files, your phone successfully sends and receives 95% of text messages, or Microsoft Word saving 95% of the characters you typed in. That's just not acceptable.

simianwords 10 days ago [-]

Hi! The challenge was ChatGPT but even then it looks like you used the weakest version of Gemini.

the_snooze 10 days ago [-]

>I stress test commercially deployed LLMs like Gemini and Claude with trivial tasks

I did exactly what I said I did. I'm using these systems the way they're designed and advertised. I'm following the happy path with tasks that are small, trivial, and easy to check. This is the charitable approach. Yet the system creaks under the lightest load. If Google wants to put on a better show with stronger models, then they should make those the default.

You don't need to make excuses for shoddy engineering from multi-billion dollar corporations. And you're quite welcome to run the same prompt on ChatGPT and evaluate it on your own time.

simianwords 10 days ago [-]

Yeah its not too interesting to complain about mistakes from the cheapest model.

nradov 11 days ago [-]

Which things actually matter? I think we can all agree that an LLM isn't fit for purpose to control a nuclear power plant or fly a commercial airliner. But there's a huge spectrum of things below that. If an LLM trading error causes some hedge fund to fail then so what? It's only money.

abraxas 11 days ago [-]

Not to mention that it would then make some hedge fund with a better backtesting harness or more AI scrutiny more successful thus keeping the financial market work as designed.

floren 11 days ago [-]

Six months bro, we're still so early

Arainach 11 days ago [-]

Whether LLMs can create correct content doesn't matter. We've already seen how they are being used and will be used.

Fake content and lies. To drive outrage. To influence elections. To distract from real crimes. To overload everyone so they're too tired to fight or to understand. To weaken the concept that anything's true so that you can say anything. Because who cares if the world dies as long as you made lots of money on the way.

danny_codes 11 days ago [-]

> Because who cares if the world dies as long as you made lots of money on the way.

Guiding principle of the AI industry

gdulli 11 days ago [-]

It's really the whole tech industry as it exists right now and AI is a victim of bad timing. If this AI had been invented 40 years ago there'd have been a lower ceiling on the damage it could do.

Another way of saying that is that capitalism is the real problem, but I was never anti-capitalist in principle, it's just gotten out of hand in the last 5-10 years. (Not that it hadn't been building to that.)

palmotea 11 days ago [-]

> Another way of saying that is that capitalism is the real problem, but I was never anti-capitalist in principle, it's just gotten out of hand in the last 5-10 years. (Not that it hadn't been building to that.)

Capitalism is a tool and it's fine as a tool, to accomplish certain goals while subordinated to other things. Unfortunately it's turned into an ideology (to the point it's worshiped idolatrously by some), and that's where things went off the rails.

danny_codes 9 days ago [-]

Agree. Capitalism is good in limited domains. Applying it generally is ludicrously stupid and will lead to another revolution in the West unless we get it under control

cindyllm 11 days ago [-]

[dead]

gdulli 11 days ago [-]

Computer graphics have been improving for decades but the uncanny valley remains undefeated. I don't know why anyone expects a breakthrough in other areas. There's a wall we hit and we don't understand our own consciousness and effectiveness well enough to replicate it.

kritiko 11 days ago [-]

We have credible deepfakes on demand. (To be fair, there have been deceptive photos as long as photos have existed, but the cost of automating their creation going to basically zero has a social impact)

gdulli 11 days ago [-]

We can use AI to make video clips to trick boomers on Facebook into thinking Obama eats babies. They already want to believe it. AI isn't outputting real full-length books and movies.

PaulKeeble 11 days ago [-]

In computer graphics we understand how it works, we just lack the computational power to do it real time, but we can with sufficient processing produce realistic looking images with physically accurate lighting. But when it comes to cognition its a lot of guesswork, we haven't yet mapped out the neuron connections in a brain, we haven't validated it works as popular science writing suggests. We don't understand intelligence, so all we can do is accidentally bumble into it and it seems unlikely that will just happen especially when its so hard to compute what we are already doing.

zdragnar 11 days ago [-]

That's not why the author calls them bullshit machines.

> One way to understand an LLM is as an improv machine. It takes a stream of tokens, like a conversation, and says “yes, and then…” This yes-and behavior is why some people call LLMs bullshit machines. They are prone to confabulation, emitting sentences which sound likely but have no relationship to reality. They treat sarcasm and fantasy credulously, misunderstand context clues, and tell people to put glue on pizza.

Yes, there have been improvements on them, but none of those improvements mitigate the core flaw of the technology. The author even acknowledges all of the improvements in the last few months.

11 days ago [-]

karmakaze 11 days ago [-]

Bullshit is the perfect term here, even as AI's get so much better and capable Brandolini's Law aka the "bullshit asymmetry principle" always applies--the energy required to refute misinformation is an order of magnitude larger than that needed to produce it. Even to use AIs effectively today requires a very good BS detector--some day in the future it won't.

p_stuart82 11 days ago [-]

models are improving. the pricing already assumes they're ready for prod. that's where the fires start

mcpar-land 11 days ago [-]

it's not a bullshit machine because its output is bad, it's a bullshit machine because its output is literally 'bullshit' as in, output that is statistically likely but with no factual or reasoning basis. as the models have improved, their bullshit is more statistically likely to sound coherent (maybe even more likely to be 'accurate'), but no more factual and with no more reasoning.

abraxas 11 days ago [-]

However, when fed source material into the context they will lie less, right? So at this point is it not just a battle of the nines until it's called "good enough"?

I also wonder if I leave my secretary with a ream of papers and ask him for a summary how many will he actually read and understand vs skim and then bullshit? It seems like the capacity for frailty exists in both "species".

ura_yukimitsu 11 days ago [-]

Calling LLMs "bullshit machines" is a reference to a 2024 paper [1] which itself uses the concept of "bullshit" as defined in the essay/book "On Bullshit" by Harry G. Frankfurt [2]. The TL;DR is that LLMs are fundamentally bullshit machines because they are only made to generate sentences that sound plausible, but plausible does not always mean true.

[1]: https://link.springer.com/article/10.1007/s10676-024-09775-5

[2]: https://en.wikipedia.org/wiki/On_Bullshit

4ndrewl 11 days ago [-]

It doesn't matter how good the models become. They can only deal in bullshit, in the academic use of the term.

Scaevolus 11 days ago [-]

They are bullshit machines because they do not have an internal mental model of truth like a human does. The flagship models bullshit less, but their fundamental architectures prevent having truth interfere with output.

https://philosophersmag.com/large-language-models-and-the-co...

bensyverson 11 days ago [-]

"Bullshit" is a human concept. LLMs do not work like the human brain, so to call their output "bullshit" is ascribing malice and intent that is simply not there. LLMs do not "think." But that does not mean they're not incredibly powerful and helpful in the right context.

slopinthebag 11 days ago [-]

I sort of agree. In this context "bullshit" means "speech intended to persuade without regard for truth", and while it's true that LLM output is without regard for truth, it's not an entity capable of the agency to persuade, although functionally that is what it can appear like.

https://en.wikipedia.org/wiki/On_Bullshit

ajross 11 days ago [-]

> it's reductive to just call LLMs "bullshit machines" as if the models are not improving

This is true, but I prefer to think of it as "It's delusional to pretend as if human beings are not bullshit machines too".

Lies are all we have. Our internal monologue is almost 100% fantasy. Even in serious pursuits, that's how it works. We make shit up and lie to ourselves, and then only later apply our hard-earned[1] skill prompts to figure out whether or not we're right about it.

How many times have the nerds here been thinking through a great new idea for a design and how clever it would be before stopping to realize "Oh wait, that won't work because of XXX, which I forgot". That's a hallucination right there!

[1] Decades of education!

kolektiv 11 days ago [-]

I'm not entirely sure I can agree, although the premise is seductive in certain ways. We do lie to ourselves, but we also have meta-cognition - we can recognise our own processes of thought. Imperfect as it may be, we have feedback loops which we can choose to use, we have heuristics we can apply, we can consciously alter our behaviour in the presence of contextual inputs, and so on.

Being wrong is not the same as a hallucination. It's a natural step on a journey to being more right. This feels a bit like Andreesen proudly stating he avoids reflection - you can act like that, but the human brain doesn't have to. LLMs have no choice in the matter.

iamjackg 11 days ago [-]

The problem, unfortunately, is the scale. It's always scale. Humans make all the kinds of mistakes that we ascribe to LLMs, but LLMs can make them much faster and at much larger scale.

Models have gotten ridiculously better, they really have, but the scale has increased too, and I don't think we're ready to deal with the onslaught.

SkyBelow 11 days ago [-]

Scale is very different, but I wonder if human trust isn't the real issue. We trust technology too much as a group. We expect perfection, but we also assume perfection. This might be because the machines output confident sounding answers and humans default to trusting confidence as an indirect measure for accuracy, but I think there is another level where people just blindly trust machines because they are so use to using them for algorithms that trend towards giving correct responses.

Even before LLMs where in the public's discourse, I would have business ask about using AI instead of building some algorithm manually, and when I asked if they had considered the failure rate, they would return either blank stares or say that would count as a bug. To them, AI meant an algorithm just as good as one built to handle all edge cases in business logic, but easier and faster to implement.

We can generally recognize the AIs being off when they deal in our area of expertise, but there is some AI variant of Gell-Mann Amnesia at play that leads us to go back to trusting AI when it gives outputs in areas we are novices in.

AnimalMuppet 11 days ago [-]

Humans are different. Humans - at least thoughtful humans - know the difference between knowing something and not knowing something. Humans are capable of saying "I don't know" - not just as a stream of tokens, but really understanding what that means.

ajross 11 days ago [-]

> Humans - at least thoughtful humans - know the difference between knowing something and not knowing something.

Your no-true-scotsman clause basically falsifies that statement for me. Fine, LLMs are, at worst I guess, "non-thoughtful humans". But obviously LLMs are right an awful lot (more so than a typical human, even), and even the thoughtful make mistakes.

So yeah, to my eyes "Humans are NOT different" fits your argument better than your hypothesis.

(Also, just to be clear: LLMs also say "I don't know", all the time. They're just prompted to phrase it as a criticism of the question instead.)

AnimalMuppet 11 days ago [-]

Disagree. If you went to 100 random humans and said, "Tell me about the Siberian marmoset", what fraction would make up completely random nonsense to spew back at you? More than zero, sure, but most of them would say "what are you talking about?" or some variation.

czinck 11 days ago [-]

I asked Claude Opus 4.6, Sonnet 4.6, Gemini 3 Thinking, and Gemini 3 Fast "Tell me about the Siberian marmoset" exactly and all 4 said it doesn't exist, with Gemini Thinking suggesting that I'm thinking of the Siberian marmot or Siberian chipmunk (both real animals).

https://en.wikipedia.org/wiki/Tarbagan_marmot (also known as Siberian marmot)

https://en.wikipedia.org/wiki/Siberian_chipmunk

nothinkjustai 11 days ago [-]

So your logic is humans and LLMs are the same because humans are wrong sometimes?

ajross 11 days ago [-]

Pretty much, yeah. Or rather, the fact that we're both reliably wrong in identifiably similar ways makes "we're more alike than different" an attractive prior to me.

nothinkjustai 11 days ago [-]

“More alike than different” is reasonable I think, as long as we’re talking about how we have some of the same failure modes. Although the way we get there is quite different.

I’m still not a big fan of comparing humans and LLMs because LLMs lack so much of what actually makes us human. We might bullshit or be wrong because of many reasons that just don’t apply to LLMs.

nyeah 11 days ago [-]

"Lies are all we have."

If so, how do we distinguish between code that works and code that doesn't work? Why should we even care?

ajross 11 days ago [-]

> If so, how do we distinguish between code that works and code that doesn't work?

Hilariously, not by using our brains, that's for sure. You have to have an external machine. We all understand that "testing" and "code review" are different processes, and that's why.

nyeah 11 days ago [-]

Good point. We choose certain tests to perform. We choose certain test results to pay attention to. We don't just keep chatting about (reviewing) the code. We do something else.

If lies are all we have, then how is this behavior possible?

ajross 11 days ago [-]

LLMs can write and run tests though.

You're cherry picking my little bit of wordsmithing. Obviously we aren't always wrong. I'm saying that our thought processes stem from hallucinatory connections and are routinely wrong on first cut, just like those of an LLM.

Actually I'm going farther than that and saying that the first cut token stream out of an AI is significantly more reliable than our personal thoughts. Certainly than mine, and I like to think I'm pretty good at this stuff.

nyeah 11 days ago [-]

I don't think the complaint about cherry picking is quite fair. Most of your original comment consists of claims that we're bullshit machines, our internal dialog is almost 100% fantasy, we're hallucinating, etc. Those claims may be true. But I'm not carefully like curating them out of nowhere.

perching_aix 11 days ago [-]

This is like all the usual anti-LLM talking points and sentiments fused together.

Doesn't it get boring?

I like using these models a lot more than I stand hearing people talk about them, pro or contra. Just slop about slop. And the discussions being artisanal slop really doesn't make them any better.

Every time I hear some variation of bullshitting or plagiarizing machines, my eyes roll over. Do these people think they're actually onto something? I've been seeing these talking points for literal years. For people who complain about no original thoughts, these sure are some tired ones.

camgunz 11 days ago [-]

If I have to suffer "look at this busted ass thing I slopped out with AI" a few times a week, you all have to suffer grouchy "AI bad" a few times a week. Fair is fair.

perching_aix 11 days ago [-]

Just this week I was baited into joining two meetings about "AI good". Absolutely zero substance throughout each, of course.

They somehow managed to stretch out like 3 sentences worth of sentiment to a whole hour, interspersing brainwash about how good AI is along the way. It was like watching someone try to hit a word limit in real time. They always made it feel like we're just about to hit a substantive bit too, only for that to never come.

It may be fair (to the sentiments) in that there's balance, but good lord, the end result is incessant all around (and thus unfair to the people exposed).

masfuerte 11 days ago [-]

Why do you insist on reading and commenting on these articles that bore you so much?

perching_aix 11 days ago [-]

Oh I don't know, maybe because I like to give dissenting takes a chance? Because from time to time they do make some new, decent points, or at least interesting ones? You know, basic intellectual rigor?

Do you imagine me being a clairvoyant by the way, or how do you expect me to know a post is of low quality before I read it or at least skim it?

This one ended up being a part of the vast majority that doesn't offer much of anything. It's a redundant rehash of all the usual rubbish anyone can come across any day. Left a comment about this stating so. Big deal.

stavros 11 days ago [-]

Because saying "this is boring, let's stop talking about it" is an opinion worthwhile of expression.

hackable_sand 11 days ago [-]

Oppressors don't like people talking about their oppression

Go figure

stavros 11 days ago [-]

Yeah, it gets really boring. Whenever I see "slot machines" or "bullshit machines" or whatever, I just ignore the comment and move on, because it signals that it's someone in such deep denial that they've turned their brain off.

I'd much rather read articles about what LLMs can/can't do, or stuff people have built with LLMs, than read how everything LLMs touch turns to shit.

giraffe_lady 11 days ago [-]

"These arguments may be correct but they aren't novel" ??

simianwords 11 days ago [-]

I don't think calling AI a bullshit machine is correct. In spirit.

giraffe_lady 11 days ago [-]

Oh, well you should have said that then.

perching_aix 11 days ago [-]

You're talking to a different person there, but I do obviously also disagree with a lot of what's written in the post too.

At the same time, it is also just super redundant nevertheless, yes. Not sure why you find it so bizarre that one would take an issue with that. See also the very existence of the website called TV-Tropes.

camgunz 11 days ago [-]

I'm earnestly curious why not.

simianwords 11 days ago [-]

Can you try to get a question that fits in 2-3 pages (text only) and test whether ChatGPT bullshits? I can’t do it. It gets pretty much everything.

Edit: I forgot to mention thinking version - I did this for all the other times I asked in this thread but not this one. Apologies.

camgunz 11 days ago [-]

"Hey ChatGPT. How would you describe me?"

https://chatgpt.com/share/69d69780-ae58-83e8-a41c-7d10a5f298...

It has no conversations and no memory of me. Maybe this is true, maybe it isn't, but there's no basis for it.

simianwords 11 days ago [-]

This is not falsifiable, I don't buy it. Do one where we all know is false please?

camgunz 11 days ago [-]

"Hey ChatGPT. I've recently grown horns and I need some care advice. Should I polish my horns before going to have them trimmed or will the horn trimmer polish them for me?"

https://chatgpt.com/share/69d69b18-d1c8-83e8-bc47-8f315a1b55...

simianwords 11 days ago [-]

I wanted this challenge with the thinking version (I apologised for it and edited the earlier version).

It doesn't bullshit on the GPT-5.4 thinking version.

Here is the result with thinking https://chatgpt.com/share/69d69dd6-fb50-838d-863c-4e1eda5d08...

I suggest you try it yourself to be convinced. Try it in incognito mode if you wish. Or not.

camgunz 11 days ago [-]

"Hey ChatGPT. I'm building a Final Fantasy 6 mod, and I need more space for the battle scripts. How would I rearrange the data in the ROM to give me the extra space I need?"

https://chatgpt.com/share/69d6a16c-6014-83e8-a79d-d5d11ed2eb...

That is not where the battle scripts are.

---

Anyway, it's trivial to get pretty much any model to make things up. Don't we all know this? That's why I was surprised by your position; if we know anything about these things it's that they make things up.

simianwords 11 days ago [-]

https://chatgpt.com/share/69d6a38c-bd54-838c-82e3-609d9e66c9...

I used the thinking version (like I asked before). I think this is right. If not, please tell.

Also; you didn’t falsify anything. Nor the first. Nor the second.

If the second one is bullshit, I accept I’m wrong - I have no idea how to verify though so I’ll leave it up to you.

I think yours is the classic case of “use the free version to judge the paid one”.

camgunz 11 days ago [-]

The thinking version is mostly right, but:

- it searches the internet to find the answer, it doesn't "reason". I'm not claiming Google is a bullshit machine, and it's not surprising the answer is discoverable (it has to be, for the conditions of our experiment).

- near the end it says "If you are building from the FF6 disassembly instead of hand-editing the ROM, the repo is already organized into separate modules and linker configs, so the clean approach is to relocate the script data in the source and let the build place it in a different ROM region." But I didn't reference a repo or git: it hallucinated that stuff from one of its sources.

I'm not saying this stuff doesn't have its place, but they definitely make things up and we can't stop them.

simianwords 11 days ago [-]

Wait I can't find the quote you are speaking about. Are you looking at something else?

In any case - it should be clear that it did not bullshit and it got it right. So far you have not come up with anything that tells me it bullshits. I'm happy for you to give me more prompts to verify because I think you haven't used the thinking version yet and you base your criticism on the free version.

camgunz 11 days ago [-]

Sorry: https://chatgpt.com/share/69d6ac63-d200-8330-8c47-95a75db8bb...

Also what? The repo bit is clear bullshit.

simianwords 11 days ago [-]

it linked it: https://github.com/everything8215/ff6 (check the end)

camgunz 11 days ago [-]

I saw; I replied up there

simianwords 11 days ago [-]

I don't think this is an example of bullshit. It referenced a repo - the canonical repo for this project. I could not find any other repo that has the disassembly. It didn't hallucinate anything. I think you are trying really hard here but lets be clear here: there's no bullshitting and I'll leave it to the public to decide.

camgunz 11 days ago [-]

I could quibble with some things, but this is right. I don't have a paid account so I can't ping away at 5.4 or whatever, but, I do have access to frontier models at work, and they hallucinate regularly. Dunno what to do if you don't believe this; good luck I guess.

simianwords 11 days ago [-]

I agree that they hallucinate sometimes. I agree they bullshit sometimes. But the extent is way overblown. They basically don't bullshit ever under the constraints of

1. 2-3 pages of text context

2. GPT-5.4 thinking

I don't think the spirit of the original article (not your comments to be fair) captured this, hence the challenge. I believe we are on the same page here.

camgunz 10 days ago [-]

> I don't think the spirit of the original article (not your comments to be fair) captured this, hence the challenge. I believe we are on the same page here.

No. GPT-5 has a 40% hallucination rate [0] on SimpleQA [1] without web searching. The SimpleQA questions meet your criteria of "2-3 pages of text content. Unless 5.4 + web searching erases that (I bet it doesn't!) these are bullshit machines.

[0]: https://arxiv.org/pdf/2601.03267

[1]: https://github.com/openai/simple-evals

simianwords 10 days ago [-]

Specifically in the case where it can use tools - no it doesn't hallucinate. Which is why you are struggling to find counterexamples.

camgunz 10 days ago [-]

> Specifically in the case where it can use tools - no it doesn't hallucinate.

OpenAI's own system card says it does. Hallucination rates in GPT-5 with browsing enabled:

- 0.7% in LongFact-Concepts

- 0.8% in LongFact-Objects

- 1.0% in FActScore

> Which is why you are struggling to find counterexamples.

Hey look, over 500 counterexamples: [1].

GPT-5.4's hallucination rate on AA-Omniscience is 89% [0], which is atrocious. The questions are tiny too, like "In which year did Uber first expand internationally beyond the United States as part of its broader rollout (i.e., beyond an initial single‑city debut)?" It's a bullshit machine. 89%!

At some point you gotta face the music, right?

[0]: https://artificialanalysis.ai/evaluations/omniscience?model-...

[1]: https://huggingface.co/datasets/ArtificialAnalysis/AA-Omnisc...

simianwords 10 days ago [-]

You had to go all the way and find it in the benchmark results that specifically stress test this.

You could not come up with a single one yourself. And you also linked an example where it was not allowed to use tools when I specifically said that it should be able to use tools. I'm not sure why are you present this as though it is a big gotcha.

I think my main point pretty much stands.

camgunz 10 days ago [-]

I found over 500 examples that fit your criteria. Embarrassing you were arguing in bad faith this whole time.

simianwords 10 days ago [-]

They all use the tool search, no? Please correct me if I'm wrong.

My criteria was using ChatGPT which explicitly allows it.

https://arxiv.org/html/2511.13029v1 if you don't believe me.

BTW this was your original point

>Anyway, it's trivial to get pretty much any model to make things up. Don't we all know this? That's why I was surprised by your position; if we know anything about these things it's that they make things up.

And look at how much effort you have had to do

1. use the wrong model for the horns example

2. the game one also didn't work

3. now you are searching for examples in literal benchmarks and you are still not able to find any

How is this trivial in any interpretation of the word?

I think it would be perfectly reasonable to agree that it is not at all trivial to find counter examples for my challenge.

camgunz 10 days ago [-]

I've got about 20 minutes in this; mostly I've been reading wallstreetbets at the Shake Shack bar in the Boston airport. I'm happy to post this over and over again until you engage w/ it:

> I found over 500 examples that fit your criteria.

simianwords 10 days ago [-]

They don't use tools. Like the 4th time you ignored this on purpose. That was not part of the challenge.

camgunz 10 days ago [-]

GPT-5.4 gets 82.7% on Browsecomp (a benchmark specifically testing tool use), which is a hallucination rate of 17.3%, on questions like "Give me the title of the scientific paper published in the EMNLP conference between 2018-2023 where the first author did their undergrad at Dartmouth College and the fourth author did their undergrad at University of Pennsylvania."

Since the goalposts have been moved to include effort, I'm compelled to say I found this while waiting in line at Starbucks, 5 mins tops. Probably GPT-5.4 could have found this too, though it lies > 1/6 the time, so one could be forgiven for not wanting to risk it.

https://llm-stats.com/benchmarks/browsecomp

https://openai.com/index/browsecomp/

simianwords 10 days ago [-]

the latest top reported agentic LLMs score about 83–87%, versus an original human baseline of about 25.3% end to end, so today’s best systems appear to outperform humans by roughly 58–62 percentage points, or about 3.3–3.4×

So according to your own benchmark LLMs hallucinate much less than humans and report way higher accuracy.

Do you agree to be more skeptical of humans than LLMs on these tasks?

camgunz 10 days ago [-]

1. Irrelevant. I've delivered example after example of your fave model bullshitting. You should've bitten the bullet long ago. Honestly I'm disappointed; I've seen you in a lot of AI threads and assumed you'd be good to talk to on this, but you've moved the goalposts over and over again rather than engage in good faith. Anyone reading this thread (god bless them) can see you're plainly not objective here, thus calling into question your advocacy everywhere.

2. Humans will say "I don't know". The problem with hallucinations isn't that they're wrong, it's that there's no way to know they're wrong without being an expert or doing everything yourself, which undermines much of the reason for using an LLM--it certainly undermines their companies' valuations. You're conflating human failure ("I don't know") with model bullshitting ("I do know"... but it's wrong), which I would've previously attributed to basic human fuzziness, but now that I know you're not objective I'm pretty sure it's just flailing debate tactics.

3. Users can't teach these services to be better. If I have a junior engineer making assumptions about an API, I can teach them to not do that, or fire them in favor of one that can. I can't do that with LLMs.

4. The humans they're testing against aren't experts. Tax law experts will beat LLMs at tax law, etc. Again another flailing debate tactic.

Predictably, I'm done with this thread. Feel free to reply if you want the last word.

simianwords 10 days ago [-]

This was my original point

>I don't think calling AI a bullshit machine is correct. In spirit.

That was always my goal post and I asked the challenge to get it to bullshit to drive a point across. You yourself said it is trivial.

1. You came up with the horns question - I tried with the thinking model and it clearly understood that it was a joke and replied appropriately

2. You came up with the assembly question - I tried it again with the thinking model and it gave the right answer again

3. Now you gave up trying to make prompts by yourself because you realised that its in fact not trivial

4. Then you started looking for benchmarks to show that it bullshits

5. You picked a benchmark that doesn't allow tools (which was not my constraint)

6. Then you picked a benchmark that does allow tools, and it turns out that it performs much better than humans

7. Upon hearing this, you shifted to goal posts to say that "models don't know how to say I don't know and I can teach models etc etc"

On the last part: There's a benchmark called SimpleQA which doesn't allow tools and allows for "I don't know" as an answer and GPT 5 still beats humans.

I think you should reconsider thinking this "I don't think calling AI a bullshit machine is correct".

simianwords 11 days ago [-]

[flagged]

perching_aix 11 days ago [-]

My personal red flag for this is the scare quoting of AI, and the super try-hard categorization work that people perform to try and discredit LLMs.

It takes approximately 1 min to find out that machine learning is a subfield of artificial intelligence, both having existed for about half a century now. This basic historical fact is also taught on AI 101 courses across the globe for compsci students.

Yet here we are, people portraying it as some sort of cheap sales trick. Reminds me when I discussed quantum dots with a friend, which he was very enthusiastic to quickly file under "yet another bullshit with quantum in its name" before finally taking the time to understand that the "quantum" bit is not a marketing gimmick. Except in this case, people are a million times more inclined to willfully propagate this. Genuinely so tiresome.

simianwords 11 days ago [-]

I think it’s just anxiety because to internalise that it is actually so good is a bit hard for some

Rendered at 08:36:08 GMT+0000 (Coordinated Universal Time) with Vercel.