NHacker Next

- new
- past
- show
- ask
- show
- jobs
- submit

login

Rendered at 14:10:40 GMT+0000 (Coordinated Universal Time) with Vercel.

NHacker Next

- new
- past
- show
- ask
- show
- jobs
- submit

login

Rendered at 14:10:40 GMT+0000 (Coordinated Universal Time) with Vercel.

Either a theoretical definition of C should be used (where integers can be arbitrarily large), or a practical definition of Turing-completeness should be used (where some implementation of a finite computational device is possible), such that it's the same theoretical or practical consideration on both sides of the comparison.

I suppose it depends on whether you consider those "part of the language". Which seems like a pretty arbitrary question, but then, we're making a pretty arbitrary point. We already know that real machines always have some finite limitation on memory and thus cannot be Turing machines.

Well, maybe the definition can be used for some Hilbert hotel kind-of-stuff, but I would any day prefer a definition that distinguishes some software constructs as Turing-complete and others as not, where C is definitely in the "yes" camp.

addressablestorage -- objects have addresses, and those addresses must be a finite length (the length of a pointer), which is implementation-dependent but must be finite so that sizeof(pointer) returns a valid result. Even if you could have an infinite amount of RAM, nostandardC implementation could address it.The accepted answer tries to go further with register variables (which officially don't have addresses) and recursion (whose depth is not limited by the standard), but founders on the limitation that you can't make an array out of register variables. Functions can only have a finite number of arguments and return a finite-sized structure.

Another answer from a couple years later tries to make a non-addressable array using va_list and va_copy(). I don't know enough about the quirks of varargs to tell whether this would work, although nobody seems to have an unanswered objection so far.

I think in part it’s because storage limits are platform-specific and context-dependent. Sure, your laptop has a hard drive with limited storage capacity, but how much free space do you have? Running out of disk space isn’t an

algorithmiclimit, it’s a limit of the environment, which is constantly changing.So this is a way of splitting concerns. When we’re analyzing algorithms, storage capacity is someone else’s problem. It’s not part of the function’s API. (Though, handling running out of storage gracefully might be part of the API, depending on the language.)

The idea is that if I know algorithm A is O(N) and algorithm B is O(N) whereas algorithm C is O(log N) then I will know that it's stupid to use both algorithm's A and B together to avoid C, because the complexity of C was actually lower.

In practice there's a big problem - big-O isn't good enough for real world engineering decisions. So we need to measure, and if we're measuring why bother defining this in the STL ?

There's also a small problem, this forbids, at least notionally, implementations from choosing better options which strictly don't meet the documented criteria. But that problem counts as small because in practice we'll just pretend not to notice and leave the resulting bug reports from pedants in the fix-never pile.

Do you have any examples of “better” algorithms that the STL is not able to use? Another idea in STL is to provide many algorithms, which is why there are 3 sorts.

Where do you see three sorts with different behaviour? I see two, an unstable sort just named "sort" and a stable sort.

sort performs O(N⋅log(N)) comparisons and so does the stable sort, although to be fair it also promises to work (more slowly, O(N⋅log*2(N)) instead) if there's no scratch space.

We don't get to

askfor the no scratch variant, so we can't count that separately.And these big-O values are totally uninteresting because they're typical for every modern sort, when people announce a shiny new general purpose sort it has the same complexity numbers. Does your STL have it? Who knows, the standard doesn't specify.

And that gets back to my earlier point, Introsort, the typical sort you'd find in an C++ program today, is relatively modern, and C++ was standardised in 1998. So actually quicksort was often in a "complying" C++ stdlib because hey, that's good right? Well, no, Quicksort's worst case has O(n*2) so it's actually forbidden by the STL's documentation, but nevertheless it was shipped because for many real world uses it's fast.

stable_sort, sort, and sort_heap

The creator Alex Stepanov also included insertion_sort (used in sort) but it was rejected by standard committee.

That suggests that the idea that complexity is primarily to compare is wrong, because then why would anyone pick insertion_sort? Of course there are real world cases where it is the right choice. But if you need the complexity guarantee, then it's not.

> shiny new general purpose sort

I don't want it in the standard until it's proven useful and has some longevity, not just a micro benchmark. Shiny and new are exactly the wrong adjectives.

Why can't you include a header?

> quicksort

Introsort is simply a variant of quick sort with heap sort as a fallback to avoid the worst case, and insertion sort at the lowest level.

Anyone who tried to pass off naiive quick sort simply wasn't educated about the topic, so it's good that they were not standard compliant.

I'm torn about whether this constitutes a sort though. Like, yeah, in some sense this is the meat of a heap sort, but, it's the caller's job to provide the data as a heap first and of course this is C++ so if we screw that up it's Undefined Behaviour.

> I don't want it in the standard until it's proven useful and has some longevity

I wasn't advocating for such things to be

in the standardbut to bein the implementation. One of the worst sins of WG21 has been standardising implementations, as they did with std::unordered_map and continued to do in newer C++ versions.And you're wrong, these people knew exactly what they were doing. You've probably used a stdlib which did this (such as libc++), the reason was very simple: Despite the big-O complexity, in practice quicksort was fast while heap sort, though compliant, is usually slower. Benchmarks matter. Big-O not so much.

Like I said, Introsort was new, and as you've explained you want to wait until things have "some longevity" before standardising them, so the stdlibs didn't take Introsort at first. LLVM took Introsort for libc++ in 2021, Microsoft landed a compliant sort some years earlier. Neither complied with C++ 98 initially.

I mean the big joke for LLVM is that the bug ticket about their sort complexity which sat in a pile unresolved until after

Joe Biden became presidentwas by Orson Peters. It's like realising the guy who pointed out that your physics textbook has the formula for mass-energy equivalence wrong is Albert Einstein. Gee, I wonder if Orson has any idea for how to fix this libc++ bug...They feel like a 1980s feature, if there was a hash table in C's stdlib, this is the hash table you'd expect. But it's not in C and it wasn't from the 1980s, it's from the same year Osama Bin Laden was killed by US special forces.

Rustexisted at the same time this data structure was standardised. Not Rust 1.0, but that's insane, it's like when you realise millions of older Spanish people were alivebefore their country was a democracy. Many are old enough to remember the prior regime.The C++ 11 standard was

notrequired to standardize broken garbage from many years ago.WG21 pulled C++ 0x Concepts (a much more powerful feature than the "Concepts Lite" which landed in C++ 20) because they argued it might take too long to implement, ridiculous as that claim is, so there's no reason to keep something terrible just because it's in a draft paper.

It's hard to even believe this was

tryingto be general.Notice that not only the hash tables themselves, but the infrastructure is garbage too. In 1998 C++ STL is one of the earliest generic programming APIs, nobody knows how you should implement the generality of something like hashing, fine.

But by 2011 people had tried several things, and this (the std::hash templated callable which returns a machine-word integer) is not even a reasonable attempt at a known good design.

moreuseful, not less, as they become more general. Even elementary ideas like counting and arithmetic don't include practical limits. My favorite example is the concept of periodicity, which is usually defined something like this: This implies that all periodic signals areeternal, which isneverthe case in the real world! But we talk about "periodic" signals all the time because the concept is still useful.If you demand that you can always computably do arithmetic on size_t's, allow storing arbitrary arithmetic or even logical expressions in your bignum integers, and call those bignum as well. Define "less than" on infinite integers based on "alphabetical" order. Then the only thing that is non-computable is (in)equality between two expressions for which non-finite-ness is unprovable under First order logic. Given Godel's Completeness (note, not Incompleteness) Theorem, that should probably meet the definition of a C implementation, though I haven't read the standard.

This doesn’t seem like a helpful argument.

If, say, my laptop, with its 16GiB of RAM (meaning finite number of possible states), was actually Turing complete, that would have profound implications for mathematics. That would mean that there was an upper bound for the memory required for computation. But that's obviously not the case, because my laptop isn't Turing complete.

Does this affect your daily life as a programmer? Well, no. But does the distinction between "Turing machine" and "Turing-machine-like system but with finite memory" matter? In some contexts, yes. There are people whose profession it is to think about computation with more rigour than us programmers.

(FWIW, in most languages, while implementations aren't Turing complete, the language spec

is. C is the odd one out by requiring implementations to impose an upper bound on the amount of memory which can be addressed.)It's pretty easy to see that any finite machine isn't Turing Complete, because you just ask whether it can compute a function that doesn't fit in its memory. So, for your laptop: define a function that's true on some number larger than would fit in 16GiB (fiddling the definitions as necessary depending on exactly how you define input / output etc.)

As wikipedia says:

> No physical system can have infinite memory, but if the limitation of finite memory is ignored, most programming languages are otherwise Turing-complete.

The convention is to ignore the infinite case, when talking about real systems because a) most things we want answers to are not large enough for this to make a difference and b) otherwise no real system is Turing-complete, and that's not a helpful definition.

For example, pointers could theoretically use a variable length scheme (kinda like Utf-8), as long as the underlying hardware supported it or supported being able to pass only the next n bits of the pointer in something like a chained syscall.

Of course that isn’t in the specification, but the transformation seems theoretically implementable without needing e.g. infinite length pointers for every access.

In contrast, there is no way to coerce a DFA into becoming a Turing Machine (that is theoretically implementable on the DFA).

So the proof is not necessarily wrong, but it might not be the right kind of proof to be useful.

It's just that Turning machines are a useful model for some actual computers.

The limit of "Must run infinitely" will basically wreck any computer that will fit in this universe, won't it?

EDIT: Thanks for correction to rule 110

https://ogiekako.vercel.app/blog/find_mkdir_tc

https://news.ycombinator.com/item?id=41115941

They also updated their proof (not saying whether it's valid or not, just updated):

https://news.ycombinator.com/item?id=41127041

https://ogiekako.vercel.app/blog/find_mkdir_tc

On a given implementation of C, a pointer has some specific size (let's call it S). This means that we can address 2^S bytes of memory. Each byte has CHAR_BIT bits, so can be in 2^CHAR_BIT states, meaning we have 2^(S + CHAR_BIT) possible states memory can be in. Since there's a finite number of states, it's not a Turing machine.

> fpos_t is a complete object type other than an array type capable of recording all the information needed to specify uniquely every position within a file.

[7.23.1p3 of N3054 working draft of C2x, as that's what I have open right now.]

Your position in a file has to be uniquely specifiable`socket()` has entered the conversation.

There isn't really, at least not in principle. You could use a FILE* as a tape and restrict yourself to relative fseeks. But this discussion doesn't concern itself with IO; obviously anything

canin principle be made into a Turing machine by strapping enough IO to it.in context of IRL things the question of turning completeness is normally not "if it can be used to simulate any Turing machine" (wikipedia) but " if it can be used to simulate any Turing machine bound to some realistic size limit"

For a “useful” proof, I think you have to do a bit of reading between the lines of separating the “essence” from the specification even if the language you end up with is not exactly as specified but can e.g. be implemented in the language given the right primitives. I don’t think a low-level language specification can be Turing complete.

Maybe a LISP would be, or pretty much anything that don’t specify pointer sizes (as in this post).

A useful perspective to contrast to this idea that it's not a "real" Turing machine if you don't have infinite tape (and thus, no such machines can ever exist) is the Busy Beaver (perpetual favourite of Hacker News, just use the search at the bottom to find such topics). Some

verysmall beavers are already completely beyond our ability to reason about because they express unsolved problems of mathematics.the aforementioned comment from TFA:

C99's addition of va_copy to the variadic argument API may give us a back door to Turing-completeness. Since it becomes possible to iterate through a variadic arguments list more than once in a function other than the one that originally received the arguments, va_args can be used to implement a pointerless pointer.Of course, a real implementation of the variadic argument API is probably going to have a pointer somewhere, but in our abstract machine it can be implemented using magic instead.[1] Whether that copy is meaningfully usable is a different matter, which is why va_copy exists.

One could imagine a memory API with infinite storage, perhaps some sort of null-terminated addresses with no defined upper bound, but it'd be different than that.

You'd need to store it, and our universe is finite, so there isn't enough room.

Much worsethe universe despite being finite is alreadyso enormousandgrowingthat you cannot cross it, so it would be impossible to actually perform computation as even if the data exists it can't necessarily ever be moved from where it is to the site of computation - it may get further away instead despite travelling as fast as possible.Fit on a normal machine c99 runs on.

The second you say "most computations" you stopped talking about Turing Completeness.

Wait until you find out all memory lookups are O(n^0.5) too (assuming a holographic universe)

Only physical "Turing complete" systems. JavaScript, as specified by ECMA, is properly Turing machine; you can express a program which will allocate an arbitrary amount of objects. In the "JavaScript abstract machine", there is no upper limit.

In C, there is necessarily a finite upper limit to the amount of objects allocated, since every object needs a unique address and addresses are represented by finite-bit pointers. That means that the "C abstract machine" as defined by ISO can not even in principle allocate arbitrarily many objects.

(And yes, this all means that any Turing-machine-like systems built in the real world aren't proper Turing machines, since strictly speaking, a Turing machine is a theoretically construct with an infinitely long tape.)

However, in most languages (JavaScript, Python, Java, ...) there's no concept of an object's

location in memory as expressed by a finite number of bits. In those languages, you can express programs which will create an ever increasing number of objects. Trying torunthat program will eventually cause a crash as you run out of space, but the source code itself encodes a program which would allocate objects with no upper bound.That's not the case for C. In C, every object has a unique address. That address is encoded in a fixed number of bits. As such, you can't even write source code which allocates objects with no upper bound; the upper bound will always be 2 to the power of the number of bits in a pointer.

You have made an arbitrary causal network between the idea of Rule 110, and its implementation in C++. Its arbitrary, and minds much greater than yours can enjoy and understand this distinction, as does Stephen, and I. See Pages 770ff.

You can compile rule 110 into a Mathemetica notebook, you can output it in x86 assembler. Since they are the same, i.e. they produce the exact outputs, and can determine computability

"Turing completeness is a term in computer science that describes the ability of a system to compute any possible calculation or program, and can be used to describe modern programming languages (Python, C++, etc.). Turing complete describes a programmable system that can solve any computational problem."

Now, how about a little bit of reading?

https://philarchive.org/archive/CASOTC-3

"Virtually all programming languages today are Turing-complete."

https://en.wikipedia.org/wiki/Turing_completeness

It is a finite number and it's a big number, but still finite (and proven).

https://en.wikipedia.org/wiki/Bekenstein_bound

Also the playlist Understanding the Holographic Universe from PBS Space Time https://www.youtube.com/watch?v=qPKj0YnKANw&list=PLsPUh22kYm... and Entropy Explained! https://www.youtube.com/watch?v=nhy4Z_32kQo&list=PLsPUh22kYm...

But the gist is that turing completeness

even limited by finite universesis still turing completeness in the most interesting interpretation.A Turing machine that you implement in C as “the tape is a single array in address space” would be, but that’s not what is required for Turing completeness.

The entire argument about pointer size is facetious - programs have operated over data that is larger than address space for ever, and is not hard to manage.

But let’s just go nuts: you can make a Turing machine in C that implements the tape as a stream api that wraps all the cloud storage providers and just pauses until more drives are installed whenever necessary, and you have just as much of a Turing machine as any physical Turing machine could possibly be.

If the real argument is “to be turing complete means that a Turing machine must be able to have infinite state”, then the argument is only technically true, but that’s argument applies equally to

everyTuring complete system.> But let’s just go nuts: you can make a Turing machine in C that implements the tape as a stream api that wraps all the cloud storage providers and just pauses until more drives are installed whenever necessary, and you have just as much of a Turing machine as any physical Turing machine could possibly be.

It isn't stupid. The post is not about "real world" implementations. The question was about C99's abstract semantics. As the first answer points out:

> (Of course you could make the program store the tape content externally, through file input/output functions. But then you wouldn't be asking whether C is Turing-complete, but whether C plus an infinite storage system is Turing-complete, to which the answer is a boring “yes”. You might as well define the storage to be a Turing oracle — call fopen("oracle", "r+"), fwrite the initial tape content to it and fread back the final tape content.)

arein the language), there is our infinite storage that does not depend on limits to pointer size.