NHacker Next
  • new
  • past
  • show
  • ask
  • show
  • jobs
  • submit
Hcreate(3) (linux.die.net)
pdw 15 days ago [-]
Don't use die.net! The manpages there were last updated 10+ years ago. It's very unfortunate that that site does so well on Google.
alurm 7 days ago [-]
In this case it doesn't matter much I think. I suppose the api existed for ages.

But I prefer openbsd or man7 pages, sure.

nvy 15 days ago [-]
Is there a better man page site to use instead? I abhor reading in the terminal.
layer8 14 days ago [-]
creatonez 14 days ago [-]
Archlinux also now has a manpage viewer

https://man.archlinux.org/

Useful if you need to see the manpage of a very recent version of a package.

o11c 14 days ago [-]
man7.org is "official" (linked from kernel.org)

This page there: https://man7.org/linux/man-pages/man3/hsearch.3.html

nvy 14 days ago [-]
Never knew this about man7, thank you.
self_awareness 14 days ago [-]
What's the difference between reading letters and digits on a webpage, and reading them in a terminal?
nvy 14 days ago [-]
Scroll behavior, for starters.
PhilipRoman 13 days ago [-]
Obviously it is none of my business how people read their manpages, but on the off chance this may be useful to you:

    export LESS='--mouse --wheel-lines=3'
self_awareness 14 days ago [-]
You mean that you can't scroll in the terminal, or?
PhilipRoman 13 days ago [-]
Mankier is nice if you like fancy formatting

https://www.mankier.com/1/qemu

alan-crowe 14 days ago [-]
The documentation pages for FreeBSD include manual pages, for example

https://man.freebsd.org/cgi/man.cgi?query=hcreate&manpath=Fr...

ivanjermakov 15 days ago [-]
The only other one I'm aware of is Ubuntu manuals: https://manpages.ubuntu.com/manpages/bionic/en/
chrsw 14 days ago [-]
Not only that, but I've been having lots of problems with Cloudflare lately. How much of the internet do they provide service for? Good grief...
fanf2 14 days ago [-]
This API is laughably bad, almost completely useless. But it is in POSIX https://pubs.opengroup.org/onlinepubs/9699919799/functions/h... so it is ubiquitous as well as useless.
ropejumper 15 days ago [-]
Does someone know why these are not reentrant without the GNU extension? It seems like a really weird choice to make a library for hash tables and then make the state global like that.
layer8 15 days ago [-]
You can look them up here in System V Interface Definition Issue 2 from 1986: http://www.bitsavers.org/pdf/att/unix/SVID/System_V_Interfac.... It has a note “Future Directions: The restriction on only having one hash search table active at any given time will be removed.” Which of course couldn’t be done without breaking compatibility.

Most programs at the time probably didn’t need more than one hash table, and it made for a simpler interface. (A program could in principle also work around the limitation by storing multiple values per hash table entry.) Most Unix libraries worked with global state at the time, since multi-threading also didn’t exist yet. I remember running into the limitation of only being able to have one lexer/parser grammar (via lex/yacc) per program.

loeg 15 days ago [-]
They're really, really old. Dating to System V Unix. At the time it was pretty common to have stateful libc functions.
kelnos 15 days ago [-]
Back when these functions were created, it was normal and expected for libc to store global state for a variety of things. Reentrancy wasn't really an issue that was considered then.

(Though this particular API has an even weirder limitation: you can only create a single hash table at a time in your entire program!)

kragen 15 days ago [-]
probably they were added in 2.9bsd or something when programs couldn't be more than about 5000 lines of code before you ran out of the 16-bit address space

hmm, no, it doesn't seem to be in https://www.tuhs.org/Archive/Distributions/UCB/2.9BSD/ or even in https://www.tuhs.org/Archive/Distributions/UCB/4.2BSD/ or https://www.tuhs.org/Archive/Distributions/UCB/4.3BSD/. and https://man.freebsd.org/cgi/man.cgi?query=hsearch&apropos=0&... says explicitly:

> The hcreate(), hdestroy(), and hsearch() functions first appeared in AT&T System V UNIX.

which... does seem a bit late for thinking it was a good idea to only support one hash table per process? that was in 01983, 6 years after the vax was introduced and 4 years after the 68000. i mean awk supported arbitrarily many hash tables in a single process in 7th edition unix, even before it had subroutines

Someone 15 days ago [-]
> i mean awk supported arbitrarily many hash tables in a single process

This API supports that, too, as long as you don’t call one of the functions while another call is in progress. You get that for free if your process is single threaded.

https://en.wikipedia.org/wiki/Thread_(computing)#Threading_m... says

“History of threading models in Unix systems

SunOS 4.x implemented light-weight processes or LWPs. NetBSD 2.x+, and DragonFly BSD implement LWPs as kernel threads (1:1 model)”

According to Wikipedia, SunOS 4.0 is from 1988, NetBSD 2.0 from 2004, so I guess chances are good that AT&T System V UNIX didn’t support threads in user space programs.

kragen 15 days ago [-]
you seem to be discussing what does or doesn't support multiple threads, but what we were talking about is what does or doesn't support multiple simultaneously existing hash tables, which hcreate() doesn't. i've edited my comment upthread to clarify

(it's also true, as kimixa points out, that a single-threaded process with signal handlers isn't really single-threaded, but i don't think that's really the important consideration in this context)

for what it's worth, it's not very difficult to use alarm() to invoke a context-switching subroutine to get preemptive user-space multithreading on unix without any kernel support. in theory you have to write the context-switching subroutine in assembly language, which isn't difficult at all; here's one i wrote in arm assembly for the arm eabi, for cooperative multithreading:

            .thumb_func
            .globl einyield
    einyield:
            push {r0, r4-r11, lr}   @ save all callee-saved regs plus r0
            ldr r0, =current_task_pointer
            ldr r1, [r0]
            str sp, [r1]            @ save stack pointer in current eintask
            ldr r1, [r1, #4]        @ load pointer to next eintask
            str r1, [r0]
            ldr sp, [r1]            @ switch to next eintask's stack
            pop {r0, r4-r11, pc}    @ return into einyielded context there
(from http://canonical.org/~kragen/sw/dev3/einkornix.S)

but in practice you can usually just use longjmp()

preemptive multithreading requires you to save all the registers, not just the callee-saved registers, but so does signal handling, so you can usually just use the signal-handling machinery and use the above code (or its equivalent on your vax or 68000 or whatever) to context-switch between signal-handler stack frames

kimixa 15 days ago [-]
Or a signal handler? Even if there's a single kernel thread, if it can be interrupted while inside one of these functions you can hit the same sort of problem.

Reentrancy was an issue on consumer devices well before multithreading was commonly supported there, after all

rdtsc 13 days ago [-]
Good point. There are multiple levels of "safety". I think glibc API does a decent job marking and indicating which is which:

https://sourceware.org/glibc/manual/2.40/html_node/POSIX-Saf...

  Thread-Safe 
  Async-Signal-Safe
  Async-Cancel-Safe (mostly limited to pthread_*cancel*)
There is a host of other "safety" remarks also useful to keep in mind: https://sourceware.org/glibc/manual/2.40/html_node/Other-Saf...
kragen 13 days ago [-]
this isn't about safety, though; hcreate() and hsearch() don't support having two hash tables in the same process at the same time, not even in an unsafe way. that also makes them non-reentrant, but that isn't the main problem!
p_l 15 days ago [-]
Adding threading support is why there was a bunch of interest in using Mach microkernel in Unix systems.
clbrmbr 13 days ago [-]
Don’t key prefixes take care of the need for an arbitrary number of tables?
tedunangst 15 days ago [-]
Probably started as a function in one program. Somebody copied it into a second program. Somebody wanted it in a third program and copied it into a library.
metadat 15 days ago [-]
Is there a command-line interface for this, or is it strictly library-only?

Would be cool if there were a way to point to a memmapped region under /dev/shm or equivalent.

sph 15 days ago [-]
Type `man redis` or `man sqlite3` for memmapped hash tables with a lot of bells and whistles.

I can't see what would you do with a command line hash table CLI. That's basically a poor man's database.

metadat 15 days ago [-]
Redis is way too much, but you've got a really good point with SQLite. Thanks!
14 days ago [-]
Neywiny 15 days ago [-]
I usually turn to C++ when I need such things. Hopefully I remember this exists next time.
nextaccountic 14 days ago [-]
Don't use this, it's worse than useless
self_awareness 14 days ago [-]
I did a quick search on GitHub for today's code that uses it, and actually found out that people do use it today, but generally I agree, this is a legacy API, legacy implementation, and the best thing we can do for it is to forget about it.

The `hcreate(3)` API is interesting only because it has a very limited use, yet it still exists in libc. So it's kind of interesting in a circus way; we want to see it, but then we want to get out of it, and we want to continue our rational lives without clowns and bearded ladies.

There are probably many thirdparty hashtables for C which work much better than this.

13 days ago [-]
Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact
Rendered at 11:04:07 GMT+0000 (Coordinated Universal Time) with Vercel.