NHacker Next
  • new
  • past
  • show
  • ask
  • show
  • jobs
  • submit
Emacs internals: Tagged pointers vs. C++ std:variant and LLVM (Part 3) (thecloudlet.github.io)
tialaramex 2 hours ago [-]
It's not clear to me (and as an unsafe language it's not called out by your compiler if you do something illegal) what the correct way to spell this kind of trick is in C++

I had thought you need the pointer-sized integer types and mustn't do this directly to an actual pointer, but maybe I was wrong (in theory, obviously practice doesn't follow but that's a dangerous game)

thecloudlet 2 hours ago [-]
Doing bitwise operations directly on raw pointers is a fast track to Undefined Behavior in standard C/C++. Emacs gets away with it largely due to its age, its heavy reliance on specific GCC behaviors/extensions, and how its build system configures compiler optimizations.

In modern C++, the technically "correct" and safe way to spell this trick is exactly as you suggested: using uintptr_t (or intptr_t).

trws 1 hours ago [-]
There’s a paper in flight to add a stdlib type to handle pointer tagging as well while preserving pointer provenance and so-forth. It’s currently best to use the intptr types, but the goal is to make it so that an implementation can provide specializations based on what bits of a pointer are insignificant, or even ignored, on a given target without user code having to be specialized. Not sure where it has landed since discussion in SG1 but seemed like a good idea.
tialaramex 36 minutes ago [-]
Given you aren't sure since SG1 this might be useless but... do you have a paper number? Or, more likely, know an author's name ?
trws 4 minutes ago [-]
It’s Hana Dusikova’s paper IIRC.
legobmw99 8 minutes ago [-]
Seems like its p3125r0
shadowgovt 1 hours ago [-]
Is there a similar solution to doing this in Rust? I suppose inside `unsafe` you can do basically anything.
trws 2 minutes ago [-]
Everything else in the siblings is true, but remember that the language and std types in rust all do this already. Most of the time it’s better to use a native enum or optional/result because they do this in the compiler/lib. It’s only really worth it if you need more than a few types or need precise control of the representation for C interop or something.
tialaramex 46 minutes ago [-]
Unlike C++ all of Rust's primitive types get the same first class treatment as your user defined types and so the appropriate API is provided as methods on pointer types. For this you want ptr::map_addr which takes a callable (such as your own function for this mapping or a lambda) to fiddle with the pointer.

https://doc.rust-lang.org/std/primitive.pointer.html#method....

Rust's MIRI is able to run code which uses this (a strict provenance API) because although MIRI's pointers are some mysterious internal type, it can track that we mapped them to hide our tags, and then later mapped back from the tagged point to recover our "real" pointer and see that's fine.

This isn't an unsafe operation. Dereferencing a pointer is unsafe, but twiddling the bits is fine, it just means whoever writes the unsafe dereferencing part of your codebase needs to be very careful about these pointers e.g. making sure the ones you've smuggled a tag in aren't dereferenced 'cos that's Undefined Behaviour.

It's clear to me how this works in Rust, it's just unclear still in C++

simonask 51 minutes ago [-]
Rust is basically in the same place as C++, i.e. provenance rules are currently ad-hoc/conventional, meaning that pointer tagging is a grey area.
tialaramex 44 minutes ago [-]
Nope. Rust stabilized strict provenance over a year ago. Some details about aliasing aren't tied down, but so long as you can obey the strict provenance rules you're golden today in Rust to hide flags in pointers etc.

https://blog.rust-lang.org/2025/01/09/Rust-1.84.0/#strict-pr...

thecloudlet 1 hours ago [-]
Waiting for Rust experts.
db48x 2 hours ago [-]
Do the way LLVM does it.
2 hours ago [-]
thecloudlet 4 hours ago [-]
Emacs internal part 2 HN link:

https://news.ycombinator.com/item?id=47259961

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact
Rendered at 16:27:51 GMT+0000 (Coordinated Universal Time) with Vercel.