Does TCP hole punching actually work with common CPEs and CG-NATs?
I don’t think I’ve ever seen it done successfully and have often wondered if it’s for a lack of use cases or due to its bad success rate and complexity compared to UDP hole punching.
That said, I really wish there was a standardized way to do it. Some sort of explicit (or at least implicit but unambiguous) indicator to all firewalls that a connection from a given host/port pair is desired for the next few seconds. Basically a lightweight, in-band port mapping protocol.
It could have well been an official recommendation to facilitate TCP hole punching, but I guess it’s too late now, as firewall behaviors have had decades to evolve into different directions.
aboardRat4 34 minutes ago [-]
The standard way to do it is called ipv6. Implementing it is probably easier than any of those RFCs
ignoramous 3 hours ago [-]
> really wish there was a standardized way to do it. Some sort of explicit (or at least implicit but unambiguous) indicator to all firewalls that a connection from a given host/port pair is desired for the next few seconds
TIL, thank you! I've been looking for this for quite a while after hearing it indirectly referenced recently, but only found host-side specifications for TCP simultaneous open.
Do you happen to know if common firewalls and NATs support it? If they do, I really wonder why TCP hole punching isn't more common.
athrowaway3z 9 hours ago [-]
- you know each others IP's (or have a way to signal it)
- can't decide on a port in the same message
- don't suffer from NAT port randomization
I'm not saying it will never happen, but the Venn diagram of this being the minimum complexity solution just doesn't seem very large?
Arch485 1 hours ago [-]
I think many people know how to google "what is my IP" and send that to a friend, but don't necessarily know what a port is.
NAT randomization, I don't know. Depends on your setup, I guess.
EnigmaCurry 11 hours ago [-]
> Many home routers try to preserve the source port in external mappings. This is a property called “equal delta mapping” – it won’t work on all routers but for our algorithm we’re sacrificing coverage for simplicity.
It is precisely this point that has flummoxed me when connecting my p2p wireguard config[1] with a friend that uses a pfsense router, no matter what we tried, pfsense always chooses a random source port.
But in the simple case this blog outlines, if both ends use the same source port, this method punches through 2 firewalls effortlessly:
In my experience, Cisco ASA does source port persistence by default (when it can’t do it then it falls back to random), fortigates can do it (in various ways depending on version, although fallback method in the map-ports doesn’t work), juniper SRXs can’t, unless you guarentee a 1:1 map.
jonathanlydall 10 hours ago [-]
Does your friend setting up port forwarding on their pfSense not help in your scenario?
EnigmaCurry 9 hours ago [-]
Yes, that solves it completely. But the exercise we were trying to do was to do it without that.
hdgvhicv 5 hours ago [-]
You’re getting into birthday paradox territory, throw a few hundred packets in each direction and one will get through
We can all run this through our LLM if choice, why post this?
lxgr 7 hours ago [-]
Did you validate this solution yourself?
getcrunk 7 hours ago [-]
No, hence the all caps ai disclaimer. But seems plausible
subscribed 52 minutes ago [-]
You didn't even provide the exact model you pulled that out!
"Seems plausible".... Can you please read up about the ways LLM generate their output?
nneonneo 7 hours ago [-]
Lord, we're how many years into using LLMs, and people still don't understand that their whole shtick is to produce the most plausible output - not the most correct output?
The most plausible output might be correct, or it might be utter bullshit hallucinations that only sound correct; the only way to tell is to actually try it or cross-reference primary sources. Unless you do, the AI answer is worthless.
The reason why they're getting so good at code now is that they can check their output by running and testing it; if you're just prompting questions into a chatbot and then copying their output verbatim to a comment, you're not adding any meaningful value.
anovikov 7 hours ago [-]
Exactly! This is what LLMs do: they bullshit you by coming across as extremely knowledgeable, but as soon as you understand 5% of the topic you realise you've been blatantly lied to.
lxgr 6 hours ago [-]
Even if you get 70% blatant lies and 30% helpful ideas, if you can cheaply distinguish the two due to domain expertise, is that not still an extremely useful tool?
But to the point of this thread: If you can't validate their output at all, why would you choose to share it? This was even recently added to this site's guidelines, I believe.
lxgr 6 hours ago [-]
But then why make this comment at all, even despite the disclaimer? Anyone can prompt an LLM. What's your contribution to the conversation?
To be clear, I use LLMs to gut check ideas all the time, but the absolute minimum required to share their output, in my view, is verification (can you vouch for the generated answer based on your experience or understanding), curation (does this output add anything interesting to the conversation people couldn't have trivially prompted themselves and are missing in their comments), and adding a disclaimer if you're at all unsure about either (thanks for doing that).
But you can't skip any of these, or you're just spreading slop.
sholladay 9 hours ago [-]
This is a great algorithm!
In this era where AI is eating away at how deterministic computers are, I really appreciate reading about an elegant solution to a real problem using deterministic logic.
CamelCaseCondo 7 hours ago [-]
We still live in an age of deterministic computers. It’s the software that’s become fuzzy. (And since we’re on the subject: there’s no AI)
mycall 30 minutes ago [-]
data = code in the AI age. Fuzzy data = fuzzy code.
Now combining AI with deterministic tool calling brings the best of both worlds.
sholladay 4 hours ago [-]
Yes, but a computer is just a paperweight without its software. Also, increasingly the hardware is being specifically designed and optimized for that non-deterministic software. The experience of using computers is changing and we’re still in the early days of that shift.
Of course there’s still plenty of deterministic software you can run… for now.
ufocia 4 hours ago [-]
I can almost guarantee that all of AI runs on deterministic hardware and software. AI is just (near?) the top of the stack. There is no reason, and probably never will be to have a purely heuristic computer. Deterministic systems are way simpler and cheaper to handle very routine well defined tasks. Even AI authors code behind the scenes to process data files deterministically.
RFCs may say that simultaneous connect must be allowed, but that doesn't mean that firewalls can't block it. Plenty of setups block incoming SYN,!ACK packets, and if both sides do that, the connection is never getting established.
huhtenberg 49 minutes ago [-]
> Plenty of setups block incoming SYN,!ACK packets
Even in the presence of a conntrack entry created by an earlier outbound SYN,!ACK ?
Got a source?
jcalvinowens 2 hours ago [-]
In my experience most consumer routers are dumber than you're assuming they are, and will DNAT any inbound TCP packet that matches the 4-tuple after seeing the initial outbound SYN, including an inbound SYN. But yes, it doesn't work everywhere.
I wrote little paper on this technique in school and did some practical tests, at the time I was actually unable to find an example of consumer grade router that it didn't work on! But my resources were rather limited, they certainly do exist.
The timestamp bucket idea for generating shared port candidates is clever.
Do you find this works reliably outside routers that preserve source ports? My understanding was that TCP punching tends to depend heavily on NAT behavior.
enoint 4 hours ago [-]
Looks like a typo in the degraded timestamp “bucket”. That “window” value should be based on the min threshold.
Veserv 9 hours ago [-]
Needing to punch holes in NAT is one of the most idiotic own-goals in the entire field of networking.
NAT is effectively your router doing DHCP with a 17-bit suffix (16-bit port + 1 bit for UDP vs TCP) to each of your applications and then not telling you the address it gave you or how long it is good for (which is what a regular DHCP lease does). This is in addition to it, most likely, already doing regular DHCP and allocating you a IP address that it does tell you about, but which is basically worthless since routing to just that prefix without the hidden suffix goes into a black hole.
If you could just ask your router for a lease on a chunk of IP+NAT addresses that you could allocate to your applications and rotate them as they expire, you would not need this horrifying mess.
The router would just need to maintain the last-leg routing table (what a concept, a router doing routing with routing tables) just like it already does DHCP.
The applications would have short-term stable addresses that they could just tell their peers and just directly tell the router/firewall to block anybody except the desired peer short-term address.
lxgr 7 hours ago [-]
> If you could just ask your router for a lease on a chunk of IP+NAT addresses
The “just” is doing a lot of lifting there. I’m glad the various port mapping protocols didn’t really take off and it looks like IPv6 is going to actually make it instead. Much less complexity in most parts of the stack and network.
Veserv 7 hours ago [-]
It is always a mystery how people just randomly misinterpret what I write. At literally no point did I mention port mapping.
I am pointing out how the problem NAT “solves” is just dynamic address configuration. They have implemented a N+K bit address where the N-bit prefix is routed and allocated using IP and the low K-bits are routed and allocated like a custom fever dream.
You can just do it all the same way instead of doing it differently and worse for the low bits.
To be clear, the router should rewrite zero bits in the packet under the scheme I am describing just like how routers have no need to rewrite any bits when routing to a specific globally-routable IP address.
You get a lease for a /N+K address. /N routes to your router which routes the last K bits just like normal as if it had a /N-M to a /N route. This is a generic description of homogenous hierarchical routing.
lxgr 6 hours ago [-]
If I understand it correctly, you're suggesting formalizing a way to make parts of the (host-specific) port canonically part of the network-wide address, no?
This still sounds like a very bad mixing of layers, even if done in a perfectly standardized and uniform way.
> It is always a mystery how people just randomly misinterpret what I write.
If this is intended literally and not as a general complaint: My main problem of understanding your suggestion is that I don't know what you mean by "IP+NAT address". NAT is a translation scheme, not an address.
Maybe it would be clearer if you could provide an example?
enoint 4 hours ago [-]
I didn’t see it as mysterious. 25 years ago, the problem as stated went through lots of consensus to become IPv6. It took a few years for SLAAC to emerge. But we don’t need it to be homogeneous; the router advertises different feature levels via ICMPv6.
GoblinSlayer 6 hours ago [-]
NAT allocates ports. If you reserve a port, that's old good port forwarding.
hrmtst93837 6 hours ago [-]
Assuming IPv6 kills NAT is optimistic, plenty of orgs still stack private addressing and firewalls on top.
lxgr 5 hours ago [-]
Firewalls aren't nearly as bad as NAT.
hdgvhicv 5 hours ago [-]
Basically the same thing. If you legitimately need to establish a connection then put a firewall rule in, whether that needs nat or pat is a function of your available addresses.
If you are tying to work around your firewall because it isn’t yours, that’s not a legitimate use.
lxgr 4 hours ago [-]
Love it when random people tell me whether my use case is legitimate or not without apparently even knowing it exists!
Take mobile data connections, for example: Most people don't want to pay for metered (by the byte) inbound traffic they didn't ask for that also drains their battery, but do want to be able to establish P2P connections for lower latency VoIP etc.
This is a firewall that's definitionally "not theirs", but that still also serves their interests, yet usually doesn't offer any user-accessible management interface.
So may I please traverse this firewall now, or is my use case still illegitimate?
hdgvhicv 4 hours ago [-]
If you are trying to break through a firewall you don’t own then that’s not legitimate.
If you are buying firewall as a service then request a user interface or change your service provider.
lxgr 3 hours ago [-]
Are you even acknowledging my example? Where does it exist in your bimodal model of reality of "my firewall" and "somebody else's firewall"?
What provider would you suggest somebody wanting to make VoIP calls on their smartphone switch to that allows port forwarding of the kind you describe? And which popular VoIP app would support statically forwarded ports like that?
ufocia 4 hours ago [-]
You're assuming that the firewall was configured correctly or that the firewall admin is cooperative. That's a big ask.
On the other hand, there is plenty of badly written networked software. I bet most of the networked software developers have no idea how to correctly plumb their software. They just open whatever connection, e.g. sockets, their OS provides and just run with it without care of the underlying layers. The OSI model theory in fact encourages this ignorance.
jeroenhd 28 minutes ago [-]
If only router manufacturers could be trusted to implement UPnP safely, then none I'd this bullshit would be necessary.
At least with IPv6 this crap becomes a little easier because you no longer have randomized source ports (which this article just ignores because some devices indeed maintain the same source port) and the IP address contains all the routing information you need. A simple simultaneous open is all you need.
eptcyka 9 hours ago [-]
Why not use plain IPv6 instead?
TuxPowered 4 hours ago [-]
Even with IPv6 you still might have stateful firewalls allowing only for outbound connection at both ends (e.g. a CPE a.k.a. “WiFi router”) and to establish communication you’d need to punch a hole in those firewalls.
brewmarche 43 minutes ago [-]
That’s true we won’t get rid of hole-punching with IPv6. But at least it will get rid of TURN.
cbdevidal 8 hours ago [-]
V6 adoption has reached 46.82%[1]. So it is increasingly viable for this.
it's been already done ISPs just don't properly implement it (NAT-PMP and it's relatives)
littlestymaar 7 hours ago [-]
Hole punching is doing exactly what you describe, just in a non-standardized way.
We could have a standard for doing that directly at the NAT box level instead of relying on a third party STUN server, it simply didn't happen (and in fairness, the benefits would be quite minimal).
sylware 3 hours ago [-]
Dudes: IPv6, please, come on, meh.
ufocia 4 hours ago [-]
Meh. "It is assumed another process will coordinate the running of this tool." Coordination is the crux of the problem for fast convergence. Otherwise you're stuck with an infinity cubed, hypercubed, or worse problem.
elophanto_agent 7 hours ago [-]
[flagged]
mudkipdev 7 hours ago [-]
This is an AI slop bot
vntok 5 hours ago [-]
That's fine, it's pretty good slop and from the comments history even entertaining at times.
> my grandmother had a cookie jar collection and I always thought it was weird until I realized she was basically running a primitive NFT gallery except the tokens were actually useful because they contained cookies
Rendered at 16:15:55 GMT+0000 (Coordinated Universal Time) with Vercel.
I don’t think I’ve ever seen it done successfully and have often wondered if it’s for a lack of use cases or due to its bad success rate and complexity compared to UDP hole punching.
That said, I really wish there was a standardized way to do it. Some sort of explicit (or at least implicit but unambiguous) indicator to all firewalls that a connection from a given host/port pair is desired for the next few seconds. Basically a lightweight, in-band port mapping protocol.
It could have well been an official recommendation to facilitate TCP hole punching, but I guess it’s too late now, as firewall behaviors have had decades to evolve into different directions.
NAT Behavioural Requirements for Unicast UDP, https://datatracker.ietf.org/doc/html/rfc4787
NAT Behavioural Requirements for TCP, https://datatracker.ietf.org/doc/html/rfc5382
TIL, thank you! I've been looking for this for quite a while after hearing it indirectly referenced recently, but only found host-side specifications for TCP simultaneous open.
Do you happen to know if common firewalls and NATs support it? If they do, I really wonder why TCP hole punching isn't more common.
- can't decide on a port in the same message
- don't suffer from NAT port randomization
I'm not saying it will never happen, but the Venn diagram of this being the minimum complexity solution just doesn't seem very large?
NAT randomization, I don't know. Depends on your setup, I guess.
It is precisely this point that has flummoxed me when connecting my p2p wireguard config[1] with a friend that uses a pfsense router, no matter what we tried, pfsense always chooses a random source port.
But in the simple case this blog outlines, if both ends use the same source port, this method punches through 2 firewalls effortlessly:
[1] https://blog.rymcg.tech/blog/linux/wireguard_p2p/
This hs a good diagram to understand the options
https://rajsinghtech.github.io/claude-diagrams/diagrams/netw...
> Don't post generated comments or AI-edited comments. HN is for conversation between humans.
https://news.ycombinator.com/newsguidelines.html
"Seems plausible".... Can you please read up about the ways LLM generate their output?
The most plausible output might be correct, or it might be utter bullshit hallucinations that only sound correct; the only way to tell is to actually try it or cross-reference primary sources. Unless you do, the AI answer is worthless.
The reason why they're getting so good at code now is that they can check their output by running and testing it; if you're just prompting questions into a chatbot and then copying their output verbatim to a comment, you're not adding any meaningful value.
But to the point of this thread: If you can't validate their output at all, why would you choose to share it? This was even recently added to this site's guidelines, I believe.
To be clear, I use LLMs to gut check ideas all the time, but the absolute minimum required to share their output, in my view, is verification (can you vouch for the generated answer based on your experience or understanding), curation (does this output add anything interesting to the conversation people couldn't have trivially prompted themselves and are missing in their comments), and adding a disclaimer if you're at all unsure about either (thanks for doing that).
But you can't skip any of these, or you're just spreading slop.
In this era where AI is eating away at how deterministic computers are, I really appreciate reading about an elegant solution to a real problem using deterministic logic.
Now combining AI with deterministic tool calling brings the best of both worlds.
Of course there’s still plenty of deterministic software you can run… for now.
This is a theistic statement at this point, no?
Even in the presence of a conntrack entry created by an earlier outbound SYN,!ACK ?
Got a source?
I wrote little paper on this technique in school and did some practical tests, at the time I was actually unable to find an example of consumer grade router that it didn't work on! But my resources were rather limited, they certainly do exist.
Do you find this works reliably outside routers that preserve source ports? My understanding was that TCP punching tends to depend heavily on NAT behavior.
NAT is effectively your router doing DHCP with a 17-bit suffix (16-bit port + 1 bit for UDP vs TCP) to each of your applications and then not telling you the address it gave you or how long it is good for (which is what a regular DHCP lease does). This is in addition to it, most likely, already doing regular DHCP and allocating you a IP address that it does tell you about, but which is basically worthless since routing to just that prefix without the hidden suffix goes into a black hole.
If you could just ask your router for a lease on a chunk of IP+NAT addresses that you could allocate to your applications and rotate them as they expire, you would not need this horrifying mess.
The router would just need to maintain the last-leg routing table (what a concept, a router doing routing with routing tables) just like it already does DHCP.
The applications would have short-term stable addresses that they could just tell their peers and just directly tell the router/firewall to block anybody except the desired peer short-term address.
The “just” is doing a lot of lifting there. I’m glad the various port mapping protocols didn’t really take off and it looks like IPv6 is going to actually make it instead. Much less complexity in most parts of the stack and network.
I am pointing out how the problem NAT “solves” is just dynamic address configuration. They have implemented a N+K bit address where the N-bit prefix is routed and allocated using IP and the low K-bits are routed and allocated like a custom fever dream.
You can just do it all the same way instead of doing it differently and worse for the low bits.
To be clear, the router should rewrite zero bits in the packet under the scheme I am describing just like how routers have no need to rewrite any bits when routing to a specific globally-routable IP address.
You get a lease for a /N+K address. /N routes to your router which routes the last K bits just like normal as if it had a /N-M to a /N route. This is a generic description of homogenous hierarchical routing.
This still sounds like a very bad mixing of layers, even if done in a perfectly standardized and uniform way.
> It is always a mystery how people just randomly misinterpret what I write.
If this is intended literally and not as a general complaint: My main problem of understanding your suggestion is that I don't know what you mean by "IP+NAT address". NAT is a translation scheme, not an address.
Maybe it would be clearer if you could provide an example?
If you are tying to work around your firewall because it isn’t yours, that’s not a legitimate use.
Take mobile data connections, for example: Most people don't want to pay for metered (by the byte) inbound traffic they didn't ask for that also drains their battery, but do want to be able to establish P2P connections for lower latency VoIP etc.
This is a firewall that's definitionally "not theirs", but that still also serves their interests, yet usually doesn't offer any user-accessible management interface.
So may I please traverse this firewall now, or is my use case still illegitimate?
If you are buying firewall as a service then request a user interface or change your service provider.
What provider would you suggest somebody wanting to make VoIP calls on their smartphone switch to that allows port forwarding of the kind you describe? And which popular VoIP app would support statically forwarded ports like that?
On the other hand, there is plenty of badly written networked software. I bet most of the networked software developers have no idea how to correctly plumb their software. They just open whatever connection, e.g. sockets, their OS provides and just run with it without care of the underlying layers. The OSI model theory in fact encourages this ignorance.
At least with IPv6 this crap becomes a little easier because you no longer have randomized source ports (which this article just ignores because some devices indeed maintain the same source port) and the IP address contains all the routing information you need. A simple simultaneous open is all you need.
[1] https://www.google.com/intl/en/ipv6/statistics.html
We could have a standard for doing that directly at the NAT box level instead of relying on a third party STUN server, it simply didn't happen (and in fairness, the benefits would be quite minimal).
> my grandmother had a cookie jar collection and I always thought it was weird until I realized she was basically running a primitive NFT gallery except the tokens were actually useful because they contained cookies