This looks AI-generated and very misleading.... definitely decreases my trust in the linked library, which is unfortunate given that the overall approach seems novel and interesting. The intro starts off by saying SDB is better than rbspy because it doesn't have errors related to data races. But then in the body of the article, it says "Data races may occur if the Ruby VM updates the stack while SDB is reading it", but says that's fine because similar issues occur in other profilers. That sort of trivial contradiction (along with the vague language, overly verbose / repetitive intro and summary, and rando citations (an MIT course lecture??)) feels like the hallmark of a predictive language model with no actual understanding of the code it's explaining.
IainIreland 4 days ago [-]
This doesn't read as AI-generated to me at all.
The prose isn't polished enough to be AI. AI generation is unlikely to produce missing spaces like "...which are not readable to humans.SDB uses eBPF ...", or grammatical inaccuracies like "Ensuring Fully Correctness".
As for the data race thing, it seems to me that there's a pretty clear distinction between rbspy's approach (as described in reference 1) and this blog post. rbspy is walking the native stack, which occasionally fails. SDB seems to be looking at Ruby's internals instead, and has some sort of generation-number design to identify cases where there was a data race.
Beyond that, this post just absolutely sounds like what somebody would write if they were trying to describe in prose why they think their multi-threaded code is correct, especially the "Scanning Stacks without the GVL" section.
yfractal 2 days ago [-]
I am the author of this article. Sorry for the misleading article.
First, I admitted I didn't describe the problem clearly. And now, I'd like to have some explanation.
Data race refers to incorrect data, not an invalid address error.
And the citations, they are 2, one is the Ruby memory model(the third ref), and then a MIT course. I referred to this because it supposes 64-bit aligned memory reading is atomic, but I can't find other sources. And if you read the MIT course reference, you will see it's about RCU, and the RCU is valid only when 64-bit memory operations are atomic.
Yeah, sometimes, the compiler may compile a 64-bit memory access into two instructions, such as Rust, but it's not Ruby. It should be fine not consider this.
nightpool 2 days ago [-]
Thanks for the reply! I'm sorry for coming off as too harsh.
> Yeah, sometimes, the compiler may compile a 64-bit memory access into two instructions, such as Rust, but it's not Ruby. It should be fine not consider this.
This sounds like it probably depends on the compiler or toolchain used. So Ruby compiled with LLVM would have issues with this approach, but Ruby compiled with GCC might not. Also it would be interesting to see whether yjit would affect this—it has hand-tuned assembly for 64-bit memory access.
meisel 4 days ago [-]
This title should have “How” prepended to it
yfractal 2 days ago [-]
Have this actually, you can check the link.
1123581321 4 days ago [-]
HN automatically removes those.
Rendered at 23:10:52 GMT+0000 (Coordinated Universal Time) with Vercel.
The prose isn't polished enough to be AI. AI generation is unlikely to produce missing spaces like "...which are not readable to humans.SDB uses eBPF ...", or grammatical inaccuracies like "Ensuring Fully Correctness".
As for the data race thing, it seems to me that there's a pretty clear distinction between rbspy's approach (as described in reference 1) and this blog post. rbspy is walking the native stack, which occasionally fails. SDB seems to be looking at Ruby's internals instead, and has some sort of generation-number design to identify cases where there was a data race.
Beyond that, this post just absolutely sounds like what somebody would write if they were trying to describe in prose why they think their multi-threaded code is correct, especially the "Scanning Stacks without the GVL" section.
First, I admitted I didn't describe the problem clearly. And now, I'd like to have some explanation.
In the intro, the issue with rbspy is that it reads invalid addresses, see this https://github.com/rbspy/rbspy/blob/8d501946f75335154c493473....
Data race refers to incorrect data, not an invalid address error.
And the citations, they are 2, one is the Ruby memory model(the third ref), and then a MIT course. I referred to this because it supposes 64-bit aligned memory reading is atomic, but I can't find other sources. And if you read the MIT course reference, you will see it's about RCU, and the RCU is valid only when 64-bit memory operations are atomic.
Yeah, sometimes, the compiler may compile a 64-bit memory access into two instructions, such as Rust, but it's not Ruby. It should be fine not consider this.
> Yeah, sometimes, the compiler may compile a 64-bit memory access into two instructions, such as Rust, but it's not Ruby. It should be fine not consider this.
This sounds like it probably depends on the compiler or toolchain used. So Ruby compiled with LLVM would have issues with this approach, but Ruby compiled with GCC might not. Also it would be interesting to see whether yjit would affect this—it has hand-tuned assembly for 64-bit memory access.