I looked into this because part of our pipeline is forced to be chunked. Most advice I've seen boils down to "more contiguity = better", but without numbers, or at least not generalizable ones.
My concrete tasks will already reach peak performance before 128 kB and I couldn't find pure processing workloads that benefit significantly beyond 1 MB chunk size. Code is linked in the post, it would be nice to see results on more systems.
twoodfin 1 hours ago [-]
Your results match similar analyses of database systems I’ve seen.
64KB-128KB seems like the sweet spot.
Rendered at 15:53:52 GMT+0000 (Coordinated Universal Time) with Vercel.
My concrete tasks will already reach peak performance before 128 kB and I couldn't find pure processing workloads that benefit significantly beyond 1 MB chunk size. Code is linked in the post, it would be nice to see results on more systems.
64KB-128KB seems like the sweet spot.