This is really cool!! It looks interesting for making errors in complex type systems easier to debug, but the quadratic performance of the title sounds a bit worrying for productive compiler use — and imo the benchmarks don’t really mean anything without a point of reference to a traditional unification implementation.
If this system only provides benefits in the type-error path of the compiler I wonder if a traditional single-pass unification could be used for speed on the common path of code compiling without type errors, then when unification fails this slower multi-pass approach could be run on-demand to give better error reporting. This could lazily avoid the cost of the approach in most cases, and the cases in which it would be used are less latency critical anyway.
Also, I think there is a typo in one of the code blocks: ‘2 should be unified into (string, string) not just string afaict
simvux 2 days ago [-]
Working around the performance concerns by only running this algorithm after a faster algorithm found an error is an interesting idea. I think it could work, as things that produce errors would in a real compiler be marked as "poisoned" to be ignored in further analysis and IR lowerings. Thus the two algorithms disagreeing on a type wouldn't cause a noticable difference in the end result.
Comparative benchmarks are tricky. I considered making a simpler single-pass inference branch which is based around the same data structures to create a more one-to-one comparison. But this algorithm is rather different so porting it wasn't very straight forward. I'm currently integrating this into my real compiler so from there it'll be easier to estimate the real-world performance impact of this system.
Typo has been fixed, thanks!
edmundgoodman 2 days ago [-]
Couldn’t the comparison (and also the fast path I guess) just be putting all the inference passes into a single big pass, which then avoids the quadratic number of re-applications of passes? Looks like unification is otherwise the same?
Rendered at 18:01:43 GMT+0000 (Coordinated Universal Time) with Vercel.
If this system only provides benefits in the type-error path of the compiler I wonder if a traditional single-pass unification could be used for speed on the common path of code compiling without type errors, then when unification fails this slower multi-pass approach could be run on-demand to give better error reporting. This could lazily avoid the cost of the approach in most cases, and the cases in which it would be used are less latency critical anyway.
Also, I think there is a typo in one of the code blocks: ‘2 should be unified into (string, string) not just string afaict
Comparative benchmarks are tricky. I considered making a simpler single-pass inference branch which is based around the same data structures to create a more one-to-one comparison. But this algorithm is rather different so porting it wasn't very straight forward. I'm currently integrating this into my real compiler so from there it'll be easier to estimate the real-world performance impact of this system.
Typo has been fixed, thanks!