NHacker Next
  • new
  • past
  • show
  • ask
  • show
  • jobs
  • submit
Hamming Distance for Hybrid Search in SQLite (notnotp.com)
jonatron 32 minutes ago [-]
You could first calculate the distance of the first n bits (eg: 64, one popcountll) as a first pass, then calculate the full distance for candidates over a threshold from the first pass. It makes it approximate, but depending on the application it can be worth it.
mbreese 2 minutes ago [-]
I was thinking of something similar — instead of just two passes, couldn’t you also store different quantized values? If you have thousands of documents, you could narrow it down to a handful with a few bit-wise Hamming comparisons before doing a full cosine similarity on just the rest. If you hand more than one bitmap stored, you’d have fewer comparisons at each step too.

Would this work?

stephenheron 14 minutes ago [-]
I've had good success in using this: https://github.com/sqliteai/sqlite-vector if you want something a bit more "off the shelf" if you are using SQLite.
Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact
Rendered at 16:14:49 GMT+0000 (Coordinated Universal Time) with Vercel.