I've found Gemini 2.5 Flash is the best model in terms of speed/cost/quality. Pro is great as well, but probably not necessary for most chat-with-paper functionality.
I'll add too that building an AI layer on top of arXiv is a deep, deep rabbit hole depending on how far you want to take the project. Drop me a note if you want to chat more about my experience with it.
Regardless, thanks for sharing this!
beng-nl 140 days ago [-]
I’m amazed, the interface is pretty complete and slick for a days work - then again I’m not a WebDev so I’d do it a dumb way.. Curious how this was made..
matt1 140 days ago [-]
I think you might be getting the two projects mixed up.
Emergent Mind, my tool, has been in the works for over two years. If that's the interface you're referring to, thank you.
Asxiv, what this post is about, was built in a day by the OP.
I hadn't, but at a glance it seems more of a community that an AI tool?
APNPucky 141 days ago [-]
Very interesting, I still need to test it more, but it seems like it parses only the arxiv PDF data. For getting more accurate equations it might be a good idea to download the original tex source and let it parse that (maybe even both).
EDIT: Another thought: maybe the output could also support markdown/latex like chatgpt.
anonfunction 137 days ago [-]
Thank you, that's probably correct. I think the gemini api might turn the pages into images and use those. Sending the original tex source was something I thought of but not all papers have those submitted.
As for markdown / latex output that could be done, especially for equations! I'll have to look into the best way to render that.
bArray 141 days ago [-]
It's a nice project, but the LLM itself seems to struggle with actually comprehending the subjects. It can point me very well to parts of the paper, but it could not explain parts of the equations based on other knowledge outside of the paper.
anonfunction 137 days ago [-]
I've had better results running it with the Gemini 2.5 Pro model but it's much more expensive. The website is using the very cheap 2.5 flash lite model.
You can run it yourself using the better model to see if you get better results as well:
I've built a similar platform with searching access to arXiv and Semantic Scholar; the only difference is that our agents can highlight text in the paper down to the line level. In our testing, Gemini struggles compared to Sonnet four or Opus 4. We found that without agentic highlighting, there wasn't much difference in output quality or utility (meaning, references saved, generating with citation, or even quote gathering is still hard without actual PDF interactivity). I'd love your feedback on https://www.ubik.studio (use academic search)
anonfunction 137 days ago [-]
Very cool! I chose gemini for two reasons; the first being it supported pdf input and the second the flash lite model being very cheap so I could comfortably make it public and free.
phamtrongthang 141 days ago [-]
Hi. Cool project! But I wonder what is the different between this and alphaxiv.org?
anonfunction 137 days ago [-]
That seems like a social network? This is just a tool to have an AI model answer questions without needing to do anything other than replace the domain.
ks2048 141 days ago [-]
Nice that it will link to specific pages. I wonder if it could be made to highlight specific parts of a page (i.e. highlight the exact thing I am looking for)?
anonfunction 137 days ago [-]
Possibly. I had to kind of be very specific to get the model to link to the pages in a specific format my frontend can parse and then interact with the pdf viewer[1]. It seems very full featured so it may have a way to highlight portions of the page, but my experience with PDFs leads me to believe it would be tricky.
that is cool and all, but don't forget that some researchers were caught putting hidden messages (https://arxiv.org/pdf/2507.06185) instructing LLMs to praise the paper.
it would be good if you made some sort of protection against these techniques. I think feeding images of pages instead of the page code itself would be beneficial.
anonfunction 137 days ago [-]
Wow I had not known of that! This is mostly just a quick tool I wanted but something to think about if anything further were to come from it.
sreenathmenon 140 days ago [-]
Cool
148 days ago [-]
tentacle256 141 days ago [-]
[dead]
Rendered at 02:28:56 GMT+0000 (Coordinated Universal Time) with Vercel.
My site, https://www.emergentmind.com, is similar, though I'm two years in :)
I've found Gemini 2.5 Flash is the best model in terms of speed/cost/quality. Pro is great as well, but probably not necessary for most chat-with-paper functionality.
I'll add too that building an AI layer on top of arXiv is a deep, deep rabbit hole depending on how far you want to take the project. Drop me a note if you want to chat more about my experience with it.
Regardless, thanks for sharing this!
Emergent Mind, my tool, has been in the works for over two years. If that's the interface you're referring to, thank you.
Asxiv, what this post is about, was built in a day by the OP.
EDIT: Another thought: maybe the output could also support markdown/latex like chatgpt.
As for markdown / latex output that could be done, especially for equations! I'll have to look into the best way to render that.
You can run it yourself using the better model to see if you get better results as well:
https://github.com/montanaflynn/asxiv
1. https://mozilla.github.io/pdf.js/
https://asxiv.org/pdf/1706.03762
it would be good if you made some sort of protection against these techniques. I think feeding images of pages instead of the page code itself would be beneficial.