Next.js App Router + React Server Components Demo

new
past
show
ask
show
jobs
submit

▲Show HN: Spegel, a Terminal Browser That Uses LLMs to Rewrite Webpages (simedw.com)

426 points by simedw 221 days ago | 180 comments

qsort 221 days ago [-]

This is actually very cool. Not really replacing a browser, but it could enable an alternative way of browsing the web with a combination of deterministic search and prompts. It would probably work even better as a command line tool.

A natural next step could be doing things with multiple "tabs" at once, e.g: tab 1 contains news outlet A's coverage of a story, tab 2 has outlet B's coverage, tab 3 has Wikipedia; summarize and provide references. I guess the problem at that point is whether the underlying model can support this type of workflow, which doesn't really seem to be the case even with SOTA models.

hliyan 220 days ago [-]

For me, a natural next step would be to turn this into a service -- rather than doing it in the browser, this acts as a proxy, strips away all the crud and serves your browser clean text. No need to install a new browser, just point the browser to the URL via the service.

But if we do it, we have to admit something hilarious: we will soon be using AI to convert text provided by the website creator into elaborate web experiences, which end users will strip away before consuming it in a form very close to what the creator wrote down in the first place (this is already happening with beautifully worded emails that start with "I hope this email finds you well").

npmipg 219 days ago [-]

working on this as we speak!

TeMPOraL 221 days ago [-]

> tab 1 contains news outlet A's coverage of a story, tab 2 has outlet B's coverage, tab 3 has Wikipedia; summarize and provide references.

I think this is basically what https://ground.news/ does.

(I'm not affiliated with them; just saw them in the sponsorship section of a Kurzgesagt video the other day and figured they're doing the thing you described +/- UI differences.)

doctoboggan 221 days ago [-]

I am a ground news subscriber (joined with a Kurzgesagt ref link) and it does work that way (minus the wikipedia summary). It's pretty good and I particularly like their "blindspot" section showing news that is generally missing from a specific partisan new bubble.

simedw 221 days ago [-]

Thank you.

I was thinking of showing multiple tabs/views at the same time, but only from the same source.

Maybe we could have one tab with the original content optimised for cli viewing, and another tab just doing fact checking (can ground it with google search or brave). Would be a fun experiment.

myfonj 221 days ago [-]

Interestingly, the original idea of what we call a "browser" nowadays – the "user agent" – was built on the premise that each user has specific needs and preferences. The user agent was designed to act on their behalf, negotiating data transfers and resolving conflicts between content author and user (content consumer) preferences according to "strengths" and various reconciliation mechanisms.

(The fact that browsers nowadays are usually expected to represent something "pixel-perfect" to everyone with similar devices is utterly against the original intention.)

Yet the original idea was (due to the state of technical possibilities) primarily about design and interactivity. The fact that we now have tools to extend this concept to core language and content processing is… huge.

It seems we're approaching the moment when our individual personal agent, when asked about a new page, will tell us:

    Well, there's nothing new of interest for you, frankly:
    All information presented there was present on pages visited recently.
    -- or --
    You've already learned everything mentioned there. (*)
    Here's a brief summary: …
    (Do you want to dig deeper, see the content verbatim, or anything else?)

Because its "browsing history" will also contain a notion of what we "know" from chats or what we had previously marked as "known".

idiotsecant 221 days ago [-]

I can definitely see a future in which we are qch have our own personal memetic firewall, keeping us safe and cozy in our personal little worldview bubbles.

aspenmayer 221 days ago [-]

Some people think the sunglasses in They Live let you see through the propaganda, others think that the sunglasses themselves are just a different kind of pysop.

So, you gonna “put on those sunglasses, or start chewing on that trashcan?” It’s a distinction without a difference!

https://www.youtube.com/watch?v=1Rr4mQiwxpA

bee_rider 221 days ago [-]

It would have to have a pretty good model of my brain to help me make these decisions. Just as a random example, it will have to understand that an equation is a sort of thing that I’m likely to look up even if I understand the meaning of it, just to double check and get the particulars right. That’s an obvious example, I think there must be other examples that are less obvious.

Or that I’m looking up a data point that I already actually know, just because I want to provide a citation.

But, it could be interesting.

dotancohen 220 days ago [-]

  > Or that I’m looking up a data point that I already actually know, just because I want to provide a citation.

Or what were know has changed.

When I was a child we knew that the North Star consisted of five suns. Now we know that it is only three suns, and through them we can see another two background stars that are not gravitationally bound to the three suns of the Polaris system.

Maybe in my grandchildren lifetimes we'll know something else about the system.

myfonj 221 days ago [-]

Well we should first establish some sort of contract how to convey the "I feel that I actually understand this particular piece of information, so when confronted with it in the future, you can mark is as such". My lines of thought were more about a tutorial page that would present the same techniques as course you have finished a week prior, or news page reporting on an event you just read about on a different news site a minute before … stuff like this … so you wold potentially save the time skimming/reading/understanding only to realise there was no added value for you in that particular moment. Or while scrolling through a comment section, hide comment parts repeating the same remark, or joke.

Or (and this is actually doable absolutely without any "AI" at all):

    What the bloody hell actually newly appeared on this particular URL since my last visit?

(There is one page nearby that would be quite unusable for me, had I not a crude userscript aid for this particular purpose. But I can imagine having a digest about "What's new here?" / "Noteworthy responses?" would be way better.)

For the "I need to cite this source", naturally, you would want the "verbatim" view without any amendments anyway. Also probably before sharing / directing someone to the resource, looking at the "true form" would be still pretty necessary.

ffsm8 221 days ago [-]

> Well, there's nothing new of interest for you, frankly

For this to work like a user would want, the model would have to be sentient.

But you could try to get there with current models, it'd just be very untrustworthy to the point of being pointless beyond a novelty

myfonj 221 days ago [-]

Not any more "sentient" than existing LLMs even in the limited chat context span are already.

Naturally, »nothing new of interest for you« here is indeed just a proxy for »does not involve any significant concept that you haven't previously expressed knowledge about« (or how to put it), what seems pretty doable, provided that contract of "expressing knowledge about something" had been made beforehand.

Let's say that all pages you have ever bookmarked you have really grokked (yes, a stretch, no "read it later" here) - then your personal model would be able to (again, figuratively) "make qualified guess" about your knowledge. Or some kind of tag that you could add to any browsing history entry, or fragment, indicating "I understand this". Or set the agent up to quiz you when leaving a page (that would be brutal). Or … I think you got the gist now.

nextaccountic 221 days ago [-]

In your cleanup step, after cleaning obvious junk, I think you should do whatever Firefox's reader mode does to further clean up, and if that fails bail out to the current output. That should reduce the number of tokens you send to the LLM even more

You should also have some way for the LLM to indicate there is no useful output because perhaps the page is supposed to be a SPA. This would force you to execute Javascript to render that particular page though

simedw 221 days ago [-]

Just had a look and three is quite a lot going into Firefox's reader mode.

https://github.com/mozilla/readability

dotancohen 220 days ago [-]

For the vast majority of pages you'd actually want to read, isProbablyReaderable() will quickly return a fair bool guess whether the page can be parsed or not.

phatskat 221 days ago [-]

> I was thinking of showing multiple tabs/views at the same time, but only from the same source.

I think the primary reason I use multiple tabs but _especially_ multiple splits is to show content from various sources. Obviously this is different that a terminal context, as I usually have figma or api docs in one split and the dev server on the other.

Still, being able to have textual content from multiple sources visible or quickly accessible would probably be helpful for a number of users

wrsh07 221 days ago [-]

Would really love to see more functionality built into this. Handling post requests, enabling scripting, etc could all be super powerful

baq 221 days ago [-]

wonder if you can work on the DOM instead of HTML...

almost unrelated, but you can also compare spegel to https://www.brow.sh/

andrepd 221 days ago [-]

LLMs to generate SEO slop of the most utterly piss-poor quality, then another LLM to lossilly "summarise" it back. Brave new world?

bubblyworld 221 days ago [-]

Classic that the first example is for parsing the goddamn recipe from the goddamn recipe site. Instant thumbs up from me haha, looks like a neat little project.

andrepd 221 days ago [-]

Which it apparently does by completely changing the recipe in random places including ingredients and amounts thereof. It is _indeed_ a very good microcosm of what LLMs are, just not in the way these comments think.

simedw 221 days ago [-]

It was actually a bit worse than that the LLM never got the full recipe due to some truncation logic I had added. So it regurgitated the recipe from training, and apparently, it couldn't do both that and convert units at the same time with the lite model (it worked for just flash).

I should have caught that, and there are probably other bugs too waiting to be found. That said, it's still a great recipe.

andrepd 221 days ago [-]

[flagged]

221 days ago [-]

0x696C6961 221 days ago [-]

What is the point?

andrepd 220 days ago [-]

The point is LLMs are fundamentally unreliable algorithms for generating plausible text, and as such entirely unsuitable for this task. "But the recipe is probably delicious anyway" is beside the point, when it completely corrupted the meaning of the original. Which is annoying when it's a recipe but potentially very damaging when it's something else.

Techies seem to pretend this doesn't happen, and the general public who doesn't understand will trust the aforementioned techies. So what we see is these tools being used en masse and uncritically for purposes to which they are unsuited. I don't think this is good.

plonq 221 days ago [-]

I’m someone else but for me the point is a serious bug resulted _incorrect data_, making it impossible to trust the output.

bubblyworld 220 days ago [-]

Assuming you are responding in good faith - the author politely acknowledged the bug (despite the snark in the comment they responded to), explained what happened and fixed it. I'm not sure what more I could expect here? Bugs are inevitable, I think it's how they are handled that drives trust for me.

throwawayoldie 221 days ago [-]

The output was then posted to the Internet for everyone to see, without the minimal amount of proofreading that would be necessary to catch that, which gives us a good microcosm of how LLMs are used.

On a more pleasant topic the original recipe sounds delicious, I may give it a try when the weather cools off a little.

bubblyworld 221 days ago [-]

What do you mean? The recipes in the screenshot look more or less the same, the formatting has just changed in the Spiegel one (which is what was asked for, so no surprises there).

Edit: just saw the author's comment, I think I'm looking at the fixed page

lpribis 221 days ago [-]

Another great example of LLM hype train re-inventing something that already existed [1] (and was actually thought out) but making it worse and non-deterministic in the worst ways possible.

https://schema.org/Recipe

bubblyworld 220 days ago [-]

Can we stop with the unprovoked dissing of anyone using LLMs for anything? Or at least start your own thread for it. It's an unpleasant, incredibly boring/predictable standard for discourse (more so than the LLMs themselves lol).

alt187 220 days ago [-]

It's in fact very provoked. The LLM just changes the instructions of the recipe and creates new ones. That's an unpleasant standard of user experience.

bubblyworld 220 days ago [-]

That is a terrible reason to be a dick to someone. Especially someone who has created free software that you have no obligation to use.

throwaway290 220 days ago [-]

When they stop training these LLM on stolen content?

bubblyworld 220 days ago [-]

Then go join a thread where someone is actually talking about that issue, which is very valid, and make a meaningful contribution to the conversation. Jumping on people for no reason other than they have touched a language model is just rude.

throwaway290 220 days ago [-]

No you go. It is allowed to diss a thing in the thread about that thing if we think that thing is bad. People here dissed Dropbox when it was launched... and this is no Dropbox.

There are things that are not allowed. But here someone made a good point without any personal attacks. You silencing people is probably the least appropriate thing in this thread.

> make a meaningful contribution

What was not a meaningful contribution, mentioning the relevant schema? Saying using the LLM is bad for example because it is trained on our content without permission and payment? or that this steals from people who provide you content for free and make money from ads?

bubblyworld 220 days ago [-]

I'm not interested in continuing this conversation.

throwaway290 220 days ago [-]

I wish you didn't start it... I see it's you who posted the top level comment but it doesn't mean you "own" the thread.

komali2 220 days ago [-]

That's a cool schema, but the LLM solution is necessary because recipe website makers will never use the schema because they want you to have to read through garbage, with some misguided belief that this helps their SEO or something. Or maybe they get more money if you scroll through more ads?

throwaway290 220 days ago [-]

How do they make money then? Or do you think they are doing some sort of public service and you are entitled?

anExcitedBeast 220 days ago [-]

They make long articles to maximize ad exposure and SEO. It's good faith --they're doing what they have to to make money with the underlying tech ecosystem-- but it's not a good outcome.

LLMs are shifting that ecosystem (at least temporarily) and new revenue models will emerge. It'll take time to figure out. But we shouldn't artificially support a bad system just because it's the existing system.

Transitions are always awkward. In the meantime, I'm inclined to give people rope to experiment.

throwaway290 219 days ago [-]

It seems that your answer is a long euphemism for "I don't give a damn how they make money, they better find a new way I guess".

bubblyworld 220 days ago [-]

I'm genuinely a bit confused by the recipe blog business model. Like there's got to be one, right? People don't usually spew the same story about their grandma hundreds of times on a real blog.

Just hitting keywords for search? Many of them don't even have ads so I feel like that can't be it. Maybe referrals?

Revisional_Sin 220 days ago [-]

SEO. Longer articles get ranked higher.

bubblyworld 220 days ago [-]

Makes sense, thanks, but how do you actually make money from that without tons of ads? I realise this is a super naive question haha

gpm 220 days ago [-]

> without tons of ads

This is a requirement? I literally only browse the web with an ad blocker but I always assumed those sites had tons of ads.

bubblyworld 220 days ago [-]

Lol, that's funny - good point, I completely forgot I had an ad blocker running 24/7. I don't think I've browsed the raw internet in more than a decade...

RobertBobert 220 days ago [-]

[dead]

soap- 220 days ago [-]

And that would be great, if anyone used it.

LLMs are specifically good at a task like this because they can extract content from any webpage, regardless of it supports whatever standard that no one implements

VMG 220 days ago [-]

The LLM thing actually works. Who cares if it's deterministic. Maybe the same people who come up with arcane schemas that nobody ever uses?

221 days ago [-]

IncreasePosts 221 days ago [-]

There are extensions that do that for you, in a deterministic way and not relying on LLMs. For example, Recipe Filter for chrome. It just shows a pop up over the page when it loads if it detects a recipe

bubblyworld 221 days ago [-]

Thanks, I already use that plugin, actually, I just found the problem amusingly familiar. Recipe sites are the original AI slop =P

mromanuk 221 days ago [-]

I definitely like the LLM in the middle, it’s a nice way to circumvent the SEO machine and how Google has optimized writing in recent years. Removing all the cruft from a recipe is a brilliant case for an LLM. And I suspect more of this is coming: LLMs to filter. I mean, it would be nice to just read the recipe from HTML, but SEO has turned everything into an arms race.

tines 221 days ago [-]

> Removing all the cruft from a recipe is a brilliant case for an LLM

Is it though, when the LLM might mutate the recipe unpredictably? I can't believe people trust probabilistic software for cases that cannot tolerate error.

kccqzy 221 days ago [-]

I agree with you in general, but recipes are not a case where precision matters. I sometimes ask LLMs to give me a recipe and if it hallucinates something it will simply be taste bad. Not much different from a human-written recipe where the human has drastically different tastes than I do. Also you basically never apply the recipe blindly; you have intuition from years of cooking to know you need to adjust recipes to taste.

Uehreka 221 days ago [-]

Hard disagree. I don’t have “years of cooking” experience to draw from necessarily. If I’m looking up a recipe it’s because I’m out of my comfort zone, and if the LLM version of the recipe says to add 1/2 cup of paprika I’m not gonna intuitively know that the right amount was actually 1 teaspoon. Well, at least until I eat the dish and realize it’s total garbage.

Also like, forget amounts, cook times are super important and not always intuitive. If you screw them up you have to throw out all your work and order take out.

kccqzy 221 days ago [-]

All I'm arguing is that you should have the intuition to know the difference between 1/2 cup of paprika and a teaspoon. Okay maybe if you just graduated from college and haven't cooked much you could make such a mistake but realistically outside the tech bubble of HN you won't find people confusing 1/2 cup with a teaspoon. It's just intuitively wrong. An entire bottle of paprika I recently bought has only 60 grams.

And yes cook times are important but no, even for a human-written recipe you need the intuition to apply adjustments. A recipe might be written presuming a powerful gas burner but you have a cheap underpowered electric. Or the recipe asks for a convection oven but your oven doesn't have the feature. Or the recipe presumes a 1100W microwave but you have a 1600W one. You stand by the food while it cooks. You use a food thermometer if needed.

tines 221 days ago [-]

Huh? You don't care if an LLM switches pounds to kilograms because... recipes might taste bad anyway????

kccqzy 221 days ago [-]

Switching pounds with kilograms is off by a factor of two. Most people capable of cooking should have the intuition to know something is awfully wrong if you are off by a factor of two, especially since pounds and kilograms are fairly large units when it comes to cooking.

whatevertrevor 221 days ago [-]

Not really an apt comparison.

For one an AI generated recipe could be something that no human could possibly like, whereas the human recipe comes with at least one recommendation (assuming good faith on the source, which you're doing anyway LLM or not).

Also an LLM may generate things that are downright inedible or even toxic, though the latter is probably unlikely even if possible.

I personally would never want to spend roughly an hour or so making bad food from a hallucinated recipe wasting my ingredients in the process, when I could have spent at most 2 extra minutes scrolling down to find the recommended recipe to avoid those issues. But to each their own I guess.

joshvm 221 days ago [-]

There is a well-defined solution to this. Provide your recipes as a Recipe schema: https://schema.org/Recipe

Seems like most of the usual food blog plugins use it, because it allows search engines to report calories and star ratings without having to rely on a fuzzy parser. So while the experience sucks for users, search engines use the structured data to show carousels with overviews, calorie totals and stuff like that.

https://recipecard.io/blog/how-to-add-recipe-structured-data...

https://developers.google.com/search/docs/guides/intro-struc...

EDIT: Sure enough, if you look at the OPs recipe example, the schema is in the source. So for certain examples, you would probably be better off having the LLM identify that it's a recipe website (or other semantic content), extract the schema from the header and then parse/render it deterministically. This seems like one of those context-dependent things: getting an LLM to turn a bunch of JSON into markdown is fairly reliable. Getting it to extract that from an entire HTML page is potentially to clutter the context, but you could separate the two and have one agent summarise any of the steps in the blog that might be pertinent.

    {"@context":"https://schema.org/","@type":"Recipe","name":"Slowly Braised Lamb Ragu ...

visarga 221 days ago [-]

I foreseen this a couple years ago. We already have web search tools in LLMs, and they are amazing when they chain multiple searches. But Spegel is a completely different take.

I think the ad blocker of the future will be a local LLM, small and efficient. Want to sort your timeline chronologically? Or want a different UI? Want some things removed, and others promoted? Hide low quality comments in a thread? All are possible with LLM in the middle, in either agent or proxy mode.

I bet this will be unpleasant for advertisers.

yellow_lead 221 days ago [-]

LLM adds cruft, LLM removes cruft, never a miscommunication

hirako2000 221 days ago [-]

Do you also like what it costs you to browse the web via an LLM potentially swallowing millions of tokens per minutes ?

prophesi 221 days ago [-]

This seems like a suitable job for a small language model. Bit biased since I just read this paper[0]

[0] https://research.nvidia.com/labs/lpr/slm-agents/

treyd 221 days ago [-]

I wonder if you could use a less sophisticated model (maybe even something based on LSTMs) to walk over the DOM and extract just the chunks that should be emitted and collected into the browsable data structure, but doing it all locally. I feel like it'd be straightforward to generate training data for this, using an LLM-based toolchain like what the author wrote to be used directly.

askonomm 221 days ago [-]

Unfortunately in the modern web simply walking the DOM doesn't cut it if the website's content loads in with JS. You could only walk the DOM once the JS has loaded, and all the requests it makes have finished, and at that point you're already using a whole browser renderer anyway.

kccqzy 221 days ago [-]

Yeah but this project doesn't use JS anyway.

leroman 221 days ago [-]

Cool idea! but kind of wasteful.. I just feel wrong if I waste energy.. At least you could first turn it into markdown with a library that preserves semantic web structures (I authored this- https://github.com/romansky/dom-to-semantic-markdown) saving many tokens = much less energy used..

otabdeveloper4 220 days ago [-]

This is exactly the sort of thing that should be running on a local LLM.

Using a big cloud provider for this is madness.

clbrmbr 221 days ago [-]

Suggestion: add a -p option:

    spegel -p "extract only the product reviews" > REVIEWS.md

__MatrixMan__ 221 days ago [-]

It would be cool of it were smart enough to figure out whether it was necessary to rewrite the page on every visit. There's a large chunk of the web where one of us could visit once, rewrite to markdown, and then serve the cleaned up version to each other without requiring a distinct rebuild on each visit.

myfonj 221 days ago [-]

Each user have distinct needs, and has a distinct prior knowledge about the topic, so even the "raw" super clean source form will probably be eventually adjusted differently for most users.

But yes, having some global shared redundant P2P cache (of the "raw" data), like IPFS (?) could possibly help and save some processing power and help with availability and data preservation.

__MatrixMan__ 221 days ago [-]

I imagine it sort of like a microscope. For any chunk of data that people bothered to annotate with prompts re: how it should be rendered you'd end up with two or three "lenses" that you could toggle between. Or, if the existing lenses don't do the trick, you could publish your own and, if your immediate peers find them useful, maybe your transitive peers will end up knowing about them as well.

markstos 221 days ago [-]

Cache headers exist for servers to communicate to clients how long it safe to cache things for. The client could be updated to add a cache layer that respects cache headers.

simedw 221 days ago [-]

If the goal is to have a more consistent layout on each visit, I think we could save the last page's markdown and send it to the model as a one-shot example...

pmxi 221 days ago [-]

The author says this is for “personalized views using your own prompts.” Though, I suppose it’s still useful to cache the outputs for the default prompt.

__MatrixMan__ 221 days ago [-]

Or to cache the output for whatever prompt your peers think is most appropriate for that particular site.

kelsey98765431 221 days ago [-]

People here are not realizing that html is just the start. If you can render a webpage into a view, you can render any input the model accepts. PDF to this view. Zip file of images to this view. Giant json file into this view. Whatever. The view is the product here, not the html input.

IncreasePosts 221 days ago [-]

I did something similar, but with a chrome extension. Basically, for every web page, I feed the HTML to a local LLM (well, on a server in my basement). I ask it to consider if the content is likely clickbait or can be summarized without losing too many interesting details, and if so, it adds a little floating icon to the top of the page that I can click on to see the summary instead.

My next plan is to rewrite hyperlinks to provide a summary of the page on hover, or possibly to rewrite the hyperlinks to be more indicative of the content at the end of it(no more complaining about the titles of HN posts...). But, my machine isn't too beefy and I'm not sure how well that will work, or how to prioritize links on the page.

robbles 221 days ago [-]

I'm curious whether anyone has run into hallucinations with this kind of use of an LLM.

They are pretty great at converting data between formats, but I always worry there's a small chance it changes the actual data in the output in some small but misleading way.

throwawayoldie 220 days ago [-]

Guess you didn't see the old version of the screenshots on the page, which showed things like 1.5 pounds of lamb being converted into 1.5 kg, that is, more than doubling it.

ohadron 221 days ago [-]

This is a terrific idea and could also have a lot of value with regards to accessibility.

taco_emoji 221 days ago [-]

The problem, as always, is that LLMs are not deterministic. Accessibility needs to be reliable and predictable above all else.

Jotalea 220 days ago [-]

Insanely resource expensive, but still a very interesting "why not?" idea. I think a fitting use case would be adapting newer websites for them to work on older hardware. That is, assuming the new technologies used are not vital to the functionality of the website (ex. Spotify, YouTube, WhatsApp) and can be adapted to older technologies (ex. Google Search, from all the styles that it has, to a simple input and a button).

In theory this could be used for ad blocking; though more expensive and less efficient, but the idea is there.

So, it is a very curious idea, but we still have to find an appropriate use case.

hyperific 221 days ago [-]

Why not use pandoc to convert html to markdown and have the LLM condense from there?

cheevly 221 days ago [-]

Very cool! My retired AI agent transformed live webpage content, here's an old video clip of transforming HN to My Little Pony (with some annoying sounds): https://www.youtube.com/watch?v=1_j6cYeByOU. Skip to ~37 seconds for the outcome. I made an open-source standalone Chrome extension as well, it should probably still work for anyone curious: https://github.com/joshgriffith/ChromeGPT

js2 220 days ago [-]

I was unfamiliar with Textual which seems more interesting than Slowly Braised Lamb Ragu:

https://textual.textualize.io/

mossTechnician 221 days ago [-]

Changes Spegel made to the linked recipe's ingredients:

Pounds of lamb become kilograms (more than doubling the quantity of meat), a medium onion turns large, one celery stalk becomes two, six cloves of garlic turn into four, tomato paste vanishes, we lose nearly half a cup of wine, beef stock gets an extra ¾ cup, rosemary is replaced with oregano.

simedw 221 days ago [-]

Fantastic catch! It led me down a rabbit hole, and I finally found the root cause.

The recipe site was so long that it got truncated before being sent to the LLM. Then, based on the first 8000 characters, Gemini hallucinated the rest of the recipe, it was definitely in its training set.

I have fixed it and pushed a new version of the project. Thanks again, it really highlights how we can never fully trust models.

jugglinmike 221 days ago [-]

Great catch. I was getting ready to mention the theoretical risk of asking an LLM be your arbiter of truth; it didn't even occur to me to check the chosen example for correctness. In a way, this blog post is a useful illustration not just of the hazards of LLMs, but also of our collective tendency to eschew verity for novelty.

andrepd 221 days ago [-]

> Great catch. I was getting ready to mention the theoretical risk of asking an LLM be your arbiter of truth; it didn't even occur to me to check the chosen example for correctness.

It's beyond parody at this point. Shit just doesn't work, but this fundamental flaw of LLMs is just waved away or simply not acknowledged at all!

You have an algorithm that rewrites textA to textB (so nice), where textB potentially has no relation to textB (oh no). Were it anything else this would mean "you don't have an algorithm to rewrite textA to textB", but for gen ai? Apparently this is not a fatal flaw, it's not even a flaw at all!

I should also note that there is no indication that this fundamental flaw can be corrected.

throwawayoldie 220 days ago [-]

> the theoretical risk of asking an LLM be your arbiter of truth

"Theoretical"? I think you misspelled "ubiquitous".

orliesaurus 221 days ago [-]

oh damn...

achierius 221 days ago [-]

Did you actually observe this, or is just meant to be illustrative of what could happen?

mossTechnician 221 days ago [-]

This is what actually happened in the linked article. The recipe is around the text that says

> Sometimes you don't want to read through someone's life story just to get to a recipe... That said, this is a great recipe

I compared the list of ingredients to the screenshot, did a couple unit conversions, and these are the discrepancies I saw.

hambes 220 days ago [-]

I've thought about getting a web browser to work on the terminal for a while now. This is an idea that hadn't occured to me yet and I'm intrigued.

But I feel it doesn't solve the main issue of terminal-based web browsing. Displaying HTML in the terminal is often kind of ugly and css-based fanciness does not work at all, but that can usually just be ignored. The main problem is javascript and dynamic content, which this approach just ignores.

So no real step forward for cli web browsing, imo.

deadbabe 221 days ago [-]

I would like to see a version of this where an LLM just takes the highlights of various social media content from your feed and just gives you the stuff worth watching. This also means excluding crap you had no interest in and was simply inserted into your feed. Fight algorithms with algorithms. Eliminate doom scrolling.

adrianpike 221 days ago [-]

Super neat - I did something similar on a lark to enable useful "web browsing" over 1200 baud packet - I have Starlink back at my camp but might be a few miles away, so as long as I can get line of sight I can Google up stuff, albeit slow. Worked well but I never really productionalized it beyond some weekend tinkering.

coder543 221 days ago [-]

Just a typo note: the flow diagram in the article says "Gemini 2.5 Pro Lite", but there is no such thing.

simedw 221 days ago [-]

You are right, it's Gemini 2.5 Flash Lite

deepdarkforest 221 days ago [-]

The main problem with these approaches is that most sites now are useless without JS or having access to the accessibility tree. Projects like browser-use or other DOM based approaches at least see the DOM(and screenshots).

I wonder if you could turn this into a chrome extension that at least filters and parses the DOM

jadbox 221 days ago [-]

I actually made a CLI tool recently that uses Puppeteer to render the page including JS, summarizes key info and actions, and enables simple form filling all from a CLI menu. I built it for my own use-cases (checking and paying power bills from CLI), but I'd love to get feedback on the core concept: https://github.com/jadbox/solomonagent

andoando 221 days ago [-]

Dude I love this. I've been thinking of doing this exactly this, but for as a screen reader for accessibility reasons.

jadbox 221 days ago [-]

Thanks, it's alpha at the moment- next feature is complex forms and bug fixing broken actions (downloading). Do give it a spin! Welcome to contribute or drop feedback on the repo :)

willsmith72 221 days ago [-]

True for stuff requiring interaction, but to help their LCP/SEO lots of sites these days render plain html first. It's not "usable" but for viewing it's pretty good

ghm2180 220 days ago [-]

This is great! Another useful amendment to this that would make me use it add a chrome browser tool to allow access to pages that need authn and then scrape them for you.

My #1 usecase is fetching wikis on my hard drive and letting a local coding agent use it for creating plans.

pepperonipboy 221 days ago [-]

Could work great with emacs' eww!

sammy0910 221 days ago [-]

I built a project that basically does this for emacs

https://github.com/sstraust/simpleweb

thephotonsphere 221 days ago [-]

also with lynx because it can browse from stdin

neocodesoftware 221 days ago [-]

Does it fail cloudflare captcha?

ospider 221 days ago [-]

I think it will, it uses requests, and cloudflare blocks traffic from non-browser, e.g. python http clients. It would be better to use something like curl_cffi.

stared 221 days ago [-]

Any chance it would work for pages like Facebook or LinkedIn? I would love to have a distraction-free way of searching information there.

Obviously, against wishes of these social networks, which want us to be addicted... I mean, engaged.

aydyn 221 days ago [-]

Does anyone really get addicted to linkedin? Its so sanitized and clinical. Nobody acts real on there or even pretends to.

encom 221 days ago [-]

The worst[1] part about losing my job last month was having to take LinkedIn seriously, and the best[2] part about now having found a new job is logging off LinkedIn, for a very long time hopefully. The self-aggrandising, pretentious, occasionally virtue signalling, performance-posting make me want to throw up. It takes a considerable amount of effort on my part to not make sarcastic shitposts, but in the interest of self preservation, I restrain myself. My header picture, however, is my extremely messy desk, full of electronics, tools, test equipment, drawings, computers and coffee cups. Because that's just how I work when I'm in the zone, and it serves as a quiet counterpoint to the polished self-promotion people do.

And I didn't even get the new job through LinkedIn, though it did yield one interview.

[1] Not the actual worst.

[2] Not the actual best.

simedw 221 days ago [-]

We’ll probably have to add some custom code to log in, get an auth token, and then browse with it. Not sure if LinkedIn would like that, but I certainly would.

Buttons840 221 days ago [-]

A step towards the future of ad-blocking maybe? Just rewrite every page?

conradkay 221 days ago [-]

Something tells me we'll see more ad-inserting

Modified3019 221 days ago [-]

>Companies burning energy with llms to dynamically hide ads and bullshit on every pageload

>Individuals burning energy using personal llm internet condoms to strips ads and bullshit from every pageload

Eventually there will be a project where volunteers use llms to harvest the real internet and “launder” both the copyright and content into some kind of pre-processed distributed shadow internet where things are actual useable, while being just as wrong as the real internet.

What a future.

userbinator 221 days ago [-]

Many people were doing that at the turn of the century(!) with filtering proxies, more deterministically and with far less computing power. Some still do today.

cyrillite 221 days ago [-]

I have been thinking of a project extremely similar to this for a totally different purpose. It’s lovely to see something like this. Thank you for sharing it, inspiring

amelius 221 days ago [-]

Curious about that different purpose ...

anonu 221 days ago [-]

Don't you need javascript to make most webpages useful?

inetknght 221 days ago [-]

Good sir, no.

The web has existed for long before javascript was around.

The web was useful for long before javascript was around.

I literally hate javascript -- not the language itself but the way it is used. It has enabled some pretty cool things, yes. But javascript is not required to make useful webpages.

pmxi 221 days ago [-]

I think you misunderstood him. Yes, it’s possible to CREATE a useful webpage without JavaScript, but many EXISTING webpages rely on JavaScript to be functional.

jazzyjackson 221 days ago [-]

If Amazon.com can work with JavaScript disabled, any site could be rewritten to do without. But I think to even get to the content on a lot of SPAs this would need to be running a headless browser to render the page, before extracting the static content unfortunately

IncreasePosts 221 days ago [-]

No - an experiment: try disabling javascript in your browser settings, and then whenever you see a webpage that isn't working, enable javascript for that domain. You'd be surprised how fast 90% of the web feels with JS disabled.

nashashmi 221 days ago [-]

You should call this software a lens and filter instead of a mirror. It takes the essential information and transforms it into another medium.

barrenko 220 days ago [-]

I need this, but for the new forum formats such as Discourse or Discuss or whatever it's called. An eyesore and a brainsore.

crest 221 days ago [-]

A cool hack, but also impressive to come up with a CLI "browser" that's even more expensive to run than Chromium.

WD-42 221 days ago [-]

Does anyone know why LLMs love emojis so much?

userbinator 221 days ago [-]

Likely because it was trained on such material... which is just as authentic and vapid.

eevmanu 221 days ago [-]

great POC

looks very similar to a chrome extension i use for a similar goal: reader view - https://chromewebstore.google.com/detail/ecabifbgmdmgdllomnf...

web3aj 221 days ago [-]

Very cool. I’ve been interested in browsing the web directly from my terminal; this feels accessible.

eniac111 221 days ago [-]

Cool! It would be even better if it was able to create simple web pages for vintage browsers.

stronglikedan 221 days ago [-]

That would violate the do-one-thing-and-do-it-well principle for no apparent benefit. There are plenty of tools to convert markdown to basic HTML already.

benrutter 221 days ago [-]

Welcome to 2025 where it's more reasonable to filter all content through an LLM than to expect web developers to make use of the semantic web that's existed for more than a decade. . .

Serioisly though, looks like a novel fix for the problem that most terminal browsers face. Namely that terminals are text based, but the web, whilst it contains text, is often subdivided up in a way that only really makes sense graphically.

I wonder if a similar type of thing might work for screen readers or other accessibility features

insane_dreamer 221 days ago [-]

Interesting, but why round-trip through an LLM just to convert HTML to Markdown?

markstos 221 days ago [-]

Because the modern web isn't reliably HTML, it's "web apps" with heavy use of JavaScript and API calls. To first display the HTML that you see in your browser, you need a user agent that runs JavaScript and makes all the backend calls that Chrome would make to put together some HTML.

Some websites may still return some static upfront that could be usefully understood without JavaScript processing, but a lot don't.

That's not to say you need an LLM, there are projects like Puppeteer that are like headless browsers that can return the rendered HTML, which can then be sent through an HTML to Markdown filter. That would be less computationally intensive.

insane_dreamer 221 days ago [-]

> That's not to say you need an LLM, ... then be sent through an HTML to Markdown filter. That would be less computationally intensive.

which was exactly my point

crent 221 days ago [-]

Because this isn't just converting HTML to markdown. I'd recommend taking another look at the website and particularly read the recipe example as it demonstrates the goal of the project pretty well.

gvison 220 days ago [-]

Great project, much less memory than opening a web page in a browser.

tartoran 221 days ago [-]

Loving the text only browsing. Is this as fast as in the preview?

cout 221 days ago [-]

This is a neat idea!

I wonder if it could be adapted to render as gopher pages.

fzaninotto 221 days ago [-]

Congrats! Now you need an entire datacenter to visualize a web page.

juujian 221 days ago [-]

Couldn't this time reasonably well on a local machine is you have some kind of neutral processing chip and enough ram? Conversion to MD shouldn't require a huge model.

busssard 221 days ago [-]

only if you use an API and not a dedicated distill/tune for html to MD conversion.

But the question of Javascript remains

098799 221 days ago [-]

You could also use headless selenium under the hood and pipe to the model the entire Dom of the document after the JavaScript was loaded. Of course it would make it much slower but also would amend the main worry people have which is many websites will flat out not show anything in the initial GET request.

busssard 221 days ago [-]

can you flesh this out a tiny bit? because for indy-crawlers the javascript rendering is the main problem.

098799 221 days ago [-]

Here's a sketch: https://chatgpt.com/share/68640b97-9a48-8007-a27c-fdf85ff412... -- selenium drives your actual browser under the hood.

amelius 221 days ago [-]

Can it strip ads?

tossandthrow 221 days ago [-]

It can inject its own!

amelius 221 days ago [-]

You have a point as it uses Gemini under the hood. However, the moment Google introduces ads in the model users will run away. So Google really has no opportunity here to inject ads.

And wouldn't it be ironic if Gemini was used to strip ads from webpages?

tossandthrow 221 days ago [-]

The field of "seo for Ai", ie, seeking to have your company featured in LLMs, is already established.

In the rare cases where the model would jam on its own, this will likely already happen.

herval 221 days ago [-]

We’re back to the BBS days, 30 years later!

nicklo 221 days ago [-]

Have you considered making an MCP for this? Would be great for use in vibe-coding

remram 221 days ago [-]

Not to be confused with Kubernetes' Spiegel: https://spegel.dev/ https://github.com/spegel-org/spegel

Klaster_1 221 days ago [-]

Now that's a user agent!

CaptainFever 221 days ago [-]

Finally, web browsers work for the user, not the website owners!

revskill 221 days ago [-]

Use uv instead of pip

b0a04gl 221 days ago [-]

this is another layer of abstraction on top of an already broken system. you're running html through an llm to get markdown that gets rendered in a terminal browser. that's like... three format conversions just to read text. the original web had simple html that was readable in any terminal browser already. now they arent designed as documents anymore but rather designed as applications that happen to deliver some content as a side effect

MangoToupe 221 days ago [-]

That's the world we live in. You can either not have access to content or you must accept abstractions to remove all the bad decisions browser vendors have forced on us the last 30 years to support ad-browsing.

worldsayshi 221 days ago [-]

If the web site is a SPA that is hydrated using an API it would be conceivable that the LLM can build a reusable interface around the API while taking inspiration from the original page. That interface can then be stored in some cache.

I'm not saying it's necessarily a good idea but perhaps a bad/fun idea that can inspire good ideas?

jrm4 221 days ago [-]

I 100% agree -- but still I find this a feature and not a bug. It's always an arms race, and I like this shot fired.

_joel 221 days ago [-]

> this is another layer of abstraction on top of an already broken system

pretty much like all modern computing then, hey.

nashashmi 221 days ago [-]

Think of it as a secretary that is transforming and formatting information. You may desire for the original medium to be something like what you want but you don’t get that so you can get a cheap dumber secretary instead.

amelius 221 days ago [-]

I take it you never use "Reader mode" in your browser?

221 days ago [-]

ktpsns 221 days ago [-]

Reminds me of https://www.brow.sh/ which is not AI related at all but just a very powerful terminal browser which in fact supports JS, even videos.

221 days ago [-]

nartho 221 days ago [-]

I think the project itself is really cool, that said I really don't like the trend of having LLMs regurgitate content back to us. That said, this kinda makes me think of Browsh, who took the opposite approach and tries to render the HTML in the terminal (without LLMs as far as I know)

https://github.com/browsh-org/browsh https://www.youtube.com/watch?v=HZq86XfBoRo

hirako2000 221 days ago [-]

That would also keep your wallet or GPU rag coller

221 days ago [-]

sammy0910 221 days ago [-]

I built something that did this a bit ago

https://github.com/sstraust/simpleweb

sammy0910 221 days ago [-]

something I found challenging when I was building was -- how do you make the speed fast enough so that it still creates a smooth browsing experience?

I'm curious how you tackled that problem

4b11b4 221 days ago [-]

https://github.com/sstraust/simpleweb/blob/79294b461b2e67a24...

Not the answer to your question but here's the prompt

simedw 221 days ago [-]

That's a cool project.

I think most of it comes down to Flash-Lite being really fast, and the fact that I'm only outputting markdown, which is fairly easy and streams well.

busssard 221 days ago [-]

what does it do about javascript?

221 days ago [-]

ghaering 221 days ago [-]

[dead]

jannniii 221 days ago [-]

[dead]

willm 221 days ago [-]

Why not just use ncurses?

221 days ago [-]

jannniii 221 days ago [-]

Gopher is back!

Bluestein 221 days ago [-]

Gosh. Lovely project and cool, and - likewise - a bit scary: This is where the "bubble" seals itself "from the inside" and custom (or cloud, biased) LLMs sear the "bubble" in.-

The ultimate rose (or red, or blue or black ...) coloured glasses.-

wayeq 221 days ago [-]

... what?

rrnechmech 219 days ago [-]

I think OP means that this "filtering" the on the fly conversion does can amplify the content bubbles we live in

Rendered at 15:46:40 GMT+0000 (Coordinated Universal Time) with Vercel.