So once all that is done, the user needs to click away cookie consent banner, newsletter sign-up and the continue reading button. And only now can we stop the clock on "time to read content".
This is a big reason why I read comments first. I click and get straight to content.
EDIT: I just realized that the "time to ..."-moniker works really bad in my phrasing here. Maybe "time to start reading" would have been better.
Doesn't help with load times though, it would be an interesting exercise for such a plugin to only fetch the content, a la Lynx back in the day.
IIRC there was some kind of grid format used that was entirely opaque to me; yet the developer must have thought was an improvement over the previous "allow X.domain.com temporarily" vs "allow X.domain.com permanently." that I was used to.
I should give it another try, maybe it's not as obscure as it once was.
EDIT: I've just reinstalled it, and it is much more intuitive then it was before. Thank-you for reminded me about it.
For me, everything has continued to get faster for the past 30 years. Better modems, service, dsl, cable. 15 years ago, installed button on browser at work to turn off scripting (all or nothing). Used until NoScript came out.
Most news sites, NYTimes, WPost, CNN, even HN - you don't even need scripting to read the text, and it loads like lightning. I don't even bother with an add blocker these days. Want to see pics? Tweak the pic link yourself, or google the article title. Worth it, because it's so much faster.
On HN, I finally got a log in and enabled scripting for ycombinator.com - it's usually always fast anyway.
Example: to follow the war in Ukraine at
- I just enable "scribblemaps.com". Don't need "azure.com", "google-analytics.com", or "visualstudio.com".
Most sites I leave default, no scripting at all. If I really want to see missing content, I might try enabling some, or even all, if I trust the site. Most of the time I can read everything I want without wasting the time.
Even if the whole text is loaded with the initial page, you'll see a request to somewhere to record that you clicked. Your engagement has been measured. This can be helpful for the site directly (which articles do people actually care about after the first paragraph) and that people are engaged enough to click for more is something they can “sell” to advertisers. A better designed site will have the “read more” button be an actual link so if you have JS disabled (or it fails to load) instead of the content reveal simply failing it falls back to a full-page round-trip so you are counted that way.
This could be done with picking up on scroll events or visibility tests on lower parts of the article, instead of asking the user to click, but those methods are less accurate for a number of reasons (people with big screens, people with JS disabled or JS failed to load, …)
It is to try prove they have engaged users who are looking at the screen rather than, for instance, opening it in a tab then closing it later without ever reading any of it.
You might link that to what adverts were shown, but only if they know that information (they may not if the advertising space is farmed out to 3rd parties which is very common).
Maybe, maybe not. It could all be a huge cycle of cons with each part of the advertising and stalking business making crap up about what is effective and convincing the next people in the cycle. But certainly some think it is effective, or at least are hoping that it will be.
It could be like junk email: there are companies and individuals who make money from sending it. It doesn't actually matter to them if it is effective, as long as they can convince others buying their services that it is, or likely might be, effective, and they get their money whether it actually is or very much is not.
In practice if editors don’t write excerpts, this is the first paragraph, and if there are no other stories, well it’s a measurement of engagement at that point.
You forgot the paywall that you will see at this point.
Maybe there is also a customer service bot saying: ”Hey! Ask me about our special offer on 12 month subscription!”
this "felt" too much work so i thought of redesigning the workflow so we ended up with
<captcha send>username<wait><tab>password><wait><tab><show captcha getting banner><wait><captcha paste><tab><enter>
we made the entire process take time to accommodate the ~6 seconds of captcha which currently does not "feel" as taking that time because people are now not happy with loading and wait spinnners generally or stuff taking time, they want everything to be instant. this is just gaming the system so that we can work around technical limitations
Instead of clicking around and filling out forms and waiting for loading spinners all the time, we'll just tell a large language model what we want to do in English, and it will go off and screen-scrape a bunch of apps and websites, do all the clicking for us, and summarize the results in a much simpler UI designed to actually be fast and useful, vs. designed to optimize the business metrics of some company as interpreted by a gaggle of product managers.
This isn't unprecedented. Plaid screen scrapes terrible bank websites and turns them into APIs, though without AI. Google Duplex uses AI to turn restaurant phone numbers into an API for making reservations. DeepMind's Sparrow, just announced today, answers factual questions posed in plain English by performing Google searches and summarizing the results. But it's going to be a revolution when it becomes much more general and able to take actions rather than just summarize information. It isn't far off! https://adept.ai is pretty much exactly what I'm talking about, and I expect there are a lot more people working on similar things that are still in stealth mode.
And Google Search is so utterly weak. Not just because of the neutering of search options or the weird priority conflict inside Google, but just because even with litteraly all the data in the world about a specific user it doesn't seems like it can wrangle what a request actually means.
It can tell me the time in Chicago, but not what computer would actually be the best for my work. That search will only be spam, irrelevant popular results and paid reviews.
Same if I asked for a _good_ pizza recipe, it would probably not understand what that actually means for me.
The whole model of "throwing a request in the box and expecting a result" seems broken to me, I mean even between humans it doesn't work that way, why would it work with an advanced AI ?
PS: even with more back and forth, I'm imagining what we have now with customer support over chat, and while more efficient than by phone, it's definitely not the interface I want by default
At the same many primarily text sites struggle, such as news and social media, struggle to load in 15 seconds.
My connection measures about 920mbps down so 15 seconds is really ridiculously slow. This is certainly not a technology problem evidenced by my own app that is doing so much more is such shorter time.
This site, for instance.
Then, maybe LLMs will get more intelligent and it will be less comical, and more like having an actual slave. An agent intelligent enough to do all these things is probably intelligent enough make me queasy at the thought of being its master.
Language is an imprecise tool, it has inherent ambiguity. Language is also laborious and one dimensional (stream of bits over temporal dimension). A tool such as the one you describe would be extremely frustrating to use.
But I wonder if there will come a point where captchas and "Not a Robot" checkboxes no longer work.
It’s understandable to get frustrated by this, but at some point you realize it’s pointless.
This is true in many, many facets of life. Household possessions tend to expand to fill the available square footage. Cities sprawl haphazardly until commute times become unbearable. Irrigation expands until the rivers are depleted. Life expands to the limit, always.
Even in a modern city this is rarely the case.
So I'm afraid the real answer is that webdev is just not mature yet.
It keeps getting worse.
When I do web app security assessments, I end up with a logfile of all requests/responses made during browsing a site.
The sizes of these logfiles have ballooned over the past few years, even controlling for site complexity.
Many megabytes of JS shit, images, etc being loaded and often without being cached properly (so they get reloaded every time).
A lot of it is first party framework bloat (webdev active choices), but a lot is third party bloat - all the adtech and other garbage that gets loaded every time (also without cacheing) for analytics and tracking.
Economists and lawmakers have determined that the economic benefits of personalization accrue to the ad middlemen like Google, not to the publishers who have to encumber their sites with all the surveillance beacons, but the reality of the market is publishers have no leverage. That said, most of those beacons are set with the async/defer attribute and should not have a measurable on page load speed.
This is also amusing from a change management process at large organizations: want to tweak an Apache setting? Spend a month getting CAB approval and wait for a deployment window.
Here are two sources which I found useful at that time
Still, it's important that latency is top of mind for the designers of 5G, as it gets short shrift far too often.
From their FAQ/changelog :
> 19 Mar 2013: The default connection speed was increased from DSL (1.5 mbps) to Cable (5.0 mbps). This only affects IE (not iPhone).
There was another popular article on HN a while ago , claiming mobile websites had gotten slower since 2011. But actually HTTP Archive just started using a slower mobile connection in 2013. I wrote more about that issue with the HTTP Archive data at the time .
The Google data uses Largest Contentful Paint instead of Speed Index, but the two metrics ultimately try to measure the same thing. Both have pros and cons. Speed Index goes up if there are ongoing animations (e.g sliders). LCP only looks at the single largest content element.
When looking at the real-user LCP data over time, keep in mind that changes are often due to changes in the LCP definition (e.g opacity 0 elements used to count but don't any more). https://chromium.googlesource.com/chromium/src/+/master/docs...
Streamlit is the framework I used to build the app at the bottom of the article with. It does unfortunately load a decent amount of JS. However it should be non-blocking, which means it won't interfere with how quickly you can see or use the page.
It pings back to Streamlit to keep your session state alive as it's running a whole Python interpreter on the backend for each session.
The speed index for this page hovers between 1-2 seconds when I test it.
I wouldn't do it like that, but to call that wasteful when many websites need 5MB of ad code before they let you see anything is a bit over the top!
The source for that is some stats from speedtest.net, which I assume is calculated from the users who used their speed test? So it's probably heavily skewed towards power users who have a fast connection and want to check if they are really getting what they are paying for. Most "casual" users with shitty DSL connections are happy if "the internet" works at all and are pretty unlikely to ever use this service...
What's baffling to me is how people love to spend seemingly infinite time playing with tech stacks and what not but then pay very little attention to basic details like what to load and how many resources do they really need.
But I see the image inside the network tab so bandwidth is getting wasted for no reason.
Reminds me of that time when we switched from writing by hand (and sometimes typing on a typewriter) all kinds of forms and reports to composing them on a computer and then printing it: initial time savings were pretty huge and so, naturally, the powers that be said "well, guess We can make you fill much more paperwork than you currently are filling" and did so. In the end, the amount of paperwork increased slightly out of proportion and we're now spending slightly more time on it than we used to. A sort of a law of conservation of effort, if you will.
1. Power on BBC
2. Type *EDIT[ENTER]
Though obviously not as fully featured as a modern word processor, or even some editors of the time.
We haven't progressed in speed, but versatility is theough the roof compared to 10 ~ 20 years ago.
Or just stream is as video on Youtube and let Google pay for the storage :)
Do you think this is including ad loads? Ad networks run a real-time auction. It takes some time to collect bids so the highest can be chosen.
Developers will therefore always maximize resource usage as more becomes available.
ff always ask for internet connection although all needed files are in the cache.
i'm on internet diet (disable wifi for hours long) so bad behaviors are apparent
those who always-on-24/7 fiber may not notice
In the comments I mostly see "because of js and adtech", but is there a factual analysis somewhere ? How much is due to the recent massive deployment of https and related latency ? Is it a problem of latency or bandwidth ? What type of contents is causing most of the waiting time ? Images, css, js ? Just wondering
I'm not sure what sort of analysis you're looking for, but different sites have different problems. I'm not convinced that averaging them makes sense, and it's very difficult to create a reasonable metric. As soon as you start measuring some specific metric, people will optimize for it whilst still making the site unusable. Unfortunately "usable state" is too hard to quantify without it being game-able.
> How much is due to the recent massive deployment of https and related latency
Very little, you can check here  and see for yourself. Obviously it depends on your distance to the server you are reaching, but the added latency of HTTPS isn't really a factor when you're looking at 10s for a page to load.
> What type of contents is causing most of the waiting time ? Images, css, js ?
Lets look at CNN (only because I happen to remember that a lite version exists thanks to someone on HN). The lite version loads entirely in 350ms for me. The normal version with adblock on finishes loading everything after 1.43s. The normal version with adblock off, finishes loading after 20s and reflows a bunch of times as ads get loaded.
So I agree with the rest of the comments, it's "because of js and adtech".
Disclaimer: I'm in South Africa at the moment on 4g, my internet isn't the best :)
I don't know about a real analysis, but it's pretty easy to see the effect in your own browser. Disable JS and see how much faster the page loads.
Turns out it was hauling in 10MB of analytics scripts.
Oh to be a mere average computer user. I work with files that can still take some non-instant time after hitting save to complete. Conversely, it still takes some non-instant time to open said file. As long as there's such a thing as progress bars, count downs, spinning wheels, beach balls, etc, there is always room to make things faster.
And yet when you ask customers whether they want more features, or a faster program it's invariably features. Fix a bug or add a feature? Add a feature. Improve perf or add a feature? Add a feature.
My own tolerance for delays is tiny, but my "average users" seem to not suffer from it at all. I guess the reason is this: they know how much time something took before. They know that if this takes 10 seconds and it took them an hour to do on paper, that's quick. Meanwhile for me I'm in the IDE having a 200ms keystroke delay and I'm almost having a heart attack.
Of course this is what they want. To the user, the damn thing should have been working when you released it. To the user, you shouldn't have to ask to have something that's broken but is meant to be a feature to be fixed. If it's not working, why is it there in the first place. Fix it. Duh. But also add new features before your competitor does and takes your users. This is such a weird excuse on discussions about a progress bar.
25 years later, progress bars are no better: they steadily go to 15%, then stop there for a while, suddenly zoom to 80%, then slowly progress to 99%, then stop there for a long time.
But if you are making something that has multiple phases and you want to use a single progress bar then that isn't goingt to progress nicely from 0 to 100 it's just impossible.
You'll either have to make an overall progress bar for the different phases, or just indicate that this is phase 4/5 and the progress of that phase is 82%. This solves the jumpy progress bar issue, but doesn't solve the underlying issue of actually showing an estimate of time remaining. You have no idea whether phase 5/5 is going to take 1 second or 1 minute.
When a progress bar stops at 99% it's just because some final task that was expected to not take very long, did. E.g. it could be deleting a file at the end of some install process, and that for some reason blocks. There is obviously no way of knowing the actual time that will take without trying it, at which point it's too late for the progress bar to adjust of course.
So: we'll have general artificial intelligence on smartwatches, before we have smooth multi-phase progress indication (because that's never).
So then why try it?
I would much rather see some sort of update like when you run "yum update" and it spits out 3/15, 11/15, etc. You don't have to show all of the lines scrolling, but you could do similar to a "\r" instead of a "\n" in the progress window with an update that is very intuitive. You know how many steps you must take to accomplish the user's request. I don't need to see how long each step is taking per se, but if you showed me the total number of steps and show each one ticking off would be much more useful than the typical blue bar race.
During startup, the console would display the modules after it loaded them. People attributed failure to load the next module as a failure of the last one visible.
This ended with mup.sys being blamed for every failure.
Then why didn't they print a message before loading a module? This is really simple stuff, and would immediately show the problem. This is pretty normal in printing to logfiles in software: print messages both before and after an action is taken, so if it gets stuck in the action, you see that it never succeeded though it was started.
How is that obvious? Deleting a file doesn't just block for no reason. The OS (usually) or disk hardware (rarely) are the ones that decide to block it, and they have the information needed to know how long it will take.
What actually happens is that the advantages of showing a correct progress bar simply don't justify the complexity needed for it, and a decision to show a crappy incorrect progress bar is made.
Another example (from the real world this time): I make a progress bar for an iterative optimization of an design in a program. The number of iterations until the design completes can't be reliably estimated. It's not completely underterministic of course, but the complexity of making that estimation isn't just high, it's likely to be so high that it approaches the complexity of the optimization itself.
But the first example, git housekeeping, depends only on information readily available locally. git could just give this information to any calling program that needs it, but my guess is that nobody thinks that progress bars are worth it.
Actually, in today's world, I'd be happy if progress bars (or even indefinite loading indicators) would even show me when the process has crashed and won't ever finish, but they'll happily show "progress" that will never finish.
I like how you assume I can't recite that movie from memory.
But I'm using uBlock origin …
You aren't loading all the adtech webshit that makes up the majority of a page load.
Given that network wide adblock/tracker blocking saves upwards of 60% of bandwidth (for web traffic) on the average network, its pretty obvious where the problem is.
Latency: Part of the page loads, and then the page asks for more data. In some cases, (such as loading a page hosted on another continent,) this is bound by the speed of light.
Poor data access (database) code: Sometimes this is due to lazy or incompetent programmers, other times its due to the fact that "not instant" is "good enough."
Writing a web page to load everything very quickly in a single request is surprisingly hard, and will often break modern and easily understood design patterns.
Served locally it takes about 20ms to load that React app and draw something on my screen. An HTML copy of Frankenstein (463kB) loaded from the same server loaded in 5ms. A plain HTML document three times the size of the toy React "app" loaded in a quarter of the time!
> Writing a web page to load everything very quickly in a single request is surprisingly hard, and will often break modern and easily understood design patterns.
I find this to be an odd take. This nearly 4,000 word CNN article including graphics weighs in at 70kB. The HTML alone including a bunch of inline styling (tables, font tags, etc) weighs only 55kB. With no graphics loaded it is still a perfectly readable article. Loading the HTML of that page in my above test it still loads in 5ms and is 100% readable.
Being able to render an enhanced markdown or JSON or YAML page in the browser without any generators would be phenomenal and resolve a lot of long-standing issues with structured data.
This is the same as saying "Despite larger pipes every year, water still doesn't reach your house any faster".
If a website is transferring more data to render content, or worse, before starting to display content, then bandwidth matters as it will move that data more quickly.
If a website requires multiple round trips to complete a given request, then latency will also matter, and sets an absolute minimum floor to time-to-display (TTD) regardless of bandwidth. The higher your latency, the slower that process.
In theory it's possible for an SPA (single-page application) to be more responsive despite an overall larger page weight as it can incrementally request and present additional content. In practice such pages often perform worse on time to display due to both increased total data transfer and round-trip request requirements. A lightweight HTTP/2 HTML+CSS only site can be far more performant if it's based on static pages and request/load dynamics.
This is not optimized for "engagement" or whatever proxy for "money in my pocket", however, so it's pretty rare.
> Contrary to popular belief, the average car is not in fact that much more fuel-efficient than older cars. Still, to this day, the average vehicle has a range of between 20 and 30 miles per gallon; a stat which was very similar in the 1920s. But, why is this? Well, cars are a whole lot bigger.
More sites taking advantage of them at the same pace of technical development, cancelling them out.
Otherwise if we had a website from 10 years ago it is faster to load with today's connections.
Usage experience with typical specs of 2022 (500G SSD and 16G RAM) is as good or bad as ones from 20 years back (say, 20G HDD and 128M RAM).
A startup could probably nuke reddit in a month if they just concentrate on performance and usability.
I would rephrase it:
Because faster broadband every year, web pages don't load any faster