|Source: Frederic Filloux
When reading this 800 words Guardian story — about half of page of text long — your web browser loads the equivalent of 55 pages of HTML code, almost half a million characters. To be precise: an article of 757 words (4667 characters and spaces), requires 485,527 characters of code ... “useful” text (the human-readable article) weighs less than one percent (0.96%) of the underlying browser code. The rest consists of links (more than 600) and scripts of all types (120 references), related to trackers, advertising objects, analytics, etc.But he ends on a somewhat less despairing note. Follow me below the fold for a faint ray of hope.
In due fairness, this cataract of code loads very fast on a normal connection.His "normal" connection must be much faster than my home's 3Mbit/s DSL. But then the hope kicks in:
some HTML tags are replaced with AMP-specific tags (see also HTML Tags in the AMP spec). These custom elements, called AMP HTML components, make common patterns easy to implement in a performant way.Finally, Google supports the use of AMP with a proxy cache that:
fetches AMP HTML pages, caches them, and improves page performance automatically. When using the Google AMP Cache, the document, all JS files and all images load from the same origin that is using HTTP 2.0 for maximum efficiency.The cache also validates the pages it caches confirming that:
the page is guaranteed to work, and that it doesn't depend on external resources. The validation system runs a series of assertions confirming the page’s markup meets the AMP HTML specification.
|Source: Frederic Filloux
As an admittedly biased reference point, I took one of the first texts, World Wide Web Summary, written in HMTL by its inventor Tim Berners-Lee. Published in 1991, it probably is one of the purest, most barebones forms of hypertext markup language: less that 4200 characters of readable text for less that 4600 characters of code. That’s a 90% usefulness rate as shown in the table below (you can also refer to my original Google Sheet here, to get precise numbers, stories URLs and formulae).The table (click on the image above) is interesting for the wide range of "usefulness rate", from 91% to 1%:
The big surprise (at least for me) comes from the Progressive Web App implemented by the Washington Post. The Plain HTML page offers roughly the same content as the PWA version, but with a huge gain in HTML size.The Washington Post PWA page uses less than one-tenth as many bytes to deliver equivalent content. That's double the improvement The Guardian got with AMP. Progressive Web Apps are a technique created by Google about a year ago for building Web pages that, by using local storage, service workers and asynchronous behavior to provide app-like user experiences:
Google is just starting to promote the PWA on a large scale and the tools are already available. ... Because it supports Push notifications and other features until now reserved to native apps, PWA has great potential for publishersIt is clearly capable of impressive performance gains, with only about 1 byte of crud for each byte of content. Filloux's equivocal about the prospects for AMP and PWA. Although Google has ways of punishing sites that don't get with the program, I'm more pessimistic. The tools people use to generate their pages emit HTML that is just brain-dead (e.g. the same enormous <div> specification on adjacent phrases). Only people who simply don't care could put out stuff like this.
[Updated to correct ungrammatical sentence]