Browser ArchitectureKay's basic argument is that designers of user interface infrastructure typically cannot predict all the requirements that will be placed upon it. At some point, as we see with the Web, programmability will evolve, so it had better be designed in from the start. His main point is summarized at the end of the post:
Key Point: “sending a program, not a data structure” is a very big idea (and also scales really well if some thought is put into just how the program is set up).The starting point that leads him to this conclusion is a contrast between his recommendations at the inception of the Web:
I made several recommendations — especially to Apple where I and my research group had been for a number of years — and generally to the field. These were partially based on the scope and scalings that the Internet was starting to expand into.And where the Web has ended up:
- Apple’s Hypercard was a terrific and highly successful end-user authoring system whose media was scripted, WYSIWYG, and “symmetric” (in the sense that the “reader” could turn around and “author” in the same high-level terms and forms). It should be the start of — and the guide for — the “User Experience” of encountering and dealing with web content.
- The underlying system for a browser should not be that of an “app” but of an Operating System whose job would be to protectively and safely run encapsulated systems (i.e. “real objects”) gotten from the web. It should be the way that web content could be open-ended, and not tied to functional subsets in the browser.
One way to look at where things are today is that the circumstances of the Internet forced the web browsers to be more and more like operating systems, but without the design and the look-aheads that are needed.Kay sees this as a failure of imagination:
This was all done after — sometimes considerably after — much better conceptions of what the web experience and powers should be like. It looks like “a hack that grew”, in part because most users and developers were happy with what it did do, and had no idea of what else it *should do* (and especially the larger destinies of computer media on world-wide networks).
- There is now a huge range of conventions both internally and externally, and some of them require and do use a dynamic language. However, neither the architecture of this nor the form of the language, or the forms of how one gets to the language, etc. are remotely organized for the end-users. The thresholds are ridiculous when compared to both the needs and the possibilities.
- There is now something like a terribly designed OS that is the organizer and provider of “features” for the non-encapsulated web content. This is a disaster of lock-in, and with actually very little bang for the buck.
let me use “Licklider’s Vision” from the early 60s: “the destiny of computing is to become interactive intellectual amplifiers for all humanity pervasively networked worldwide”.He uses the example of the genesis of PostScript to make the point about programmability:
This doesn’t work if you only try to imitate old media, and especially the difficult to compose and edit properties of old media. You have to include *all media* that computers can give rise to, and you have to do it in a form that allows both “reading” and “writing” and the “equivalent of literature” for all users.
Examples of how to do some of this existed before the web and the web browser, so what has happened is that a critically weak subset has managed to dominate the imaginations of most people — including computer people — to the point that what is possible and what is needed has for all intents and purposes disappeared.
Several of the best graphics people at Parc created an excellent “printing standard” for how a document was to be sent to the printer. This data structure was parsed at the printer side and followed to set up printing.I have worked on both a window system that sent "a data structure" and one that sent "a program" and, while I agree with Kay that in many cases sending a program is a very good idea, the picture is more complicated than he acknowledges.
But just a few weeks after this, more document requirements surfaced and with them additional printing requirements.
This led to a “sad realization” that sending a data structure to a server is a terrible idea if the degrees of freedom needed on the sending side are large.
And eventually, this led to a “happy realization”, that sending a program to a server is a very good idea if the degrees of freedom needed on the sending side are large.
John Warnock and Martin Newell were experimenting with a simple flexible language that could express arbitrary resolution independent images — called “JAM” (for “John And Martin” — and it was realized that sending JAM programs — i.e. “real objects” to the printer was a much better idea than sending a data structure.
This is because a universal interpreter can both be quite small and also can have more degrees of freedom than any data structure (that is not a program). The program has to be run in a protected address space in the printer computer, but it can be granted access to a bit-buffer, and whatever it does to it can then be printed out “blindly”.
This provides a much better match up between a desktop publishing system (which will want to print on any of the printers available, and shouldn’t have to know about their resolutions and other properties), and a printer (which shouldn’t have to know anything about the app that made the document).
|NeWS & Pie Menus|
I liked NEWS as far as it went. I don’t know why it was so cobbled together — Sun could have done a lot more. For example, the scalable pixel-independent Postscript imaging model, geometry and rendering was a good thing to try to use (it had been used in the Andrew system by Gosling at CMU) and Sun had the resources to optimize both HW and SW for this.Kay is wrong that the Andrew window system that Gosling and I built at C-MU used PostScript. At that time the only implementation of PostScript that we had access to was in the Apple LaserWriter. It used the same Motorola 68K technology as the Sun workstations on our desks, and its rendering speed was far too slow to be usable as a graphical user interface. Gosling announced that he was going to Sun and told me he planned to build a PostScript-based window system. I thought it was a great idea in theory but was skeptical that it would be fast enough. It wasn't until Gosling showed me PostScript being rendered on Sun/1's screen at lightning speed by an early version of SunDew that I followed him to Sun to work on what became NeWS.
But Postscript was not well set up to be a general programming language, especially for making windows oriented frameworks or OS extensions. And Sun was very intertwined with both “university Unix” and C — so not enough was done to make the high-level part of NEWS either high-level enough or comprehensive enough.
A really good thing they should have tried is to make a Smalltalk from the “Blue Book” and use the Postscript imaging model as a next step for Bitblt.
Also, Hypercard was very much in evidence for a goodly portion of the NEWS era — somehow Sun missed its significance.
It is true that vanilla PostScript wasn't a great choice for a general programming language. But Owen Densmore (with my help) leveraged PostScript's fine-grained control over name resolution to build an object-oriented programming environment for NeWS that was, in effect, a Smalltalk-like operating system, with threads and garbage collection. It is described in Densmore's 1986 Object-Oriented Programming in NeWS, and Densmore's and my 1987 A User‐Interface Toolkit in Object‐Oriented PostScript.
As regards Hypercard, Don Hopkins pointed out that:
Outside of Sun, at the Turing Institute in Glasgow, Arthur van Hoff developed a NeWS based reimagination of HyperCard in PostScript, first called GoodNeWS, then HyperNeWS, and finally HyperLook. It used PostScript for code, graphics, and data (the axis of eval).
Like HyperCard, when a user clicked on a button, the Click message could delegate from the button, to the card, to the background, then to the stack. Any of them could have a script that handled the Click message, or it could bubble up the chain. But HyperLook extended that chain over the network by then delegating to the NeWS client, sending Postscript data over a socket, so you could use HyperLook stacks as front-ends for networked applications and games, like SimCity, a cellular automata machine simulator, a Lisp or Prolog interpreter, etc. SimCity, Cellular Automata, and Happy Tool for HyperLook (nee HyperNeWS (nee GoodNeWS))
Hopkins also pointed Kay to Densmore's Object-Oriented Programming in NeWS. Kay was impressed:
This work is so good — for any time — and especially for its time — that I don’t want to sully it with any criticisms in the same reply that contains this praise.I was impressed at the time by how simple and powerful the programming environment enabled by the combination of threads, messages and control over name resolution was. Densmore and I realized that the Unix shell's PATH variable could provide exactly the same control over name resolution that PostScript's dictionaries did so, as I recall in an afternoon, we ported the PostScript object mechanism to the shell to provide a fully object-oriented shell programming environment.
I will confess to not knowing about most of this work until your comments here — and this lack of knowledge was a minus in a number of ways wrt some of the work that we did at Viewpoints since ca 2000.
My Two Cents WorthAs you can see, NeWS essentially implemented the whole of Kay's recommendations, up to and including HyperCard. And yet it failed in the marketplace, whereas the X Window System has been an enduring success over the last three decades. Political and marketing factors undoubtedly contributed to this. But as someone who worked on both systems I now believe even in the absence of these factors X would still have won out for the following technical reasons:
One of the great realizations of the early Unix was that the *kernel* of an OS — and essentially the only part that should be in “supervisor mode” — would only manage time (quanta for interleaved computations) and space (memory allocation and levels) and encapsulation (processes) — everything else should be expressible in the general vanilla processes of the system.Fundamentally X only managed time (by interleaving rendering operations on multiple windows), space (by virtualizing the framebuffer) and encapsulation (by managing the overlapping of multiple virtualized framebuffers or windows, and managing inter-window communication such as cut-and-paste).
It is important to note the caveat in Kay's assertion that:
sending a data structure to a server is a terrible idea if the degrees of freedom needed on the sending side are large.The reason X was successful despite "sending a data structure" was that the framebuffer abstraction meant that the "degrees of freedom needed on the sending side" weren't large. BitBlt and, later, an alpha-channel allowed everything else to be "expressible in the general vanilla processes of the system". Thus X can be viewed as conforming to Kay's recommendations just as NeWS did.
- The PostScript rendering model is designed for an environment with enough dots-per-inch that the graphic designer can ignore the granularity of the display. In the 80s, and still today to a lesser extent, this isn't the case for dynamic displays. Graphic design for displays sometimes requires the control over individual pixels that PostScript obscures. Display PostScript, which was effectively the NeWS rendering model without the NeWS operating system, also failed in the marketplace partly for this reason.
- With the CPU power available in the mid 80s, rendering PostScript even at display resolutions fast enough to be usable interactively required Gosling-level programming skills from the implementer. It was necessary to count clock cycles for each instruction in the inner loops, and understand the effects of the different latencies of main memory and the framebuffer. Porting NeWS was much harder than porting X, which only required implementing BitBlt, Of course, this too rewarded programming skill, but it was also amenable to hardware implementation in a way the PostScript wasn't in those days. So X had a much easier deployment.
- The lack of CPU power in those days also meant there was deep skepticism about the performance of interpreters in general, and in the user interface in particular. Mention "interpreted language" and what sprung to mind was BASIC or UCSD Pascal, neither regarded as fast.
- Similarly, X applications were written in single-threaded C using a conventional library of routines. This was familiar territory for most programmers of the day. Not so the object-oriented, massively multi-threaded NeWS environment with its strange "reverse-polish" syntax. At the time these were the preserve of truly geeky programmers.
This is all at the window system level, but I believe similar arguments apply at the level Kay is discussing, the Web. For now, I'll leave filling out the details as an exercise to the reader.
Early Window SystemsHopkins also pointed Kay to Methodology of Window Management, the record of a workshop in April 1985, because it contains A Window Manager for Bitmapped Displays and Unix, a paper by Gosling and me about the Andrew window system, and SunDew - A Distributed and Extensible Window System, Gosling's paper about SunDew.
The workshop also featured Warren Teitelman's Ten Years of Window Systems - A Retrospective View. Later, I summarized my view of the early history in this comment to /.:
There were several streams of development which naturally influenced each other- broadly:
- Libraries supporting multiple windows from one or more threads in a single address space, starting from Smalltalk leading to the Mac and Windows environments.
- Kernel window systems supporting access to multiple windows from multiple address spaces on a single machine, starting with the Blit and leading to SunWindows and a system for the Whitechapel MG-1.
- Network window systems supporting access to multiple windows from multiple address spaces on multiple machines via a network, starting from work at PARC by Bob Sproull & (if memory serves) Elaine Sonderegger, leading to Andrew, SunDew which became NeWS, and W which became X.
Windows didn't start with Smalltalk. The first *real* windowing system I know of was ca 1962, in Ivan Sutherland's Sketchpad (as with so many other firsts). The logical "paper" was about 1/3 mile on a side and the system clipped, zoomed, and panned in real time. Almost the same year -- and using much of the same code -- "Sketchpad III" had 4 windows showing front, side, top, and 3D view of the object being made. These two systems set up the way of thinking about windows in the ARPA research community. One of the big goals from the start was to include the ability to do multiple views of the same objects, and to edit them from any view, etc.The paper to which Kay refers is A clipping divider by Bob Sproull & Ivan Sutherland.
When Ivan went ca 1967 to Harvard to start on the first VR system, he and Bob Sproull wrote a paper about the general uses of windows for most things, including 3D. This paper included Danny Cohen's "mid-point algorithm" for fast clipping of vectors. The scheme in the paper had much of what later was called "Models-Views-and-Controllers" in my group at Parc. A view in the Sutherland-Sproull scheme had two ends (like a telescope). One end looked at the virtual world, and the other end was mapped to the screen. It is fun to note that the rectangle on the screen was called a "viewport" and the other end in the virtual world was called "the window". (This got changed at Parc, via some confusions demoing to Xerox execs).
In 1967, Ed Cheadle and I were doing "The Flex Machine", a desktop personal computer that also had multiple windows (and Cheadle independently developed the mid-point algorithm for this) -- our viewing scheme was a bit simpler.