Tuesday, February 27, 2018

"Nobody cared about security"

There's a common meme that ascribes the parlous state of security on the Internet to the fact that in the ARPAnet days "nobody cared about security". It is true that in the early days of the ARPAnet security wasn't an important issue; everybody involved knew everybody else face-to-face. But it isn't true that the decisions taken in those early days hampered the deployment of security as the Internet took the shape we know today in the late 80s and early 90s. In fact the design decisions taken in the ARPAnet days made the deployment of security easier. The main reason for today's security nightmares is quite different.

I know because I was there, and to a small extent involved. Follow me below the fold for the explanation.

Thursday, February 22, 2018

Brief Talk at Video Game Preservation Workshop

I was asked to give a brief talk to the Video Game Preservation Workshop: Setting the Stage for Multi-Partner Projects at the Stanford Library, discussing the technical and legal aspects of cooperation on preserving software via emulation. Below the fold is an edited text of the talk with links to the sources.

Tuesday, February 20, 2018

Notes from FAST18

I attended the technical sessions of Usenix's File And Storage Technology conference this week. Below the fold, notes on the papers that caught my attention.

Thursday, February 15, 2018

Do You Need A Blockchain?

David Gerard's Do you need a Blockchain? Probably less than Wüst and Gervais think you do reviews an interesting paper, Do you need a Blockchain? by Karl Wüst and Arthur Gervais of ETH Zurich. Their abstract says:
In this article we critically analyze whether a blockchain is indeed the appropriate technical solution for a particular application scenario. We differentiate between permissionless (e.g., Bitcoin/Ethereum) and permissioned (e.g. Hyperledger/Corda) blockchains and contrast their properties to those of a centrally managed database.
Gerard is, for him, pretty enthusiastic about the paper:
This paper is worth your time. They explain the jargon at length, and discuss many commonly-advocated blockchain use cases — it’s a useful survey of the area — even as the authors are huge Bitcoin and blockchain advocates, and somewhat more optimistic for applying blockchains than is really warranted.
Below the fold, I look at both the paper and Gerard's review.

Wednesday, February 14, 2018

Preserving Government Information

Spotlight on Digital Government Information Preservation: Examining the Context, Outcomes, Limitations, and Successes of the DataRefuge Movement  by and examines the issues around preserving access to government information through the lens of the DataRefuge movement. Below the fold, some commentary.

Tuesday, February 13, 2018

Correlated Cryptojacking

On February 11 at least 4,275 Web sites were found to have been simultaneously cryptojacked:
they include The City University of New York (cuny.edu), Uncle Sam's court information portal (uscourts.gov), Lund University (lu.se), the UK's Student Loans Company (slc.co.uk), privacy watchdog The Information Commissioner's Office (ico.org.uk) and the Financial Ombudsman Service (financial-ombudsman.org.uk), plus a shedload of other .gov.uk and .gov.au sites, UK NHS services, and other organizations across the globe.

Manchester.gov.uk, NHSinform.scot, agriculture.gov.ie, Croydon.gov.uk, ouh.nhs.uk, legislation.qld.gov.au, the list goes on.
They were all running Coinhive's Monero miner in visitors' browsers. How and why did this happen and what should these sites have been doing to prevent it? Follow me below the fold.

Monday, February 12, 2018

Lessons From Arquivo.pt

Daniel Gomes' video
I'd like to draw your attention to Daniel Gomes excellent video entitled Improving the robustness of the Arquivo.pt web archive.

Arquivo.pt is the Portuguese Web Archive. It got started in 2007, and in 2010 was an early archive to support full-text search. In 2013 it suffered a hardware malfunction that took the service down and lost 17% of its content. This led to a complete re-think of the system architecture, implementation, and operations. Daniel describes this process and the encouraging results in detail. It is well worth the 20 minutes to watch it.

Daniel divides the re-think into 5 major sections:
  1. Hardware and software architecture shifted to shared-nothing
  2. Reinforced replication policies
  3. Monitor the service
  4. Quality assurance for software development
  5. Document and test procedures
I'd agree with all these points. Many of the details correspond to things the LOCKSS Program focused on during preparation for the TRAC audit of the CLOCKSS Archive in 2014. This is especially the case for the last of Daniel's sections; the audit forced us to document our processes, which forced us to think about whether they were actually achieving their goals, which led to the discovery that in a number of cases they weren't.

Thursday, February 8, 2018

Meta: Blog Switched To HTTPS (Updated)

Because From July, Chrome will name and shame insecure HTTP websites I followed the instructions Hamad Ansari provides in Blogger Released Free SSL (HTTPS) For Custom Domains and enabled both "connections over HTTPS" and "HTTPS redirect", so that:
gets redirected to:
Everything I've tried so far works. Please comment on this post if you find things that don't work.

Update: Scott Helme points out that I'm just part of an encouraging trend. The graph shows the top million sites from Alexa in groups of 4,000. For each group, it shows the number of sites that are HTTPS (only, I believe). It shows that the pace of sites going HTTPS-only is increasing. The effect of Chrome's naming and shaming will presumably increase the rate of adoption further in July.

Tuesday, February 6, 2018

DNA's Niche in the Storage Market

I've been writing about storing data in DNA for the last five years, both enthusiastically about DNA's long-term prospects as a technology for storage, and pessimistically about its medium-term prospects. This time, I'd like to look at DNA storage systems as a product, and ask where their attributes might provide a fit in the storage marketplace.

As far as I know no-one has ever built a storage system using DNA as a medium, let alone sold one. Indeed, the only work I know on what such a system would actually look like is by the team from Microsoft Research and the University of Washington. Everything below the fold is somewhat informed speculation. If I've got something wrong, I hope the experts will correct me.