Tuesday, April 5, 2016

The Curious Case of the Outsourced CA

I took part in the Digital Preservation of Federal Information Summit, a pre-meeting of the CNI Spring Membership Meeting. Preservation of government information is a topic that the LOCKSS Program has been concerned with for a long time; my first post on the topic was nine years ago. In the second part of the discussion I had to retract a proposal I made in the first part that had seemed obvious. The reasons why the obvious was in fact wrong are interesting. The explanation is below the fold.

One major difficulty in preserving the Federal Web presence is finding it. The idea that all Federal websites live under .gov or .mil, or even that somewhere in the Federal government is a complete, accurate and up-to-date list of them is just wrong. How does a Web crawler know that some random site in .com, .org or .us is actually part of the Federal Web presence?

Connecting to GeoTrust
In a praiseworthy attempt to protect Federal websites from the all-powerful Chinese hackers and the dreaded terrorists, a decree as gone forth that, come December 31st this year, all of them must be HTTPS-only. An HTTPS website must have a certificate carrying a chain of signatures terminating in one from a root Certificate Authority (CA) that browsers trust. Here, for example, is the website of GeoTrust, a commercial CA that browsers trust. The certificate chain is:
  • www.geotrust.com is certified by:
  • GeoTrust Extended Validation SHA256 SSL CA, which is certified by:
  • GeoTrust Primary Certification Authority G3, which is a CA browsers trust.
The list of CAs that my Firefox trusts is here, all 188 of them. Note that because GeoTrust is in the list, it can certify its own website. I wrote about the issues around CAs back in 2013, notably that:
  • The browser trusts all of them equally.
  • The browser trusts CAs that the CAs on the list delegate trust to. Back in 2010, the EFF found more than 650 organizations that Internet Explorer and Firefox trusted.
  • Commercial CAs on the list, and CAs they delegate to, have regularly been found to be issuing false or insecure certificates.
Among the CAs on the list are agencies of many governments, such as the Dutch, Chinese, Hong Kong, and Japanese governments.

I assumed that the US government would be on the list too. My obvious idea was that government websites outside .gov and .mil could be found by crawling other domains looking for HTTPS sites whose certificate's signature chain ended at the US government root CA. This would solve a big problem for collecting and preserving Federal government information. Alas, I under-estimated the mania for outsourcing government functions to for-profit companies.

Connecting to the LoC website
As an example, visit http://www.loc.gov. Your browser will be redirected to https://www.loc.gov, the home page of the Library of Congress website. It will display a green padlock icon,showing that the connection is secure and the browser has verified the certificate upon which the connection's security depends. So far so good. Now click on the green padlock and reveal the details of this verification. The certificate chain looks like:
  • *.loc.gov is certified by:
  • Entrust Certification Authority - L1K, which is certified by:
  • Entrust Root Certification Authority G2, which is on the browser's trusted CA list.
What this means is that the Library of Congress is paying a commercial CA to reassure citizens that their website is what it claims to be, and is secure.

Connecting to the DHS website
If you visit http://www.dhs.gov/ you will be redirected to https://www.dhs.gov/ but unlike the Library of Congress you won't get the reassuring green padlock. There are two reasons:
  • The images in the page are delivered via HTTP, so:
    Your connection to this site is private, but someone on the network might be able to change the look of the page.
  • The browser doesn't like the HTTPS connection because:
    Your connection to www.dhs.gov is encrypted using an obsolete cipher suite.
But the browser has verified the signature chain. It looks like:
  • www.dhs.gov is certified by:
  • GeoTrust SSL CA - G3, which is certified by:
  • GeoTrust Global CA, which is on the browser's trusted CA list.
So the Department of Homeland Security is paying a different commercial CA to reassure citizens that their website is what it claims to be, and that it is not very secure.

Why is this? Is it because the Library of Congress believes that Entrust is more trustworthy than the US Government? I hope not, Entrust is one of the CAs whose delegated CAs have been caught issuing bogus certificates. It is because, as far as I can tell, the list of 188 CAs that browsers trust contains no US Government controlled CA.

So, your browser trusts the government of the People's Republic of China but not the government of the United States of America!

It isn't that the Federal government doesn't trust itself to run a secure root CA. There is a Federal root CA, the Common Policy Root CA, which is clearly regarded as secure since it is used to control access to Federal systems. But it isn't in the browser's list of trusted CAs, so it isn't any use for outward-facing services such as websites. If it was Federal websites could be Federally certified as GeoTrust websites are GeoTrust certified.

In what world does this make sense? One in which there's money to be made selling services to the Federal government. By failing to follow the example of other governments by putting a root CA that they control into the list, the government arranges for funds to flow to for-profit companies who can protect the cash flow by lobbying, and arranging a warm welcome on the other side of the revolving door for the decision makers. All the for-profit CAs need to do is to make sure they stay in GSA's list of approved vendors, like DigiCert.

So, apart from the waste of taxpayer money, and the failure of my idea for finding government websites, what is the downside of this situation? CAs sometimes misbehave, as DigiNotar and StartSSL did. The result is a dispute between the guilty CAs and the browser vendors, resolved by removing them from the list, as StartSSL was, or by explicitly distrusting their root certificates, as DigiNotar's were. If this happened to one of the CAs Federal websites use, the dispute to which the Feds were not a party would result in the websites using the guilty CA becoming unavailable until the affected certificates could be replaced with new ones from a different CA in GSA's list. The browser vendors control the trusted CA list, so among other things they control citizens' access to government information. Since they're all based in the US, there would be good reasons why they'd be reluctant to remove a US government CA from the list.

I'm naturally reluctant to trust the Federal government, but I'm a whole lot more reluctant to trust for-profit CAs. It looks like I'm out of luck; the policy about public access to the Federal root CA is up on the Web:
Does the US government operate a publicly trusted certificate authority?

No, not as of early 2016, and this is unlikely to change in the near future.

The Federal PKI root is trusted by some browsers and operating systems, but is not contained in the Mozilla Trusted Root Program. The Mozilla Trusted Root Program is used by Firefox, as well as a wide variety of devices and operating systems. This means that the Federal PKI is not able to issue certificates for use in TLS/HTTPS that are trusted widely enough to secure a web service used by the general public.

The Federal PKI has an open application to the Mozilla Trusted Root Program. However, even if the Federal PKI’s application is accepted, it will take a significant amount of time for the Federal PKI’s root certificate to actually be shipped onto devices and propagate widely around the world.