Many web sites have explicit terms of service. For example, here are the terms of service that "govern your use of certain New York Times digital products". They start with this clause:
1.1 If you choose to use NYTimes.com (the “Site”), NYT’s mobile sites and applications, any of the features of this site, including but not limited to RSS, API, software and other downloads (collectively, the "Services"), you will be agreeing to abide by all of the terms and conditions of these Terms of Service between you and The New York Times Company ("NYT", “us” or “we”).So, just by using the services of nytimes.com, the New York Times claims that I have agreed to a whole lot of legal terms and conditions. I didn't have to click a check-box agreeing to them, or do anything explicit. The terms and conditions are not on the front page itself, they're just linked from it. The link is hard to find, in faint type at the very bottom of the page, wedged blandly between "Privacy" and the eye-glazing "Terms of Sale."
Among the terms that I'm deemed to have agreed to are:
2.3 You may download or copy the Content and other downloadable items displayed on the Services for personal use only, ... Copying or storing of any Content for other than personal use is expressly prohibited ...So, if the Terms of Service apply, Web archives are clearly violating the terms of service. Interestingly, there is an exception:
5.2 ... THE SERVICES AND ALL DOWNLOADABLE SOFTWARE ARE DISTRIBUTED ON AN "AS IS" BASIS WITHOUT WARRANTIES OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING, WITHOUT LIMITATION, WARRANTIES OF TITLE OR IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. YOU HEREBY ACKNOWLEDGE THAT USE OF THE SERVICES IS AT YOUR SOLE RISK.The New York Times claims not to be liable. Even if you thought "arguing with a man who buys ink by the barrel" was a good idea:
11.1 These Terms of Service have been made in and shall be construed and enforced in accordance with New York law. Any action to enforce these Terms of Service shall be brought in the federal or state courts located in New York City.Good luck with that.
So the interesting question is whether, in the absence of any explicit action on my (or an archive's crawler's) part, the terms of service bind me (or the archive)? Now, IANAL, and even actual lawyers appear to believe the answer isn't obvious. But writing on the Technology and Law blog a year ago, Venkat Balasubramani suggests that unless there is an explicit action indicating assent, the terms are unlikely to apply:
In place of the flawed browsewrap/clickwrap typology, we can use a simple non-overlapping typology for web interfaces: Category A is a click-through presentation where a user clicks while knowing that the click signals assent to the applicable terms; and Category B is everything else, which is not a contract.Let us assume for the moment that Balasubramani is correct and if there was no click-through the terms are not binding. In the good old days of Web archiving, this would mean there was no problem because the crawler would not have clicked the "I agree" box. But in today's Web, browser-based crawlers are clicking on things. Lots of things. In fact, they're clicking on everything they can find. Which might well be an "I agree" box. Lawyers will be able to argue whether the crawler clicked on it "knowing that the click signals assent to the applicable terms".
Making this assumption, Jefferson and I argued as follows. Suppose my, or the archive's, browser were configured to include in the HTTP request to nytimes.com, a Link header with "rel=license" pointing to the Terms of Service that apply to the services available from the requesting browser. The New York Times would have been notified of these terms far more directly than I had been of their terms by the faint type link at the bottom of the page that few have ever consciously clicked on. Thus, using exactly the same argument that the New York Times used to bind me to their terms, they would have been bound to my terms.
What's sauce for the goose is sauce for the gander. If an explicit action is required, archive crawlers that don't click on an "I agree" box are not bound by the terms. If no explicit action is required, only some form of notification, browsers and browser-based crawlers can bind websites to their terms by providing a suitable notification.
What Terms of Service would be appropriate for using my browser? Based on the New York Times' terms, perhaps they should include:
1.2 We may change, add or remove portions of these Terms of Service at any time, which shall become effective immediately upon posting. It is your responsibility to review these Terms of Service prior to each use of the Browser and by continuing to use this Browser, you agree to any changes.
1.4 We may change, suspend or discontinue any aspect of the Services at any time, including the availability of any Services feature, database, or content. We may also impose limits on certain features and services or restrict your access to parts or all of the Services without notice or liability.and:
4.1 You may not access or use, or attempt to access or use, the Services to take any action that could harm us or a third party. You may not access parts of the Services to which you are not authorized. You may not attempt to circumvent any restriction or condition imposed on your use or access, or do anything that could disable or damage the functioning or appearance of the Services,I.e. you and your advertising networks better not send us any malware. And, of course, we need the perennial favorite:
5.2 ... THE SERVICES AND ALL INFORMATION THEY CONTAIN ARE DISTRIBUTED ON AN "AS IS" BASIS WITHOUT WARRANTIES OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING, WITHOUT LIMITATION, WARRANTIES OF TITLE OR IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. YOU HEREBY ACKNOWLEDGE THAT USE OF THE SERVICES IS AT YOUR SOLE RISK.A reverse EULA. Wouldn't you like to be able to do this?
So far, this may sound like a parody or a paranoid fantasy. But many online media companies have begun to target client-side browser information to police content delivery. Sites like Forbes, Wired, and maybe even (gasp) The New York Times are now disallowing access to their sites for those with ad-blocking browser add-ons:
We noticed you still have ad blocker enabled. By turning it off or whitelisting Forbes.com, you can continue to our site and receive the Forbes ad-light experience.It turns out that the "Forbes ad-light experience" includes free bonus malware!
that using an ad-blocker detector script is basically doing the same sort of thing as a cookie in terms of spying on client-side information within one's web browser, and a letter he received from the EU Commission apparently confirms his assertion.Thus running a script that collects information from an EU citizen's browser (which is what the vast majority do) apparently requires explicit permission. If Hanff's efforts succeed, anticipate European Web publishers going non-linear.
As the web has grown into a processing environment, it presumes a reciprocal interactivity, the parameters of which are still shifting and unbalanced. In the end the terms of this overall interplay of information exchange and license seem, as they so often do, inequitable. The future is here, it's just not evenly licensed. On one end, media and other corporate content sites target user browsers, inject (accidentally or via 3rd parties) potentially malicious scripts, monitor for plug-in screeners, install browsing trackers, analyze cookies and add all sorts of profiling and monitoring scripts, all generally without any explicit agreement on our part. On the other hand, we, simple users, often are presumed to agree to prolix legalese and verbose, obscure license agreements, all simply so we can read about people doing yoga with their dogs.