Tuesday, January 31, 2012

The 5 Stars of Online Journal Articles

David Shotton, another participant in last summer's Dagstuhl workshop on Future of Research Communications, has an important article in D-Lib entitledThe Five Stars of Online Journal Articles — a Framework for Article Evaluation. By analogy with Tim Berners-Lee's Five Stars of Linked Open Data, David suggests assessing online articles against five criteria:
  • peer review
  • open access
  • enriched content
  • available datasets
  • machine-readable metadata
For each criterion, he provides a five-point scale. For example, the open access scale goes from 0 for no open access to 4 for Creative Commons licensing. The full article is well worth a read, especially for David's careful explanation of the impacts of each point on the scale of each criterion on the usefulness of the content.

The article concludes by applying the evaluation to a number of articles (including itself). In this spirit, here is my evaluation of our SOSP '03 paper:
  • peer review: 2 - Responsive peer review
  • open access: 1 - Self-archiving green/gratis open access
  • enriched content: 1 - Active Web links
  • available datasets: 1 - Supplementary information files available
  • machine-readable metadata: 1- Structural markup available

Friday, January 27, 2012

Friday, January 20, 2012

Mass-Market Scholarly Communication Revisted

The very first post to this blog in 2007 was entitled "Mass-Market Scholarly Communication". Its main point was:
Blogs are bringing the tools of scholarly communication to the mass market, and with the leverage the mass market gives the technology, may well overwhelm the traditional forms.
Now, Annotum: An open-source authoring and publishing platform based on WordPress is proving me a prophet.

It was developed based on experience with PLOS Currents, a rapid publishing journal hosted at Google. After a detailed review of the alternatives, the developers decided to implement Annotum as a WordPress theme providing the capabilities needed for journal publishing, such as multiple authors, strict adherence to JATS (the successor to the NLM DTD), tables, figures, equations, references and review. The leverage of mass-market publishing technology is considerable. The paper describing Annotum is well worth a read.

Wednesday, December 28, 2011

Adding cloud storage to the economic model

The next stage in building the economic model of long-term storage is to add the ability to model cloud storage, and to use it to investigate the circumstances under which it is cheaper than local storage. The obvious first step is to collect historical data on cloud storage, to compare how rapidly it is decreasing against the Kryder's Law decrease in disk cost. The somewhat surprising results from looking at Amazon S3's price history are below the fold. I'd be grateful if anyone could save me the trouble of getting equivalent price histories for other cloud storage providers.

Tuesday, December 13, 2011

CNI Talk on the Economic Model

I gave a talk at the Fall CNI meeting on the work I've been doing on economic models of long-term storage. CNI recorded the talk and I'm expecting them to post the video and the slides. Much of the talk expanded on the talk I gave at the Library of Congress Storage Workshop. The new part was that I managed to remove the assumption that storage prices could never go up, so I was able to model the effect of spikes in storage costs, such as those caused by the floods in Thailand.. Below the fold is the graph.

Thursday, November 17, 2011

Progress on the Economic Model of Storage

I've been working more on the economic model of long-term storage. As an exercise, I tried to model the effect on the long-term cost storage on disk of the current floods in Thailand. The more I work on this model, the more complex the whole problem of predicting the cost of long-term storage becomes. This time, what emerged is that, despite my skepticism about Kryder's Law, in a totally non-obvious way I had wired in to the model the assumption that disk prices could never rise! So when I tried to model the current rise in disk prices, things went very wrong. So, until I get this fixed, the best I can do is to model a pause of a varying number of years before disk prices resume their Kryder's Law decrease.

For this simulation, I assume that interest rates reflect the history of the last 20 years, that the service life of disks is 4 years, that the planning horizon is 7 years, that the disk cost is 2/3 of the 3-year cost of ownership, and that the initial cost of the unit of storage is $100. The graph plots the endowment required to have a 98% probability of surviving 100 years (z-axis) against the length of the initial pause in disk cost decrease in years (y-axis), and the percentage annual decrease in disk cost thereafter (x-axis).

As expected, the faster the disk price drops and the shorter the pause before it does, the lower the endowment needed. In this simulation the endowment needed ranges from 4.2 to 17.6 times the initial cost of storage, but these numbers should be taken with a grain of salt. It is early days and the model has many known deficiencies.

Monday, October 31, 2011

PLoS Is Not As Lucrative As Elsevier

David Crotty of Oxford University Press made the headline-grabbing charge that PLoS will this year be more profitable than Elsevier. I responded skeptically in comments, and Kent Anderson, a society publisher, joined in to support David. Comments appear to have closed on this post, but I have more to say. Below the fold I present a more complete version of my analysis and respond to David's objections.