Tuesday, August 21, 2018

Optical media durability

At last I started clearing out the garage laundry room cupboards, which is where amongst much other stuff the optical media backups I take every week have been accumulating for many years. They have been stored in a fairly warm shirt-sleeve environment with no special precautions. So to get some idea of the durability of writable optical media, I've been somewhat randomly pulling groups of backups out of the stacks and re-verifying the MD5 checksums, which were all verified immediately after writing.

TL;DR: Surprisingly, I'm getting good data from CD-Rs more than 14 years old, and from DVD-Rs nearly 12 years old. Your mileage may vary. Below the fold, my results.

MonthMediaGoodBadVendor
01/04CD-R50GQ
05/04CD-R50Memorex
02/06CD-R50GQ
11/06DVD-R50GQ
12/06DVD-R10GQ
01/07DVD-R40GQ
04/07DVD-R30GQ
05/07DVD-R20GQ
07/11DVD-R40Verbatim
08/11DVD-R10Verbatim
04/13DVD+R20Optimum
05/13DVD+R30Optimum
The fields in the table are as follows:
  • Month: The date marked on the media in Sharpie, and verified via the on-disk metadata.
  • Media: The type of media.
  • Good: The number of media with this type and date for which all MD5 checksums were correctly verified.
  • Bad: The number of media with this type and date for which any file failed MD5 verification.
  • Vendor: the vendor name on the media
The media are generic, low-cost CD-R and DVD-+R media purchased in 50s and 100s from Fry's in Palo Alto. Several of the CD-Rs caused errors like this during the initial mount, but nevertheless succeeded in verifying their MD5s:
sr 6:0:0:0: [sr0] tag#0 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
sr 6:0:0:0: [sr0] tag#0 Sense Key : Medium Error [current]
sr 6:0:0:0: [sr0] tag#0 Add. Sense: L-EC uncorrectable error
sr 6:0:0:0: [sr0] tag#0 CDB: Read(10) 28 00 00 05 64 30 00 00 02 00 00 00
blk_update_request: critical medium error, dev sr0, sector 1413312
sr 6:0:0:0: [sr0] tag#0 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
sr 6:0:0:0: [sr0] tag#0 Sense Key : Medium Error [current] 
sr 6:0:0:0: [sr0] tag#0 Add. Sense: L-EC uncorrectable error
sr 6:0:0:0: [sr0] tag#0 CDB: Read(10) 28 00 00 05 64 30 00 00 02 00 00 00 
blk_update_request: critical medium error, dev sr0, sector 1413312
Aug 12 14:34:37 nuc7 kernel: [194688.719850] Buffer I/O error on dev sr0,
                             logical block 176664, async page read
I never intended these weekly backups to be a long-term archive. The intended use was disaster recovery; until now I was just too lazy to dispose of the back catalog. I've retained the sample of disks for future re-analysis.  But the remaining approximately 1200 older than 3 years will be recycled by the CD Recycling Center of America, once I figure out how to ship 45lbs of optical disks!

I'm sorry that the sample isn't bigger, but it was time-consuming feeding 40 disks into the reader, and I need the space in the cupboards now.

2 comments:

Ian F. Adams said...


Professor Wildani and myself often had discussions about what (if anything) could be labeled as "Archive by accident" and if there's value in it/should we care. The net result of the discussions was a resounding "Who knows", and back to the usual problems around identifying high-value data without an oracle lest we become packrats and data hoarders and all the problems that entails.

I remember at a Daghstuhl workshop a few years back (you were there, I believe), talking with folks about doing crude automatic triage, tossing near-duplicates, flagging things for a human to pick at etc, under the assumption that we will inevitably toss things that may be valuable, but we may still wind up with a greater corpus of "useful" stuff with a reduced workload. Potentially an intractable problem, but we can dream :)

Regardless, interesting stuff! Thanks for sharing!

David. said...

The 2019 update of this post is here.