TL;DR: Surprisingly, I'm getting good data from CD-Rs more than 14 years old, and from DVD-Rs nearly 12 years old. Your mileage may vary. Below the fold, my results.
Month | Media | Good | Bad | Vendor |
01/04 | CD-R | 5 | 0 | GQ |
05/04 | CD-R | 5 | 0 | Memorex |
02/06 | CD-R | 5 | 0 | GQ |
11/06 | DVD-R | 5 | 0 | GQ |
12/06 | DVD-R | 1 | 0 | GQ |
01/07 | DVD-R | 4 | 0 | GQ |
04/07 | DVD-R | 3 | 0 | GQ |
05/07 | DVD-R | 2 | 0 | GQ |
07/11 | DVD-R | 4 | 0 | Verbatim |
08/11 | DVD-R | 1 | 0 | Verbatim |
04/13 | DVD+R | 2 | 0 | Optimum |
05/13 | DVD+R | 3 | 0 | Optimum |
- Month: The date marked on the media in Sharpie, and verified via the on-disk metadata.
- Media: The type of media.
- Good: The number of media with this type and date for which all MD5 checksums were correctly verified.
- Bad: The number of media with this type and date for which any file failed MD5 verification.
- Vendor: the vendor name on the media
sr 6:0:0:0: [sr0] tag#0 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE sr 6:0:0:0: [sr0] tag#0 Sense Key : Medium Error [current] sr 6:0:0:0: [sr0] tag#0 Add. Sense: L-EC uncorrectable error sr 6:0:0:0: [sr0] tag#0 CDB: Read(10) 28 00 00 05 64 30 00 00 02 00 00 00 blk_update_request: critical medium error, dev sr0, sector 1413312 sr 6:0:0:0: [sr0] tag#0 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE sr 6:0:0:0: [sr0] tag#0 Sense Key : Medium Error [current] sr 6:0:0:0: [sr0] tag#0 Add. Sense: L-EC uncorrectable error sr 6:0:0:0: [sr0] tag#0 CDB: Read(10) 28 00 00 05 64 30 00 00 02 00 00 00 blk_update_request: critical medium error, dev sr0, sector 1413312 Aug 12 14:34:37 nuc7 kernel: [194688.719850] Buffer I/O error on dev sr0, logical block 176664, async page readI never intended these weekly backups to be a long-term archive. The intended use was disaster recovery; until now I was just too lazy to dispose of the back catalog. I've retained the sample of disks for future re-analysis. But the remaining approximately 1200 older than 3 years will be recycled by the CD Recycling Center of America, once I figure out how to ship 45lbs of optical disks!
I'm sorry that the sample isn't bigger, but it was time-consuming feeding 40 disks into the reader, and I need the space in the cupboards now.
2 comments:
Professor Wildani and myself often had discussions about what (if anything) could be labeled as "Archive by accident" and if there's value in it/should we care. The net result of the discussions was a resounding "Who knows", and back to the usual problems around identifying high-value data without an oracle lest we become packrats and data hoarders and all the problems that entails.
I remember at a Daghstuhl workshop a few years back (you were there, I believe), talking with folks about doing crude automatic triage, tossing near-duplicates, flagging things for a human to pick at etc, under the assumption that we will inevitably toss things that may be valuable, but we may still wind up with a greater corpus of "useful" stuff with a reduced workload. Potentially an intractable problem, but we can dream :)
Regardless, interesting stuff! Thanks for sharing!
The 2019 update of this post is here.
Post a Comment