How many of us have an old hard drive hanging around? I am talking about the one you were told was unfixable. The one that has 3 bad sectors. The one they replaced and handed to you in one of those distinctive anti-static bags. You know the ones I mean – the steely grey translucent plastic ones that look like they should contain space food.
I have more than one ‘dead’ hard drive. I can’t quite bring myself to throw them out – but I have no immediate plans to try and reclaim their files.
I know that there are services and techniques for pulling data off otherwise inaccessible hard drives. You hear about it in court cases and see it on TV shows. A quick Google search on hard drive rescue turns up businesses like Disk Data Recovery
Do archivists already make it a policy to hunt not just for computers, but for discarded and broken hard drives lurking in filing cabinets and desk drawers? Compare this to a carton of documents that needed special treatment to permit access to the records they contained and yet are appraised as valuable. If the treatment required were within budgetary and time constraints – it would be performed. Mold, bugs, rusty staples, photos that are stuck together… archivists generally know where to get the answers they need to tackle these sorts of problems. I suspect that a hard drive advertised or discovered to be broken would be treated more like an empty box than a moldy box.
For now I would stack this challenge near the bottom of the list below archiving digital records that we can access easily but that run on old hardware or software, but I can imagine a time when standard hard drive rescue techniques will need to be a tool for the average archivist.
Jeanne, empty box vs. moldy box is an excellent analogy. Nice and sticky and easy to remember. Much like a moldy box, a “dead” disk might be tossed long before it comes under archival control.
As for a solution, I’m hoping that our future digital repository tools will be able to suck out any useful information automatically off any hard drive. Or at least in a way that any relatively tech savvy person can handle without getting the IT department involved.
Or maybe we’ll have digital conservators.
Have you seen Cornell’s Format & Media Migration research? Interesing stuff.
I am both optimistic and pessimistic about this issue. There are quite a large number of directions you can come at this issue from so I am struggling to pick ones making a balanced argument, but here it goes:
Where the moldy vs empty analogy (which I really do like, to a point) breaks down is the huge amount of complexity that goes into making a hard-drive work and, thus, the ways it can break.
On the simple level: reformatting, bad sectors, the drive still “works.” You can still read most of the data that is there using a variety of accessibly priced tools. I have used them more times than I would have liked. We can already do this and with relatively low amounts of training. (Maybe someone ought to create an SAA one day or half day training workshop for this. Yes? No? Any takers?) I completely agree that “a hard drive advertised or discovered to be broken” should be treated like a “moldy box” rather than as an “empty” one unless the donor knows the data was successfully moved to another drive. An average archivist only needs a quick inspection (can I attach it to a computer and run the software over it?) before declaring its life or death/emptiness. (There are more caveats to this, but I will move on.)
After this point optimism deflates. Any fault in a lower level of drive functionality from the hardware preforming I/O to the disk heads and platters and you are going to require expensive recovery services. This is, and I expect will remain, beyond the “average” archivist. I am talking about the professionals who take the disk into clean room, open up the disk, remove the platters (which are extremely delicate from what I understand), and place them in a new set of hardware. I am no data recovery professional but I am sure that the process has even more variables to consider. This job may fall to dedicated digital curators, as Sally suggested, in the future but there are so many variables in disk design and implementation that it is likely to remain a very specialized and expensive (finding replacement hardware) activity.
Then there is the consideration of whose records are being collected. If the archives is serving a large institution (think Universities, Government, many businesses) has an IT department the hard-drives that run through that organization are most likely tied to the computer assigned to the user. This computer most likely has a replacement schedule (and users often complain that the replacements don’t come often enough). The new machine is brought in; the old one is removed, the hard-drive securely wiped (especially now with the many reports of data-theft), and the computer surplussed. We are going to encounter fewer and fewer loose hard-drives unless you are collecting from those not tied to an IT department.
Finally, for hard-drives from personal computers, the ones likely to be found in a corner, they are being relied on less and less for important data. Instead, important files are being kept on a server (maintained by some IT personnel) be it local or over the internet. Hard-drives, instead, are only temporary stores, the scratch pads, and places to keep personal data (audio, pictures, application configurations). How long will they continue to retain much of anything important enough to justify the expense?
Of course it is (most often) possible to do systematic attempts at data recovery. It is not difficult to run a drive through data recovery software. After this point however it will be a serious consideration of cost/benefit. Will it be worth the archives’ limited time and money to do it?
“Transportable media” (Floppies, optical disks, etc) may not have the same problems of hard-drives. Instead they have their own set of troubles; primarily the direct destruction to the layer carrying the physical indicators of, symbolically speaking, the 1’s and 0’s.
The road from perfect media (paper, film, disks) to complete information loss is far shorter and much more expensive to recover down that road for electronic media than the corresponding “moldy box.”
Still reading this comment? I am impressed/flattered. On the whole, when it comes to data recovery and accessioning, I am far more optimistic about the future, the records being created now that we haven’t started collecting. Users (corporate and consumer) are being more careful about data safety. We just need to make sure we understand what the bits we receive actually mean…. but that is another subject.
“After this point however it will be a serious consideration of cost/benefit. Will it be worth the archives’ limited time and money to do it?”
This is particularly dicey when you have NO IDEA what’s there to be recovered in the first place. Hard to devote resources to something that might have no real value.
This is an issue not just for repositories but for family collections as well.
My blog is aimed at genealogists and family archivists. Folks who are scanning historical photos at an impressive rate. I remind them as often as I can (without being too annoying) that “recoverable data” is essentially the same as “lost data” if their descendants don’t want to spend the $$ to find out what’s on grandma’s old disks.
Thank you both for the great comments. I think it most likely that drives that are not easy to access will only get attention if the archivist is hunting for records that cannot be located elsewhere. I am so curious to know what sort of best practices evolve on this front over the next five to ten years.
Pingback:crashed hard drive
I had personal external HDs fail on me in recent weeks and I tried everything I could to extract what I could before it totally seized on me. I used the Linux CS live CD. Although it wasn’t able to get 100% of my files, I was able to retrieve the most essential ones and junk the rest. When it comes to my business HDs, I don’t mess around. I needed 100% recovery, so I went to a service that specializes in it within a half hour from my locale. They wound up retrieving every last kilobit thankfully. Now I use an offsite data backup service (the same people that got my business information) and all is well. My suggestion: Always have a back-up of a back-up. As far as the old HDs: I would keep them myself. Who’s to say that down the line there may be a revolutionary tool that will give one the ability to extract information like opening the wrapper of a lollipop!
Your post is really very good and I appreciate it. It’s hard to sort the good from the bad sometimes, but I think you’ve nailed it. You write very well which is amazing. I really impressed by your post.online buy a sales forms carbonlessformsco.com/carbonless-invoice-forms.html