Cataloguing data from tape archive…

Many businesses maintain an archive of old backups. Often though there is no catalogue or index of files and no way of reading the media anyway. When a major bank approached us, this is exactly the problem they faced. They were searching for client data as part of an e-disclosure process, but had no idea where the data was. The problem was exasperated by their huge archive of legacy tape media comprising over 300,000 data tapes!

Thankfully the tapes had been indexed by date, so as part of the e-disclosure process we were able to narrow down the field to 2,000 tapes. At first the client was reluctant to release the tapes on security grounds, but our strict security protocols and information management procedures satisfied their requirements. With our laboratory accredited to ISO 27001 (Information Management) standards, client data is as safe as possible.

Before work could start in earnest, we has to assess a wide variety of media to establish the backup software, version, format and file types. Much of the data was proprietary which complicated matters further. The data sets we were send to analyse for data recovery comprised of DDS, DLT, SDLT, LTO and even some old Exabyte and Travan tapes.

With our unique handlers we were able to bypass the backup software and read the raw data. Over a number of weeks a catalogue was compiled and error checked. Thereafter, sample files were recovered and sent to the client for user acceptance testing. Only once the client had confirmed they could read all the data were we able to setup the process.

Built into the specification were steps to reconcile the migrated data at predetermined points during the migration. Successive batches of media were processed, extracting the data bit for bit and retrieving corresponding metadata. Pre-defined quality assurance checks verified that no data had been lost or gained during each step. Any anomalies were ascertained with minimal loss of time. Scripts that identify and report referential integrity (checking values against target master and parameter files), data constraints and duplicate data were implemented throughout the migration process. Furthermore there was a manual test for every data cartridge, comparing the legacy and target applications for consistency.

Finally catalogues were created for each data cartridge and the relevant data recovered. The project was completed within 13 weeks and once the data had been verified by the client, all data cartridges returned and copies securely erased.