As is often the case the press have not quite got this story right, but pretty close.
Although it may not match the perception of “the man on the street”, OS is a data business, and over the past 5 years since I have been working here, the volumes of data we deal with have increased massively.
Not only in terms of new sources of digital imagery but also from increasing numbers of geospatial feature databases used in product development.
Data volumes today are over 500 Tb – that’s around 25,000 iPods !!
As a “National Mapping Agency” and as part of government there are additional responsibility’s in terms of maintaining an archive of the data throughout is lifetime and hence the need to develop strategies to archive large amounts of data.
We have chosen to adopt UDO media, very high density optical media which can store 30GB per disk and which is far more resistant to environmental conditions than traditional magnetic media.
The bigger issue for us however is to make sure that the data is able to be used potentially in 50 years time which is guaranteed life of the media. Will we be able to read the data formats used (TIFF, SQL load files, CSV) in 2056 ? We have tried to select as open generic formats as possible but we need to document how the data is accessed as in the future we may need to be emulating the environments of today on some future computing platform.
There is an interesting precedent.. The BBC’s Domesday Project of 1986 based on a BBC model B micro and LV disc was rescued from its unreadable state by the National Archives a couple of years ago.
Written and submitted from the Holiday Inn Express Southampton, using my Vodafone 3G network card.
4 replies on “Geospatial archiving – or how to backup 25,000 iPods”
Ed,
You may be interested in a workshop we’re holding on the 27th of October in Edinburgh called: Maintaining Long-term Access to Geospatial Data (http://www.nesc.ac.uk/esi/events/697/) where we will be looking at some the issues you raise, particulary how to ensure that geospatial data is accessible and usable, both now and in the future.
[…] In a recent post about data archival, Ed Parsons linked to an interesting piece about how the 1986 Domesday Project was rescued. […]
October 18
Ed:
Thanks to Elizabeth Seaman to drawing my attention to your blog on geospatial archiving. As you probably know we have been actively working on this for over three and a half years now as part of the InterPARES 2 project. The Cybercartographic Atlas of Antarctica is a case study for the project and from the outset we have considered the challenges of how to archive a complex interactive, multimedia map-based database. Some of the solutions we have arrived at are interesting but we certainly have not solved all of the problems. Just a comment: Back up and preservation is not the same as archiving and many people assume that when they back up they have essentially archived. This is far from the case. We are deeply involved in various efforts in Canada to effectively archive geospatial data and there are several upcoming meetings on this topic. We are also publishing fairly widely on this topic now. Some of this material is still in press. Most of it has been presented at various conferences including PV2005 in Edinburgh. I look forward to seeing you in Cambridge if not before.
Fraser
Fraser,
Yes you point is quite right archiving is not the same as back-up, but it makes a good blog title ( Am I turning into a journalist ?) Look forward to catching up with you again..