Under the Desktop: Backup and Archive Down
Data archive and backup were on my mind at this month’s Seybold San Francisco 2003 conference and expo. But that was natural, since I was part of a panel on the topic. But I was surprised to find that other content creation professionals were also dealing with the issues.
Unhappily, my session, Archives for the Ages: Preserving Your Images and Data was canceled due to a bomb scare at the Oracle World conference in the hall across the street. Go figure.
Before and after the unscheduled interruption, I talked shop with a number of imaging consultants. Many observed that their clients were very concerned about creating an archive of their images — in digital format, not in hardcopy. Some of these content creators were digital photographers and others were from prepress companies. They understood and had experience with archiving analog media, mostly film, but they were at loose ends when it came to the proper practices for digital storage.
As I ran down some of the considerations of archival storage (and pitched the session), it became clear that these image-conscious photographers and designers were focused totally on the long-term health of their digital images. Perhaps this obsession on archive was at the expense of their here-and-now workflow and business, for I also discovered they had no backup plan in place.
How could this be after all the reminders — even nagging– we’ve received on the subject of backup? While this lack certainly shows a fundamental gap in the education of digital professionals, this continuing refusal to pay the least attention to backup also may show some weariness over the message.
So, instead of administering another guilt trip, perhaps it’s worth taking the time for a short refresher course on the conceptual differences between backup and archive and why everyone needs to add both into their digital workflow. Think of it as the left-brain approach.
“A” is for Archive
Uncertainty around archive and backup storage is understandable, since both activities look similar on the surface. They both involve making copies of files, using some kind of removable storage and then putting them aside. And most storage vendors have aimed various storage technologies (tape, optical cartridges, and removable hard disks) for both processes.
As its name suggests, archive deals with the long-term storage of documents. It seeks the preservation of your content for years, hopefully outlasting its creator. (This is a hot-button issue as I discovered a while ago with my recent column on problems with CD-R and CD-RW discs.
However, archival storage also mandates the movement of files from their placement on one of your primary drives to its hopefully eternal resting place. In this case, “move” means that the file is deleted from its original location after its relocation (see Figure 1). This is a critical difference between archive and backup.
Figure 1: A Google image search on the word “archive” offered up this image of the International Organization for Migration’s archive as the first selection. Or we can hope that it isn’t the archive. Perhaps this pile of stuff is waiting to go into the actual archive. But it’s also ironic, since data archive deals with the “migration” of files, right?
In the wider computer industry, this migration to archival storage is driven primarily by the cost of maintaining different types of data. And this same consideration can hold for content creators, as well.
In the enterprise, important data that’s accessed and changed frequently is often placed on expensive storage-area-network systems, which can cost hundreds of thousands of dollars. That means that older files, which won’t be changed again, may be taking up room that could be better used by more dynamic data. It’s a matter of cost: networked storage is very expensive real estate for older files. So, the companies find it worthwhile to shift older data to less-expensive storage and eventually off to an archive. Some pay big bucks for software to do this process automatically.
Smaller prepress shops and individual creators have many of the same concerns about cost although on a different scale. For example, you may have an investment in a fast hard drive or a small RAID box for your Photoshop scratch disk. Older projects still stored on the drive can slow down performance as well as take up space for more current projects. It’s best to move these files to another less-expensive hard disk or removable archive format to preserve your investment.
Of course, selecting an archival media is a difficult choice. There are many considerations and these issues will be the grist for forthcoming articles. Suffice it to say that professional archivists — in the government and in large collections — have concerns about the data integrity of any current media format for more than 5 to 10 years.
“B” is for Backup
Unlike the long-term storage goals of archive, backup deals with your near-term work. It’s about the restoration of files and returning your current workflow to operation after some unfortunate event corrupts your data and system documents. (In a recent column, I discussed the problems that can occur with irregular power supplies.)
There are two parts of the backup process: first is the backup of your data, and then its restoration. Most people think these are the same thing (or two sides of the coin), but they can be two discreet processes. Backing up data can be more than a simple duplication of your files and system setup; instead, the backup set can target certain files for special attention.
Figure 2: A Google image search on the word “backup” offered this image from an IT company in the Philippines. It’s on a page discussing Computer Associates International’s BrightStor ARC Backup software. Now, do you think that users of a $695 backup application are storing files onto floppy diskettes anymore?
The most essential part of a backup strategy — aside from actually doing it (nag, nag) — comes with the analysis of your everyday workflow. Some data is bound to be more important than others. In the storage business this backup analysis is broken into two concepts: Recovery Time Objective and Recovery Point Objective.
Recovery Time Objective is the time that you will need to get back to work, whether that means recovering a single file or restoring an entire hard drive. That time frame has obvious meaning to your business. If you’re on deadline and your entire hard drive must be restored before you can continue working, you may have trouble finishing the job on time.
Recovery Point Objective is concerned with the actual data you will recover. Data isn’t usually static; as you work on an image, it changes from minute to minute. Or you receive e-mail messages, the database that is your mail folder changes.
For example, it’s common practice to back up daily or weekly. But is that really enough for your needs? Consider the e-mail messages you receive throughout the course of a week. If you lost several days worth of messages, or even a week’s worth, what would that mean for your projects, record keeping and client communication?
What about the creative effort you put into an image or layout? Could you recreate a project if you had to go back days or a week or longer? That’s also a critical situation for your clients and for your work.
Some files could do with more frequent backups during the course of a workday. Many backup programs can take snapshots of data or even of the entire volume. By creating more recovery points into your files, you can work your way back as close as possible to the point when a file was corrupted. This practice would put less of your data at risk.
I’ve refrained from harping on the usual simplistic backup refrain. I hope that this respite, with a look at the concepts behind archive and backup, may get you to think about creating a backup strategy. And then, more importantly, to do it.
As the Talmud says: “Once an error creeps in, it stays.” But that doesn’t have to hold true if you’ve set enough recovery points to accommodate the occasional error.
This article was last modified on January 6, 2023
This article was first published on September 25, 2003
