When you first start tinkering with digital imaging, you do things by the seat of the pants, and after a while you realize you need a more disciplined approach to have a manageable setup. The result is called a workflow.

Workflow phases

Each person’s workflow is slightly different, but the following rough steps are common to everyone:

  1. Acquisition—getting the pictures in, whether from a flatbed scanner, a slide/negative scanner, PhotoCD or digital cameras. This also encompasses automated primary cleanup done from within a scanner driver, e.g. Digital ICE3.
  2. Reviewing—deleting dud pictures, and if you have duplicates, selecting only the best one.
  3. Asset management—cataloging your pictures in a database, with categories, captions and all. Professional organizations like photo agencies go to a very high level of detail as this is the key to their business, but this is also essential for anyone contemplating building an imae collection of more than 1000 pictures or so.
  4. Editing—you can go hog-wild with Photoshop or the GIMP, although since this is a very labor-intensive process, it is usually done to a small minority of pictures
  5. Output—getting prints made, but also publishing to the Web
  6. Backup—backing up in case of hardware failure or catastrophe.

Acquisition

What hardware you use for acquisition controls the final quality of your results, so:

  • Don’t skimp on a cheap scanner, use slide or negative scanners rather than flatbed scans from prints
  • Use digital cameras like the Canon D60 or Nikon D100 that have larger sensors with less thermal noise rather than point and shoots.

Using a slide/negative scanner is a very slow and laborious process, and a preferable option in many cases is to have scans made by a photo lab. Avoid the low quality Kodak PictureCD and opt instead for PhotoCD Master, which has higher resolution and scans made more carefully.

Reviewing (editing)

Getting rid of the chaff early is a major step in improving your productivity, but it is difficult to be objective about one’s own photos. This process is sometimes also known as editing, although this term lends itself to confusion with digital imaging. Here is a good introductory article on the subject: Give That Cat The Boot: Editing 101.

Asset management

For most of the other phases, the choice of software does not matter very much and will indeed change over time. It is essential to get asset management right up-front, however. The solution you use must be

  1. scalable to accommodate an expanding collection of photographs
  2. open, you don’t want to be locked in a proprietary database format, at the very least you should have the ability to export the database to some kind of text format
  3. flexible, allowing you to enter as much or as little metadata as you require for any given photo
  4. Offer powerful retrieval capabilities: you should be able to run queries like “find all the photos of me and my grandma in front of the Golden Gate bridge”, or full-text caption search (if you use captions, not very common because of the amount of work involved)
  5. standards compliant, the key standards being EXIF (picture metadata like aperture and exposure) and IPTC (the press photographers’ standard for captions)

The best program I’ve found so far is IMatch (Windows only, I’m afraid), mostly because of its incredibly flexible category system, that works like set theory with multiple inclusion relationships and boolean operators. I have posted a more detailed overview of how to use IMatch for image category management. As I have switched to the Mac, my current asset management program is Kavasoft Shoebox, which has the same power as IMatch and a much better user interface to boot, but is not scriptable.

Editing

The most comprehensive description of a Photoshop editing workflow is available here on Michael Reichmann’s Luminous Landscape site.

Output

As I’ve mentioned elsewhere, my preferred output method used to be prints made on a Fuji Frontier digital minilab system. Unfortunately, most labs are clueless about color management and cropping, and I now use an Epson R1800 archival pigment ink printer. People who want to print digital black & white prints may opt for the R2400 instead.

Backup

This is essential if you do not want all your hard work above to go in smoke in case of a hard drive failure.

Media failures are not the only kind of disaster that can destroy your digital images, fire, theft, flooding and earthquakes are also a consideration, depending on where you live. Most companies have a disaster recovery plan (at least on paper), most individuals should have a simplified one for their personal effects as well. I am not just talking about photos: scanning property titles, diplomas and other vital documents is an inexpensive precaution.

Sticking to a diet is hard. So is sticking to a backup plan, for human factors and process-related reasons, not technical ones. If your chosen backup method is so cumbersome you don’t apply it regularly, it is not going to do you much good. You should focus on developing a process that fits your risk sensitivity as well as your time and budget, and if your current approach is not sustainable, reexamine your backup requirements to fit within what you can do on a regular basis. A weekly or monthly backup schedule should not be too onerous for most people.

The backup process should also involve periodic verification of the backups, so that media failure can be detected and corrected immediately. This implies redundancy in the backup, as well as diversification (use media of different types, or different manufacturers, to avoid simultaneous failure from systemic causes). If you wait 5 years until you actually need the backup, Murphy’s law will inevitably strike.

CD-R and DVD-R media are the cheapest per megabyte, but I am not convinced of their archival characteristics (some published tests have shown CD-Rs can become unreadable in as little as 2 years). 70 or 80GB DLT tape cartridges (and other tape technologies like DAT DDS, 8mm, VXA or LTO) offer high capacity and are durable, but tape drives are very expensive, unreliable and usually available only in SCSI.

Just as the watched pot does not boil over, online data like that stored on external hard drives is harder to misplace than removable media. The solution I use is to make two backups onto two external 250GB firewire hard drives (under about $1 a gigabyte as of July 2005). I rotate them weekly between home and my office, so even if my apartment burns down, I will have lost at most a week’s worth of pictures.

If you prefer CD-R or DVD-R, be sure to use reliable brands like Mitsui Gold and follow the NIST guidelines for their care and handling (here is the PDF one-page summary).

For the backup software, I do not trust proprietary indexing formats and use a regular filesystem with incremental disk to disk copies using XXCOPY on Windows, LaCie’s free SilverKeeper utility on Mac OS X, and Rsync on UNIX.

Format obsolescence is a factor, although the magnitude of the risk is often overblown. While JPEG and TIFF are likely to be supported well into the future, manufacturers’ proprietary RAW image formats (for digital cameras) are less likely to. When a format becomes obsolete, it should be converted to a more durable one, obviously before the OS and drivers for it have become nonfunctional.

Finally, we are all mortal. If you were to disappear tomorrow, would your loved ones know how to retrieve your photos? Making prints of the best ones is a low-tech but robust way of ensuring their passage over time, possibly even skipping generations.