A few weeks ago Indicommons featured an excellent blog post by Deborah Wythe, Head of Digital Collections and Services at the Brooklyn Museum. She poses the question many of us frequently hear: Why isn’t everything digitized yet? She then proceeds with a nicely articulated description of some of the challenges, then quantifies them:
“The scale of “getting everything digitized” is just mind boggling. In our small archives at the Brooklyn Museum, we have about 1,600 feet of documents, photographs, negatives, ledger books — just about any analog format you can imagine, covering the Museum’s history from 1823 to the present. Here’s the math: at an estimated 3,000 documents per foot, that’s 4.8 million items. Even if you could scan, describe and process 30 per hour (highly unlikely), that’s 160,000 hours of work, or 20,000 eight-hour workdays.
If you saved a 20 Mb master file (or even half that size) for each of those 4.8 million documents, that’s serious storage and backup, not to speak of long-term management and preservation. And that’s just for a really, really small repository!”
Put another way: digitization done right will expand access and hopefully provide a seamless and improved experience for the user. But what happens on the back-end is anything but easier – it’s complex, labor-intensive, not well supported by market solutions (as Wythe notes), and is an ongoing problem-solving endeavor. Wythe is speaking primarily of textual or static image media; in general, collections of moving images multiply all of these issue by several orders of magnitude.
At WITNESS our archive has been building its own asset management system, designed to support ongoing video production and multiple distribution streams, as well as long-term preservation and access. We began regular digitization of incoming media just about a year ago. Our organization is small but our access and distribution needs are extensive.
Take one WITNESS project: In 2006 we co-produced a video with partners Comissão Pastoral da Terra and the Center for Intl Justice & Law on slave labor in Brazil called Bound by Promises. The video was produced in four languages (Portuguese, Spanish, French and English) and 3 different lengths (for different audiences or purposes). Each version (that’s 12!) is archived in multiple formats: a tape master, a digital master, a viewing copy with time-code, an mpg2 with associated elements for DVD creation (PAL & NTSC), flash for web; other iterations are created on an as-needed basis. There are approximately 150 camera original source tapes, each requiring at minimum a digital master and access copy; dozens of stills; photoshop and quark files for packaging; Final Cut project backups and XMLs; audio and graphics files.
Metadata includes the technical information about the files (or tapes): audio & video codecs, bitrates, frame size, duration, hardware and format source.
Descriptive and discovery metadata includes shotlists, transcripts and translations of transcripts, indexing, and summaries that describe, contextualize and make clear the significance of the content.
And of course, any use is contingent upon rights, detailed information on security and restrictions, ownership, and consents from subjects.
While storage costs continue to drop, digital storage that is reliable and provides redundancy or backup is a considerable financial investment.
And then there’s migration: the lifecycle of media becomes increasingly shortened and accelerated; paper is still a much more robust archival medium that files or magnetic media. At MIT6 a few weeks ago, Richard Wright of the BBC (podcast here) offered up a schematic comparing the storage media through the ages, from stone to digital. In his estimation, each leap forward contains about 10 times more capacity while lasting 1/10th as long! And metadata becomes increasingly key and difficult, because – for one thing – there is so much more content to manage.
So, Wythe concludes, we all need to make choices. Agree. We also need to promote more understanding of the complexities of our task and the risk of profound loss if we fail at it.