Digitizing the Long Tail

According to Chris Anderson, proponent of the idea of the “long tail,” the Internet has changed the market by decreasing the expense companies encounter in maintaining large inventories and increasing the size of potential customer pools. Companies like Netflix and Amazon have made millions by providing access to strange, obscure and out-dated material to customers hungering for products outside the mainstream. There are clear parallels being drawn between business models that work within the long tail and the future of libraries, which are full of “deep and rich collections…[and] a very long tail of scholarly and cultural materials.” For Lorcan Dempsey, president of OCLC, it is not enough that libraries are full of obscure and unique items. Libraries must make those items readily accessible to a large number of users in order to see any benefit from their long tails.

For Dempsey, this means redefining the way that libraries organize, describe, and share materials with the public. His recommendations seem to be in step with innovations in the library world as a whole. For example, Dempsey argues that discovery needs to be simpler; Netflix works because people can easily find the dvds they’re interested in and a hundred other dvds they didn’t even know they were interested in. Library OPACs do not work in this same intuitive and responsive way, but as Kristen Antelman reports, libraries like North Carolina State University are working to make OPACs easier to use, connecting users with the long tail quickly and painlessly. OPACs are only one part of the problem, however, and professionals are working to improve metadata and inter-institutional sharing of standards and information to improve discovery.

In addition to improving discovery, libraries must improve the ways in which content is delivered. For many, this means that materials ought to be digitized and delivered via the web. Large scale digitization projects have sprung up at major research and public libraries across the country. These include publically and privately funded efforts including the equally-loved-and-reviled Google Book Project. A simple scan of the library blogosphere provides exciting glimpses into all the work that’s going on around user experience and interaction design both with discovery and delivery in the library world, all centered around connecting more people with more “stuff” in the long tail.

Digitizing the long tail has many perceived benefits. Librarians and archivists have struggled for years with how to maximize access to materials while also ensuring long-term preservation. In the eyes of conservators, allowing random users to thumb through a collection of brittle manuscripts from the 18th century is approaching heresy. By the same token, placing limits and restrictions on access to a select group of highly specialized researchers goes against core tenets of the library and archives profession. This tension between preservation and access has resulted in debates and conflicts. Subsequently, handbooks like the Code of Ethics of the Society of American Archivists dictate that professionals must strike a balance between preserving the record or object and providing access to it.

Digitization would appear to be the perfect solution to the problem in that the process of creating digital copies of fragile and unique materials allows for long-term preservation while also making opportunities for access to a record’s content which in no way threatens the material itself. But is preserving content enough? What role does the medium in which the content is presented play in its interpretation and the making of meaning? Is the original medium important? What affect does migrating content from one format to another have on its meaning? Marshall McLuhan, everyone’s favorite philosopher of media, would argue of course that medium and message are inextricably linked. Many librarians, archivists, and bibliophiles would use his ideas to argue the value of the original form and dissuade us from considering digitized forms to be authoritative. This may be the case, but I would argue that the current trend in digitization does not call for the destruction of originals as was the case during the microfilming craze of recent decades, and instead places renewed value in original form by advocating the long-term preservation of both book and scan.

What interests me about digitization is the new value that can be drawn from old content when it is presented in a new form. Of course new interpretations will be drawn from an old book when it is digitized and accessed on the web, but isn’t that the point? Getting old stuff in front of new eyes in order to facilitate the making of new meanings? How can new media forms be used to present content in new ways, rather than just emulating old forms in a fresh copy? N. Katherine Hayles, author of Writing Machines, and other new media scholars are interested in just these questions. In the way that television, when it was first introduced, emulated radio, Web 1.0 emulated books with chapters with the user drilling down into content through a hierarchy of ideas. Web 2.0 is deconstructing that hierarchy and pushing the boundaries of media, creating new definitions of how we think, see, and react. Is it so bad that people are reading Macbeth online for free rather than dropping $100 a seat to see it at BAM? Is the performance happening tonight any closer to Shakespeare’s original vision than a hypertext annotation on Google Books might be?

Philosophical considerations aside, the main question with digitization seems to be whether it really addresses the problem of preservation. Michael Day presents the grim realities of digital preservation. Unfortunately, it turns out that digital forms of materials are just a fragile as physical forms and require just as much care. This includes materials that are “born digital” and have never existed in physical form, including websites. At first this seems counter-intuitive, but Day discusses how changes in technology can render a digital copy obsolete and unreadable much more quickly than time would render a book or a moving image unusable. Digital preservationists struggle with whether to migrate an obsolete file type to a new file type every few years or whether to build emulators on new platforms capable of reading old file types.

Deanna Macrum and Amy Friedlander of the Council on Library and Information Resources argue that the history of preservation in libraries can teach us much about how to move forward with digital preservation. Just as libraries have learned from mistakes like printing on cheap paper or cutting up and destroying books for microfilming, they will continue to learn as they struggle to deal with the realities of life in a digital age. According to Macrum and Friedlander, a library’s mission will always be, “…to preserve the resources on which research, teaching, and learning so heavily depend,” and, I would argue, to provide access to those resources. In today’s day and age, Macrum and Friedlander note that libraries will have to consider their history as well as their users in the attempt to continue to meet that mission.

Clearly, the future of libraries is full of uncertainty. I think what is most encouraging to me as I enter the profession is that all of this debate and discussion is going on in earnest all over the globe and in a variety of different settings, including libraries, archives, museums, and even corporations. Digital preservation of material is a problem that is of interest to so many different groups of stakeholders, and there are so many smart people working on it, that it seems impossible for no good solution to be found. I’m excited to be a part of the generation of librarians who get to tackle this problem.


