is published by
Driving Us Crazy
Ruminations on the aging of information in the information age
by Christopher Dow
When I was a teenager, I owned a somewhat mysterious electrical device. It was a metal box, olive green and seven or eight inches square, with a power cord, a small speaker grill, and a little door that opened to reveal primitive electronics and a mechanism whose function was obscure. The only clue to the device’s identity was an embossed metal plate stating that the box was a wire recorder manufactured in the 1930s.
Although I’d never heard of wire recorders, I had a tape recorder and could extrapolate that wire recorders must have been the first devices used to imprint electromagnetic signals onto a metallic medium. How cool, I thought, and I immediately desired a recordable wire to see if the thing still worked.
Unfortunately, this was 30 years after the recorder’s manufacture, and the company that made it was no longer in business. My dream of testing the device waned, and I soon went off to college and lost track of it. Maybe my brother dismantled it, or maybe my mom tossed it out. Whatever happened, it vanished from my life, and I quite forgot about it until a newspaper article 15 years later brought it to mind.
The article was about two university researchers who had run across a carton filled with mysterious spools of wire. After determining that the spools were wire recordings, they wondered what was recorded on them, but they were unable to find a wire recorder to play them.
I should have taken it as a warning—the canary in the information age coal mine—but it would be another 10 years before the truth really hit home. By then, I was watching in dismay as my vinyl LPs, Betamax tapes, and other media in which I had invested considerable time and money grew obsolete almost overnight. Even worse was what was happening to my digital data. In the early 1980s, I was the proud owner of one of the first personal computers—a Kaypro that used the CPM operating system and 5.25” floppy disks—and when it finally crashed, the manufacturer was out of business, CPM had evolved into DOS, and 3.5” floppies were the medium of exchange. All that data on those old 5.25” disks was essentially unretrievable. Sure, there were technicians who could convert the data, but they wanted an arm and a leg to do the work. The boxes of backup disks that I had assiduously recorded to preserve my data in case of a disk malfunction sat there, mute and mocking me.
The shift to digital media isn’t the first transformation in media that people have experienced. “Plato can be very well explained as being located at the interface of oral and written traditions,” says Werner Kelber, the Isla Carroll and Percy E. Turner Professor of Religious Studies, who is interested in the cultural impact of media. “Medieval culture steadily moved toward a developed manuscript culture that eventually mutated into print culture—the high tech of the 15th and 16th centuries.”
Apparently such shifts have never been easy. We take writing for granted as the basis of our civilization, but writing was once extremely revolutionary—and controversial. “Plato had deep reservations about writing because a whole civilization that used to memorize Homer would be destroyed,” says Kelber. The late medieval church didn’t look favorably on print, either. “It rightly sensed that the new medium was going to undermine its authority,” Kelber continues. “People would read a printed Bible and form their own opinions. In addition, printing the Bible in vernacular languages gave momentum to national languages and identities but also helped draw new lines of religious and national division. It fragmented the unity of the Holy Roman Empire, compounding Catholic–Protestant polarities, which would trigger religious warfare culminating in the Thirty Years’ War, devastating Central Europe.”
Whatever their philosophical or ideological views, those worried about the effects of writing and printing made one prognosis that certainly has materialized. “I can quote you voices from the Helenistic Age, and even in the Bible,” says Kelber, “where people complained that of making books there is no end.”
There are several ironies of the information age, not the least of which is that there are such massive quantities of information that it is impossible to be familiar with more than a small fraction of it. Book production is booming, and there are magazines and journals for every conceivable topic and purpose. And that just touches on print. Mass media developed in the 20th century—radio, film, audio recordings, television, and video—provide a steady onslaught compounded by computers and the Internet. “I ran across a history of the Web not long ago,” says Shisha van Horn, manager of technology services for Rice’s Office of Educational Technology. “In 1994, I think, there were something like 3,800 websites.” Less than a decade later, the Web has exploded into. . . .Well, can anyone really count all the sites out there? Google claims to search three billion, and I’m sure it’s missed a few.
Consider also a study by researchers at the University of California at Berkeley’s School of Information Management that found that last year’s production of print, film, optical, and magnetic content totaled roughly 23 exabytes. That’s 23 million million megabytes. The researchers estimate that new information is increasing by 30 percent per year. The thought is numbing even to those of us acclimated to dealing with large amounts of information—Plato probably would be driven mad.
The supreme irony of the information age, however, is not that there is more information than we can process but that it seems as difficult as ever to preserve.
One early promise of the digital age was that information, once digitized, remains both permanent and accessible, but just two decades later, we already are experiencing problems.
First of all, wholesale preservation of information is not possible, if for no other reason than the fact that digital media is far more delicate than most of us realize or are willing to admit. The frailty of media is nothing new. Ancient hieroglyphs have been sandblasted by the winds, whole histories have succumbed to fire both accidental and intentional, and books, newspapers, and magazines published on pulp paper are crumbling to the acidic dust of anonymity. Electronic media, like paper, are subject to physical destruction, and they have additional vulnerabilities as well. “On a computer,” Kelber points out, “retaining knowledge and deleting knowledge are only one click apart.”
Electronic data also is at the mercy of other forces. Although hard drives are relatively stable, they can—and do—crash, making data recovery extremely costly—or impossible.
Heat and dirt can damage equipment, and proximity to electromagnetic fields can corrupt or erase electronic files. “I’d try occasionally to restore stuff off floppy disks, and sometimes it would work and other times it wouldn’t,” says Hubert Daugherty, a Rice educational technology specialist. “The reliability of these transportable media is only mediocre. The CD-ROM is a pretty robust storage medium—it’s not magnetic but a format in which pits stamped in the plastic are read by a laser.” But, he says, CD-ROMs have an estimated shelf-life of only 20 to 30 years.
If that seems short, CD-Rs and CD-RWs have a shelf-life that makes pulp paper look positively eternal. What many people don’t realize is that those formats use a different technology than the stamped CD-ROMs that you buy, for example, from the music store. “The newer recordable CDs are fairly reliable for writing a disc, taking it somewhere else, and reading it,” explains Daugherty, “but they’re based on organic dyes. Instead of creating physical pits, the CD-R drive mimics the pits by burning spots on the dye. The drive can read the spots just as it reads the pits, but the discs don’t last as long because the dye is photo-sensitive. If you leave them in sunlight, they can degrade over just a week or two. Even conventional room lighting can make them degrade in just a few years.”
"If you're dealing with something that's preserved electronically, you also need the appropriate device to read the media."
You feel that your scholarly indentity is damaged because you can't connect with your own scholarly past. I desperately try to find ways of making material usable once again, but too often it's a waste
"A lot of webpages have no identifying information on them—they don't say who created the page, when it was created, or who to contact if there are problems."
—Shisha Van Horn
"There's not much economic incentive to make easy conversion paths between old and new media. In any case, how far back will such paths go?"
"I'd try occasionally to restore stuff off floppy disks, and sometimes it would work and other times it wouldn't. The reliability of these transportable media is only mediocre."
So while we have this idea that information in the digital realm is permanent, all the digital storage media are highly susceptible to natural phenomenon—light, electromagnetic fields, heat, dirt—as well as to mechanical failure. Even worse, electronic media has a whole range of barriers that can render it useless even if it is intact and sound. Those barriers exist because electronic media are created by specific devices and software whose incredible proliferation and mutation make data management and retention difficult or practically impossible. “When you’re working with a book, you don’t need anything except the object itself,” explains Lisa Spiro, director of the Electronic Text Center in Fondren Library. “But if you’re dealing with something that’s preserved electronically, you also need the appropriate device to read the media.”
This means that a data disk can all too quickly turn into an information black hole. “In the mid-1990s,” says Spiro, “librarians became increasingly conscious that formats change so rapidly that today’s standard formats might be something entirely different in two years. If people used, say, WordStar as their word processing application in the late 1980s, how can they open up those WordStar files now? Some word processors may have that backwards compatibility, but it can be a real headache.”
Transition paths are set up by companies, points out Tony Gorry, the Friedkin Professor of Management, professor of computer science, and director of Rice’s Center for Technology in Teaching and Learning. “You have the 5.25" floppy, and they want to make a path for you to get to the 3.5" disk.” But he doesn’t see it as an issue that engages a lot of business interests. “There is not much economic incentive to make easy conversion paths between old and new media,” he says. In any case, how far back will such paths go? “Apple, for example, has announced that it will no longer support Operating System 9,” Gorry says. “My guess is that a bunch of people will lose things unless they keep their old Macintoshes around.”
Obviously, the leap from analog to digital is the single greatest hurdle. After all, even if digital formats aren’t directly compatible, software can be written to facilitate conversion from one format to another. But this isn’t necessarily possible for the average person facing a whole new round of conversions every time hardware or software changes. And things are certain to get more complex as formats continue to proliferate. “The problem with speeding up the process of communication and changing the machines and formats is not just that things become rapidly outdated,” says Kelber. “Information is invariably lost in the process.”
In addition to generating a plethora of media formats that are fragile, vulnerable, and prone to obsolescence, the information age has produced a whole new category of data sources that are marked, at least as things stand, more by their unreliability and ephemeralness than by any other characteristics. These are, of course, websites.
Quality always has been a problem for researchers citing sources. It used to be that researchers went to a library not just because information sources were housed there but because those sources had been professionally assessed and deemed acceptable in quality. Now, people often turn to the Internet, and while there is quantity galore, the quality is much in debate.
“One effect of this barrage of information,” Gorry says, “is that it’s giving people an opportunity to be indifferent to proof. You get information from all sorts of sources, and everybody’s opinion is as good as everybody else’s. There’s this cute cartoon of two dogs sitting in front of computers, and one dog says to the other, ‘You know, the great thing about the Internet is that, when you’re on it, nobody knows you’re a dog.’”
Van Horn believes that the Internet has made us sloppy about research. “Faculty members want their students to be very cognizant of sources of information and to evaluate it and its appropriateness,” she says, “and yet a lot of webpages have no identifying information on them—they don’t say who created the page, when it was created, or who to contact if there are problems. How can you evaluate if it’s useful or not if you can’t tell whether its creator is a knowledgeable specialist or a teenager putting things on a personal webpage? Also, even though only a fraction of the past has been digitized, we have this bizarre sense that everything is on the Web, so if you do research there, you have everything you need.”
Aside from the difficulties in dealing with the sheer quantity of information on the Internet and assessing quality, there is the problem of permanence.
Unlike libraries, where researchers could be reasonably sure that others might have access to the same materials in perpetuity, the Internet is constantly in flux. “There’s always the frustration of that dreaded 404 error message that says the webpage you’re looking for is gone,” says Spiro.
Van Horn agrees that, regarding research, the Internet’s impermanence is a huge concern. “How do you even reference something that’s online,” she says, “when, in six months, it may not be there at all. Is it enough to note the URL and date?”
“The transitory nature of the Internet is a very serious problem,” Kelber says. “One may argue that the past is not important and that it’s not a major issue if the information flow changes. But I believe that an understanding of the past is an essential part of our civilization and identity, and if that process is increasingly undermined, I have worries about a civilization that lives, as some people do, only on the surface of the present.”
There are some individual and organized efforts to preserve what’s on the Internet. “For instance, the Internet Archives [http://archive.org] is trying to capture snapshots of the Web,” says Spiro, “in part to facilitate historical research and in part to address this problem of not being able to find webpages that were there six months ago. And other efforts are under way, both technical and organizational, to provide greater permanence and institutional resources, to help produce more universal names that are more stable than URLs, and to improve institutional commitments to preserve webpages.”
In reality, though, if the information contained in the vast majority of websites is archived at all, it is archived only by the site’s creators. As far as most of us are concerned, if a URL displays a 404 error, the site and its contents have vanished as irrevocably as an old magazine printed on pulp paper.
I gained a little insight into how people react psychologically to the constant mutation in media formats and impermanence of information while I was talking to Werner Kelber. We were interrupted by a phone call from the department coordinator, who asked Kelber what type of disks he had that he needed converted. Kelber has several boxes of old floppy disks that contain lectures, notes, talks, bibliographies, and research, but his current computer doesn’t have a drive that can read them. As he answers the coordinator, the coincidence strikes him, and after he hangs up, he notes how perfectly the call illustrates the pervasiveness of this issue. “These are backups,” he says, waving over the boxes of disks. “But what’s the meaning of backup? What’s out in print is fine, but to suddenly be confronted with the fact that all this is now obsolete is an idea I find impossible to live with and reconcile with my ethos as an academician. You feel that your scholarly identity is damaged because you can’t connect with your own scholarly past. I desperately try to find ways of making this material usable once again, but too often it’s a waste of time.”
“I think it’s difficult for most people,” van Horn says simply, and she is echoed by Spiro, who says the typical reaction is confusion and befuddlement. But Spiro also agrees with Gorry, who sees an individual’s reaction as a matter of temperament. “The same pace of technology that is giving us new forms of media also is shaping the way people think about preservation of the past,” Gorry says. “In some respect, throwing away stuff seems to be part of digital culture. On the one hand, you have people who have built up collections, but there also are people who are, in some sense, exhilarated by the fact that they have to throw everything out. They wouldn’t have much regret and would deal with it as another chance to go get a bunch of new things.”
Not everyone is so comfortable facing the prospect of losing information, however. “I really worry that it’s going to be Newspeak over and over again,” says Daugherty. “All I can hope is that institutional mechanisms, such as universities, maintain the continuity.”
Change is the only constant, goes an old saying that has special currency today when you consider the 23 exabytes of new yearly data estimated by the UC–Berkeley researchers. It is true, however, that a lot of that data is not necessarily worth preserving. “People generally seem to have relatively little mourning giving up stuff that is not that memorable,” says Gorry. “After listening to a Britney Spears album for a few years, who cares about listening to it one more time? If she has to go, she has to go.”
Granted, some of us may not mind losing some of the “information” out there, but there is a lot of stuff that should be preserved, and that challenges archivists and collectors especially, though they now have better tools for preservation than ever in the form of the Internet, powerful databases, and more capacious hard drives. “Some of the things we’re not able to preserve because we cannot transfer them,” says Kinga Perzynska, director of the Woodson Research Center in Fondren Library. “But even when we can, there is the problem of how long the new media are going to last to preserve the information we have. That makes archivists stop and wonder what media they should use to preserve information.”
Libraries have faced a similar problem before in preserving paper that has acid in it, but the library community is struggling mightily with digital proliferation and extinction.
“There’s a lot of awareness that it’s a really significant problem that stands in the way of ambitious and important digitization projects,” says Spiro. “The Library of Congress recently received an allocation of, I think, $100 million for the National Digital Information Infrastructure and Preservation Program to study the problem and develop strategies for long-term preservation of digital content. And there are efforts under way in library schools and departments of computer science.”
“We all should understand the obstacles,” says Perzynska, “not just those of us who sit here trying to make sense of all this huge influx of information. Electronic media developers should work closely with the community and help us create resources for archiving instead of just producing and jumping from one technology to another, because that’s what’s scary—people want to use all these different technologies in producing and searching for information, but at the same time they do not consider guidelines designed to help us preserve information.”
For now, proliferation of media formats is the order of the day. The good news is that media devices will cost less and store more. The bad news is that breaks in continuity from format to format, with the attendant loss of information, will persist.
“The problem will be simplified, somewhat,” says van Horn, “because optical storage media—CDs and DVDs—will continue. They won’t replace tapes completely, but the things that evolve the slowest stay around the longest. So tape backup will probably be around for a long time, and so will books. The ultimate format will probably be something like the crystals in the Superman movies that contain the whole world of knowledge.”
The truth is that it’s very hard to know in advance what will work and what won’t because of the complex interaction between technology and the uses to which people put it. “The marketplace drives innovation,” says Gorry. “I read papers describing storing of information on molecules in various ways and storing information in other forms, but I don’t really know what’s going to happen. The one thing that we do know with some certainty is that storage capacity is a factor. The government will step in to try to make these things backward compatible, but that almost never works.”
Networking might provide a pragmatic approach to universal digitalization, but networking poses challenges of accessibility, even aside from incompatibilities in software and between different versions of browsers. “There’s so much information going online that trying to find what you need is becoming harder and harder,” says van Horn. “Also, networking may help with transferring data between hard drives, but it doesn’t solve the problem of storage outside of drives.”
Kelber says that loss of information is a dilemma that needs closer attention. “Unfortunately, people who work with media are not sufficiently aware of it because many tend to think only technologically—they think that if we get this system hooked up with that system, we’ve done our jobs. But the cultural implications make it far too important to remain just a technological issue.”
Perhaps we eventually will reach a barrier beyond which we cannot make data devices faster, smaller, and more efficient—something akin to an information speed of light. Until then, of course, we’re unlikely to stop creating ever-new formats to supplant older ones. And in the process, information will be subject to another law of nature—the law of natural selection. Information that is robust and useful will find a home somewhere in cyberspace, and that which is no longer viable will die out and vanish, as extinct as sandblasted hieroglyphs and the Library at Alexandria.
This article originally appeared in the Winter 2004 issue of Sallyport: The Magazine of Rice University.