Note: This article contained pictures that were copyrighted and could not be published on this Web page. Captions for those pictures appear in italics.
Scholars of antiquity and the Middle Ages often complain of insufficient information with which to piece together the historical record. Chroniclers of our own age may soon complain of the opposite problem: an avalanche of paper and electronic data that could make their task equally difficult, in its own way, as writing about the worlds of Plato or Charlemagne. The space now needed to house the records of the Supreme Court for a single term is equal to that used for the court’s records from 1789 to 1837 and the Archives are running out of space and money. Much of the government’s decision-making process is conducted today through electronic mail, but most of it is routinely erased. Numerous federal reports have warned that the United States was in danger of losing its memory by “erasing tomorrow’s records today” and the government is just waking up to the extent of the problem.
|Stille01.jpg||One single government report, the Clinton administration’s ill-fated health reform plan, generated more than 250 boxes of documents to be catalogued and stored at the National Archives and Records Administration warehouse in Adelphi, Maryland.|
The National Archives and Records Agency (NARA) was created during the 1930’s on the optimistic premise that the government could keep all of its most vital records indefinitely, acting as our nation’s collective memory. Ironically, as we enter the so-called “Information Age,” – with its apparent promise of unlimited information and instant access – that optimistic vision is in crisis. Drowning in data and choking on paper, the Archives are facing the stark realization that they may not be able to preserve what they already have, let alone keep up with the seemingly limitless flow of information coming their way.
The numbers are so huge as to be almost comical. The Archives are currently custodian to four billion pieces of paper, seven million photographs; 312,396 films and videos; 2,172,047 maps and charts; 2,079,380 architectural and engineering plans; and 8,995,819 aerial photographs. The prodigious volume of material already takes up some 20 billion cubic feet of warehouse space. Storage is eating nearly half of the Archive’s budget. Ironically, the more information the Archive keeps, the less money they have to make it available to the public. The Archives, for example, are considering shutting down their record center in New York, forcing scholars and citizens to travel to Philadelphia to consult government records.
“We don’t have adequate funds to preserve all the records we already have and for want of funds, we already have reduced our reference services,” says John W. Carlin, the former governor of Kansas whom President Clinton appointed chief archivist. “Unless millions of dollars magically appear we will continue this downward spiral with less and less to spend on services and employees.”
The Archives recently completed a gigantic new warehouse at their headquarters in Adelphi, M.D., outside of Washington, but, at current rates of accumulation (half a billion cubic feet of records a year), it will be full in twenty years. “Continuing as we are now simply won’t work,” Carlin said in a recent speech. “The status quo is not an option.”
What is particularly frightening about the Archives’ problem is the fact that most government agencies don’t deposit their records until they are thirty years old, so that the Archives have not yet had to deal with the real explosion of information that has occurred in the last generation.
Computer technology was supposed to solve all of the Archives’ problems, but, so far, it has only compounded them. In 1989, a public interest group – the National Security Archives – successfully sued the White House to prevent it from destroying any electronic records in an effort to obtain information on the Iran-Contra scandal. The result is that all branches of the federal government are now required to preserve all of their computer files and electronic-mail. Because government offices use different kinds of computers, software programs and formats, just recovering this material has proved to be a logistical nightmare. It took the National Archives two and a half years (and its entire electronic records staff) just to make a secure copy of all the electronic records of the Reagan White House. It will take months longer to make most of them intelligible. “They are gibberish as they currently stand,” says Fynette Eaton of the Archives’ electronic records center.
Dealing with the White House records case has meant that the electronic records division of the Archives has fallen far behind in all other areas. “In 1993, we had plans to have 1,500 files available on-line within the next twelve months. Instead, three years later, we have zero,” says Eaton. “In the mid-1980’s, we had a study done that stated that we had a brief window of opportunity [to get on top of the information revolution] but we missed that window of opportunity because of the lawsuit.”
One of the many illusions of the digital revolution is that once something has been recorded on computer it has been preserved indefinitely. This is far from being the case. “If we have a piece of paper that is two hundred years old we will almost certainly be able to read it, but if you take a computer tape from thirty years ago, chances are you won’t be able to read it,” says Eaton. “The life of magnetic computer tape is generally about ten years. We have a regular program of copying our electronic data.”
|Stille02.jpg||John W. Carlin, the former governor of Kansas, is now chief archivist of the U.S. He is alarmed by the dramatic surge in records and the inadequate funds available to preserve them.|
But because of the astonishingly rapid changes in the computer industry, the software and hardware used to present old data may quickly become obsolescent and inaccessible. “On a purely physical level, we are just preserving ones and zeros,” Eaton says. “But you need to be able to read the records and present them in a way that is interpretable. If I have a word processing file from ten years ago, I can always show you the text. But if I’ve got the data base from the 1970’s, where they are displaying data on a map, I may not be able to make the data look like a map. It may depend on software or hardware that is no longer available. In the computer industry there is enormous pressure to come up with something new for people to buy and to make those things unlike other things that are already available.”
The potential losses from the first decades of the computer revolution could be considerable. For several years a disturbing rumor has circulated that the Archives had lost the data from the United States Census of 1960. According to the story, the information lies locked on obsolete thirty-six year-old computer tapes that can no longer be read by today’s machines. The Archives continue to reassure the public that the material has been safely copied onto more modern media, but because census data must be kept private until seventy-two years after its collection, the rumor will probably persist until independent researchers can view the material for themselves in the year 2032. Nonetheless, if only apocryphal, the story contains an important element of truth: there are in fact only two computers in the world, one in the Smithsonian museum and the other in Japan, that can still read the original tapes.
The federal government, with its multitude of departments, agencies and offices, is a veritable Tower of Babel of different and incompatible computer languages and formats – many of them old and obsolete. Many of the records of The National Military Command Center are stored in a data base management system (known as NIPS) that IBM no longer supports and which the National Archives has difficulty translating into readable form. The Agent Orange Task Force has been unable to use herbicide records written in the NIPS format.
Although the data from the 1960 Census may be safe, the same may not be true of later surveys. “Bureau of Census files prior to 1989 threaten to eclipse the NIPS problem,” the Archives reported to Congress a few years ago. “The Bureau reported to usthat they have over 4,000 reels of tape, containing permanently valuable data, which are difficult, if not impossible to use because they are in CENIO (Census Input/ Output) format or because the files have been compressed on an ad hoc basis.” Each computer tape can store 75,000 pages of information so that, if the data cannot be recovered, the Census Bureau might lose up to 300 million pages of data.
The data base for the United States Railway Authority until 1981 was created through an information system known as BASIS. But the newer versions of system cannot read the older format, making the information inaccessible. “Our inability to read this data is especially frustrating because it constitutes the only finding aid to a large volume of paper records in the National Archives,” NARA reported.
In 1989, NARA located several hundred reels of tape from the Department of Health and Human Services but has been unable to find any technical documentation that would help them use them. For similar reasons, the Archives have had to reject data files from the National Commission on Marijuana and Drug Abuse, the Public Land Law Review Commission on School Finance, the National Commission on Consumer Finance.
In 1990, an alarmed Congressional committee reported that the National Archives’ “current policies are inadequate to assure the long-term preservation of electronic records,” and most experts in the field see little reason to alter that gloomy assessment.
It is still hard to tell how much information will be permanently lost, but most experts believe it will be considerable. David Bearman, a system’s expert at the University of Pittsburgh, believes that the government should give up on most computer data from recent decades and start over. “The amount of time, energy and money it would take to recover that data is ridiculously disproportionate to its value,” he says. “They ought to look to the future.”
In order to be sure of being able to preserve electronic information, one must have what is called “meta-data,” information about how the data is structured and formatted so that it can be reproduced in the future. “You want to set up records so that you have structural information about them that will make them usable in the future,” says Bearman. But the United States, unlike other governments around the world, such as Canada, has so far failed to do that.
Scott Armstrong, the journalist who brought the initial e-mail suit, insists that the National Archives has only itself to blame for missing the information revolution. “We had to bring them kicking and screaming into the 20th century,” he says, “and they are doing everything to slide backwards.”
Armstrong believes that the Archives have deliberately resisted computerization for political reasons. “If the information is in electronic form, it’s much easier to access. It’s because they are trying to avoid capturing the evidentiary history of the government. Presidents want to control what records form their historic records.” People in the intelligence and national security community, Armstrong says, want to limit access to electronic records because it will make the U.S. government too transparent. “They call it the ‘mosaic theory,’ whereby no single piece of information is by itself dangerous but if you piece it all togetheryou’ll be able to figure out what the government is up to.”
The archivists, Armstrong says, have been political appointees close to the White House, who have seen their duty as protecting the president rather than serving the public. “The previous archivist, Don Wilson, was there at the White House at the midnight hour as Bush was getting ready to leave office, trying to get the Bush records.”
The current archivist, Carlin, is the former Democratic governor of Kansas and close to President Clinton. He has an ambitious strategic plan to modernize the Archives, but Armstrong sees a continuing resistance to open access. While declassifying large amounts of data and adopting a public posture of openness, the Clinton administration has also moved to shut down access to certain kinds of information. While agreeing to comply with the court order to preserve electronic records, the administration now claims that the National Security Council is not an agency of the federal government, thus trying to avoid preserving the same kind of sensitive data that the Reagan and Bush administrations were forced to hand over.
The National Archives still seems to favor paper records over electronic ones, allowing government agencies to destroy electronic records if they have already been printed out. “It makes no sense,” says Armstrong. “If your basement were flooded, the first thing you would try to do is turn off the flow of water, and then start worrying about mopping up. The Archives are doing the exact opposite. They have been avoiding collecting electronic records. By now virtually all government records are on computer, but they are still telling people to print out their records onto paper. If the government had dedicated the energy it has spent fighting the e-mail lawsuits into modernizing its record-keeping operations, it would have gone a long way to solving its problems.”
|Stille03.jpg||Researchers like Kathryn Serkes, of Seattle, WA., who works for the Association of American Physicians and Surgeons, depend on the National Archives for information on how government works. She examined documents pertaining to the Clinton administration’s health care plan. Government documents are being lost to research because of processing backlogs, inadequate record keeping, and more recently, outdated computer programs.|
The new Archivist, Governor Carlin, has just issued a strategic plan that promises to bring the government into the electronic era. But some of the chief experts in the field remain skeptical. “Most decisions are spent determining what should be destroyed, rather than what should be kept,” says David Bearman, an expert in digital storage at the University of Pittsburgh. “In the mid-1980’s, the joint chiefs of staff decided to destroy the minutes of all their meetings. Now, if someone had sat down and made a list of the most important records to keep for the post World War II period, one of the first things they would have come up with are the minutes of the Joint Chiefs of Staff.”
The most important thing that the Archive could do to preserve electronic data, archivists seem to agree, is to impose uniform record-keeping standards on the various departments of the U.S. government. But, as a recent report of the Archives reveals, most government agencies have rejected this suggestion as too costly and too onerous.
©1997 Alexander Stille
Alexander Stille is a freelance writer from Atlanta, GA who is researching the future of the past.