Wednesday, July 26, 2006

Archiving the Web

I was once told (though this has the aura of an urban legend and might not be true) that in earlier years when Microsoft published a new suite of software, some people would scour the printed manuals looking for errors--omitted copyright acknowledgments, improper use of trademarked names, etc.--and would use these errors to bring lawsuits against the company. The manuals were costly to recall and reprint, and Microsoft's exposure remained for at least some duration.

Now much of this content is publicly available on the Web. The benefits for the company include, among other things, the reduced production costs of creating printed manuals and the ability to collect information about the kind of information people are seeking and adjust as demand dictates. And, if an error is discovered, the content can be immediately replaced. While public exposure to incorrect content is wider (more people can potentially see it), the duration of exposure is much shorter and the content is far less costly to replace.

But, if the content is archived, the company's exposure remains and the chance to replace the content in any manner, costly or not, is removed. I’m not a lawyer so this example may be bogus, but in this case the practice of archiving the Web negates, to some extent, the benefits of the Web's changeable nature.

If the alternative is not archiving, however, we are losing a valuable piece of our history. Then the Web's changeable nature means that one of the most extensive records (though not perhaps the most accurate) of our current worldwide society, culture, and existence is completely fleeting.

Likely the alternative is something else. But in any event, the outcome of the suit against archive.org may begin to lay the foundation for regulation in the future, along with other efforts to regulate this relatively new information medium.