:: The Impersistent Web

There’s a thread discussing whether or not the internet is a valid site to use for reference.

Now, i’m not talking about finding a cookie recipe (use turbinado sugar for both white and brown and baking powder instead of baking soda) or how many Shih Tzu would be required to keep your yak herd safe from Yeti, (30, by the way) but more in the writing a paper or dissertation for posterity.

The problem is that the web generally isn’t a long term storage mechanism. Granted, this probably sounds really weird considering what i do for a living, but i don’t think that the web really is built for doing hard reference works. Books and journals don’t generally disappear because someone stopped paying their bills. Likewise indexes don’t suddenly break years later because someone realized they suck at site design. i’m also fairly certain that there’s no equivalent of a 404 in the Dewey decimal system.

Thing is, more and more folks are using the web for hard research. i’ve seen plenty of examples where folks cite sites where once they cited more durable things like magazines.

Sites like the Internet Archive do a fair job of trying to preserve information, but even they have issues, and not everyone wants to have their information permanently archived.

Thing is, what does this say about future research? Years from now, will there be the equivalent knowledge loss of the burning of the Alexandrian Library every year as sites come and go? Will the collective mindshare of the internet lose virtual braincells faster than a freshman at his first frat party as information sources become increasingly disconnected? Will information be forced to live in protected environments where accessibility, bandwidth and storage are guaranteed, for a price? How does one determine what bits of information are worth “archiving” and which aren’t?

If i set aside a fund that guarantees my site to survive far longer than i will, does that make my crap somehow more worthy than someone who doesn’t or never gets the chance to preserve it?

Thing is, i don’t know if there is an answer. This is a new media and a weird one, but then it’s also built off of the same sort of issues that hit other media. In defense of the web, i’d actually recommend that if someone were to use the web as a research source, they need to include the research materials along with the created source, so instead of handing in a printed document with a set of back references, i’d recommend handing in a CD (or DVD) containing copies of the content. Fortunately, in most academic situations, fair use covers the copies you’re making, but naturally CDRM comes into play with other content.

Granted, like early radio and TV, there is the ever present issue of “where does this go” for more permanent storage. Probably much like radio and TV, i’m willing to bet “shoe boxes” will be the predominant repository.

But ultimately, i don’t have an answer to how, or even if, this will be solved. Doesn’t mean i can’t think about it though…

