2013-05-25: Game Walkthroughs As A Metaphor for Web Preservation

Do you remember playing the Atari 400/800 game "Star Raiders"?  Probably not, but for me it pretty much defined my existence in middle school: the obvious Star Wars inspiration, the stereo sound, the (for the time) complex game play, the 3D(-ish) first-person orientation -- this was all ground-breaking stuff for 1979.  It, along with games like "Eastern Front (1941)", inspired me at a young age to become a video game developer; an inspiration which did not survive my undergraduate graphics course

I could encourage you to (re)experience the game by pointing you to the ROM image for the game, as well an appropriate emulator (I used "Atari800MacX"), but without the venerable Atari joystick (the same one used in the more famous 2600 system), it just doesn't feel the same to me.  And although the original instructions have been scanned, the game play is complex enough that unlike most games of the era, you can't immediately understand what to do.  So although emulation is possible, probably the best way to "share" my middle school experience with you is through one of the many game walkthroughs that exist on Youtube.



Game walkthroughs are quite popular for a variety of purposes: advertising the game, demonstrating a gamer's proficiency (e.g., speedruns), illustrating short cuts and cheats, even as new form of cinema (e.g., "Diary of A Camper").  Walkthroughs are fascinating to me because they capture the essence of the game (from the point of view of a particular player) in what can be thought of as migration: recording and uploading what was originally an ephemeral experience.  Obviously the game play is canned and not interactive, but in some sense the expertly played Star Raiders session linked above does a better job of conveying the essence of 1981 than emulation, at least with respect to the 10 minute investment that the Youtube video represents.  (And yes, I realize the video was probably generated from an emulator.)  But let's put aside the emulation vs. migration debate for the moment (see David Rosenthal's "Rothenberg Still Wrong" if you'd like to read more about it).

I think game walkthroughs can provide us with an interesting metaphor for web archiving, not simply walkthroughs of web instead of game sessions (though that is possible), but in the sense of capturing a series of snapshots of dynamic services and archiving them.  Given "enough" snapshots, we might be able to reconstruct the output of a black box.

Consider Google Maps: a useful service completely at odds with our current web archiving capabilities that "archiving Google Maps" isn't even a defined concept (see David's IIPC 2012 and 2013 blog posts for background "archiving the future web").  The Internet Archive's Wayback Machine claims to have 11,000+ mementos (archived web pages) for maps.google.com:

http://web.archive.org/web/*/maps.google.com/

But only the first page is archived, clearly not the entire service.  If you start interacting with the mementos in the Wayback Machine, you'll find they're actually reaching out to the live web (see Justin's "Zombies in the Archives" for more discussion on this topic).  But Google Maps is sharable at each state of a user's interaction.  For example, here is the URL of the ODU Computer Science building in Google Maps:

https://maps.google.com/maps?q=4700+Elkhorn+Avenue,+Norfolk,+VA&hl=en&sll=36.885425,-76.306227&sspn=0.021179,0.035148&oq=4700+e,+Norfolk,+VA&t=h&hnear=4700+Elkhorn+Ave,+Norfolk,+Virginia+23508&z=16

It shortens to:

http://goo.gl/maps/TRldz

for easier sharing.  The shortened URIs of two zoom operations are:

http://goo.gl/maps/pTJJM
http://goo.gl/maps/hP0CY

I sent all three URIs to WebCite for archiving and they are accessible at, respectively:

http://www.webcitation.org/6GssbIkmD
http://www.webcitation.org/6GssvijCW
http://www.webcitation.org/6GstCs86N

Looking at a screen shot in WebCite, it appears that at least that view is archived:



But looking at the activity log shows that nearly all the connections are actually going to various google.com machines instead of archived versions at webcitation.org (2017-10-13 update: the map tiles no longer load from google.com, so the webcitation.org links above are more obviously broken):


Assuming we solve the problem of archiving all the requests and not reaching out to the live web (e.g., client-side transactional archiving), the next problem would be determining that these two Google Map URIs:

https://maps.google.com/maps?q=4700+Elkhorn+Avenue,+Norfolk,+VA&hl=en&ll=36.886692,-76.308128&spn=0.003744,0.004393&sll=36.885425,-76.306227&sspn=0.021179,0.035148&oq=4700+e,+Norfolk,+VA&t=h&hnear=4700+Elkhorn+Ave,+Norfolk,+Virginia+23508&z=18

https://maps.google.com/maps?q=4700+Elkhorn+Avenue,+Norfolk,+VA&hl=en&ll=36.887636,-76.306465&spn=0.003556,0.00537&sll=36.885425,-76.306227&sspn=0.021179,0.035148&oq=4700+e,+Norfolk,+VA&t=h&hnear=4700+Elkhorn+Ave,+Norfolk,+Virginia+23508&z=18

are "similar enough" that they can be substituted for each other in the playback of an archived session.  For example, if an archive has a memento of the first URI but the client is requesting the a memento of the second URI, rather than return a 404 the first URI can probably be substituted in most cases.  Of course, this notion of similarity will be both a function of URIs being archived (e.g., exploiting the fact that the above URIs are about geospatial data) as well as the client accessing the mementos (different sessions may have different thresholds for similarity). 

For example, suppose you wanted to see the state of the ODU campus ca. 2013: from the ODU CS building, to the parking garage on 43rd & Elkhorn, to the football stadium off Hampton Boulevard.  Taking the three maps.google.com URIs above plus seven more, I uploaded an image slideshow to Youtube:



Which certainly preserves the experience to an extent (university campuses are constantly changing and growing and these aerial views will soon be "archival" instead of "current").  But imagine instead of a series of PNGs strung together into a video, there were 10 different HTML pages in an archive (along with the associated images).  You could still "scroll", but the transition from one memento to the next would be discrete (i.e., jerky instead of smooth like the live version of Google Maps).  To stretch the walkthrough metaphor further, it would be helpful if a memento like http://www.webcitation.org/6GssvijCW was navigable not only by the links that appear in the page, but in the context of the archived URIs that precede and follow it; not unlike TAMU's Walden's Paths, but with archived content and the path information as a global property of the memento itself.

Over time there might be enough paths through Google Maps that we might be able to say we've preserved some usable percentage of it -- or at least the ODU CS building, parking garage, and football stadium ca. 2013.  There are a number of issues to be researched to make this easy enough for people to do (many of which our group is investigating), but the popularity of game walkthroughs and their preservation side-effects suggests to me that the web archiving community should be informed by them.  And if so, perhaps that will assuage my middle school dream of being a video game developer. 

--Michael

Comments