This week both Dan Cohen’s blog and ArchivesNext posted about the new Archives Wiki sponsored by the American Historical Association (AHA). The AHA blog summarizes the goals of this wiki as:
…we hope that by harnessing this (relatively) new technology for collaboration on the web, we can draw on the collective interests of thousands of researchers and archivists to develop a rich resource for anyone venturing into new archives for the first time.
The AHA post goes on to express the hope that the wiki “will provide a deeper level of information than the rather general information on most archival web sites”. Setting aside the question of if this wiki will reach critical mass with regard to contributions, the idea of collecting lots of information about archives and their collections got me thinking again about Freebase.com.
My earlier post, Metadata World Building: Freebase.com and OpenLibrary.org, considers the potential of using Freebase to build a set of structured data about archival institutions. I believe in the spirit behind the Archives Wiki, but I wish that the rich set of information that is going to be captured was being stuck into multiple attributes rather than free-form wiki text. I know that they have contributor guidelines , but that isn’t enough for me.
Why Structured Data Is So Cool
Why am I so hung up about structured data? This is the sort of thing that needs a good example – and thanks to Level 1 Librarian‘s post OCLC maps the world, I found my way to the amazing OCLC WorldMap project. The WorldMap itself is a Flash based application that lets you explore data about both WorldCat holdings and other related statistics for countries around the world.
Once you get into the application, click on any two countries (I chose Russia and Australia) and then click on the ‘Compare’ button. For those especially interested in Archives, click on the ‘Cultural Institutions’ button (3rd from the bottom). If you move your mouse over each of the bars on the bar graphs you can see the actual numbers driving them. For example, on my Cultural Institutions comparison chart for Russia vs Australia I can see that Russia has 112 Archives while Australia has 42. The data source for both of these numbers is listed as the International Directory of Archives. To see the sources for the data, click on one of the tabs labeled with a country name and then click on any number to see it’s source. If my instructions are lacking, take a look at the beautiful and thorough Key to the WorldMap.
If people are going to go to all this effort entering data about archives and their collections, I wish it were being collected in such a way that we could then build new and more fabulous tools for accessing, manipulating and exploring the information.
As an example – if we collected the hours of each archives in a structured way we could figure out how many hours a week the Illinois State Archives is open (Monday–Friday, 8:00 a.m.–4:30 p.m.; Saturday, 8:00 a.m.–3:30 p.m. = 50 hours) and contrast that with the weekly hours of the Missouri State Archives (8 a.m. to 5 p.m. Monday through Wednesday and Friday; 8 a.m. to 9 p.m. on Thursday; and 8:30 a.m. to 3:30 p.m. on Saturday = 56 hours) . We could figure out how many state archives have evening hours or weekend hours. How about a map that showed the historian visiting a new city which archives were open late on the one night he has off from his conference? This is just a tiny example – but I hope it lights a spark for people about the promise of collecting this simple kind of data in a structured way.
Freebase’s whole point is to build data-sets that can drive interesting applications – like WorldMap. This just makes me want to race back to Freebase and figure out how to capture what I wish were being captured by the ArchivesWiki within the confines of Freebase’s model.
OCLC’s WikiD
I was just about to end this post when I realized I ought to check to see if someone has already tackled this problem of adding structured data to a wiki. I found my way to OCLC’s WikiD (Wiki/Data) project. The project’s home page states: “WikiD (Wiki/Data) extends the Wiki model to support multiple WikiCollections containing arbitrary schemas of XML records with minimal additional complexity.” From a brief look around, I am not clear how this would integrate in with the more traditional wiki style of the Archives Wiki – nor am I convinced that this project is still moving forward (the most recent dates I see on presentations are from 2006) – but the idea of a wiki that includes structured data is definitely there. Anyone out there have any more information about WikiD or any other tool that supports wiki style ease with the ability to structured data?
Final Thoughts
Again, I love the vision inherent in the Archives Wiki. I know that even getting a project like this off the ground is a big deal. I found this old AHA blog post from October of 2006 that discusses the original proposal for it and why it should be done. All the reasons are sound. But (you knew there was a but) the database geek in me just goes nuts when I see structured data being typed in free-form.
I’m the author of the WikiD project. Thanks for the plug.
WikiD was a research prototype and has since been re-engineered from the ground up. It is currently being used by OCLC in several production applications such as our institution registry (http://worldcat.org/registry/Institutions) and the GDFR project (http://hul.harvard.edu/gdfr/).
I haven’t had time to put together an open-source distribution of the new version. That has been our our plan all along, but there hasn’t been enough time. In the mean time, I will be happy to answer questions, though.
Jeff
Jeff,
Thank you so much for the WikiD update! It is exciting to know that it is still alive and moving forward.
Jeanne
Pingback:Tom Cobbaert .eu » links for 2008-02-20
There is the Semantic MediaWiki, also: http://ontoworld.org/wiki/Semantic_MediaWiki
Sam,
Thank you so much for the link. That looks fabulous – and like it has the best of both worlds.
Jeanne
Pingback:Recent Links Tagged With "librarytech" - JabberTags