Menu Close

Category: access

Political Campaign Ads from the NBC News Archives Find New Audience on Hulu.com

Thinking about politics, but waxing nostalgic for the good old days of movie stars and snappy jingles? Surf over to Hulu.com’s new gallery of Historic Campaign Ads. These are from iCue, which bills itself as “A fun, innovative learning environment built around the video from the NBC News Archives“.

And what would a political video blog post be without a political video? If you don’t see the video below, you can click through to view the I Like Ike ad from 1954 I chose for your viewing pleasure.

This is a great example of finding new audiences for material from archives. In this case, I had to dig for a while to discover that these were from the NBC News Archives. The Hulu iCue network/studio home page doesn’t really tell me anything – but you can imagine using a page like this to supply more information if you wanted to stress the archival origin of a set of videos.

Dipity: Easy Hosted Timelines

Dipity LogoI discovered Dipity via the Reuters article An open-source timeline of the virtual world. The article discusses the creation of a Virtual Worlds Timeline on the Dipity website. Dipity lets anyone create an account and start building timelines. In the case of the Virtual Worlds Timeline, the creator chose to permit others to collaborate on the timeline. Dipity also provides four ways of viewing any timeline: a classic left to right scrolling view, a flipbook, a list and a map.

I chose to experiment by creating a timeline for Spellbound Blog. Dipity made this very easy – I just selected WordPress and provided my blog’s URL. This was supposed to grab my 20 most recent posts – but it seems to have taken 10 instead. I tried to provide a username/password so that Dipity could pull ‘more’ of my posts (they didn’t say how many – maybe all of them?). I couldn’t get it to work as of this writing – but if I figure it out you will see many more than 10 posts.

I particularly like the way they use the images I include in my posts in the various views. I also appreciate that you can read the full posts in-place without leaving the timeline interface. I assume this is because I publish my full articles to my RSS feed. It was also interesting to note that posts that mentioned a specific location put a marker on a map – both within the single post ‘event’ as well as the full map view.

Dipity also supports the streamlined addition of many other sources such as Flickr, Picasa, YouTube, Vimeo, Blogger, Tumblr, Pandora, Twitter and any RSS feed. They have also created some neat mashups. TimeTube uses your supplied phrase to query YouTube and generates a timeline based on the video creation dates. Tickr lets you generate an interactive timeline based on a keyword or user search of Flickr.

Why should archivists care? I always perk up anytime a new web service appears that makes it easy to present time and location sensitive information. I wrote a while ago about MIT’s SIMILE project and I like their Timeline software, but in some ways hosted services like Dipity throw the net wider. I particularly appreciate the opportunity for virtual collaboration that Dipity provides. Imagine if every online archives exhibit included a Dipity timeline? Dipity provides embed code for all the timelines. This means that it should be easy to both feature the timeline within an online exhibit and use the timeline as a way to attract a broader audience to your website.

There has been discussion in the past about creating custom GoogleMaps to show off archival records in a new and different way.  During THATCamp there was a lot of enthusiasm for timelines and maps as being two of the most accessible types of visualizations. By anchoring information in time and/or location it gives people a way to approach new information in a predictable way.

Most of my initial thoughts about how archives could use Dipity related to individual collections and exhibits – but what if an archive created one of these timelines and added an entry for every one of their collections. The map could be used if individual collections were from a single location. The timeline could let users see at a glance what time periods were the focus of collections within that archives. A link could be provided in each entry pointing to the online finding aid for each collection or record group

Dipity is still in working out the kinks of some of their services, but if this sounds at all interesting I encourage you to go take a look at a few fun examples:

And finally I have embedded the Internet Memes timeline below to give you a feel of what this looks like. Try clicking on any of the events that include a little film icon at the bottom edge and see how you can view the video right in place:

Image Credit:  I found and ‘borrowed’ the Dipity image above from Dipity’s About page.

Flickr Terms of Service, Unwritten Guidelines and Safety Levels

Flickr: Free Click by fikra (Sami Ben Gharbia)As more cultural heritage institutions add photos to Flickr, such as these sets added by the Smithsonian, an AP article discussing freedom of expression in online public spaces identifies some some issues that deserve attention. In ‘Public’ online spaces don’t carry speech, rights, Anick Jesdanun highlights a number of scenarios in which service providers (such as the Yahoo! owned Flickr) clash with their users, including this one (italics my own):

Dutch photographer Maarten Dors met the limits of free speech at Yahoo Inc.’s photo-sharing service, Flickr, when he posted an image of an early-adolescent boy with disheveled hair and a ragged T-shirt, staring blankly with a lit cigarette in his mouth.

Without prior notice, Yahoo deleted the photo on grounds it violated an unwritten ban on depicting children smoking. Dors eventually convinced a Yahoo manager that – far from promoting smoking – the photo had value as a statement on poverty and street life in Romania. Yet another employee deleted it again a few months later.

This image on Flickr gives more details about the photo being removed – and this is the reinstated photo in question. The article points out “Service providers write their own rules for users worldwide and set foreign policy when they cooperate with regimes like China. They serve as prosecutor, judge and jury in handling disputes behind closed doors.” It makes me wonder if the ‘unwritten guidelines’ are applied evenly across Flickr. With the creation of The Commons area, it would be easy to create two standards – one for the general public and another for ‘blessed’ institutions. Images that are acceptable from the Brooklyn Museum (consider this set of Behind The Scenes photos of the Ron Mueck exhibition) might not be accepted from the average person. In my research I discovered a set of Public Domain photos from the National Archives. Some of the photos included in this set are historically valuable images that I would not necessarily want a child to see. Does this mean they shouldn’t be on Flickr? I don’t think so, but that certainly isn’t up to me.

Here are the relevant passages of the Yahoo! Terms of Service:

You agree to not use the Service to:

  1. upload, post, email, transmit or otherwise make available any Content that is unlawful, harmful, threatening, abusive, harassing, tortious, defamatory, vulgar, obscene, libelous, invasive of another’s privacy, hateful, or racially, ethnically or otherwise objectionable;
  2. harm minors in any way;

You acknowledge that Yahoo! may or may not pre-screen Content, but that Yahoo! and its designees shall have the right (but not the obligation) in their sole discretion to pre-screen, refuse, or remove any Content that is available via the Service. Without limiting the foregoing, Yahoo! and its designees shall have the right to remove any Content that violates the TOS or is otherwise objectionable.

That bit about ‘otherwise objectionable’ could be used to cover removal of anything. Being subject to the terms of service of Internet service providers is nothing new, but as archives, libraries and other cultural heritage institutions look for ways to increase their revenue streams and explore innovative ways to bring more eyes to their materials it will become more import to understand these guidelines.

I understand (as the author of the article that inspired this post also points out) that Yahoo! is a business. Their priorities are not always going to be the same as those of the National Archives or the Brooklyn Museum. There are definitely images from history and the world of art that are only appropriate for adults, but isn’t that what Flickr’s content filter feature, named SafeSearch, is all about? These are the three ‘safety levels’ available on Flickr:

  • Safe – Content suitable for a global, public audience
  • Moderate – If you’re not sure whether your content is suitable for a global, public audience but you think that it doesn’t need to be restricted per se, this category is for you
  • Restricted – This is content you probably wouldn’t show to your mum, and definitely shouldn’t be seen by kids

It is interesting that Flickr has it’s own separate list of Community Guidelines, independent of Yahoo!’s terms of service. This is the passage from these guidelines about filtering content:

Take the opportunity to filter your content responsibly. If you would hesitate to show your photos or videos to a child, your mum, or Uncle Bob, that means it needs to be filtered. So, ask yourself that question as you upload your content and moderate accordingly. If you don’t, it’s likely that one of two things will happen. Your account will be reviewed then either moderated or terminated by Flickr staff.

I am still not sure what safety level I would use for a photo showing rows of dead in a concentration camp. I guess given the choices, ‘restricted’ is the best option – but that still doesn’t sit right with me somehow. I did an advanced Flickr search for ‘concentration camp’ with SafeSearch on – and those photos are not currently being marked as restricted. Who is it that we expect to be protecting using SafeSearch? From Flickr’s definition above it is supposed to at least be kids (and maybe your mom and Uncle Bob).

I think the question of the moment is how to know which images are appropriate to upload if some of the guidelines are unwritten. Flickr is a community and understanding the community is essential to success within that community. Once you believe your images are appropriate to include, then you must decide the right ‘safety level’. It is not clear to me how to tell the difference between an image that is not appropriate to be uploaded to Flickr and an image that is okay but needs to be marked with a safety level of ‘restricted’. I am very interested to see how this category of ‘appropriate but restricted’ evolves. For now, I am going to keep a watch on how the Flickr Commons grows and what range of content is included. The final answer for some of these images may be to only provide them via the institutions’ web sites rather than via service providers such as Flickr.

Image credit: Free Click by fikra (Sami Ben Gharbia) via Flickr

Clustering Data: Generating Organization from the Ground Up

Flickr: water tag clustersMy trip to the 2008 Information Architecture Summit (IA Summit) down in Miami has me thinking a lot about helping people find information. In this post I am going to examine clustering data.

Flickr Tag Clusters
Tag clusters are not new on Flickr – they were announced way back in August of 2005. The best way to understand tag clusters is to look at a few. Some of my favorites are the water clusters (shown in the image above). From this page you can view the reflection/nature/green cluster, the sky/lake/river cluster, the blue/beach/sun cluster or the sea/sand/waves cluster.

So what is going on here? Basically Flickr is analyzing groupings of tags assigned to Flickr images and identifying common clusters of tags. In our water example above – they found four different sets of tags that occurred together and distinctly apart from other sets of tags. The proof is in the pudding – the groupings make sense. They get at very subtle differences even though the mass of data being analyzed is from many different individuals with many different perspectives.

Tag clusters are very powerful and quite different from tag clouds. Tag clouds, by their nature, are a blunt instrument. They only show you the most popular tags. Take a look at the tag cloud for the Library of Congress photostream on Flickr. I do learn something from this. I get a sense of the broad brush topics, time periods and locations. But if you look at the full list of Library of Congress Flickr tags you see what a small percentage the top 150 really are (and yes.. that page does takes a while to load). Who else is now itching to ask Flickr to generate clusters within the LOC tag set?

Steve.Museum
Another example of cultural heritage images being tagged is the Steve Museum Art Museum Social Tagging Project which lets individuals tag objects from museums via Steve Tagger. It resembles the Library of Congress on Flickr project in that it includes existing metadata with each image and permits users to add any tags they deem appropriate. I think it would be fascinating to contrast the traffic of image taggers on Steve.Museum vs Flickr for a common set of images. Is it better to build a custom interface that users must seek out but where you have complete control over the user experience and collected data? Or is it better to put images in the already existing path of users familiar with tagging images? I have no answers of course. All I know is I wish I could see the tag clusters one could generate off the Steve.Museum tag database. Perhaps someday we will!

Del.icio.us Tags
del.icio.us related tagsDel.icio.us, a web service for storing and tagging your bookmarks online, supports what they call ‘related tags’ and ‘tag bundles’. If you view the page for the tag ‘archives’ – you will see to the far right a list of related tags like those shown in the image here. What is interesting is that if I look at my own personal tag page for archives I see a much longer list of related tags (big surprise that I have a lot of links tagged archives!) and I am given the option of selecting additional tags to filter my list of links via a combination of tags.

Del.icio.us’s ‘tag bundles’ let me create my own named groupings of tags – but I must assemble these groups manually rather than have them generated or suggested. On the plus side, Del.icio.us is very open about publishing its data via APIs and therefore supporting third party tools. I think my favorite off that list for now has to be MySQLicious which mirrors your del.icio.us bookmarks into a MySQL database. Once those tags are in a database, all you need are the right queries to generate the clusters I want to see.

Clusty: Clustered Search Results
Clusty: clusters screen shotAn example of what this might look like for search results can be seen via the search engine Clusty.com from the folks over at Vivisimo. For example – try a search on the term archives. This is one of those search terms for which general web searching is usually just infuriating. Clusty starts us with the same top 2 results as a search for archives on Google does, but it also gives us a list of clusters on the left sidebar. You can click on any of those clusters to filter the search results.

Those groups don’t look good to you? Click the ‘remix’ link in the upper right hand corner of the cluster list and you get a new list of clusters. In a blog post titled Introducing Clustering 2.0 Vivisimo CEO Raul Valdes-Perez explains what happens when you click remix:

With a single click, remix clustering answers the question: What other, subtler topics are there? It works by clustering again the same search results, but with an added input: ignore the topics that the user just saw. Typically, the user will then see new major topics that didn’t quite make the final cut at the last round, but may still be interesting.

I played for a while.. clicking remix over and over. It was as if it was slicing and dicing the facets for me – picking new common threads to highlight. I liked that I wasn’t stuck with what someone else thought was the right way to group things. It gave me the control to explore other groupings.

Ontology is Overrated
Clay Shirky’s talk Ontology is Overrated: Categories, Links and Tags from the spring of 2005 ties a lot of these ideas together in a way that makes a lot of sense to me. I highly recommend you go read it through – but I am going to give away the conclusion here:

It’s all dependent on human context. This is what we’re starting to see with del.icio.us, with Flickr, with systems that are allowing for and aggregating tags. The signal benefit of these systems is that they don’t recreate the structured, hierarchical categorization so often forced onto us by our physical systems. Instead, we’re dealing with a significant break — by letting users tag URLs and then aggregating those tags, we’re going to be able to build alternate organizational systems, systems that, like the Web itself, do a better job of letting individuals create value for one another, often without realizing it.

I currently spend my days working with controlled vocabularies for websites, so please don’t think I am suggesting we throw it all away. And yes, you do need a lot of information to reach the critical mass needed to support the generation of useful clusters. But there is something here that can have a real and positive impact on users of cultural heritage materials actually finding and exploring information. We can’t know how everyone will approach our records. We can’t know what aspects of them they will find interesting.

There Is No Box
Archivists already know that much of the value of records is in the picture they paint as a group. A group of records share a context and gives the individual records meaning. Librarians and catalogers have long lived in a world of shelves. A book must be assigned a single physical location. Much has been made (both in the Clay Shirky talk and elsewhere) that on the web there is no shelf.

What if we take the analogy a step further and say that for an online archives there is no box? Of course, just as with books, we still need our metadata telling us who created this record originally (and when and why and which record comes before it and after it) – but picture a world where a single record can be virtually grouped many times over. Computer programs are only going to get better at generating clusters, be they of user assigned tags or search results or other metdata. From where I sit, the opportunity for leveraging clustering to do interesting things with archival records seems very high indeed.

Of Pirates, Treasure Chests and Keys: Improving Access to Digitized Materials

Key to Anything by Stoker Studios (flickr)Dan Cohen posted yesterday about what he calls The Pirate Problem. Basically the Pirate Problem can be summed up as “there are ways of acting and thinking that we can’t understand or anticipate.” Why is that a ‘Pirate Problem’? Because a pirate pub opened near his home and rather than folding shortly thereafter due to lack of interest from the ‘very serious professionals’ who populate DC suburbs – the pub was a rousing success due to the pirate aficionados who came out of the woodwork to sing sea shanties and drink grog. This surprising turn of events highlighted for him the fact that there are many ways of acting and thinking (some people even know all the words to sea shanties without needing sheet music).

Dan recently delivered the keynote speech at a workshop at the University of North Carolina at Chapel Hill. The workshop brought together dozens of historians to talk about how the 16 million archival documents of the Southern Historical Collection (SHC) should be put online. He devoted his keynote “to prodding the attendees into recognizing that the future of archives and research might not be like the past” and goes on in his post to explain:

The most memorable response from the audience was from an award-winning historian I know from my graduate school years, who said that during my talk she felt like “a crab being lowered into the warm water of the pot.” Behind the humor was the difficult fact that I was saying that her way of approaching an archive and understanding the past was about to be replaced by techniques that were new, unknown, and slightly scary.

This resistance to thinking in new ways about digital archives and research was reflected in the pre-workshop survey of historians. Extremely tellingly, the historians surveyed wanted the online version of the SHC to be simply a digital reproduction of the physical SHC.

Much of the stress of Dan’s article is on fear of new techniques of analysis. The choppy waters of text mining and pattern recognition threaten to wash away traditional methods of actually reading individual pages and “most historians just want to do their research they way they’ve always done it, by taking one letter out of the box at a time”.

I certainly like the idea of new technologically based ways of analyzing large sets of cultural heritage materials, but I also believe that reading individual letters will always be important. The trick is finding the right letter!

And of course – we still need the context. It isn’t as if when we digitize major collections like the SHC that we are going to scan and OCR each page without regard to which box it came out of. We can’t slice and dice archival records and manuscripts into their component parts to feed into text analysis with no way back to the originals.

I like to imagine the combination of all the new technology (be it digitization, cross collection searching, text mining or pattern recognition) as creating keys to different treasure chests. Humanities scholars are treasure hunters. Some will find their gems through careful reading of individual passages. Others will discover patterns spread across materials now co-existing virtually that before digitization would have been widely separated by space and time. Both methods will benefit from the digitization of materials and the creation of innovative search and text analysis tools. Both still require an understanding of a material’s origin. The importance of context isn’t going anywhere – we still need to know which box the letter came from (and in a perfect world, which page came before and which came after). I want scholars to still be able to read one page from the box – I just want them to be able to do it from home in the middle of the night if they are so inclined with their travel budget no worse for wear.

Dan ties his post together by pointing out that:

… in Chapel Hill I was the pirate with the strange garb and ways of behaving, and this is a good lesson for all boosters of digital methods within the humanities. We need to recognize that the digital humanities represent a scary, rule-breaking, swashbuckling movement for many historians and other scholars.

In my opinion, the core message should be that we just found more locked treasure chests – and for those who are interested, we have some new keys that just might open those locks. I enjoyed the Pirate metaphor (obviously) and I appreciate that there are real issues here relating to strong discomfort with the fast changing landscape of technology, but I have to believe that if we do something that prevents historians from being able to read one letter at a time we are abandoning the treasure chests that are already open for the new ones for which we haven’t yet found the right keys. I am greedy. I want all the treasure!

Image credit: key to anything by Stoker Studios via flickr

SAA2008: PDFs of Conference Presentations

I found another reason recently to be excited about the progress of SAA’s online presence. Buried in the ARCHIVES 2008: Archival R/Evolution & Identities Checklist for Presenters is first tidbits of a plan to provide access to PDF versions of conference presentations on the SAA website.

Send an Electronic Copy of Your Presentation to SAA. The conference organizers would like to offer meeting attendees the opportunity to view presentations after the conference on the SAA 2008 Annual Meeting website (www.archivists.org). If you’ll supply a copy of your presentation, we’ll convert it to a PDF and post it. Please note that by sending SAA a copy of your presentation in electronic format, you grant permission for your presentation to be viewed by all SAA 2008 Annual Meeting attendees.

I am so pleased! I have always wanted access to the presentations – both for those sessions I attend and those I cannot. I have often been that person hovering at the edge of the stage after a panel, waiting to request a soft copy of the presentation.

I do wonder what they mean when they say that the presentations will be “viewable by meeting attendees”. In my heart of hearts I hope they go a step further and let the speakers sign off on these presentations being shared with the world (or at least with all of SAA). I haven’t gone through every Session Page on the SAA 2007 Un-Official Wiki, but I believe that not very many presenters took the opportunity to provide links to soft copies of their presentations. I hope that SAA is more successful on this front.

No matter the choices made relating to immediate access – I see this as a big step forward in the commitment to using technology. I think one of the best ways to learn is through getting your hands dirty. Technology is listed as one of SAA’s strategic priorities. Every choice that SAA makes that encourages their membership to become more tech-savvy is a step towards supporting that priority.

Big Digital Step For SAA: American Archivist Online

SAA LogoThe Society of American Archivists has officially launched American Archivist Online (also available via the Members Only page once you login to archvists.org).

Here are a few key points that caught my eye from the FAQ :

  • Content is available as PDF files with embedded searchable text (one file per article or section of the journal)
  • It is hosted by MetaPress
  • The online version will be produced in parallel with the print version

What issues are online?

Fall/Winter 2000 (Volume 63 – Number 2) through the most recent issue – Fall/Winter 2007. The FAQ reports that additional back issues will be digitized over time.

How is it structured?

Each journal article is a separate PDF file. Talk about a boon to graduate students and archives professors everywhere! Even the front matter is there separated out – perfect for printing and attaching to your article printouts for future reference. Of course, if you are feeling green (and better at reading on screen than I am) you can bookmark them or save them locally for future reference.

Who can access it?

Officially, only members of SAA and individual or institutional subscribers to the journal can access all available issues. In reality, it appears most of the issues are available to everyone. Currently only the Fall/Winter issues of 2005, 2006 & 2007 restrict access to all the content. Even for these issues there is access to some of the articles – such as the Book Reviews section in both the 2005 and 2007 Fall/Winter issues.

The FAQ claims that non-subscribers must pay a fee to print an article – but I don’t see how they will enforce that. When viewing a PDF of an article from the most recent issue I was able to save it to my local desktop and print it without a problem. Not sure if that is a bug or how it will remain – or if maybe they are talking about official reprints that are sent through the mail?

Other features

  • Try the handy Article Category search links – like this one that shows all the Presidential Addresses.
  • Mark or save articles to your own private lists (if you are logged in)
  • Search the full text – either across the journal or within an individual issue.
  • Subscribe to the RSS feed (I spotted on the All Issues page). The feed includes the article abstract, category, author and source issue information. Be the first archivist on your block to know the instant the new issue is published online!

Final Thoughts

I think that everyone who heard President Adkins announce at SAA in Chicago that the American Archivist was going online was excited (well.. there was lots of clapping – that is for sure). That announcement was a strong indications to me of SAA’s commitment to improving their online offerings.

Finally seeing it available online is even better – action speaks louder than words.

Image Credit: SAA Logo from http://archivists.org/