You are here

A Semantic Makeover for CMS Data

  • Bill Levay, @wjlevay, wjlevay@gmail.com, Linked Jazz Project, Code4Lib first-timer

How can we take semi-structured but messy metadata from a repository like CONTENTdm and transform it into rich linked data? Working with metadata from Tulane’s Hogan Jazz Archive Photography Collection, the Linked Jazz Project used Open Refine and Python scripts to tease out proper names, match them with name authority URIs, and specify FOAF relationships between musicians who appear together in photographs. Additional RDF triples were created for any dates associated with the photos, and for those images with place information we employed GeoNames URIs. Historical images and data that were siloed can now interact with other datasets, like Linked Jazz’s rich set of names and personal relationships, and can be visualized (see prototype visualization) or otherwise presented on the web in any number of ways.

GitHub: https://github.com/wjlevay/tulane-jazz-data

Presentation Slides: download via Dropbox