Code4Lib 2010 Schedule

The schedule for the 2010 Code4Lib Conference in Asheville, NC.

Monday, February 22 -- Pre-Conferences

08:00-09:00 - Registration / coffee
09:00-12:00 - Morning Sessions
12:00-13:30 - Lunch (on your own)
13:30-16:30 - Afternoon Sessions

Tuesday, February 23

08:00-09:00 - Registration / Breakfast
09:00-09:15 - Welcome / Orientation /Housekeeping
09:15-10:00 - Keynote #1: Cathy Marshall [Video] [Page]
10:00-10:20 - Cloud4Lib - Jeremy Frumkin and Terry Reese [Video] [Page]
Major library vendors are creating proprietary platforms for libraries. We will propose that the code4lib community pursue the cloud4lib, a open digital library platform based on open source software and open services. This platform would provide common service layers for libraries, not only via code, but also allow libraries to easily utilize tools and systems through cloud services. Instead of a variety of competing cloud services and proprietary platforms, cloud4lib will attempt to be a unifying force that will allow libraries to be consumer of the services built on top of it as well as allow developers / researchers / code4lib'ers to hack, extend, and enhance the platform as it matures.
10:20-10:40 - Break
10:40-11:00 - The Linked Library Data Cloud: Stop talking and start doing - Ross Singer [Video] [Page]
A year later and how far has Linked Library Data come? With the emergence of large, centralized sources (id.loc.gov/authorities/, viaf.org, among others) entry to the Linked Data cloud might be easier than you think. This presentation will describe various projects that are out in the wild that can bridge the gap between our legacy data and the semantic web, incremental steps we can take modeling our data, why linked data matters and a demonstration of how a small template changes can contribute to the Linked Data cloud.
11:00-11:20 - Do It Yourself Cloud Computing with Apache and R - Harrison Dekker [Video] [Page]
R is a popular, powerful, and extensible open source statistical analysis application. Rapache, software developed at Vanderbilt University, allows web developers to leverage the data analysis and visualization capabilities of R in real-time through simple Apache server requests. This presentation will provide an overview of both R and rapache and will explore how these tools might be used to develop applications for the library community.
11:20-11:40 - Public Datasets in the Cloud - Rosalyn Metz and Michael B. Klein [Video] [Page]
When most people think about cloud computing (if they think about it at all), it usually takes one of two forms: Infrastructure Services, such as Amazon EC2 and GoGrid, which provide raw, elastic computing capacity in the form of virtual servers, and Platform Services, such as Google App Engine and Heroku, which provide preconfigured application stacks and specialized deployment tools. Several providers, however, offer access to large public datasets that would be impractical for most organizations to download and work with locally. From a 67-gigabyte dump of DBpedia's structured information store to the 180-gigabyte snapshot of astronomical data from the Sloan Digital Sky Survey, chemistry and biology to economic and geographic data, these datasets are available instantly and backed by enough pay-as-you-go server capacity to make good use of them. We will present an overview of currently-available datasets, what it takes to create and use snapshots of the data, and explore how the library community might push some of its own large stores of data and metadata into the cloud.
11:40-12:00 - 7 Ways to Enhance Library Interfaces with OCLC Web Services - Karen A. Coombs [Video] [Page]
OCLC Web Services such as xISSN, WorldCat Search API, WorldCat Identities, and the WorldCat Registry provide a variety of data which can be used to enhance and improve current library interfaces. This talk will discuss several simple ideas to improve current users interfaces using data from these services. Javascript and PHP code to add journal of table of contents information, peer-reviewed journal designation, links to other libraries in the area with a book, also available ..., and info about this author will be discussed.
12:00-13:00 - Lunch (provided)
13:00-13:20 - Taking Control of Library Metadata and Websites Using the eXtensible Catalog - Jennifer Bowen [Video] [Page]
The eXtensible Catalog Project has developed four open-source software toolkits that enable libraries to build and share their own web- and metadata-focused applications on top of a service-oriented architecture that incorporates Solr in Drupal, a robust metadata management platform, and OAI-PMH and NCIP-compatible tools that interact with legacy library systems in real-time. This presentation will showcase XC's metadata processing services, the metadata "navigator" and the Drupal user interface platform. The presentation will also describe how libraries and their developers can get started using and contributing to the XC code.
13:20-13:40 - Matching Dirty Data – Yet Another Wheel - Anjanette Young and Jeff Sherwood [Video] [Page]
This talk demonstrates one method of matching sets of MARC records that lack common unique identifiers and might contain slight differences in the matching fields. It will cover basic usage of several python tools. No large stack traces, just the comfort of pure python and basic computational algorithms in a step-by-step presentation on dealing with an old library task: matching dirty data. While much literature exists on matching/merging duplicate bibliographic records, most of this literature does not specify how to accomplish the task, just reports on the efficiency of the tools used to accomplish the task, often within a larger system such as an ILS.
13:40-14:00 - HIVE: A New Tool for Working With Vocabularies - Ryan Scherle and Jose Aguera [Video] [Page]
HIVE is a toolkit that assists users in selecting vocabulary and ontology terms to annotate digital content. HIVE combines the ease of folksonomies with the rigor of traditional vocabularies. By combining semantic web standards with text mining techniques, HIVE will improve the effectiveness of subject metadata generation, allowing users to search and browse terms from a variety of vocabularies and ontologies. Documents can be submitted to HIVE to automatically generate suggested vocabulary terms. Your system can interact with common vocabularies such as LCSH and MESH via the central HIVE server, or you can install a local copy of HIVE with your own custom set of vocabularies. This talk will give an overview of the current features of HIVE and describe how to build tools that use the HIVE services.
14:00-14:20 - Metadata Editing – A Truly Extensible Solution - David Kennedy and David Chandek-Stark [Video] [Page]
We set out in the Trident project to create a metadata tool that scales. In doing so we have conceived of the metadata application profile, a profile which provides instructions for software on how to edit metadata. We have built a set of web services and some web-based tools for editing metadata. The metadata application profile allows these tools to extend across different metadata schemes, and allows for different rules to be established for editing items of different collections. Some features of the tools include integration with authority lists, auto-complete fields, validation and clean integration of batch editing with Excel. I know, I know, Excel, but in the right hands, this is a powerful tool for cleanup and batch editing. In this talk, we want to introduce the concepts of the metadata application profile, and gather feedback on its merits, as well as demonstrate some of the tools we have developed and how they work together to manage the metadata in our Fedora repository.
14:20-14:40 - Break
14:40-15:50 - Lightning Talks 1
15:50-17:00 - Breakout Sessions 1
17:00-17:15 - Daily Wrap Up (include breakout reports?)

Wednesday, February 24

08:00-09:00 - Breakfast
09:00-09:15 - Housekeeping, Intros
09:15-09:35 - Iterative Development Done Simply - Emily Lynema [Video] [Page]
With a small IT unit and a wide array of projects to support, requests for development from business stakeholders in the library can quickly spiral out of control. To help make sense of the chaos, increase the transparency of the IT "black box," and shorten time lag between requirements definition and functional releases, we have implemented a modified Agile/SCRUM methodology within the development group in the IT department at NCSU Libraries. This presentation will provide a brief overview of the Agile methodology as an introduction to our simplified approach to iteratively handling multiple projects across a small team. This iterative approach allows us to regularly re-evaluate requested enhancements against institutional priorities and more accurately estimate timelines for specific units of functionality. The presentation will highlight how we approach each development cycle (from planning to estimating to re-aligning) as well as some of the actual tools and techniques we use to manage work (like JIRA and Greenhopper). It will identify some challenges faced in applying an established development methodology to a small team of multi-tasking developers, the outcomes we've seen, and the areas we'd like to continue improving. These types of iterative planning/development techniques could be adapted by even a single developer to help manage a chaotic workplace.
09:35-09:55 - Vampires vs. Werewolves: Ending the War Between Developers and Sysadmins with Puppet - Bess Sadler [Video] [Page]
Developers need to be able to write software and deploy it, and often require cutting edge software tools and system libraries. Sysadmins are charged with maintaining stability in the production environment, and so are often resistant to rapid upgrade cycles. This has traditionally pitted us against each other, but it doesn't have to be that way. Using tools like puppet for maintaining and testing server configuration, nagios for monitoring, and hudson for continuous code integration, UVA has brokered a peace that has given us the ability to maintain stable production environment with a rapid upgrade cycle. I'll discuss both the individual tools, our server configuration, and the social engineering that got us here.
09:55-10:15 - I Am Not Your Mother: Write Your Test Code - Naomi Dushay, Willy Mene, and Jessie Keck [Video] [Page]
How is it worth it to slow down your code development to write tests? Won't it take you a long time to learn how to write tests? Won't it take longer if you have to write tests AND develop new features, fix bugs? Isn't it hard to write test code? To maintain test code? We will address these questions as we talk about how test code is crucial for our software. By way of illustration, we will show how it has played a vital role in making Blacklight a true community collaboration, as well as how it has positively impacted coding projects in the Stanford Libraries.
10:15-10:35 - Break
10:35-10:55 - Media, Blacklight, and Viewers Like You (pdf, 2.61MB) - Chris Beer [Video] [Page]
There are many shared problems (and solutions) for libraries and archives in the interest of helping the user. There are also many "new" developments in the archives world that the library communities have been working on for ages, including item-level cataloging, metadata standards, and asset management. Even with these similarities, media archives have additional issues that are less relevant to libraries: the choice of video players, large file sizes, proprietary file formats, challenges of time-based media, etc. In developing a web presence, many archives, including the WGBH Media Library and Archives, have created custom digital library applications to expose material online. In 2008, we began a prototyping phase for developing scholarly interfaces by creating a custom-written PHP front-end to our Fedora repository. In late 2009, we finally saw the (black)light, and after some initial experimentation, decided to build a new, public website to support our IMLS-funded /Vietnam: A Television History/ archive (as well as existing legacy content). In this session, we will share our experience of and challenges with customizing Blacklight as an archival interface, including work in rights management, how we integrated existing Ruby on Rails user-generated content plugins, and the development of media components to support a rich user experience.
10:55-11:15 - Becoming Truly Innovative: Migrating from Millennium to Koha - Ian Walls [Video] [Page]
On Sept. 1st, 2009, the NYU Health Sciences Libraries made the unprecedented move from their Millennium ILS to Koha. The migration was done over the course of 3 months, without assistance from either Innovative Interfaces, Inc. or any Koha vendor. The in-house script, written in Perl and XSLT, can be used with any Millennium installation, regardless of which modules have been purchased, and can be adapted to work for migration to systems other than Koha. Helper scripts were also developed to capture the current circulation state (checkouts, holds and fines), and do minor data cleanup. This presentation will cover the planning and scheduling of the migration, as well as an overview of the code that was written for it. Opportunities for systems integration and development made newly available by having an open source platform are also discussed.
11:15-12:00 - Ask Anything! – Facilitated by Dan Chudnov [Video] [Page]
a.k.a. "Human Search Engine". A chance for you to ask a roomful of code4libbers anything that's on your mind: questions seeking answers (short or long), requests for things (hardware, software, skills, or help), or offers of things. We'll keep the pace fast, and the answers faster. Come with questions and line up at the start of the session and we'll go through as many as we can; sometimes we'll stop at finding the right person or people to answer a query and it'll be up to you to find each other after the session. First time at code4libcon! (Thanks to Ka-Ping Yee for the inspiration/explanation, reused here in part.)
12:00-13:00 - Lunch (provided)
13:00-13:20 - A Better Advanced Search - Naomi Dushay and Jessie Keck [Video] [Page]
Even though we'd love to get basic searches working so well that advanced search wouldn't be necessary, there will always be a small set of users that want it, and there will always be some library searching needs that basic searching can't serve. Our user interface designer was dissatisfied with many aspects of advanced search as currently available in most library discovery software; the form she designed was excellent but challenging to implement. See http://searchworks.stanford.edu/advanced We'll share details of how we implemented Advanced Search in Blacklight.
13:20-13:40 - Drupal 7: A more powerful platform for building library applications - Cary Gordon, The Cherry Hill Company [Video] [Page]
The release of Drupal 7 brings with it a big increase in utility for this already very useful and well-accepted content management framework. Specifically, the addition of fields in core, the inclusion of RDFa, the use of the PHP_db abstraction layer, and the promotion of files to first class objects facilitate the development of richer applications directly in Drupal without the need to integrate external products.
13:40-14:00 - Enhancing Discoverability With Virtual Shelf Browse (3.65 MB ppt) - Andreas Orphanides, Cory Lown, and Emily Lynema [Video] [Page]
With collections turning digital, and libraries transforming into collaborative spaces, the physical shelf is disappearing. NCSU Libraries has implemented a virtual shelf browse tool, re-creating the benefits of physical browsing in an online environment and enabling users to explore digital and physical materials side by side. We hope that this is a first step towards enabling patrons familiar with Amazon and Netflix recommendations to "find more" in the library. We will provide an overview of the architecture of the front-end application, which uses Syndetics cover images to provide a "cover flow" view and allows the entire "shelf" to be browsed dynamically. We will describe what we learned while wrangling multiple jQuery plugins, manipulating an ever-growing (and ever-slower) DOM, and dealing with unpredictable response times of third-party services. The front-end application is supported by a web service that provides access to a shelf-ordered index of our catalog. We will discuss our strategy for extracting data from the catalog, processing it, and storing it to create a queryable shelf order index.
14:00-14:20 - How to Implement A Virtual Bookshelf With Solr - Naomi Dushay and Jessie Keck [Video] [Page]
Browsing bookshelves has long been a useful research technique as well as an activity many users enjoy. As larger and larger portions of our physical library materials migrate to offsite storage, having a browse-able virtual shelf organized by call number is a much-desired feature. I will talk about how we implemented nearby-on-shelf in Blacklight at Stanford, using Solr and SolrMarc.
14:20-14:40 - Break
14:40-15:50 - Lightning Talks 2
15:50-17:00 - Breakout Sessions 2 - Sign up on the wiki
17:00-17:15 - Daily Wrap Up (include breakout reports?)

Thursday, February 25

08:00-09:00 - Breakfast
09:00-09:15 - Housekeeping
09:15-10:00 - Keynote #2: catfish, cthulhu, code, clouds and Levenshtein distance - Paul Jones [Video] [Page]
10:00-10:15 - Break
10:15-11:00 - Lightning Talks 3
11:00-11:20 - You Either Surf or You Fight: Integrating Library Services With Google Wave - Sean Hannan [Page]
So Google Wave is a new shiny web toy, but did you know that it's also a great platform for collaboration and research? (I bet you did.) ...And what platform for collaboration and research would not be complete without some library tools to aid and abet that process? I will talk about how to take your library web services and integrate them with Google Wave to create bots that users can interact with to get at your resources as part of their social and collaborative work.
11:20-11:40 - library/mobile: Developing a Mobile Catalog - Kim Griggs [Video] [Page]
The increased use of mobile devices provides an untapped resource for delivering library resources to patrons. The mobile catalog is the next step for libraries in providing universal access to resources and information. This talk will share Oregon State University (OSU) Libraries' experience creating a custom mobile catalog. The discussion will first make the case for mobile catalogs, discuss the context of mobile search, and give an overview of vendor and custom mobile catalogs. The second half of the talk will look under the hood of OSU Libraries' custom mobile catalog to provide implementation strategies and discuss tools, techniques, requirements, and guidelines for creating an optimal mobile catalog experience that offers services that support time critical and location sensitive activities.
11:40-12:00 - Mobile Web App Design: Getting Started (8.5 MB ppt) - Michael Doran [Video] [Page]
Creating or adapting library web applications for mobile devices such as the iPhone, Android, and Palm Pre is not hard, but it does require learning some new tools, new techniques, and new approaches. From the Tao of mobile web app design to using mobile device SDKs for their emulators, this presentation will give you a jump-start on mobile cross-platform design, development, and testing. And all illustrated with a real-world mobile library web application.
12:00-12:15 - Wrap-Up

Conference Program