archives

Under the Hood of Hadoop Processing at OCLC Research

Roy Tennant, OCLC Research

Apache Hadoop is widely used by Yahoo!, Google, and many others to process massive amounts of data quickly. OCLC Research uses a 40-node compute cluster with Hadoop and HBase to process the 300 million MARC records of WorldCat in various ways. This presentation will explain how Hadoop MapReduce works and illustrate it with specific examples and code. The role of the jobtracker in both monitoring and reporting on processes will be explained. String searching WorldCat will also be demonstrated live.

Lucene's Latest (for Libraries)

Erik Hatcher, LucidWorks

Lucene powers the search capabilities of practically all library discovery platforms, by way of Solr, etc. The Lucene project evolves rapidly, and it's a full-time job to keep up with the ever improving features and scalability. This talk will distill and showcase the most relevant(!) advancements to date.

All Tiled Up

Mike Graves, MIT Libraries

You've got maps. You even scanned and georeferenced them. Now what? Running a full GIS stack can be expensive, and overkill in some cases. The good news is that you have a lot more options now than you did just a few years ago. I'd like to present some lighter weight solutions to making georeferenced images available on the Web.

This talk will provide an introduction to MBTiles. I'll go over what they are, how you create them, how you use them and why you would use them.

2014 Conference Schedule

Schedule for the 2014 Code4Lib Conference in Raleigh, NC.