You are here

Feed aggregator

FOSS4Lib Recent Releases: ArchivesSpace - 1.2.0

planet code4lib - Mon, 2015-03-30 20:43

Last updated March 30, 2015. Created by Peter Murray on March 30, 2015.
Log in to edit this page.

Package: ArchivesSpaceRelease Date: Monday, March 30, 2015

Nicole Engard: Bookmarks for March 30, 2015

planet code4lib - Mon, 2015-03-30 20:30

Today I found the following resources and bookmarked them on Delicious.

Digest powered by RSS Digest

The post Bookmarks for March 30, 2015 appeared first on What I Learned Today....

Related posts:

  1. Encyclopaedia Britannica Goes — Gasp! — Wiki
  2. Can you say Kebberfegg 3 times fast
  3. Are you backing up?

John Miedema: Tags are the evil sisters of Categories. Surprising views, sour fast. Lila offers a different approach.

planet code4lib - Mon, 2015-03-30 20:26

I’m a classification nut, as I told you. In the last post I told you about the way I organize files and emails into folders. Scintillating stuff, I know. But let’s go a level deeper toward Lila by talking about tagging. Tags are the evil sisters of categories. Categories are top-down classification — someone on high has a idealized model of how everything fits into nice neat buckets. Tags are situational and bottom-up. In the heat of the moment, you decide that this file or that email is about some subject. Tags don’t conform to a model, you make them up on the fly. You add many tags, as many as you like. Mayhem! I’ve tried ‘em, I don’t like ‘em.

Tags do one thing very well, they let you create surprising views on your content. Categories suffer from the fact that they only provide one view, a hierarchical structured tree. Tags let you see the same content in many different ways. Oh! Look. There’s that short story I wrote tagged with “epic.” And there’s those awesome vacation pics tagged with the same. Hey, I could put those photos on that story and make it so much better. But the juice you get out of tags sours fast. The fact that they are situational and bottom-up causes their meaning to change. “Bad” and “sick” used to mean negative things. As soon as people get about a hundred tags they start refactoring them, merging and splitting them, using punctuation like underscores to give certain tags special meanings. Pretty soon they dump the whole lot of them and start over. Tags fail. What people really want is, yup, categories.

Lila is a new way to get the juice out of tags without going sour. Lila works collaboratively with the author to organize writing. Lila will let writers assign categories and tags, but treat them as mere suggestions. The human is smart, Lila knows, and needs his or her help, so it will use the author’s suggestions to come up with its own set of categories and tags. Lila’s technique will be based on natural language processing. Best part, the tags can also be regenerated at the click of a button, so that the tags never sour. You get the surprising views and the tags maintain their freshness. Sweet.

I’ve been pretty down on tags in this post, so I will say there is one more thing that tags do quite well. They connect people, like hash tags in twitter. They form lose groupings of content so that disparate folks can find each other. It doesn’t apply so much to a solitary writing process, but it might fit to a social writing process. I will think on that.

FOSS4Lib Recent Releases: Sufia - 6.0.0

planet code4lib - Mon, 2015-03-30 20:03

Last updated March 30, 2015. Created by Peter Murray on March 30, 2015.
Log in to edit this page.

Package: SufiaRelease Date: Friday, March 27, 2015

Roy Tennant: Want To See More Women in Tech? Mentor Someone

planet code4lib - Mon, 2015-03-30 16:50

I was not much more than a newly-minted librarian when my greatest professional mentor gave me a chance at something that would launch my career beyond the confines of my institution onto an international stage. It was in the early 90s, when the Internet was just beginning to take off at large research libraries around the United States. If you can, and I know it’s difficult, imagine libraries without the Internet. Imagine society without the Internet.

Anne Lipow was entrusted with developing and delivering technology and bibliographic classes to both staff and faculty at UC Berkeley, and she took the responsibility very seriously. She would prowl the halls of Doe Library looking for young turks like myself, to pull us in to developing and delivering courses on how to connect to the newly-online library catalog or how to use this new thing called Gopher. I almost started ducking into doorways when I spotted her coming down the hall. And now I’m really glad I was more stupid than cowardly. Because I could never have predicted what would come next.

Anne retired from Berkeley and started her own consultancy: Library Solutions Institute. She began planning her very first event — an all-day hands-on workshop on how to use the Internet timed to coincide with the ALA Annual Conference to be held in San Francisco in June 1992. She signed me up to help, as well as John Ober. Clifford Lynch agreed to be the ending keynote of one group and the beginning keynote of another, thereby allowing us to sign up two cohorts over two days.

We began work on a set of handouts that soon led to a binder to hold them all and the dawning realization that we had a book on our hands. Anne changed the name of her business to Library Solutions Institution and Press and we were off to the races. Crossing the Internet Threshold: An Instructional Handbook was published later that year and it took off, and my speaking career took off with it. Before long I was traveling to foreign countries such as Romania and Hungary, giving workshops based on that text. Between the royalties and speaking fees, my wife and I were able to financially weather the impact of twins born in February 1993. Without it, I shudder to think.

So you will not find a stronger advocate for mentorship than me. That’s why I have tried to focus on finding young female professionals interested in library technology to mentor, so as a profession we can increase the number of women in tech librarianship. I know that a diversity of perspectives, skills, and abilities is by its very nature a good thing. And the more of us out there increasing diversity of all kinds in library tech librarianship, the better off the entire profession will be.

Anne, I miss you. But your example and inspiration is alive and well.


Islandora: Islandora Foundation: Meet the Partners

planet code4lib - Mon, 2015-03-30 14:45

The Islandora Foundation has been very fortunate to welcome six new Partner-level members in the past few months, due in large part to enthusiasm over our ongoing Fedora 4/Islandora 7.x upgration project. I'd like to take some time in this week's blog to highlight those new members, and all of the Foundation supporters who have helped us to get where we are today. Which is a pretty good spot for a non-profit of less than two years: we are on the verge of our third community-led release, upgrading to support the latest in Fedora, holding Camps all over the world, and planning our first conference.

So let's take a look at the Partners who are helping the Islandora Foundation to thrive:

When we launched in July 2013, it was with the backing of two initial partners who have always been a part of Islandora's story: UPEI and discoverygarden, Inc. Islandora was born at UPEI under the guidance of University Librarian Mark Leggott, who continues on at the current Chairman of our Board of Directors.

Discoverygarden, in this context, is sort of like the Foundation's older sibling who went to work in the private sector. By providing services to install and customize an open source software platform and donating many of their developments right back for public use, it developed alongside Islandora while making huge contributions to the codebase, and developers at dgi continue to produce and refine a lot of the core functions that make Islandora work.

The next institution to step up to the plate as a Partner is LYRASIS. One of the Foundation's first Collaborator members, LYRASIS is a non-profit membership organization committed to the success of libraries and cultural heritage organizations. It partners with members create, access, and manage information with an emphasis on digital content. In the Islandora community, LYRASIS has had an active presence on the Roadmap Committee, the Board of Directors, a number of Interest Groups, and most Islandora Camps (to the point where we feel a little bereft when there's no one from LYRASIS in attendance).

When they renewed their membership this year, LYRASIS decided to bump up to Partner to help support the upgration. This is a common theme with our new Partners. Fedora 4 is a great step forward and it is awesome how many in the community have committed to seeing it happen.

Their Assistant Director for Digital Technology Services, Peter Murray, was already a member of the Islandora Foundation Board at our request, and will be continuing in this role.

The University of Manitoba is a public university in Winnipeg, Manitoba. A founding Member in the Islandora Foundation, they also bumped up to Partner to help the Fedora 4 upgration. Their Web Application Developer, Jared Whiklo, has been an active participant on the front lines of the project, working with Nick and Danny to get the prototype off the ground. The library's Head of Discovery & Delivery Services, Lisa O'Hara, is joining the Foundation's Board of Directors.

McMaster University is another long-time Collaborator in the Foundation. Like LYRASIS and the University of Manitoba, their new Partnership helps to support the future of Islandora with Fedora 4. Already a a big help to the community through member's participating in Interest Groups and the Roadmap Committee, we are looking forward to having McMaster represented on our Board of Directors by Dale Askey.

York University makes the move to Partner from Collaborator through their very generous in-kind donation of a resource that has proven absolutely vital to the Fedora 4 upgration: Nick Ruest's time. They are also piloting a migration from Fedora 3 to 4 that will most likely serve as the framework on which the entire community can base such migrations in the future. 

Adam Taves, Acting Associate University Librarian for Collections & Research, will be joining the Board of Directors on York's behalf.


Simon Fraser University is one of two new Partners who are joining the Foundation for the first time this year - although it has long been an active contributor to the Islandora community through the efforts of members like Mark Jordan and Alex Garnett. Indeed, Mark Jordan was already a member of the Islandora Foundation Board of Directors and will be staying on with us.

We were very fortunate to be able to first announce SFU's Partnership right on their downtown campus at Islandora Camp BC last February.


Our very newest Partner is not quite our first European member (that distinction belongs to to digiBESS group in Italy), but it is our first European Partner. The University of Limerick joins us in part to support the Fedora 4 upgration and will be represented on the Board by Caleb Derven. And maybe, if we are very lucky, we'll get to invite you all to an Islandora Camp in their fair city. Because this:

Photo by: William Murphy

ACRL TechConnect: A Video on Browser Extensions

planet code4lib - Mon, 2015-03-30 13:00

I thought we’d try something new on ACRL TechConnect, so I recorded a fifteen-minute video discussing general use cases for browser extensions and some specifics of Google Chrome extensions.

The video mentions my WikipeDPLA post on this blog and walks through some slides I presented at a Code4Lib Northern California event.

If you’re looking for another good extension example in libraryland, Stephen Schor of New York Public Library recently wrote extensions for Chrome and Firefox that improve the appearance and utility of the Library of Congress’ EAD documentation. The Chrome extension uses the same content script approach as my silly example in the video. It’s a good demonstration of how you can customize a site you don’t control using the power of browser add-ons.

Have you found a use case for browser add-ons at your library? Let us know in the comments!

Mark E. Phillips: Metadata Edit Events: Part 3 – What

planet code4lib - Mon, 2015-03-30 02:16

This is the third post in a series related to metadata event data that we collected from January 1, 2014 to December 31, 2014 for the UNT Libraries Digital Collections.  We collected 94,222 metadata editing events during this time.

The first post was about the when of the events,  when did they occur, what day of the week and what day of the week the occurred.

The second post touched on the who of the events,  who were the main metadata editors, how were edits distributed among the different users, and how the number of years per month, day, hour were distributed.

This post will look at the what of the events data.  What were the records that were touched,  what collections or partners did they belong to and so on.

Of the total 94,222 edit events there were 68,758 unique metadata records edited.

By using the helpful st program we can quickly get the statistics for these 68,758 unique metadata records.  By choosing the “complete” stats we get the following data.

N min q1 median q3 max sum mean stddev stderr 68,758 1 1 1 1 45 94,222 1.37034 0.913541 0.0034839

With this we can see that there is a mean of 1.37 edits per record over the entire dataset with the maximum number of edits for a record being 45.

The total distribution of number of edits-per-record a presented in the table below.

Number of Edits Instances 1 53,213 2 9,937 3 3,519 4 1,089 5 489 6 257 7 111 8 60 9 30 10 13 11 14 12 7 13 5 14 5 15 1 16 2 17 1 19 1 21 1 26 1 30 1 45 1

From the 68,758 records edited,  53,213 (77%) of the records were only edited once, with two and three edits per record edit 9,937 (14%),  and 3,519 (5%) respectively. From there things level out very quickly to under 1% of the records.

When indexing these edit events in Solr I also merged the events with additional metadata from the records.  By doing so we have a few more facets to take a look at, specifically how the edit events are distributed over partner, collection, resource type and format.


There are 167 partner institutions represented in the edit event dataset.

The top ten partners by the number of edit events is presented in the table below.

Partner Code Partner Name Edit Count Unique Records Edited Unique Collections UNTGD UNT Libraries Gov Docs Department 21,932 14,096 27 OKHS Oklahoma Historical Society 10,377 8,801 34 UNTA UNT Libraries Special Collections 9,481 6,027 25 UNT UNT Libraries 7,102 5,274 27 PCJB Private Collection of Jim Bell 5,504 5,322 1 HMRC Houston Metropolitan Research Center at Houston Public Library 5,396 2,125 5 HPUL Howard Payne University Library 4,531 4,518 4 UNTCVA UNT College of Visual Arts and Design 4,296 3,464 5 HSUL Hardin-Simmons University Library 2,765 2,593 6 HIGPL Higgins Public Library 1,935 1,130 3

In addition to the number of edit events,  I have added a column for the number of unique records for each of the institutions.  The same data is presented in the graph below.

Graph showing the edit event count and unique record count for each of the institutions with the most edit events

The larger the difference between the Edit Count and the Unique Records Edited represents more repetitive edits of the same records by that partner.

The final column in the table above shows the number of different collections that were edited that belong to each specific partner.  Taking UNTGD as an example, there are 27 different collection that held records that were edited during the year.

Collection Code Collection Name Edit Events Records Edited TLRA Texas Laws and Resolutions Archive 8,629 5,187 TXPT Texas Patents 7,394 4,636 TXSAOR Texas State Auditor’s Office: Reports 2,724 1,223 USCMC United States Census Map Collection 1,779 1,695 USTOPO USGS Topographic Map Collection 490 458 TRAIL Technical Report Archive and Image Library 287 279 CRSR Congressional Research Service Reports 271 270 FCCRD Federal Communications Commission Record 211 208 NACA National Advisory Committee for Aeronautics Collection 62 62 WWPC World War Poster Collection 49 49 WWI World War One Collection 41 41 USDAFB USDA Farmers’ Bulletins 21 19 ATOZ Government Documents A to Z Digitization Project 19 18 WWII World War Two Collection 19 19 ACIR Advisory Commission on Intergovernmental Relations 14 13 NMAP World War Two Newsmaps 12 12 TR Texas Register 12 8 TXPUB Texas State Publications 12 12 GAORT Government Accountability Office Reports 10 10 BRAC Defense Base Closure and Realignment Commission 4 4 OTA Office of Technology Assessment 4 4 GDCC CyberCemetery 2 2 FEDER Federal Communications Commission Record 1 1 GSLTX General and Special Laws of Texas 1 1 TXHRJ Texas House of Representatives Journals 1 1 TXSS Texas Soil Surveys 1 1 UNTGOV Government Documents General Collection 1 1

This is set of data that is a bit easer to see with a simple graph.  I’ve plotted the ratio of records and the number of edit events to a simple line graph.

UNT Government Documents Edits to Record Ratios for each collection.

You can look at the graph above and quickly see which of the collections have had a higher edit-to-record ratio with the Texas State Auditor’s Office: Reports being the most number of edits per record with a ratio of over 2 edits per record for that collection.  Many of the other collections are much closer to 1 where there would be one edit per record.


The edit events occur in 266 different collections in the UNT Libraries’ Digital Collections.  As with the 167 partners above,  that is too many to stick into a table so I’m going to just list the top ten of them for us in the table below.

Collection Code Collection Name Edit Events Unique Records TLRA Texas Laws and Resolutions Archive 8,629 5,187 ABCM Abilene Library Consortium 8,481 8,060 TDNP Texas Digital Newspaper Program 7,618 6,305 TXPT Texas Patents 7,394 4,636 OKPCP Oklahoma Publishing Company Photography Collection 5,799 4,729 JBPC Jim Bell Texas Architecture Photograph Collection 5,504 5,322 TCO Texas Cultures Online 5,490 2,208 JJHP John J. Herrera Papers 5,194 1,996 UNTETD UNT Theses and Dissertations 4,981 3,704 UNTPC University Photography Collection 4,509 3,232

Again plotting the ratio of edit events to the number of unique records gives us the graph below.

Edit Events to Record Ratio grouped by Collection

You can quickly see the two collections that averaged over two edit events for each of the records that were edited during the last year,  meaning if a record was edited,  most likely it was edited at least two times.  Other collections like the Jim Bell Photography Collection or the Abilene Library Consortium Collection appear to have only been edited one time per record on average,  so when the edit was complete, it wasn’t revisited for additional editing.

Resource Type

The UNT Libraries makes use of a locally controlled vocabulary for its resource types.  You can view all of the available resource types here .

If you group the edit events and associated edit events by the resource type you will get the following table.

Resource Type Edit Events Unique Records image_photo 31,702 24,384 text_newspaper 11,598 10,176 text_leg 8,633 5,191 text_patent 7,480 4,667 physical-object 5,591 4,921 text_etd 4,986 3,709 text 4,311 2,511 text_letter 4,276 2,136 image_map 3,542 3,160 text_report 3,375 1,822 image_artwork 1,217 1,042 text_article 1,060 758 video 931 461 sound 719 694 text_legal 687 341 text_journal 549 288 text_book 476 422 image_presentation 430 313 image_postcard 429 180 image_poster 427 321 text_paper 423 312 text_pamphlet 303 199 text_clipping 275 149 text_yearbook 91 66 dataset 54 19 image_score 49 37 collection 41 34 image 34 20 website 22 20 text_chapter 17 14 text_review 13 11 text_poem 3 1 specimen 1 1

By calculating the edit-event-to-record ratio and plotting that you get the following graph.

Edit Events to Record Ratio grouped by Resource Type.

In the graph above I presented the data in the same order as it appears in the table just above the chart.  You can see that the highest ratio is for our text_poem record that was edited three different times.  Other notably high ratios are for postcards and datasets though there are several others that are at or close to 2 to 1 ratio of edits to records.


The final way we are going to look at the “what” data is by Format.  Again the UNT Libraries uses a controlled vocabulary for the format which you can look at here.  I’ve once again facetted on the format field and presented the total number of edit events and then unique records for each of the five format types that we have in the system.

Format Edit Events Unique Records text 48,580 32,770 image 43,477 34,436 video 931 461 audio 720 695 website 22 20

Converting the ratio of events-to-records into a bar graph results in the graph below.

Edit Events to Record Ratio grouped by Format

It looks like we edit video files more times per record than any of the other types with text and then image coming in behind.


There are almost endless combinations of collections, partners, resource types, and formats that can be put together and it deserves some further analysis to see if there are patters that we should pay attention to present in the data.  But that’s more for another day.

This is the third in a series of posts related to metadata edit events in the UNT Libraries’ Digital Collections.  check back for the next installment.

As always feel free to contact me via Twitter if you have questions or comments.

DuraSpace News: TOMORROW: Washington D.C. Fedora User Group Meeting, March 31 - April 1

planet code4lib - Mon, 2015-03-30 00:00

Washington, DC  The Washington D.C. Fedora User Group Meeting will get underway tomorrow, Mar. 31 at the USDA National Agriculture Library. Day one presentations include updates on DuraSpace and Fedora 4, Fedora at the National Agriculture Library, Fedora at the University of Maryland Libraries, an Islandora Update and Specifying the Fedora API, and Short Presentations and a Project Roundtable. View the agenda here.

DuraSpace News: TOMORROW: Washington D.C. Fedora User Group Meeting, March 31 - April 1

planet code4lib - Mon, 2015-03-30 00:00

Washington, DC  The Washington D.C. Fedora User Group Meeting will get underway tomorrow, Mar. 31 at the USDA National Agriculture Library. Day one presentations include updates on DuraSpace and Fedora 4, Fedora at the National Agriculture Library, Fedora at the University of Maryland Libraries, an Islandora Update and Specifying the Fedora API, and Short Presentations and a Project Roundtable. View the agenda here.

Mita Williams: The Setup

planet code4lib - Sun, 2015-03-29 21:33
For this post, I’m going to pretend that the editors of the blog, The Setup (“a collection of nerdy interviews asking people from all walks of life what they use to get the job done”) asked me for a contribution. But in reality, I’m just following Bill Denton’s lead.

It feels a little self-indulgent to write about one’s technology purchases so before I describe my set up, let me explain why I’m sharing this information.

Some time back, in preparation for a session I was giving on Zotero for my university’s annual  technology conference, I realized that before going into the reasons how to use Zotero, I had to address the reasons why. I recognized that I was asking students and faculty who were likely already time-strapped and overburdened, to abandon long-standing practices that were already successfully working for them if they were going to switch to Zotero for their research work.

Before my presentation, I asked on Twitter when and why faculty would change their research practices.  Most of the answers were on the cynical side but there were some that gave me some room to maneuver, namely this one: “when I start a new project.”  And there’s a certain logic to this approach. If you were starting graduate school and know that you have to prepare for comps and generate a thesis at the end of the process, wouldn’t you want to conscientiously design your workflow at the start to capture what you learn in such a way that it’s searchable and reusable?

My own sabbatical is over and oddly enough, it is now at the end of my sabbatical in which I feel the most like I’m starting all over again in my professional work. So I’m using that New Project feeling to fuel some self-reflection in my own research process, bring some mindfulness to my online habits, and deliberate design into My Setup.

There’s another reason why I’m thinking about the deliberate design of research practice. As libraries start venturing into the space of research service consultation, I believe that librarians need to follow best practices for ourselves if we hope to develop expertise in this area.

As well, I think we need to more conscious of how and when our practices are not in line with our values. It’s simply not possible to live completely without hypocrisy in this complicated world but that doesn’t mean we can’t strive for praxis. It’s difficult for me to take seriously accusations that hackerspaces are neoliberal when it’s being stated by a person cradling a  Macbook or iPhone. That being said, I greatly rely on products from Microsoft, Amazon, and Google so I'm in no position to cast stones.

I just want to care about the infrastructures we’re building….

And with that, here’s my setup!


There are three computers that I spend my time on: the family computer in the kitchen (a Dell desktop running Windows 7), my work computer (another Dell desktop running Windows 7), and my Thinkpad X1 Carbon laptop which I got earlier this year.  Grub turned my laptop into a dual boot machine that I can switch between Ubuntu and Windows 7. I feel I need a Windows environment so I can run any ESRI products and all those other Mac/Windows only products if need be.

I have a Nexus 4 Android phone made by LG and a Kindle DX as my ebook reader. I don’t own a tablet or an mp3 player.

Worldbackup Day is March 31st. I need to get myself an external drive for backups (Todo1).


After getting my laptop, the first thing I did was investigated password managers to find which one would work best for me. I ended up choosing LastPass and I felt the benefits immediately. Using a password manager has saved me so much pain and aggravation and now my passwords are now (almost) all unique. Next, I need to set up two factor authentication for the services that I haven’t gotten around to yet (Todo2).  

With work being done on three computers, it’s not surprising that I have a tendency to work online. My browser of choice is Mozilla but I will flip to Chrome from time to time. I use the sync functionality on both so my bookmarks are the automatically updated and the same across devices. I use SublimeText for my text editor for code, GIMP as my graphics editor, and QGIS for my geospatial needs.

This draft, along with much of my other writing and presentations are on Google Drive. I spend much of my time in Gmail and Google Calendar. While years ago, I downloaded all my email using Mozilla Thunderbird, I have not set up a regular backup strategy for these documents (Todo3). I’ve toyed with using Dropbox to back up Drive but think I’m better with an external drive. I have a Dropbox account because people occasionally share documents with me through it but at the moment, I only use it to backup my kids Minecraft games.

From 2007 to 2013, I used delicious to capture and share the things I read online. Then delicious tried to be the new Pinterest and made itself unusable (although it has since reverted back to close to its original form) and so I switched to Evernote (somewhat reluctantly because I missed the public aspect of sharing bookmarks).   I’ve grown to be quite dependent on Evernote to save my outboard brain. I use IFTTT to post the links from my Twitter faves to delicious which are then imported automatically into Evernote.  I also use IFTTT to automatically backup my Tumblr posts to Evernote, my Foursquare check-ins saved to Evernote (and Google Calendar) and my Feedly saved posts to Evernote. Have I established a system to back up my Evernote notes on a regular basis? No, no I have not (Todo4).

The overarching idea that I have come up with is that the things I write are backed up on my Google Drive account and the library of things that I have read or saved to future reading (ha!) are saved on Evernote.  To this end, I use IFTTT to save my Tweets to a Google Spreadsheet and my Blogger and WordPress posts are automatically saved to Google Drive (still in a work in progress. Todo 5). My ISP is Dreamhost but I am tempted to jump ship to Digital Ocean.

My goal is to have at least one backup for the things I’ve created. So I use IFTTT to save my Instagram posts to Flickr. My Flickr posts are just a small subset of all the photos that are automatically captured and saved on Google Photos.  No, I have not backed up these photos  (Todo 6) but I have, since 2005, printed the best of my photos on an annual basis into beautiful softcover books using QOOP and then later, through Blurb.  My Facebook photos and status updates from 2006 to 2013 have been printed in a lovely hardcover book using MySocialBook.  One day I would like to print a book of the best of my blogged writings using Blurb, if just as a personal artifact.

Speaking of books, because I’m one of the proud and the few to own a KindleDX, I use it to read PDFs and most of my non-fiction reading. When I stumble upon a longread on the web, I use Readability’s Send to Kindle function so I can read it later without eyestrain. I’m inclined to buy the books that I used in my writing and research as Kindle ebooks because I can easily attach highlighted passages from these books to my Zotero account. My ebooks are backed up in my calibre library. I also use Goodreads to keep track of my reading because I love knowing what my friends are into.

I subscribe to Rdio and for those times that I actually spend money on owning music, I try to use Bandcamp. I’m an avid listener of podcasts and for this purpose use BeyondPod. Our Sonos system allows us to play music from all these services, as well as TuneIn, in the living room.  The music that I used to listen to on CD is now sitting on an unused computer running Windows XP and I know if I don’t get my act together and transfer those files to an external drive soon those files will be gone for good.. if they haven’t already become inaccessible (*gulp*) (Todo 8).

For my “Todo list” I use Google Keep, which also captures my stray thoughts when I’m away from paper or my computer. Google Keep has an awesome feature that will trigger reminders based on your location.

So that’s My Setup. Let me know if you have any suggestions or can see some weaknesses in my workflow. Also, I’d love to learn from your Setup.

And please please please call me out if I don’t have a sequel to this post called The Backup by the time of next year's World Backup Day.

Nicole Engard: Bookmarks for March 29, 2015

planet code4lib - Sun, 2015-03-29 20:30

Today I found the following resources and bookmarked them on Delicious.

Digest powered by RSS Digest

The post Bookmarks for March 29, 2015 appeared first on What I Learned Today....

Related posts:

  1. No more Delicious?
  2. Can you say Kebberfegg 3 times fast
  3. Are you backing up?

John Miedema: I’m a bit of a classification nut. It comes from my Dutch heritage. How do you organize files and emails into folders?

planet code4lib - Sat, 2015-03-28 18:05

I’m a bit of a classification nut. It comes from my Dutch heritage — those Dutchies are always trying to be efficient with their tiny bits of land. It’s why I’m drawn to library science too. I think a lot about the way I organize computer files and emails into folders. It provides insight into the way all classification works, and of course ties into my Lila project. I’d really like to hear about your own practices. Here’s mine:

  1. Start with a root folder. When an activity starts, I put a bunch of files into a root folder (e.g., a Windows directory or a Gmail label).
  2. Sort files by subject or date. As the files start to pile up in a folder, I find stuff by sorting files by subject or date using application sorting functions (e.g., Windows Explorer).
  3. Group files into folders by subject. When there are a lot of files in a folder, I group files into different folders. The subject classification is low level, e.g, Activity 1, Activity 2. Activities that are expire are usually grouped together into an ‘archive’ folder.
  4.  Develop a model. Over time the folder and file structure can get complex, making  it hard to find stuff. I often resort to search tools. What helps is developing a model that reflects my work. E.g., Client 1, Client 2. Different levels correspond to my workflow, E.g., 1. Discovery, 2. Scoping, 3. Estimation, etc. The model is really a taxonomy, an information architecture. I can use the same pattern for each new activity.
  5. Classification always requires tinkering. I’ve been slowly improving the way I organize files into folders for as long as I’ve been working. Some patterns get reused over time, others get improved. Tinkering never ends.

(I will discuss the use of tagging later. Frankly, I find manual tagging hopeless.)

Mark E. Phillips: Metadata Edit Events: Part 2 – Who

planet code4lib - Sat, 2015-03-28 15:53

In the previous post I started to explore the metadata edit events dataset generated from 94,222 edit events from 2014 for the UNT Libraries’ Digital Collections.  I focused on some of the information about when these edits were performed.

This post focuses on the “who” of the dataset.

All together we had 193 unique users edit metadata for one of the systems that comprise the UNT Libraries’ Digital Collections.  This includes The Portal to Texas History, UNT Digital Library, and the Gateway to Oklahoma History.

The top ten most frequent editors of metadata in the system are responsible for 57% of the overall edits.

Username Edit Events htarver 15,451 aseitsinger 10,105 twarner 4,655 mjohnston 4,143 atraxinger 3,905 cwilliams 3,490 sfisher 3,466 thuang 3,327 mphillips 2,669 sdillard 2,518

The overall distribution of edits per user looks like this.

Distribution of edits per user for the Edit Event Dataset

As you can see it shows the primary users of the system and then very quickly tapers down to the “long tail” of users who have a lower number of edit events.

A quick look at the total number of users active for given days of the week across the entire dataset.

Sun Mon Tue Wed Thu Fri Sat 40 95 122 122 123 97 39

There is a swell for Tue, Wed, and Thu in the table above.  It seems to be pretty consistent, either you have 39,40 users, 95-97 users, or 122-123 unique users on a given day of the week.

In looking at how unique users were spread across the year, grouped into months,  we got the following table and then graph.

Month Unique Users January 54 February 73 March 64 April 61 May 44 June 40 July 48 August 50 September 50 October 84 November 49 December 36

Unique Editors Per Month

There were some spikes throughout the year,  most likely related to a metadata class in the UNT College of Information that uses the Edit system as part of their teaching.  This is the October and February spikes in number of unique users.  Other than that we are a consistently over 40 unique users per month with a small dip for the December holiday season when school is not is session.

In the previous post we had a heatmap with the number of edit events distributed over the hours of the day and the days of the week.  I’ve included that graph below.

94,222 edit events plotted to the time and day they were performed

I was curious to see how the unique number of editors mapped to this same type of graph,  so that is included below.

Unique editors distribution across day of the week and hour of the day.

User Status

Of the 193 unique metadata editors in the dataset, 135 (70%) of the users were classified as Non-UNT-Employee and  58 (30%) were classified as UNT-Employee. For the edit events themselves, 75,968 (81%) were completed by users classified with a status of UNT-Employee  and 18,254 (19%) by users classified with the status of Non-UNT-Employee.

User Rank Rank Edit Events Percentage of Total Edits (n=94,222) Unique Users Percentage of Total Users (n=193) Librarian 22,466 24% 16 8% Staff 12,837 14% 13 7% Student 41,800 44% 92 48% Unknown 17,119 18% 72 37%

You can see that 44% of all of the edits in the dataset were completed by users who were students. Librarians and Staff members accounted for 38% of the edits.

This is the second in a series of posts related to metadata edit events in the UNT Libraries’ Digital Collections.  check back for the next installment.

As always feel free to contact me via Twitter if you have questions or comments.

Ed Summers: The Adventure of Experiment

planet code4lib - Sat, 2015-03-28 11:50

Love of certainty is a demand for guarantees in advance of action. Ignoring the fact that truth can be bought only by the adventure of experiment, dogmatism turns truth into an insurance company. Fixed ends upon one side and fixed “principles” — that is authoritative rules — on the other, are props for a feeling of safety, the refuge of the timid, and the means by which the bold prey upon the timid.

John Dewey in Human Nature and Conduct (p. 237)

Nicole Engard: Bookmarks for March 27, 2015

planet code4lib - Fri, 2015-03-27 20:30

Today I found the following resources and bookmarked them on Delicious.

Digest powered by RSS Digest

The post Bookmarks for March 27, 2015 appeared first on What I Learned Today....

Related posts:

  1. Herding Cattle
  2. Google Floor Plans
  3. Planning to Travel?

DPLA: DPLAfest in Light of SEA 101

planet code4lib - Fri, 2015-03-27 19:18

In my social media feeds yesterday, I saw some friends and acquaintances say that they were reconsidering their attendance at DPLAfest, scheduled to be held in Indianapolis, IN, April 17-18, in light of the recent signing of SEA 101, or the “Religious Freedom Restoration Act,” into law by Governor Pence of Indiana.  I must admit that as an openly gay employee at DPLA, I had an immediate and strong negative reaction.  I was unhappy about my organization spending money in a place that would allow businesses not to serve me simply because I am gay.

However, after more thought and a night of sleep, I have come to a different conclusion.  The passing of this law should make us all want to attend DPLAfest even more than we might have before.  We should want to support our hosts and the businesses in Indianapolis who are standing up against this law, and we should make it clear that our money will only be spent in places that welcome all.

At DPLA, we have already begun to diligently ensure that all the venues we are supporting welcome all of the DPLA staff and community.  Messages like these have already helped put our mind at ease about a number of our scheduled activities:













Stickers like the one below are going to help us know which businesses to support while we are in Indianapolis:










At DPLAfest, we will also have visible ways to show that we are against this kind of discrimination, including enshrining our values in our Code of Conduct.  We encourage you to use this as an opportunity to let your voice and your dollars speak.  Let’s use this as a time to support those businesses and venues that support true freedom, all while enjoying each other’s company and a great conference lineup!


Emily Gore

DPLA Director for Content

HangingTogether: Round of 16: The plot thickens … and so do the books

planet code4lib - Fri, 2015-03-27 17:29

OCLC Research Collective Collections Tournament


Our second round of competition is complete, and only eight conferences remain standing! And yes, our tournament Cinderella, Big South, is still with us! Details below, but here are the Round of 16 results:

[Click to enlarge]

Competition in this round was on book length – which conference has the thickest books?* Big South, continuing its magical tournament run, ended up with the thickest books of all the conferences, averaging about 292 pages and ousting the powerful Big Ten from the tournament! West Coast also continues on to the next round, with a convincing victory over the Ivy Leaguers! Summit League, Ohio Valley, Atlantic 10, Missouri Valley, and Big Sky will also move on to the Round of 8. Conference USA and American Athletic had the tightest battle, with Conference USA coming out on top by less than 10 pages!

While Big South had the thickest books of all the conferences competing in this round (averaging about 292 pages), the Ivy League had the thinnest books, averaging about 225 pages. Does this surprise you? It turns out that the larger the size of the collective collection, the thinner the books. Take a look at this:

[Click to enlarge]

Big South had the smallest collective collection among the conferences competing in this round; the Ivy League had the largest. As the chart shows, there is a pretty strong correlation between collection size, and the percentage of the collection accounted for by books with less than 100 pages. Got any ideas why? Put them in the comments!

By the way, in case you were wondering, the average length of a print book in WorldCat is about 255 pages.

Bracket competition participants: Remember, if the conference you chose has been ousted from the tournament, do not despair! If no one picked the tournament Champion, all entrants will be part of a random drawing for the big prize!

The Round of 8 is next, where the tournament field will be reduced to just four conferences! Results will be posted March 31.


*Average number of pages per print book in conference collective collection. Data is current as of January 2015.

[Click to enlarge]

More information:

Introducing the 2015 OCLC Research Collective Collections Tournament! Madness!

OCLC Research Collective Collections Tournament: Round of 32 Bracket Revealed!

Round of 32: Blow-outs, buzzer-beaters, and upsets!

About Brian Lavoie

Brian Lavoie is a Research Scientist in OCLC Research. Brian's research interests include collective collections, the system-wide organization of library resources, and digital preservation.

Mail | Web | LinkedIn | More Posts (11)

Sean Chen: Waving a Dead Fish

planet code4lib - Fri, 2015-03-27 16:05

I’ve been using Vagrant & Virtualbox for development on my OS X machines for my solo projects. But in an effort to get an intern started up on developing a front-end to a project I started a while ago I ran into a really strange problem getting Vagrant working on Windows.

So as a tale of caution for whatever robot wants to pick up this bleg.

Bootcamp partition on a Mid-2010 MacBook Pro. Running a dormant OS X and a full Windows 7. The Windows 7 is the main environment:

Use the git bash shell since it has SSH to stand up the boxes with vagrant init, vagrant up.

And then stuck (similar to Vagrant stuck connection timeout retrying):

==> default: Clearing any previously set network interfaces... ==> default: Preparing network interfaces based on configuration... default: Adapter 1: nat default: Adapter 2: hostonly ==> default: Forwarding ports... default: 22 => 2222 (adapter 1) ==> default: Booting VM... ==> default: Waiting for machine to boot. This may take a few minutes... default: SSH address: default: SSH username: vagrant default: SSH auth method: private key default: Error: Connection timeout. Retrying... default: Error: Connection timeout. Retrying... default: Error: Connection timeout. Retrying... default: Error: Connection timeout. Retrying... default: Error: Connection timeout. Retrying... default: Error: Connection timeout. Retrying... default: Error: Connection timeout. Retrying... default: Error: Connection timeout. Retrying... default: Error: Connection timeout. Retrying... default: Error: Connection timeout. Retrying... default: Error: Connection timeout. Retrying... default: Error: Connection timeout. Retrying... default: Error: Connection timeout. Retrying...

Well we booted into the VM with a head and it looked like the booting got interrupted by some sort of kernal panic due to:

Spurious ACK on isa0060/serio0. Some program might be trying to access hardware directly.

Ok makes sense…the machine isn’t booting up and there has to be a reason why.

Long story short. The Windows 7 partition didn’t have virtualization enabled, and there is no BIOS setting or switch somewhere to do it. So what do you do:

How to enable hardware virtualization on a MacBook?

Like waving a dead fish in front of your computer.

  • Boot into OSX.
  • System Preferences > Select the Start Up preference pane
  • Select the Boot Camp partition with Windows
  • Restart into the Boot Camp partition
  • Magic

Go figure

FOSS4Lib Recent Releases: Goobi - 1.11.0

planet code4lib - Fri, 2015-03-27 14:10

Last updated March 27, 2015. Created by Peter Murray on March 27, 2015.
Log in to edit this page.

Package: GoobiRelease Date: Wednesday, March 25, 2015


Subscribe to code4lib aggregator