You are here

Feed aggregator

Peter Murray: DLTJ in a #NEWWWYEAR

planet code4lib - Sun, 2017-12-31 22:36

I came across Jen Simmon's call for a #newwwyear by way of Eric Meyer:

Ok, here’s the deal. Tweet your personal website plan with the hashtag #newwwyear (thanks @jamiemchale!):
1) When will you start?
2) What will you try to accomplish?
3) When is your deadline?

Improve an existing site. Start a new one. Burn one down & start over. It’s up to you.

— Jen Simmons (@jensimmons) December 20, 2017

Eric's goal:

I plan to participate in #newwwyear (see My plan:

1) I’ll start December 27th.
2) I’ll redesign for the first time in a dozen years, and I’ll do it live on the production site.
3) My deadline is January 3rd, so I’ll have a week.

— Eric Meyer (@meyerweb) December 22, 2017

I was definitely inspired by Eric's courage to do it live on his personal website (see this YouTube playlist). Watch him work without a net (and without a CSS compiler!). I also know that the layout for this blog is based on the WordPress Twenty-eleven theme with lots of bits and pieces added by me throughout the years. It is time for a #newwwyear refresh. Here is my plan.

1) I'll start January 1.
2) I'll convert '' from WordPress to a static site generator (probably Jekyll) with a new theme.
3) My deadline is January 13 (with a conference presentation in between the start and the end dates).

Along the way, I'll probably package up a stack of Amazon Web Services components that mimic the behavior of GitHub pages with some added bonuses (standing temporary versions of the site based on branches, supporting HTTPS for custom domains, integration with GitHub pull request statuses). I'll probably even learn some Ruby along the way.

Eric Hellman: 2017: Not So Prime

planet code4lib - Sat, 2017-12-30 17:21
Mathematicians call 2017 a prime year because 2017 has no prime factors other than 1 and 2017. Those crazy number theorists.

I try to write at least one post here per month. I managed two in January. One of them raged at a Trump executive order that compelled federal libraries to rat on their users. Update: Trump is still president.  The second pointed out that Google had implemented cookie-like user tracking on previously un-tracked static resources like Google Fonts, jQuery, and Angular. Update: Google is still user-tracking these resources.

For me, the highlight of January was marching in Atlanta's March for Social Justice and Women with a group of librarians.  Our chant: "Read, resist, librarians are pissed!"

In February, I wrote about how to minimize the privacy impact of using Google AnalyticsUpdate: Many libraries and publishers use Google Analytics without minimizing privacy impact.

In March, I bemoaned the intense user tracking that scholarly journals force on their readersUpdate: Some journals have switched to HTTPS (good) but still let advertisers track every click their readers make.

I ran my first-ever half-marathon!

In April, I invented CC-licensed "clickstream poetry" to battle the practice of ISPs selling my clickstream.  Update: I sold an individual license to my poem!

I dressed up as the "Trump Resistor" for the Science March in New York City. For a brief moment I trended on Twitter. As a character in Times Square, I was more popular than the Naked Cowboy!

In May, I tried to explain Readium's "lightweight DRM"Update: No one really cares - DRM is a fig-leaf anyway.

In June, I wrote about digital advertising and how it has eviscerated privacy in digital libraries.  Update: No one really cares - as long as PII is not involved.

I took on the administration of the free-programming-books repo on GitHub.  At almost 100,000 stars, it's the 2nd most popular repo on all of GitHub, and it amazes me. If you can get 1,000 contributors working together towards a common goal, you can accomplish almost anything!

In July, I wrote that works "ascend" into the public domain. Update: I'm told that Saint Peter  has been reading the ascending-next-monday-but-not-in-the-US "Every Man Dies Alone

I went to Sweden, hiked up a mountain in Lappland, and saw many reindeer.

In August, I described how the National Library of Medicine lets Google connect Pubmed usage to Doubleclick advertising profilesUpdate: the National Library of Medicine still lets Google connect Pubmed usage to Doubleclick advertising profiles.

In September, I described how user interface changes in Chrome would force many publishers to switch to HTTPS to avoid shame and embarassment.  Update: Publishers such as Elsevier, Springer and Proquest switched services to HTTPS, avoiding some shame and embarrassment.

I began to mentor two groups of computer-science seniors from Stevens Institute of Technology, working on projects for and Gitenberg. They are a breath of fresh air!

In October, I wrote about new ideas for improving user experience in ebook reading systemsUpdate: Not all book startups have died.

In November, I wrote about how the Supreme Court might squash out an improvement to the patent system. Update: no ruling yet.

I ran a second half marathon!

In December, I'm writing this summary. Update: I've finished writing it.

On the bright side, we won't have another prime year until 2027. 2018 is twice a prime year. That hasn't happened since 1994, the year Yahoo was launched and the year I made my first web page!

Open Knowledge Foundation: Data aggregators: a solution to open data issues

planet code4lib - Thu, 2017-12-28 13:40

This is a guest opinion piece written by Guiseppe Maio, and Jedrzej Czarnota PhD. Their biographies can be found below this post.

Open Knowledge International’s report on the state of open data identifies the main problems affecting open government data initiatives. These are: the very low discoverability of open data sources, which were rightfully defined as being “hard or impossible to find”; the lack of interoperability of open data sources, which are often very difficult to be utilised; and the lack of a standardised open license, representing a legal obstacle to data sharing. These problems harm the very essence of the open data movement, which advocates data easy to find, free to access and to be reutilised.  

In this post, we will argue that data aggregators are a potential solution to the problems mentioned above.  Data aggregators are online platforms which store data of various nature at once central location to be utilised for different purposes. We will argue that data aggregators are, to date, one of the most powerful and useful tools to handle open data and resolve the issues affecting it.

We will provide the evidence in favour of this argument by observing how FAIR principles, namely Findability, Accessibility, Interoperability and Reusability, are put into practice by four different data aggregators engineered in Indonesia, Czech Republic, the US and the EU. FAIR principles are commonly utilised as a benchmark to assess the quality of open data initiatives and good FAIR practices are promoted by policymakers.

Image: SangyaPundir (Wikimedia Commons)

We will also assess the quality of aggregators’ data provision tools. Aggregators’ good overall performance on the FAIR indicators and the good quality of their data provision tools will prove their importance. In this post, we will firstly provide a definition of data aggregators presenting the four data aggregators previously mentioned. Subsequently, we will discuss the performance of the aggregators on the FAIR indicators and the quality of their data provision.   

Data aggregators

Data aggregators perform two main functions: data aggregation and integration. Aggregation consists of creating hubs were multiple data sources can be accessed for various purposes. Integration makes reference to linked data, namely data to which a semantic label (a name describing a variable) is attached in order to allow for the integration and amalgamation of different data sources (Mazzetti et al 2015, Hosen and Alfina 2016, Qanbari et al 2015, Knap et al 2012).

Following on that, two strengths characterise data aggregators. Firstly, aggregators implement the so-called “separations of concern”:  this means that each actor is responsible for a functionality. Separation of concerns spurs accountability and improves data services. Secondly, aggregators host added-value services, i.e. semantics, data transformation, data visuals (Mazzetti et al 2015). However, aggregators face a major challenge as they represent a “single point of failure”: when aggregators break down, the whole system (including data providers and users) is put in jeopardy.

In this post we investigate the Indonesian Active Hiring website, the Czech ODCleanStore, the US-based and the EU-funded ENERGIC-OD.  

  1. The Active Hiring website is a portal that monitors job hiring trends by sector, geographical area and job type. The platform utilises open and linked data (Hosen and Alfina 2016).
  2. ODCleanStore is a project that enables automated data aggregation, simplifying previous aggregation processes; the website provides provenance metadata (metadata showing the origin of the data) and information on data trustworthiness (Knap et al 2012).
  3. is a platform that catalogues raw data, providing open APIs to government data. This portal is part of the Gov 2.0 movement.
  4. ENERGIC-OD (European Network for Redistributing Geospatial Information to user Community – Open Data) is a European Commission-funded project which aims to facilitate access to the Geographic Information System (GIS) open data.  The project built a pan-European Virtual Hub (pEVH), a new technology brokering together diverse GIS open data sources.

FAIR indicators and quality of data provision to evaluate data aggregators

FAIR principles and the quality of data provision are the criteria for the assessment of open data aggregators.

Findability. Data aggregators by default increase the discoverability of open data, as they assemble data in just one site, rendering it more discoverable. Aggregators, however, do not fully resolve the problem of lack of discoverability: they merely  change the nature of it. While before, findability was associated with technical problems (data was available but technical skills were needed to extract it from various original locations), now it is intertwined to marketing ones (data is in one place, but it may be that no one is aware of it). Aggregators thus address the findability issues but do not fully resolve them.

Accessibility. Aggregators perform well on the Accessibility indicator. As an example, ENERGIC-OD makes data very accessible through the use of a single API. ‘s proposed new unit, Data Compute Unit (DCU), provide APIs to render data accessible and usable. ODCleanStore converts data in RDF format which makes it more accessible. Finally, Active Hiring website will provide data as CSV through API(s). Aggregators show improved data accessibility.

Interoperability. All platforms produce metadata (ENERGIC-OD, and linked data (Active Hiring Website and ODCleanStore) which make data interoperable, allowing it to be integrated, thus contributing to the resolution of the non-interoperability issue.

Reusability. ENERGIC-OD’s freemium model promotes reusability. data can be easily downloaded and reutilised as well. ODCleanStore guarantees re-use, as data is licensed with Apache 2.0, while Active Hiring allows visualisation only. Thus, three out of four aggregators enhance the reusability of the data, showing a good performance on the reusability indicator.

Quality of data provision. Web Crawler is used in ENERGIC-OD and Active Hiring websites. This is a programme which sifts the web searching for data in an automated and methodical way. ODCleanStore acquires data in the following ways:  A) through a “data acquisition module” which collects government data from a great deal of different sources in various formats and converts it into RDF (Knap et al 2012); 2) through the usage of a web service for publishers; 3) or data can be sent directly as RDF. In the case of, the government sends data directly to the portal. Three out of four aggregators show automated or semi-automated ways of acquiring data, rendering this process smoother.


This post analysed the performance of four data aggregators on the FAIR principles. The overall good performance of the aggregators demonstrates how they render the process of data provision smoother and more automated, improving open data practices. We believe, that aggregators are among the most useful and powerful tools available today to handle open data.

  • Hosen, A. and Alfina, I. (2016). Aggregation of Open Data Information using Linked Data: Case Study Education and Job Vacancy Data in Jakarta. IEEE, pp.579-584.
  • Knap, T., Michelfeit, J. and Necasky, M. (2012). Linked Open Data Aggregation: Conflict Resolution and Aggregate Quality. IEEE 36th International Conference on Computer Software and Applications Workshops, pp.106-111.
  • Mazzetti, P., Latre, M., Bauer, M., Brumana, R., Brauman, S. and Nativi, S. (2015). ENERGIC-OD Virtual Hubs: a brokered architecture for facilitating Open Data sharing and use. IEEE eChallenges e-2015 Conference Proceedings, pp.1-11.
  • Qanbari, S., Rekabsaz, N. and Dustdar, S. (2017). Open Government Data as a Service (GoDaaS): Big Data Platform for Mobile App Developers. IEEE 3rd International Conference on Future Internet of Things and Cloud, pp.398-403.


Giuseppe Maio is a research assistant working on innovation at Trilateral Research. You can contact him at His twitter handle is @pepmaio. Jedrzej Czarnota is a Research Analyst at Trilateral Research. He specialises in innovation management and technology development. You can contact Jedrzej at and his Twitter is @jedczar.

District Dispatch: Keeping public information public: Records scheduling, retention and destruction

planet code4lib - Thu, 2017-12-28 13:32

This is the third post in a three-part series on how federal records laws ensure government information is appropriately managed, preserved and made available to the public.(see also part one and part two).

The National Archives and Records Administration, which holds nearly 5 million cubic feet of archival records, helps federal agencies determine which records are deemed to be “of permanent value” and which can be disposed of, and when and how. Photo by Wknight94

Earlier posts in this series discussed the need for the federal government to have a system for determining how long to keep records and when to dispose of them. Just as a library deaccessions materials when they are no longer needed for that library’s purposes, the federal government also must dispose of records that it no longer needs. Without disposal, the haystack of records would grow ever larger, making it that much harder to find the needle.

Only 1-3% of records created by the federal government are deemed to be of permanent value. Agencies must transfer those records to the National Archives and Records Administration (NARA) and dispose of the rest after an appropriate period of time. Currently, NARA holds nearly 5 million cubic feet of archival records – and that number continues to grow. You can see why there has to be a way to separate the wheat from the chaff.

What are agency responsibilities to retain federal records?

The Federal Records Act (FRA) is the law that sets those standards. In turn, NARA provides guidance to federal agencies on their responsibilities under the FRA. The FRA requires agencies to safeguard federal records so that they are not lost or destroyed without authorization.

Agencies must maintain federal records as long as they have “sufficient administrative, legal, research, or other value to warrant their further preservation by the Government.” In order to do so, agencies must determine the “value” of a particular record.

How do agencies determine the expected value of a record?

Given the huge volume of records created by the federal government, it would be impractical to review every individual record in order to determine its value. Instead, agencies predict the value of records in advance through the use of categories. The agency categorizes the types of records it produces, and then, for each category, predicts how far in the future those records will be useful.

For instance, consider official agency documents explaining the rationale behind a new regulation. We can imagine that, thirty years later, people might want to know why the agency issued that regulation. But there would likely not be as much interest in the agency’s cafeteria menu after that length of time. This process is known as “records scheduling,” because the agency is creating a schedule for how long to retain each type of record.

However, the agency does not make scheduling decisions on its own. Agencies submit proposed schedules to NARA for its appraisal. If NARA agrees with the agency’s proposal, then it publishes a notice in the Federal Register describing the records and inviting comments from the public. After reviewing any comments, NARA may approve the schedule, and the agency may then begin disposing of records according to that schedule. Approved schedules are posted online to help the public identify how long particular records will be retained.

How can the records scheduling process be improved?

Typically, there is not much controversy about records schedules; rightly or wrongly, they’re generally seen as pretty mundane. But that’s not always the case.

For instance, in July, NARA published a notice of proposed records schedules from Immigration and Customs Enforcement (ICE) for several categories of records related to detainees. After reviewing the proposed schedules, the ACLU and other organizations filed comments arguing that the proposed retention schedule “does not account for the needs of the public, impacted individuals and government officials to conduct necessary, and in some cases required, evaluation of ICE detention operations.” In response, NARA has not yet approved the ICE proposal, and changes are expected before it is approved.

ALA believes that the records scheduling process can be improved to better ensure that important records are retained for an adequate period of time. In November, ALA sent a letter recommending that NARA increase the transparency of proposed records schedules and review its appraisal policy to give greater consideration to the potential value of records. ALA will continue to work with NARA and agencies to advocate for the proper preservation of government information and its availability to the public.

The post Keeping public information public: Records scheduling, retention and destruction appeared first on District Dispatch.

Terry Reese: MarcEdit MacOS 3 Design notes

planet code4lib - Thu, 2017-12-28 00:23

** Updated 12/28 **

Ok, so I’m elbow deep putting some of the final touches on the MacOS version of MarcEdit. Most of the changes to be completed are adding new functionality (introduced in MarcEdit 7 in recent updates), implementing the new task browser, updating the terminal mode, and completing some of the UI touches. My plan has been to target Jan. 1 as the release date for the next MacOS version of MarcEdit, but at this point, I’m thinking this version will release sometime between Jan.1 and Jan. 7. Hopefully, folks will be OK if I need a little bit of extra time.

As I’m getting closer to completing this work, I wanted to talk about some of the ways that I’m thinking about how the MacOS version of MarcEdit is being redesigned.

  1. As much as possible, I’m syncing the interfaces so that all versions of MarcEdit will fundamentally look the same. In some places, this means significantly redoing parts of the UI. In others, its mostly cosmetic (colors). While design does have to stay within Apple’s UI best practices (as much as possible), I am trying to make sure that the interfaces will be very similar. Probably the biggest differences will be in the menuing. To do this, I’ve had to improve the thread queuing system that I’ve developed in the MacOS version of MarcEdit to make up for some of the weaknesses in how default UI threads work.
  2. One of the challenges I’ve been having is related to some of the layout changes in the UI. To simplify this process, the initial version of the Mac update won’t allow a lot of the forms to be resized. The form itself will resize automatically based on the font and font sizes selected by the user – but resizing and scaling all the items on the window and views automatically and in spatial content is providing to be a real pain. Looking at a large number of Mac apps, window resizing doesn’t always appear to be available though (unless the window is more editor based) – so maybe this won’t be a big issue.
  3. Like MarcEdit 7, I’m working on integrations. Enhancing the OCLC integrations, updating the ILS integrations, integrating a lot of help into MarcEdit MacOS, adding new wizards, integrating plugins – I’m trying to make sure that I fully embrace with the MacOS update, one of the key MarcEdit 7 design rules – that MarcEdit should integrate or simplify the moving of data between systems (because you are probably using more programs than MarcEdit).
  4. Functionally, MacOS 3 should be functionally equivalent to MarcEdit 7 with the following exceptions
    1. There is no COM functionality (this is windows only)
    2. Initially, there will be no language switching (the way controls are named are very different than on Windows – I haven’t figured out how to connect the differences)
  5. Like MarcEdit 7, I’m targeting this build for newer versions of MacOS. In this case, I’ll be targeting 10.10+. This means that users will need to be running Yosemite (released in 2013) or greater. Let me know if this is problematic. I can push this down one, maybe two versions – but I’m just not sure how common older OSX versions are in the wild.


Here’s a working wireframe of the new main MacOS MarcEdit update.

Anyway – these are the general goals. All the functional code written for MarcEdit 7 has been able to be reused in MarcEdit MacOS 3, so like the MarcEdit 7 update, this will be a big release.

As I’ve noted before, I’m not a Mac user. I spent more time in the eco-system to get a better idea of how programs handle (or don’t) resizing windows, etc. – but fundamentally, working with MacOS feels like working with a broken version of Linux. So, with that in mind – I’m developing MarcEdit MacOS 3 using the 4 concepts above. Once completed, I’d be happy to talk (or hear) from primarily MacOS users and talk about how some of the UI design decisions might be updated to make the program easier for MacOS users.



District Dispatch: April CopyTalk on the calendar: fair use confidence

planet code4lib - Wed, 2017-12-27 22:40

Originally scheduled for January 4, 2018, this CopyTalk is being postponed to April 5. 

Please join us for “Assessing Librarians Confidence and Comprehension in Explaining Fair Use Following an Expert Workshop” presented by Sara Benson, Copyright Librarian at the University of Illinois.

Sara will discuss her study that evaluated librarian confidence and comprehension following an expert-led fair use training session. The results, though limited in scope, provide encouraging evidence that librarians can tackle the concept of fair use when provided with appropriate training. Both the level of confidence and the level of comprehension rose after the librarian participants were provided with training, indicating that the training did have an impact. Further evidence of impact was revealed by the survey distributed two weeks after the training wherein some librarians noted that they had had the opportunity to use the skills learned in the training workshop during their daily work. The findings could contribute to the enhancement of copyright in libraries training and workshops.

Sara Benson was a practicing lawyer before teaching at the University of Illinois College of Law. Now she is also a librarian with an MLIS degree from the iSchool at Illinois.

Mark your calendars and set aside some time for this webinar: Thursday, January 4, 2 p.m. Thursday, April 5, at 2 p.m. for our hour-long free webinar. Go to and sign in as a guest.

This program is brought to you by the Copyright Education Subcommittee.

Did you miss a CopyTalk? Check out our CopyTalk webinar archive!

The post April CopyTalk on the calendar: fair use confidence appeared first on District Dispatch.

Meredith Farkas: My year in books 2017

planet code4lib - Wed, 2017-12-27 19:50

Reading this year has been so many things for me. An escape. A way to educate myself. A way to see my own struggles in a different way through another’s story. A way to understand the struggles of others. A way to better understand where I came from. This year I think I’ve read more than I had in any other year since college. I read 16 of the 22 books I’d hoped to read this year, which feels like an accomplishment. Books with asterisks are ones that I didn’t (because I couldn’t get into them) or have not yet read in full (as is the case of the books by Nguyen, Clinton, and Clements). I struggled this year to think of what my favorite book was, but The Underground Railroad, Little Fires Everywhere, The Hate U Give, and Bad Feminist were definitely highlights.

This year, I also listed the children’s books I’ve either read with Reed or on my own. Reed is part of an Oregon Battle of the Books team this year and I’m their coach, so I’ve been slowly reading the 8 books he was assigned to read so I can help quiz him. He’s finally gotten into reading, which I could not be more thrilled about. I remember having the same experience the summer before I started third grade: one moment, I hated reading; the next, I was in love.

Adult and Young Adult Fiction

  • Willful Disregard: A Novel About Love by Lena Andersson
  • All Grown Up by Jami Attenberg
  • The Water Knife by Paolo Bacigalupi
  • The Noise of Time by Julian Barnes*
  • The Idiot by Elif Batuman
  • The Sellout by Paul Beatty
  • Outline by Rachel Cusk*
  • The Arrangement by Sarah Dunn
  • Eleven Hours by Pamela Erens
  • Turtles All the Way Down by John Green
  • Since we Fell by Dennis Lehane*
  • It Can’t Happen Here by Sinclair Lewis
  • The Association of Small Bombs by Karan Mahajan
  • Behold the Dreamers by Imbolo Mbue
  • Norwegian by Night by Derek Miller
  • The Bluest Eye by Toni Morrison
  • I’ll Give You the Sun by Jandy Nelson
  • Little Fires Everywhere by Celeste Ng
  • The Sympathizer by Viet Thanh Nguyen*
  • Commonwealth by Ann Patchett
  • Eleanor and Park by Rainbow Rowell
  • The Hate U Give by Angie Thomas
  • Chemistry by Weike Wang
  • The Underground Railroad by Colson Whitehead
  • The Sun is Also a Star by Nicola Yoon


  • Voices from Chernobyl by Svetlana Alexievich
  • My Name is Freida Sima:The American-Jewish Women’s Immigrant Experience Through the Eyes of a Young Girl from the Bukovina by Judith Tydor Baumel-Schwartz (which is actually about my relatives!)
  • What Happened by Hillary Clinton*
  • Evicted: Poverty and Profit in the American City by Matthew Desmond
  • Love Warrior by Glennon Doyle*
  • Abandon Me: Memoirs by Melissa Febos*
  • Bad Feminist by Roxanne Gay
  • The Morning They Came for Us: Dispatches from Syria by Janine di Giovanni
  • Lab Girl by Hope Jahren*
  • I Love Dick by Chris Kraus
  • Furiously Happy by Jenny Lawson
  • The Arm: Inside the Billion-Dollar Mystery of the Most Valuable Commodity in Sports by Jeff Passan
  • Becoming Habsburg: The Jews of Habsburg Bukovina, 1774-1918 by David Rechter*
  • Men Explain Things to Me by Rebecca Solnit
  • The Mother of All Questions by Rebecca Solnit

Children’s Fiction

  • Alien in My Pocket by Nate Ball
  • The Case of the Case of Mistaken Identity by Mac Barnett
  • Wild Life by Cynthia deFelice
  • The Terrible Two by Mac Barnett and Jory John
  • The Terrible Two Get Worse by Mac Barnett and Jory John
  • Keepers of the School Book 1: We the Children by Andrew Clements
  • Keepers of the School Book 2: Fear Itself by Andrew Clements*
  • The Westing Game by Ellen Raskin
  • Harry Potter and the Half-Blood Prince by J. K. Rowling
  • Harry Potter and the Deathly Hallows by J. K. Rowling
  • Holes by Louis Sachar
  • I Survived the Eruption of Mount St. Helens, 1980 by Lauren Tarshis

Here are some of the books I hope to read in 2018, though as always, I know that serendipity and the vagaries of Overdrive hold lists will impact my decision-making. If any of you have thoughts on these or alternative suggestions, let me know!

  • Beartown by Fredrik Backman
  • We Were Eight Years in Power by Ta Nahesi Coates
  • Why I Am Not a Feminist: A Feminist Manifesto by Jessa Crispin
  • Manhattan Beach by Jennifer Egan
  • Fresh Complaint by Jeffrey Eugenides
  • My Brilliant Friend by Elena Ferrante
  • Difficult Women by Roxanne Gay
  • Class Mom by Laurie Gelman
  • Homegoing by Yaa Gyasi
  • Exist West by Moshin Hamid
  • Uncommon Type by Tom Hanks
  • Before the Fall by Noah Hawley
  • This Will Be My Undoing: Living at the Intersection of Black, Female, and Feminist in (White) America by Morgan Jenkins
  • Her Body and Other Parties: Stories by Carmen Maria Machado
  • Dark Money: The Hidden History of the Billionaires Behind the Rise of the Radical Right by Jane Mayer
  • Homesick for Another World by Ottessa Moshfegh
  • Nasty Women: Feminism, Resistance, and Revolution in Trump’s America edited by Samhita Mukhopadhyay and Kate Harding
  • So You Want to Talk About Race by Ijeoma Oluo
  • The Bright Hour by Nina Riggs
  • On Tyranny: Twenty Lessons from the Twentieth Century by Timothy Snyder
  • Sing, Unburied, Sing by Jesmyn Ward
  • The Best Kind of People: A Novel by Zoe Whittall

Terry Reese: MarcEdit 7: Holiday Edition

planet code4lib - Wed, 2017-12-27 17:43

I hope that this note finds everyone in good spirits. We are in the mist of the holiday season and I hope that everyone that this reaches has had a happy one. If you are like me, the past couple of days have been spent clean up. There are boxes to put away, trees to un-trim, decorations to store away for another year. But one thing has been missing, and that has been my annual Christmas eve update. Hopefully, folks won’t mind it being a little belated this year.

The update includes a number of updates – I posted about the most interesting (I think) here:, but the full changelog is below:

  • Enhancement: Clustering Tools: Added the ability to extract records via the clustering tooling
  • Enhancement: Clustering Tools: Added the ability to search within clusters
  • Enhancement: Linux Build created
  • Bug Fix: Clustering Tools: Numbering at the top wasn’t always correct
  • Bug Fix: Task Manager: Processing number count wouldn’t reset when run
  • Enhancement: Task Broker: Various updates to improve performance and address some outlier formats
  • Bug Fix: Find/Replace Task Processing: Task Editor was incorrectly always check the conditional option. This shouldn’t affect run, but it was messy.
  • Enhancement: Copy Field: Added a new field feature
  • Enhancement: Startup Wizard — added tools to simplify migration of data from MarcEdit 6 to MarcEdit 7

One thing I specifically want to highlight, and that is the presence of a Linux build. I’ve posted a quick video documenting the installation process at: The MarcEdit 7 Linux build is much more self-contained than previous versions, something I’m hoping to do with the MacOS build as well. I’ll tell folks upfront, there are some UI issues with the Linux version – but I’ll keep working to resolve them. However, I’ve had a few folks asking about the tool, so I wanted to make it ready and available.

Throughout this week, I’ll be working on updating the MacOS build (I’ve fallen a little behind, this build may take an extra week to complete (I was targeting Jan. 1, it might slip a few days past) and I’ll say that functionality, I think folks will be happy as it fills in a number of gaps while still integrating the new MarcEdit 7 functionality (including the new clustering tools).

As always, if you have questions, please let me know. Otherwise, I’d like to wish everyone a Happy New Year, filled with joy, love, friendship, and success.



Open Library: A Holiday Gift from Open Library: Introducing the Reading Log

planet code4lib - Wed, 2017-12-27 00:23

For years readers have been asking us for a convenient way to keep track of the books they’re reading.

As we prepare to step through the threshold into 2018, we’re happy to announce the release of a brand new Reading Log feature which lets you indicate all the books you’d like to read, books you’ve already read, and books you’re currently reading.

You can now mark a book as “Want to Read”, “Currently Reading”, or “Already Read”, in addition to adding it to one of your themed reading lists.

Here’s how it works!

Any time you go to a book page on Open Library, you will see a new green button with the text “Want to Read”. By clicking on this button, you can mark this book as a work you’d like to read. By clicking on the white dropdown arrow on the right side of the button, you can select from other options too, like “Currently Reading” or “Already Read”. Once you click one of these options, the green button will appear gray with a green check, indicating your selection has been saved.

Where can I review my Reading Log?

You can review the books in your Reading Log by clicking the “My Books” menu and selecting the “My Reading Log” option in the dropdown.

You can find a link to your Reading Log page under the “My Books” menu

From this page, you can manage the status of the books you’re reading and easily find them in the future.

A preview of the Reading Log page

Who can see my Reading Log selections?

Books you mark as “Want to Read”, “Currently Reading”, or “Already Read” are made private by default. We know some people want to share what books they’re reading. In the future, we hope to offer an option for readers to make their Reading Log public.

Can I Still Use Lists?

You can still use your existing Lists and even create new ones! In addition to giving you a convenient way to log your reading progress, you can also use the green dropdown menu to add this book to one of your custom themed Lists.

Send Us Your Feedback!

We hope you love this new feature as much as we do and we’d love to hear your thoughts! Tweet us at @openlibrary. Is the Reading Log feature not working as you expect? Please tell us about any issues you experience here.

David Rosenthal: Updating Flash vs. Hard Disk

planet code4lib - Tue, 2017-12-26 16:00
Chris Mellor at The Register has a useful update on the evolution of the storage market based on analysis from Aaron Rakers. Below the fold, I have some comments on it. In order to understand them you will need to have read my post The Medium-Term Prospects for Long-Term Storage Systems from a year ago.

SourceThe first thing to note is that the transition to 3D is proceeding rapidly:
3D NAND bits surpassed 50 per cent of total NAND Flash bits supplied in 2017's third quarter, and are estimated to reach 85 per cent by the end of 2018,Despite this increase in capacity, price per bit has increased recently. Rakers' sources all predict that prices will resume their decrease shortly. They didn't predict the increase, so some skepticism is in order. They differ about the rate of the decrease:
IDC thinks there will be a $/GB decline of 36 per cent year-on-year 2018. TrendForce (DRAMeXchange) recently forecast that 2018 NAND Flash price declines would be in the 10 to 20 per cent year-on-year range. Western Digital concurs with that from a 3D NAND viewpoint, and has reported having seen 3D NAND price declines in the 15 - 25 per cent per annum range. SourceSo Rakers' graph is on the optimistic side, and TrendForce's estimate agrees with my projection. Rakers projects that SSD's gradual erosion of the hard disk market share will continue:
He looked at SSD ships related to disk drive ships on a capacity basis, seeing the flash percentage share rising to 19.3 per cent in 2021 from 8.4 per cent this year:So, as I projected, Rakers agrees that in 2021 bulk data will still reside overwhelmingly on hard disk, with flash restricted to premium markets that can justify its higher cost.

Mellor ends on a cautionary note, with which I concur:
It's still not clear if QLC (4bits/cell) flash will actually be an enterprise-class technology. Flash capacity increases beyond that might stall because there is nothing beyond QLC, such as a theoretical PLC (penta level cell - 5bits/cell) technology, or layering beyond 96 x 3D NAND layers might hit a roadblock.

Cynthia Ng: Making the Choice: Personal over Professional

planet code4lib - Tue, 2017-12-26 10:33
This is going to be a relatively short post, but as many people have been asking me why I am leaving my current position, I thought I might as well do a brief post about the whole thing. TL;DR version: We’re moving and since my current position is not a remote/telecommute job, I am resigning. … Continue reading "Making the Choice: Personal over Professional"

Ed Summers: Dossier

planet code4lib - Tue, 2017-12-26 05:00

This is a snippet from Mayer-Schönberger (2011) that I happened to read soon after the Digital Blackness Symposium in Ferguson, Missouri. In one panel presentation we heard from a group of activists who spoke in part about how they wanted the records of protest in Ferguson to reflect how the activists changed as individuals. This was actually a theme that continued on from the first meeting a year earlier, where one of the main takeaways was that the archive needs to reflect voices in time–voices that are in the process of becoming. In Delete: The Virtue of Forgetting in the Digital Age, Meyer-Schönberger draws on similar ideas and specifically focuses on how digital media’s ability to collapse time can actually work to to prevent change from happening:

But there is one further dimension of how digital remembering negates time. Earlier I mentioned digital dossiers and results of person queries in search engines. They are odd because they are limited to information available in digital format, and thus omit (possibly important) facts not stored in digital memory. Equally odd is how in such dossiers time is collapsed, and years–perhaps decades of life–are thrown together: that a person took out a loan a decade ago may be found next to the link to an academic paper she published six years later, as well as a picture posted on Flickr on last week. The resulting collage with all the missing (as well as possibly misleading) bits is no accurate reflection of the person themselves. Using an ever more comprehensive set of digital sources does not help much either. The resulting digital collage of facts would still be problematic. It would be like taking a box of unsorted photos of yourself, throwing on a table and thinking that by just looking hard enough at them you might gain a comprehensive accurate sense of who the person in the photos actually is today (if you think this example if far-fetched, just look at Flickr, and how it makes accessible photos taken over long periods of time). These digital collages combine in-numerous bits of information about us, each one (at best) having been valid at a certain point in our past. But as it is presented to us, it is information from which time has been eliminated; a collage in which change is visible only as tension between two contradicting facts, not as an evolutionary process, taking place over time.

Some worry that digital collages resemble momentary comprehensive snapshots of us frozen in time, like a photograph, but accessible to the world. Actually, digital collages are much more disquieting than that. They are not like one, but hundreds, perhaps thousands of snapshots taken over our lifetime superimposed over each other, but without the perspective of time. How can we grasp a sense of a person that way? How can we hope to understand how a person evolved over the years, adjusting his views, adapting his (changing) environment? How can we pretend to know who that person is today, and how his values, his thinking, his character have evolved, when all that we are shown is timeless collage of personal facts thrown together? Perhaps, the advocates of digital memory would retort that we could do so with appropriate digital filters, where through visual presentation time could be reintroduced, like color coding years, or sorting events. I am afraid it still might not fully address the problem. Because I fear the real bottleneck of conflating history and collapsing time is not digital memory, but human comprehension. Even if we were presented with a dossier of facts, neatly sorted by date, in our mind we would still have difficulties putting things in the right temporal perspective, valuing facts over time … From the perspective of the person remembering, digital memory impeded judgment. From the perspective of the person remembered, however, it denies development, and refuses to acknowledge that all humans change all the time. By recalling forever each of our errors and transgressions digital memory rejects our human capacity to learn from them, to grow and evolve.

(p. 123-124)

Part of me wants to say that this was the case with physical folders full of paper documents, and photographs too. It’s interesting to consider how digital media functions differently because of the way it implies completeness and obscures its gaps and inconsistencies. Indeed, Meyer-Schönberger points out that digital media actually is much more malleable on scales that are inconceivable with paper documents.


Mayer-Schönberger, V. (2011). Delete: The virtue of forgetting in the digital age. Princeton University Press.

Hugh Rundle: Dreaming bigger

planet code4lib - Sat, 2017-12-23 02:01

I missed the deadline for the GLAMBlogCLub in November, ironically because I was going through a month that was particularly lacking in 'balance'. At my place of work, D-Day arrived on Melbourne Cup Day: we migrated from Amlib - a system we'd been using for 20 years - to Koha ILS. The migration itself went fairly smoothly, but inevitably there was a lot of work cleaning things up afterwards. Twenty years of custom, practice, and not-particularly-controlled data entry will mess with even the most comprehensive data migration plan. I put in some very long hours in November, but that's not sustainable for very long: I had to remind myself in the weeks that followed that I don't have to get everything done and everything perfect in the next week, or even the next month.

I've long been somewhat sceptical of the idea of 'work-life balance'. Working and 'life' are not two opposites to be balanced. In professions like librarianship this can be especially so: many librarians proudly identify as librarians primarily. This is not unproblematic, but it does highlight that separating 'life' from 'work' is quite artificial. The idea of 'balance' needs to also be interrogated. Working hard or long hours is not, in itself, necessarily unbalanced. The idea of flow or being 'in the zone' can often lead to long periods of productive and highly enjoyable work. Flow is pretty hard to achieve in a busy open plan office, but I've had periods of getting enjoyably lost in the Koha weeds over the last few months, piecing together how we might reconceptualise our collection management, or take advantage of a particular feature. My loss of balance was, rather, driven by a self-imposed requirement that our migration must go flawlessly, or I would somehow have let down the entire Koha and library community. In hindsight this was a stupid amount of pressure to put on myself, but it happened. At the very moment I should have been feeling excited and triumphant at what turned out to be a reasonably successful migration (system down for only one day, which was a public holiday anyway), I was instead exhausted, stressed and not much fun to be around. I'm about to start six weeks of leave, so no need to feel any kind of sympathy! I think the lesson here is to always maintain a sense of perspective, and have people around who can tell you when you're starting to get out of control.


Koha ILS is the first open source library management system in the world and I still can't quite believe I managed to jump through all the hoops to move my five-branch library service onto it. There are things about Koha that are magical because they just work the way people expect them to work, but naturally there are also parts of the system that are less well developed, or appear to have been poorly thought through, or simply work in a particular way that is fine for many libraries but not so great for ours. Such is the nature of software generally, but the beauty of an open source system is that if you have the money or skills, you can fix, improve or extend it. My argument for moving to Koha has, from the beginning, been about flexibility and local control of our tools, rather than about saving money. Of course, the way this all works is that open source software is collaborative. It's not just that individual developers or organisations can use the code, but rather that the system is built by the Koha community as a community. Bugs are listed, patches are tested and signed off, proposals are made, questions are asked and answered. It's not perfect. I'd like a lot more documentation, and exactly how the whole system works is pretty mysterious for newbies, but everyone involved recognises that: they're just all really busy. The people I've spoken to about this have all responded by encouraging me to help fix it, rather than responding in any defensive way: the way to make things better is to just make things better.

I've been a fan of open source - and Koha - for a long time, but this year as I've worked on Koha implementation I've been surprised how much moving to a completely open source architecture has changed the way I approach the options available to us for future projects. We're now running an open source web CMS (Joomla!), room and equipment booking system (Brushtail), and library management system (Koha). With our core system now open source, I've noticed that my colleagues and I are already starting think much more in terms of "we could approach problem X with solution Y" rather than "I wish system A had feature B". It can be a subtle difference, but the very fact that we could adjust how our system works without asking anyone's permission has changed the way we think about what's possible at all. In reality, 'asking for permission' still happens with Koha core development, because patches are discussed, tweaked and signed-off. But this is collaboration with peers, not begging a vendor for a feature that's "not on their roadmap". For the ambitious solutions we'll need funding and fee-for-service development. But unlike the black box of proprietary software licensing, we'll have a reasonably clear idea what we're going to get for our money.

Seeing my own and other's thinking expand like this in such a small period of time, I've been thinking about the hidden benefits of open source. I've always thought about it as a collaborative way to make tools, and I've long thought that librarians need to make our own tools if we are to fulfil the promises our profession makes. But I'd not really consciously considered that open-source could also expand our collective horizons because of its collaborative nature. Institutions - especially those that are part of of funded by governments - naturally want expenditure to fall into neat boxes. This is the reason, I think, so many libraries buy off-the-shelf software licenses and subscription packages: it's easy to explain to Procurement, and means you can shift the responsibility onto the vendor. But with the responsibility goes the control. When we lose control, our mindsets change. We either become compliant, or we become demanding, but either way the scope becomes smaller. When you're spending all your energy begging for an extra option in that drop-down menu, you don't have the mental bandwidth to be dreaming up whole new use-cases for your software generally. Proprietary software has 'user groups' of course, but this is not collaboration - it's more akin to a club or a union. You can collectively bargain or beg, but you can't really collectively build and make. It's the expanded scale of what's possible now that changes things in a way I hadn't deeply thought about before. So I'm going to enjoy my summer break, but I've never been more excited about my work. Because while we still need to burn it all down, we've got to build something out of the ashes.

We'll dream bigger when we build it together.

Information Technology and Libraries: Everyone’s Invited: A Website Usability Study Involving Multiple Library Stakeholders

planet code4lib - Fri, 2017-12-22 16:40

This article describes a usability study of the University of Southern Mississippi Libraries’ website conducted in early 2016. The study involved six participants from each of four key user groups – undergraduate students, graduate students, faculty, and library employees – and consisted of six typical library search tasks such as finding a book and an article on a topic, locating a journal by title, and looking up hours of operation. Library employees and graduate students completed the study’s tasks most successfully, whereas undergraduate students performed fairly simple searches and relied on the Libraries’ discovery tool, Primo. The study’s results identified several problematic features that impacted each user group, including library employees. This increased internal buy-in for usability-related changes in a later website redesign. 

Information Technology and Libraries: Mobile Website Use and Advanced Researchers: Understanding Library Users at a University Marine Sciences Branch Campus

planet code4lib - Fri, 2017-12-22 16:40
This exploratory study examined the use of the Oregon State University Libraries website via mobile devices by advanced researchers at an off-campus branch location. Branch campus–affiliated faculty, staff, and graduate students were invited to participate in a survey to determine what their research behaviors are via mobile devices, including frequency of their mobile library website use and the tasks they were attempting to complete. Findings showed that while these advanced researchers do periodically use the library website via mobile devices, mobile devices are not the primary mode of searching for articles and books or for reading scholarly sources. Mobile devices are most frequently used for viewing the library website when these advanced researchers are at home or in transit. Results of this survey will be used to address knowledge gaps around library resources and research tools and to generate more ways to study advanced researchers’ use of library services via mobile devices.

Information Technology and Libraries: Editorial Board Thoughts: Reinvesting in Our Traditional Personnel Through Knowledge Sharing and Training

planet code4lib - Fri, 2017-12-22 16:40
Editorial Board Thoughts: Reinvesting in Our Traditional Personnel Through Knowledge Sharing and Training

Information Technology and Libraries: Metadata Provenance and Vulnerability

planet code4lib - Fri, 2017-12-22 16:40
The preservation of digital objects has become an urgent task in recent years as it has been realised that digital media have a short life span. The pace of technological change makes accessing these media more and more difficult. Digital preservation is accomplished by two main methods, migration and emulation. Migration has been proven to be a lossy method for many types of digital objects. Emulation is much more complex; however, it allows preserved digital objects to be rendered in their original format, which is especially important for complex types such as those made up of multiple dynamic files. Both methods rely on good metadata in order to maintain change history or construct an accurate representation of the required system environment. In this paper, we present our findings that show the vulnerability of metadata and how easily they can be lost and corrupted by everyday use. Furthermore, this paper aspires to raise awareness and to emphasise the necessity of caution and expertise when handling digital data by highlighting the importance of provenance metadata.

Information Technology and Libraries: Letter from the Editor

planet code4lib - Fri, 2017-12-22 16:40
Letter from the Editor.

District Dispatch: Senators introduce bipartisan Museum and Library Services Act of 2017

planet code4lib - Fri, 2017-12-22 14:06

Acknowledging the critical role of libraries as anchor institutions in communities across the nation, a group of senators under the leadership of Jack Reed (D-RI), Susan Collins (R-ME), Thad Cochran (R-MS), Kirsten Gillibrand (D-NY), and Lisa Murkowski (R-AK) introduced the bipartisan Museum and Library Services Act of 2017 (S. 2271).

The 2017 MLSA reauthorizes the Institute of Museum and Library Services (IMLS), showing congressional support for the federal agency. IMLS administers funding through the Library Services Technology Act (LSTA), the only federal program that exclusively covers services and funding for libraries. The LSTA provides more than $183 million for libraries through the Grants to States program, the National Leadership Grants for Libraries, the Laura Bush 21st Century Librarian Program, and Native American Library Services.

To be clear, S. 2271 would not ensure full funding* for the programs libraries depend on. Reauthorization of the MLSA is not necessary for IMLS to receive funding: the last MLSA expired in 2016. Rather, S. 2271 would authorize IMLS to continue to exist and give direction about how the agency should operate. Passage of this reauthorization bill would signal that Congress values libraries and supports the mission of IMLS. As ALA President Jim Neal expressed it,

“Today’s introduction of the bipartisan MLSA reauthorization is the first critical step toward ensuring federal support for our nation’s nearly 120,000 libraries. LSTA grants enable libraries in every state to innovate and meet the growing demand for services that meet the needs of our communities.”

The 2017 MLSA continues to support the stated mission of IMLS to inspire libraries to “advance innovation, lifelong learning, and cultural and civic engagement.” It largely mirrors the previous authorization, with some improvements. After considerable input from library professionals across the country, ALA’s Washington Office worked closely with the bill’s lead cosponsors to include numerous recommendations in the legislation such as:

  • explicit allowance for grant funds to be used to help libraries prepare for and provide services after a disaster or emergency;
  • greater use of data-driven tools to measure the impact and maximize the effectiveness of library services; and
  • additional provisions to enable more Native American tribes to participate in IMLS grant programs.

Today’s introduction of the MLSA gives a clear and timely opportunity for each one of our elected federal leaders to show unequivocally their support for libraries.

ALA’s Washington Office encourages you to use the action center to contact your senators and ask them to cosponsor S. 2271. In your emails and calls to senators, tell them how LSTA funds enable your library to offer valuable services to your community. Invite them to visit your library to see for themselves the difference you are making in people’s lives. Ultimately, it is your story and your voice that will persuade your elected leaders to show their support for libraries and cosponsor the MLSA of 2017.

* ALA members have defended funding for IMLS at every turn throughout the appropriations process in 2017, beginning with the administration’s March budget recommendation to effectively eliminate IMLS. That proposal was rejected by House and Senate Appropriators, with both chambers recommending robust funding for IMLS (although final funding bills have not passed Congress). We will aggressively continue our advocacy to fund libraries in the new year. In the meantime, our strategy is to gain cosponsors for MLSA in the Senate and work with representatives to introduce companion legislation in the House.

The post Senators introduce bipartisan Museum and Library Services Act of 2017 appeared first on District Dispatch.


Subscribe to code4lib aggregator