You are here

planet code4lib

Subscribe to planet code4lib feed
Planet Code4Lib -
Updated: 11 hours 16 min ago

LITA: Jobs in Information Technology: August 9, 2017

Wed, 2017-08-09 19:36

New vacancy listings are posted weekly on Wednesday at approximately 12 noon Central Time. They appear under New This Week and under the appropriate regional listing. Postings remain on the LITA Job Site for a minimum of four weeks.

New This Week

Oregon State University Libraries and Press, Library Technician 3, Corvallis, OR

New York University Division of Libraries, Supervisor, Metadata Production & Management, New York, NY

Visit the LITA Job Site for more available jobs and for information on submitting a job posting.

Lucidworks: Customizing Ranking Models in Solr to Improve Relevance for Enterprise Search

Wed, 2017-08-09 17:38

As we countdown to the annual Lucene/Solr Revolution conference in Las Vegas next month, we’re highlighting talks and sessions from past conferences. Today, we’re highlighting Salesforce’s Ammar Haris & Joe Zeimen’s talk, “Customizing Ranking Models in Solr to Improve Relevance for Enterprise Search”.

Solr provides a suite of built-in capabilities that offer a wide variety of relevance related parameter tuning. Index and/or query time boosts along with function queries can provide a great way to tweak various relevance related parameters to help improve the search results ranking. In the enterprise space however, given the diversity of customers and documents, there is a much greater need to be able to have more control over the ranking models and be able to run multiple custom ranking models.

This talk discusses the motivation behind creating an L2 ranker and the use of Solr Search Component for running different types of ranking models at Salesforce.

Join us at Lucene/Solr Revolution 2017, the biggest open source conference dedicated to Apache Lucene/Solr on September 12-15, 2017 in Las Vegas, Nevada. Come meet and network with the thought leaders building and deploying Lucene/Solr open source search technology. Full details and registration…

The post Customizing Ranking Models in Solr to Improve Relevance for Enterprise Search appeared first on Lucidworks.

HangingTogether: The Transformation of Academic Library Collecting

Wed, 2017-08-09 15:12

The Transformation of Academic Library Collecting

In October 2016, I was privileged to attend a seminal event, The Transformation of Academic Library Collecting: A Symposium Inspired by Dan C. Hazen, along with colleagues Lorcan Dempsey and Constance Malpas who were speaking. This occasion brought together a group of eminent library leaders, research collections specialists and scholars at Norton’s Woods Conference Center in Cambridge, MA, to commemorate the career of Dan Hazen (1947–2015) and reflect upon the transformation of academic library collections. Hazen was a towering figure in the world of research collections management and was personally known to many attendees; his impact on the profession of academic librarianship and the shape of research collections is widely recognized and continues to shape practice and policy in major research libraries.

Sarah Thomas (Vice President for the Harvard Library and University Librarian & Roy E. Larsen Librarian for the Faculty of Arts and Sciences) and other colleagues had done a remarkable job not only selecting speakers but designing an event that allowed for discussion and reflection. We felt that the event needed to be documented in some way, and were pleased that Sarah endorsed this idea. The resulting publication, The Transformation of Academic Library Collecting: A Synthesis of the Harvard Library’s Hazen Memorial Symposium, is now freely available from our website.

Drawing from presentations and audience discussions at the symposium, this publication examines of some central themes important to a broader conversation about the future of academic library collections, in particular, collective collections and the reimagination of what have traditionally been called “special” and archival collections (now referred to as unique and distinctive collections). The publication also includes a foreword about Dan Hazen and his work by Sarah Thomas.

The Transformation of Academic Library Collecting: A Synthesis of the Harvard Library’s Hazen Memorial Symposium is not only a tribute to Hazen’s impact on the academic library community, but also a primer on where academic library collections could be headed in the future. We hope you will read, share, and use this as a basis for continuing an important conversation.

FOSS4Lib Upcoming Events: VIVO Camp 2017, Duke Univ

Wed, 2017-08-09 15:02
Date: Thursday, November 9, 2017 - 08:30 to Saturday, November 11, 2017 - 12:00Supports: Vivo

Last updated August 9, 2017. Created by Peter Murray on August 9, 2017.
Log in to edit this page.

VIVO Camp registration information

District Dispatch: IMLS Leadership grants & Laura Bush grants available

Wed, 2017-08-09 15:00

The Institute of Museum and Library Services (IMLS) recently announced the availability of two grant opportunities for libraries through the National Leadership Grants for Libraries (NLG) and the Laura Bush 21st Century Librarian (LB21) programs. The deadline to submit grant proposals is September 1, 2017, and awards will be announced in January 2018. NLG and LB21 programs are funded through the Library Services and Technology Act (LSTA) administered by IMLS.

Libraries are encouraged to apply for these funding opportunities. An increase in applications for these programs would send a signal to Congressional appropriators, and the Administration, that these grants are needed in communities across the country. Earlier this year, the President proposed eliminating both grant programs for FY2018, cutting $13.4 million for NLG and $10.0 million for LB21. The House Appropriations Committee rejected the President’s request and in July provided funding for both programs at their FY2017 levels. The full House is expected to vote on the funding bill that includes these programs in September, as is the key Senate Subcommittee and Committee with jurisdiction over both.

The NLG program invests in projects that address challenges and opportunities faced by libraries. Work funded often produces creative and valuable new tools, research findings and models that can be widely used and have national impact. The LB21 program supports “human capital projects” for libraries and librarians. It is intended to help produce a diverse workforce of librarians to better meet the changing learning and information needs of the American public.

IMLS has announced that the next round of NLG and LB21 grants will support three kinds of projects:

  • Community Anchors – projects that advance the role of libraries (and library professionals) as community anchors that foster community partnerships to encourage civic and cultural engagement, community dialogue, lifelong learning, promote digital inclusion and support local economies;
  • National Digital Platform – projects or professionals that create, develop, and expand digital content and services in communities; and
  • Curating Collections – projects or professionals that further preservation and the management of digital library collections.

For more information about the grant guidelines, as well as examples of previously awarded grants, visit IMLS’ NLG or the LB21 pages. IMLS also has posted informational webinars to answer potential applicants’ questions.

Grant requests will be peer-reviewed and must be submitted online by September 1, 2017, with all required documents through In FY2017, approximately 25% of grant requests were funded. The next grant cycle for NLG and LB21 will be announced in December.

The post IMLS Leadership grants & Laura Bush grants available appeared first on District Dispatch.

Eric Lease Morgan: Stories: Interesting projects I worked on this past year

Wed, 2017-08-09 14:59

This is short list of “stories” outlining some of the more interesting projects I worked on this past year:

  • Ask Putin – A faculty member from the College of Arts & Letters acquired the 950-page Cyrillic transcript of a television show called “Ask Putin”. The faculty member had marked up the transcription by hand in order to analyze the themes conveyed therein. They then visited the Center for Digital Scholarship, and we implemented a database version of the corpus. By counting & tabulating the roots of each of the words for each of the sixteen years of the show, we were able to quickly & easily confirm many of the observations she had generated by hand. Moreover, the faculty member was able to explore additional themes which they had not previously coded.
  • Who’s related to whom – A visiting scholar from the Kroc Center asked the Center for Digital Scholarship to extract all of the “named entities” (names, places, & things) from a set of Spanish language newspaper articles. Based on strength of the relationships between the entities, the scholar wanted a visualization to be created illustrating who was related to whom in the corpus. When we asked more about the articles and their content, we learned we had been asked to map the Columbian drug cartel. While incomplete, the framework of this effort will possibly be used by a South American government.
  • Counting 250,000,000 words – Working with Northwestern University, and Washington University in St. Louis, the Center for Digital Scholarship is improving access & services against the set of literature called “Early English Books”. This corpus spans 1460 and 1699 and is very representative of English literature of that time. We have been creating more accurate transcriptions of the texts, digitizing original items, and implementing ways to do “scalable reading” against the whole. After all, it is difficult to read 60,000 books. Through this process each & every word from the transcriptions has been saved in a database for future analysis. To date the database includes a quarter of a billion (250,000,000) rows. See:
  • Convocate – In conjunction with the Center for Civil and Human Rights, the Hesburgh Libraries created an online tool for comparing & contrasting human rights policy written by the Vatican and various non-governmental agencies. As a part of this project, the Center for Digital Scholarship wrote an application that read each & every paragraph from the thousands of pages of text. The application then classified each & every paragraph with one or more keyword terms for the purposes of more accurate & thorough discovery across the corpus. The results of this application enable the researcher to items of similar interest even if they employ sets of dispersed terminology. For more detail, see:

DuraSpace News: Announcing VIVO Camp

Wed, 2017-08-09 00:00

VIVO Camp is a multi-day training event designed specifically for new and prospective users. Camp will be held November 9-11, 2017 on the campus of Duke University in Durham, NC. Over two and a half days, VIVO Camp will start with an introduction to VIVO leading to a comprehensive overview by exploring these topics:

  • VIVO features

  • Examples and demos of VIVO including customizations

  • Representing scholarship

  • Loading, displaying and using VIVO data

  • Introduction to the ontologies

Code4Lib: Code4Lib 2018

Tue, 2017-08-08 22:07
Topic: conferences

Code4Lib 2018 will be held from February 13-16, 2018, in Washington, D.C.

More information is available on the conference website at:

Karen Coyle: On reading Library Journal, September, 1877

Tue, 2017-08-08 19:54
Of the many advantages to retirement is the particular one of idle time. And I will say that as a librarian one could do no better than to spend some of that time communing with the history of the profession. The difficulty is that it is so rich, so familiar in many ways that it is hard to move through it quickly. Here is just a fraction of the potential value to be found in the September issue of volume two of Library Journal.* Admittedly this is a particularly interesting number because it reports on the second meeting of the American Library Association.

For any student of library history it is especially interesting to encounter certain names as living, working members of the profession.

Other names reflect works that continued on, some until today, such as Poole and Bowker, both names associated with long-running periodical indexes.

What is particularly striking, though, is how many of the topics of today were already being discussed then, although obviously in a different context. The association was formed, at least in part, to help librarianship achieve the status of a profession. Discussed were the educating of the public on the role of libraries and librarians as well as providing education so that there could be a group of professionals to take the jobs that needed that professional knowledge. There was work to be done to convince state legislatures to support state and local libraries.

One of the first acts of the American Library Association when it was founded in 1876 (as reported in the first issue of Library Journal) was to create a Committee on Cooperation. This is the seed for today's cooperative cataloging efforts as well as other forms of sharing among libraries. In 1877, undoubtedly encouraged by the participation of some members of the publishing community in ALA, there was hope that libraries and publishers would work together to create catalog entries for in-print works.
This is one hope of the early participants that we are still working on, especially the desire that such catalog copy would be "uniform." Note that there were also discussions about having librarians contribute to the periodical indexes of R. R. Bowker and Poole, so the cooperation would flow in both directions.

The physical organization of libraries also was of interest, and a detailed plan for a round (actually octagonal) library design was presented:
His conclusion, however, shows a difference in our concepts of user privacy.
Especially interesting to me are the discussions of library technology. I was unaware of some of the emerging technologies for reproduction such as the papyrograph and the electric pen. In 1877, the big question, though, was whether to employ the new (but as yet un-perfected) technology of the typewriter in library practice.

There was some poo-pooing of this new technology, but some members felt it may be reaching a state of usefulness.

"The President" in this case is Justin Winsor, Superintendent of the Boston Library, then president of the American Library Association. Substituting more modern technologies, I suspect we have all taken part in this discussion during our careers.

Reading through the Journal evokes a strong sense of "le plus ça change..." but I admit that I find it all rather reassuring. The historical beginnings give me a sense of why we are who we are today, and what factors are behind some of our embedded thinking on topics.

* Many of the early volumes are available from HathiTrust, if you have access. Although the texts themselves are public domain, these are Google-digitized books and are not available without a login. (Don't get me started!) If you do not have access to those, most of the volumes are available through the Internet Archive. Select "text" and search on "library journal". As someone without HathiTrust institutional access I have found most numbers in the range 1-39, but am missing (hint, hint): 5/1880; 8-9/1887-88; 17/1892; 19/1894; 28-30/1903-1905; 34-37;1909-1912. If I can complete the run I think it would be good to create a compressed archive of the whole and make that available via the Internet Archive to save others the time of acquiring them one at a time. If I can find the remainder that are pre-1923 I will add those in.

Open Knowledge Foundation: New research: Understanding the drivers of license proliferation

Tue, 2017-08-08 15:51

Open licensing is still a major challenge for open data publication. In a recent blog post on the state of open licensing in 2017 Open Knowledge International identified that governments often decide to create custom licenses instead of using standard open licenses such as Creative Commons Attribution 4.0.

This so-called license proliferation is problematic for a variety of reasons. Custom licenses necessitate that data users know all legal arrangements of these licenses – a problem that standard licenses are intended to avoid by clearly and easily stating use rights. Custom licenses can also exacerbate legal compatibility issues across licenses, which makes it hard (or impossible) to combine and distribute data coming from different sources. Because of legal uncertainties and compatibility issues, license proliferation can have chilling effects on the reuse of data and in the worst case prevent data reuse entirely.

When investigating this topic further we noticed a dearth of knowledge about the drivers of license proliferation: neither academia nor grey literature seem to give systematic answers, but there are some great first analyses, as well as explanations why license proliferation is bad. Why do governments create custom licenses? Who within government decides that standard licenses are not the best solution to make data and content legally open? How do governments organise the licensing process and how can license recommendations applied across government agencies?


Exploring the drivers of license proliferation

In order to address these questions Open Knowledge International started a research project into license proliferation. Using the findings of the Global Open Data Index (GODI) 2016/17 as a starting point, we first mapped out how many different licenses are used in a selection of 20 countries. This includes following countries which either rank high in GODI, or where the Open Knowledge community is present:

Taiwan, Australia, Great Britain, France, Finland, Canada, Norway, New Zealand, Brazil, Denmark, Colombia, Mexico, Japan, Argentina, Belgium, Germany, Netherlands, Greece, Nepal, Singapore.

Now we want to explore how governments decide what kind of license to use for data publication. We intend to publish the results in a narrative report to inform the open data community to understand the licensing process better, to inform license stewardship, and to advocate for the use of standard licenses.


Get in touch!

We are planning to run interviews with government officials who are involved in licensing.  Please don’t hesitate and get in touch with us by sending an email to Feedback from government officials working on licensing is much appreciated. Also do reach out if you have background knowledge about the licensing situation in above listed countries, or if you have contacts to government.

We hope to hear from you soon!  

David Rosenthal: Approaching The Physical Limits

Tue, 2017-08-08 15:00
As storage media technology gets closer and closer to the physical limits, progress on reducing the $/GB number slows down. Below the fold, a recap of some of these issues for both disk and flash.

The current examples of this are Heat Assisted Magnetic Recording (HAMR) and its planned successor Bit Patterned Media (BPM). As I wrote last December:
Seagate 2008 roadmap Here is a Seagate roadmap slide from 2008 predicting that the then (and still) current technology, perpendicular magnetic recording (PMR), would be replaced in 2009 by heat-assisted magnetic recording (HAMR), which would in turn be replaced in 2013 by bit-patterned media (BPM).

ASTC 2016 roadmap Here is a recent roadmap from ASTC showing HAMR starting in 2017 and BPM in 2021. So in 8 years HAMR has gone from next year to next year, and BPM has gone from 5 years out to 5 years out. The reason for this real-time schedule slip is that as technologies get closer and closer to the physical limits, the difficulty and above all cost of getting from lab demonstration to shipping in volume increases exponentially.HAMR is still slipping in real time. About the same time I was writing, Seagate was telling the trade press that:
It is targeting 2018 for HAMR drive deliveries, with a 16TB 3.5-inch drive planned, featuring 8 platters and 16 heads. It is tempting to imagine that this slippage gives flash the opportunity to kill off hard disk. As I, among others such as Google's Eric Brewer, and IBM's Robert Fontana have pointed out, this scenario is economically implausible:
NAND vs. HDD capex/TBThe argument is that flash, despite its many advantages, is and will remain too expensive for the bulk storage layer. The graph of the ratio of capital expenditure per TB of flash and hard disk shows that each exabyte of flash contains about 50 times as much capital as an exabyte of disk. Fontana estimates that last year flash shipped 83EB and hard disk shipped 565EB. For flash to displace hard disk immediately would need 32 new state-of-the-art fabs at around $9B each or nearly $300B in total investment.But there's also a technological reason why the scenario is implausible. Flash already hit one physical limit:
when cell lithography reached the 15-16nm area ... NAND cells smaller than that weren’t reliable data stores; there were too few electrons to provide a stable and recognisable charge level.Despite this, flash is currently reducing in cost quite fast, thanks to two technological changes:
  • 3D, which stacks up to 96 layers of cells on top of each other.
  • Quad-Level Cell (QLC), which uses 16 voltage levels per cell to store 4 bits per cell.
Going from 2D to 3D is a one-time gain because, unfortunately, there are unsolved technical problems in reducing cost further by going from 3D to 4D. QLC requires more electrons per cell, so requires bigger cells to hold them. Reducing cost again by going to 32 voltage levels would need bigger cells again, so won't be easy or cost-effective. Thus the current rate of $/GB decrease is unlikely to be sustained.

At The Register, Chris Mellor has an clear, simple overview of the prospect for flash technology entitled Flash fryers have burger problems: You can't keep adding layers:
The flash foundry folk took on 3D NAND because it provided an escape hatch from the NAND scaling trap of ever-decreasing cell sizes eventually to non-functioning flash.

But 3D NAND, the layering of many 2D planar NAND chip structures, will run into its own problems. The piece is quite short and easy to understand; it is well worth a read.

John Miedema: Finger-Free Options for Taking a Note

Tue, 2017-08-08 01:30

The origin of the word, digital, is late 15th century, from Latin digitalis, finger or toe. Digital technology depends on our fingers but sometimes I want to perform tasks finger-free. For example, I want to speak a note, convert it to text, and send it to my Evernote inbox for later follow-up. This is handy when my fingers are already too busy on other tasks. It is also useful when I drive alone, since I don’t want to text and drive. There are some “post-digital” options:


1. OK Google function on my Android phone. I speak a note into my phone, “OK Google,” “Take Note,” “Lorem Ipsum.” The voice note is converted to text and sent to my Evernote inbox. Google instructions, Evernote instructions. OK Google is helpful but not when driving. OK Google will not respond until I unlock my phone, which requires my fingers. Even if I turn off device security for the trip I have to use my finger on the power button to wake up the device. I don’t want to touch my device. Period.

2. Amazon Alexa and IFTTT. The Amazon Echo Dot’s Alexa app is always listening for voice commands. No finger action is required to unlock or wake up the device. IFFFT has an applet, Add your Alexa To-Dos to Evernote. As long as I am in voice range of the Echo Dot I say, “Alexa To Do.” Alexa asks, “What can I add for you?” I say, “Lorem Ipsum.” The voice note is converted to text and sent to my Evernote inbox. The Amazon Echo Dot costs $50 USD but thumbs up for working indoors. The limitation is device portability. It is possible to take the Echo Dot in the car, but it requires a phone’s internet connection and a power source. It gets complicated.

3. Android Watch. Raise the watch up to get the voice prompt without a finger. Install Evernote for Android Wear and you are good to go. It appears to be the best option, but I do not own an Android Watch because I am too cheap to shell out hundreds of dollars.

Update. On further experimentation I have observed a real problem with OK Google and Alexa. I begin a note, “OK Google Take Note” or “Alexa To Do.” I begin the note, “First … remember to ….” The note gets saved as “First” after the initial pause. Um. I need to find a way to save a longer note that gets expressed with pauses. I have not tested Android Watch but since it is a Google technology it probably has the same limitation.

Lucidworks: The Path to Universal Search at Allstate

Mon, 2017-08-07 22:32

As we countdown to the annual Lucene/Solr Revolution conference in Las Vegas next month, we’re highlighting talks and sessions from past conferences. Today, we’re highlighting Allstate Insurance Company’s Nery Encarnacion and Sean Rasmussen’s talk, “The Path to Universal Search”.

Too many search boxes? Can’t remember which one to use? You know the document you are looking for exists, but you just can’t find it. This talk provides an understanding of how Lucidworks Fusion helped Allstate traverse disparate data sources and consolidate scattered search boxes to create a better user experience.

Join us at Lucene/Solr Revolution 2017, the biggest open source conference dedicated to Apache Lucene/Solr on September 12-15, 2017 in Las Vegas, Nevada. Come meet and network with the thought leaders building and deploying Lucene/Solr open source search technology. Full details and registration…

The post The Path to Universal Search at Allstate appeared first on Lucidworks.

LITA: Call for Proposals, Deadline Extended, LITA @ ALA Annual 2018

Mon, 2017-08-07 18:21

The program submission deadline has been extended to:

Tuesday September 5, 2017

Submit Your Program ideas for the 2018 ALA Annual Conference 

New Orleans LA, June 21-26, 2018

The LITA Program Planning Committee (PPC) is now encouraging the submission of innovative and creative proposals for the 2018 Annual American Library Association Conference. We’re looking for 60 minute conference presentations. The focus should be on technology in libraries, whether that’s use of, new ideas for, trends in, or interesting/innovative projects being explored – it’s all for you to propose. Programs should be of interest to all library/information agency types, that inspire technological change and adoption, or/and generally go above and beyond the everyday.

  • Submission Deadline: September 5, 2017
  • Final Decisions: September 29, 2017
  • Schedule of Sessions Announced: November 8, 2017

For the first time, proposals will be accepted via one submission site for all ALA Divisions, RoundTables, Committees and Offices. This link to the submission site will redirect to the ALA log-in page. All submitters are required to have an ALA profile, but are not required to be ALA members.

Help and details on making a successful submission are on the LITA Forms web site.

We regularly receive many more proposals than we can program into the slots available to LITA at the ALA Annual Conference. These great ideas and programs all come from contributions like yours. Submissions are open to anyone, regardless of ALA membership status. We welcome proposals from anyone who feels they have something to offer regarding library technology. We look forward to hearing the great ideas you will share with us this year.

Questions or Comments?

Contact LITA at (312) 280-4268 or Mark Beatty,

LITA: Lost Art of Creativity

Mon, 2017-08-07 15:55

The Lost Art series examines tech tools that encourage communication between libraries and its users.  The Lost Art of Conversation looked at ways that podcasts can connect with the community, as well as the technology required to create a professional podcast.

This month is all about the 3-D printer, a tool that creates three dimensional objects based on a digital design.  A brief history of this technology: the first patent was issued in the 1980s and today these printers can create anything from a kidney to a car.

A 2016 Pew Research Study found that 50% of those polled think 3-D printers are a good investment for libraries (up 5% from 2015) and this number goes up when people are broken out by race: “69% of blacks and 63% of Hispanics say libraries should definitely buy 3-D printers and other high-tech tools, compared with 44% of whites.”

Some people might wonder why libraries should invest in such an expensive technology.  There are both  symbolic and practical reasons for the investment. Open access is a tenet of librarianship, ever since the beginning of U.S. public libraries when books were difficult to get if you were not wealthy or a member of the church.  The current Digital Divide is real and libraries continue to provide access to technology that people otherwise couldn’t afford it. The 3-D printer is just another example of leveling the playing field.  Using this tool is not just for show, there are many practical applications.

One of the hands created at the Mastics-Moriches-Shirley Community Library in Suffolk County, NY

Earlier this month, a library in Suffolk County, New York printed 15 prosthetic hands to donate to disabled children around the world. Custom hands cost as much as $10,000, while the library is able to create a hand using $48 in materials. Maskerspaces, gathering places for users to share ideas and tools, offer 3-D printing along with Legos, sewing machines, and other tools to encourage creativity.  Some additional uses can be found in a 2016 article published in School Library Journal: “My Love/Hate Relationship with 3-D Printers in Libraries.”

ALA published a practical guide, Progress in the Making, for those new to the 3-D printing world or those considering the purchase.  Below are some highlights:

  1. Cost- range from $200-$2,000 but there is no need to spend over $1,500. Not sure which printer is best for your library? Check with colleagues or product reviews, like this one from LibraryJournal.
  2. Supplies- ALA recommends having 2-3 rolls of material in stock, these cost around $25.
  3. Space- most printers are the size of a desktop computer, so allocate the same desk size as a computer plus storage for supplies and prototypes!
  4. Software- ALA recommends Tinkercad, a free computer-aided design (CAD) software that runs in a browser.  Some printers, like the LulizBot Mini, offers free, open-source software.
  5. Time- this depends on what is being created. A Rubik’s Cube will take around 5 hours to complete, whereas a larger or more intricate design will take longer.
  6. Issues- many printers have warranties and customer service reps that can help troubleshoot by phone or email

How is your library using 3-D printers? Any creative or interesting designs to share?

HangingTogether: A research and learning agenda for special collections and archives

Mon, 2017-08-07 15:38

As we previously shared with you, we have been hard at work developing a research and learning agenda for special collections and archives. Here’s what has happened since Chela’s last post in May…

Chela continued conversations with the advisory group and also with many others in the field. The goal has been to develop a practitioner-based view of challenges and opportunities — and to include many voices in that process.

Workshop in Iowa City

We held a workshop at the RBMS Conference in June with an invited group of special collections and other library leaders to help refine an early draft of our agenda. That group was very generous with their time and helped improve the agenda considerably. Thanks to the Iowa City Public Library for being generous hosts!

Following the useful input and critique we gathered in Iowa City, we revised the document and released it for broader comment. We also held a larger workshop focusing on the current draft at focused on the current draft in July at Archives 2017, the annual meeting of the Society of American Archivists.

Workshop at Archives 2017

In developing this agenda, which we see as not only important for OCLC Research but for other organizations and stakeholders, we taking a transparent, iterative approach. We are seeking substantial input from the OCLC Research Library Partnership, as well as the broader archives and special collections community.

We are inviting you today to play a role in the next steps of shaping the agenda, and asking for your feedback on the current draft of the agenda by August 28th. We are happy to hear thoughts on any element of the draft agenda, but in particular, are interested in hearing comments on the following questions:

  1. Proposed Research Activities: do you have ideas for activities in areas that are left blank in the current draft? Are there other research activities or questions you would like to see addressed within each of the outlined topical areas of investigation?
  2. Relevant Existing Work in the Community: Is there current or early-stage work going on that addresses any of the topical areas of investigation and that we should be aware of?
  3. Priorities for OCLC: OCLC Research will be able to address only a small portion of the issues and activities outlined in the agenda, and wants to put its resources and expertise to best use. Which of the topical areas of investigation and proposed research activities would you most like to see OCLC take on, and where do you think they can make most impact?

Please find the draft agenda either as a Google Doc or as a PDF. You are welcome to add comments in the Google Doc itself, or submit comments via email to We welcome feedback and comments through August 28th.

District Dispatch: A visit to Senator Tester’s field office

Mon, 2017-08-07 15:30

I moved to Montana three years ago when I accepted a position as director of Montana State University’s School Library Media preparation program. Like any good librarian, the very first thing I did when I moved to Bozeman was obtain my library card. And like any good library advocate, the second thing I did was learn about Montana politics. Montana is an interesting place. It’s incredibly rural (our largest city is Billings, population 110,000). Just over one million people live in the Treasure State, and it takes about ten hours to travel across the state east to west. Accordingly, Montana is represented by our two Senators, Steve Daines and Jon Tester, and one at-large Representative, Greg Gianforte.

Senator Tester
Source: Thom Bridge

Senator Tester is the only working farmer in Congress. He lives in Big Sandy, population 598, where he produces organic wheat, barley, lentils, peas, millet, buckwheat and alfalfa. He butchers his own meat and brings it to Washington in an extra carry-on bag. A former teacher and school board member, he is a staunch advocate for public education. I looked at his background and priorities and found that Senator Tester has a good track record of supporting some of ALA’s key issues, such as open access to government information and the Library Services and Technology Act.

I’ve participated in ALA’s National Library Legislative Day as part of the Montana delegation annually since 2015, so I was familiar with Senator Tester’s Washington, DC-based staff. This summer, with the Senate’s August recess looming, I saw another opportunity to connect with the Senator’s field staff. In Bozeman and the surrounding area, the Senator’s staff regularly schedules outreach and listening sessions in public libraries. On July 27, I attended one of these listening sessions at the Bozeman Public Library. I came prepared with a short list of items that I wanted to cover. Because there were about eight people in the listening session, I wasn’t able to get specific about my issues, so I scheduled a one-on-one appointment the following week with the field office staff in Downtown Bozeman.

I met with Jenna Rhoads, who is a new field officer and a recent graduate of MSU’s political science program. We chatted briefly about people we knew in common and I congratulated her on her new position and recent graduation. I then spoke about several issues, keeping it short, to the point, and being very specific about my “asks.” These issues included:

  1. Congratulating Senator Tester for receiving the Madison Award from the American Library Association and thanking him for his support of the Library Services and Technology Act by signing the Dear Appropriator letter for the FY18 appropriations cycle. I asked that next year, the Senator please consider signing the Dear Appropriator letter on the Innovative Approaches to Literacy program as well.
  2. Thanking the field office for holding listening sessions in local public libraries and encouraging this partnership to continue.
  3. Asking that Senator Tester use his position on the Interior Appropriations subcommittee to assure continued funding for the U.S. Geological Survey when the Interior Appropriations bill is voted on after Labor Day. I provided Jenna with a copy of ALA’s related letter and asked that she pass it along to the appropriate Washington staffer.
  4. Inviting the Senator to continue to work in the long term on school library issues, particularly in rural and tribal schools, which Senator Tester already cares deeply about.

The meeting lasted about 30 minutes. Later that day I followed up with a thank you email, reiterating my issues and “asks.”

As the Senate goes into its traditional August recess, this is a very good time to schedule a meeting with your senator’s field office staff in your local area and perhaps even meet with your senator. I hope that you will take the opportunity to engage with your senators and their field office staff to advocate for important library issues. There are many resources on District Dispatch, the ALA Washington Office blog, that can help you hone in on the issues that are important to your senator. Additionally, the ALA Washington Office’s Office of Government Relations staff are always willing to help you craft your message and give you valuable information about where your senator stands on library issues so you can make your case in the most effective manner.

I chose to take the time to meet with my senator’s field office staff because I believe in the power of civic engagement – and because I know that libraries change lives. I hope that you will take some time to connect with your senator’s field office this August.

The post A visit to Senator Tester’s field office appeared first on District Dispatch.

Terry Reese: MarcEdit 7 Alpha: the XML/JSON Profiler

Sun, 2017-08-06 19:06

Metadata transformations can be really difficult.  While I try to make them easier in MarcEdit, the reality is, the program really has functioned for a long time as a facilitator of the process; handling the binary data processing and character set conversions that may be necessary.  But the heavy lifting, that’s all been on the user.  And if you think about it, there is a lot of expertise tied up in even the simplest transformation.  Say your library gets an XML file full of records from a vendor.  As a technical services librarian, I’d have to go through the following steps to remap that data into MARC (or something else):

  1. Evaluate the vended data file
  2. Create a metadata dictionary for the new xml file (so I know what each data element represents)
  3. Create a mapping between the data dictionary for the vended file and MARC
  4. Create the XSLT crosswalk that contains all the logic for turning this data into MARCXML
  5. Setup the process to move data between XML=>MARC


All of these steps are really time consuming, but the development of the XSLT/XQuery to actually translate the data is the one that stops most people.  While there are many folks in the library technology space (and technical services spaces) that would argue that the ability to create XSLT is a vital job skill, let’s be honest, people are busy.  Additionally, there is a big difference between knowing how to create an XSLT and writing a metadata translation.  These things get really complicated, and change all the time (XSLT is up to version 3), meaning that even if you’ve learned how to do this years ago, the skills may be stale or not translate into the current XSLT version.

Additionally, in MarcEdit, I’ve tried really hard to make the XSLT process as simple and straightforward as possible.  But, the reality is, I’ve only been able to work on the edges of this goal.  The tool handles the transformation of binary and character encoding data (since the XSLT engines cannot do that), it uses a smart processing algorithm to try to improve speed and memory handling while still enabling users to work with either DOM or Sax processing techniques.  And I’ve tried to introduce a paradigm that enables reuse and flexibility when creating transformations.  Folks that have heard me speak have likely heard me talk about this model as a wheel and spoke:

The idea behind this model is that as long as users create translations that map to and from MARCXML, the tool can automatically enable transformations to any of the known metadata formats registered with MarcEdit.  There are definitely tradeoffs to this approach (for sure, doing a 1-to-1, direct translation would produce the best translation, but it also requires more work and users to be experts in the source and final metadata formats), but the benefit from my perspective is that I don’t have to be the bottleneck in the process.  Were I to hard-code or create 1-to-1 conversions, any deviation or local use within a spec, would render the process unusable…and that was something that I really tried to avoid.  I’d like to think that this approach has been successful, and has enabled technical services folks to make better use of the marked up metadata that they are provided.

The problem is that as content providers have moved more of their metadata operations online,  a large number have shifted away from standards-based metadata to locally defined metadata profiles.  This is challenging because these are one off formats that really are only applicable for a publisher’s particular customers.  As a result, it’s really hard to find conversions for these formats.  The result of this, for me, are large numbers of catalogers/MarcEdit users asking for help creating these one off transformations…work that I simply don’t have time to do.  And that can surprise folks.  I try hard to make myself available to answer questions.  If you find yourself on the MarcEdit listserv, you’ll likely notice that I answer a lot of the questions…I enjoy working with the community.  And I’m pretty much always ready to give folks feedback and toss around ideas when folks are working on projects.  But there is only so much time in the day, and only so much that I can do when folks ask for this type of help.

So, transformations are an area where I get a lot of questions.  Users faced with these publisher specific metadata formats often reach out for advice or to see if I’ve worked with a vendor in the past.  And for years, I’ve been wanting to do more for this group.  While many metadata librarians would consider XSLT or XQuery as required skills, these are not always in high demand when faced with a mountain of content moving through an organization.  So, I’ve been collecting user stories and outlining a process that I think could help: an XML/JSON Profiler.

So, it’s with a lot of excitement, that I can write that MarcEdit 7 will include this tool.  As I say, it’s been a long-term coming; and the goal is to reduce the technical requirements needed to process XML or JSON metadata.

XML/JSON Profiler

To create this tool, I had decide how users would define their data for mapping.  Given that MarcEdit has a Delimited Text Translator for converting Excel data to MARC, I decided to work form this model.  The code produced does a couple of things:

  1. It validates the XML format to be profiled.  Mostly, this means that the tool is making sure that schema’s are followed, namespaces are defined and discoverable, etc.
  2. Output data in MARC, MARCXML, or another XML format
  3. Shifts mapping of data from an XML file to a delimited text file (though, it’s not actually creating a delimited text file).
  4. Since the data is in XML, there is  a general assumption that data should be in UTF8.


Users can access the Wizard through the updated XML Functions Editor.  Users open MARC Tools and select Edit XML function list, and you see the following:

I highlighted the XML Function Wizard.  I may also make this tool available from the main window.  Once selected, the program walks users through a basic reference interview:

Page 1:


From here, users just need to follow the interview questions.  User will need a sample XML file that contains at least one record in order to create the mappings against.  As users walk through the interview, they are asked to identify the record element in the XML file, as well as map xml tags to MARC tags, using the same interface and tools as found in the delimited text translator.  Users also have the option to map data directly to a new metadata format by creating an XML mapping file – or a representation of the XML output, which MarcEdit will then use to generate new records.

Once a new mapping has been created, the function will then be registered into MarcEdit, and be available like any other translation.  Whether this process simplifies the conversion of XML and JSON data for librarians, I don’t know.  But I’m super excited to find out.  This creates a significant shift in how users can interact with marked up metadata, and I think will remove many of the technical barriers that exist for users today…at least, for those users working with MarcEdit.

To give a better idea of what is actually happening, I created a demonstration video of the early version of this tool in action.  You can find it here:  This provides an early look at the functionality, and hopefully help provide some context around the above discussion.  If you are interested in seeing how the process works, I’ve posted the code for the parser on my github page here:

Do you have questions, concerns?  Let me know.



John Miedema: Evernote Random. Get a Daily Email to a Random Note.

Sun, 2017-08-06 15:25

I write in bits and pieces. I expect most writers do. I think of things at the oddest moments. I surf the web and find a document that fits into a writing project. I have an email dialog and know it belongs with my essay. It is almost never a good time to write so I file everything. Evernote is an excellent tool for aggregating all of the bits in notebooks. I have every intention of gettng back to them. Unfortunately, once the content is filed, it usually stays buried and forgotten.

I need a way to keep my content alive. The solution is a daily email, a link to a random Evernote note. I can read the note to keep it fresh in memory. I can edit the note, even just one change to keep it growing.

I looked around for a service but could not find one. I did find an IFTTT recipe for emailing a daily link to a random Wikipedia page. IFTTT sends the daily link to a Wikipedia page that automatically generates a random entry. In the end, I had to build an Evernote page to do a similar thing.

You can set up Evernote Random too, but you need a few things:

  • An Evernote account, obviously.
  • A web host that supports PHP.
  • A bit of technical skill. I have already written the Evernote Random script that generates the random link. But you have to walk through some technical Evernote setup steps, like generating keys and testing your script in their sandbox.
  • The Evernote Random script fro, my GitHub Gist site. It has all the instructions.
  • An IFTTT recipe. That’s the easy part.
  • Take the script. Use it. Improve it. I would enjoy hearing from you.

Originally published at this website on April 1, 2015.