You are here

planet code4lib

Subscribe to planet code4lib feed
Planet Code4Lib -
Updated: 3 hours 20 min ago

David Rosenthal: Abby Smith Rumsey's "When We Are No More"

7 hours 55 min ago
Back in March I attended the launch of Abby Smith Rumsey's book When We Are No More. I finally found time to read it from cover to cover, and can recommend it. Below the fold are some notes.

There are four main areas where I have comments on Rumsey's text. On page 144, in the midst of a paragraph about the risks to our personal digital information she writes:
The documents on our hard disks will be indecipherable in a decade.The word "indecipherable" implies not data loss but format obsolescence. As I've written many times, Jeff Rothenberg was correct to identify format obsolescence as a major problem for documents published before the advent of the Web in the mid-90s. But the Web caused documents to evolve from being the private property of a particular application to being published. On the Web, published documents don't know what application will render them, and are thus largely immune to format obsolescence.

It is true that we're currently facing a future in which most current browsers will not render preserved Flash, not because they don't know how to but because it isn't safe to do so. But shows that the technological fix for this problem is already in place. Format obsolescence, were it to occur, would be hard for individuals to mitigate. Especially since it isn't likely to happen, it isn't helpful to lump it in with threats they can do something about by, for example, keeping local copies of their cloud data.

On page 148 Rumsey discusses the problem of the scale of the preservation effort needed and the resulting cost:
We need to keep as much as we can as cheaply as possible. ... we will have to invent ways to essentially freeze-dry data, to store data at some inexpensive low level of curation, and at some unknown time in the future be able to restore it. ... Until such a long-term strategy is worked out, preservation experts focus on keeping digital files readable by migrating data to new hardware and software systems periodically. Even though this looks like a short-term strategy, it has been working well  ... for three decades and more.Yes, it has been working well and will continue to do so provided the low level of curation manages find enough money to keep the bits safe. Emulation will ensure that if the bits survive we will be able to render them, and it does not impose significant curation costs along the way.

The aggressive (and therefore necessarily lossy) compression Rumsey enviasges would reduce storage costs, and I've been warning for some time that Storage Will Be Much Less Free Than It Used To Be. But it is important not to lose sight of the fact that ingest, not storage, is the major cost in digital preservation. We can't keep it all; deciding what to keep and putting it some place safe is the most expensive part of the process.

On page 163 Rumsey switches to ignoring the cost and assuming that, magically, storage supply will expand to meet the demand:
Our appetite for more and more data is like a child's appetite for chocolate milk: ... So rather than less, we are certain to collect more. The more we create, paradoxically, the less we can afford to lose.Alas, we can't store everything we create now, and the situation isn't going to get better.

On page 166 Rumsey writes:
Other than the fact that preservation yields long-term rewards, and most technology funding goes to creating applications that yield short-term rewards, it is hard to see why there is so little investment, either public or private, in preserving data. The culprit is our myopic focus on short-term rewards, abetted by financial incentives that reward short-term thinking. Financial incentives are matters of public policy, and can be changed to encourage more investment in digital infrastructure.I completely agree that the culprit is short-term thinking, but the idea that "incentives ... can be changed" is highly optimistic. The work of, among others, Andrew Haldane at the Bank of England shows that short-termism is a fundamental problem in our global society. Inadequate investment in infrastructure, both physical and digital, is just a symptom, and is far less of a problem than society's inability to curb carbon emissions.

Finally, some nits to pick. On page 7 Rumsey writes of the Square Kilometer Array:
up to one exabyte (1018 bytes) of data per dayI've already had to debunk another "exabyte a day" claim. It may be true that the SKA generates an exabyte a day but it could not store that much data. An exabyte a day is most of the world's production of storage. Like the Large Hadron Collider, which throws away all but one byte in a million before it is stored the SKA actually stores only(!) a petabyte a day (according to Ian Emsley, who is responsible for planning its storage). A book about preserving information for the long term should be careful to maintain the distinction between the amounts of data generated, and stored. Only the stored data is relevant.

On page 46 Rumsey writes:
our recording medium of choice, the silicon chip, is vulnerable to decay, accidental deletion and overwritingOur recording medium of choice is not, and in the foreseeable future will not be, the silicon chip. It will be the hard disk, which is of course equally vulnerable, as any read-write digital medium would be. Write-once media would be somewhat less vulnerable, and they definitely have a role to play, but they don't change the argument.

FOSS4Lib Recent Releases: Evergreen - 2.10.4

22 hours 18 min ago

Last updated May 25, 2016. Created by gmcharlt on May 25, 2016.
Log in to edit this page.

Package: EvergreenRelease Date: Wednesday, May 25, 2016

William Denton: CC-BY

22 hours 46 min ago

I’ve changed the license on my content to CC-BY: Creative Commons Attribution 4.0.

UPDATE 25 May 2016: The feed metadata is now updated too. “We copy documents based on metadata.”

Evergreen ILS: Evergreen 2.10.4 released

Wed, 2016-05-25 21:55

We are pleased to announce the release of Evergreen 2.10.4, a bug fix release.

Evergreen 2.10.4 fixes the following issues:

  • Fixes the responsive view of the My Account Items Out screen so that Title and
    Author are now in separate columns.
  • Fixes an incorrect link for the MVF field definition and adds a new link to
    BRE in fm_IDL.xml.
  • Fixes a bug where the MARC stream authority cleanup deleted a bib
    record instead of an authority record from the authority queue.
  • Fixes a bug where Action Triggers could select an inactive event
    definition when running.
  • Eliminates the output of a null byte after a spool file is processed
    in MARC steam importer.
  • Fixes an issue where previously-checked-out items did not display in
    metarecord searches when the Tag Circulated Items Library Setting is
  • Fixes an issue in the 0951 upgrade script where the script was not
    inserting the version into config.upgrade_log because the line to do so
    was still commented out.

Please visit the downloads page to retrieve the server software and staff clients

Library of Congress: The Signal: The Radcliffe Workshop on Technology & Archival Processing

Wed, 2016-05-25 19:18

This is a guest post from Julia Kim, archivist in the American Folklife Center at the Library of Congress.

Professor Michael Connelly delivering keynote. Photo by Radcliffe Workshop on Technology and Archival Processing.

The annual meeting of the Radcliffe Technology Workshop (April 4th – April 5th, #radtech16) brought together historians, (digital) humanists and archivists for an intensive discussion of the “digital turn” and its effect on our work. The result was a focused and highly participatory meeting among professionals working across disciplinary lines with regards to our respective methodologies and codes of conduct. The talks and panels served as springboards for rich conversations addressing many of the big picture questions in our fields. Added to this was the use of round-table small group discussions after panel presentations, something that I wish was more a norm at professional events. This post covers only a small portion of the two days.

Matthew Connelly (Columbia University) asked “Will the coming of Big Data mean the end of history as we know it?” The answer was a resounding “yes.” Based on his years as a researcher at the National Archives and Records Administration (NARA), Connelly surveyed the history of government secrets, its inefficiencies, and the minuscule sample rate determining record retention and the resultant losses to the historical record of major world events. Part of his work as a researcher involved making use of these efforts to initiate the largest searchable collection of now de-classified government records with “The Declassification Engine” and the History Lab. In amassing and analyzing the largest data collection of declassified and unredacted records, their work uncovers secrets via systematic omission, for example. (Read more at Wired magazine.)

The next panel, “Connections and Context: A Moderated Conversation about Archival Processing for the Digital Humanities Generation,” was organized around archival processing challenges and included Meredith Evans (Jimmy Carter Presidential Library and Museum), Cristina Pattuelli (Pratt Institute), and Dorothy Waugh (Emory University).

  • Meredith Evans (Jimmy Carter Presidential Library and Museum) of “Documenting Ferguson,” discussed her work “Documenting the Now” and her efforts to push archivists outside of their comfort zone and into the community to collect documentation as events unfolded.
  • Cristina Pattuelli (Pratt Institute) presented on the Linked Jazz linked data pilot project, which pulls together tools into a single platform to create connections with jazz-musician data. The initial data, digitized oral history transcripts, is further enriched and mashed with other types of data sets, like discography information from Carnegie Hall. (Read the overview published on EDUCAUSE.)
  • Dorothy Waugh (Emory University) spoke to the researcher aspect — or more aptly, the lack of researchers — of born-digital collections. (I wrote a related story titled “Researcher Interactions with Born-Digital”.) Her work underlines the need to cultivate not only donors but also the researchers we hope will one day want to investigate time-date stamps and disk images, for example. While few collections are available for research, the lack of researchers using born-digital collections is also a problem. Researchers are unaware of collections and do not, in a sense, know how to approach using these collections. She is in the process of developing a pilot project with undergraduate students to remedy this.
  • Benjamin Moser, the authorized biographer of Susan Sontag, spoke of his own discomfort, at times, with a researcher’s abilities to exploit privileged knowledge in email. To Moser, email increased the responsibilities of both the archive and the researcher to work in a manner that is “tasteful” and underlined the need to define and educate others in what that may mean. (Read his story published in The New Yorker.)

Mary O’Connell Murphy introducing “Collections and Context” panel. Photo by Radcliffe Workshop on Technology and Archival Processing.

There were a number of questions and concerns that we discussed, such as: What course of action is necessary or right when community activists feel discomfort with their submissions? How can we make sure that these collections aren’t misused? How can we protect individuals from legal prosecution? What are our duties to donors, to the law, and to our professions, and how do individuals navigate the conflicts among their competing claims? How can we, across disciplines, develop a way of discussing these issues? If the archives are defined as an associated set of values and practices, how can we address the lack of consensus on how to (re)interpret them, in light of the challenges of digital collections?

Claire Potter (the New School) delivered a keynote entitled “Fibber McGee’s Closet: How Digital Research Transformed the Archive– But Not the History Department,” which underlined these new challenges and the need for history methodologies to shift alongside shifts in archival methodologies. “The Archive, of course, has always represented systems of cognition,” as Potter put it, “but when either the nature of the archive or the way the archive is used changes, we must agree to change with it.” Historians must learn to triage in the face of the increased volume, despite the slow pace at which educational and research models have moved. Potter called for archivists and historians to work together to support our complimentary roles in deriving meaning and use from collections. “The long game will be, historians, I hope, will begin to see archives and information technology as an intellectual and scholarly choice.” The Archives can be a teaching space and research space. (Read the text of her full talk.)

“Why Can’t We Stand Archival Practice on Its Head?” included three case studies experimenting with forms of “digitization as processing”- Larisa Miller (Hoover Institution, Stanford University), Jamie Roth and Erica Boudreau (John F. Kennedy Center Presidential Library and Museum), and Elizabeth Kelly (Loyola University, New Orleans).

  • ­Larisa Miller (Hoover Institution, Stanford University) reviewed the evolution of optical character recognition (OCR) and its use as a processing substitute. In comparing finding aids to these capabilities, she noted that “any access method will produce some winners and some losers.” Miller underscored the resource decisions that every archive must account for: Is this about finding aids or the best way to provide access? By eliminating archival processing, many more materials are digitized and made available to users. Ultimately, what methods maximize resources to get the most materials out to end users? In addition to functional reasons, Miller was critical of some core processing tasks: “The more arrangement we do, the more we violate original order.” (Read her related article published in The American Archivist.)
  • Jamie Roth and Erica Boudreau (John F. Kennedy Center Presidential Library and Museum) implemented multiple modes to test against one another: systematic digitization, digitization “on-demand” and simultaneous digitization while processing. Their talks emphasized impediments to digitization for access, such as their need to comply with legal requirements with restricted material and the lack of reliability with OCR. Roth emphasized that poor description still leads to lack of access or “access in name only.” They also cited researcher’s strong preferences for the analog original, even when given the option to use the digitized version.
  • Elizabeth Kelly (Loyola University, New Orleans) also experimented with folder-level metadata in digitizing university photographs. The scanning resulted in significant resource savings but surveyed users found the experimentally scanned collection “difficult to search and browse, but acceptable to some degree.” (Her slides are on Figshare.)

A great point from some audience members was that these types of item-level online displays are not viable information for data researchers. Item-level organization seems to be a carryover from the analog world that, once again, serves some and not others with their evaluations.

“Going Beyond the Click: A Moderated Conversation on the Future of Archival Description,” included Jarrett Drake (Princeton), Ann Wooton (PopUp Archive) and Kari Smith (Massachusetts Institute of Technology, but I’ll focus on Drake’s work. Drake, Smith, and Wooten all addressed the major insufficiencies in existing descriptive and access practices in different ways. Smith will publish a blog post with more information on MIT’s Engineering the Future of the Past this Friday, May 27.

  • Jarrett Drake (Princeton) spoke from his experiences at Princeton, as well as with “A People’s Archive for Police Violence in Cleveland.” He delivered an impassioned attack of foundational principles — such as provenance, appraisal and respect des fonds — as not only technically insufficient in a landscape of corporatized ownership in the cloud, university ownership of academic work and collaborative work, but also as unethical carryovers of our colonialist and imperialistic past. With this technological shift, however, he emphasized the greater possibility for change: “First, we occupy a moment in history in which the largest percentage of the world’s population ever possesses the power and potential to author and create documentation about their lived experiences.” (Read the full text of his talk.)

While I haven’t done justice to the talks and the ensuing conversation and debate, the Radcliffe Technology Workshop helped me to expand my own thinking by framing problems to include invested practitioners and theorists outside of the digital preservation sphere. To my knowledge it is also the only event of its kind.

LITA: Jobs in Information Technology: May 25, 2016

Wed, 2016-05-25 18:36

New vacancy listings are posted weekly on Wednesday at approximately 12 noon Central Time. They appear under New This Week and under the appropriate regional listing. Postings remain on the LITA Job Site for a minimum of four weeks.

New This Week

Pacific States University(PSU), Librarian, Los Angeles, CA

Visit the LITA Job Site for more available jobs and for information on submitting a job posting.

John Mark Ockerbloom: Sharing journals freely online

Wed, 2016-05-25 16:07

What are all the research journals that anyone can read freely online?  The answer is harder to determine than you might think.  Most research library catalogs can be searched for online serials (here’s what Penn Libraries gives access to, for instance), but it’s often hard for unaffiliated readers to determine what they can get access to, and what will throw up a paywall when they try following a link.

Current research

The best-known listing of current free research journals has been the Directory of Open Access Journals (DOAJ), a comprehensive listing of free-to-read research journals in all areas of scholarship. Given the ease with which anyone can throw up a web site and call it a “journal” regardless of its quality or its viability, some have worried that the directory might be a little too comprehensive to be useful.  A couple of years ago, though, DOAJ instituted more stringent criteria for what it accepts, and it recently weeded its listings of journals that did not reapply under its new criteria, or did not meet its requirements.   This week I am pleased to welcome over 8,000 of its journals to the extended-shelves listings of The Online Books Page.  The catalog entries are automatically derived from the data DOAJ provides; I’m also happy to create curated entries with more detailed cataloging on readers’ request.

Historic research

Scholarly journals go back centuries.  Many of these journals (and other periodicals) remain of interest to current scholars, whether they’re interested in the history of science and culture, the state of the natural world prior to recent environmental changes, or analyses and source documents that remain directly relevant to current scholarship.  Many older serials are also included in The Online Books Page’s extended shelves courtesy of HathiTrust, which currently offers over 130,000 serial records with at least some free-to-read content.  Many of these records are not for research journals, of course, and those that are can sometimes be fragmentary or hard to navigate.  I’m also happy to create organized, curated records for journals offered by HathiTrust and others at readers’ request.

It’s important work to organize and publicize these records, because many of these journals that go back a long way don’t make their content freely available in the first place one might look.  Recently I indexed five journals founded over a century ago that are still used enough to be included in Harvard’s 250 most popular works: Isis, The Journal of Comparative Neurology, The Journal of Infectious Diseases, The Journal of Roman Studies, and The Philosophical Review.  All five had public domain content offered at their official journal site, or JSTOR, behind paywalls (with fees for access ranging from $10 to $42 per article) that were available for free elsewhere online.  I’d much rather have readers find the free content than be stymied by a paywall.  So I’m compiling free links for these and other journals with public domain runs, whether they can be found at Hathitrust, JSTOR (which does make some early journal content, including from some of these journals, freely available), or other sites.

For many of these journals, the public domain extends as late as the 1960s due to non-renewal of copyright, so I’m also tracking when copyright renewals actually start for these journals.  I’ve done a complete inventory of serials published until 1950 that renewed their own copyrights up to 1977.  Some scholarly journals are in this list, but most are not, and many that are did not renew copyrights for many years beyond 1922.  (For the five journals mentioned above, for instance, the first copyright-renewed issues were published in 1941, 1964, 1959, 1964, and 1964 respectively– 1964 being the first year for which renewals were automatic.)

Even so, major projects like HathiTrust and JSTOR have generally stopped opening journal content at 1922, partly out of a concern for the complexity of serial copyright research.  In particular, contributions to serials could have their own copyright renewals separate from renewals for the serials themselves.  Could this keep some unrenewed serials out of the public domain?  To answer this question, I’ve also started surveying information on contribution renewals, and adding information on those renewals to my inventory.  Having recently completed this survey for all 1920s serials, I can report that so far individual contributions to scholarly journals were almost never copyright-renewed on their own.  (Individual short stories, and articles for general-interest popular magazines, often were, but not articles intended for scientific or scholarly audiences.)  I’ll post an update if the situation changes in the 1930s or later. So far, though, it’s looking like, at least for research journals, serial digitization projects can start opening issues past 1922 with little risk.  There are some review requirements, but they’re comparable in complexity to the Copyright Review Management System that HathiTrust has used to successfully open access to hundreds of thousands of post-1922 public domain book volumes.

Recent research

Let’s not forget that a lot more recent research is also available freely online, often from journal publishers themselves.  DOAJ only tracks journals that make their content open access immediately, but there are also many journals that make their content freely readable online a few months or years after initial publication.  This content can then be found in repositories like PubMedCentral (see the journals noted as “Full” in the “participation” column), publishing platforms like Highwire Press (see the journals with entries in the “free back issues” column), or individual publishers’ programs such as Elsevier’s Open Archives.

Why are publishers leaving money on the table by making old but copyrighted content freely available instead of charging for it?  Often it’s because it’s what’s makes their supporters– scholars and their funders– happy.  NIH, which runs PubMedCentral, already mandates open access to research it funds, and many of the journals that fully participate in PubMedCentral’s free issue program are largely filled with NIH-backed research.  Similarly, I suspect that the high proportion of math journals in Elsevier’s Open Archives selection has something to do with the high proportion of mathematicians in the Cost of Knowledge protest against Elsevier.  When researchers, and their affiliated organizations, make their voices heard, publishers listen.

I’m happy to include listings for  significant free runs of significant research journals on The Online Books Page as well, whether they’re open access from the get-go or after a delay.  I won’t list journals that only make the occasional paid-for article available through a “hybrid” program, or those that only have sporadic “free sample” issues.  But if a journal you value has at least a year’s worth of full-sized, complete issues permanently freely available, please let me know about it and I’ll be glad to check it out.

Sharing journal information

I’m not simply trying to build up my own website, though– I want to spread this information around, so that people can easily find free research journal content wherever they go.  Right now, I have a Dublin Core OAI feed for all curated Online Books Page listings as well as a monthly dump of my raw data file, both CC0-licensed.  But I think I could do more to get free journal information to libraries and other interested parties.  I don’t have MARC records for my listings at the moment, but I suspect that holdings information– what issues of which journals are freely available, and from whom– is more useful for me to provide than bibliographic descriptions of the journals (which can already be obtained from various other sources).  Would a KBART file, published online or made available to initiatives like the Global Open Knowledgebase, be useful?  Or would something else work better to get this free journal information more widely known and used?

Issues and volumes vs. articles

Of course, many articles are made available online individually as well, as many journal publishers allow.  I don’t have the resources at this point to track articles at an individual level, but there are a growing number of other efforts that do, whether they’re proprietary but comprehensive search platforms like Google Scholar and Web of Science, disciplinary repositories like ArXiV and SSRN, institutional repositories and their aggregators like SHARE and BASE, or outright bootleg sites like Sci-Hub.  We know from them that it’s possible to index and provide access to the scholarly knowledge exchange at a global scale, but doing it accurately, openly, comprehensively, sustainably, and ethically is a bigger challenge.   I think it’s a challenge that the academic community can solve if we make it a priority.  We created the research; let’s also make it easy for the world to access it, learn from it, and put it to work.  Let’s make open access to research articles the norm, not the exception.

And as part of that, if you’d like to help me highlight and share information on free, authorized sources for online journal content, please alert me to relevant journals, make suggestions in the comments here, or get in touch with me offline.

District Dispatch: Presidential campaigns weigh in on education & libraries

Wed, 2016-05-25 15:22

Representatives from all three major Presidential campaigns are expected to participate in this week’s CEF Presidential Forum to be held May 26 in Washington. ALA will be participating in the half-day forum and encourages members to view and participate online.


ALA members are invited to follow the Forum online as the event will be live streamed  starting at 10:00 AM and running through 12:00 PM EST. ALA has submitted library-themed questions for the Presidential representatives, but you can participate in the event by submitting your questions at or tweeting your questions via twitter using #CEFpresForum.

The Committee for Education Funding (CEF) is hosting the 2016 Presidential Forum, which will emphasize education as a critical domestic policy and the need for continuing investments in education. At the forum, the high-level surrogates will discuss in depth the education policy agendas of the remaining candidates. A second panel of education experts from think tanks will discuss the educational landscape that awaits the next administration.  CEF has hosted Presidential Forums during previous elections.

Candy Crowley, award-winning journalist and former Chief Political Correspondent for CNN, will moderate both panels.

The post Presidential campaigns weigh in on education & libraries appeared first on District Dispatch.

David Rosenthal: Randall Munroe on Digital Preservation

Wed, 2016-05-25 15:00
Randall Munroe succinctly illustrates a point I made at length in my report on emulation:
And here, for comparison, is one of the Internet Archive's captures of the XKCD post. Check the mouse-over text.

Open Knowledge Foundation: Introducing: MyData

Wed, 2016-05-25 14:01

this post was written by the OK Finland team

What is MyData?

MyData is both an alternative vision and guiding technical principles for how we, as individuals, can have more control over the data trails we leave behind us in our everyday actions.

The core idea is that we, you and I, should have an easy way to see where data about us goes, specify who can use it, and alter these decisions over time. To do this, we are developing a standardized, open, and mediated approach to personal data management by creating “MyData operators.”

Standardised operator model

A MyData operator account would act like an email account for your different data streams. Like an email, different parties can host an operator account, with different sets of functionalities. For example, some MyData operators could also provide personal data storage solutions, others could perform data analytics or work as identity provider. The one requirement for a MyData operator is that it lets individual receive and send data streams according to one interoperable set of standards.

What “MyData” can do?

“MyData” model does a few things that the current data ecosystem does not.

It will let you to re-use your data with a third party – For example, you could take data collected about your purchasing habits from a loyalty card of your favourite grocery store and re-use it in a financing application to see how you are spending your money on groceries.

It will let you see and change how you consent to your data use Currently,  different service providers and applications use complicated terms of service where most users just check ‘yes’ or ‘no’ once , without being entirely sure what they agree to.

It will let you change services – With MyData you will be able to take your data from one operator to another if you decide to change services.

Make it happen, make it right

MyData2016 conference will be held in Aug 31st- Sep 2nd in Helsinki Hall of Culture.

Right now, the technical solutions for managing your data according to MyData approach exist. There are many initiatives, emerging out of both the public and private sectors around the world, paving the way for human-centered personal data management. We believe strongly in the need to collaborate with other initiatives to develop an infrastructure in a way that works with all the complicated systems at work in the current data landscape. Buy your tickets for early bird discount before May 31st.

Follow MyData on social media for updates:

Twitter Facebook

William Denton: CC-BY

Wed, 2016-05-25 02:16

I’ve changed the license on my content to CC-BY: Creative Commons Attribution 4.0.

DuraSpace News: Luso-Brazilian Digital Library Launched

Wed, 2016-05-25 00:00

From Tiago Ferreira, Neki IT


District Dispatch: Last week in appropriations

Tue, 2016-05-24 19:41

The Appropriations process in Congress is a year-long cycle with fits and starts, and includes plenty of lobbying, grassroots appeals, lobby days, speeches, hearings and markups, and even creative promotions designed to draw attention to the importance of one program or another. ALA members and the Office of Government Relations continue to play a significant role in this process. Recently, for example, we’ve worked to support funding for major library programs like LSTA and IAL, as well as to address policy issues that arise in Congressional deliberations. Your grassroots voice helps amplify my message in meetings with Congressional staff.

The House and Senate Appropriations Committees have begun to move their FY2017 funding bills through the subcommittee and full committee process as the various spending measures to the Floor and then to the President’s desk. Last week was a big week for appropriations on Capitol Hill and I was back-and-forth to various Congressional hearings, meetings, and events. Here are a few of last week’s highlights:

Source: csp_iqoncept

Tuesday – There’s another word for that    

The full House Appropriations Committee convened (in a type of meeting called a “markup”) to discuss, amend and vote on two spending bills: those for the Department of Defense and the Legislative Branch. A recent proposed change to Library of Congress (LC) cataloging terminology having nothing to do with funding at all was the focus of action on the Legislative Branch bill. Earlier in April, the Subcommittee Chair Tom Graves (R-GA14) successfully included instructions to the Library in a report accompanying the bill that would prohibit the LC from implementing changes in modernizing the outdated, and derogatory, terms “illegal aliens” and “aliens.”

An amendment was offered during Tuesday’s full Committee meeting by Congresswoman Debbie Wasserman Schultz (D-FL23) that would have removed this language from the report (a position strongly and actively supported by ALA and highlighted during National Library Legislative Day). The amendment generated extensive discussion, including vague references by one Republican to “outside groups” (presumably ALA) that were attempting to influence the process (influence the process? in Washington? shocking!).

The final roll call vote turned out to be a nail biter as ultimately four Committee Republicans broke with the Subcommittee chairman to support the amendment. Many in the room, myself included, thought the amendment might have passed and an audible gasp from the audience was heard upon announcement that it had failed by just one vote (24 – 25). Unfortunately, two Committee Democrats whose votes could have carried the amendment were not able to attend. The Legislative Branch spending bill now heads to the Floor and another possible attempt to pass the Wasserman Schultz amendment …. or potentially to keep the bill from coming up at all.

Wednesday – Can you hear me now? Good.

In Congress, sometimes the action occurs outside the Committee rooms. It’s not uncommon, therefore, for advocates and their congressional supporters to mount a public event to ratchet up the pressure on the House and Senate. ALA has been an active partner in a coalition seeking full funding for Title IV, Part A of the Every Student Succeeds Act. On Wednesday, I participated in one such creative endeavor: a rally on the lawn of the US Capitol complete with high school choir, comments from supportive Members of Congress, and “testimonials” from individuals benefited by Title IV funding.

This program gives school districts the flexibility to invest in student health and safety, academic enrichment, and education technology programs. With intimate knowledge of the entire school campus, libraries are uniquely positioned to assist in determining local needs for block grants, and for identifying needs within departments, grade levels, and divisions within a school or district. Congress authorized Title IV in the ESSA at $1.65 billion for FY17, however the President’s budget requests only about one third of that necessary level.

The cloudy weather threatened — but happily did not deliver — rain and the event came off successfully. Did Congress hear us? Well, our permit allowed the use of amplified speakers, so I’d say definitely yes!

Thursday – A quick vote before lunch

On Thursday, just two days after House Appropriators’ nail biter of a vote over Legislative Branch Appropriations, the full Senate Appropriations Committee took up their version of that spending bill in addition to Agriculture Appropriations. For a Washington wonk, a Senate Appropriations Committee hearing is a relatively epic thing to behold. Each Senator enters the room trailed by two to four staffers carrying reams of paper. Throughout the hearing, staffers busily whisper amongst each other, and into the ears of their Senators (late breaking news that will net an extra $10 million for some pet project, perhaps?)

While a repeat of Tuesday’s House fracas wasn’t at all anticipated (ALA had worked ahead of time to blunt any effort to adopt the House’s controversial Library of Congress provision in the Senate), I did wonder whether there had been a last minute script change when the Chairman took up the Agriculture bill first and out of order based on the printed agenda for the meeting. After listening to numerous amendments addressing such important issues as Alaska salmon, horse slaughter for human consumption (yuck?), and medicine measurement, I was definitely ready for the Legislative Branch Appropriations bill to make its appearance. As I intently scanned the room for any telltale signs of soon-to-be-volcanic controversy, the Committee Chairman brought up the bill, quickly determined that no Senator had any amendment to offer, said a few congratulatory words, successfully called for a voice vote and gaveled the bill closed.

Elapsed time, about 3 minutes! I was unexpectedly free for lunch…and, for some reason, craving Alaska salmon.

Epilogue – The train keeps a rollin’

This week’s activity by the Appropriations Committees of both chambers demonstrates that the leaders of Congress’ Republican majority are deliberately moving the Appropriations process forward. Indeed, in the House and Senate they have promised to bring all twelve funding bills to the floor of both chambers on time…something not done since 1994. Sadly, however, staffers on both sides of the aisle tell me thatthey expect the process to stall at some point. If that happens, once again Congress will need to pass one or more “Continuing Resolutions” (or CRs) after October 1 to keep the government operating. One thing is certain; there is lots of work to be done this summer to defend library funding and policies.

The post Last week in appropriations appeared first on District Dispatch.

District Dispatch: Judiciary Committee Senators face historic “E-Privacy” protection vote

Tue, 2016-05-24 17:55

More good news could be in the offing for reform of ECPA, the Electronic Communications Privacy Act. Senate Judiciary Committee Chairman Charles Grassley (R-IA) recently (and pleasantly) surprised reform proponents by calendaring a Committee vote on the issue now likely to take place this coming Thursday morning, May 26th.  The Committee, it is hoped, will take up and pass H.R. 699, the Email Privacy Act, which was unanimously approved by the House of Representatives, as reported in District Dispatch, barely three weeks ago.  (A similar but not identical Senate bill co-authored by Judiciary Committee Ranking Member Patrick Leahy [D-VT], S. 356, also could be called up and acted upon.)


Either bill finally would update ECPA in the way most glaringly needed: to virtually always require the government to get a standard, judicially-approved search warrant based upon probable cause to acquire the full content of an individual’s emails, texts, tweets, cloud-based files or other electronic communications. No matter which is considered, however, there remains a significant risk that, on Thursday, the bill’s opponents will try to dramatically weaken that core reform by exempting certain agencies (like the IRS and SEC) from the new warrant requirement, and/or by providing dangerous exceptions to law enforcement and security agencies acting in overbroadly defined “emergency” circumstances.

Earlier today, ALA joined a new joint letter signed by nearly 65 of its public and sector coalition partners calling on Senators Grassley and Leahy to take up and pass H.R. 699 as approved by the House: in other words “without any [such] amendments that would weaken the protections afforded by the bill” ultimately approved by 419 of the 435 House Members.

Now is the time to tell the Members of the Senate Judiciary Committee that almost 30 years has been much too long to wait for real ECPA reform. Please go to ALA’s Legislative Action Center to email to your Senate Judiciary Senator now!

The post Judiciary Committee Senators face historic “E-Privacy” protection vote appeared first on District Dispatch.

SearchHub: Welcome Jeff Depa!

Tue, 2016-05-24 17:30

We’re happy to announce another new addition to the Lucidworks team! Please welcome Jeff Depa, our new Senior Vice President of Worldwide Field Operations in May 2015 (full press release: Lucidworks Appoints Search Veterans to Senior Team).

Jeff will lead the company’s day-to-day field operations, including its rapidly growing sales, alliances and channels, systems engineering and professional services business. Prior to Lucidworks, Jeff has over 17 years in leadership positions across sales, consulting, and systems engineering with companies such as Oracle, Sun, and most recently at DataStax.

Jeff earned a B.S. in Biomedical Engineering from Case Western Reserve University and also holds a Masters in Management. Aside for a passion to enable clients to unleash the power of their data, Jeff is an avid pilot and enjoys spending time with his family in Austin, TX.

We sat down with Jeff to learn more about his passion for search:

What attracted you to Lucidworks?

Lucidworks is at the forefront of unleashing the value hidden in the massive amount of data companies have collected across disparate systems. They have done a phenomenal job in driving the adoption of Apaceh Solr, but more importantly, building a platform in Fusion that allows enterprises from high volume ecommerce shops to healthcare to easily adopt and deploy a search solution that goes beyond the industry standard, and really focuses on providing the right information at the right time with unique relevancy and machine learning technologies.

What will you be working on at Lucidworks?

I’ll be focused on building on top of a solid foundation as we continue to drive the adoption of Fusion in the market and expand our team to capture the market opportunity with our customers and partners. I’m excited to be part of this journey.

Where do you think the greatest opportunities lie for companies like Lucidworks?

In today’s economy, value is driven from creating a unique, personalized and real time experience for customers and employees. Lucidworks sits squarely in the middle of an enterprise’s disparate and rapidly evolving data sources and enables the transformation of data to information that can be used to improve the user experience. The ability to to tie that information to a high impact customer result is a huge opportunity for Lucidworks.

Welcome to the team Jeff!

The post Welcome Jeff Depa! appeared first on

LITA: Mindful Tech, a 2 part webinar series with David Levy

Tue, 2016-05-24 15:09

Mindful Tech: Establishing a Healthier and More Effective Relationship with Our Digital Devices and Apps
Tuesdays, June 7 and 14, 2016, 1:00 – 2:30 pm Central Time
David Levy, Information School, University of Washington

Register Now for this 2 part webinar

“There is a long history of people worrying and complaining about new technologies and also putting them up on a pedestal as the answer. When the telegraph and telephone came along you had people arguing both sides—that’s not new. And you had people worrying about the explosion of books after the rise of the printing press.

What is different is for the last 100-plus years the industrialization of Western society has been devoted to a more, faster, better philosophy that has accelerated our entire economic system and squeezed out anything that is not essential.

As a society, I think we’re beginning to recognize this imbalance, and we’re in a position to ask questions like “How do we live a more balanced life in the fast world? How do we achieve adequate forms of slow practice?”

David Levy – See more at:

Don’t miss the opportunity to participate in this well known program by David Levy, based on his recent widely reviewed and well regarded book “Mindful Tech”. The popular interactive program will include exercises and participation now re-packaged into a 2 part webinar format. Both parts will be fully recorded for participants to return to, or to work with varying schedules.

Register Now for the 2 part Mindful Tech webinar series

This two part, 90 minutes each, webinars series will introduce participants to some of the central insights of the work Levy has been doing over the past decade and more. By learning to pay attention to their immediate experience (what’s going on in their minds and bodies) while they’re online, people are able to see more clearly what’s working well for them and what isn’t, and based on these observations to develop personal guidelines that allow them to operate more effectively and healthfully. Levy will demonstrate this work by giving participants exercises they can do, both during the online program and between the sessions.


David Levy

David M. Levy is a professor at the Information School of the University of Washington. For more than a decade, he has been exploring, via research and teaching, how we can establish a more balanced relationship with our digital devices and apps. He has given many lectures and workshops on this topic, and in January 2016 published a book on the subject, “Mindful Tech: How to Bring Balance to Our Digital Lives” (Yale). Levy is also the author of “Scrolling Forward: Making Sense of Documents in the Digital Age” (rev. ed. 2016).

Additional information is available on his website at:

Then register for the webinar and get Full details

Can’t make the dates but still want to join in? Registered participants will have access to both parts of the recorded webinars.


  • LITA Member: $68
  • Non-Member: $155
  • Group: $300

Registration Information

Register Online page arranged by session date (login required)
Mail or fax form to ALA Registration
Call 1-800-545-2433 and press 5

Questions or Comments?

For all other questions or comments related to the preconference, contact LITA at (312) 280-4269 or Mark Beatty,

Islandora: iCampBC - Instructors Announced!

Tue, 2016-05-24 13:59

Islandora Camp is going back to Vancouver from July 18 - 20, courtesy of our wonderful hosts at the British Columbia Electronic Library Network. Camp will (as usual) consist of three days: One day of sessions taking a big-picture view of the project and where it's headed, one day of hands-on workshops for developers and front-end administrators, and one day of community presentations and deeper dives into Islandora tools and sites. The instructors for that second day have been selected and we are pleased to introduce them:


Mark Jordan has taught at two other Islandora Camps and at the Islandora Conference. He is the developer of Islandora Context, Islandora Themekey, Islandora Datastream CRUD, and the XML Solution Pack, and is one of the codevelopers of the the Move to Islandora Kit. He is also an Islandora committer and is currently serving as Chair of the Islandora Foundation Board. His day job is as Head of Library Systems at Simon Fraser University.

Rosie Le Faive started with Islandora in 2012 while creating the a trilingual digital library for the Commission for Environmental Cooperation. With experience and - dare she say - wisdom gained from creating highly customized sites, she's now interested in improving the core Islandora code so that everyone can use it. Her interests are in mapping relationships between objects, and intuitive UI design. She is the Digital Infrastructure and Discovery librarian at UPEI, and develops for Agile Humanities.  


Melissa Anez has been working with Islandora since 2012 and has been the Community and Project Manager of the Islandora Foundation since it was founded in 2013. She has been a frequent instructor in the Admin Track and developed much of the curriculum, refining it with each new Camp.

Janice Banser is the Systems Librarian at Simon Fraser University.  She has been working with Islandora, specifically the admin interface, for over a year now. She is a member of the Islandora Documentation Interest Group and has contributed to the last two Islandora releases. She has been working with Drupal for about 6 years and has been a librarian since 2005.

Patrick Hochstenbach: Crosshatching with my fountain pen

Tue, 2016-05-24 04:34
Filed under: portaits Tagged: crosshatch, fountain pen, ink, paper, portrait, sktchy, twsbi

Terry Reese: MarcEdit Update

Tue, 2016-05-24 01:39

Yesterday, I posted a significant update to the Windows/Linux builds and a maintenance update to the Mac build that includes a lot of prep work to get it ready to roll in a number of changes that I’ll hopefully complete this week.  Unfortunately, I’ve been doing a lot of travelling, which means that my access to my mac setup has been pretty limited and I didn’t want to take another week getting everything synched together. 

So what are the specific changes:

ILS Integrations
I’ve been spending a lot of time over the past three works head down working on ILS integrations.  Right now, I’m managing two ILS integration scenarios – one is with Alma and their API.  I’m probably 80% finished with that work.  Right now, all the code is written, I’m just not getting back expected responses from their bibliographic update API.  Once I sort out that issue – I’ll be integrating this change into MarcEdit and will provide a youtube video demonstrating the functionality. 

The other ILS integration that I’ve been accommodating is working with MarcEdit’s MARC SQL Explorer and the internal database structure.  This work builds on some work being done with the Validate Headings tool to close the authority control loop.  I’ll likely be posting more about that later this week as I’m currently have a couple libraries test this functionality to make sure I’ve not missed anything.  Once they give me the thumbs up, this will make its way into the MarcEditor as well. 

But as part of this work, I needed to create a way for users to edit and search the local database structure in a more friendly way.  So, leveraging the ILS platform, I’ve included the ability for users to work with the local database format directly within the MarcEditor.  You can see how this works here ( Integrating the MarcEditor with a local SQL store.  I’m not sure what the ideal use case is for this functionality – but over the past couple of weeks, it had been requested by a couple of power users currently using the MARC SQL Explorer for some data edits, but hoping for an easier to user interface.  This work will be integrated into the Mac MarcEdit version at the end of this week.  All the prep work (window/control development) has been completed.  At this point, its just migrating the code so that it works within the Mac’s object-C codebase.

Edit Shortcuts
I created two new edit shortcuts in the MarcEditor.  The first, Find Records With Duplicate Tags, was created to help users look for records that may have multiple tags or a tag/subfield combination with a set of records.  This is work that can be done in the Extract Selected Records tool, but it requires a bit a trickery and knowledge of how MarcEdit formats data. 

How does this work – say you wanted to know which records had multiple call numbers (050) fields in a record.  You would select this option, enter 050 in the prompt, and then the tool would create for you a jump list showing all the records that met your criteria. 

Convert To Decimal Degrees
The second Edit ShortCut function is the first Math function (I’ll be adding two more, specifically around finding records with dates greater than or less than a specific value) targeting the conversion of Degree/Minutes/Seconds to decimal degrees.  The process has been created to be MARC agnostic, so users can specify the field, and subfields to process.  To run this function, select it from the Edit Shortcuts as demonstrated in the screenshot below:

When selected, you will get the following prompt:

This documents the format for defining the field/subfields to be processed.  Please note, it is important to define the all four potential values for conversion – even if they are not used within the record set. 

Using this function, you can now convert a value like:
=034  1\$aa$b1450000$dW1250000$eW1163500$fN0461500$gN0420000
=034  1\$aa$b1450000$d+125.0000$e+116.5800$f+046.2500$g+042.0000

This function should allow users to transition their cartographic data to a format that is much more friendly to geographic interpretation if desired.

Bug Fixes:
This update also addressed a bug in the Build New field parser.  If you have multiple arguments, side-by-side, within the same field grouping (i.e., {100$a}{100$b}{100$c} – the parser can become confused.  This has been corrected.

Included and update to the linked data rules file, updating the 7xx fields to include the $t in the processing.  Also updated the UNIMARC translation to include a 1:1 translation for 9xx data.

Over the next week, I hope to complete the Alma integration, but will focusing the development work in my free time on getting the Mac version synched with these changes.


DuraSpace News: Sandy Payette to Speak at 2016 VIVO Conference

Tue, 2016-05-24 00:00

From the VIVO 2016 Planning Committee

Register today to attend the 2016 VIVO conference and hear from leading experts within our community.