You are here

Feed aggregator

District Dispatch: 2017 WHCLIST award winner announced

planet code4lib - Thu, 2017-04-06 17:05

This week, the American Library Association (ALA) Washington Office announced that Lori Rivas of Newhall, CA is the winner of the 2017 White House Conference on Library and Information Services (WHCLIST) Award. Given to a non-librarian participant attending National Library Legislative Day, the award covers hotel fees and includes $300 stipend to defray the cost of attending the event.

2017 WHCLIST winner Lori Rivas.

A lifelong library user, Rivas has spent the last seven years advocating for her library. This passion for libraries grew out of many years of library use:

“For 20 years, I homeschooled my children, depending on public library resources and programming. In 2010, our city, Santa Clarita, CA, proposed contracting with a private company for the management of our public libraries. All my momma bear energy and activism juices were galvanized into the fight to keep our local public libraries truly public.”

In response to this proposed contract, Rivas helped to organize a campaign that gained national attention. She secured a private audience with Los Angeles County Supervisor Michael Antonovich and CA State Senator Bob Huff (R-29th District), and was interviewed by the Washington Post. Later advocacy efforts led her to testify before the California State Governance and Finance Committee for the passage of AB438. She has campaigned for library services during local elections, written to local media, and undertaken many other tasks in support of libraries. Her advocacy work eventually led her to become a library consultant for the Southern California Library Cooperative (SCLC), where among other projects she helped convert LSTA grants into work plans and timetables, ensuring the successful implementation of LSTA grants and giving her insight into the importance of LSTA funding.

Looking toward the future, Rivas hopes to pursue her MLIS, to better combine her interests: “education, activism, public service, government work, writing, and advocating for the disenfranchised.”

The White House Conference on Library and Information Services—an effective force for library advocacy nationally, statewide and locally—transferred its assets to the ALA Washington Office in 1991 after the last White House conference. These funds allow ALA to participate in fostering a spirit of committed, passionate library support in a new generation of library advocates. Leading up to National Library Legislative Day each year, the ALA seeks nominations for the award. Representatives of WHCLIST and the ALA Washington office choose the recipient.

The post 2017 WHCLIST award winner announced appeared first on District Dispatch.

Karen Coyle: Precipitating Forward

planet code4lib - Thu, 2017-04-06 16:38
Our Legacy, Our Mistake
If you follow the effort taking place around the proposed new bibliographic data standard, BIBFRAME, you may have noticed that much of what is being done with BIBFRAME today begins our current data in MARC format and converts it to BIBFRAME. While this is a function that will be needed should libraries move to a new data format, basing our development on how our legacy data converts is not the best way to move forward. In fact, it doesn't really tell us what "forward" might look like if we give it a chance.

We cannot define our future by looking only at our past. There are some particular aspects of our legacy data that make this especially true.            

I have said before (video, article) that we made a mistake when we went from printing cards using data encoded in MARC, to using MARC in online catalogs. The mistake was that we continued to use the same data that had been well-adapted to card catalogs without making the changes that would have made it well-adapted to computer catalogs. We never developed data that would be efficient in a database design or compatible with database technology. We never really moved from textual description to machine-actionable data points. Note especially that computer catalogs fail to make use of assigned headings as they are intended, yet catalogers continue to assign them at significant cost.

One of the big problems in our legacy data that makes it hard to take advantage of computing technology is that the data tends to be quirky. Technology developers complain that the data is full of errors (as do catalogers), but in fact it is very hard to define, algorithmically, what is an error in our data.  The fact is that the creation of the data is not governed by machine rules; instead, decisions are made by humans with a large degree of freedom. Some fields are even defined as being either this or that, something that is never the case in a data design. A few fields are considered required, although we've all seen records that don't have those required fields. Many fields are repeatable and the order of fields and subfields is left to the cataloger, and can vary.

The cataloger view is of a record of marked-up text. Computer systems can do little with text other than submit it for keyword indexing and display it on the screen. Technical designers look to the fixed fields for precise data points that they can operate on, but these are poorly supported and are often not included in the records since they don't look like "cataloging" as it is defined in libraries. These coded data elements are not defined by the cataloging code, either, and can be seen a mere "add-ons" that come with the MARC record format. The worst of it is that they are almost uniformly redundant with the textual data yet must be filled in separately, an extra step in the cataloging process that some cannot afford.

The upshot of this is that it is very hard to operate over library catalog data algorithmically. It is also very difficult to do any efficient machine validation to enforce consistency in the data. If we carry that same data and those same practices over to a different metadata schema, it will still be very hard to operate over algorithmically, and it will still be hard to do quality control as a function of data creation.

The counter argument to this is that cataloging is not a rote exercise - that catalogers must make complex decisions that could not be done by machines. If cataloging were subject to the kinds of data entry rules that are used in banking and medical and other modern systems, then the creativity of the cataloger's work would be lost, and the skill level of cataloging would drop to mere data entry.

This is the same argument you could used for any artisanal activity. If we industrialize the act of making shoes, the skills of the master shoe-maker are lost. However, if we do not industrialize shoe production, only a very small number of people will be able to afford to wear shoes.

This decision is a hard one, and I sympathize with the catalogers who are very proud of their understanding of the complexity of the bibliographic world. We need people who understand that complexity. Yet increasingly we are not able to afford to support the kind of cataloging practices of which we are proud. Ideally, we would find a way to channel those skills into a more efficient workflow.

There is a story that I tell often: In the very early days of the MARC record, around the mid-1970's, many librarians thought that we could never have a "computer catalog" because most of our cataloging existed only on cards, and we could NEVER go back and convert the card catalogs, retype every card into MARC. At that same time, large libraries in the University of California system were running over 100,000-150,000 cards behind in their filing. For those of you who never filed cards... it was horribly labor intensive. Falling 150,000 cards behind meant that a book was on the shelf THREE MONTHS before the cards were in the catalog. Some of this was the "fault" of OCLC which was making it almost too easy to create those cards. Another factor was a great increase in publishing that was itself facilitated by word processing and computer-driven typography. Within less than a decade it became more economical to go through the process of conversion from printed cards to online catalogs than to continue to maintain enormous card catalogs. And the rest is history. MARC, via OCLC, created a filing crisis, and in a sense it was the cost of filing that killed the card catalog, not the thrill of the modern online catalog.

The terrible mistake that we made back then was that we did not think about what was different between the card catalog and the online catalog, and we did not adjust our data creation accordingly. We carried the legacy data into the new format which was a disservice to both catalogers and catalog users. We missed an opportunity to provide new discovery options and more efficient data creation.

We mustn't make this same mistake again.

The PrecipitantAbove I said that libraries made the move into computer-based catalogs because it was uneconomical to maintain the card catalog. I don't know what the precipitant will be for our current catalog model, but there are some rather obvious places to look to for that straw that will break the MARC/ILS back. These problems will probably manifest themselves as costs that require the library to find a more efficient and less costly solution. Here are some of the problems that I see today that might be factors that require change:

  • Output rates of intellectual and cultural products is increasing. Libraries have already responded to this through shared cataloging and purchase of cataloging from product vendors. However, the records produced in this way are then loaded into thousands of individual catalogs in the MARC-using community.
  • Those records are often edited for correctness and enhanced. Thus they are costing individual libraries a large amount of money, potentially asmuch or more than libraries save by receiving the catalog copy.
  • Each library must pay for a vendor system that can ingest MARC records, facilitate cataloging, and provide full catalog user (patron) support for searching and display.
  • "Sharing" in today's environment means exporting data and sending it as a file. Since MARC records can only be shared as whole records, updates and changes generally are done as a "full record replace" which requires a fair amount of cycles. 
  • The "raw" MARC record as such is not database friendly, so records must be greatly massaged in order to store them in databases and provide indexing and displays. (Another way to say this is that there are no database technologies that know about the MARC record format. There are database technologies that natively accept and manage other data formats, such as key-value pairs

There are some current technologies that might provide solutions:

  • Open source. There is already use of open source technology in some library projects. Moving more toward open source would be facilitated by moving away from a library-centric data standard and using at least a data structure that is commonly deployed in the information technology world. Some of this advantage has already been obtained with using MARCXML.
  • The cloud. The repeated storing of the same data in thousands of catalogs means not being able to take advantage of true sharing. In a cloud solution, records would be stored once (or in a small number of mirrors), and a record enhancement would enhance the data for each participant without being downloaded to a separate system. This is similar to what is being proposed by OCLC's WorldShare and Ex Libris' Alma, although presumably those are "starter" applications. Use of the cloud for storage might also mean less churning of data in local databases; it could mean that systems could be smaller and more agile.
  • NoSQL databases and triple stores. The current batch of databases are open source, fast, and can natively process data in a variety of formats (although not MARC). Data does not have to be "pre-massaged" in order to be stored in a database or retrieved and the database technology and the data technology are in sync. This makes deployment of systems easier and faster. There are NoSQL database technologies for RDF. Another data format that has dedicated database technology is XML, although that ship may have sailed by now.
  • The web. The web itself is a powerful technology that retrieves distributed data at astonishing rates. There are potential cost/time savings on any function that can be pushed out the web to make use of its infrastructure. 

The change from MARC to ?? will come and it will be forced upon us through technology and economics. We can jump to a new technology blindly, in a panic, or we can plan ahead. Duh.

Evergreen ILS: Thursday Sponsors

planet code4lib - Thu, 2017-04-06 16:04

Every year the Evergreen conference can be as equally exhausting as it is fun. Remember to stay hydrated and keep your energy up! This year we’ve placed breaks and the reception in the vendor area so that you can see some of the great services these organizations offer to the community.

We’d like to highlight the sponsors this year that are making these opportunities possible:

Consortium of Ohio Libraries (COOL) and Kenton County Public Library sponsored our breakfast.

The Sage Library Consortium sponsored our afternoon breaks.

MARCive is sponsoring the afternoon break.

Equinox Open Library Initiative is sponsoring the reception.

Please take a moment to thank each of these for their support of the conference and community involvement.

DPLA: Welcome to the Windy City!

planet code4lib - Thu, 2017-04-06 15:13

DPLAfest 2017 is around the corner and we are thrilled to be heading to Chicago, one of the nation’s most dynamic cities, to learn, converse, and explore with you, both during and beyond the fest. Whether you are a Chicago newcomer or a native, check out highlights below for local institutions and events to explore while in town for the fest. For those participating in the fest both near and far, we are also excited to take this opportunity to showcase the digital diversity of Chicago’s rich cultural heritage community.

Chicago Skyline postcard, ca. 1930s, from the collection of Boston Public Library via Digital Commonwealth.


First and foremost, Meet our Hosts

Chicago Public Library
You will get to know Chicago Public Library’s Harold Washington Library Center well during DPLAfest. Named for Chicago’s first African American mayor, it serves as the main library for the 80 CPL locations across the city and is the largest public library building in the world. Look for these exhibitions on display throughout the library during the fest.

Black Metropolis Research Consortium
Based at the University of Chicago, the Black Metropolis Research Consortium (BMRC) is a Chicago-based membership association of libraries, universities, and other archival institutions. BMRC provides extensive access to materials on African American and African diasporic culture, history, and politics, with a specific focus on materials relating to Chicago.

Chicago Collections
Chicago Collections is a consortium of 25 libraries, museums, and archives that collaborate to preserve and share the history and culture of the Chicago region through their digital collections. Together, these organizations offer a diverse and dynamic representation of Chicago’s history and its people.

Reaching Across Illinois Library System (RAILS)
RAILS provides continuing education, consulting, e-book projects, delivery, shared online catalogs, talking book services, and other innovative services to all types of libraries (academic, public, school, and special) in northern/western Illinois and beyond. See all the details at

Digital Diversity

Whether you will be in town or following the festivities from home, there will be many opportunities for you to explore the digital diversity of Chicago and the broad range of available cultural institutions and collections.

Highlights include:

A photograph of African American teens practicing dance steps, 1955, from the collection of the University of Illinois at Chicago via Illinois Digital Heritage Hub.

Explore Chicago Collections
Investigate the collections of 25 partner institutions across the city of Chicago through this digital consortium.

Leather Archives and Museum
A RAILS member institution, the Leather Archives and Museum compiles, preserves, and maintains history and memorabilia documenting leather and related lifestyles, including but not limited to the Gay and Lesbian communities, for historical, educational and research purposes.

The Sterling Morton Library at the Morton Arboretum
A RAILS member institution and a unique resource in the Chicago area, the Morton Library is devoted mainly to the literature of botany and horticulture, especially as it relates to trees and shrubs that can be grown in northern Illinois.

BMRC Photo Gallery
Explore BMRC’s digitized photo collection documenting African American and African diasporic culture, history, and politics in Chicago.

Illinois Digital Heritage Hub
A collaboration between lead partners Chicago Public Library, Illinois State Library, University of Illinois-Urbana Champaign, and the Consortium of Academic and Research Libraries in Illinois (CARLI), the Illinois Digital Heritage Hub is one of the newest DPLA hubs, with a broad range of materials on the history and culture of Chicago, with even more to come in the future!

Beyond the Internet: Exploring Chicago in Real Life

Chicago for the Tourist book cover, 1912, from the collection of the University of Illinois at Urbana-Champaign Library.

Explore Chicago outside the fest and beyond the internet at some of the nation’s premier arts and culture institutions. Below are a few highlights. Feel free to offer other fest-goers your personal recommendations on Twitter using #DPLAfest.

Museums and Libraries

The Art Institute of Chicago

DPLAfest Presenter: Hear from Art Institute of Chicago staff members in Digital Publishing at the Art Institute of Chicago: Discoverability, Adaptability, and Access.

The Field Museum, including special exhibition Tattoo, on view now.

DPLAfest Presenter: Hear from The Field Museum’s Alaka Wali in The Walking Art Collection: How Tattoos Tell Stories of Identity.

The DuSable Museum of African American History

The Newberry Library

Chicago History Museum

National Museum of Mexican Art

Tour Wrigley Field, home of the 2016 World Series champion Chicago Cubs

Arts and Culture

Chicago Latino Film Festival and poster exhibit at Chicago Public Library, opens April 20 – Hosted by the International Latino Cultural Center of Chicago, the two-week festival will feature over 100 feature and short films from across Latin America.

Hamilton, An American Musical at PrivateBank Theater – Do not throw away your shot to (maybe, hopefully) get tickets to this Broadway smash hit!

Chicago International Music and Movies Festival (CIMMfest) Spring Fling Thing, April 21-23 – a mini-film festival showcasing movies, music, and more.

March for Science, April 22 – Join Chicago’s March for Science on Saturday following the fest and march alongside thousands of people across the country to protect the future of scientific research and promote the importance of scientific facts.

An illustrated map of Chicago, 1931, from the collection of the University of Illinois at Urbana-Champaign Library.


Chicago is home to the first Ferris wheel, skyscrapers, and deep dish pizza – can you blame us for our excitement to for you visit? See you in two weeks!

Library of Congress: The Signal: New Home and Features for Sustainability of Digital Formats Site

planet code4lib - Thu, 2017-04-06 13:50

This is a guest post by Kate Murray, IT Specialist in the Library of Congress’s Digital Collections and Management Services.

The Library of Congress’ Sustainability of Digital Formats Web site (informally just known as “Formats”) details and analyzes the technical aspects of digital formats with a focus towards strategic planning regarding formats for digital content, especially collection policies.  Launched in 2004, Formats provides in-depth descriptions of over 400 formats sorted into content categories: still image, sound, textual, moving image, Web archive, datasets, geospatial and generic formats with more to come. There are other publicly available format assessment tools in the community at large including the British Library Format Assessments (via DPC wiki) and Harvard Library’s Digital Preservation Format Assessments just to name a few (see the iPRES 2016 workshop on Sharing, Using and Re-using Format Assessments for more examples) but in part, what makes the LC Formats resource unique is the fact that we document relationships between formats (subtypes and the like), especially the way wrappers and encodings interact when used together – what we call a “combo pack.”

Caption: (Link: Example of combo pack fdd that describes Apple ProRes 422 within the QuickTime wrapper.

Formats is also well-known for what we consider when evaluating formats including the seven sustainability factors and the quality and functionality factors which vary depending on the content category.

What’s New

Not ones to rest on our laurels, we are excited to announce recent updates and improvements for Formats. First, it’s moved to a new URL from to Each page has a page-level redirect to bring users to the correct site. Content at the old URL is no longer revised so be sure to update your bookmarks to get the most current information.

One of the new additions to Formats is the inclusion of the PRONOM Persistent Unique Identifier (PUID) and WIkidata Title ID information in order to help establish the correct relationships to other these format assessment resources.  One example is the open source format identification tool Siegfried which includes both LC’s format document descriptions and PRONOM information in its results. It’s important to recognize that there’s not always a perfect match across resources for a variety of reasons –maybe the versions aren’t consistently described for example  - but when there is a good match, we’ll include it. It’s more complicated than just looking for matching format extensions like .tif or .wav. There’s an intellectual research component to correctly pair like with like so it takes a bit of time. We’re working our way through the list of format document descriptions and adding as we can – it’s an ongoing project.

Caption: (Link: ) PUID and Wikidata Title ID links along with other file signifiers for the WAVE Audio File Format.

In addition, we’ve also added links to formats listed in the Recommended Formats Statement to better connect these related resources.

Caption: (link: RFS link for TIFF, Revision 6.0.

A reminder that all format document descriptions are available for download in XML, as individual pages or get the entire set in a zip file.

Formats continues to evolve to meet the Library’s and the digital preservation community’s changing needs. Stay tuned for announcements about the posting of new format descriptions – we have much more to come.

Evergreen ILS: We Thank Equinox!

planet code4lib - Thu, 2017-04-06 13:00

Every year the community comes together for the annual Evergreen International Conference, this year in Covington, Kentucky! One full day in and it’s already been a great conference with lots of great interaction and education. And while we can’t do the conference without the great presenters we also need support from our sponsors and today we are highlighting the Equinox Open Library Initiative.

Twitter: @EquinoxOLI

Equinox has been a long term member of the community, in fact preceding it. Equinox contains some of the original developers of the Evergreen ILS and continues to provide support services for Evergreen as well as hosting, development and consulting.

Open Knowledge Foundation: Frictionless Data Case Study: John Snow Labs

planet code4lib - Thu, 2017-04-06 12:26

Open Knowledge International is working on the Frictionless Data project to remove the friction in working with data. We are doing this by developing a set of tools, standards, and best practices for publishing data. The heart of Frictionless Data is the Data Package standard, a containerization format for any kind of data based on existing practices for publishing open-source software.

We’re curious to learn about some of the common issues users face when working with data. In our Case Study series, we are highlighting projects and organisations who are working with the Frictionless Data specifications and tooling in interesting and innovative ways. For this case study, we interviewed Ida Lucente of John Snow Labs. More case studies can be found at

What does John Snow Labs do?

John Snow Labs accelerates data science and analytics teams, by providing clean, rich and current data sets for analysis. Our customers typically license between 50 and 500 data sets for a given project, so providing both data and metadata in a simple, standard format that is easily usable with a wide range of tools is important.

What are the challenges you face working with data?

Each data set we license is curated by a domain expert, which then goes through both an automated DataOps platform and a manual review process. This is done in order to deal with a string of data challenges. First, it’s often hard to find the right data sets for a given problem. Second, data files come in different formats, and include dirty and missing data. Data types are inconsistent across different files, making it hard to join multiple data sets in one analysis. Null values, dates, currencies, units and identifiers are represented differently. Datasets aren’t updated on a standard or public schedule, which often requires manual labor to know when they’ve been updated. And then, data sets from different sources have different licenses – we use over 100 data sources which means well over 100 different data licenses that we help our clients be compliant with.

How are you working with the specs?

The most popular data format in which we deliver data is the Data Package (see Each of our datasets is available, among other formats, as a pair of data.csv and datapackage.json files, complying with the specs at We currently provide over 900 data sets that leverage the Frictionless Data specs.

How did you hear about Frictionless Data?

Two years ago, when we were defining the product requirements and architecture, we researched six different standards for metadata definition over a few months. We found Frictionless Data as part of that research, and after careful consideration have decided to adopt it for all the datasets we curate. The Frictionless Data specifications were the simplest to implement, the simplest to explain to our customers, and enable immediate loading of data into the widest variety of analytical tools.

What else would you like to see developed?

Our data curation guidelines have added more specific requirements, that are underspecified in the standard. For example, there are guidelines for dataset naming, keywords, length of the description, field naming, identifier field naming and types, and some of the properties supported for each field. Adding these to the Frictionless Data standard would make it harder to comply with the standard, but would also raise the quality bar of standard datasets; so it may be best to add them as recommendation.

Another area where the standard is worth expanding is more explicit definition of the properties of each data type – in particular geospatial data, timestamp data, identifiers, currencies and units. We have found a need to extend the type system and properties for each field’s type, in order to enable consistent mapping of schemas to different analytics tools that our customers use (Hadoop, Spark, MySQL, ElasticSearch, etc). We recommend adding these to the standard.

What are the next things you are going to be working on yourself?

We are working with Open Knowledge International on open sourcing some of the libraries and tools we’re building. Internally, we are adding more automated validations, additional output file formats, and automated pipelines to load data into ElasticSearch and Kibana, to enable interactive data discovery & visualization.

What do you think are some other potential use cases?

The core use case is making data ready for analytics. There is a lot of Open Data out there, but a lot of effort is still required to make it usable. This single use case expands into as many variations as there are BI & data management tools, so we have many years of work ahead of us to address this one core use case.

Open Knowledge Foundation: OpenCon 2017 Srinagar celebrating International Open Data Day

planet code4lib - Thu, 2017-04-06 09:08

This blog is part of the event report series on International Open Data Day 2017. On Saturday 4 March, groups from around the world organised over 300 events to celebrate, promote and spread the use of open data. 44 events received additional support through the Open Knowledge International mini-grants scheme, funded by SPARC, the Open Contracting Program of Hivos, Article 19, Hewlett Foundation and the UK Foreign & Commonwealth Office. This event was supported through the mini-grants scheme under the Open Research theme.

The International Open Data Day at the University of Kashmir, India, was celebrated as a satellite event titled “OpenCon 2017 Srinagar: Celebrating International Open Data day” on 4th March 2017.


The event was organised for the first time in the valley with the aim of introducing scholars, researchers, students and the teaching community to the availability and benefits of Open Research Data. The concept of open data is not that much common among the research community, although the university promotes and stands for open access. Therefore the organisers emphasised the concept and importance of open data, especially for research and allied areas.

The overwhelming participation by the researchers, scholars, faculty members numbering to more than 150 revealed their keen interest in the theme and curiosity about the availability and use of open data sets in different setups.

The full day event was well divided into different sessions and started with the inaugural session where Dr. Ajaz H.Wani (Scientist-D, Department of Biotechnology) introduced the concept of open data by showcasing some examples of everyday data sets generated and populated in different sectors like Google maps and from the field of Biotechnology. Another address by Mr Ajaz ul Haq (Producer, Electronic Multimedia Research Centre) laid emphasis on the various dimensions of openness and highlighted the importance of understanding the difference between the terms open and free.

In the next session, “OpenCon Webcast: OpenData 101” by Ross Mounce was screened and it educated participants on the basics of open data, the legal and technical aspects of open data and the issue of privacy – why not all data should be opened. The participants and experts present established good network during lunch and exchanged ideas and shared experiences and concerns.

Dr Zahid Ashraf Wani (Assistant Professor, Department of Library & Information Science) gave a presentation on the Registry of Research Data Repositories and highlighted the availability and usefulness of research data repositories in different subject areas.

Nadim Akhtar Khan (Assistant Professor, Department of Library & Information Science) introduced the projects available through  Open Knowledge Labs. These include: CKAN, Frictionless Data, FutureTDM, Open Data for Development (OD4D), Open Budgets EU, Open Data Handbook, OpenSpending, OpenTrials, School of Data etc. The participants were also encouraged to use for understanding, creating and sharing data sets in real environments.

After the presentations, there was a panel discussion where panelists from different subject areas shared their experiences regarding the availability of research datasets in their respective domains and their observations regarding open data. Dr Abdul Majid Baba, (University librarian & Head, DLIS) emphasised the importance of open access in present research environment. Professor Bashir Ahmad Joo (Department of Management Studies) highlighted the importance of open data in Business and Finance and presented some good examples of open data in the banking sector for ready reference, utilisation and drawing inferences.

Dr Mohammad Tariq Banday (Head of the Department of Electronics and Instrumentation Centre) highlighted the importance of open data for the researchers in the field of science. He emphasised that using and testing open data in local research environments will be more beneficial for quality research. He also talked about the importance of making more data open in subject areas like Electronics and how that will go a long way in strengthening the research domain. Dr Masood Rizvi (Assistant Professor, Department of Chemistry) shared experiences in utilising research tools such as ResearchGate for sharing research datasets in open and its influence on establishing quality and the impact of individual research efforts at the global level.

Dr Sumeer Gul (Assistant Professor, Department of Library & Information Science) emphasised on the importance of basic concepts underpinning open access and highlighted the importance of open access publishing and open archives/repositories for teaching and research community while Mrs Rosy Jan (Assistant Professor, Department of Library & Information Science) deliberated upon the role of libraries and information centres in promoting open access and open environment for research.

Dr Zahid Ashraf Wani also talked about the importance of open data for economically poor nations and its implications for building more vibrant research communities at the global level. The sharing of research data from poorer regions like ours can be boon in terms of potential collaboration globally and help the region to make most of the infrastructure facilities available in the developed world.

Nadim Akhtar Khan during the concluding remarks after the panel discussions called upon all the participants to make use of open data day deliberations as the basis for giving serious considerations towards understanding, using and sharing open data. The participants were asked to make use of Open Knowledge Labs for further strengthening their understanding of 0pen data and its use.

The feedback from participants about the event made our fatigued day fruitful because most of them became confident about experimenting with open data and creating small groups for discussions and experiences and issues sharing at the local level. We are confident that Open Data Day 2017 is the beginning towards embracing openness and will lead to a vibrant research culture with more transparency and reusing options of existing datasets. The most amazing part of the event was that the MLIS students of final semester showed their keen interest in the deliberations and were actively involved in the discussions.

Despite the fact that the preparations for ODD celebrations started late, the teamwork and tireless efforts of the teaching and non-teaching members of the Department of Library & Information Science, University of Kashmir made the event possible. Our special thanks go to the Honourable Vice-Chancellor and the Registrar for approving the event at such a short notice. Also, our gratitude goes to Director of EMMRC, for providing the auditorium and video coverage of the event.

We would fail in our duties if we won’t thank Lorraine Chuen of SPARC OPEN for providing us a very vibrant OpenCon Platform for organising this event and schedule our event using sched (Event management tool) that saved us lots of efforts and time. We are immensely thankful to SPARC and Open Knowledge Foundation for th mini-grant that was used for meeting different expenditures for holding the event successfully.

For the detailed schedule you can visit: or

HangingTogether: Best Practices for Web Archiving Metadata: Watch This Space!

planet code4lib - Wed, 2017-04-05 17:11

Some of you may recall that back in 2015 we surveyed our OCLC Research Library Partners to determine their top challenges with web archiving, and the need for guidance on metadata practices emerged as #1. In response, early in 2016 we established a Web Archiving Metadata Working Group (WAM) to develop best practices for metadata. The group did extensive background research over the past year, and we’re now on a fast track to publish three reports in the next several months. In the meantime, you can read a substantial overview of the project in this article published last Friday in the online Journal of Western Archives.

The first two reports will underpin the best practices: one on tools available for capture of websites, with a focus on their metadata extraction capabilities; and a review of the literature on metadata needs of web archives users.

The best practice guidelines will be in the third report. In addition to defining and interpreting a set of data elements, the report will articulate differences between bibliographic and archival standards; contrast approaches to description of individual websites and collections; and include both a literature review focused on metadata issues and crosswalks to related standards.

WAM established several principles to underpin the best practices. They are intended to …

  • … address the needs of users of archived websites as determined by our literature review
  • … be community-neutral, standards-neutral, and output-neutral; in other words, applicable to any context in which metadata for archived websites is needed
  • … consist of a relatively lean set of data elements, with the scope of each defined (i.e., a data dictionary)
  • interpret each element for description of archived websites, which, unlike books or serials or published audiovisual media, have no conventions for representing elements such as creators, dates, or extent
  • … be upward-compatible with standards that have far deeper data element sets, including RDA, MARC, DACS, EAD, and MODS

We are in the process of finalizing the set of data elements and have adopted the following so far:

  • title
  • creator
  • contributor
  • date
  • description
  • extent
  • identifier
  • language
  • subject
  • genre

These may seem both obvious and straightforward, but most need definition and interpretation for the web context. One example: what types of date are both feasible to determine and important to include, and how can their meaning be made clear? Additional elements under consideration include geographic coverage, publisher, rights, access, source of description, URL, and collector (or should the latter be owner? or repository? or location?). We’ve eliminated from consideration several that don’t have specific applicability to websites, including audience and statement of responsibility.

We’ll be circulating the draft best practices widely across the library and archives community and are hoping to hear from many who are struggling to describe websites and collections. Our aim is to promulgate best practices that will encourage use of metadata that is both meaningful and useful to users of these resources.

Stay tuned!


About Jackie Dooley

Jackie Dooley leads OCLC Research projects to inform and improve archives and special collections practice.

Mail | Web | Twitter | Facebook | More Posts (20)

Evergreen ILS: Thank you PAILS/SPARK

planet code4lib - Wed, 2017-04-05 16:09

We want to thank PAILS/SPARK for being kind enough to sponsor our luncheon today during the pre-conference! We could not make the conference work without community members like PAILS and their generosity is appreciated!

Open Knowledge Foundation: Youth Association for Development (YAD) Pakistan celebrates ODD17 in Pakistan

planet code4lib - Wed, 2017-04-05 13:00

This blog is part of the event report series on International Open Data Day 2017. On Saturday 4 March, groups from around the world organised over 300 events to celebrate, promote and spread the use of open data. 44 events received additional support through the Open Knowledge International mini-grants scheme, funded by SPARC, the Open Contracting Program of Hivos, Article 19, Hewlett Foundation and the UK Foreign & Commonwealth Office. This event was supported through the mini-grants scheme under the Open contracting and tracking public money flows theme.

It is important to acquire skills to intervene effectively in the democratic processes. The Youth Association for Development (YAD)-Pakistan celebrated the annual Open Data Day on March 4th, 2017 in Quetta, Pakistan.  The themes of the day were”Our Money Our Responsibility”, and “Open Data is Oil, Open Data is Soil”. We looked at the demand for an open budget, open tenders, open bids, open jobs, open procurement and open recruitment to make public institutions accountable to citizens and to keep the entire public data open with online access to each and every citizen to ensure transparency and accountability.

During the day, different speakers and facilitators delivered their speeches on open data and informed the participants that the government is endowed with the responsibility of prescribing regulations and procedures for public procurement and must make sure these are followed to the latter. This is important to improve governance, management, transparency, accountability and quality of public acquisition of goods, works and services.

They also mentioned the importance of monitoring procurement of public sector agencies/organisations and to keep a record that is open to everyone, as it is done in some developed nations where they have revolutionised their economies through e-procurement systems. For some time now, there has been the need within the government to develop an e-Procurement system for Pakistan. The growth of Public and Private sector are equally important for a stable economy. However, there is a need for both sectors to be on the same page to revitalise the essence of an Electronic Government Procurement (e-GP) system as an integral part of an overall strategic procurement plan. The plan should include but not limited to strategic sourcing or supplier rationalisation, automation of the manual procurement system, and participation in one or more marketplaces.

To support the development, implementation and operation of e-procurement systems, governments should take several different business approaches; the above choices of business models are associated with the amount of risk and cost a government is willing to undertake when implementing its e-GP system. The systems should also contain valuable information for monitoring and auditing government purchases, confidential information of vendors, procurement initiatives, responses to bids and payment information. It is important to note that e-procurement systems fundamentally provide a service to support the exchange information and therefore remain independent of the procurement process itself. A certain level of security is required to ensure the integrity, transparency and privacy of c processes, and also facilitate the marginalised citizens to become the agents of social action and social change of through open data.

Our speakers mentioned that the public sector, and especially the health department in Baluchistan [ one of Pakistan’s four provinces] are facing several issues with no clear mechanism for data and open data because the health sector is managed quite poorly in Baluchistan. This is because while the national doctor to patient ratio (1:1000) or the national nurse to patient ratio (1:50) in Pakistan is quite dim, it is even worse in Baluchistan. For example, 11 million kids died before reaching the age of five in Baluchistan. Maternal mortality rate is alarming with 785/100,000 live births while infant mortality rate is 97/1000 live births.

In rural areas of Baluchistan, health services are at lower ebb and the state of hospitals, RHCs, BHUs are also not encouraging. This is backed by the Health Management Information System. However, due to the unavailability open data on the situation, the health sector is facing complaints from stakeholders including donor agencies that some EPI managers and DHOs were found compiling and submitting fake agenda on health condition in their respective district.

Also, due to the unavailability of data on the health sector of Baluchistan, the province is facing several problems including monitoring, tracking, evaluation, tracing the data which is mostly not open because the data is only in hard format and not available and accessible to the public/citizens, policy, planning and diseases controlling, prevention and care. There is the need to build the capacities of the health department so they can develop critical provincial databases that cover all health sectors, health providers, health facilities with their identification code, routine operation and budget planning for practising in an evidence-based manner.

We also discuss the need for government to take the necessary steps to introduce openness, transparency and accountability in their entire departments and must also engage to help achieve practices of citizen participation as well as, responsive and accountable states. This will eventually lead to greater access to state services and resources; greater realisation of rights; enhancement state responsiveness and accountability.

The event concluded with thanks to Chief Executive Officer of YAD Atta ul Haq Khaderza. 

In the Library, With the Lead Pipe: Spring Reading

planet code4lib - Wed, 2017-04-05 13:00

It’s time for spring cleaning, and your editors here at In the Library with the Lead Pipe are cleaning out our bookmarks, bedside reading piles, and saved articles folders. We’re revisiting some great recent reads in the process. Here’s a selection of things we’ve been reading and that we think you might enjoy, too. Feel free to add your own spring reading recommendations in the comments.


Annie recommends:

On ‘Diversity’ as Anti-Racism in Library and Information Studies: A Critique” by David James Hudson

I feel like this is the article on “diversity” that everyone should read. Hudson looks at the literature that talks about diversity as the primary discursive mode of anti-racism in LIS and pushes us to think deeper on these issues. His article is also published in the very first issue of the Journal of Critical Library and Information Studies, which he states is a “potential site of critical exchange from which to articulate a sustained critique of race in and through our field.”

Considerations on Mainstreaming Intersectionality” by Rita Kaur Dhamoon*

I read this article in preparation for a presentation that I did at ACRL, and I feel it gives a really good overview of how theorists have applied and built upon the theory of intersectionality. Dhamoon goes on to detail five considerations ”when operationalizing an intersectional-type research paradigm.” As Lead Pipe publishes more articles that discuss identity, structural inequities, and relationships of power and difference, I find that taking an intersectional approach would help people understand the complexities of these relationships.

*I realize that this is a paywalled article, but if you want to read more of her writing that is published in an OA journal, check out “A Feminist Approach to Decolonizing Anti-Racism: Rethinking Transnationalism, Intersectionality, and Settler Colonialism.”


Bethany recommends:

Democracy & Education: An Introduction to the Philosophy of Education by John Dewey

Right now I’m reading up on cultivating shared physical spaces because I am at a campus where four academic institutions reside collectively. I haven’t read the entire book, but I’ve found some great nuggets of insight so far. My favorite quote to date is: “Whether we permit chance environments to do the work, or whether we design environments for the purpose makes a great difference” (p. 22).

Learning Spaces: Creating Opportunities for Knowledge Creation in Academic Life by Maggi Savin-Baden

This book gets to the heart of what I believe academics work to nurture in higher education. I’ve been reading through it to develop a solid literature review for an upcoming article I’m planning to write. I appreciate the correlation the author makes between learning spaces and their ability to transform individuals’ perspectives. According to Savin-Baden, “Learning spaces are often places of transition, and sometimes transformation, where the individual experiences some kind of shift or reorientation in their life world” (p. 8).


Amy recommends:

Gendered Labor and Library Instruction Coordinators: The Undervaluing of Feminized Work” by Veronica Arellano Douglas and Joanna Gadsby

This paper from a presentation at ACRL 2017 explores the ways in which the feminization of librarianship has influenced institutional organization structures, resulting in the proliferation of coordinator positions with responsibilities that tend more toward administrative and relational work than do other higher-level roles like managers and supervisors. This paper may be speaking specifically about instruction coordinators in academic libraries, but I see a lot here that speaks to my position in a public library, where a large part of my job is program coordination. The authors have me reflecting in particular on the amount of relationship maintenance I do in my work.

Why ‘Rock Star Librarian’ is an Oxymoron” by Allie Jane Bruce

The team of contributors at Reading While White always gets me thinking about a perspective I’ve thus far missed in my own reading and critical evaluation, but this piece particularly resonates—especially as invitations to publisher events at ALA Annual begin to trickle in. This editorial is in response to the Wall Street Journal’s March 5 article about “rock star” librarians, and more specifically in response to the outcry against that article by youth-focused librarians. Critics of the original WSJ article claimed youth librarianship is grossly misrepresented by the article’s inclusion of only white men as their “rock stars,” but Bruce’s response digs deeper to explore ways that those very same youth-focused librarians may be contributing to and shoring up a system that continually underrepresents us.


Sofia recommends:

Critical Directions for Archival Approaches to Social Justice” by Richardo L. Punzalan and Michelle Caswell

A colleague of mine, Michelle Baildon, recommended this really thoughtful article to me. Punzalan and Caswell examine the relationship between archives and social justice and suggest further explorations to move the archival field forward. They also make an argument for the fact that social justice has long been a part of archival work and that it is clear that social justice should be a central tenet of the archival field. I read it recently to help prepare for a workshop I’ll be co-teaching using materials from our archives. The students will also be reading it to provide a framing for the workshop, which is for a class on activism. I’m excited for our class discussion on this article!

Living a Feminist Life by Sara Ahmed

I love Sara Ahmed’s work, particularly On Being Included: Racism and Diversity in Institutional Life, so when I saw that she just came out with a new book, I had to get it. If you aren’t familiar with her work, check out her blog Feminist Killjoys, which she wrote in conjunction with Living a Feminist Life. This book really resonated with a lot of things I’ve been thinking about and struggling with, as her work always does for me. She’s already struggled through the same issues and is generously sharing her wisdom and hope. If that’s not enough for you, there’s a quote from bell hooks saying “everyone should read this book.”


Ryan recommends:

Yes, Digital Literacy. But Which One?” by Mike Caulfield

This article has been passed around quite a bit by some librarians and other instructors I follow, and with good reason. Caulfield makes a persuasive case that, while useful starting points, information literacy acronyms like CRAAP and RADCAB are insufficient. Then he demonstrates a few types of domain knowledge and technical skills that could add up to a robust digital literacy. He argues that in addition to emphasizing the abstract values reinforced by various acronyms, instructors would better serve students by explaining the various ideologies one might encounter in the world of research, then giving them models, processes, and specific tools that help people act on abstract values. Not only did I learn some specific skills and tricks—I knew of Wolfram Alpha but never thought to use it how Caulfield suggests—the challenges he lays out have remained on my mind as I’ve been thinking about how to work with faculty on information literacy. His post provides an abstract appeal and some workable models, a combination I can’t help but appreciate.    

Rhetorical Listening: Identification, Gender, Whiteness by Krista Ratcliffe

We read this rather quickly in a pedagogy class I took last semester, and it’s a layered enough book that I’m still re-reading it as I can make the time. Ratcliffe’s arguments persuasively situate listening within rhetorical traditions, feminist theory, and critical race theory. What I keep coming back to is her emphasis on an ethics of accountability and the need to listen both to the claims people make and the cultural logics within which people make those claims. Her chapters focus on different tactics for listening, including for public debates, scholarly debates, and within classrooms—all places potentially quite relevant for librarians. Instead of trying to recapitulate her book in a paragraph, I’ll quote her motivation for writing the book: she was struck by the complications of her “own standpoint as a white feminist who had an abhorrence of racism and who had considered how racism works in the lives of non-white people but who had never really been taught nor had taken it upon herself to learn how racism functions in relation to whiteness and/or white people beyond the narrative that begins, ‘Once upon a time, white people were racists’” (p. 35). If that starting point resonates with you, this book will amply reward your reading with both specific tactics for listening and the inviting introductions she provides to a host of other thinkers.


Ian recommends:

@jacobsberg Twitter feed by Jacob Berg

My reading routine for the past few months has been especially bifurcated. Mornings are dominated by a wide variety of political news, much of which comes through my Twitter feed, especially @jacobsberg’s seemingly endless stream of links to articles and posts mostly having to do with perpetual catastrophe of U.S. national politics. This daily breakfast diet usually does not put me in an optimal mood for a day’s librarianship (or maybe it does—thanks, Jake!).

Things Left Unsaid” by Veronica Arellano Douglas & “Seeking a diverse candidate pool” by Angela Pashia

Speaking of librarianship, my mood has been lifted recently by these two posts about the academic library hiring process (which I’ve shared with colleagues working on this issue at MPOW) [both of these posts cite Lead Pipe articles, but that’s honestly not the reason I’m plugging them!]. Both Douglas and Pashia describe the institutionalized oppressions that continue to structure and define much of our profession, but they also offer determination, hope, and advice about how to create, in Douglas’s words, “a feminist, inclusive practice of librarianship.”

Vinyl Records and Analog Culture in the Digital Age: Pressing Matters by Paul E. Winters & Vinyl: The Analogue Record in the Digital Age by Dominik Bartmanski and Ian Woodward

Over those same last few months my days usually conclude with readings in the history of music recording and record collecting. These two books in particular have impressed me. The studies share a fascination with (and participation in) the recent “vinyl revival,” of which I’ve been a half-hearted participant. Although the books provide very thoughtful and theoretically informed discussions of the place of vinyl records in their cultural and social contexts, the cumulative effect of their analyses was (for me) to demystify the vinyl record and reduce its fetish value. I’m more fascinated by the practice of collecting (regardless of format) and the ways that collecting creates communities of knowledge and structures knowledge, and these studies shed light on this phenomenon as well.


Evergreen ILS: Evergreen Conference Thanks Emerald Data

planet code4lib - Wed, 2017-04-05 12:22

Today the pre-conference begins for the International Evergreen Conference 2017 in Covington, Kentucky. Every year it feels like the pre-conference gets just a bit bigger and this year is no exception. In the early years of the conference we didn’t even worry about registration for it as it was a relatively few handful of developers that showed up for a hackfest. Over the years we have added more and more including community meetings and now two different tracks of three hour workshops and presentations. Along with all of this is a need for community support to make it happen. So, we want to thank both our wonderful volunteers and a sponsor that helped make this happen this year – Emerald Data. Emerald Data provides support services for Evergreen and has been a long time community member so we appreciate their support and involvement in our community.

Emerald’s Website:
Emerald’s Twitter: @emeralddata

#evgils17 #evgils

Open Knowledge Foundation: Open Data Day 2017: round table on transparency and corruption in Albania

planet code4lib - Wed, 2017-04-05 10:00

This blog is part of the event report series on International Open Data Day 2017. On Saturday 4 March, groups from around the world organised over 300 events to celebrate, promote and spread the use of open data. 44 events received additional support through the Open Knowledge International mini-grants scheme, funded by SPARC, the Open Contracting Program of Hivos, Article 19, Hewlett Foundation and the UK Foreign & Commonwealth Office. This event was supported through the mini-grants scheme under the Open contracting and tracking public money flows theme.

This blog has been reposted from

March 4th, 2017 was International Open Data Day. Actors engaged in opening data in different countries in the world developed events and presented initiatives promoting open data, mainly for the public sector. The Albanian Institute of Science (AIS), an organisation promoting open data for Albania, invited journalists, who cover justice-related issues, to attend a roundtable on transparency and corruption.

The event was opened by the presentation of the RedFlag Index initiative. This initiative uses open data to fight against inequality, misuse and corruption in municipality processes. Following two years of monitoring the tender procedures and contracting of 61 municipalities and the establishment of a database for every procurement procedure, such marking identifies automatically every procedure conducted without competition, and by setting appropriate deadlines for the bidders to prepare their bids.

The justice journalists at the meeting raised their concern about Tirana Judicial District Court not publishing its criminal cases and decisions thereof on its web. After 12 years of a practice of transparency established by this court, it suddenly decided to stop giving the public access to the court decisions. Such measure was taken following some complaints about the publication of decisions of a private and family character. The Court decided, on this occasion, to stop publishing even decisions of a public and criminal nature, and information necessary for transparency.

The journalists have already reported on this development, and expect the Court to issue a media release in the near future.

Albanian TV coverage from 

DPLA: Introduction to Upcoming DPLA + Ebooks Work

planet code4lib - Tue, 2017-04-04 15:45

By DPLA Ebook Consultant Micah May and Ebook Program Manager Michelle Bickert

As part of its core mission of maximizing access to our shared culture, the Digital Public Library of America (DPLA) is working to expand the discoverability, accessibility, and availability of ebooks for the general public. At DPLAfest 2015, many of you joined us as we began a deep exploration of the ebook space. Two years later, and with additional support from the Alfred P. Sloan Foundation, we are taking elements of that work forward.

We are exploring how DPLA may be able to broaden access for users by helping libraries move to an open service architecture. What does maximizing access to ebooks look like? Facilitating discovery of free, open content; unlocking previously gated content through new licensing and/or access models; and facilitating better purchasing options for libraries.

Our vision is to

  • Help libraries find and serve more open content, including open textbooks and other open educational resources (OER).
  • Merge content from multiple paid sources on a single platform and consolidated user interface.
  • Curate content to drive discovery and use of more of libraries existing collection.
  • Experiment with new types and sources of content including local publishing.
  • Empower DPLA to work directly with publishers to secure new and better terms for libraries that will allow them to provide more access at a better value.

While we explore innovative methods to advance the library ebook ecosystem, we’re also making familiar content new again. We are developing a substantial, free, and open collection of widely-read and widely-held ebooks, with a goal of improving discoverability through metadata and curation. Interested in helping? Check out our survey on open content, and watch for a later post for more.

These efforts complement DPLA’s ongoing work in the ebook space as a partner on the Open eBooks initiative. During its first year, K-12 children in need across the United States and its territories downloaded over one million popular and award-winning ebooks for free, without holds.

In the coming weeks we will be sharing more about this ongoing exploration. If you’re joining us in Chicago for DPLAfest 2017, we have two full days of ebook discussions. We invite you to join the conversation. Stay tuned for more updates on DPLA + Ebooks.

HangingTogether: The Realities of Research Data Management: Part One Now Available!

planet code4lib - Tue, 2017-04-04 14:54

Check out the new OCLC Research report A Tour of the Research Data Management (RDM) Service Space, the first in a four-part series exploring the realities of research data management. This report provides background on RDM’s emergence as an important new service area supporting 21st century scholarship, and offers a high-level description of the RDM service space as it stands today.

The Realities of Research Data Management, an OCLC Research project, explores the context and choices research universities face in building or acquiring RDM capacity. Findings are derived from detailed case studies of four research universities: University of Edinburgh, University of Illinois at Urbana-Champaign, Monash University, and Wageningen University and Research. Future reports will focus on scoping local RDM service offerings, the incentives for acquiring RDM capacity, and sourcing and scaling RDM services.

A Tour of the Research Data Management (RDM) Service Space begins the report series by providing background on the emergence of RDM as a key research support service on campus. The report goes on to describe the broad contours of the current RDM service space, identifying three major service categories: Education, Expertise, and Curation. This high-level view of the RDM service space will help organize subsequent reports’ discussion of the RDM service offerings at the four case study institutions.

RDM is both an opportunity and a challenge for many research universities. But research data management is not a discrete, well-defined service, and RDM solutions are not of the one-size-fits-all variety. Moving beyond recognition of RDM’s importance requires facing the realities of research data management. Each institution must shape its local RDM service offering by navigating several key inflection points: deciding to act, deciding what to do, and deciding how to do it. Future reports in this series will examine these decisions in the context of the choices made by the case study partners.

Stay tuned for more updates and outputs from The Realities of Research Data Management project!


About Brian Lavoie

Brian Lavoie is a Research Scientist in OCLC Research. He has worked on projects in many areas, such as digital preservation, cooperative print management, and data-mining of bibliographic resources. He was a co-founder of the working group that developed the PREMIS Data Dictionary for preservation metadata, and served as co-chair of a US National Science Foundation blue-ribbon task force on economically sustainable digital preservation. Brian's academic background is in economics; he has a Ph.D. in agricultural economics. Brian's current research interests include stewardship of the evolving scholarly record, analysis of collective collections, and the system-wide organization of library resources.

Mail | Web | LinkedIn | More Posts (17)


Subscribe to code4lib aggregator