You are here

Feed aggregator

Jason Ronallo: Client-side Video Tricks for IIIF

planet code4lib - Sat, 2017-06-10 17:29

I wanted to push out these examples before the IIIF Hague working group meetings and I’m doing that at the 11th hour. This post could use some more editing and refinement of the examples, but I hope it still communicates well enough to see what’s possible with video in the browser.

IIIF solved a lot of the issues with working with large images on the Web. None of the image standards or Web standards were really developed with very high resolution images in mind. There’s no built-in way to request just a portion of an image. Usually you’d have to download the whole image to see it at its highest resolutions. Image tiling works around a limitation of image formats by just downloading the portion of the image that is in the viewport at the desired resolution. IIIF has standardized and image servers have implemented how to make requests for tiles. Dealing with high resolution images in this way seems like one of the fundamental issues that IIIF has helped to solve.

This differs significantly from the state of video on the web. Video only more recently came to the web. Previously Flash was the predominant way to deliver video within HTML pages. Since there was already so much experience with video and the web before HTML5 video was specified, it was probably a lot clearer what was needed when specifying video and how it ought to be integrated from the beginning. Also video formats provide a lot of the kinds of functionality that were missing from still images. When video came to HTML it included many more features right from the start than images.

As we’re beginning to consider what features we want in a video API for IIIF, I wanted to take a moment to show what’s possible in the browser with native video. I hope this helps us to make choices based on what’s really necessary to be done on the server and what we can decide is a client-side concern.

Crop a video on the spatial dimension (x,y,w,h)

It is possible to crop a video in the browser. There’s no built-in way that this is done, but with how video it integrated into HTML and all the other APIs that are available there cropping can be done. You can see one example below where the image of the running video is snipped and add to a canvas of the desired dimensions. In this case I display both he original video and the canvas version. We do not even need to have the video embedded on the page to play it and copy the images over to the canvas. The full video could have been completely hidden and this still would have worked. While no browser implements it a spatial media fragment could let a client know what’s desired.

Also, in this case I’m only listening for the timeupdate event on the video and copying over the portion of the video image then. That event only triggers so many times a second (depending on the browser), so the cropped video does not display as many frames as it could. I’m sure this could be improved upon with a simple timer or a loop that requests an animation frame.

Watch the video and cropped video

And similar could be done solely by creating a wrapper div around a video. The div is the desired width with overflow hidden and the video is positioned relative to the div to give the desired crop.

This is probably the hardest one of these to accomplish with video, but both of these approaches could probably be refined and developed into something workable.

Truncate a video on the temporal dimension (start,end)

This is easily accomplished with a Media Fragment added to the end of the video URL. In this case it looks like this:,10. The video will begin at the 6 second mark and stop playing at the 10 second mark. Nothing here prevents you from playing the whole video or any part of the video, but what the browser does by default could be good enough in lots of cases. If this needs to be a hard constraint then it ought to be pretty easy to do that with JavaScript. The user could download the whole video to play it, but any particular player could maintain the constraint on time. What’s nice with video on the web is that the browser can seek to a particular time and doesn’t even need to download the whole video to start playing any moment in the video since it can make byte-range requests. And the server side piece can just be a standard web sever (Apache, nginx) with some simple configuration. This kind of “seeking” of tiles isn’t possible with images without a smarter server.

Scale the video on the temporal dimension (play at 1.5x speed)

HTML5 video provides a JavaScript API for manipulating the playback rate. This means that this functionality could be included in any player the user interacts with. There are some limitations on how fast or slow the audio and video can play, but there’s a larger range of how fast or slow the just the images of the video can play. This will also differ based on browser and computer specifications.

This video plays back at 3 times the normal speed:

This video plays back at half the normal speed:

Change the resolution (w,h)

If you need to fit a video within a particular space on the page, a video can easily be scaled up and down on the spatial dimension. While this isn’t always very bandwidth friendly, it is possible to scale a video up and down and even do arbitrary scaling right in the browser. A video can be scaled with or without maintaining its aspect ratio. It just takes some CSS (or applying styles via JavaScript).

Play stretched video

Rotate the video

I’m not sure what the use case within IIIF is for rotating video, but you can do it rather easily. (I previously posted an example which might be more appropriate for the Hague meeting.)

Play rotating video

Use CSS and JavaScript safely, OK?


Two of the questions I’ll have about any feature being considered for IIIF A/V APIs are:

  1. What’s the use case?
  2. Can it be done in the browser?

I’m not certain what the use case for some of these transformations of video would be, but would like to be presented with them. But even if there are use cases, what are the reasons why they need to be implemented via the server rather than client-side? Are there feasibility issues that still need to be explored?

I do think if there are use cases for some of these and the decision is made that they are a client-side concern, I am interested in the ways in with the Presentation API and Web Annotations can support the use cases. How would you let a client know that a particular video ought to be played at 1.2x the default playback rate? Or that the video (for some reason I have yet to understand!) needs to be rotated when it is placed on the canvas? In any case I wonder to what extent making the decision that someone is a client concern might effect the Presentation API.

Stop all videos

Evergreen ILS: Evergreen 3.0 development update #9

planet code4lib - Fri, 2017-06-09 21:38

Duck (Bronze statuette, Roman artwork) at the Musée des Beaux-Arts in Lyon, France © Marie-Lan Nguyen / Wikimedia Commons (CC-BY)

Since the previous update, another 18 patches have been committed to the master branch.

With this week’s work, we can strike out one more line on the 3.0 road map: canceling or aborting a transit will no longer delete the record of the transit outright. Instead, the record of the transit will get updated with its cancellation time, but remain available for reporting and spelunking. The gremlins that hide books that get lost in transit between libraries will presumably have to satisfy themselves with hoarding unpaired socks. Credit goes to Chris Sharp for the initial patches and to Bill Erickson for reviewing them and writing follow-ups.

Another set of patches to note are the ones for bug 1681943, which improve how the public catalog “my lists” page displays on mobile devices. This is a good example of how a patchset can evolve as it gets feedback; in particular, I’d like to note what may be a shift in how we do responsive design, with a less use of hiding stuff in the mobile view in favor of rearranging everything to better fit. Credit goes to Terran McCanna with contributions from Ben Shum and me and feedback from Dan Scott.

We also had a development meeting on 7 June. The minutes and log are available. The next meeting is scheduled for 15:00 EDT / 20:00 UTC on 5 July.

After some discussion on the mailing lists, Terran McCanna has set the next block of Bug Squashing Days for 17 to 21 July. On a personal note, 19 July happens to be my birthday; the best present that anybody active in Evergreen development can give me is to write, test, sign off on, and commit patches and documentation that week.

Duck trivia

I think somebody really, really wants a duck to visit Chicago. Can we make it happen by ALA Annual? Then again, better luck might be had visiting the Chicago Botanic Garden.


Updates on the progress to Evergreen 3.0 will be published every Friday until general release of 3.0.0. If you have material to contribute to the updates, please get them to Galen Charlton by Thursday morning.

Harvard Library Innovation Lab: LIL Talks: The 180-Degree Rule in Film

planet code4lib - Fri, 2017-06-09 19:37

This week, Jack Cushman illustrated how hard it is to make a film, or rather, how easy it is to make a bad film. With the assistance of LIL staff and interns, he directed a tiny film of four lines in about four minutes, then used it as a counter-example. Any mistake can break the suspension of disbelief, and amateurs are likely to make many mistakes. Infelicities in shot selection, lighting, sound, wardrobe and makeup, set design, editing, color, and so on destroy the viewer’s immersion in the film.

An example is the 180-degree rule: in alternating shots over the shoulders of two actors facing each other, the cameras must remain on the same side of the imaginary line joining the two actors. Breaking this rule produces cuts where the spatial relationship of the two actors appears to change from shot to shot.

After some discussion of the differences between our tiny film and Star Wars, Jack gauged his crew’s enthusiasm, and directed another attempt, taking only slightly longer to shoot than the first try. Here are some stills from the set.

Open Knowledge Foundation: What data do we need? The story of the Cadasta GODI fellowship

planet code4lib - Fri, 2017-06-09 18:35

This blogpost was written by Lindsay Ferris and Mor Rubinstein


There is a lot of data out there, but which data users needs to solve their issues? How can we, as an external body, know which data is vital so we can measure it?  Moreover, what to do when data is published in so many levels – local, regional and federal that is so hard to find?

Every year we are thinking about these questions in order to improve the Global Open Data Index (GODI), and make it more relevant to civil society. Having the relevant data characteristics is crucial for data use since without specific data it is hard to analysed and learn.

After the publication of the GODI 2015, Cadasta Foundation approached us to discuss the results of GODI in the land ownership category.  Throughout this initial, lively discussion, we noticed that a systematic understanding of land data in general, and land ownership data in particular, was missing. An idea emerged: What if we will We decided to bridge these gaps to build a systematic understanding of land ownership data for the 2016 GODI.

And so came to life the idea of the GODI fellowship. It was simple – Cadasta will have a fellow for a period of 6 months to explore the publication of data that is relevant to land ownership issues. The fellowship would be funded by Cadasta and the fellow would be an integral part of the team. OKI would give in-kind support of guidance and research. The fellowship goals were:

  • Global policy analysis of open data in the field of land and resource rights
  • Better definition for the land ownership dataset in the Global Open Data Index for 2016;
  • Mapping stakeholders and partners for the Global Open Data Index (for submissions);
  • Recommendations for a thematic Index;
  • A working paper or a series of blog posts about open data in land and resource ownership.

Throughout the fellowship, Lindsay conducted interviews with land experts, NGOs and government officials as well as on-going desk research on the land data publication practices across different contexts. She established 4 key outputs:

  1. Outlining the challenges of opening land ownership data. Blog post here.
  2. Mapping the different types of land data and their availability. Overview here.
  3. Assessing the privacy and security risks of opening certain types of land data. See our work here:

4.Identifying user needs and creating user personas for open land data.  User personas here.  

Throughout the GODI process, our aim is to advocate for datasets that different stakeholders actually need and that make sense within the context in which they are published. For example, one of the main challenges in land ownership is that data is not always recorded or gathered by the federal level, and is collect in cities and regions. One of the primary users of land ownership data are other government agencies. Having a grasp of this type of knowledge helped us better define the land ownership dataset for the GODI. Ultimately, we developed a thoughtful definition based on these reflections and recommendations.  

For us at OKI, having someone dedicated in an organisation that is an expert in a data category was immensely helpful. It makes the index categories more relevant for real life use  and help us to measure the categories better. It helps us to make sure our assumptions and foundation for the research are good. For Cadasta, having a person dedicate on open data helped to create a knowledge based and resources that help them look at the open data better. It was a win – win for both sides.

In fact, The work Lindsay was doing was very valuable for Cadasra that Lindsay time was extended at Cassata and she worked on writing a case study about open data and land in Sao Paulo and Land Debate final report and a paper on Open Data in Land Governance for the 2017 World Bank Land and Poverty Conference.

Going forward in the future of open data assessment, we believe that having this expert input in the design of the survey is crucial. Having only an open data lense can lead us to bias and wrong measurements. In our vision, we see the GODI tool as community owned assessment, that can help all fields to promote, find and use the data that is relevant for them. Interested of thinking the future of your field through open data? Write to us on the forum –

LITA: LITA events @ ALA Annual 2017

planet code4lib - Fri, 2017-06-09 16:38
Don’t miss a single one of the great LITA events at ALA Annual 2017 in Chicago, June 23-26, 2017

There’s so much good stuff, we’re listing just the highlights in chronological order. Be sure to check for more programs and details:

LITA at ALA Annual web page, and the
ALA Scheduler site, filtered for LITA

LITA AvramCamp preconference, an AdaCamp inspired event
Friday, June 23, 2017, 9:00 am – 4:00 pm

Facilitators: Margaret Heller, Digital Services Librarian, Loyola University Chicago; and Evviva Weinraub, Associate University Librarian for Digital Strategies, Northwestern University.

This one-day LITA preconference during ALA Annual in Chicago will allow female-identifying individuals employed in various technological industries an opportunity to network with others in the field and to collectively examine common barriers faced.

LITA Conference Kickoff
Friday June 23, 2017, 3:00 pm – 4:00 pm

Join current and prospective LITA members for an overview and informal conversation at the Library Information Technology Association (LITA) Conference Kickoff.

Executive Perspectives: A Conversation on the Future of the Library Technology Industry
Saturday June 24, 2017, 10:30 am – 11:30am

Marshall Breeding, welcomes this panel of CEO or other senior executives representing organizations that produce software or services for libraries.

  • Sarah Pritchard, Dean of Libraries and the Charles Deering McCormick University Librarian, Northwestern University
  • Bill Davison, President and Chief Executive Officer, SirsiDynix
  • Kurt Sanford, Chief Executive Officer, ProQuest
  • George Coe, Chief Operating Officer for Follett Corporation, Baker and Taylor Follett

LITA Imagineering: Generation Gap: Science Fiction and Fantasy Authors Look at Youth and Technology
Saturday June 24, 2017, 1:00 pm – 2:30 pm

Join LITA, the Imagineering Interest Group, and Tor Books as this panel of Science Fiction and Fantasy authors discuss how their work can help explain and bridge the interests of generational gaps.

  • Cory Doctorow
  • Annalee Newitz
  • V.E. Schwab
  • Susan Dennard

LITA Guide Book Signing: Melody Condron, author of
Managing the Digital You: Where and How to Keep and Organize Your Digital Life
Rowman and Littlefield Booth #2515
Saturday June 24, 2017, 3:00 pm – 3:30 pm

Don’t miss your chance to meet, purchase at a large discount and have your copy signed by Melody. This book is a much-needed guide for those struggling with how to manage and preserve their digital items.

Top Technology Trends
Sunday, June 25, 2017, 1:00 pm – 2:30 pm

LITA’s premier program on changes and advances in technology.  This conference panelists are:

  • Margaret Heller, Session Moderator, Digital Services Librarian, Loyola University Chicago
  • Emily Almond, Director of IT, Georgia Public Library Service
  • Marshall Breeding, Independent Consultant and Founder, Library Technology Guides
  • Vanessa Hannesschläger, Researcher, Austrian Centre for Digital Humanities/Austrian Academy of Sciences
  • Veronda Pitchford, Director of Membership and Resource Sharing, Reaching Across Illinois Library System (RAILS)
  • Tara Radniecki, Engineering Librarian, University of Nevada, Reno

LITA President’s Program with Kameron Hurley: We Are the Sum of Our Stories
Sunday June 25, 2017 from 3:00 pm – 4:00 pm

LITA President Aimee Fifarek welcomes Kameron Hurley, author of the essay collection The Geek Feminist Revolution, as well as the award-winning God’s War Trilogy and The Worldbreaker Saga.

LITA Happy Hour
Sunday, June 25, 2017, 6:00 pm – 8:00 pm

This year the LITA Happy Hour continues the year long celebration of LITA’s 50th anniversary. Expect anniversary fun and games. There will be lively conversation and excellent drinks; cash bar.

See you in Chicago!


OCLC Dev Network: DEVCONNECT 2017: An Inaugural Success!

planet code4lib - Fri, 2017-06-09 13:00

It was a great pleasure hosting OCLC’s inaugural DEVCONNECT conference. (ICYMI, DEVCONNECT was held May 8 – 9, 2017, at the OCLC Conference Center in Dublin, Ohio.)

Open Knowledge Foundation: OK Greece sings an MoU with the Hellenic Institute of Transport (HIT)

planet code4lib - Fri, 2017-06-09 10:00

On Friday, June 2, Open Knowledge Greece (OK Greece) signed a Memorandum of Understanding (MoU) with the Hellenic Institute of Transport (HIT), regarding the sharing and the analysis of transport data in the city of Thessaloniki, with the aim to predict traffic and improve mobility in the street.

From left to right: Dr Jose Salanova – HIT’s Research Associate; Dr Charalampos Bratsas – OK Greece President; Dr Evangelos Bekiaris – HIT’s Director and Dr Georgia Aifadopoulou- HIT’s Deputy Director-Research Director

HIT’s Director Dr Evangelos Bekiaris and Dr Georgia Aifadopoulou, HI Deputy Director – Research Director welcomed OK Greece President Dr Charalampos Bratsas at the Institute offices. Following the signing of the agreement, Mr Bratsas stressed:

Today, we made another step towards the efficient use and management of data in the interests of citizens. We are very happy about this cooperation and we hope for its long-term growth.

When asked about the aim of the agreement and HIT’s benefit from its cooperation with OK Greece, Mr Bekiaris said that HIT wants to take advantage of OK Greece’s know-how on the field of data analysis, in order to highlight its own data for the common good of both Thessaloniki and the rest of Greece, in a reliable and secure way that will open the data to the largest possible public.

Among others, Mr Bekiaris also mentioned the benefit of this effort to the end-user. More specifically, he said:

This MoU gives us the opportunity to operate data platforms, through which businesses will be able to derive the data they need, for free, in order to take initiatives and develop new services. There has not been such a thing in Greece yet, as there is in Finland, for example, but, along with OK Greece, we can develop something similar in the transport sector to allow Greek SMEs to use data and create new services, helping the public and the economy as a whole.

Dr Georgia Aifadopoulou described the agreement as the beginning of a pioneering innovation for Thessaloniki, which will also have a multiplier effect for the rest of Greece. According to her, the HIT has been running a living lab on smart mobility in Thessaloniki for years, noting that it has been recognised at the European level by introducing Thessaloniki to the official list of the EU smart cities.

The lab gathers real-time information, through an ecosystem, built by a joint venture of local institutions, such as the Municipality of Thessaloniki, the Region of Central Macedonia, as well as the TAXI and commercial fleet operators. This ecosystem processes the available information, providing citizens with plenty of services, regarding traffic, the road conditions etc.

She also stated that through this collaboration with OK Greece, we manage to open our data, also persuading other bodies to follow our example. Our goal is to expand the existing ecosystem and promote the exchange of know-how on open data. The choice of Thessaloniki, participating in a relevant competition, as the first city at the European level to pilot the Big Data Europe – Empowering Communities with Data Technologies project in the field of mobility and transport also constitutes a great opportunity to this direction.

Dr Ms Aifadopoulou further stressed that the innovative nature of the MoU lies in the cooperation of bodies, coming from different scientific fields. According to her, data analysis is a big issue. The extraction of knowledge from data is another question. This is why we need both institutions in our venture: on the one hand, the HIT, which knows the field of mobility and on the other hand, OK Greece, which knows how to make the data analysis, offering the needed interpretations and explanations. Via this convergence, we will be able to create new knowledge for institutions, citizens, also improving the management of the transport system in Thessaloniki.  

To read more about Open Knowledge Greece visit their website. You can also follow them on Twitter: @okfngr

Access Conference: Access 2017 Schedule Posted

planet code4lib - Thu, 2017-06-08 19:16

The program schedule for Access 2017 is available now. Check back here and follow us on Twitter  (@accesslibcon) for more announcements about our two keynote speakers, the social events, and some new ideas for the hackfest.

Looking forward to see you in Saskatoon Sept 27-29th.

Early-bird registration rates close July 1st!

David Rosenthal: Public Resource Audits Scholarly Literature

planet code4lib - Thu, 2017-06-08 15:00
I (from personal experience), and others, have commented previously on the way journals paywall articles based on spurious claims that they own the copyright, even when there is clear evidence that they know that these claims are false. This is copyfraud, but:
While falsely claiming copyright is technically a criminal offense under the Act, prosecutions are extremely rare. These circumstances have produced fraud on an untold scale, with millions of works in the public domain deemed copyrighted, and countless dollars paid out every year in licensing fees to make copies that could be made for free.The clearest case of journal copyfraud is when journals claim copyright on articles authored by US federal employees:
Work by officers and employees of the government as part of their official duties is "a work of the United States government" and, as such, is not entitled to domestic copyright protection under U.S. law. So, inside the US there is no copyright to transfer, and outside the US the copyright is owned by the US government, not by the employee. It is easy to find papers that apparently violate this, such as James Hansen et al's Global Temperature Change. It carries the statement "© 2006 by The National Academy of Sciences of the USA" and states Hansen's affiliation as "National Aeronautics and Space Administration Goddard Institute for Space Studies".Perhaps the most compelling instance is the AMA falsely claiming to own the copyright on United States Health Care Reform: Progress to Date and Next Steps by one Barack Obama.

Now, Carl Malamud tweets:
Public Resource has been conducting an intensive audit of the scholarly literature. We have focused on works of the U.S. government. Our audit has determined that 1,264,429 journal articles authored by federal employees or officers are potentially void of copyright.They extracted metadata from Sci-Hub and found:
Of the 1,264,429 government journal articles I have metadata for, I am now able to access 1,141,505 files (90.2%) for potential release.This is already extremely valuable work. But in addition:
2,031,359 of the articles in my possession are dated 1923 or earlier. These 2 categories represent 4.92% of scihub. Additional categories to examine include lapsed copyright registrations, open access that is not, and author-retained copyrights.It is long past time for action against the rampant copyfraud by academic journals.

Tip of the hat to James R. Jacobs.

District Dispatch: Washington Office at Annual 2017: “catalyzing” change in communities and Congress

planet code4lib - Thu, 2017-06-08 13:43

ALA’s Office of Government Relations (OGR) in Washington is pleased to present at ALA’s upcoming Annual Conference in Chicago two important, but very different, perspectives on how libraries and librarians can succeed in producing positive change. Both sessions will take place in the Conference Center on the morning of Saturday, June 24.

The first, from 8:30 – 10:00am, is titled “Be A Catalyst: Your Portfolio of Resources to Create Catalytic Change in Communities” (Room MCP W176a). It will feature Institute of Museum and Library Services Director Dr. Kathryn (“Kit”) Matthew, Kresge Foundation President Rip Rapson and Barbara Bartle, President of the Lincoln Community Foundation in discussion of how institutions can best leverage outside investments and their own assets through collaboration to be “enablers of community vitality and co-creators of positive community change.” The panel will explore IMLS’ new Community Catalyst initiative, take a deeper dive into the agency’s late 2016 “Strengthening Networks, Sparking Change” report, and profile a recent IMLS funding pilot (now closed) to further library “catalyzing” projects.

The second program, “Make Some Noise! A How-To Guide to Effective Federal Advocacy in Challenging Times”, will run from 10:30 to 11:30am (Room MCP W178b). Sponsored by ALA’s Committee on Legislation (COL), the session will be a fast-paced, practical discussion of what works in motivating members of Congress (and, for that matter, all elected policy makers) to support libraries and the issues we champion in Washington and outside the Beltway alike.

Incoming COL Chair (and Director of Library Journal’s 2017 Library of the Year in Nashville, TN) Kent Oliver will be joined by Georgia State Librarian (and COSLA Legislative Co-Chair) Julie Walker and Virginia Library Association Executive Director Lisa Varga. They’ll share their recent experiences in the ongoing Fight for Libraries! campaign for federal funding, and from decades of successful library advocacy in the “real world.” The program – which also will offer lots of audience “Q&A” opportunities – will be moderated by OGR Managing Director Adam Eisgrau.

The post Washington Office at Annual 2017: “catalyzing” change in communities and Congress appeared first on District Dispatch.

Open Knowledge Foundation: The state of open licensing in 2017

planet code4lib - Thu, 2017-06-08 13:35

This blog post is part of our Global Open Data Index (GODI) blog series. Firstly, it discusses what open licensing is and why it is crucial for opening up data. Afterward, it outlines the most urgent issues around open licensing as identified in the latest edition of the Global Open Data Index and concludes with 10 recommendations how open data advocates can unlock this data. The blog post was jointly written by Danny Lämmerhirt and Freyja van den Boom.

Open data must be reusable by anyone and users need the right to access and use data freely, for any purpose. But legal conditions often block the effective use of data.

Whoever wants to use existing data needs to know whether they have the right to do so. Researchers cannot use others’ data if they are unsure whether they would be violating intellectual property rights. For example, a developer wanting to locate multinational companies in different countries and visualize their paid taxes can’t do so unless they can find how this business information is licensed. Having clear and open licenses attached to the data, which allow for use with the least restrictions possible, are necessary to make this happen.

Yet, open licenses still have a long way to go. The Global Open Data Index (GODI) 2016/17 shows that only a small portion of government data can be used without legal restrictions. This blog post discusses the status of ‘legal’ openness. We start by explaining what open licenses are and discussing GODI’s most recent findings around open licensing. And we conclude by offering policy- and decisionmakers practical recommendations to improve open licensing.

What is an open license?

As the Open Definition states, data is legally open “if the legal conditions under which data is provided allow for free use”.  For a license to be an open license it must comply with the conditions set out under the  Open Definition 2.1.  These legal conditions include specific requirements on use, non-discrimination, redistribution, modification, and no charge.

Why do we need open licenses?

Data may fall under copyright protection.

Copyright grants the author of an original work exclusive rights over that work. If you want to use a work under copyright protection you need to have permission.

There are exceptions and limitations to copyright when permission is not needed for example when the data is in the ‘public domain’ it is not or no longer protected by copyright, or when your use is permitted under an exception.

Be aware that some countries also allow legal protection for databases which limit what use can be made of the data and the database. It is important to check what the national requirements are, as they may differ.

Because some types of data (papers, images) can fall under the scope of copyright protection we need data licensing. Data licensing helps solve problems in practice including not knowing whether the data is indeed copyright protected and how to get permission. Governments should therefore clearly state if their data is in the public domain or when the data falls under the scope of copyright protection what the license is.

  • When data is public domain it is recommended to use the CC0 Public Domain license for clarity.
  • When the data falls under the scope of copyright it is recommended to use an existing Open license such as CC-BY to improve interoperability.

Using Creative Commons or Open Data Commons licenses is best practice. Many governments already apply one of the Creative Commons licenses (see this wiki). Some governments have chosen however to write their own licenses or formulate ‘terms of use’ which grant use rights similar to widely acknowledged open licenses. This is problematic from the perspective of the user because of interoperability. The proliferation of ever more open government licenses has been criticized for a long time. By creating their own versions, governments may add unnecessary information for users, cause incompatibility and significantly reduce reusability of data.  Creative Commons licenses are designed to reduce these problems by clearly communicating use rights and to make the sharing and reuse of works possible.

The state of open licensing in 2017

Initial results from the GODI 2016/17 show roughly that only 38 percent of the eligible datasets were openly licensed (this value may change slightly after the final publication on June 15).

The other licenses include many use restrictions including use limitations to non-commercial purposes, restrictions on reuse and/or modifications of the data.  

Where data is openly licensed, best practices are hardly ever followed

In the majority of cases, our research team found governments apply general terms of use instead of specific licenses for the data. Open government licenses and Creative Commons licenses were seldom used. As outlined above, this is problematic. Using customized licenses or terms of use may impose additional requirements such as:

  • Require specific attribution statements desired by the publisher
  • Add clauses that make it unclear how data can be reused and modified.
  • Adapt licenses to local legislation

Throughout our assessment, we encountered unnecessary or ambivalent clauses, which in turn may cause legal concerns, especially when people consider to use data commercially. Sometimes we came across redundant clauses that cause more confusion than clarity.  For example clauses may forbid to use data in an unlawful way (see also the discussion here).

Standard open licenses are intended to reduce legal ambiguity and enable everyone to understand use rights. Yet many licenses and terms contain unclear clauses or are not obvious to what data they refer to. This can, for instance, mean that governments restrict the use of substantial parts of a database (and only allow the use of insignificant parts of it). We recommend that governments give clear examples which use cases are acceptable and which ones are not.

Licenses do not make clear enough to what data they apply.  Data should include a link to the license, but this is not commonly done. For instance, in Mexico, we found out that procurement information available via Compranet, the procurement platform for the Federal Government, was openly licensed, but the website does not state this clearly. Mexico hosts the same procurement data on and applies an open license to this data. As a government official told us, the procurement data is therefore openly licensed, regardless where it is hosted. But again this is not clear to the user who may find this data on a different website. Therefore we recommend to always have the data accompanied with a link to the license.  We also recommend to have a license notice attached or ‘in’ the data too. And to keep the links updated to avoid ‘link rot’.

The absence of links between data and legal terms makes an assessment of open licenses impossible

Users may need to consult legal texts and see if the rights granted to comply with the open definition. Problems arise if there is not a clear explanation or translation available what specific licenses entail for the end user. One problem is that users need to translate the text and when the text is not in a machine-readable format they cannot use translation services. Our experience shows that it was a significant source of error in our assessment. If open data experts struggle to assess public domain status, this problem is even exacerbated for open data users. Assessing public domain status requires substantial knowledge of copyright – something the use of open licenses explicitly wants to avoid.

Copyright notices on websites can confuse users. In several cases, submitters and reviewers were unable to find any terms or conditions. In the absence of any other legal terms, submitters sometimes referred to copyright notices that they found in website footers. These copyright details, however, do not necessarily refer to the actual data. Often they are simply a standard copyright notice referring to the website.

Recommendations for data publishers

Based on our finding we prepared 10 recommendations that policymakers and other government officials should take into account:

  1. Does the data and/or dataset fall under the scope of IP protection? Often government data does not fall under copyright protection and should not be presented as such. Governments should be aware and clear about the scope of intellectual property (IP) protection.
  2. Use standardized open licenses. Open licenses are easily understandable and should be the first choice. The Open Definition provides conformant licenses that are interoperable with one another.
  3. In some cases, governments might want to use a customized open government license. These should be as open as possible with the least restrictions necessary and compatible (see point 2). To guarantee a license is compatible, the best practice is to submit the license for approval under the Open Definition.
  4. Exactly pinpoint within the license what data it refers to and provide a timestamp when the data has been provided.
  5. Clearly, publish open licensing details next to the data. The license should be clearly attached to the data and be both human and machine-readable. It also helps to have a license notice ‘in’ the data.
  6. Maintain the links to licenses so that users can access license terms at all times.
  7. Highlight the license version and provide context how data can be used.
  8. Whenever possible, avoid restrictive clauses that are not included in standard licenses.
  9. Re-evaluate the web design and avoid confusing and contradictory copyright notices in website footers, as well as disclaimers and terms of use.
  10. When government data is in the public domain by default, make clear to end users what that means for them.


DuraSpace News: POSTERS Wanted for VIVO 2017 Conference Until June 12

planet code4lib - Thu, 2017-06-08 00:00

There's still time to submit a poster or abstract! You have until Monday, June 12, so shuffle your weekend plans and give a boost to your career. Here's why it's worth it:

DuraSpace News: POSTERS Wanted for VIVO 2017 Conference Until June 12

planet code4lib - Thu, 2017-06-08 00:00

There's still time to submit a poster or abstract! You have until Monday, June 12, so shuffle your weekend plans and give a boost to your career. Here's why it's worth it:

Islandora: New York Academy of Medicine Library Launches New Digital Collections

planet code4lib - Wed, 2017-06-07 18:05


New York (June 6, 2017) - The New York Academy of Medicine Library announced today the launch of its new digital collections and exhibits website, hosted on the open-source framework Islandora and accessible at The new site makes it easy for the public to access and explore highlights of the Library's world-class historical collections in the history of medicine and public health.

"The Academy is committed to enhancing access to our Library's world-class collections through digitization," said Academy President Jo Ivey Boufford, MD.

"With the launch of our new digital collections and exhibits website, users across the globe will have access to an ever-growing number of important resources in the history of medicine and public health."

The website includes a glimpse into the Library's rare and historical collections material. In one day, high-end photographer Ardon Bar-Hama, courtesy of George Blumenthal, took photos of a subset of the Library's treasures, which are all accessible via the new website. Visitors interested in cookery can page through the Library's Apicius manuscript with 500 Greek and Roman recipes from the 4th and 5th centuries. Other highlights includes beautiful anatomical images from Andreas Vesalius's De Humani corporis Fabrica and striking botanicals like this skunkcabbage (Symplocarpus Foetida) hand-colored plate from William P. C. Barton's Vegetable Materia Medica.

Also featured is The William H. Helfand Collection of Pharmaceutical Trade Cards, which contains approximately 300 colorful pharmaceutical trade cards produced in the U.S. and France between 1875 and 1895 that were used to advertise a wide range of goods in the nineteenth century. Such cards are now regarded as some of the best source material for the study of advertising, technology and trade in the post- Civil War period.

"It is gratifying to digitize our materials and see them come to life with the launch," said Robin Naughton, PhD, Head of Digital for the Library. "Our digital collections and exhibits website represent a bridge between the Academy Library's collections and the world as it intersects with the humanities and technology."

The Library will continue to launch new digital collections and exhibits, including "How to Pass Your O.W.L.'s at Hogwarts: A Prep Course," which celebrates the 20th anniversary of the publication of Harry Potter and the Philosopher's Stone and will be launched on June 26. Two other upcoming digital projects focus on the history of the book: "Facendo Il Libro/Making the Book," funded by the Gladys Krieble Delmas Foundation, and "Biography of a Book," funded by a National Endowment for the Humanities Digital Projects for the Public grant.

About The New York Academy of Medicine Library The Academy is home to one of the most significant historical libraries in medicine and public health in the world, safeguarding the heritage of medicine to inform the future of health. The Library is dedicated to building bridges among an interdisciplinary community of scholars, educators, clinicians, and the general public, and fills a unique role in the cultural and scholarly landscape of New York City. Serving a diverse group of patrons-from historians and researchers to documentary filmmakers to medical students and elementary school students-the Academy collections serve to inform and inspire a variety of audiences from the academic to the public at large.


Kiri Oliver 212.822.7278 |

District Dispatch: Washington Office at Annual 2017: Libraries #ReadytoCode

planet code4lib - Wed, 2017-06-07 14:28

The #ReadytoCode team is building off our Phase I project report to address some of the recommendations on support, resources and capacity school and public libraries need to get their libraries Ready to Code.

Are you tracking what’s going on with coding in libraries? OITP’s Libraries #ReadytoCode initiative is in full swing and if you haven’t heard, you can find out more in Chicago.

The Ready to Code team is building off our Phase I project report to address some of the recommendations on support, resources and capacity school and public libraries need to get their libraries Ready to Code.

Get Your Library #ReadytoCode (Sunday, June 25, 1-2:30 p.m.)
Get a taste of what we heard from the field and hear from librarians who have youth coding programs up and running in their libraries. Join us on Sunday, June 25 at 1 to 2:30. Play “Around the World” and talk with library staff from different backgrounds and experiences who will share the ups and downs and ins and outs of designing coding activities for youth. Table experts will cover topics like community and family engagement, analog coding, serving diverse youth, evaluating your coding programs and more!

Learn how to get started. Hear about favorite resources. Build computational thinking facilitation skills. Discuss issues of diversity and inclusion. Visit each table and get your #ReadytoCode passport stamped with one-of-a-kind stamps. Share your own examples for a bonus stamp.

Start your library’s coding club with Google’s CS First and other free resources (Saturday, June 24, 1 – 2:30 p.m.)
Interested in offering a computer science program at your library? Join a team from Google to learn about free resources to support librarians in facilitating activities for youth, including how to set up and run free CS First clubs, designed to help youth learn coding in fun and engaging ways through interest-based modules like story-telling, design, animation and more. Speakers include Hai Hong, program manager of CS Education; Nicky Rigg, program manager of CS Education; and Chris Busselle, program manager of CS First

Libraries as change agents in reducing implicit bias: Partnering with Google to support 21st Century skills for all youth (Saturday, June 24, 3 – 4 p.m.)
As our economy shifts, digital skills, computer science and computational thinking are becoming increasingly essential for economic and social mobility, yet access to these skills is not equitable. Join a team Hai Hong and Nicky Rigg from Google to learn about recent research to address implicit biases in education, and be ready to work as we discuss how libraries and Google can partner to increase the diversity of youth who are prepared to participate in the digital future.

Tech education in libraries: Google’s support for digital literacy and inclusion (Sunday, June 25, 10:30 – 11:30 a.m.)
How can we better support our youth to participate in and benefit from the digital future? Join Google’s Connor Regan, associate product manager of Be Internet Awesome, and others from Google to learn about the range of free resources available to help librarians, families and communities to promote digital literacy and the safe use of the internet.


Want to know more? Follow the Libraries #ReadytoCode conference track on the Conference Scheduler and stock up on ideas to design awesome coding programs when you get back home!

The post Washington Office at Annual 2017: Libraries #ReadytoCode appeared first on District Dispatch.

HangingTogether: Institutional researchers and librarians unite!

planet code4lib - Tue, 2017-06-06 19:51

Institutional research information management requires the engagement and partnership of numerous stakeholders within the university or research institution. A critical stakeholder group on any campus are institutional researchers, and I encourage greater collaboration between university libraries and institutional research professionals to support research information management.

Last week I had the opportunity to present a poster at the annual meeting of the Association of Institutional Research (AIR), the primary professional organization for US institutional research (or IR) professionals. The IR professionals I spoke with expressed frustration with the ability to collect high quality, reliable information about the research productivity at their institutions. They require this information for many different reasons:

  • They are increasingly being asked to report on faculty research activities as a component of institutional decision support and strategic planning.
  • They support institutional and disciplinary accreditation activities which require extensive accounting of research activities.
  • They are asked to support cyclical internal reviews of undergraduate and graduate degree programs (typically called program review). While program review emphasizes student academic activities and outcomes, quantitative and qualitative information about faculty research is needed.
  • They aggregate information that may support institutional competitiveness in national and international rankings and conduct benchmarking against peer institutions.
  • They may be asked to support or lead annual academic progress review workflows (called faculty activity reporting or FAR in the US), in which faculty self-report research, teaching, and service activities to support promotion and tenure evaluation as well as annual reviews.

IR professionals, who usually report directly to senior academic leadership, are keen to discover improved ways to collect and interpret campus research productivity. While European institutions have been collecting and managing research information for some time, as demonstrated through the maturity of international organizations like EuroCRIS and the maintenance of database models like CERIF to support Current Research Information Systems (CRIS), this is still fairly new in the United States, and US research information management practices are developing quite differently than European CRIS models. As Amy Brand articulated in her excellent 2015 blog post, US RIM adoption straggles in great part because no single campus unit “owns” interoperability; instead, system development takes place in a decentralized and uncoordinated way. This could be seen within the AIR community, as conversations about collecting program review and benchmarking data were usually separate from faculty activity reporting (FAR) workflows. And completely absent from the conversation there were RIM components that libraries are usually keen to address, including public researcher profiles to support expertise discovery, linkages to open access content and repositories, and reuse in faculty web pages, CVs, and biosketches. Different components of the RIM landscape are being developed and supported in siloed communities. This isn’t good for anyone.

I see complementary goals and potential alliances between libraries and institutional research professionals. Collecting and managing the scholarly record of an institution is a challenging endeavor requiring enterprise collaboration. By working together and with other institutional stakeholders, I believe institutional researchers and librarians can collect and preserve quality metadata about the institutional scholarly record, and they can support a variety of activities, including public researcher profiles, faculty activity review workflows, linkages to open access content, and reporting and assessment activities–all parts of a rich, complex research information management infrastructure. By working together to enter once and reuse often, the researchers also win, as improved systems can save them time by reducing multiple requests for the same information, accelerate CV and biosketch creation, and automatically update other systems and web pages through APIs and plug-ins.

IR professionals are obviously data savvy, but publications metadata is usually outside of their experience. They understand that there are significant challenges to collecting the publications and scholarly record of their institutions, but they are largely unfamiliar with specific challenges of person, object, and institutional name ambiguity in bibliographic records or why sources or coverage may vary by discipline. Because they may have not previously collaborated with libraries, it’s easy for institutional researchers to miss the knowledge and expertise the library may offer in addressing these challenges. Libraries can offer institutional researchers this expertise, as well as knowledge about bibliographic standards, identifiers, and vocabularies.

Libraries have complementary perspectives on research and researcher information to offer cross-campus institutional research colleagues. For example, while libraries–and OCLC Research–is paying close attention to the evolving scholarly record and the growing importance of research data sets, grey literature, and preprints, this is largely unfamiliar and unimportant to institutional researchers. For the immediate future, publications remain the primary, measurable intellectual currency for benchmarking and reporting at US universities, as are traditional, article-level citation metrics. And unlike the library community, institutional reporting offices have little interest or experiences supporting open access, discoverability, expertise identification, and content preservation.

I think it’s equally important for libraries to ask what they can learn from the institutional research community. IR professionals are the experts about institutional data, and they hold the keys to demystifying campus information, including institutional hierarchies and affiliations that need to be addressed in any RIM implementation. They provide leadership and support for departmental, disciplinary, and institutional data aggregation efforts like accreditation, which provides them with a unique and powerful view of challenges and opportunities–for improving data, systems, workflows, and collaborations. They are familiar with institutional and national policies, like FERPA, that ensure personal privacy, and they also have well-established communities of practice to support data sharing, such as the Association of American Universities Data Exchange (AAUDE).

OCLC Research and working group members from OCLC Research Library Partnership institutions are working together to understand rapid changes in institutional research information management and the role of the library within it. Stay tuned for upcoming research reports this fall, as well as conference presentations this summer at the LIBER Annual Conference in Patras, Greece and the 8th annual VIVO conference in New York City.

About Rebecca Bryant

Rebecca Bryant is Senior Program Officer at OCLC where she leads research and initiatives related to research information management in research universities.

Mail | Web | Twitter | LinkedIn | More Posts (2)

Archival Connections: Installing Social Feed Manager Locally

planet code4lib - Tue, 2017-06-06 18:12
The easiest way to get started with Social Feed Manager is to install Docker on a local machine, such as a laptop or (preferably) desktop computer with a persistent internet connection. Running SFM locally for anything other than testing purposes is NOT recommended. It will not be sufficient for a long-term documentation project and would […]

DPLA: DPLA Board Call: June 15, 2017, 3:00 PM Eastern

planet code4lib - Tue, 2017-06-06 17:50

The next DPLA Board of Directors call is scheduled for Thursday, June 15 at 3:00 PM Eastern. Agenda and dial-in information is included below. This call is open to the public, except where noted.


[Public] Welcome and Introduction of Board Members and Michele Kimpton, Interim Executive Director – Board  Chair, Amy Ryan
[Public] Updates on Executive Director Search – Amy Ryan
[Public] DPLA Update – Michele Kimpton
[Public] Questions/comments from the public
Executive Session to follow public portion of call


Join from PC, Mac, Linux, iOS or Android:

Or Telephone: +1 408-638-0968 (US Toll)

Meeting ID: 173 812 951

Islandora: Islandora CLAW Install: Call for Stakeholders

planet code4lib - Tue, 2017-06-06 16:47

Have you ever installed Islandora yourself? Do you think it could be a better experience? Would you like to spare yourself, the community, and all the potential adopters out there the difficulties of installing an entire repository solution from scratch? Then the Islandora Foundation is calling on you to help make that possible.

Now that the core of CLAW is shaping up, we plan on holding a series of sprints to implement and document a modular installation process that will benefit everyone. We know that there is a deep well of knowledge and experience out there in the community, and we're hoping motivated individuals and organizations will step forward and commit to being part of this process. Identifying as a stakeholder will give you influence over the direction that this effort takes if you're willing to put in the time to make it happen.

Work will commence in July, but it will be in a different format than before. Before any programming or documenting gets started, we're asking for stakeholders to be involved in a series of planning meetings to identify the scope of work to do be done for each sprint.

So if you're interested in being involved with the creation of what will be one of the greatest features of CLAW, please respond to this doodle poll for an initial informative meeting about being a stakeholder. At the meeting, we will be discussing the new sprint format in detail, what it means to be a stakeholder, as well as prior efforts to give people the context they need to decide if they want to be involved. So if you're curious, please feel free to stop by. And if you don't feel like participating in conversation but just want to listen in, that's okay. As always, lurkers are welcome.


Subscribe to code4lib aggregator