You are here

Feed aggregator

Galen Charlton: IMLS support for free and open source software

planet code4lib - Sat, 2017-03-18 19:26

The Institute of Museum and Library Services is the U.S. government’s primary vehicle for direct federal support of libraries, museums, and archives across the entire country. It should come as no surprise that the Trump administration’s “budget blueprint” proposes to wipe it out, along with the NEA, NEH, Meals on Wheels, and dozens of other programs.

While there is reason for hope that Congress will ignore at least some of the cuts that Trump proposes, the IMLS in particular has been in the sights of House Speaker Paul Ryan before. We cannot afford to be complacent.

Loss of the IMLS and the funding it delivers would be a disaster for many reasons, but I’ll focus on just one: the IMLS has paid a significant role in funding in the creation and use of free and open source software for libraries, museums, and archives. Besides the direct benefit to the institutions who were awarded grants to build or use F/LOSS, such grants are a smart investment on the part of an IMLS: a dollar spent on producing software that anybody can freely use can rebound to the benefit of many more libraries.

For example, here is a list of some of the software projects whose creation or enhancement was funded by an IMLS grant:

This is only a partial list; it does not include LSTA funding that libraries may have used to either implement or enhance F/LOSS systems or money that libraries contributed to F/LOSS development as part of a broader grant project.

IMLS has also funded some open source projects that ultimately… went nowhere. But that’s OK; IMLS funding is one way that libraries can afford to experiment.

Do you or your institution use any of this software? Would you miss it if it were gone — or never existed — or was only available in some proprietary form? If so… write your congressional legislators today.

District Dispatch: Inspired by music: a copyright history

planet code4lib - Fri, 2017-03-17 22:07

I started to work for ALA as a copyright specialist during the Eldred vs. Ashcroft public domain court battle that ultimately went to the Supreme Court. The question was whether the recent extension of the copyright term under the Sonny Bono Copyright Term Extension Act of 1998 from life plus 50 years to life plus 70 years was constitutional. In a 7-2 ruling, the Court said that the term was constitutional and that Congress could determine any term of copyright as long as it was not forever. Even one day less than forever met the definition of “limited times” in the Copyright Clause. I was shattered because I was sure we were going to win. Naïve me.

ALA was one of the amici that supported Eric Eldred, an Internet publisher who relied on public domain materials for his business. A lot can be said about the case and a lot has been written. I have argued that the silver lining of the disastrous ruling was the formation of the Duke Center for the Study of the Public Domain, Creative Commons and other open licensing movements. The ruling also led the publication of comic book called Bound by Law? Tales from the Public Domain by James Boyle and Keith Aoki. It is a great book that should be in the collection of every library.

This year, there is another book by Boyle, Aoki and Jennifer Jenkins, that should be in the collection of every library. It’s called Theft: A History of Music. It examines the certainty that music could not written without relying on music that was created before—the “standing on the shoulders of giants” idea. There’s a great documentary called John Lennon’s Jukebox that illustrates how music that Lennon loved—rock n’ roll records from the United States—ended up in his music. This music inspired him to be a musician. Its creativity planted the seeds for his own creativity. You can hear a riff on the intro of Richie Barrett’s “Some Other Guy” on “Instant Karma.” That’s cool. (Meanwhile, we see court cases like Blurred Lines and Stairway to Heaven.)

Theft: A History of Music is a labor of love as well as a primer on copyright overall. If you are teaching copyright to librarians or students, this might be the only required text that you assign.

Available online under a Creative Commons license and in print. Here’s a video teaser.

The post Inspired by music: a copyright history appeared first on District Dispatch.

District Dispatch: Look Back, Move Forward: Freedom of Information Day

planet code4lib - Fri, 2017-03-17 19:22

Senator Tester accepting the James Madison Award at the Newseum in Washington, D.C. The award is given to those who have worked to protect public access to government information.

At the tail end of this year’s #SunshineWeek, let’s take a quick moment to #FlashbackFriday (or should we say #FOIAFriday?) to 29 years ago yesterday, when the American Library Association began celebrating Freedom of Information Day. In honor of the day this year, ALA presented U.S. Senator Jon Tester of Montana with the 2017 James Madison Award for his advocacy for public access to government information. Upon accepting the award, Senator Tester gave a short speech, which you can watch here.

“It is a true honor to receive this award. Throughout my time in the U.S. Senate, I have made it a priority to bring more transparency and accountability to Washington. By shedding more light across the federal government and holding officials more accountable, we can eliminate waste and ensure that folks in Washington, D.C. are working more efficiently on behalf of all Americans.”

At the ceremony, Senator Tester affirmed his longstanding commitment to increasing public access to information by formally announcing the launch of the Senate Transparency Caucus, which aims to shed more light on federal agencies and hold the federal government more accountable to taxpayers.

Earlier this week, Senator Tester also reintroduced the Public Online Information Act, which aims to make all public records from the Executive Branch permanently available on the Internet in a searchable database at no cost to constituents.” In other words, this bill (if enacted) would cement the simple concept we know to be true: in the 21st century, public means online.

In honor of Senator Tester, here is a look back at the origins of ALA’s Freedom of Information Day: a 1988 resolution signed by Council to honor the memory of James Madison.

1988 Resolution on Freedom of Information Day.


The post Look Back, Move Forward: Freedom of Information Day appeared first on District Dispatch.

David Rosenthal: The Amnesiac Civilization: Part 4

planet code4lib - Fri, 2017-03-17 15:00
Part 2 and Part 3 of this series covered the unsatisfactory current state of Web archiving. Part 1 of this series briefly outlined the way the W3C's Encrypted Media Extensions (EME) threaten to make this state far worse. Below the fold I expand on the details of this threat.

The W3C's abstract describes EME thus:
This proposal extends HTMLMediaElement [HTML5] providing APIs to control playback of encrypted content.

The API supports use cases ranging from simple clear key decryption to high value video (given an appropriate user agent implementation). License/key exchange is controlled by the application, facilitating the development of robust playback applications supporting a range of content decryption and protection technologies.The next paragraph is misleading; EME not merely enables DRM, it mandates at least an (insecure) baseline implementation of encrypted content:
This specification does not define a content protection or Digital Rights Management system. Rather, it defines a common API that may be used to discover, select and interact with such systems as well as with simpler content encryption systems. Implementation of Digital Rights Management is not required for compliance with this specification: only the Clear Key system is required to be implemented as a common baseline.The Clear Key system requires that content be encrypted, but the keys to decrypt it are passed in cleartext. I will return to the implications of this requirement.

EME data flowsThe W3C's diagram of the EME stack shows an example of how it works. An application, i.e. a Web page, requests the browser to render some encrypted content. It is delivered, in this case from a Content Distribution Network (CDN), to the browser. The browser needs a license to decrypt it, which it obtains from the application via the EME API by creating an appropriate session then using it to request the license. It hands the content and the license to a Content Decryption Module (CDM), which can decrypt the content using a key in the license and render it.

What is DRM trying to achieve? Ostensibly, it is trying to ensure that each time DRM-ed content is rendered, specific permission is obtained from the content owner. In order to ensure that, the CDM cannot trust the browser it is running in. For example, it must be sure that the browser can see neither the decrypted content nor the key. If it could see, and save for future use, either it would defeat the purpose of DRM.

The CDM is running in an environment controlled by the user, so the mechanisms a DRM implementation uses to obscure the decrypted content and the key from the environment are relatively easy to subvert. This is why in practice most DRM technologies are "cracked" fairly quickly after deployment. As Bunnie Huang's amazing book about cracking the DRM of the original Xbox shows, it is very hard to defeat a determined reverse engineer.

Content owners are not stupid. They realized early on that the search for uncrackable DRM was a fool's errand. So, to deter reverse engineering, they arranged for the 1998 Digital Millenium Copyright Act (DMCA) to make any attempt to circumvent protections on digital content a criminal offense. Cory Doctorow explains what this strategy achieves:
So if DRM isn't anti-piracy, what is it? DRM isn't really a technology at all, it's a law. Specifically, it's section 1201 of the US DMCA (and its international equivalents). Under this law, breaking DRM is a crime with serious consequences (5 years in prison and a $500,000 fine for a first offense), even if you're doing something that would otherwise be legal. This lets companies treat their commercial strategies as legal obligations: Netflix doesn't have the legal right to stop you from recording a show to watch later, but they can add DRM that makes it impossible to do so without falling afoul of DMCA.

This is the key: DRM makes it possible for companies to ban all unauthorized conduct, even when we're talking about using your own property in legal ways. This intrudes on your life in three ways:
  1. It lets companies sue and threaten security researchers who find defects in products
  2. It lets companies sue and threaten accessibility workers who adapt technology for use by disabled people
  3. It lets companies sue and threaten competitors who want to let you do more with your property -- get it repaired by independent technicians, buy third-party parts and consumables, or use it in ways that the manufacturer just doesn't like.
Of course, among the "ways that the manufacturer just doesn't like" can be archiving.

IANAL, but I do not believe that it is a defense under the DMCA that the "protections" in question are made of tissue paper. Thus, for example, it is likely that even an attempt to reverse-engineer an implementation of EME's Clear Key system in order to preserve the plaintext of some encrypted content would risk severe criminal penalties. Would an open source implementation of Clear Key be legal?

It is this interaction between even purely nominal DRM mechanisms and the DMCA that has roused opposition to EME. J. M. Porup's A battle rages for the future of the Web is an excellent overview of the opposition and its calls on Tim Berners-Lee to decry EME. Once he had endorsed it, Glyn Moody wrote a blistering takedown of his reasoning in Tim Berners-Lee Endorses DRM In HTML5, Offers Depressingly Weak Defense Of His Decision. He points to the most serious problem EME causes:
Also deeply disappointing is Berners-Lee's failure to recognize the seriousness of the threat that EME represents to security researchers. The problem is that once DRM enters the equation, the DMCA comes into play, with heavy penalties for those who dare to reveal flaws, as the EFF explained two years ago.How do we know that this is the most serious problem? Because, like all the other code running in your browser, the DRM implementations have flaws and vulnerabilities. For example:
Google's CDM is Widevine, a technology it acquired in 2010. David Livshits, a security researchers at Ben-Gurion University and Alexandra Mikityuk from Berlin's Telekom Innovation Laboratories, discovered a vulnerability in the path from the CDM to the browser, which allows them to capture and save videos after they've been decrypted. They've reported this bug to Google, and have revealed some proof-of-concept materials now showing how it worked (they've withheld some information while they wait for Google to issue a fix).

Widevine is also used by Opera and Firefox (Firefox also uses a CDM from Adobe).

Under German law -- derived from Article 6 of the EUCD -- Mikityuk could face criminal and civil liability for revealing this defect, as it gives assistance to people wishing to circumvent Widevine. Livshits has less risk, as Israel is one of the few major US trading partners that has not implemented an "anti-circumvention" law, modelled on the US DMCA and spread by the US Trade Representative to most of the world.Note that we (and Google) only know about this flaw because one researcher was foolhardy and another was from Israel. Many other flaws remain unrevealed:
The researchers who revealed the Widevine/Chrome defect say that it was likely present in the browser for more than five years, but are nevertheless the first people to come forward with information about its flaws. As many esteemed security researchers from industry and academe told the Copyright Office last summer, they routinely discover bugs like this, but don't come forward, because of the potential liability from anti-circumvention law.Glyn Moody again:
The EFF came up with a simple solution that would at least have limited the damage the DMCA inflicts here:
a binding promise that W3C members would have to sign as a condition of continuing the DRM work at the W3C, and once they do, they not be able to use the DMCA or laws like it to threaten security researchers.Alas, Cory Doctorow again:
How do we know that companies only want DRM because they want to abuse this law, and not because they want to fight piracy? Because they told us so. At the W3C, we proposed a compromise: companies who participate at W3C would be allowed to use it to make DRM, but would have to promise not to invoke the DMCA in these ways that have nothing to do with piracy. So far, nearly 50 W3C members -- everyone from Ethereum to Brave to the Royal National Institute for Bind People to Lawrence Berkeley National Labs -- have endorsed this, and all the DRM-supporting members have rejected it.

In effect, these members are saying, "We understand that DRM isn't very useful for stopping piracy, but that law that lets us sue people who aren't breaking copyright law? Don't take that away!"Its not as though, as an educated Web user, you can decide that you don't want to take the risks inherent in using a browser that doesn't trust you, or the security researchers you depend upon. In theory Web DRM is optional, but in practice it isn't. Lucian Armasu at Tom's Hardware explains:
The next stable version of Chrome (Chrome 57) will not allow users to disable the Widevine DRM plugin anymore, therefore making it an always-on, permanent feature of Chrome. The new version of Chrome will also eliminate the “chrome://plugins” internal URL, which means if you want to disable Flash, you’ll have to do it from the Settings page.You definitely want to disable Flash. To further "optimize the user experience":
So far only the Flash plugin can be disabled in the Chrome Settings page, but there is no setting to disable the Widevine DRM plugin, nor the PDF viewer and the Native Client plugins. PDF readers, including the ones that are built into browsers, are major targets for malicious hackers. PDF is a “powerful” file format that’s used by many, and it allows hackers to do all sorts of things given the right vulnerability.

People who prefer to open their PDF files in a better sandboxed environment or with a more secure PDF reader, rather than in Chrome, will not be able to do that anymore. All PDF files will always open in Chrome’s PDF viewer, starting with Chrome 57.But that's not what I came to tell you about. Came to talk about the draft archiving.

I fully appreciate the seriousness of the security threat posed by EME, but it tends to overwhelm discussion of EME's other impacts. I have long been concerned about the impact of Digital Rights Management on archiving. I first wrote about the way HTML5 theoretically enabled DRM for the Web in 2011's Moonalice plays Palo Alto:
Another way of expressing the same thought is that HTML5 allows content owners to implement a semi-effective form of DRM for the Web. That was then, but now theory is practice. Once again, Glyn Moody is right on target:
One of the biggest problems with the defense of his position is that Berners-Lee acknowledges only in passing one of the most serious threats that DRM in HTML5 represents to the open Web. Talking about concerns that DRM for videos could spread to text, he writes:
For books, yes this could be a problem, because there have been a large number of closed non-web devices which people are used to, and for which the publishers are used to using DRM. For many the physical devices have been replaced by apps, including DRM, on general purpose devices like closed phones or open computers. We can hope that the industry, in moving to a web model, will also give up DRM, but it isn't clear.
So he admits that EME may well be used for locking down e-book texts online. But there is no difference between an e-book text and a Web page, so Berners-Lee is tacitly admitting that DRM could be applied to basic Web pages. An EFF post spelt out what that would mean in practice:
A Web where you cannot cut and paste text; where your browser can't "Save As..." an image; where the "allowed" uses of saved files are monitored beyond the browser; where JavaScript is sealed away in opaque tombs; and maybe even where we can no longer effectively "View Source" on some sites, is a very different Web from the one we have today.
It's also totally different from the Web that Berners-Lee invented in 1989, and then generously gave away for the world to enjoy and develop. It's truly sad to see him acquiescing in a move that could destroy the very thing that made the Web such a wonderfully rich and universal medium -- its openness. The EFF's post (from 2013) had several examples of EME "mission creep" beyond satisfying Netflix:
Just five years ago, font companies tried to demand DRM-like standards for embedded Web fonts. These Web typography wars fizzled out without the adoption of these restrictions, but now that such technical restrictions are clearly "in scope," why wouldn't typographers come back with an argument for new limits on what browsers can do?

Indeed, within a few weeks of EME hitting the headlines, a community group within W3C formed around the idea of locking away Web code, so that Web applications could only be executed but not examined online. Static image creators such as photographers are eager for the W3C to help lock down embedded images. Shortly after our Tokyo discussions, another group proposed their new W3C use-case: "protecting" content that had been saved locally from a Web page from being accessed without further restrictions. Meanwhile, publishers have advocated that HTML textual content should have DRM features for many years.Web archiving consists of:
content ... saved locally from a Web page ... being accessed without further restrictions.It appears that the W3C's EME will become, in effect, a mandatory feature of the Web. Obviously, the first effect is that much Web video will be DRM-ed, making it impossible to collect in replayable form and thus preserve. Google's making Chrome's video DRM impossible to disable suggests that YouTube video will be DRM-ed. Even a decade ago, to study US elections you needed YouTube video.

But that's not the big impact that EME will have on society's memory. It will spread to other forms of content. The business models for Web content are of two kinds, and both are struggling:
  • Paywalled content. It turns out that, apart from movies and academic publishing, only a very few premium brands such as The Economist, the Wall Street Journal and the New York Times have viable subscription business models based on (mostly) paywalled content. Even excellent journalism such as The Guardian is reduced to free access, advertising and voluntary donations. Part of the reason is that Googling the headline of paywalled news stories often finds open access versions of the content. Clearly, newspapers and academic publishers would love to use Web DRM to ensure that their content could be accessed only from their site, not via Google or Sci-Hub.
  • Advertising-supported content. The market for Web advertising is so competitive and fraud-ridden that Web sites have been forced into letting advertisers run ads that are so obnoxious and indeed riddled with malware, and to load up their sites with trackers, that many users have rebelled and use ad-blockers. These days it is pretty much essential to do so, to keep yourself safe and to reduce bandwidth consumption. Sites are very worried about the loss of income from blocked ads. Some, such as Forbes, refuse to supply content to browsers that block ads (which, in Forbes case, turned out to be a public service; the ads carried malware). DRM-ing a site's content will prevent ads being blocked. Thus ad space on DRM-ed sites will be more profitable, and sell for higher prices, than space on sites where ads can be blocked. The pressure on advertising-supported sites, which include both free and subscription news sites, to DRM their content will be intense.
Thus the advertising-supported bulk of what we think of as the Web, and the paywalled resources such as news sites that future scholars will need will become un-archivable. Kalev Leetaru will need to add a fourth, even more outraged, item to his list of complaints about Web archives.

The prospect for academic journals is somewhat less dire. Because the profit margins of the big publishers are so outrageous, and because charging extortionate subscriptions for access to the fruits of publicly and charitably-funded research so hard to justify, they are willing to acquiesce in the archiving of their content provided it doesn't threaten their bottom line. The big publishers typically supply archives such as Portico and CLOCKSS with content through non-Web channels. CLOCKSS is a dark archive, so is no threat to the bottom line. Portico's post-cancellation and audit facilities can potentially leak content, so Portico will come under pressure to DRM content supplied to its subscribers.

Almost all the world's Web archiving technology is based on Linux or other Open Source operating systems. There is a good reason for this, as I wrote back in 2014:
One thing it should be easy to agree on about digital preservation is that you have to do it with open-source software; closed-source preservation has the same fatal "just trust me" aspect that closed-source encryption (and cloud storage) suffer from.Lucian Armasu at Tom's Hardware understands the issue:
there may also be an oligopoly issue, because the content market will depend on four, and perhaps soon only three, major DRM services players: Google, Microsoft, and Apple. All of these companies have their own operating systems, so there is also less incentive for them to support other platforms in their DRM solutions.

What that means in practice is that if you choose to use a certain Linux distribution or some completely new operating system, you may not be able to play protected content, unless Google, Microsoft, or Apple decide to make their DRM work on that platform, too.So it may not even be possible for Web archives to render the content even if the owner wished to give them permission.

Open Knowledge Foundation: Creating awareness about Open Data in Kyambogo University, Uganda

planet code4lib - Fri, 2017-03-17 14:00

This blog is part of the event report series on International Open Data Day 2017. On Saturday 4 March, groups from around the world organised over 300 events to celebrate, promote and spread the use of open data. 44 events received additional support through the Open Knowledge International mini-grants scheme, funded by SPARC, the Open Contracting Program of Hivos, Article 19, Hewlett Foundation and the UK Foreign & Commonwealth Office. This event was supported through the mini-grants scheme under the Open Research theme.

Kyambogo University’s Open Data Day event was about creating awareness of open data within the University community.

Held in the Library’s Computer lab on 4th March 2017, the event included presentations on open data and open access resources; an exhibition of open access library resources and a bonus event – tour of the library service centres. It was attended by librarians, academic staff, and students drawn from different faculties of the University.

Participants registering for the open data day event

The event kicked off with a presentation about open data by Mr Wangwe Isimail, a computer technician in Kyambogo University Library. He covered the following topics: What is open data, Kinds of open data, who can open data? Why open data? Key features of openness. How to open data and Top 21 data sources.

He briefed the participants on an open access workshop that was organised by the Kyambogo University Library in June 2016 which was attended by librarians, deans of faculties, lecturers, researchers and graduate students. The open access workshop was facilitated by Mr David Ball, a SPARC Europe Project Officer for PASTEUR4OA and FOSTER [European Union projects]. Mr Wangwe, in his presentation, emphasised the importance of open data as another element of open science in addition to open access and open source. Hopefully, in the future, the library will organise a workshop on open source too.

Mr Wangwe Isimail delivering a presentation on open data

At the end of the presentation, participants were asked to work in groups of five to discuss what Kyambogo University can contribute towards open access. Participants demonstrated an understanding of initiatives to promote open data. They suggested:

  • Increasing the participation on world open data day celebrations so as to increase awareness to a wider audience
  • Set up a data repository (Kyambogo University Library is already in the process of setting up an institutional repository). It was exciting to hear the participants asking for sensitisation for the university management so they will deposit data into the institutional repository to increase transparency in the university.
  • Carry out sensitization workshops in Kyambogo University to encourage people to open up their research data

Participants in a group discussion

The second presentation was about open access resources by Mary Acanit, An assistant Librarian and Head of ICT Services in Kyambogo University Library. The presentation covered: the meaning of open access; open access resources available at Kyambogo University; comparison between open access resources and subscribed resources; how to access open access resources and; information searching techniques.

Ms. Mary Acanit delivering a presentation on open access resources

The presentation further looked at the benefits of open access and open access publishing models. In addition, participants went through a hands-on training on how to search for open access resources and each was asked to select any of the open access resource databases and download an article of choice on any topic. 

After the presentation, participants were given a tour of the library services. In the interest of time, participants were asked to visit a library service centre of their choice and were guided by librarians on duty.

There are four service centres and they are located in different parts of the university campus. Barclays Library is located in the East End of the Campus with subject strengths in humanities, social sciences and business and management. Barclays library mainly serves Faculty of Arts and Social Sciences, School of Management and Entrepreneurship and Faculty of Vocational Studies. West End library has subject strengths in Science, technology and Engineering and serves mainly faculty of Science and Faculty of Engineering. Faculty of Education Library is a faculty library with subject strengths on Education. Faculty of Special needs and Rehabilitation Library (FSN&R) is also a faculty library in the North end of the campus with subject strengths of its collection in special needs studies. Each of the service centres has a wireless internet connection to facilitate access to online library resources including open access resources.

Some learnings from our Open Data Day event

I am glad to be part of a community that organised the open data day event at my institution and added a voice to promoting access to research data.

I was overwhelmed by the support I received from my library. I shared the idea about open data day event with my colleagues and they were willing to offer a hand: making presentations, guiding participants during the library tour, identifying logistics, distributing invitations, etc. I learnt that we can make greater strides if we work as a team. My advice to people planning to organise similar events is that identify with people who are passionate about the same cause as you and start your local open data community.



District Dispatch: #TTW17 recap, upcoming conferences and more

planet code4lib - Fri, 2017-03-17 13:10

Hi all. Well, I have accumulated a number of items and so figured it was time for a little update. I would like to begin with YALSA’s Teen Tech Week, which just concluded. We had a big push on coding-related activities and a highlight was a segment on the news in Detroit on WXYZ-TV (the ABC affiliate). It is great; check it out — only 90 seconds.

Next, I will be out in a couple of venues to talk about information policy. I will be at the ACRL conference in Baltimore on March 23-24 and identified two specific times to meet up with folks to provide an update from the Washington swamp and answer questions: Thursday 3 to 4 p.m. and Friday 11:30 a.m. to 1 p.m. If interested, please email me at for details.

I also will be a presenter at the upcoming Coalition for Networked Information Task Force meeting in Albuquerque, New Mexico on April 3-4. In the session, “Direct from the Swamp: Developments of the 45th President and 115th Congress,” I will be presenting with Krista Cox of the Association of Research Libraries. If you will be there and want to talk, please contact me and we can set up a time, or just see each other on-site.

Mr. Nick Minchin, Australian Consul-General, speaking at a recent Library For All reception at the Australian Consulate-General. LFA was co-founded by Australian citizen Rebecca MacDonald.

As I reported elsewhere, I accompanied ALA leadership to New York City to meet with publishing and library organizations. In addition, I had other meetings and I would like to highlight one of them. I am on the advisory board of the non-profit Library For All (LFA). By a happy coincidence of scheduling, I was able to attend a reception at the Australian Consulate-General to honor LFA, which was co-founded by Australian citizen Rebecca MacDonald.

LFA has built a digital library to deliver quality educational materials in developing countries. LFA’s mission is to make knowledge accessible to all, equally. Initially focused on obtaining published works to make them available to youth in developing countries (e.g., Haiti, Rwanda, Cambodia), new directions include creating original works such as building a Girls’ Collection that will inspire, empower and educate girls across the world.

“The Girls’ Collection will contain books with strong female characters and stories that show female readers that they are powerful and equal members of society. Through the Collection, girls accessing the digital library will be able to see their identities reflected in the characters of the stories they read, an essential reminder of how important their lives are and a powerful message for them to continue their education, pursue careers, and stand up for their rights as contributing members of the global economy.”

Finally, I would like to mention some noteworthy meetings on the Hill. Larra Clark and I, with Kevin Maher and Adam Eisgrau of the Office of Government Relations and counsel Norm Lent at Arent Fox, had a dozen meetings with staffers (for both the majority and minority) on the Congressional Committees on Small Business and Veterans Affairs or whose Members are on one of these committees. We made good use of our policy advocacy videos and the briefs on small business and veterans. There was considerable interest and goodwill and multiple opportunities for next steps. We are now contemplating our follow-ups for the coming months.

The post #TTW17 recap, upcoming conferences and more appeared first on District Dispatch.

District Dispatch: ALA urges action to increase Lifeline broadband options for low-income Americans

planet code4lib - Fri, 2017-03-17 12:54

Yesterday, the American Library Association joined digital inclusion allies in a letter to Federal Communications Commission Chairman Ajit Pai urging the FCC to act quickly on enabling Lifeline Broadband Providers to serve low-income Americans through the federal program. Developed by the Leadership Conference on Civil and Human Rights, the joint letter comes in the wake of the FCC revoking LBP designations in early February.

“The Wireline Competition Bureau Lifeline Broadband Provider (LBP) revocation order delays an array of innovative and high quality Lifeline broadband offerings and has a chilling effect on other potential Lifeline broadband entrants. The new LBP designation process is critical for increasing competition and facilitating competition and innovation in the Lifeline broadband program, and we urge the Federal Communications Commission to resume the designation process immediately.

We urge the Commission to act quickly on this matter as uncertainty regarding the process for broadband providers to participate in the Lifeline program delays access to affordable broadband to low-income households.”

The post ALA urges action to increase Lifeline broadband options for low-income Americans appeared first on District Dispatch.

FOSS4Lib Recent Releases: Archivematica - 1.6

planet code4lib - Thu, 2017-03-16 23:59

Last updated March 16, 2017. Created by Peter Murray on March 16, 2017.
Log in to edit this page.

Package: ArchivematicaRelease Date: Thursday, March 16, 2017

District Dispatch: President’s budget proposal to eliminate federal library funding

planet code4lib - Thu, 2017-03-16 22:09

This morning, President Trump released his budget proposal for FY2018. The Institute of Museum of Library Services (IMLS) is included in the list of independent agencies whose budgets the proposal recommends eliminating. Library funding that comes through other sources such as the Department of Education, the Department of Labor and the National Endowment for the Humanities is also affected. Just how deeply overall federal library funding is impacted is unclear at this point. The Washington Office is working closely with our contacts in the federal government to gather detailed information. We will provide the analysis of the total impact when it is complete and as quickly as possible.

One thing we all know for certain: Real people will be impacted if these budget proposals are carried through.

While we are deeply concerned about the president’s budget proposal, it is not a done deal. As I said in a statement issued this morning,

“The American Library Association will mobilize its members, congressional library champions and the millions upon millions of people we serve in every zip code to keep those ill-advised proposed cuts from becoming a congressional reality.”

There are several actions we can take right now:

  1. Call your Members of Congress  – ask them to publicly oppose wiping out IMLS, and ask them to commit to fighting for federal library funding. (You can find talking points and an email template on the Action Center.)
  2. Share your library’s IMLS story using the #SaveIMLS tag – tell us how IMLS funding supports your local community. If you aren’t sure which IMLS grants your library as received, you can check the searchable database available on the IMLS website.
  3. Sign up to receive our action alerts – we will let you know when and how to take action, and send you talking points and background information.
  4. Register to participate in National Library Legislative Day on May 1-2, either in Washington, D.C., or online.

Timing is key to the Federal budget/appropriations process. More information – along with talking points and scripts – will be forthcoming from the ALA Washington Office, particularly as it pertains to the upcoming advocacy campaign around “Dear Appropriator” letters. Meanwhile, please take the time to subscribe to action alerts and District Dispatch to ensure you receive the latest updates on the budget process.

The president’s budget has made clear that his funding agenda is not ours. It’s time for library professionals and supporters to make our priorities clear to Congress.

The post President’s budget proposal to eliminate federal library funding appeared first on District Dispatch.

FOSS4Lib Recent Releases: JHOVE - 1.6

planet code4lib - Thu, 2017-03-16 17:37

Last updated March 16, 2017. Created by Peter Murray on March 16, 2017.
Log in to edit this page.

Package: JHOVERelease Date: Thursday, March 16, 2017

Open Knowledge Foundation: Transparency and Accountability in the management of DRC’s Extractive Sector: the role of Open Data

planet code4lib - Thu, 2017-03-16 14:00

This blog is part of the event report series on International Open Data Day 2017. On Saturday 4 March, groups from around the world organised over 300 events to celebrate, promote and spread the use of open data. 44 events received additional support through the Open Knowledge International mini-grants scheme, funded by SPARC, the Open Contracting Program of Hivos, Article 19, Hewlett Foundation and the UK Foreign & Commonwealth OfficeThis event was supported through the mini-grants scheme under the Open contracting and tracking public money flows theme. 

The Open Data Initiative of the Democratic Republic of Congo joined the world to celebrate its first ever Open Data Day on 4th March in Kinshasa.

In a conference style, the event brought together more than 50 participants with varying backgrounds who are interested in open data and the impact it can have on the development of the Democratic Republic of Congo (DRC). The participants who came to the event included representatives from government, Parliamentarians, researchers from universities, students, entrepreneurs, innovators, etc.  

DRCongo Open Data initiative team welcome participants for the March 4th, 2017 Open data day in Kinshasa

The event centred on discussions on how open data can contribute to achieving the Sustainable Development Goals by 2030 in the DRC.

There was also discussions on how open data can contribute to enhancing transparency and accountability in the extractive industries in the DRC. For example, the discussion on open data and the extractive sector focused on the availability and state of data on revenue from mining, petroleum, etc. as well as the amount of tax extractive companies pay to the government and how the money is spent.

Participants at Open Data Day in democratic republic of Congo, March 4, 2017

The event was led by 15 speakers at various sessions of the conference who made visual and oral presentations on open data to help participants understand open data and how open data can contribute to the development of the DRC. Through questions and responses, participants have built skills and knowledge on open data. Several participants commented on the Open Data Day in the DRC:

The Open Data Day event helped me have a good understanding of open data and its significance to enhancing transparecy and accountability in different fields of activities, especially in the extractive sectors – Mr Mayambo; student, University of Kinshasa

The Open Data Initiative team received several recommendations from the participants and plans to call on the DRC government with those recommendations so as to promote open data in the country. Thanks to the grant from Hivos [facilitated by Open Knowledge International], the Open Data Day celebrated in the Democratic Republic of Congo was successful and helped to promote the importance of open data. Our analysis of the event has led us to conclude that many people in the country do not yet know about open data and its importance.

Therefore, we will work to strengthen the promotion of open data through strong marketing campaigns on TV, radio, newspapers, face-to-face campaigns, social media such as Facebook, Youtube, Twitter and improve our participation in major conferences on IT and data at national level, particularly in the provinces. The DRCongo Open Data Initiative has decided to organise the next Open Data Day on March 4, 2018, in Lubumbashi.

Open Knowledge Foundation: Open Data Day 2017: Throwback on the event organized in Cotonou (Benin)

planet code4lib - Thu, 2017-03-16 11:00

This blog is part of the event report series on International Open Data Day 2017. On Saturday 4 March, groups from around the world organised over 300 events to celebrate, promote and spread the use of open data. 44 events received additional support through the Open Knowledge International mini-grants scheme, funded by SPARC, the Open Contracting Program of Hivos, Article 19, Hewlett Foundation and the UK Foreign & Commonwealth Office.  This event was supported through the min-grants scheme under the Open contracting and tracking public money flows theme.

On Saturday, March 4, 2017, in the premises of the fablab, Blolab hosted the Open Data Day in Cotonou in Benin. Like the 300 other events held around the world on the occasion of World Data Day, the initiative attracted dozens of people with different profiles to learn, exchange and share their experiences around open data. In Cotonou, the Bloggers Association of Benin organised the event under the theme « Learning and understanding the interest of open data as a lever for transparency ». 

The idea was to explain to the public the importance of open data in the perspective of fighting corruption to promote good governance and citizen participation and citizen control of public action. Innovation and improved services to citizens/users are other open data issues addressed during the meeting.

Four communications on open data

The first presentation was entitled « Open Data, from the origins to the present » which was delivered by myself, Maurice Thantan. As head of the Association of Bloggers of Benin, the main organiser of the meeting, I first recalled the history, the objectives, the principles but also the advantages of open data. This presentation was punctuated by examples of citizen projects carried out using open data. Tools and resources needed to deepen the discussion were also shared.

The second presentation focused on the relationship between open data and transparency. This lively presentation was led by Franck Kouyami. As an open data specialist and an adept of the free movement in general, he had already organised and co-organized two open data days in Cotonou in the past.

The third communication was peculiar. It was rather a sharing of experience. Indeed, Malick Tapsoba, head of Open Data Burkina Faso was the special guest of the event. He shared the experience of Burkina Faso in the implementation of open data policy. In Francophone Africa, Burkina Faso is cited as a model for opening up public data. The country cited as one of the most promising and to follow closely this year according to Guardian.

The fourth and final communication of the day was the work of Shadai Ali. Partner at Open SI (one of the sponsors of this Open Data Day), Shadaï explained in his presentation how open data can accelerate the digital transformation of companies.

In #Cotonou #OpenDataDay gathered this Saturday high-level participants including the Director of State Information Systems and Services

— Maurice Thantan (@seigla) March 5, 2017

Enthusiastic personalities and participants

These four presentations were of interest to the participants because of the quality of the speakers. The public was also composed of personalities. The Director of the Digital Agency and Director of State of Information Services and Systems at the Presidency of the Republic was present. The general manager of the media was also there. Researchers, civil society actors, journalists and academics also made the trip.

The event ended on a note of satisfaction. Some participants who had just been sensitised on the Open Data theme ensured their commitment to advocate for Open Data in Benin. A Storify thread of the day’s event in tweets and pictures is available here.

#OpenDataDay à #Cotonou, ils parlent de l'organisation par @ab_benin. Ici Jean Jaures Tingbo apprécie #ODD17. Vidéo réalisée par @jnoumonvi

— Maurice Thantan (@seigla) March 6, 2017

#OpenDataDay à #Cotonou, ils parlent de l'organisation par @ab_benin. Ici Malick T. @mbakatre de @OpenDataBurkina apprécie #ODD17 @jnoumonvi

— Maurice Thantan (@seigla) March 6, 2017



DuraSpace News: Recordings Available: DSpace 7 Webinar Series

planet code4lib - Thu, 2017-03-16 00:00

Austin, TX  DuraSpace concluded its latest Hot Topics Webinar Series, "Introducing DSpace 7: Next Generation UI.”  Curated by Claire Knowles, Library Digital Development Manager, The University of Edinburgh, this three-part series began by showcasing the efforts of the team behind the next generation UI for DSpace highlighting aspects of DSpace 7 development.  The second webinar demonstrated how DSpace can be used for research data taking a closer look at DSpace in action at the University of Edinburgh and the Dryad Digital Repository.

FOSS4Lib Upcoming Events: Archon Day

planet code4lib - Wed, 2017-03-15 20:03
Date: Monday, May 22, 2017 - 08:00 to 16:00Supports: Archon

Last updated March 15, 2017. Created by Peter Murray on March 15, 2017.
Log in to edit this page.

For more details see Facebook.

Open Knowledge Foundation: Daystar University student journos learn about tracking public money through Open Data

planet code4lib - Wed, 2017-03-15 15:00

This blog is part of the event report series on International Open Data Day 2017. On Saturday 4 March, groups from around the world organised over 300 events to celebrate, promote and spread the use of open data. 44 events received additional support through the Open Knowledge International mini-grants scheme, funded by SPARC, the Open Contracting Program of Hivos, Article 19, Hewlett Foundation and the UK Foreign & Commonwealth OfficeThis event was supported through the mini-grants scheme under the Open Contracting and tracking public money flows theme. 

Before March 4, 2017, I had grand plans for the Open Data Day event. The idea was to bring together journalists from our student paper Involvement, introduce them to the concept of open data, then have them look at data surrounding the use of public monies. From there, we’d see what stories could emerge. 

Then Stephen Abbott Pugh from Open Knowledge International linked me up with Catherine Gicheru, the lead for Code for Kenya, which is affiliated with the open data and civic technology organisation Code for Africa. The event took a wonderfully new turn.

Prestone Adie, data analyst at ICT Authority, started us off with an explanation of open data and gave us interesting links to sites such as Kenya National Bureau of Statistics, Kenya Open Data Portal, and Kenya’s Ethics and Anti-Corruption Commission. He also pushed it up by taking us to some interesting blogs. There was one from a data analyst who uses his knowledge and expertise to post about mall tickets and fashion vloggers among other varied topics. There was also another that crowdsources information about bad roads in Kenya.  

It was a prime teachable moment, and I jumped in to emphasise how good writing is not restricted to journalism students. Data scientists and self-confessed nerds are in on the game too, and doing some pretty provocative storytelling in the process.

We took a refreshments break where Catherine and Florence Sipalla, surprised us with delicious branded muffins, giving all participants a sugar rush that sustained us for the second session. 

Catherine and Florence, who works as communication consultant and trainer, walked us through what Code for Kenya is doing, using massive amounts of data to tell stories that keep our public officials accountable. Among the tools, they’ve developed is PesaCheck, which enables citizens to verify the numbers that our leaders provide.

We then planned to have our students meet to come up with story ideas using these tools. I’m looking forward to what they will produce.

Here is what one of them said about the event:

As a journalist, the life blood of information today is data. The more you have it in your story the more credible and evidence based your story will look. Such a conference will inspire young journalists to rethink of how they write their stories. Data to me is inescapable  -Abubaker Abdullahi.

Prestone Adie’s list of Open Data sources
  1. Health facilities datasets
  2.  Water And sanitation datasets
  3.  Humanitarian datasets
  4.  Africa open datasets
  5. Kenya Tender awards
  6. Data on parliament activities
  7.  Laws made by the Kenyan parliament
  8. / / for agricultural datasets
  9. Commission on Revenue Allocation datasets
  10. Environmental data
  11. Historical photos

David Rosenthal: SHA1 is dead

planet code4lib - Wed, 2017-03-15 15:00
On February 23rd a team from CWI Amsterdam (where I worked in 1982) and Google Research published The first collision for full SHA-1, marking the "death of SHA-1". Using about 6500 CPU-years and 110 GPU-years, they created two different PDF files with the same SHA-1 hash. SHA-1 is widely used in digital preservation, among many other areas, despite having been deprecated by NIST through a process starting in 2005 and becoming official by 2012.

There is an accessible report on this paper by Dan Goodin at Ars Technica. These collisions have already caused trouble for systems in the field, for example for Webkit's Subversion repository. Subversion and other systems use SHA-1 to deduplicate content; files with the same SHA-1 are assumed to be identical. Below the fold, I look at the implications for digital preservation.

SHA-1 Collision (source)Technically, what the team achieved is a collision via an identical-prefix attack. Two different files generating the same SHA-1 hash is a collision. In their identical-prefix attack, they carefully designed the start to a PDF file. This prefix contained space for a JPEG image. They created two files, each of which started with the prefix, but in each case was followed by different PDF text. For each file, they computed a JPEG image that, when inserted into the prefix, caused the two files to collide.

Conducted using Amazon's "spot pricing" of otherwise idle machines the team's attack would cost about $110K. This attack is less powerful than a chosen-prefix attack, in which a second colliding file is created for an arbitrary first file. 2012's Flame malware used a chosen-prefix attack on MD5 to hijack the Windows update mechanism.

As we have been saying for more than a decade, the design of digital preservation systems must start from a threat model. Clearly, there are few  digital preservation systems whose threat model currently includes external or internal evil-doers willing to spend $110K on an identical-prefix attack. Which, in any case, would require persuading the preservation system to ingest a file of the attacker's devising with the appropriate prefix.

But the attack, and the inevitability of better, cheaper techniques leading to chosen-prefix attacks in the future illustrate the risks involved in systems that use stored hashes to verify integrity. These systems are vulnerable to chosen-prefix attacks, because they allow content to be changed without causing hash mis-matches.

The LOCKSS technology is an exception. LOCKSS boxes do not depend on stored hashes and are thus not vulnerable to identical-prefix or even chosen-prefix attacks on their content. The system does store hashes (currently SHA-1), but uses them only as hints to raise the priority of polls on content if the hashes don't match. The polling system is currently configured to use SHA-1, but each time it prepends different random nonces to the content that is hashed, mitigating these attacks.

If an attacker replaced a file in a LOCKSS box with a SHA-1 colliding file, the next poll would not be given higher priority because the hashes would match. But when the poll took place the random nonces would ensure that the SHA-1 computed would not be the collision hash. The damage would be detected and repaired. The hashes computed during each poll are not stored, they are of no value after the time for the poll expires. For details of the polling mechanism, see our 2003 SOSP paper.

There are a number of problems with stored hashes as an integrity preservation technique. In Rick Whitt on Digital Preservation I wrote:
The long timescale of digital preservation poses another problem for digital signatures; they fade over time. Like other integrity check mechanisms, the signature attests not to the digital object, but to the hash of the digital object. The goal of hash algorithm design is to make if extremely difficult with foreseeable technology to create a different digital object with the same hash, a collision. They cannot be designed to make this impossible, merely very difficult. So, as technology advances with time, it becomes easier and easier for an attacker to substitute a different object without invalidating the signature. Because over time hash algorithms become vulnerable and obsolete, preservation system depending for integrity on preserving digital signatures, or even just hashes, must routinely re-sign, or re-hash, with a more up-to-date algorithm.When should preservation systems re-hash their content? The obvious answer is "before anyone can create collisions", which raises the question of how the preservation system can know the capabilities of the attackers identified by the system's threat model. Archives whose threat model leaves out nation-state adversaries are probably safe if they re-hash and re-sign as soon as the open literature shows progress toward a partial break, as it did for SHA-1 in 2005.

The use of stored hashes for integrity checking has another, related problem. There are two possible results from re-computing the hash of the content and comparing it with the stored hash:
  • The two hashes match, in which case either:
    • The hash and the content are unchanged, or
    • An attacker has changed both the content and the hash, or
    • An attacker has replaced the content with a collision, leaving the hash unchanged.
  • The two hashes differ, in which case:
    • The content has changed and the hash has not, or
    • The hash has changed and the content has not, or
    • Both content and hash have changed.
The stored hashes are made of exactly the same kind of bits as the content whose integrity they are to protect. The hash bits are subject to all the same threats as the content bits. In effect, the use of stored hashes has reduced the problem of detecting change in a string of bits to the previously unsolved problem of detecting change in a (shorter) string of bits.

Traditionally, this problem has been attacked by the use of Merkle trees, trees in which each parent node contains the hash of its child nodes. Notice that this technique does not remove the need to detect change in a string of bits, but by hashing hashes it can reduce the size of the bit string.

The possibility that the hash algorithm used for the Merkle tree could become vulnerable to collisions is again problematic. If a bogus sub-tree could be synthesized that had the same hash at its root as a victim sub-tree, the entire content of the sub-tree could be replaced undetectably.

One piece of advice for preservation systems using stored hashes is, when re-hashing with algorithm B because previous algorithm A has been deprecated, to both keep the A hashes and verify them during succeeding integrity checks. It is much more difficult for an attacker to create files that collide for two different algorithms. For example, using the team's colliding PDF files:

$ sha1sum *.pdf
38762cf7f55934b34d179ae6a4c80cadccbb7f0a shattered-1.pdf
38762cf7f55934b34d179ae6a4c80cadccbb7f0a shattered-2.pdf
$ md5sum *.pdf
ee4aa52b139d925f8d8884402b0a750c shattered-1.pdf
5bd9d8cabc46041579a311230539b8d1 shattered-2.pdf

Clearly, keeping the A hashes is pointless unless they are also verified. The attacker will have ensured that the B hashes match, but will probably not have expended the vastly greater effort to ensure that the A hashes also match.

Stored hashes from an obsolete algorithm are in practice adequate to detect random bit-rot, but as we see they can remain useful even against evil-doers. Archives whose threat model does not include a significant level of evil-doing are unlikely to survive long in today's Internet.

D-Lib: Open Access to Scientific Information in Emerging Countries

planet code4lib - Wed, 2017-03-15 12:13
Article by Joachim Schopfel, University of Lille, GERiiCO Laboratory

D-Lib: Broken-World Vocabularies

planet code4lib - Wed, 2017-03-15 12:13
Article by Daniel Lovins, New York University (NYU) Division of Libraries; Diane Hillmann, Metadata Management Associates LLC

D-Lib: The Landscape of Research Data Repositories in 2015: A re3data Analysis

planet code4lib - Wed, 2017-03-15 12:13
Article by Maxi Kindling, Stephanie van de Sandt, Jessika Rucknagel and Peter Schirmbacher, Humboldt-Universitat zu Berlin, Berlin School of Library and Information Science (BSLIS), Germany; Heinz Pampel, Paul Vierkant and Roland Bertelmann, GFZ German Research Centre for Geosciences, Section 7.4 Library and Information Services (LIS), Germany; Gabriele Kloska and Frank Scholze, Karlsruhe Institute of Technology (KIT), KIT Library, Germany; Michael Witt, Purdue University Libraries, West Lafayette, Indiana, USA


Subscribe to code4lib aggregator