You are here

planet code4lib

Subscribe to planet code4lib feed
Planet Code4Lib -
Updated: 2 weeks 5 days ago

DuraSpace News: DuraCloud Selected as a Featured Open Source Project for Mozilla Global Sprint

Wed, 2018-04-18 00:00

DuraSpace is excited to announce that its project, Open Sourcing DuraCloud: Beyond the License has been selected as a featured project for the Mozilla Foundation Global Sprint to be held May 10th-11th, 2018.

HangingTogether: What’s changed in linked data implementations in the last three years?

Tue, 2018-04-17 20:42
Linking Open Data cloud diagram 2017, by Andrejs Abele, John P. McCrae, Paul Buitelaar, Anja Jentzsch and Richard Cyganiak. by CC-BY-SA license.

OCLC Research conducted an “International Linked Data Survey for Implementers” in 2014 and 2015, attracting responses from a total of 90 institutions in 20 countries.  In the 2015 survey, 168 linked data projects or services were reported, of which 112 were described; 61% of them had been in production for over two years. This represented a doubling of the number of relatively “mature” linked data implementations compared to the 2014 results.

We are curious – what might have changed in the last three years? OCLC Research has decided to repeat its survey to learn details of new projects or services that format metadata as linked data and/or make subsequent uses of it that have launched since the last survey. And we are interested in what might have changed in linked data implementations or plans reported in the previous surveys.

The questions are mostly the same so we can more easily compare results. The target audience are staff who have implemented or are implementing linked data projects or services, either by publishing data as linked data, by consuming linked data resources into their own data or applications, or both.

So if you have implemented or are implementing a linked data project or service, please take the 2018 survey! The link:

We are asking that responses be completed by 25 May 2018. As with the previous surveys, we will share the examples collected for the benefit of others wanting to undertake similar efforts and add the responses to those from the 2014 and 2015 surveys (without contact information) available in this Excel workbook.

What do you think has changed in the last three years?

Code4Lib: Code4Lib Journal Issue 41 Call for Papers

Tue, 2018-04-17 14:05

Call for Papers (and apologies for cross-posting):

The Code4Lib Journal (C4LJ) exists to foster community and share information among those interested in the intersection of libraries, technology, and the future.

We are now accepting proposals for publication in our 41st issue. Don't miss out on this opportunity to share your ideas and experiences. To be included in the 41st issue, which is scheduled for publication in August 2018, please submit articles, abstracts, or proposals at or to by Friday, May 11, 2018. When submitting, please include the title
or subject of the proposal in the subject line of the email message.

C4LJ encourages creativity and flexibility, and the editors welcome submissions across a broad variety of topics that support the mission of the journal. Possible topics include, but are not limited to:

* Practical applications of library technology (both actual and hypothetical)
* Technology projects (failed, successful, or proposed), including how they were done and challenges faced
* Case studies
* Best practices
* Reviews
* Comparisons of third party software or libraries
* Analyses of library metadata for use with technology
* Project management and communication within the library environment
* Assessment and user studies

C4LJ strives to promote professional communication by minimizing the barriers to publication. While articles should be of a high quality, they need not follow any formal structure. Writers should aim for the middle ground between blog posts and articles in traditional refereed journals. Where appropriate, we encourage authors to submit code samples, algorithms, and pseudo-code. For more information, visit C4LJ's Article Guidelines or browse articles from the first 40 issues published on our website:

Remember, for consideration for the 41st issue, please send proposals, abstracts, or draft articles to no later than Friday, May 11, 2018.

Send in a submission. Your peers would like to hear what you are doing.

Code4Lib Journal Editorial Committee

District Dispatch: Ready to Code school library brings music, coding to students with learning disabilities

Tue, 2018-04-17 12:54

In this 2.5 minute video, see how Heritage High School (Newport News, Va.) librarian Melanie Toran and the students she works with are combining music and coding to gain computational thinking literacies.

This post is the second in a series by Libraries Ready to Code cohort participants, who will release their beta toolkit at ALA’s 2018 Annual Conference.


The post Ready to Code school library brings music, coding to students with learning disabilities appeared first on District Dispatch.

Open Knowledge Foundation: Apply Now! School of Data’s 2018 Fellowship Programme

Tue, 2018-04-17 10:05

This blog has been reposted from the School of Data blog

School of Data is inviting journalists, data scientists, civil society advocates and anyone interested in advancing data literacy to apply for its 2018 Fellowship Programme, which will run from May 2018 to January 2019. 8 positions are open, 1 in each of the following countries: Bolivia, Guatemala, Ghana, Indonesia, Kenya, Malawi, Tanzania, The Philippines. The application deadline is set on Sunday, May 6th of 2018. If you would like to sponsor a fellowship, please get in touch with School of Data.

Apply for the Fellowship Programme

The Fellowship

School of Data works to empower civil society organisations, journalists and citizens with the skills they need to use data effectively in their efforts to create more equitable and effective societies. Fellowships are nine-month placements with School of Data for data-literacy practitioners or enthusiasts. During this time, Fellows work alongside School of Data to build an individual programme that will make use of both the collective experience of School of Data’s network to help Fellows gain new skills, and the knowledge that Fellows bring along with them, be it about a topic, a community or specific data literacy challenges.

Similarly to previous years, our aim with the Fellowship programme is to increase awareness of data literacy and build communities who together, can use data literacy skills to make the change they want to see in the world.

The 2018 Fellowship will continue the work in the thematic approach pioneered by the 2016 class. As a result, we will be prioritising candidates who:

  • possess experience in, and enthusiasm for, a specific area of data literacy training
  • can demonstrate links with an organisation practising in this defined area and/or links with an established network operating in the field

We are looking for engaged individuals who already have in-depth knowledge of a given sector or specific skillsets that can be applied to this year’s focus topics.. This will help Fellows get off to a running start and achieve the most during their time with School of Data: nine months fly by!

Read More about the Fellowship Programme

The areas of focus in 2018

We have partnered with Hivos and NRGI to work on the following themes: Procurement and data in the extractives industry (oil, mining, gas). These amazing partner organisations will provide Fellows with guidance, mentorship and expertise in their respective domains.

2018 Fellowship Positions


The Fellowship in Bolivia will be focused on public procurement data through the Open Contracting Programme. For this position, School of Data is looking for someone with: Experience with and interest in community building, experience with the implementation of civic projects with a data or technical component, storytelling skills, and experience with promoting data or technical stories to a wide audience, basic understanding of the public procurement process


The Fellowship in Guatemala will be focused on public procurement data through the Open Contracting Programme. For this position, School of Data is looking for someone with: Experience in the planning, coordination and implementation of projects with civil society organisations, the ability to advise and train organisations on working with data and delivering technical projects, basic understanding of the public procurement process


The Fellowship in Ghana with be focused on extractives Data through the Media Development Programme at NRGI. For this position, School of Data is looking for someone with: an interest in supporting or working within the civil society sector, experience working with financial (or related) data for analysis experience as a trainer and/or community builder, interest and/or experience in the extractives sector, demonstrated skills as a data storyteller or journalist


The Fellowship in Malawi will be focused on public procurement data through the Open Contracting Programme. For this position, School of Data is looking for someone with: experience with delivering technical and data-driven projects, experience with facilitating training activities, experience with data collection projects, basic understanding of the public procurement process


The Fellowship in Indonesia will be focused on public procurement data through the Open Contracting Programme. For this position, School of Data is looking for someone with: experience with delivering technical and data-driven projects, experience with facilitating training activities, experience with working with government systems or data. Candidates with the following optional interests and experience will be appreciated: experience with explaining complex topics to varied audiences, experience with user design methodologies, experience with community development

The Philippines

The Fellowship in The Philippines will be focused on public procurement data through the Open Contracting Programme. For this position, School of Data is looking for someone with: experience with user-centric research and design methodologies, experience with community-building activities, experience with data storytelling. Candidates with the following optional interests and experience will be appreciated: graphic design skills, experience with delivering trainings


The Fellowship in Kenya will be focused on public procurement data through the Open Contracting Programme. For this position, School of Data is looking for someone with: experience with delivering data-driven projects, experience with user research and data storytelling, experience with explaining complex topics to varied audiences. Candidates with the following optional interests and experience will be appreciated: interest in or experience with supporting civic projects and civil society organisations, experience with facilitating training activities.


The Fellowship in Tanzania will be focused on public procurement data through the Open Contracting Programme. For this position, School of Data is looking for someone with: experience with delivering data-driven projects, experience with facilitating training activities, experience with explaining complex topics to varied audiences. Candidates with the following optional interests and experience will be appreciated: experience working with journalists or as a journalist, interest in or experience with supporting civic projects and civil society organisations, experience with writing pedagogical content

9 months to make an impact

The programme will run from May to January 2019, and entail up to 10 days a month of time. Fellows will receive a monthly stipend of $1,000 USD a month to cover for their work.

What are you waiting for?

Read more about School of Data’s Fellowship or Apply now

Key Information: Fellowship
  • Available positions: up to 8 fellows, 1 in each of the following countries: Bolivia, Guatemala, Ghana, Indonesia, Kenya, Malawi, Tanzania, The Philippines
  • Application deadline: May 6th, 2018, midnight GMT+0
  • Duration: From May 14th, 2018 to January 31st, 2019
  • Level of activity: 10 days per month
  • Stipend: $1000 USD per month
Key links About diversity and inclusivity

School of Data is committed to being inclusive in its recruitment practices. Inclusiveness means excluding no one because of race, age, religion, cultural appearance, sexual orientation, ethnicity or gender. We proactively seek to recruit individuals who differ from one another in these characteristics, in the belief that diversity enriches all that we do.

DuraSpace News: VIVO Updates April 8 -- camp, conference, new sites, action planning

Tue, 2018-04-17 00:00

From Mike Conlon, VIVO Product Director

District Dispatch: 2018 WHCLIST award winner announced

Mon, 2018-04-16 16:13

This week, the American Library Association’s (ALA) Washington Office announced that Yolanda Peña-Mendrek of Oakley, California is the winner of the 2018 White House Conference on Library and Information Services (WHCLIST) Award. Given to a non-librarian participant attending National Library Legislative Day, the award covers hotel fees and includes a $300 stipend to defray the cost of attending the event.

An active library advocate and a member of the Friends of the Oakley Library, Peña-Mendrek was appointed as the Contra Costa County Library Commissioner in 2017. In her first year as Library Commissioner, she helped raise funds to support five branch libraries serving fast growing parts of the county.

A retired teacher, Peña-Mendrek is a firm believer in the importance of a good education and access to information. After she became a teacher, she got to see librarians working firsthand with the students at her school, as well as through the local library. She sees libraries as a place where people from all walks of life have the opportunity to expand their knowledge, and strongly believes that elected officials need to hear about the services libraries provide for their communities. Upon learning that she would be the recipient of the 2018 WHCLIST award, she had this to say:

I feel humbled and extremely honored to receive this scholarship to be able to represent my community, and to be their voice on this National Library Legislative Day 2018.

Beyond her involvement with the local libraries, Peña-Mendrek has also served a number of other organizations in her community, including the National Association of Latinos Elected and Appointed Officials, the California School Board Association, and the American Council of Teachers of Foreign Languages.

Now that she is retired, Peña-Mendrek wants to put her energy to use by continuing to support her community’s access to libraries. We look forward to having her attend National Library Legislative Day 2018, where she will join other attendees from California to advocate on behalf of libraries.

The White House Conference on Library and Information Services—an effective force for library advocacy nationally, statewide and locally—transferred its assets to the ALA Washington Office in 1991 after the last White House conference. These funds allow ALA to participate in fostering a spirit of committed, passionate library support in a new generation of library advocates. Leading up to National Library Legislative Day each year, the ALA seeks nominations for the award. Representatives of WHCLIST choose the recipient.

The post 2018 WHCLIST award winner announced appeared first on District Dispatch.

LITA: Don’t miss and JSON-LD – the repeat popular LITA webinar

Mon, 2018-04-16 15:54

It’s not too late to Sign up Now for the popular

Introduction to and JSON-LD
Instructor: Jacob Shelby, Metadata Technologies Librarian, North Carolina State University (NCSU) Libraries
Wednesday April 18, 2018, Noon – 1:30 pm Central time

Web search engines such as Google, Bing, and Yahoo are integral to making information more discoverable on the open web. How can you expose data about your organization, its services, people, collections, and other information in a way that is meaningful to these search engines? This session will provide an introduction to both and the JSON-LD data format.

View details and Register here.

Discover upcoming LITA webinars and web courses

The Privacy in Libraries, LITA webinar series continues with
Adopting Encryption Technologies
Wednesday April 25, 2018, Noon – 1:30 pm Central Time
Presenter: Matt Beckstrom

Register now to get the full series discounts, including recordings of the previous webinars or for any single series webinar.

Stay Safe From Ransomware, Hackers & Snoops by working on your IT Security

IT Security and Privacy in Libraries
Presenter: Blake Carver
Tuesday, May 1, 2018, 2:00 – 3:30 pm Central Time

Discover additional upcoming LITA webinars and web courses

Questions or Comments?

For all other questions or comments related to the course, contact LITA at (312) 280-4268 or Mark Beatty,

Library of Congress: The Signal: Control Issues: A Report of SXSW ’18

Mon, 2018-04-16 15:47

We went to the SXSW Conference this year to reach an audience of tech developers with our session Hacking the Library of Congress. As you may expect from an emerging technology conference, sessions on virtual reality (VR) (48 sessions) and blockchain (29 sessions) dominated the week.  At the Virtual Cinema, attendees demoed a variety of VR and augmented reality (AR) experiences– some of the most compelling of which were full-sensory mixed reality (MR) works such as Meow Wolf’s The Atrium. Panelists painted visions of the future that were as hopeful (blockchain democratizing the web) as they were scary (AI taking over humanity). Attendees took breaks at rest stations with VR vacation experiences and baby goats.

The hundreds of panels, meet-up sessions and networking events created a bubble in the city of more than 50,000 attendees. A bubble that was not burst by scary local events or realities….

Labs was interested in seeing past the hype of emerging technologies like AI (artificial intelligence) and blockchain to how they can be applied responsibly in cultural heritage spaces. Conference speakers from industries that are leading the way in adopting AI and machine learning revealed issues at the conference around transparency and control that are especially important for us to consider. How do we get our authoritative holdings to compete with invisible Google algorithms? How do we keep up with the demands of digital preservation in a market that doesn’t value it? How do we push back on our dependency on electronic journal vendors with ever-increasing subscription prices? We are interested in exploring the hidden controls of new technologies and how they could affect our profession.

In the session AR/VR Evolution or Revolution?, Tony Parisi of Unity Technologies spoke of the inevitability that all computing platforms will eventually move to the VR/AR space. The more important question is for consumers to decide what their “Personal Reality” will be. Given that attendees experienced a different reality than the rest of Austin that week, it made us wonder about this trajectory. A critical component of our work in cultural heritage is to present people with things that may not be pleasant in order to teach important lessons about history (the US Holocaust Memorial Museum is a great example). Doing this type of work in a world dominated by “Personal Realities” will be more critical than ever.

Case studies from the Pharmaceutical industry using VR for empathetic education really resonated with us. Companies are designing experiences to train doctors on what it’s like to tour a smoker’s lung or experience symptoms of chronic illness such as migraines. The collections libraries serve create empathetic connections all the time, and we see a huge potential to amplify these moments of transformation between a patron and an object through contextualized virtual and mixed reality. Great examples of this potential include the Smithsonian American Art Museum’s ”SAAM VR” product created with Intel and NASA’s “Beethoven’s 5th” VR film created in collaboration with Google.

Dell Intel provided relaxing virtual reality beach breaks for SXSW attendees.

Artificial Intelligence, the glue that makes these technologies possible, was discussed in the session “Letting Go: Designing for an AI you Can’t Control” by UX Designers representing Bonsai, Facebook, Singularity University, and As AI takes away the dependency for humans to wield the technology, designers are now freed to point AI to problems and dictate what positive outcomes should be. Instead of comparing users to some type of average, for example, the technology can adapt to who users are in the moment and how they change their behavior through time (referred to as “behavior systems”). This shift in the traditional power paradigm places importance on AI products to be transparent – showing a breadcrumb trail of decisions to users, and giving them multiple opportunities to say no as they engage with the product. Microsoft’s chatbot “Tay”, which was shut down only 16 hours after it was launched in 2016 after it began tweeting racist and sexual statements, was cited as an example of how very real an AI system failure could be – and what populations it could hurt.

The human role, more than ever, is to bring an ethical lens and critical questions to the design process. We must understand where the data is coming from, bring diverse perspectives into the process before the system is formed, and understand how a failure can affect certain populations. Here we see many parallels to the great initiatives around data already in place in the cultural heritage community by groups such as Always Already Computational, Frictionless Data, DocNow, WikiWomen in Red.

Of all of the technology mentioned, blockchain proved to be the most elusive and misunderstood by attendees. Kim Jackson of Singular TV and the documentary filmmaker Alex Winter predicted the hype of blockchain as an easy money scheme will soon fade, and the technology with many useful applications will one day be as invisible to users as – to paraphrase Winter –  JavaScript is to the web. The speakers discussed possible outcomes of a decentralized web, such as users being able to conduct monetary transaction directly (eliminating the need for banks), and users controlling their own personal information, as opposed to being forced to submit PII repeatedly with every new online service they subscribe to. We think blockchain could be a huge game-changer for cultural heritage institutions. A digital ledger system could help authenticate the digital items we serve, ensuring the integrity of our primary resources when cited by third parties. Blockchain technology could also improve our understanding of how the items we make available are used. Imagine the implications for measuring impact and fundraising if we could see a history of an item by every person that had cited it!

In closing, we appreciated the opportunity to parse through the hype, fear, and promise of emerging technologies at SXSW as some of the only cultural heritage representatives in attendance. Labs looks forward to piloting these technologies (except the goats) at the Library of Congress on our experiments page and using this blog to offer thoughtful critique about our experiences.

ACRL TechConnect: Introducing Omeka S

Mon, 2018-04-16 15:00

My library has used Omeka as part of our suite of platforms for creating digital collections and exhibits for many years now. It’s easy to administer and use, and many of our students, particularly in history or digital humanities, learn how to create exhibits with it in class or have experience with it from other institutions, which makes it a good solution for student projects. This creates challenges, however, since it’s been difficult to have multiple sites or distributed administration. A common scenario is that we have a volunteer student, often in history, working on a digital exhibit as part of a practicum, and we want the donor to review the exhibit before it goes live. We had to create administrative accounts for both the student and the donor, which required a lot of explanations about how to get in to just the one part of the system they were supposed to be in (it’s possible to create a special account to view collections that aren’t public, but not exhibits). Even though the admin accounts can’t do everything (there’s a super admin level for that), it’s a bit alarming to hand out administrative accounts to people I barely know.

This problem goes away with Omeka S, which is the new and completely rebuilt Omeka. It supports having multiple sites (which is the new name for exhibits) and distributed administration by site. Along with this, there are sophisticated metadata templates that you can assign to sites or users, which takes away the need for lots of documentation on what metadata to use for which item type. When I showed a member of my library’s technical services department the metadata templates in Omeka S, she gasped with excitement. This should indicate that, at least for those of us working on the back end, this is a fun system to use.

Trying it Out For Yourself

I have included some screenshots below, but you might want to use the Omeka S Sandbox to follow along. You can experiment with anything, and the data is reset every Monday, Wednesday, Friday, and Sunday. This includes a variety of sample exhibits, one is “A Battered Tin Dispatch Box” from which I include some screenshots below.

A Quick Tour Through Omeka S

This is what the Omeka Classic administrative dashboard looks like for a super administrator.And this is the dashboard for Omeka S. It’s not all that different functionally, but definitely a different aesthetic experience.

Most things in Omeka S work analogously to classic Omeka, but some things have been renamed or moved around. The documentation walks through everything in order, so it’s a great place to start learning. Overall, my feeling about Omeka S is that it’s much easier to tap into the  powerful features with less of a learning curve. I first learned Omeka S at the DLF Forum conference in fall 2017 directly from Patrick Murray-John, the Omeka Development Team Manager, and some of what is below is from his description.


Omeka S has the very useful concept of Sites, which again function like exhibits in classic Omeka. Each site has its own set of administrative functions and user permissions, which allow for viewer, editor, or admin by site. I really appreciate this, since it allowed me to give student volunteers access to just the site they needed, and when we need to give other people access to view the site before it’s published we can do that. It’s easier to add outside or supplementary materials to the exhibit navigation. On the individual pages there are a variety of blocks available, and the layout is easier for people without a lot of HTML skills to set up.

Resource Templates

These existed in Omeka Classic, but were less straightforward. Now you can set a resource template with properties from multiple vocabularies and build the documentation right into the template. The data type can be text or URI, or draw from vocabularies with autosuggest. For example, you can set the Rights field to draw from Rights Statement options.


Items work in a similar fashion to Omeka Classic. Items exist at the installation level, so can be reused across multiple sites. What’s great is that the nature of an item can be much more flexible. They can include URIs, maps, and multiple types of media such as a URL, HTML, IIIF image, oEmbed, or YouTube. This reflects the actual way that we were using Omeka Classic, but without the technical overhead to make it all work. This will make it easier for more people to create much more interactive and web-integrated exhibits.

Item Sets

Item Sets are the new name given to Collections and, like Items, they can have metadata from multiple vocabularies. Item Sets are analogous to Collections, but items can be in multiple Item Sets to be associated with sites to limit what people see. The tools for batch adding and editing are similar, but more powerful because you can actually remove or edit metadata in bulk.


Themes in Omeka S have changed quite a bit, and as Murray-John explained, it is more complicated to do theming than in the past. Rather than call to local functions, Omeka S uses patterns from Zend Framework 3, and so the process of theming will require more careful thought and planning. That said, the base themes provided are a great base, and thanks to the multiple options for layouts in sites, it’s less critical to be able to create custom themes for certain exhibits. I wrote about how to create themes in Omeka in 2013, and while some of that still holds true, you would want to consult the updated documentation to see how to do this in Omeka S.


One of my favorite things in Omeka S is the Mapping module, which allows you to add geolocation metadata to items, and create a map on site pages. Here’s an example from the Omeka S Sandbox with locations related to Scotland Yard mapped for an item in the Battered Tin Dispatch Box exhibit.

This can then turn into an interactive map on the front end.

For the vast majority of mapping projects that our students want to do, this works in a very straightforward manner. Neatline is a plugin for Omeka Classic that allows much more sophisticated mapping and timelines–while it should be ported over to Omeka S, it currently is not listed as a module. In my experience, however, Neatline is more powerful than what many people are trying to do, and that added complexity can be a challenge. So I think the Mapping module looks like a great compromise.

Possible Approaches to Migration

Migration between Omeka Classic and Omeka S works well for items. For that, there’s the Omeka2 Importer module. Because exhibits work differently, they would have to be recreated., the hosted version of Omeka, will stay on Omeka Classic for the foreseeable future, so there’s no concern that it will stop being supported any time soon, according to Patrick Murray-John.


We are still working on setting up Omeka S. My personal approach is that as new ideas for exhibits come up we will start them first in Omeka S. As we have time and interest, we may start to migrate older exhibits if they need continual management. Because some of our older exhibits rely on Omeka Classic pla but are planning to mostly create new exhibits in there that don’t rely on Omeka Classic plugins. I am excited to pair this with our other digital collection platforms to build exhibits that use content across our platforms and extend into the wider web.


Open Knowledge Foundation: Open mapping in Côte d’Ivoire, Mongolia and the USA

Mon, 2018-04-16 14:39

Authors: Delia Walker-Jones (OSM-Colorado) and Kanigui Nara (SCODA Côte d’Ivoire)

This blog is part of the event report series on International Open Data Day 2018. On Saturday 3 March, groups from around the world organised over 400 events to celebrate, promote and spread the use of open data. 45 events received additional support through the Open Knowledge International mini-grants scheme, funded by Hivos, SPARC, Mapbox, the Hewlett Foundation and the UK Foreign & Commonwealth Office. The events in this blog were supported through the mini-grants scheme under the Open Mapping theme.

School of Data (SCODA) Côte d’Ivoire

During the Open Data Day in Abidjan (Côte d’Ivoire), we gathered 13 activists working on extractive industries. Firstly we presented the 2015 EITI (Extractive Industries Transparency Initiative) report for Côte d’Ivoire. This report contains mainly the payments of extractives industries to Côte d’Ivoire government.

The 2015 EITI report has also published the geographical coordinates of operating licenses in the country. We started by showing to the participants where they can find these data in the report. And the first task was to show how these data were organised and what were their meanings.

We  explained that for each operating license, there were geographical coordinates of delimitation points of the operating field. We also discussed about the definition of longitude and latitude and the encoding system (degree minutes seconds) that has been used in the report. After that, participants were divided into groups of two persons. And, we asked to each of these groups to use Tabula in order to extract the geographical coordinates of the operating license of Societe des Mines d’Ity. This firm is operating in the west part of the country.

One of the important challenges of the day was to clean up the extracted data. We had already prepared a step by step cleaning spreadsheet. We started by introducing the different functions that have been used for cleaning. Functions like “LENGTH”; “FIND & REPLACE” ; “MID” and “SUBSTITUTE” were presented before going through the spreadsheet.

Once data were cleaned up and formatted by name of firm, delimitation points, longitude and latitude; we converted longitude and latitude into Degree Decimal format. Then, we made an introduction to Umap and each group created a map project and started to add the delimating points of the operating license of Societe des Mines d’Ity.

In terms of lessons, this event was an opportunity for participants to understand geographical coordinates and strengthen their skills in terms of data extraction and data cleaning.

We recommend to make sure that participants have a clear understanding of geographical coordinates before starting a mapping event. The next step for us is to design specific training in mapping and to organise mapathon events using OSM.

Open Street Maps (OSM) Colorado: Ger Community Mapping Center mapathon

In Denver, Colorado during Open Data Day, with the assistance of a grant from Mapbox, Open Street Maps Colorado hosted a mapathon for the Ger Community Mapping Center, a non-profit based in Ulaanbaatar, Mongolia. The weather outside was warm and sunny, but the mapathon nonetheless lured a number of GIS and geography professionals and students into a local university conference room for an afternoon spent on Open Street Maps, digitizing aerial imagery from Mongolia.

We opened the event with a couple presentations about Open Data Day and about the region of Mongolia the Ger Community Mapping Center elected to map. The Arkhangai province, the selected region, is a mostly rural province about 300 miles west of the capital Ulaanbaatar.

We saw from the aerial imagery in Open Street Maps the incredibly varied geography of the Arkhangai province, from tiny, barely visible track roads and vast forests in some areas to densely populated residential neighborhoods filled with dozens of gers (yurts) in other areas. As the participants slowly digitized the many features, this varied geography sparked conversations about how to classify smaller roads barely visible in the grass, and where to delineate residential areas in a consistent manner.

Conversations moved towards the topic of open data, as well. Questions about how to determine standards for open data, and the ethical ramifications of privacy and open spatial data through aerial imagery came to light. In the case of this mapathon, we discussed gers (yurts) and the importance of including gers in spatial data. While in many Western contexts buildings like gers would not be included, and, in fact, have not warranted a separate OSM tag, gers seemed necessary to incorporate within the cultural context of Mongolia–even inside the capital city of Ulaanbaatar, many Mongolians still live in Gers. Gers, therefore, are not only a feature that belongs on a map of Mongolia, but are also an essential feature to assessing population and the movements of the estimated 30% of Mongolians who are still nomadic or semi-nomadic.

By discussing topics like this, we hoped to bring to light a part of the world not many people living in Denver, Colorado know about, and to provide a substantial amount of new shapefiles and data for the Ger Community Mapping Center to use in future projects.


District Dispatch: Record number of signatures in the Senate

Mon, 2018-04-16 12:40

Thanks to the efforts of ALA advocates across the country, this year’s Dear Appropriator Campaign proved a success in the Senate with an increased number of signatures on the Library Services and Technology Act (LSTA) letter and sustained support for the Innovative Approaches to Literacy (IAL) letter.

The bipartisan LSTA letter was led by Senators Jack Reed (D-RI) and Susan Collins (R-ME) and called for at least $189 million funding for LSTA. Forty-six Senators signed the letter this year, one more than last year and the highest number of signatures ever generated for LSTA in the Senate! Every Senator who signed last year returned to the sign again and newly sworn-in Senator Doug Jones (D-AL) signed for the first time. The IAL letter was led by Senators Jack Reed (D-RI) and Debbie Stabenow (D-MI) and called for level funding for IAL at $27 million. This year, 35 Senators lent their support, one short of last year’s high-water mark as Republicans steered clear of funding letters this year.

This follows the successful House campaign, which saw a near-record number of Representatives signing the LSTA letter and a strong showing on the IAL letter. The House LSTA letter was signed by 136 representatives (the second most ever), a solid result for only 10 days of campaigning; normally, there is a three- to four-week window to gather signatures. Four members submitted their own individual LSTA letter: Bustos (D-IL-17), Cardenas (D-CA-29), Lance (R-NJ-7) and Jenkins (R-KS-2). Several new members added signatures to the LSTA letter, and the number of Republicans on the letter rose from three to four. Taking into account two resignations, one death and committee shifts in the House, the number of LSTA signatories is the same as last year.

The FY 2019 IAL Dear Appropriator Letter was again led by four representatives: Rep. Eddie Bernice Johnson (D-TX), Rep. Don Young (R-AK), Rep. Jim McGovern (D-MA) and Rep. Tom MacArthur (R-NJ). The final count for the IAL letter was lower than last year, with 98 signatures compared to 146. ALA advocated for IAL alongside a large coalition of education partners, so the drop is not due to inactivity on the part of ALA advocates or allies. There is some sense that the focus of school programs and school libraries has shifted to the Title IV program, authorized under the 2015 Every Student Succeeds Act. The $700 million increase in Title IV funding in FY 2018 may be evidence of this shift. WO staff, alongside colleagues at AASL, will continue to monitor the program and investigate potential changes in our policy advocacy.

Finally, we are very proud to report that Washington Office staff worked closely with state chapters and associations this year, generating over 75 letters to Members of Congress from 20 state chapters, including Alaska, Connecticut, Florida, Hawaii, Illinois, Indiana, Iowa, Maine, Minnesota, Missouri, Montana, New Hampshire, North Carolina, Pennsylvania, South Carolina, Texas, Utah, Virginia, Washington, West Virginia and the District of Columbia. We also generated letters from COSLA, AILA and ATALM and worked with United as well.

While the letter campaigns are behind us, we must continue to stay engaged. To maintain our momentum, we will be encouraging ALA advocates to call and thank the Senators and Representatives who signed these letters. In D.C., we are also gearing up for the 475 library workers and advocates who are flying in for National Library Legislative Day on May 7 and 8.

As the budget process progresses, we will keep you all apprised. We expect to have future actions soon. Once again, thank you all for your continued support and advocacy!

The post Record number of signatures in the Senate appeared first on District Dispatch.

HangingTogether: What metadata managers expect from and value about the Research Library Partnership

Mon, 2018-04-16 12:30
Geographic spread of the OCLC Research Library Partnership

That was the topic discussed recently by OCLC Research Library Partners metadata managers, initiated by Roxanne Missingham of Australian National University, John Riemer of University of California, Los Angeles, and Melanie Wacker of Columbia University. All metadata managers at Research Library Partner institutions currently rely on OCLC services and applications developed by OCLC Research. Metadata managers discussed what is working well in their current operations and shared ideas for short-term and long-term goals. In particular, metadata managers were encouraged to share their thoughts on how they envision the OCLC Research Library Partnership in the future.

Most reported that they use (or were interested in) OCLC Research’s FAST (Faceted Application of Subject Terminology) and VIAF (Virtual International Authority File) applications, but not everyone knew about all OCLC Research applications. Metadata managers pointed to conflicting priorities in their desires for future OCLC Research goals: expediting their daily operations vs. being the “change agent” to lead the way to the needs of future metadata management, such as linked data. Some of the OCLC Research activities and initiatives in this area that Partners valued include:

Metadata managers noted that metadata cleanup is a prerequisite to identifying entities. This dependency blurs the lines between OCLC services and OCLC Research. A theme of the discussions was the call for closer collaboration between OCLC Research, OCLC products and services, and the OCLC Research Library Partnership. Staff from OCLC services contributed to these discussions, and noted that OCLC Research staff are working closely with product managers to move FAST and VIAF into production. Metadata managers also would like to see OCLC Work Identifiers moved into production. The new 758 MARC field recently approved, defined as a Resource Identifier in the MARC 21 Bibliographic Format, should help with sharing URIs for works represented in WorldCat once the requirements for the field are specified.

The aspects of their institutions’ affiliation with the OCLC Research Library Partnership that metadata managers found most beneficial included:

  • Facilitated information exchanges with their peers
  • Participation in working groups
  • Works in Progress webinars showcasing developments at other OCLC Research Library Partner institutions
  • Participation in the SHARES program, the resource sharing arm of the OCLC Research Library Partnership
  • Access to OCLC Research staff for consultation
  • Invitations to participate in pilot projects
  • Introducing innovative approaches to common issues. “It’s more effective to do research collaboratively than if we do it on our own.”
  • OCLC Research reports on trends in the field, especially when the reports include recommendations that could be implemented locally.

For future areas of work, metadata managers wanted more “practical applications.” Suggestions included:

  • Offering consultants and practical follow up steps that help with prototyping tools that can take advantage of shared experimentation.
  • Moving OCLC Work Identifiers into production.
  • More and closer coordination with other groups, such as the Program for Cooperative Cataloging and the Linked Data for Production partners
  • Guidance on working in a hybrid MARC-linked data environment
  • A linked data “sand box” where Partners could experiment with others
  • Entity reconciliation and tools for identity management
  • More, better APIs
  • Increasing visibility of primary sources

This feedback will help guide future activities.

Hugh Rundle: Building our own house

Mon, 2018-04-16 10:57

From the beginning, newCardigan wanted to do things differently. We're all volunteers, and need newCardigan administration to be relatively painless. But we also value our members' privacy and security, and wanted to do the right thing by them. With this month's GLAM Blog Club theme being 'control', I thought it might be interesting to run through the newCardigan tech stack to explain how we've set things up, and why.

newCardigan uses software for two basic functions: promotion, and event management. I'm excluding @ausglamblogs here, because that's it's own separate thing. Initially, we had a website that was literally one html file and a css file on a webserver. Push communications were achieved via a monthly newsletter using Tiny Letter, and event bookings were managed by asking people to email us if they were intending to come to the next cardiParty. Obviously, this wasn't going to be sustainable in the long term, but it was low-maintenance, and relatively easy to manage as we were getting started.

It's safe to say that newCardigan quickly became rather bigger, rather faster, than the founders anticipated. Our string-and-chewing-gum arrangement quickly became unweildy. The first thing to fix was the website. I set up a Ghost site, the same software I use for this blog. This enabled us to have multiple authors, and a proper publishing platform. We used Universe to manage cardiParty bookings for a while, but then moved to Eventzilla because it looked like it would allow us to use waiting lists. We also wanted a way for the newCardigan community to talk to each other in 'our' space, so we launched a Discourse instance and plugged it into the Ghost site to allow for comments directly in the same page as our cardiParty posts. When we launched our cardiCast podcast, we used Soundcloud to host and publish. Internally, we experimented with Trello to plan things, and Loomio and then Slack to help the 'cardiCore' get organised and stay in touch. We also moved from TinyLetter to Mailchimp to enable automatic posting of event emails, by intergrating Mailchimp with the website RSS feed.

On the one hand, we were being a bit agile by trying things out. On the other hand, it started to feel like we were churning through a lot of systems that weren't working for us. The main problems were a lack of flexibility, particularly in the freemium platforms like Slack and Eventzilla, as well as looming subscription costs for Slack and nervousness about the future of Soundcloud. But we also realised it was time to formalise newCardigan, which meant we needed a proper membership database and more sophisticated way to manage records. Through all of this, we were ultimately looking for systems that were easy to use (for us and our community members), but also privacy-friendly, controlled by us, and able to be customised to our needs. Inevitably, we ended up with an open source stack.

The core of our system is CiviCRM, a constituent relationship management system specifically designed for non-profits. This allowed us to get rid of Eventzilla and Mailchimp, and manage contacts in a more sophisticated way. CiviCRM can be used by itself, but mostly works as a plugin for Drupal, Joomla! or WordPress. Since WordPress is what we were all most familiar with, we decided to migrate the website from Ghost to Wordpress with a CiviCRM integration. As part of the move, we ditched comments completely, and shut down the Discourse server: nobody had really used it, with comments and chat basically happening on Facebook and Twitter. Moving from Soundcloud to a self-hosted podcasting solution turned out to be a bit easier than I anticipated, though it was fiddly to set up initially. Conveniently for us, Digital Ocean launched their Spaces product just before we migrated everything. Spaces works basically like AWS's S3 product: in fact, you can use the S3 API to interact with Spaces. We use Cyberduck to upload files, but the compatability of everything allowed us to use the Blubrry WordPress plugin to host cardiCast using Digital Ocean Spaces, and publish it to all the usual podcast platforms using WordPress and good old RSS.

We added Stripe integration for taking payments (thanks donors!), and then to complete the move to self-hosting, we migrated our internal discussions from Slack to RocketChat. Rocketchat was a pleasant surprise for me as the system administrator of all this. Whereas I had a bit of an argument with CiviCRM, taking a couple of weeks to iron out all the weird, confusingly-documented intricacies and google some PHP snippets, when I installed RocketChat it was so easy I didn't quite believe there was nothing else to do. Rocketchat utilises Caddy for webserving and automatic HTTPS, and the newish Linux snaps system, so it's more or less set-and-forget.

The last thing we haven't quite managed to move onto our own stack is Google Drive. We used Google Hangouts for meetings a few times, but RocketChat's videochat functionality (built on Jitsi) is actually pretty good, and easier to jump in and out of, so the last Google thing we use is Drive for scheduling and document sharing. Ideally we'll move off this too soon: possibly to an ownCloud instance, or maybe just using Digital Ocean spaces.

So there you go. It's not the simplest thing I've ever done, but now that everything is set up, it's fairly straightforward to manage. Importantly, we've reduced the number of external services that can see any data from our members and others who interact with our platforms. Whilst we are still using Google Drive, there isn't any user data there: it has a bit of documentation for the Committee to manage things, and our scheduling documents, and that's basically it. We're not quite running our own servers in the spare room, but I'm pretty happy with how far we've managed to move towards running our own systems so we don't force members and participants to hand over data to third parties just so they can socialise with other GLAM people. As much as possible, it's newCardigan members, or at worst, newCardigan as an organisation, in control.

DuraSpace News: Announcing the first DSpace 7 (Virtual) Development Sprint May 7 - 18

Mon, 2018-04-16 00:00

From Tim Donohue, DSpace Tech Lead

District Dispatch: House committee approves FDLP Modernization Act

Fri, 2018-04-13 14:21

The Committee on House Administration approved the bipartisan Federal Depository Library Program (FDLP) Modernization Act of 2018 (H.R. 5305). The bill would modernize the FDLP and related programs that provide public access to government information.

In a statement commenting on the committee’s action, ALA President Jim Neal said:

“Through their decades-long collaboration with the FDLP, libraries help the public find, use and understand government information. The FDLP Modernization Act will bolster that critical partnership and secure the public’s right to know.”

The bill was introduced on March 15 following months of effort by the Committee on House Administration, which included public hearings with testimony from librarians. The bill is sponsored by Committee Chairman Gregg Harper (R-MS), Ranking Member Bob Brady (D-PA), and Committee members Rodney Davis (R-IL), Barbara Comstock (R-VA), Mark Walker (R-NC), Adrian Smith (R-NE), Barry Loudermilk (R-GA), Zoe Lofgren (D-CA) and Jamie Raskin (D-MD).

The FDLP Modernization Act would provide greater flexibility, facilitate collaboration, streamline program requirements, and allow more libraries to participate in the FDLP, making the program’s services more widely available to the public. In addition, the bill would improve public access to electronic government information, strengthen the preservation of government information, and increase transparency and program oversight.

ALA appreciates the leadership of Reps. Harper, Brady, Davis, Comstock, Walker, Smith, Loudermilk, Lofgren and Raskin in introducing and approving the FDLP Modernization Act. We hope the House will move promptly to consider the legislation.

The post House committee approves FDLP Modernization Act appeared first on District Dispatch.

Open Knowledge Foundation: Open Data Day in Tanzania and Serbia: using open data to educate, inform and create stories

Fri, 2018-04-13 08:00

Authors: Rehema Mtandika (She Codes for change) and Katarina Kosmina (SEE ICT) – their biographies can be found below this post.

This blog is part of the event report series on International Open Data Day 2018. On Saturday 3 March, groups from around the world organised over 400 events to celebrate, promote and spread the use of open data. 45 events received additional support through the Open Knowledge International mini-grants scheme, funded by Hivos, SPARC, Mapbox, the Hewlett Foundation and the UK Foreign & Commonwealth Office. The events in this blog were supported through the mini-grants scheme under the Equal Development and Open Mapping themes.

How we approached data

She Codes for Change trained 27 young girls aged 15-19 from Secondary Schools in Dar es Salaam, Tanzania on the basic concepts of data visualization, Scratch and photography. We guided them to work on groups to identify social challenges and then use open data to create data-driven animation videos stories to educate the society on the challenge. Our aim was to inspire young girls to understand the concept of open data and innovation, and how to apply them to transform their imaginations into visual products, altogether as the mechanism to solve their societal problems. In the end, each group consisting of 5 members was guided to create their datasets, and worked upon their interested social challenge. The issues worked upon were violence against children, early marriages, gender based violence, school dropout and HIV/AIDS among adolescents. The final products were presented, and then uploaded on the She Codes for Change YouTube channel.

She Codes for Change Team with participants during Open Data Day

SEEICT/Startit is an NGO which has eight Startit centers across Serbia, with the aim of educating, empowering and connecting youth and the tech community in the country. Our plan to organize open mapping events in two smaller towns in Serbia got hindered by a lack of demand and local capacity for this type of activities. Instead, with the help of UNDP in Serbia, we managed to organize a Datathon in Serbia’s capital, Belgrade, where teams worked with four mentors on data visualization projects using open datasets. The winning team mapped all elementary and high schools across Belgrade using a dataset from the Ministry of Education. They then scraped data about the locations of betting shops, given that Serbian law forbids betting shops to be closer than 200 meters from schools. This project resulted in a map of Belgrade showing over 70 betting shops which are breaking the law. Additionally, the other three teams also created visualizations which involved: optimizing the placement of police patrols and emergency vehicles for better response to car accidents, mapping bad driving habits across time and municipalities of Serbia, and showing the connectedness of public transportation in Belgrade.

Overcoming obstacles

Since the She Codes for Change proposal was not selected in the first round by the Open Data Team, our team had work on last minute preparations in order to have the logistics in place including sending invitation to schools, push and make follow up with their administrations for the timely permissions for students to attend.

Given that it was Startit’s first time organizing a Datathon and that we decided to make it a 12 hour challenge focused on visualization, we had no idea what could come out of it. In fact, we doubted if we would end up with even 1-2 working visualizations. Given the pilot/experimental nature of this event, plus the short time frame we had to plan and execute it, we struggled with social media promotion, using personal contacts and finding other ways to animate the Serbian IT community to join this endeavour. In addition, we knew that the datasets published by the government are often messy, incomplete, and inconsistent. Hence, there was a legitimate fear that the teams would end up spending most of those 12 hours cleaning data instead of analyzing and visualizing it. Fortunately, we had four fantastic mentors and the teams chose their datasets wisely, with only one team extensively struggling with their chosen datasets.

What did we learn?

She Codes for Change’s major lesson is that data finding and visualization is not a complex phenomenon if taught at an early stage. Since students are not taught much in school about data, many students in the training first thought that data is complicated and not important, however, after understanding the basic concepts and worked together to design a product for its visualization, they realized that data can help them and communities to address their challenges and make informed decisions.

Similar to the experience of She Codes for Change, as the Startit team, we realized how empowering creating data-based visualizations can be for teams participating in the Datathon – whether they’re high schoolers, students, or IT professionals. An even more striking realisation is the fact that messy government datasets can become stories which are able to inform the participants, reveal illegal activities or public policy options, and inspire new ideas.

How can we make data storytelling in Tanzania and Serbia more sustainable?

The She Codes for Change team has launched weekly Scratch trainings in Mid-March, which incorporates open data to help our beneficiaries to identify the challenges, and use the data/information available to design and produce products to satisfy the market needs. These trainings are carried out on Tuesday and Thursday of every week.

Startit’s blog team is currently in the process of writing blog posts about each of the Datathon participating team projects. We hope these stories will not only motivate the wider public to use open datasets, but also think beyond their messiness and incompleteness, as well as combine them with other data in innovative ways. Additionally, we hope future Datathons will continue to inspire data scientists and enthusiasts to use data visualization for storytelling.

Winning project in Startit’s Datathon – Realistic and abstract map of illegally placed betting shops in Belgrade

Data for stories, maps and education

These two initiatives in. Their outputs may have been different as She Codes for Change resulted in data driven animations, while Startit’s Datathon created data visualizations which sought to reveal illegalities, optimize policies or inform a wider audience.

She Codes for Change’s goal was achieved and as a result of the training they were able to create five animation videos that are data driven and informative on the gender, education and health matters. The Open Data Day training has also enabled us to create a platform of motivated young girls to create innovative solutions to the community challenges, hence providing an opportunity for them to raise their voices.

As the number of open datasets available to the public in Serbia increases, Startit plans to enable teams of young data scientists to use the power of data storytelling to continue informing and educating the wider public on the relevance and impact of data.

Author bio’s

Rehema Mtandika is a Director of Innovation at She Codes for Change. For over three years she has been working with youths and women in areas of gender empowerment through ICT and innovation, youth engagement in the social-economic development, access to quality education, access to data and information, good governance and peace and security.

Katarina Kosmina is the Programme Coordinator at SEE ICT, in charge of developing and organizing programs for 8 Startit Centers across Serbia. These programmes range from programming robots for girls or IoT workshops for high schoolers, thematic hackathons, meetups and workshops for individuals in the IT sector, as well as acceleration programs and data or IP clinics for startups. Our goal is to bring quality and free informal education, as well inspire and empower Serbian youth to enter the IT sector and continue expanding their knowledge and skills. Katarina’s passion for open data and data driven decision-making has led to an increased number in programs which aim at raising the level of data literacy in Serbia.

Eric Lease Morgan: An introduction to the NLTK: A Jupyter Notebook

Fri, 2018-04-13 03:31

The attached file introduces the reader to the Python Natural Langauge Toolkit (NLTK).

The Python NLTK is a set of modules and corpora enabling the reader to do natural langauge processing against corpora of one or more texts. It goes beyond text minnig and provides tools to do machine learning, but this Notebook barely scratches that surface.

This is my first Python Jupyter Notebook. As such I’m sure there will be errors in implementation, style, and functionality. For example, the Notebook may fail because the value of FILE is too operating system dependent, or the given file does not exist. Other failures may/will include the lack of additional modules. In these cases, simply read the error messages and follow the instructions. “Your mileage may vary.”

That said, through the use of this Notebook, the reader ought to be able to get a flavor for what the Toolkit can do without the need to completly understand the Python language.

DuraSpace News: Join Fedora at OR2018

Fri, 2018-04-13 00:00

If you will be traveling to Bozeman, Montana for OR2018 June 4-7 please join David Wilcox and Daniel Bernstein at Montana State University for a full-day workshop on June 4 that will provide an overview of Fedora–the flexible, extensible, open source repository platform for managing, preserving, and providing access to digital content. The latest version of Fedora provides native linked data capabilities and a modular architecture based on well-documented APIs and ease of integration with existing applications.