You are here

Feed aggregator

D-Lib: Data as "First-class Citizens"

planet code4lib - Thu, 2015-01-15 13:43
Guest Editorial by Lukasz Bolikowski, ICM, University of Warsaw, Poland; Nikos Houssos, National Documentation Centre / National Hellenic Research Foundation, Greece; Paolo Manghi, Istituto di Scienza e Tecnologie dell'Informazione, Consiglio Nazionale delle Ricerche, Italy and Jochen Schirrwagen, Bielefeld University Library, Germany

D-Lib: A-posteriori Provenance-enabled Linking of Publications and Datasets via Crowdsourcing

planet code4lib - Thu, 2015-01-15 13:43
Article by Laura Dragan, Markus Luczak-Roesch, Elena Simperl, Heather Packer and Luc Moreau, University of Southampton, UK; Bettina Berendt, KU Leuven, Belgium

D-Lib: Data without Peer: Examples of Data Peer Review in the Earth Sciences

planet code4lib - Thu, 2015-01-15 13:43
Article by Sarah Callaghan, British Atmospheric Data Centre, UK

D-Lib: Semantic Enrichment and Search: A Case Study on Environmental Science Literature

planet code4lib - Thu, 2015-01-15 13:43
Article by Kalina Bontcheva, University of Sheffield, UK; Johanna Kieniewicz and Stephen Andrews, British Library, UK; Michael Wallis, HR Wallingford, UK

D-Lib: A Framework Supporting the Shift from Traditional Digital Publications to Enhanced Publications

planet code4lib - Thu, 2015-01-15 13:43
Article by Alessia Bardi and Paolo Manghi, Istituto di Scienza e Tecnologie dell'Informazione, Consiglio Nazionale delle Ricerche, Italy

D-Lib: Science 2.0 Repositories: Time for a Change in Scholarly Communication

planet code4lib - Thu, 2015-01-15 13:43
Article by Massimiliano Assante, Leonardo Candela, Donatella Castelli, Paolo Manghi and Pasquale Pagano, Istituto di Scienza e Tecnologie dell'Informazione, Consiglio Nazionale delle Ricerche, Italy

DuraSpace News: Announcing Registration for 2015 DLF E-Research Network

planet code4lib - Thu, 2015-01-15 00:00
From the Council on Library and Information Resources   The Council on Library and Information Resources (CLIR) and the Digital Library Federation (DLF) are pleased to announce a new opportunity in support of e-research and research data management: the 2015 offering of the DLF E-Research Network (formerly E-Research Peer Network and Mentoring Group).  

FOSS4Lib Upcoming Events: Islandora Camp EU2

planet code4lib - Wed, 2015-01-14 21:50
Date: Wednesday, May 27, 2015 - 08:00 to Friday, May 29, 2015 - 17:00Supports: IslandoraFedora Repository

Last updated January 14, 2015. Created by Peter Murray on January 14, 2015.
Log in to edit this page.

Islandora is going back to the European Union May 27 - 29, 2015, this time in beautiful Madrid, Spain. Our thanks to sponsor and host Fundación Juan March.

Nick Ruest: #JeSuisCharlie images

planet code4lib - Wed, 2015-01-14 21:19

Using the #JeSuisCharlie data set from January 11, 2015 (Warning! Will turn your browser into a potato for a few seconds), these are the image urls that have more than 1000 occurrences in the data set.

How to create (requires unshrtn):

% --query "#JeSuisCharlie" % ~/git/twarc/utils/ JeSuisCharlie-tweets.json > JeSuisCharlie-tweets-deduped.json % cat JeSuisCharlie-tweets-deduped.json | utils/ > JeSuisCharlie-tweets-deduped-ushortened.json % ~/git/twarc/utils/ JeSuisCharlie-tweets-deduped-ushortened.json >| JeSuisCharlie-20150115-image-urls.txt % cat JeSuisCharlie-20150115-image-urls.txt | sort | uniq -c | sort -rn > JeSuisCharlie-20150115-image-urls-ranked.txt

The ranked url data set can be found here.

11657 Occurrences

4764 Occurrences

3014 Occurrences

2977 Occurrences

2840 Occurrences

2363 Occurrences

2190 Occurrences

2015 Occurrences

1921 Occurrences

1906 Occurrences

1832 Occurrences

1512 Occurrences

1409 Occurrences

1348 Occurrences

1261 Occurrences

1207 Occurrences

1152 Occurrences

1114 Occurrences

1065 Occurrences

1055 Occurrences

1047 Occurrences

tags: twarc#JeSuisCharlie#JeSuisAhmed#JeSuisJuif#CharlieHebdo

CrossRef: CrossRef Indicators

planet code4lib - Wed, 2015-01-14 20:59

CROSSREF INDICATORS (January 12, 2015)

Total no. participating publishers & societies 5710
Total no. voting members 2996
% of non-profit publishers 57%
Total no. participating libraries 1926
No. journals covered 37,358
No. DOIs registered to date 71,670,035
No. DOIs deposited in previous month 648,271
No. DOIs retrieved (matched references) in previous month 46,260,320
DOI resolutions (end-user clicks) in previous month 134,057,984

CrossRef: New CrossRef Members

planet code4lib - Wed, 2015-01-14 20:56

Updated January 12, 2015

Voting Members
Association of Basic Medical Sciences of FBIH
Emergent Publications
Kinga - Service Agency Ltd.
Participatory Educational Research (Per)
Robotics: Science and Systems Foundation
University of Lincoln, School of Film and Media and Changer Agency
Uniwersytet Przyrodniczy w Poznaniu (Poznan University of Life Sciences)
Voronezh State University
Wyzsza Szkola Logistyki (Poznan School of Logistics)

Represented Members

Journal of the Faculty of Engineering and Architecture of Gazi University
Korean Insurance Academic Society
Korean Neurological Association
Medical Journal of Suleyman Demirel University

Last updated January 5, 2015

Voting Members

Arc Medieval Press, Medieval Institute Publications (Western Michigan University)
Asociacion Argentina de Ortopedia y Traumatologia
Harrington Park Press
Institute of Metals and Technology
Science International
Wroclaw Medical University

Represented Members

Institute for Far Eastern Studies, Kyungnam University
Institute of Public Policy and Administration
Istanbul Gelisim University Journal of Social Sciences
Journal of Humanities, Seoul National University
Journal of Pediatric Sciences
Korea Environment Institute
Korean Society for Simulation in Nursing
Sakarya Universitesi Ilahiyat Fakultesi Dergisi
Sogang Journal of Philosophy
The Association for Korea Culture Studies
The International Fiscal Association of Korea
The Korean Association for Cultural Sociology
The Korean Association of Human Development
The Korean Society of Vision Science

District Dispatch: New Year, New E-rate: E-rate implementation begins

planet code4lib - Wed, 2015-01-14 20:13

Photo by Michael Casey

I doubt the Federal Communications Commission (FCC) timed its second E-rate Modernization Order to correspond with the holiday season, but the timing has provided a variety of opportunities for analogies for those of us writing about the program. So while we at the ALA Washington Office are transitioning from 2014 to 2015, we are getting to work on our E-rate resolution for 2015 and transitioning from “pre-order” advocacy to “post-order” implementation. Because, while the FCC is finished with the orders and the rule changes that were adopted in both July (pdf) and December (pdf), as FCC Chairman Wheeler noted in his statement during the December Commission meeting, “Today is just the end of the beginning of our effort to get true high-speed broadband to all of the nation’s schools and libraries. In the months ahead, there will be a lot of heavy lifting to implement these changes by Commission staff, by our friends at USAC, education and library organizations, and by schools and libraries across the country.”

Our first order of business is getting the word out to libraries about the changes and opportunities they provide libraries. We had a good start to the year with a lively webinar in collaboration with the Public Library Association (PLA). The webinar gave an overview of the major changes (including greater funding certainty thanks to $1.5 billion in new funding) to the E-rate program and provided an opportunity for questions ranging from the technical to the big picture. The archive of the webinar is available below.

The new funding and major changes in the program will only help our nation’s libraries and communities, though, if we apply. The application window for the 2015 funding year opens Wednesday, and ALA will continue to build a strategy for supporting the library community in the coming weeks and months, and into the next E-rate funding year. Along with PLA, ALA’s  E-rate Task Force, and our other partners, we aim to provide tools that will help libraries navigate the changes to the program and be successful E-rate participants.

Look for more information after the Midwinter Conference and follow the conversation using #libraryerate. We welcome your questions and input as we move forward in our work to navigate the coming transition period to a “New E-rate.” As we said in one of our numerous draft filings during the public comment period, the order cannot be merely words on a page. We agree with Chairman Wheeler that “ultimately we will be judged by the tangible results delivered to students, teachers, librarians, and library patrons… Now it’s time to roll up our sleeves and complete the job.”

Our sleeves are rolled up. Are yours?

The post New Year, New E-rate: E-rate implementation begins appeared first on District Dispatch.

DPLA: Tracking DPLA’s growth in 2014

planet code4lib - Wed, 2015-01-14 19:40

Last week, Dan Cohen shared the DPLA’s 2015-2017 strategic plan including DPLA’s goals for community-driven growth and collaborative expansion. Now seems like a good time, too, to look back at the work that our partners and we have done over the past year to increase the numbers and diversity of formats, topics, institutions, and collections that are offered through DPLA.

Institution type

We ended 2013 with just over 1,200 contributors (institutions, organizational units, and even some private collections) from all over the United States, represented by 20 Hubs. A year later, our 23 Hubs represent nearly 1,400 contributors. That’s an increase of 17% contributors last year alone! And, 2015 portends even greater growth.

We have nearly as many public library partners as those in universities, with good representation from archives and museums, as well. The graph below provides a picture of the variety of types of institutions that contribute to DPLA. Hidden within this data are the eight presidential libraries that help to make up the “federal libraries” category, and the local scouting archives, theological societies and church archives, and historical commissions, national parks, and other institutions that make up DPLA’s rich contributor base.

To see how much these contributors’ collections have grown over the past year, check out the following graph. All sectors have seen growth, but we’ve seen the numbers of community colleges (43), K-12 schools (20), and publishers (13) more than double since 2013.

Geographic distribution

In addition to the types of institutions represented in DPLA, we also keep an eye on their geographic distribution. Keep in mind that we DO have records about every state in the U.S., and beyond. This map doesn’t show you that, though.  Instead, it demonstrates where our current Hubs are located.


In January 2014, we shared progress on a research project about diversity. Through that project, we tried to look at both content and partnerships to discover how often DPLA was representing some of America’s underrepresented groups. Our working definition identifies these groups as:

Former slave of Eng Bunker,” ca. 1880-1890. University of North Carolina at Chapel Hill Library’s Southern Historical Collection via North Carolina Digital Heritage Center.

  • historically non-white racial and ethnic groups
  • cultural/religious minority groups (Jews, Muslims, Hindus, etc.)
  • women
  • LGBTQ communities
  • disabled communities (including the physically, sensorily, and developmentally disabled)
  • rural communities
  • populations with lower socioeconomic status (focusing on poverty, working class issues, labor issues)
  • elderly populations

In the 2013 research, we highlighted 50 of our contributing institutions as diverse according to their institutional mission using this definition. By the end of 2014, that number has increased to 86—a 72% jump. The number of Historically Black Colleges and Universities has increased by 50%, and the number of Hubs with “diverse” partners has increased from 10 to 14. That means three-quarters of our Hubs who have contributing partners have at least one partner that falls into the “diverse” category. We are enthusiastic to track and support partner diversity to see who is doing the collecting and representing (in addition to who is being represented).

For 2013, we had some challenges measuring diversity in DPLA items. We initially attempted to look at metadata using subject terms but found that these results were not useful. At the end of 2014, we were able to explore most of DPLA’s 8.4 million records at the collection level using collection descriptions from the Hubs and supplemental research. Using the diversity measurement, we looked through 3,937 collections and identified 430 as organized around the history or culture of one of our diversity groups–11% of the overall collections.

Collections are a way that institutions organize content topically and they can vary quite widely in size from a single item to thousands of items. For a more meaningful way to represent the impact of diverse content, we looked at the number of items in these diverse collections by group (or diversity category). It is important to note that these numbers of items are collection-based and do not account for items that aren’t affiliated with collections.

While we are pleased to see that some of these groups are relatively well represented by collections, there are a number of groups that need better representation in DPLA. For example, Asian American and Latino collections in particular should have stronger numbers and more variety in national origin. LGBT and disabled communities, as well as Arab Americans and Muslims, need more collections and content. Other categories, like Women, likely have a much broader representation in DPLA but are not as often the topic of collections.

Recruiting diverse content is a priority for most of the Hubs in our network and their work has a major impact on the diversity of DPLA’s collections. Through digitization funding and other means, DPLA has supported efforts to reformat diverse collections at the local level. As seen in the graph below, many of the Hubs make strong contributions to the diverse content accessible through DPLA.

Growth by records

While we believe that our growth should be measured by the number and variety of our partners and the diversity of their collections, record growth inevitably draws attention. Since 2013, our collections have grown by nearly three million records. That’s a 50% growth in one year’s time!

Each of our Hubs has a different collection policy, partner base, and growth plan. So their growth patterns tend to be slightly different. Still, it is interesting to see how the individual Hubs have grown, and how that compares to the others since it is their combined efforts that help DPLA grow as a whole.

Let’s first consider this by looking at straight record numbers. The clear winner here is The New York Public Library (NYPL), which went from a collection of 14,000 records at the start of 2014 to nearly 1.2 million by year’s end. But the growth of the other nine Hubs in the top ten can’t be overlooked. Some of the smaller collections—typically our Service Hubs who aggregate metadata from many smaller institutions with fewer record counts—saw major growth, as well. These include The Portal to Texas History, North Carolina Digital Heritage Center, Mountain West Digital Library, Digital Library of Georgia, and Digital Commonwealth. Considering that these Hubs are sometimes working with partners who count their collections in the hundreds of records, a growth of 78,000 records (North Carolina’s numbers this year) can be a massive undertaking.

Top ten Hubs by overall record growth 1. The New York Public Library 1,154,051 2. HathiTrust 504,816 3. Smithsonian Institution 184,702 4. The Portal to Texas History 163,810 5. Internet Archive 119,582 6. North Carolina Digital Heritage Center 77,635 7. Mountain West Digital Library 66,699 8. Digital Library of Georgia 65,074 9. Digital Commonwealth 47,843 10. ARTstor 46,158


It’s also helpful to look at growth a second way—by percentage increase—to see where some of our smaller Hubs have had significant growth in comparison to their size. While NYPL still tops the second list, small (but mighty!) South Carolina joins this top ten list with a 33% record increase since 2013. And, Digital Commonwealth moves up five places in this list with a 62% growth rate to beat out all other Service Hubs.

Top ten Hubs by percentage growth 1. The New York Public Library 8051% 2. ARTstor 453% 3. Internet Archive 99% 4. Digital Commonwealth 62% 5. The Portal to Texas History 47% 6. North Carolina Digital Heritage Center 42% 7. Digital Library of Georgia 33% 8. South Carolina Digital Library 33% 9. HathiTrust 29% 10. Smithsonian Institution 26%


Item Formats

So, what types of records are we getting in our collections from these nearly 1,400 partners and 23 Hubs, and how has that changed over the past year?

First, let’s look at the formats of the digital objects described by the metadata records in DPLA. This chart provides a comparison of the percentage in 2013 and in 2014.

Media type % of collection 2013 % of collection 2014 Text 67.64% 51% Images 32.08% 48% Moving images 0.12% 0.27% Sound 0.11% 0.11% 3D 0.03% 0.13%


You’ll see right away that this year there are nearly as many Texts as there are Images, reducing that difference from 15% to 3%. In fact, Images have overtaken Texts! While not surprising, the difference in size between Texts and Images compared to the rest of the collection is significant. The reality is that it is far more expensive and time consuming to digitize and create metadata for moving images, sound, and 3D objects, and this is especially a challenge for our small and mid-sized (and often under-resourced) partners. Still, it’s an area that is poised for more growth in 2015. We’re on our way already with 3D and Moving image collections, though, which both have more than doubled in size since 2013.

Item Languages

Finally, a comparison shows the growth in the diversity of languages represented in DPLA has also been substantial. In 2013, we reported that there were about 400 languages represented in the Texts in DPLA. Today that number is nearly 500. As one might expect, the majority of what you’ll read in DPLA is in English, but there’s also a fair number of German, French, Spanish, and Latin texts. Here’s a list of the 25 most prevalent languages found in DPLA:

Top 25 in 2013 % 1. English 72.68% 2. German 8.39% 3. French 6.94% 4. Spanish 2.86% 5. Latin 2.44% 6. Italian 1.65% 7. Russian 0.76% 8. Dutch 0.50% 9. Chinese 0.37% 10. Portuguese 0.32% 11. Arabic 0.28% 12. Swedish 0.28% 13. Danish 0.28% 14. Ancient Greek (to 1453) 0.21% 15. Japanese 0.21% 16. Hebrew 0.18% 17. Polish 0.16% 18. Modern Greek (1453-) 0.13% 19. Norwegian 0.13% 20. Czech 0.12% 21. Hungarian 0.12% 22. Ottoman Turkish (1500-1928) 0.12% 23. Persian 0.08% 24. Armenian 0.07% 25. Croatian 0.04%


Top 25 in 2014 % 1. English 74% 2. German 7.78% 3. French 6.76% 4. Spanish 2.51% 5. Latin 2.09% 6. Italian 1.88% 7. Japanese 0.91% 8. Russian 0.85% 9. Dutch 0.52% 10. Chinese 0.40% 11. Danish 0.26% 12. Swedish 0.27% 13. Portuguese 0.35% 14. Arabic 0.21% 15. Hebrew 0.22% 16. Ancient Greek (to 1453) 0.18% 17. Czech 0.17% 18. Hungarian 0.15% 19. Polish 0.16% 20. Ottoman Turkish (1500-1928) 0.10% 21. Norwegian 0.12% 22. Modern Greek (1453-) 0.12% 23. Persian 0.06% 24. Armenian 0.06% 25. Icelandic 0.05%


Note the significant jump in Japanese texts over the past year (number 15 in 2013, number 7 in 2014).

It’s great to look a back on a year that’s seen so much success in our attempts (and our partners!) to grow our collections in ways that better represent the variety of stories there are to tell in American society. But, we also know how important it is to continue the momentum to complete the partner map so that institutions across the United States and the stories they tell have a place in DPLA. Let’s all meet up here again in 2016 and see how we’ve done.

Upwards and onwards!


Featured image credit: Detail from “Ojibwe women holding sticks to play Double Ball, Grand Portage, Minnesota,” ca. 1885. From University of Minnesota Duluth, Kathryn A. Martin Library, Northeast Minnesota Historical Center Collections via Minnesota Digital Library.

All written content on this blog is made available under a Creative Commons Attribution 4.0 International License. All images found on this blog are available under the specific license(s) attributed to them, unless otherwise noted.

FOSS4Lib Recent Releases: ColorSharp - 0.9.1

planet code4lib - Wed, 2015-01-14 11:49
Package: ColorSharpRelease Date: Wednesday, January 14, 2015

Last updated January 14, 2015. Created by castarco on January 14, 2015.
Log in to edit this page.

* Added CIE's 1960 UVW color space
* Less destructive conversions (now the data is better preserved)
* Now ToSRGB, ToCIExyY and ToCIEUVW are virtual methods, not abstract.
* Updated MathNet.Numerics dependency.


Subscribe to code4lib aggregator