You are here

Feed aggregator

DuraSpace News: Telling VIVO Stories at The Marine Biological Laboratory Woods Hole Oceanographic Institution (MBLWHOI) with John Furfey

planet code4lib - Thu, 2018-01-18 00:00

VIVO is member-supported, open source software and ontology for representing scholarship.

“Telling VIVO Stories” is a community-led initiative aimed at introducing project leaders and their ideas to one another while providing VIVO implementation details for the VIVO community and beyond. The following interview includes personal observations that may not represent the opinions and views of the Marine Biological Laboratory / Woods Hole Oceanographic Institution Library (MBLWHOI) or the VIVO Project.

HangingTogether: NEW: The Realities of Research Data Management: Part Three Now Available!

planet code4lib - Wed, 2018-01-17 21:43

A new year heralds a new RDM report! Check out Incentives for Building University RDM Services, the third report in OCLC Research’s four-part series exploring the realities of research data management. Our new report explores the range of incentives catalyzing university deployment of RDM services. Our findings in brief: RDM is not a fad, but instead a rational response by universities to powerful incentives originating from both internal and external sources.

The Realities of Research Data Management, an OCLC Research project, explores the context and choices research universities face in building or acquiring RDM capacity. Findings are derived from detailed case studies of four research universities: University of Edinburgh, University of Illinois at Urbana-Champaign, Monash University, and Wageningen University and Research. Previous reports examined the RDM service space, and the scope of the RDM services deployed by our case study partners. Our final report will address sourcing and scaling choices in acquiring RDM capacity.

Incentives for Building University RDM Services continues the report series by examining the factors which motivated our four case study universities to supply RDM services and infrastructure to their affiliated researchers. We identify four categories of incentives of particular importance to RDM decision-making: compliance with external data mandates; evolving scholarly norms around data management; institutional strategies related to researcher support; and researcher demand for data management support. Our case studies suggest that the mix of incentives motivating universities to act in regard to RDM differ from university to university. Incentives, ultimately, are local.

RDM is both an opportunity and a challenge for many research universities. Moving beyond the recognition of RDM’s importance requires facing the realities of research data management. Each institution must shape its local RDM service offering by navigating several key inflection points: deciding to act, deciding what to do, and deciding how to do it. Our Realities of RDM report series examines these decisions in the context of the choices made by the case study partners.

Visit the Realities of Research Data Management website to access all the reports, as well as other project outputs.



LITA: Jobs in Information Technology: January 17, 2018

planet code4lib - Wed, 2018-01-17 19:25

New vacancy listings are posted weekly on Wednesday at approximately 12 noon Central Time. They appear under New This Week and under the appropriate regional listing. Postings remain on the LITA Job Site for a minimum of four weeks.

New This Week

City of El Segundo, Library Services Director, El Segundo, CA

New York University, Division of Libraries, Metadata Librarian, New York, NY

Visit the LITA Job Site for more available jobs and for information on submitting a job posting.

Library of Congress: The Signal: Digital Scholarship Resource Guide: So now you have digital data… (part 3 of 7)

planet code4lib - Wed, 2018-01-17 17:46

This is part three of our Digital Scholarship Research Guide created by Samantha Herron. See parts one about digital scholarship projects and two about how to create digital documents.

So now you have digital data…

Great! But what to do?

Regardless of what your data are (sometimes it’s just pictures and documents and notes, sometimes it’s numbers and metadata), storage, organization, and management can get complicated.

Here is an excellent resource list from the CUNY Digital Humanities Resource Guide that covers cloud storage, password management, note storage, calendar/contacts, task/to-do lists, citation/reference management, document annotation, backup, conferencing & recording, screencasts, posts, etc.

From the above, I will highlight:

  • Cloud-based secure file storage and sharing services like Google Drive and Dropbox. Both services offer some storage space free, but increased storage costs a monthly fee. With Dropbox, users can save a file to a folder on their computer, and access it on their phone or online. Dropbox folders can be collaborative, shared and synced. Google Drive is a web-based service, available to anyone with a Google account; any file can be uploaded, stored, and shared with others through Drive. Drive will also store Google Documents and Sheets that can be written in browser, and collaborated on in real time.
  • Zotero, a citation management service. Zotero allows users to create and organize citations using collections and tags. Zotero can sense bibliographic information in the web browser, and add it to a library with the click of a button. It can generate citations, footnotes, endnotes, and in-text citations in any style, and can integrate with Microsoft Word.

If you have a dataset:

Here are some online courses from School for Data about how to extract, clean, and explore data.

OpenRefine is one popular software for working with and organizing data. It’s like a very fancy Excel sheet.

It looks like this:

Screenshot of the Open Refine tool.

Here is an introduction to OpenRefine from Owen Stephens on behalf of the British Library, 2014. Programming Historian also has a tutorial for cleaning data with OpenRefine.

Some computer-y basics

A sophisticated text editing software is good to have. Unlike a word processor like Microsoft Word, text editors are used to edit plaintext–text without other formatting like font, size, page breaks, etc. Text editors are important for writing code and manipulating text. Your computer probably has one preloaded (e.g. Notepad on Windows computers), but there are more robust ones that can be downloaded for free, like Notepad++ for Windows, Text Wrangler for Mac OSX, or Atom for either.

The command line is a way of interacting with a computer program with text instructions (commands), instead of point-and-click GUIs, (graphical user interfaces). For example, instead of clicking on your Documents folder and scrolling through to find a file, you can type text commands into a command prompt to do the same thing. Knowing the basics of the command line helps to understand how a computer thinks, and can be a good introduction to code-ish things for those who have little experience. This Command Line Crash Course from Learn Python the Hard Way gives a quick tutorial on how to use the command line to move through your computer’s file structure.

Code Academy has free, interactive lessons in many different coding languages.

Python seems to be the code language of choice for digital scholars (and a lot of other people). It’s intuitive to learn and can be used to build a variety of programs.

Screenshot of a command line interface.

Next week we will dive into Text Analysis. See you then!

LITA: #LITAchat – LITA at ALA Midwinter 2018

planet code4lib - Wed, 2018-01-17 17:37

Attending the 2018 ALA Midwinter conference? Curious about what LITA is up to?

Join us on Friday, January 26, 1:00-2:00pm EST on Twitter to discuss and ask questions about the LITA events, activities, and more happening at this year’s 2018 ALA Midwinter Meeting in Denver, CO, February 9-13.

To participate, launch your favorite Twitter mobile app or web browser and search for the #LITAchat hashtag and select “Latest” to follow along and reply to questions asked by moderator or other participants. When replying to discussion or asking questions, add or incorporate the hashtags #alamw18 and #litachat.

See you there!

District Dispatch: Bridging the Spectrum symposium at CUA/LIS highlights public policy directions in Washington

planet code4lib - Wed, 2018-01-17 16:57

On Friday, February 2, The Catholic University of America (CUA) Library and Information Sciences Department will host its Tenth Annual Bridging the Spectrum: A Symposium on Scholarship and Practice in Library and Information Science (LIS). A one-day event, Bridging the Spectrum provides attendees with a knowledge-sharing forum and meeting place for practitioners, students, and faculty in Library and Information Sciences and Services to share work and to foster unexpected connections across the spectrum of the information professions.

Dr. Alan Inouye will be the keynote speaker at CUA’s 10th annual Bridging the Spectrum symposium on February 2, 2018.

The keynote address this year will be given by American Library Association Washington Office Director Dr. Alan Inouye. In Making Sense of the Headlines: Advancing Public Policy for the LIS Community, Dr. Inouye looks at the interplay between forming national policy on LIS issues such as net neutrality, federal funding for libraries and education policy with larger trends in government, technology, commerce and society, asking, “What is the more fundamental change taking place? What is really happening beneath the surface and over time—policy-wise? And how can the library and information science community best influence policy and move our interests higher on the political agenda—or at least defend ourselves as much as possible?”

This year, Bridging the Spectrum continues this tradition with a varied program that covers and discusses a diverse set of trends and challenges faced within the LIS fields. Both the morning and afternoon sessions feature presentations and speakers focusing on topics from the impact of digitization and establishing credible news sources, to conducting outreach to minority groups and reinventing programming for the digital natives of the Millennial Generation and Generation Z. Beyond this, the symposium also features a poster lightning round, with posters discussing emerging trends and pedagogy in archival and librarian services.

“Since 2009, Catholic University of America has been proud to have established a community of learning and knowledge-sharing through our annual Bridging the Spectrum: Symposium on Scholarship and Practice,” says Dr. Renate Chancellor, Associate Professor and University Pre-Law Advisor. Chancellor, who serves on the Symposium Committee, went on to say that the impetus for the symposium was to create an opportunity for practitioners, students and faculty in LIS to come together to showcase the wide range of research taking place throughout DC/VA/MD region. “It’s exciting to know that we are celebrating our 10th anniversary and all of the wonderful speakers, panels, and poster sessions we have seen over the years,” says Chancellor, “and to know that we have been instrumental in fostering a forum for dialogue on the important issues relevant to the LIS community.”

Bridging the Spectrum: A Symposium on Scholarship and Practice in Library and Information Science, is open to the public, and will be held in the Great Room of the Pryzbala Student Center on CUA’s campus. For more information about how to register to attend, please visit We look forward to seeing you there!

This guest post was contributed by Babak Zarin, an LIS candidate at CUA and research assistant for Dr. Renate Chancellor.

The post Bridging the Spectrum symposium at CUA/LIS highlights public policy directions in Washington appeared first on District Dispatch.

Open Knowledge Foundation: Educators ask for a better copyright

planet code4lib - Wed, 2018-01-17 11:28

This blog has been reposted from the Open Education Working Group page.


Today we, the Open Education Working Group, publish a joint letter initiated by Communia Association for the Public Domain that urgently requests to improve the education exception in the proposal for a Directive on Copyright in the Digital Single Market (DSM Directive). The letter is supported by 35 organisations representing schools, libraries and non-formal education, and also individual educators and information specialists.


In September 2016 the European Commission published its proposal of a DSM Directive that included an education exception that aimed to improve the legal landscape. The technological ages created new possibilities for educational practices. We need copyright law that enables teachers to provide the best education they are capable of and that fits the needs of teachers in the 21st century. The Directive is able to improve copyright.

However, the proposal does not live up to the needs of education. In the letter we explain the changes needed to facilitate the use of copyrighted works in support of education. Education communities need an exception that covers all relevant providers, and which permits a diversity of educational uses of copyrighted content. We listed four main problems with the Commission’s proposal:

#1:  A limited exception instead of a mandatory one

The European Commission proposed a mandatory exception, which can be overridden by licenses. As a consequence educational exception will still be different in each Member State. Moreover, educators will need a help from a lawyer to understand what they are allowed to do.

#2 Remuneration should not be mandatory

Currently most Member States have exceptions for educational purposes that are completely or largely unremunerated. Mandatory payments will change the situation of those educators (or their institutions), which will have to start paying for materials they are now using for free.

#3: Excluding experts

The European Commission’s proposal does not include all important providers of education as only formal educational establishments are covered by the exception. We note that the European lifelong-learning model underlines the value of informal and non-formal education conducted in the workplace. All these are are excluded from the education exception.

#4: Closed-door policy

The European Commission’s proposal limits digital uses to secure institutional networks and to the premises of an educational establishment. As a consequence educators will not develop and conduct educational activities in other facilities such as libraries and museums, and they will not be able to use modern means of communication, such as emails and the cloud.

To endorse the letter, send an email to Do you want to receive updates on the developments around copyright and education, sign up for Communia’s newsletter Copyright Untangled.

You can read the full letter in this blog on the Open Education website or download the PDF.

DuraSpace News: Registration Open for Fedora Camp at NASA

planet code4lib - Wed, 2018-01-17 00:00
Fedora is the robust, modular, open source repository platform for the management and dissemination of digital content. Fedora 4, the latest production version of Fedora, features vast improvements in scalability, linked data capabilities, research data support, modularity, ease of use and more. Fedora Camp offers everyone a chance to dive in and learn all about Fedora.   The Fedora team will offer a Camp from Wednesday May 16 - Friday May 18, 2018 at the NASA Goddard Space Flight Center  in Greenbelt, Maryland outside of Washington, D.C.

Library of Congress: The Signal: From Code to Colors: Working with the JSON API

planet code4lib - Tue, 2018-01-16 21:26

The following is a guest post by Laura Wrubel, software development librarian with George Washington University Libraries, who has joined the Library of Congress Labs team during her research leave.

The Library of Congress website has an API ( “application programming interface”) which delivers the content for each web page. What’s kind of exciting is that in addition to providing HTML for the website, all of that data–including the digitized collections–is available publicly in JSON format, a structured format that you can parse with code or transform into other formats. With an API, you can do things like:

  • build a dataset for analysis, visualization, or mapping
  • dynamically include content from a website in your own website
  • query for data to feed a Twitter bot

This opens up the possibility for a person to write code that sends queries to the API in the form of URLs or “requests,” just like your browser makes. The API returns a “response” in the form of structured data, which a person can parse with code. Of course, if there were already a dataset available to download that would be ideal. David Brunton explains how bulk data is particularly useful in his talk “Using Data from Historical Newspapers.” Check out LC for Robots for a growing list of bulk data currently available for download.

I’ve spent some of my time while on research leave creating documentation for the JSON API.  It’s worth keeping in mind that the JSON API is a work in progress and subject to change. But even though it’s unofficial, it can be a useful access point for researchers.  I had a few aims in this documentation project: make more people aware of the API and the data available from it, remove some of the barriers to using it by providing examples of queries and code, and demonstrate some ways to use it for analysis. I approached this task keeping in mind a talk I heard at PyCon 2017, Daniele Procida’s “How documentation works, and how to make it work for your project” (also available as a blog post), which classifies documentation into four categories: reference, tutorials, how-to, and explanation. This framing can be useful in making sure your documentation is best achieving its purpose. The JSON API documentation is reference documentation, and points to Jupyter notebooks for Python tutorials and how-to code. If you have ideas about additional “how-to” guides and tutorials would be useful, I’d be interested to hear them!

At the same time that I was digging into the API, I was working on some Jupyter notebooks with Python code for creating image datasets, for both internal and public use. I became intrigued by the possibilities of programmatic access to thumbnail images from the Library’s digitized collections. I’ve had color on my mind as an entry point to collections since I saw Chad Nelson’s DPLA Color Browse project at DPLAfest in 2015.

So as an experiment, I created Library of Congress Colors.

View of colors derived from the Library of Congress Baseball Cards digital collection

The app displays six colors swatches, based on cluster analysis, from each of the images in selected collections. Most of the collections have thousands of images, so it’s striking to see the patterns that emerge as you scroll through the color swatches (see Baseball Cards, for example). It also reveals how characteristics of the images can affect programmatic analysis. For example, many of the digitized images in the Cartoons and Drawings collection include a color target, which was a standard practice when creating color transparencies. Those transparencies were later scanned for display online. While useful for assessing color accuracy, the presence of the target interferes with color analysis of the cartoon, so you’ll see colors from that target pop up in the color swatches for images in that collection. Similarly, mattes, frames, and other borders in the image can skew the analysis. As an example, click through the color bar below to see the colors in the original cartoon by F. Fallon in the Prints and Photographs Division. 

A color swatch impacted by the presence of the color bar photographed near the cartoon  in Prints and Photographs collection

This project was a fun way to visualize the collection while testing the API, and I’ve benefited from working with the National Digital Initiatives team as I developed the project. They and their colleagues have been a source of ideas for how to improve the visualization, connected me with people who understand the image formats, and provided LC Labs Amazon Web Services storage for making the underlying data sets downloadable by others. We’ve speculated about the patterns that emerge in the colors and have dozens more questions about the collections from exploring the results.

View of colors derived from the Library of Congress Works Progress Administration (WPA) poster digital collection

There’s something about color that is delightful and inspiring. Since I’ve put the app out there, I’ve heard ideas from people about using the colors to inspire embroidery, select paint colors, or think about color in design languages. I’ve also heard from people excited to see Python used to explore library collections and view an example of using a public API. I, myself, am curious to see what people may find as they explore Library of Congress collection as data and use the JSON API or one of the many other APIs to create their own data sets. What could LC Labs do to help with this? What would you like to see?

District Dispatch: UPDATE: 50 Senators support CRA to restore Net Neutrality

planet code4lib - Tue, 2018-01-16 17:59

Senate legislation to restore 2015’s strong, enforceable net neutrality rules now has the bipartisan support from 50 of 100 senators and would be assured of passage if just one more Republican backs the effort. The bill is a Congressional Review Act (CRA) resolution from Sen. Ed Markey (D-MA), which would block the Federal Communications Commission’s (FCC) December repeal of net neutrality rules.

The measure is backed by all 49 members of the Senate Democratic caucus, including 47 Democrats and two independents who caucus with Democrats. Sen. Susan Collins (R-ME) is the only Republican to support the bill so far, and supporters are trying to secure one more Republican vote. A successful CRA vote, in this case, would invalidate the FCC’s net neutrality repeal and prevent the FCC from issuing a similar repeal in the future. But the Senate action needs a counterpart in the House, and this Congressional action would be subject to Presidential approval.

ALA is working with allies to encourage Congress to overturn the FCC’s egregious action. Email your members of Congress today and ask them to use a Joint Resolution of Disapproval under the CRA to repeal the December 2017 FCC action and restore the 2015 Open Internet Order protections.

We will continue to update you on the activities above and other developments as we continue to work to preserve a neutral internet.

The post UPDATE: 50 Senators support CRA to restore Net Neutrality appeared first on District Dispatch.

pinboard: Availability Calendar - Kalorama Guest House

planet code4lib - Tue, 2018-01-16 17:55

David Rosenthal: Not Really Decentralized After All

planet code4lib - Tue, 2018-01-16 16:00
Here are two more examples of the phenomenon that I've been writing about ever since Economies of Scale in Peer-to-Peer Networks more than three years ago, centralized systems built on decentralized infrastructure in ways that nullify the advantages of decentralization:

Open Knowledge Foundation: A lookback on 2017 with OK Brazil

planet code4lib - Tue, 2018-01-16 09:30

This blog has been written by Natalia Mazotte and Ariel Kogan, co-directors of Open Knowledge Brazil (OKBR). It has been translated from the original version at by Juliana Watanabe, volunteer of OKBR.

For us at Open Knowledge Brazil (OKBR), the year 2017 was filled with multiple partnerships, support and participation in events; projects and campaigns for mobilisation. In this blog we selected some of these highlights. Furthermore, newsflash for the team: the journalist Natália Mozatte, that was already leading Escola de Datos (School of Data) in Brazil, became co-director with Ariel Kogan (executive director since July 2016).

Foto: Engin_Akyurt / Creative Commons CC0


At the beginning of the year, OKBR and several other organizations introduced the Manifest for Digital Identification in Brazil. The purpose of the Manifest is to be a tool for society to take a stand towards the privacy and safety of personal data of citizens and turn digital identification into a safe, fair and transparent action.

We monitored one of the main challenges in the city of São Paulo and contributed to the mobilisation for this. Along with other civil society organisations, we urged the City Hall of São Paulo for transparency regarding mobility. The reason: on 25 January 2017, the first day of the new increase to the speed limits on Marginais Pinheiros and Tietê, we noticed several news items about the decrease in traffic accidents linked to the policy of reducing speed in certain parts of the city was unavailable on the site of the Traffic Engineering Company (CET).

For a few months, we conducted a series of webinars called OKBR Webinars Serires, about open knowledge of the world. We had the participation of the following experts: Bart Van Leeuwen, entrepreneur; Paola Villareal, Fellow from the Berkman Klein Center, designer/data scientist; Fernanda Campagnucci, journalist and analyst of public policies and Rufus Pollock, founder of Open Knowledge International.

We took part in a major victory for society! Along with the Movimento pela Transparência (PartidáriaMovement for Partisan Transparency), we conducted a mobilisation against the rapporteur’s proposal for a political reform, congressman Vicente Cândido (PT-SP), about hidden contributions from the campaign and the result was very positive. Besides us, a variety of organisations and movements took part in this initiative against hidden donations,: we published and handed out a public statement. The impact was huge: as a consequence, the rapporteur announced the withdrawal of secret donations.

We also participated in #NãoValeTudo, a collective effort to discuss the correct use of technology for electoral purposes along with AppCívico, o Instituto Update, o Instituto Tecnologia e Equidad.


We performed two cycles of OpenSpending. The first cycle initiated in January and involved 150 municipalities. In July, we published the report of cycle 1. In August, we started the second cycle of the game with something new: Guaxi, a robot which was the digital assistant to competitors. It is an expert bot developed with innovative chatbot technology, simulating human interaction with the users. This made the journey through the page of OpenSpending on Facebook easier. The report of the second cycle is available here.

Together with the Board of Assessment of Public Policies from FGV/DAPP we released the Brazilian edition of the Open Data Index (ODI). In total, we built three surveys: Open Data Index (ODI) Brazil, at the national level and ODI São Paulo and ODI Rio de Janeiro, at the municipal level. Months later, we ended the survey “Do you want to build the index of Open Data of your city?” and the result was pretty positive: 216 people have shown an interest to do the survey voluntarily in their town!

In this first cycle of decentralization and expansion of the ODI in the Brazilian municipality, we conducted an experiment with the first group: Arapiraca/AL, Belo Horizonte/MG, Bonfim/RR, Brasília/DF, Natal/RN, Porto Alegre/RS, Salvador/BA, Teresina/PI, Uberlândia/MG, Vitória/ES. We offered training for the local leaders, provided by the staff of the Open Data Index (FGV/DAPP – OKBR) so that they can accomplish the survey required to develop the index. In 2018, we’ll show the results and introduce the reports with concrete opportunities for the town move forward on the agenda of transparency and open data.

We launched LIBRE – a project of microfinance for journalism – a partnership from Open Knowledge Brazil and Flux Studio, with involvement from AppCivico too. It is a microfinance content tool that aims to bring a digital tool to the public that is interested in appreciating and sustaining journalism and quality content. Currently, some first portals are testing the platform in a pilot phase.


We supported the events of Open Data Day in many Brazilian cities, as well as the Hackathon da Saúde (Health Hackathon), an action of the São Paulo City Hall in partnership with SENAI and AppCívico, and participated in the Hack In Sampa event at the City Council of São Paulo.

Natália Mazotte, co-director of OKBR, participated in AbreLatam and ConDatos, annual events which have become the main meeting point regarding open data in Latin America and the Caribbean. It is a time to talk about the status and the impact in the entire region. We also participated in the 7th edition of the Web forum in Brazil with the workshop “Open patterns and access to information: prospects and challenges of the government open data”. Along with other organizations, we organized the Brazilian Open Government meeting.

The School of Data, in partnership with Google News Lab, organised the second edition of the Brazilian Conference of Journalism of Data and Digital Methods (Coda.Br). We were one of the partner organisations for the first Course of Open Government for leadership in Weather, Forest and Farming, initiated by Imaflora and supported by the Climate and Land Use Alliance (CLUA).

We were the focal point in the research “Foundations of the open code as social innovators in emerging economies: a case study in Brazil”, from Clément Bert-Erboul, a specialist in economic sociology and the teacher Nicholas Vonortas.

And more to come in 2018

We would like to thank you to follow and take part of OKBR in 2017. We’re counting on you in 2018. Beyond our plan for the next year, we have the challenge and the responsibility to contribute in the period of the elections so that Brazil proceeds on the agendas of transparency, opening public information, democratic participation, integrity and the fight against corruption.

If you want to stay updated on the news and the progress of our projects, you can follow us on our BlogTwitter and Facebook.

A wonderful 2018 for all of us!

The Open Knowledge Brazil team.

Ed Summers: Programmed Visions

planet code4lib - Tue, 2018-01-16 05:00

I’ve been meaning to read Wendy Hui Kyong Chun for some time now. Updating to Remain the Same is on my to-read list, but I recently ran across a reference to Programmed Visions: Software and Memory in Rogers (2017), which I wrote about previously, and thought I would give it a quick read beforehand.

Programmed Visions is a unique mix of computing history, media studies and philosophy that analyzes the ways in which software has been reified or made into a thing. I’ve begun thinking about using software studies as a framework for researching the construction and operation of web archives, and Chun lays a useful theoretical foundation that could be useful for critiquing the very idea of software, and investigating its performative nature.

Programmed Visions contains a set of historical case studies that it draws on as sites for understanding computing. She looks at early modes of computing involving human computers (ENIAC) which served as a prototype for what she calls “bureaucracies of computing” and the psychology of command and control that is built into the performance of computing. Other case studies involving the Memex, the Mother of All Demos, and John von Neumann’s use of biological models of memory as metaphors for computer memory in the EDVAC are described in great detail, and connected together in quite a compelling way. The book is grounded in history but often has a poetic quality that is difficult to summarize. On the meta level Chun’s use of historical texts is quite thorough and its a nice example of how research can be conducted in this area.

There are two primary things I will take away from Programmed Visions. The first is how software, the very idea of source code, is itself achieved through metaphor, where computing is a metaphor for metaphor itself. Using higher level computer programming languages gives software the appearance of commanding the computer, however the source code is deeply entangled with the hardware itself, the source code is interpreted and compiled by yet more software, which are ultimately reduced to fluctuations in voltages circuitry. The source code and software cannot be extracted from this performance of computing. This separation of software from hardware is an illusion that was achieved in the early days of computing. Any analysis of software must include the computing infrastructures that make the metaphor possible. Chun chooses an interesting passage from Dijkstra (1970) to highlight the role that source code plays:

In the remaining part of this section I shall restrict myself to programs written for a sequential machine and I shall explore some of the consequences of our duty to use our understanding of a program to make assertions about the ensuing computations. It is my (unproven) claim that the ease and reliability with which we can do this depends critically upon the simplicity of the relation between the two, in particular upon the nature of sequencing control. In vague terms we may state the desirability that the structure of the program text reflects the structure of the computation. Or, in other terms, “What can we do to shorten the conceptual gap between the static program text (spread out in”text space“) and the corresponding computations (evolving in time)? (p. 21)

Here Dijkstra is talking about the relationship between text (source code) and a performance in time by the computing machinery. It is interesting to think not only about how the gap can be reduced, but also how the text and the performance can fall out of alignment. Of course bugs are the obvious way that things can get misaligned: I instructed the computer to do X but it did Y. But as readers of source code we have expectations about what code is doing, and then there is the resulting complex computational performance. The two are one, and its only our mental models of computing that allow us to see a thing called software. Programmed Visions explores the genealogy of those models.

The other striking thing about Programmed Visions is what Chun says about memory. Von Neumann popularizes the idea of computer memory using work by McCulloch that relates the nervous system to voltages through the analogy of neural nets. On a practical level, what this metaphor allowed was for instructions that were previously on cards, or in the movements of computer programmers wiring circuits, are moved into the machine itself. The key point Chun makes here is the idea that Von Neumann use of biological metaphors for computing allows him to conflate memory and storage. It is important that this biological metaphor, the memory organ, was science fiction – there was no known memory organ at the time.

The discussion is interesting because it connects with ideas about memory going back to Hume and forward to Bowker (2005). Memories can be used to make predictions, but cannot be used to fully reconstruct the past. Memory is a process of deletion, but always creates the need for more:

If our machines’ memories are more permanent, if they enable a permanence that we seem to lack, it is because hey are constantly refreshed–rewritten–so that their ephemerality endures, so that they may “store” the programs that seem to drive them … This is to say that if memory is to approximate something so long lasting as storage, it can do so only through constant repetition, a repetition that, as Jacques Derrida notes, is indissociable from destruction (or in Bush’s terminology, forgetting). (p. 170)

In the ellided section above Chun references Kirschenbaum (2008) to stress that she does not mean to imply that software is immaterial. Instead Chun describes computer memory as undead, neither alive nor dead but somewhere in between. The circuits need to be continually electrically performed for the memory to be sustained and alive. The requirement to keep the bits moving, reminds me of Kevin Kelly’s idea of movage, and anticipates (I think?) Chun (2016). This (somewhat humorous) description of the computer memory as undead reminded me of the state that archived web content is in. For example when viewing content in the Wayback machine it’s not uncommon to run across some links failing, missing resources, lack of interactivity (search) that was once there. Also, it’s possible to slip around in time as pages are traversed that have been storedat different times. How is this the same and different from traditional archives of paper, where context is lost as well?

So I was surprised in the concluding chapter when Chun actually talks about the Internet Archive’s Wayback Machine (IWM) on pp 170-171. I guess I shouldn’t have been surprised, but the leap from Von Neumann’s first articulation of modern computer architecture forwards to a world with a massively distributed Internet and World Wide Web was a surprise:

The IWM is necessary because the Internet, which is in so many ways about memory, has, as Ernst (2013) argues, no memory–at least not without the intervention of something like the IWM. Other media do not have a memory, but they do age and their degeneratoin is not linked to their regeneration. As well, this crisis is brought about because of this blinding belief in digital media as cultural memory. This belief, paradoxically, threatens to spread this lack of memory everywhere and plunge us negatively into a way-wayback machine: the so-called “digital dark age.” The IWM thus fixes the Internet by offering us a “machine” that lets us control our movement between past and future by regenerating the Internet at a grand scale. The Internet Wayback Machine is appropriate in more ways than one: because web pages link to, rather than embed, images, which can be located anywhere, and because link locations always change, the IWM preserves only a skeleton of a page, filled with broken–rendered–links and images. The IWM, that is, only backs up certain data types. These “saved” are not quite dead, but not quite alive either, for their proper commemoration requires greater effort. These gaps not only visualize the fact that our constant regenerations affect what is regenerated, but also the fact that these gaps–the irreversibility of this causal programmable logic– are what open the World Wide Web as archive to a future that is not simply stored upgrades of the past. (p. 171-172)

I think some things have improved somewhat since Chun wrote those words, but her essential observation remains true: the technology that furnishes the Wayback Machine is oriented around a document based web, where representations of web resources are stored at particular points in time and played back at other points in time. The software infrastructures that generated those web representations are not part of the archive, and so the archive is essentially in an undead state–seemingly alive, but undynamic and inert. It’s interesting to think about how traditional archives have similar characteristics though: the paper documents that lack adequate provenance, or media artifacts that can be digitized but no longer played. We live with the undead in other forms of media as well.

One of my committee members recently asked for my opinion on why people often take the position that since content is digital we can now keep it all. The presumption being that we keep all data online or in near or offline storage and then rely on some kind of search to find it. I think Chun hits on part of the reason this might be when she highlights how memory has been conflated with storage. For some the idea that some data is stored is equivalent to having been remembered as well. But it’s actually in the exercise of the data, its use, or being accessed that memory is activated. This position that everything can be remembered because it is digital has its economical problems, but it is an interesting little philosophical conundrum, that will be important to keep in the back of my mind as I continue to read about memory and archives.


Bowker, G. C. (2005). Memory practices in the sciences (Vol. 205). Cambridge, MA: MIT Press.

Chun, W. H. K. (2016). Updating to remain the same: Habitual new media. MIT Press.

Dijkstra, E. W. (1970). Notes on structured programming. Technological University, Department of Mathematics.

Ernst, W. (2013). Digital memory and the archive. In J. Parikka (Ed.) (pp. 113–140). University of Minnesota Press.

Kirschenbaum, M. G. (2008). Mechanisms: New media and the forensic imagination. MIT Press.

Rogers, R. (2017). Doing web history with the internet archive: Screencast documentaries. Internet Histories, 1–13.

David Rosenthal: The Internet Society Takes On Digital Preservation

planet code4lib - Mon, 2018-01-15 16:01
Another worthwhile initiative comes from The Internet Society, through its New York chapter. They are starting an effort to draw attention to the issues around digital presentation. Shuli Hallack has an introductory blog post entitled Preserving Our Future, One Bit at a Time. They kicked off with a meeting at Google's DC office labeled as being about "The Policy Perspective". It was keynoted by Vint Cerf with respondents Kate Zwaard and Michelle Wu. I watched the livestream. Overall, I thought that the speakers did a good job despite wandering a long way from policies, mostly in response to audience questions.

Vint will also keynote the next event, at Google's NYC office February 5th, 2017, 5:30PM – 7:30PM. It is labeled as being about "Business Models and Financial Motives" and, if that's what it ends up being about it should be very interesting and potentially useful. I hope to catch the livestream.

District Dispatch: Tax season is here: How libraries can help communities prepare

planet code4lib - Fri, 2018-01-12 14:52
This blog post, written by Lori Baux of the Computer & Communications Industry Association, is one in a series of occasional posts contributed by leaders from coalition partners and other public interest groups that ALA’s Washington Office works closely with. Whatever the policy – copyright, education, technology, to name just a few – we depend on relationships with other organizations to influence legislation, policy and regulatory issues of importance to the library field and the public.

It’s hard to believe, but as the holiday season comes to an end, tax season is about to begin.

For decades, public libraries have become unparalleled resources in their communities, far beyond their traditional, literary role. Libraries assist those who need it most by providing free Internet access, offering financial literacy classes, job training, employment assistance and more. And for decades, libraries have served as a critical resource during tax season.

Each year, more and more Americans feel as though they lack the necessary resources to confidently and correctly file their taxes on time. This is particularly true for moderate and lower-income individuals and families who are forced to work multiple jobs just to make ends meet. The question is “where is help available?”

Libraries across the country are stepping up their efforts to assist local taxpayers in filing their taxes for free. Many libraries offer in-person help, often serving as a Volunteer Income Tax Assistance (VITA) location or AARP Tax-Aide site. But appointments often fill up quickly, and many communities are without much, if any free in-person tax assistance.

There is an option for free tax prep that libraries can provide—and with little required from already busy library staff. The next time that a local individual or family comes looking for a helping hand with tax preparation, libraries can guide them to a free online tax preparation resource—IRS Free File:

  • Through the Free File Program, those who earned $66,000 or less last year—over 70 percent of all American taxpayers—are eligible to use at least one of 12 brand-name tax preparation software to file their Federal (and in many cases, state) taxes completely free of charge. More information is available at Free File starts on January 12, 2018.
  • Free File complements local VITA programs, where people can get in-person help from IRS certified volunteers. There are over 12,000 VITA programs across the country to help people in your community maximize their refund and claim all the credits that they deserve, including the Earned Income Tax Credit (EITC). Any individual making under $54,000 annually may qualify. More information on VITAs is available at More information about AARP Tax-Aide can be found here.

With help from libraries and volunteers across the nation, we can work together to ensure that as many taxpayers as possible have access to the resources and assistance that they need to file their returns.

The Computer & Communications Industry Association (CCIA) hosts a website – – that provides resources to inform and assist eligible taxpayers with filing their taxes including fact sheets, flyers and traditional and social media outreach tools. CCIA also encourages folks to download the IRS2Go app on their mobile phone.

Thanks to help from libraries just like yours, we can help eligible taxpayers prepare and file their tax returns on time and free of charge.

Lori Baux is Senior Manager for Grassroots Programs, directing public education and outreach projects on behalf of the Computer & Communications Industry Association (CCIA), an international not-for-profit membership organization dedicated to innovation and enhancing society’s access to information and communications.

The post Tax season is here: How libraries can help communities prepare appeared first on District Dispatch.

Open Knowledge Foundation: New edition of Data Journalism Handbook to explore journalistic interventions in the data society

planet code4lib - Fri, 2018-01-12 09:48

This blog has been reposted from

The first edition of The Data Journalism Handbook has been widely used and widely cited by students, practitioners and researchers alike, serving as both textbook and sourcebook for an emerging field. It has been translated into over 12 languages – including Arabic, Chinese, Czech, French, Georgian, Greek, Italian, Macedonian, Portuguese, Russian, Spanish and Ukrainian – and is used for teaching at many leading universities, as well as teaching and training centres around the world.

A huge amount has happened in the field since the first edition in 2012. The Panama Papers project undertook an unprecedented international collaboration around a major database of leaked information about tax havens and offshore financial activity. Projects such as The Migrants Files, The Guardian’s The Counted and ProPublica’s Electionland have shown how journalists are not just using and presenting data, but also creating and assembling it themselves in order to improve data journalistic coverage of issues they are reporting on.

The Migrants’ Files saw journalists in 15 countries work together to create a database of people who died in their attempt to reach or stay in Europe.

Changes in digital technologies have enabled the development of formats for storytelling, interactivity and engagement with the assistance of drones, crowdsourcing tools, satellite data, social media data and bespoke software tools for data collection, analysis, visualisation and exploration.

Data journalists are not simply using data as a source, they are also increasingly investigating, interrogating and intervening around the practices, platforms, algorithms and devices through which it is created, circulated and put to work in the world. They are creatively developing techniques and approaches which are adapted to very different kinds of social, cultural, economic, technological and political settings and challenges.

Five years after its publication, we are developing a revised second edition, which will be published as an open access book with an innovative academic press. The new edition will be significantly overhauled to reflect these developments. It will complement the first edition with an examination of the current state of data journalism which is at once practical and reflective, profiling emerging practices and projects as well as their broader consequences.

“The Infinite Campaign” by Sam Lavigne (New Inquiry) repurposes ad creation data in order to explore “the bizarre rubrics Twitter uses to render its users legible”.

Contributors to the first edition include representatives from some of the world’s best-known newsrooms data journalism organisations, including the Australian Broadcasting Corporation, the BBC, the Chicago Tribune, Deutsche Welle, The Guardian, the Financial Times, Helsingin Sanomat, La Nacion, the New York Times, ProPublica, the Washington Post, the Texas Tribune, Verdens Gang, Wales Online, Zeit Online and many others. The new edition will include contributions from both leading practitioners and leading researchers of data journalism, exploring a diverse constellation of projects, methods and techniques in this field from voices and initiatives around the world. We are working hard to ensure a good balance of gender, geography and themes.

Our approach in the new edition draws on the notion of “critical technical practice” from Philip Agre, which he formulates as an attempt to have “one foot planted in the craft work of design and the other foot planted in the reflexive work of critique” (1997). Similarly, we wish to provide an introduction to a major new area of journalism practice which is at once critically reflective and practical. The book will offer reflection from leading practitioners on their experiments and experiences, as well as fresh perspectives on the practical considerations of research on the field from leading scholars.

The structure of the book reflects different ways of seeing and understanding contemporary data journalism practices and projects. The introduction highlights the renewed relevance of a book on data journalism in the current so-called “post-truth” moment, examining the resurgence of interest in data journalism, fact-checking and strengthening the capacities of “facty” publics in response to fears about “alternative facts” and the speculation about a breakdown of trust in experts and institutions of science, policy, law, media and democracy. As well as reviewing a variety of critical responses to data journalism and associated forms of datafication, it looks at how this field may nevertheless constitute an interesting site of progressive social experimentation, participation and intervention.

The first section on “data journalism in context” will review histories, geographies, economics and politics of data journalism – drawing on leading studies in these areas. The second section on “data journalism practices” will look at a variety of practices for assembling data, working with data, making sense with data and organising data journalism from around the world. This includes a wide variety of case studies – including the use of social media data, investigations into algorithms and fake news, the use of networks, open source coding practices and emerging forms of storytelling through news apps and data animations. Other chapters look at infrastructures for collaboration, as well as creative responses to disappearing data and limited connectivity. The third and final section on “what does data journalism do?”, examines the social life of data journalism projects, including everyday encounters with visualisations, organising collaborations across fields, the impacts of data projects in various settings, and how data journalism can constitute a form of “data activism”.

As well as providing a rich account of the state of the field, the book is also intended to inspire and inform “experiments in participation” between journalists, researchers, civil society groups and their various publics. This aspiration is partly informed by approaches to participatory design and research from both science and technology studies as well as more recent digital methods research. Through the book we thus aim to explore not only what data journalism initiatives do, but how they might be done differently in order to facilitate vital public debates about both the future of the data society as well as the significant global challenges that we currently face.

LITA: This is Jeopardy! Or, How Do People Actually Get On That Show?

planet code4lib - Thu, 2018-01-11 20:55

This past November, American Libraries published a delightful article on librarians that have appeared on the iconic game show Jeopardy! It turns out one of our active LITA members also recently appeared on the show. Here’s her story…

On Wednesday, October 18th, one of my lifelong dreams will come true: I’ll be a contestant on Jeopardy!

It takes several steps to get onto the show: first, you must pass an online exam, but you don’t really learn the results unless you make it to the next stage: the invitation to audition. This step is completed in person, comprising a timed, written test, playing a mock game with other aspiring players in front of a few dozen other auditionees, and chatting amiably in a brief interview, all while being filmed. If you make it through this gauntlet, you go into “the pool”, where you remain eligible for a call to be on the show for up to 18 months. Over the course of one year of testing and eligibility, around 30,000 people take the first test, around 1500 to 1600 people audition in person, and around 400 make it onto the show each season.

For me, the timeline was relatively quick. I tested online in October 2016, auditioned in January 2017, and thanks to my SoCal address, I ended up as a local alternate in February. Through luck of the draw, I was the leftover contestant that day. I didn’t tape then, but was asked back directly to the show for the August 3rd recording session, which airs from October 16th to October 20th.

The call is early – 7:30am – and the day’s twelve potential contestants take turns with makeup artists while the production team covers paperwork, runs through those interview stories one-on-one, and pumps up the contestants to have a good time. Once you’re in, you’re sequestered. There’s no visiting with family or friends who accompanied you to the taping and no cellphones or internet access allowed. You do have time to chat with your fellow contestants, who are all whip smart, funny, and generally just as excited as you are to get to be on this show. There’s also no time to be nervous or worried: you roll through the briefing onto the stage for a quick run-down on how the podiums work (watch your elbows for the automated dividers that come up for Final Jeopardy!), how to buzz in properly (there’s a light around the big game board that you don’t see at home that tells you when you can ring in safely), and under no circumstances are you to write on the screen with ANYTHING but that stylus!

Next, it’s time for your Hometown Howdy, the commercial blurb that airs on the local TV station for your home media market. Since I’d done it before when I almost-but-not-quite made it on the air in February, I knew they were looking for maximum cheese. My friends and family tell me that I definitely delivered.

Immediately before they let in the live studio audience for seating, contestants run through two quick dress rehearsal games to get out any final nerves, test the equipment for the stage crew, and practice standing on the risers behind the podiums without falling off.

Then it’s back to the dressing room, where the first group is drawn. They get a touch-up on makeup, the rest of the contestant group sits down in a special section of the audience, and it’s off to the races! There are three games filmed before the lunch break, then the final two are filmed. The contestants have the option to stay and watch the rest of the day if they’re defeated, but most choose to leave if it’s later on in the filming cycle. The adrenaline crash is pretty huge, and some people may need the space to let out their mixed feelings. If you win, you are whisked back to the dressing room for a quick change, a touch-up again, and back out to the champion’s podium to play again.

You may be asking, when do contestants meet Alex? Well, it happens exactly twice, and both times, the interactions are entirely on film and broadcast in (nearly) their entirety within the show. To put all of those collusion rumors around the recent streak of Austin Rogers to rest, the interview halfway through the first round and the hand-shaking at the end of the game are the only times that Alex and the contestants meet or speak with one another; there is no “backstage” where the answer-giver and the question-providers could possibly mingle. Nor do the contestants ever get to do more than wave “hello” to the writers for the show. Jeopardy! is very careful to keep its two halves very separated. The energy and enthusiasm of the contestant team – Glenn, Maggie, Corina, Lori, and Ryan – is genuine, and when your appearance is complete, you feel as though you have joined a very special family of Jeopardy! alumni.

Once you’ve been a contestant on Jeopardy!, you can never be on the show again. The only exception is if you do well enough to be asked back to the Tournament of Champions. While gag rules prohibit me from saying more about how I did, I can say that the entire experience lived up to the hype I had built around it since I was a child, playing along in my living room and dreaming of the chance to respond in the form of a question.

Islandora: iCamp EU - Call for Proposals

planet code4lib - Thu, 2018-01-11 18:42

Doing something great with Islandora that you want to share with the community? Have a recent project that the world just needs to know about? Send us your proposals to present at iCampEU in Limerick! Presentations should be roughly 20-25 minutes in length (with time after for questions) and deal with Islandora in some way. Want more time or to do a different format? let us know in your proposal and we'll see what we can do.

You can see examples of previous Islandora camp sessions on our YouTube channel.

The Call for Proposals for iCampEU in Limerick will be open until March 1st.

Type: blog Name * Tell us your name. Institution Tell us where you're joining us from. Email Address * Tell us how to contact you. Session Title * Tell us what you want to call your proposal. You can change this later. Session Details * Tell us about what you want to present. Brief Summary Please give a brief summary that can be printed in the camp schedule if your proposal is accepted. CAPTCHAThis question is for testing whether you are a human visitor and to prevent automated spam submissions. Math question * 7 + 0 = Solve this simple math problem and enter the result. E.g. for 1+3, enter 4.

Islandora: Islandora Camp EU 2018 - Registration

planet code4lib - Thu, 2018-01-11 18:41

Islandora Camp is heading to Ireland June 20 - 22, 2018, hosted by the University of Limerick. Early Bird rates are available until March 1st, 2018, after which the rate will increase to €399,00.

360,00 €Attendee Information:  Registrant Name Please enter the full name of the person who will attend the event. Email Please provide the email address of the person attending so we can send notices and updates. We promise we'll keep them to a minimum! Institution Track * Admin Developer Please select the curriculum you wish to join.
Admin: For repository and collection managers, librarians, archivists, and anyone else who deals primarily with the front-end experience of Islandora and would like to learn how to get the most out of it, or developers who would like to learn more abut the front-end experience.
Developer: For developers, systems people, and anyone dealing with Islandora at the code-level, or any front-end Islandora users who are interested in learning more about the developer side. Tee Shirt Size N/A Small Medium Large X-Large XX-Large 3X-Large 4X-Large 5X-Large Islandora Camp comes with a t-shirt. What size is preferred? Special Considerations Please let us know about any dietary restrictions or other special considerations that may need to be accommodated. Share contact info? N/A Share my info Opt out We would like to share your name and email address with your fellow attendees (and ONLY them) before the event so you can see who else is going. If you would rather we not include your info, please opt out. Learning Goals What do you want to learn at this camp? Be as general or specific as possible - if you have particular questions or problems you're tackling, or topics you'd like to learn about, please put them here.


Subscribe to code4lib aggregator