You are here

Feed aggregator

Open Knowledge Foundation: International Data Week: From Big Data to Open Data

planet code4lib - Tue, 2016-10-11 09:00

Report from International Data Week: Research needs to be reproducible, data needs to be reusable and Data Packages are here to help.

International Data Week has come and gone. The theme this year was ‘From Big Data to Open Data: Mobilising the Data Revolution’. Weeks later, I am still digesting all the conversations and presentations (not to mention, bagels) I consumed over its course. For a non-researcher like me, it proved to be one of the most enjoyable conferences I’ve attended with an exciting diversity of ideas on display. In this post, I will reflect on our motivations for attending, what we did, what we saw, and what we took back home.

Three conferences on research data

International Data Week (11-17 September) took place in Denver, Colorado and consisted of three co-located events: SciDataCon, International Data Forum, and the Research Data Alliance (RDA) 8th Plenary. Our main motivation for attending these events was to talk directly with researchers about Frictionless Data, our project oriented around tooling for working with Data “Packages”, an open specification for bundling related data together using a standardized JSON-based description format.

The concepts behind Frictionless Data were developed through efforts at improving workflows for publishing open government data via CKAN. Thanks to a generous grant from the Sloan Foundation, we now have the ability to take what we’ve learned in civic tech and pilot this approach within various research communities. International Data Week provided one the best chances we’ve had so far to meet researchers attempting to answer today’s most significant challenges in managing research data. It was time well spent: over the week I absorbed interesting user stories, heard clearly defined needs, and made connections which will help drive the work we do in the months to come.

What are the barriers to sharing research data?

While our aim is to reshape how researchers share data through better tooling and specifications, we first needed to understand what non-technical factors might impede that sharing. On Monday, I had the honor to chair the second half of a session co-organized by Peter Fitch, Massimo Craglia, and Simon Cox entitled Getting the incentives right: removing social, institutional and economic barriers to data sharing. During this second part, Wouter Haak, Heidi Laine, Fiona Murphy, and Jens Klump brought their own experiences to bear on the subject of what gets in the way of data sharing in research.

Mr. Klump considered various models that could explain why and under what circumstances researchers might be keen to share their data—including research being a “gift culture” where materials like data are “precious gifts” to be paid back in kind—while Ms. Laine presented a case study directly addressing a key disincentive for sharing data: fears of being “scooped” by rival researchers. One common theme that emerged across talks was the idea that making it easier to credit researchers for their data via an enabling environment for data citation might a be a key factor in increasing data sharing. An emerging infrastructure for citing datasets via DOIs (Digital Object Identifiers) might be part of this. More on this later.

“…making it easier to credit researchers for their data via an enabling environment for data citation might a be a key factor in increasing data sharing” What are the existing standards for research data?

For the rest of the week, I dove into the data details as I presented at sessions on topics like “semantic enrichment, metadata and data packaging”, “Data Type Registries”, and the “Research data needs of the Photon and Neutron Science community”. These sessions proved invaluable as they put me in direct contact with actual researchers where I learned about the existence (or in some cases, non-existence) of community standards for working with data as well as some of the persistent challenges. For example, the Photon and Neutron Science community has a well established standard in NeXus for storing data, however some researchers highlighted an unmet need for a lightweight solution for packaging CSVs in a standard way.

Other researchers pointed out the frustrating inability of common statistical software packages like SPSS to export data into a high quality (e.g. with all relevant metadata) non-proprietary format as encouraged by most data management plans. And, of course, a common complaint throughout was the amount of valuable research data locked away in Excel spreadsheets with no easy way to package and publish them. These are key areas we are addressing now and in the coming months with Data Packages.

Themes and take-home messages

The motivating factor behind much of the infrastructure and standardization work presented was the growing awareness of the need to make scientific research more reproducible, with the implicit requirement that research data itself be more reusable. Fields as diverse as psychology and archaeology have been experiencing a so-called “crisis” of reproducibility. For a variety of reasons, researchers are failing to reproduce findings from their own or others’ experiments. In an effort to resolve this, concepts like persistent identifiers, controlled vocabularies, and automation played a large role in much of the current conversation I heard.

…the growing awareness of the need to make scientific research more reproducible, with the implicit requirement that research data itself be more reusable”

Persistent Identifiers

Broadly speaking, persistent identifiers (PIDs) are an approach to creating a reference to a digital “object” that (a) stays valid over long periods of time and (b) is “actionable”, that is, machine-readable. DOIs, mentioned above and introduced in 2000, are a familiar approach to persistently identifying and citing research articles, but there is increasing interest in applying this approach at all levels of the research process from researchers themselves (through ORCID) to research artifacts and protocols, to (relevant to our interests) datasets.

We are aware of the need to address this use case and, in coordination with our new Frictionless Data specs working group, we are working on an approach to identifiers on Data Packages.

Controlled Vocabularies

Throughout the conference, there was an emphasis on ensuring that records in published data incorporate some idea of semantic meaning, that is, making sure that two datasets that use the same term or measurement actually refer to the same thing by enforcing the use of a shared vocabulary. Medical Subject Headings (MeSH) from the United States National Library of Medicine is a good example of a standard vocabulary that many datasets use to consistently describe biomedical information.

While Data Packages currently do not support specifying this type of semantic information in a dataset, the specification is not incompatible with this approach. As an intentionally lightweight publishing format, our aim is to keep the core of the specification as simple as possible while allowing for specialized profiles that could support semantics.


There was a lot of talk about increasing automation around data publishing workflows. For instance, there are efforts to create “actionable” Data Management Plans that help researchers walk through describing, publishing and archiving their data.

A core aim of the Frictionless Data tooling is to automate as many elements of the data management process as possible. We are looking to develop simple tools and documentation for preparing datasets and defining schemas for different types of data so that the data can, for instance, be automatically validated according to defined schemas.

Making Connections

Of course, one of the major benefits of attending any conference was the chance to meet and interact with other research projects. For instance, we had really great conversations with Mackenzie DataStream project, a really amazing project for sharing and exploring water data in the Mackenzie River Basin in Canada. The technology behind this project already uses the Data Packages specifications, so look for a case study on the work done here on the Frictionless Data site soon.

There is never enough time in one conference to meet all the interesting people and explore all the potential opportunities for collaboration. If you are interested in learning more about our Frictionless Data project or would like to get involved, check out the links below. We’re always looking for new opportunities to pilot our approach. Together, hopefully, we can make reduce the friction in managing research data.

Galen Charlton: Visualizing the global distribution of Evergreen installations from tarballs

planet code4lib - Tue, 2016-10-11 02:04

In August I made a map of Koha installations based on geolocation of the IP addresses that retrieved the Koha Debian package. Here’s an equivalent map for Evergreen:

Click to get larger image

As with the Koha map, this is based on the last 52 weeks of Apache logs as of the date of this post. I included only complete downloads of Evergreen ILS tarballs and excluded downloads done by web crawlers.  A total of 1,317 downloads from 838 distinct IP addresses met these criteria.

The interactive version can be found on Plotly.

pinboard: Free Programming Ebooks - O'Reilly Media

planet code4lib - Mon, 2016-10-10 15:35
If you're looking for some free tech ebooks, O'Reilly has you covered: cc: #code4lib #libtech #libtechwomen #mashcat

Villanova Library Technology Blog: Dig Deeper: VuFind Summit 2016

planet code4lib - Mon, 2016-10-10 14:00

Did you know that an innovative search engine used by libraries in numerous countries for browsing catalogs was developed right here at Villanova University? It’s called VuFind, and its open source coding allows for continued innovation by institutions and individuals throughout the world.

Some of the most important contributors are coming to Falvey Memorial Library this week, on Oct. 10 and 11, for VuFind Summit 2016. Registration has closed, but you can still attend the conference virtually on the Remote Participation section of the VuFind Summit webpage. Speaking of remote participation, this year’s VuFind Summit will feature a video conference with another VuFind Summit occurring concurrently in Germany.

The VuFind Summit 2015 group.

This year’s conference includes speakers such as Andrew Nagy, Leila Gonzales and Bob Haschart, among others. Nagy, one of the developers involved in starting the VuFind project here at Villanova, will be giving a talk on his new FOLIO project. FOLIO is another open source project that will integrate VuFind as it attempts to help libraries work together in new ways.

Gonzales has devised a method for using VuFind for geographical data. On her map interface, a user can draw a shape and pull full records from the designated space. Her talk features a brainstorming session for thinking up new features and applications for her software. Haschart will discuss his new SolrMarc software, which includes “extended syntax, faster indexing with multi-threading, easier customization of Java indexing code” (from Summit website above).

VuFind Summit could not be promoted, nor indeed occur, without speaking of Demian Katz. He is the VuFind Project Manager who has worked here at the Falvey Memorial Library since 2009. Demian brings the conference together each year and has even published scholarly articles on the topic of VuFind. Anyone who has spoken to him, or heard him lecture, can easily detect his passion for innovative technologies and how the user engages with them. His talk will focus on the innovations made since last year’s VuFind Summit, and he will participate heavily in mapping out the next year’s innovations.

Demian Katz lectures at VuFind Summit 2015.

I know, on a personal level, that if you aren’t a coder, then this event might not seem pertinent to you. I encourage you, however, to check out the live stream or the YouTube videos that will be posted subsequently. Not many universities can list “developed an internationally renowned search engine” on their curriculum vitae. VuFind is part of what makes Villanova University a top 50 college in the country; VuFind is part of your daily research experience here at Villanova. It’s certainly worthwhile to give attention to those specialists who make VuFind a reality.

Article by William Repetto, a graduate assistant on the Communications and Marketing Team at the Falvey Memorial Library. He is currently pursuing an MA in English at Villanova University.




Open Knowledge Foundation: OpenTrials launches beta version today at the World Health Summit

planet code4lib - Mon, 2016-10-10 12:18

For immediate release

Open Knowledge International is delighted to announce the launch of the public preview beta version ofOpenTrials at a panel session on ‘Fostering Open Science in Global Health’ at the World Health Summit today, 10 October 2016, the world’s foremost forum for strategic questions of Global Health.  OpenTrials is an open, online database of information about the world’s clinical trials funded by the Laura and John Arnold Foundation through the Center for Open Science. The project, which is designed to increase transparency and improve access to research, is directed by Dr. Ben Goldacre, an internationally known leader on clinical transparency, and is being built by Open Knowledge International.

OpenTrials works like a search engine, with advanced search options for filtering results by criteria such as drug and disease area. All data and documents for each trial included are “threaded” together and presented alongside each other. At the World Health Summit, the team will be demonstrating how the OpenTrials interface works, including how to explore trials and filter results by criteria such as drug and disease area.  They will also demonstrate the power of linking clinical trial information together, showing how it can be used to highlight important discrepancies in the data.

Explore the database at

We want the information provided on OpenTrials to inform decision-making and lead to better medical services worldwide. We expect a range of potential uses for the platform:

  • A public health researcher could find out more about the range of trials on a drug, searching by various criteria to match a specific population.
  • A doctor interested in critical appraisal of research papers could see if sources of bias for specific trials have already been assessed by experts.
  • A researcher could see if the same trial reports somewhat different methods or results in different places.
  • A patient interested in participating in a trial for their condition could identify trials in their geographical area which are enrolling.

A crowdsourcing functionality allows users to contribute data and documents and to provide feedback on the accuracy of trial information.

OpenTrials currently extracts and displays data from, EU CTR, HRA, WHO ICTRP, and PubMed, and risk of bias assessments from the Cochrane Schizophrenia group. After the beta launch, we plan to integrate systematic review data from Epistemonikos and other sources. There are seven additional sources of data that have been extracted, but can’t currently be displayed because of licensing issues – we are working with these sources of data to get permission to publish. We’ll keep updating the OpenTrials blog as they become available.

“This project aims to draw together everything that is known around each clinical trial. The end product will provide valuable information for patients, doctors, researchers, and policymakers…” – Dr. Ben Goldacre

“There have been numerous positive statements about the need for greater transparency on information about clinical trials, over many years, but it has been almost impossible to track and audit exactly what is missing, or easily identify discrepancies in information about trials” explained Dr. Goldacre, the project’s Chief Investigator and a Senior Clinical Research Fellow in the Centre for Evidence Based Medicine at the University of Oxford. “This project aims to draw together everything that is known around each clinical trial. The end product will provide valuable information for patients, doctors, researchers, and policymakers—not just on individual trials, but also on how whole sectors, researchers, companies, and funders are performing. It has the potential to show who is failing to share information appropriately, who is doing well, and how standards can be improved.”

“OpenTrials is an important step towards ensuring researchers, journalists, and patient groups have access to the medical information they need,” said Pavel Richter, CEO of Open Knowledge International. “Through the OpenTrials platform, researchers can advance science more quickly, doctors can easily find the latest evidence to improve services, and patients can locate information about pressing public health issues. OpenTrials is a great example of the work we are doing at Open Knowledge International to equip civil society organisations with the tools and information they need to address social problems and improve people’s lives.”

“OpenTrials is an important step towards ensuring researchers, journalists, and patient groups have access to the medical information they need.” – Pavel Richter, CEO of Open Knowledge International

The first phase of the Open Trials project is scheduled for completion in March 2017. For project updates, please follow @opentrials on twitter or get in touch with us at  A Hack Day (a World Health Summit Satellite event) took place on 8 October in Berlin. For more details, see here:

Further information on speakers and topics of the World Health Summit 2016:

The World Health Summit is open to media representatives:

Editor’s notes:

Ben Goldacre

Ben is a doctor, academic, writer, and broadcaster, and currently a Senior Clinical Research Fellow in the Centre for Evidence Based Medicine at the University of Oxford. His blog is at and he is @bengoldacre on twitter. Read more here. His academic and policy work is in epidemiology and evidence based medicine, where he works on various problems including variation in care, better uses of routinely collected electronic health data, access to clinical trial data, efficient trial design, and retracted papers. In policy work, he co-authored this influential Cabinet Office paper, advocating for randomised trials in government, and setting out mechanisms to drive this forwards. He is the co-founder of the AllTrials campaign. He engages with policy makers. Alongside this he also works in public engagement, writing and broadcasting for a general audience on problems in evidence based medicine. His books have sold over 600,000 copies.

Open Knowledge International

Open Knowledge International is a global non-profit organisation focussing on realising open data’s value to society by helping civil society groups access and use data to take action on social problems. Open Knowledge International addresses this in three steps: 1) we show the value of open data for the work of civil society organizations; 2) we provide organisations with the tools and skills to effectively use open data; and 3) we make government information systems responsive to civil society.

The Laura and John Arnold Foundation

LJAF is a private foundation committed to producing substantial, widespread, and lasting reforms that will maximize opportunities and minimize injustice in our society. Its strategic investments are currently focused on criminal justice, education, public accountability, evidence-based policy, and research integrity. LJAF has offices in Houston, New York City and Washington D.C.


COS is a non-profit technology company providing free, open source software and services to increase inclusivity and transparency of research. COS supports shifting incentives and practices to align more closely with scientific values.  COS develops the Open Science Framework as an infrastructure to enable a more open and transparent research workflow across all of the sciences.

World Health Summit

Under the high patronage of German Chancellor Angela Merkel, French President François Hollande and European Commission President Jean-Claude Juncker, the WHS attracts about 1,800 participants from more than 80 countries. It is the premiere international platform for  exploring strategic developments and decisions in the area of healthcare.

Terry Reese: MarcEdit Updates

planet code4lib - Mon, 2016-10-10 03:02

This round of MarcEdit updates focused on the Task Manager/Task Editing.  After talking to some folks, I really tried to do some work to make it easier for folks when sharing network tasks.  Change logs:


  • Enhancement: Task List will preserve a back up task list before save, and will restore if the original task list is deleted or zero bytes.
  • Enhancement:Task List: Added .lock files to prevent multiple users from editing files on the network at the same time.
  • Enhancement: Task List: Updated Task Manager/process to remove all file paths.  Please see:
  • Enhancement: COM Object additions.


  • Bug Fix: Task Manager: when cloning a task that had been edited, a leak occurred which could also corrupt the task list
  • Enhancement: Task List will preserve a back up task list before save, and will restore if the original task list is deleted or zero bytes.
  • Enhancement:Task List: Added .lock files to prevent multiple users from editing files on the network at the same time.
  • Enhancement: Task List: Updated Task Manager/process to remove all file paths.  Please see:


Downloads can be found at:


FOSS4Lib Recent Releases: YAZ - 5.17.0

planet code4lib - Sun, 2016-10-09 16:45

Last updated October 9, 2016. Created by Peter Murray on October 9, 2016.
Log in to edit this page.

Package: YAZRelease Date: Tuesday, October 4, 2016

Equinox Software: New Additions to SPARK/PaILS

planet code4lib - Fri, 2016-10-07 15:35


Duluth, Georgia–October 7, 2016

Equinox is pleased to announce two new additions to the SPARK/PaILS Consortium.  Claysburg Area Public Library, Hollidaysburg Area Public Library, Martinsburg Community Library, Roaring Spring Community Library, Tyrone-Snyder Public Library, and Williamsburg PA Public Library; affectionately known as The Blair County 6–BC6 for short; joined Altoona and Bellwood-Antis within SPARK.  Bellwood-Antis Public Library migrated two weeks before The BC6.

The BC6 previously shared a server to host their ILS.  They will begin physical resource sharing within SPARK very soon.  Bellwood-Antis Public Library is also in Blair County.  Along with previously migrated Altoona, Blair County libraries are now unified.

Erica Rohlfs, Equinox Project Manager, had this to say about the migrations:  

“Being a part of the Blair Libraries’ migrations was exciting and challenging. There were 3 total migrations, Altoona, Bellwood-Antis, and BC6.  It’s almost ineffable to describe the joy that came with knowing how large this migration was for both the Blair Library System and the community they serve. The librarians put an incredible amount of time and thought into their migrations, not only in the complex policy decisions but also in details, such as streamlining the BCLS branding. I just can’t say enough about how hard they worked to unify their catalog, merge patrons, and streamline resource sharing. Also, their new children’s library cards are awesome!”

Scott Thomas, SPARK/PaILS Executive Director, added; “One of the most exciting things about bringing new library systems onto SPARK is to see how the change directly improves services to library patrons. SPARK will enable the citizens of Blair County to use multiple libraries with ease and to benefit from the enhanced access to many collections county-wide.”


About Equinox Software, Inc.
Equinox was founded by the original developers and designers of the Evergreen ILS. We are wholly devoted to the support and development of open source software in libraries, focusing on Evergreen, Koha, and the FulfILLment ILL system. We wrote over 80% of the Evergreen code base and continue to contribute more new features, bug fixes, and documentation than any other organization. Our team is fanatical about providing exceptional technical support. Over 98% of our support ticket responses are graded as “Excellent” by our customers. At Equinox, we are proud to be librarians. In fact, half of us have our ML(I)S. We understand you because we *are* you. We are Equinox, and we’d like to be awesome for you. For more information on Equinox, please visit

About Pennsylvania Integrated Library System
PaILS is the Pennsylvania Integrated Library System (ILS), a non-profit corporation that oversees SPARK, the open source ILS developed using Evergreen Open Source ILS.  PaILS is governed by a 9-member Board of Directors. The SPARK User Group members make recommendations and inform the Board of Directors.  A growing number of libraries large and small are PaILS members.
For more information about PaILS and SPARK, please visit

About Evergreen
Evergreen is an award-winning ILS developed with the intent of providing an open source product able to meet the diverse needs of consortia and high transaction public libraries. However, it has proven to be equally successful in smaller installations including special and academic libraries. Today, over 1400 libraries across the US and Canada are using Evergreen including NC Cardinal, SC Lends, and B.C. Sitka.
For more information about Evergreen, including a list of all known Evergreen installations, see

Tim Ribaric: Presentation Material from Access 2016 Ignite Talks

planet code4lib - Fri, 2016-10-07 13:26

In my quest to continue flogging the only good idea I've ever had, I got to present at Access about my Bot.

read more

Ed Summers: Appriasal in Web Archives: A Schema

planet code4lib - Fri, 2016-10-07 04:00

So as you can tell if you’ve been following along with my reading of Nicolini’s Practice theory, work, and organization I’ve been trying to survey the field that is practice theory. The goal however isn’t just to increase my knowledge, but to apply it, and hopefully learn if it can be useful for my PhD work and beyond.

As I described in my proposal for the independent study, I’m interested to see if practice theory offers a useful conceptual lens for studying the ways in which archivists do appraisal on the Web. Earlier this year I interviewed 30 archivists about their appraisal work in web archives. I recorded the interviews but never had time to transcribe them all. I did use my interview notes and summaries as data for a paper, Bots, Seeds and People that I will be presenting with my co-author (and adivsor) Ricky Punzalan at the next CSCW. Fortunately Ricky had some extra money to get the interviews transcribed which sets the stage for my independent study this semester.

My goal is to take another pass through the interviews using what I learned from my previous analysis and also some ideas from practice theory. Andrea Wiggins (who is kind enough to be guiding me through this independent study) advised me to write down initial ideas about the schema because they are likely to change as the schema is tested, and it can be difficult to remember why initial decisions were made. Being able to contrast what was learned through coding, with what was expected/assumed at the beginning can provide valuable insights.

So here are the high level themes, and some of the parts that emerged in my previous analysis of appraisal in web archives:

Crawl Modalities

  • domain
  • website
  • individual document

Information Structures

  • directories
  • social networks (social filtering)
  • streams / feeds
  • url patterns

Time/Money (Resources)

  • grants
  • testing
  • storage (sampling / fidelity)
  • quotas
  • vendors


  • volunteers
  • technicians
  • archivists
  • software developers
  • organizational groups
  • (actual) social networks
  • partnerships
  • competitions
  • environmental scanning


  • seed lists
  • crawlers
  • indexers
  • viewers
  • nomination tools
  • spreadsheets
  • email
  • social bookmarking


  • storage
  • dynamic web content
  • staffing
  • timeliness
  • coordination

Design Components

  • explicit
  • implicit

To these I think I will add some ideas from practice theory. I will start out by trying to code for them, but it may be that they are concepts that help me interpret the data later instead. It would be useful to be able to coordinate these concepts with the ones I uncovered in the previous study to see if practice theory is useful here.

  • goals (purpose of practices)
  • effects (who/what is effected by practices)
  • history (how did we get here)
  • training/mentorship
  • artifacts/objects (not just tools)
  • work (specific actions & activities)
  • products (work outcomes)
  • professions/communities (groups of practitioners)
  • rules (norms, behaviors)
  • temporal features
  • spatial features

I think Nicolini’s metaphor of zooming in and zooming out could also provide a useful method for me as I take another look at my interviews. He recommends cycling between close attention to practices and then zooming out to identify relationships between practices. As I examine appraisal in web archives as a practice I think it will be useful to consider it in relation to other practices–perhaps appraisal in other domains. Also I suspect that appraisal itself will contain multiple practices within it.

The attention to breakdown as a cross-cutting concern could also be really useful in helping make some of these practices visible. Breakdown itself figures directly into practice theory, because of the connection to Heidegger. So maybe it is conceptual glue for connecting my previous sand is a conceptual

My goal for the next two weeks is to try out these codes on 2 or 3 transcripts and see how well they work. This will involve getting the transcripts loaded up (probably in MAXQDA) and entering in my initial schema. Given the results of that test I’ll adjust the schema as needed and be set up to take a full pass through the transcripts.

I’m going to be deviating from my original reading list to follow up on a few leads from my reading of Nicolini (2012):

Kuutti, K. (1996). Activity theory as a potential framework for human-computer interaction research. Context and consciousness: Activity theory and human-computer interaction, pages 17–44.

Engeström, Y., Engeström, R., and Vähäaho, T. (1999). Activity Theory and Social Practice: Cultural-Historical Approaches, chapter When the center does not hold: The importance of knotworking. Aarhus University Press Aarhus, Denmark.

Miettinen, R. and Virkkunen, J. (2005). Epistemic objects, artefacts and organizational change. Organization, 12(3):437–456.

Miettinen, R. (2006). Epistemology of transformative material activity: John Dewey’s pragmatism and cultural-historical activity theory. Journal for the Theory of Social Behaviour, 36(4):389–408.

A few things more related things that came up during my meeting this week:

  • thinking about effects of distributed/virtual organizations (Winter, Berente, Howison, & Butler, 2014) and units/layers of organizational analysis (Scott, Davis, & others, 2015)
  • infrastructural inversion: a name for the technique of looking at breakdowns for insight into infrastructure
  • critical incident technique as a research method
  • Stigmergy - social filtering as appraisal technique?
  • pay attention to things vs qualities (nouns vs adverbs) when coding

Nicolini, D. (2012). Practice theory, work, and organization: An introduction. Oxford University Press.

Scott, W. R., Davis, G. F., & others. (2015). Organizations and organizing: Rational, natural and open systems perspectives. Routledge.

Winter, S., Berente, N., Howison, J., & Butler, B. (2014). Beyond the organizational ‘container’: Conceptualizing 21st century sociotechnical work. Information and Organization, 24(4), 250–269.

Ed Summers: Nicolini (9)

planet code4lib - Fri, 2016-10-07 04:00

In the final chapter of Practice theory, work, and organization Nicolini concludes by presenting his own approach to practice theory, which he calls a toolkit approach:

I am not interested in proposing a new theory of practice. Instead, I will embrace a different strategy that can be described as a form of programmatic eclecticism or, more simply, a toolkit approach. My main tenet is that to study practice empirically we are better served by a strategy based on deliberately switching between theoretical sensitivities. (p. 213)

In short, the toolkit approach that I advocate here responds to the principle that the aim of social science is to provide a richer and more nuanced understanding of the world, and not to offer simplified answers to complex questions. More clearly, good social science makes the world more complex, not simpler. Thicker, not thinner, descriptions are the aim of good social science. And so it should be in the attempt to understand practices. (p. 215)

This toolkit approach is less about applying existing theories to new phenomenon, and more about increasing articulation, or useful knowledge about the world (Stengers, 1997). He also invokes Barad (2003) and pragmatists to say that theories are actually attempts to reconfigure the world in useful and productive ways. These attempts at doing things in the world sometimes work but often the world bites back in useful and interesting ways. Theories and methods are conceived of as packages. Strangely this is a topic that has come up in my Ethnography class this semester where the professor has stressed how deeply interrelated theory and methodology is.

This package of theory and method is needed for:

  • describing the world in terms of practices (not systems, actors, classes)
  • textual representation: being able to capture, communicate and experiment
  • establish infra-language for generating new stories and theories

He uses the metaphor of zooming in on practices in a particular place and time and then zooming out to where practices can be compared with other practices. These are alternated and repeated, as different theoretical lenses. Ethno-methodology is an example of zoomed in attention: micro-ethnography, organizational ethnography, Shadowing, conversational analysis, attention to sequence all are useful. But if analysis is limited to these techniques the view can become locked in and extremely formalized.

Another technique of zooming in focuses attention on the body: how are practices achieved with and through the body. Also of interest is how the body itself is shaped by practice. It can also be useful to focus on artifacts in relation to the body – the materiality of practice.

Also of interest are auto-poesis or creativity – the application of practices to particular times and places to suit the contingencies of circumstance. Noting how practices are adapted and made unique can help identify them. In addition the practices are characterized by their durability or persistence over time. What are the mechanics of persistence that allow practices to perpetuate?

The zooming out process is mostly a process of taking an identified practice and situating it with other practices.

In a sense, then, all practices are involved in a variety of relationships and associations that extend in both space and time, and form a gigantic, intricate,and evolving texture of dependencies and references. Paraphrasing [Latour (2005), p. 44; see also Schatzki (2002)), we can state that practice is always a node, a knot, and a conglomerate of many types of material and human agencies that have to be patiently untangled.

The metaphor of the knot is one that’s come up a few times in Nicolini’s book … and I ran across it recently in Jackson, Gillespie, & Payette (2014). I notice from my notes and citations I’ve accumulated during my reading that the first the knot also appears in Engestrom:1999. I don’t know if it’s just the presentation by Nicolini, but Engestrom seems to keep popping up in interesting ways – so I’d like to follow up my reading of practice theory by digging into some of his work. Another word that gets used a lot when describing the process of zooming out to look at relationships between practices is assemblage, which isn’t really referenced at all but seems to be drawn from Deleuze:1998. Just glancing at the Wikipedia page about [Assemblage] I can see that constellation is also used in assemblage theory. Somewhere along the line I picked this word up as well, but I don’t remember where. I think I’ve looked at (???) before and been intimidated. It might be interesting to read some of the theory/crticism around Deleuze at some point.

Nicolini offers up a few things to focus on to achieve this effect of zooming out from individual practices:

  • compare here/now of a practice with the there/then of another
  • how do practices “conjure” or establish social arrangements?
  • what are the interests/projects/hopes that led to the current state of affairs?

He also suggests that shadowing, the [sociology of translation] and Actor Network Theory are useful ways of zooming out the theoretical perspective (Czarniawska-Joerges, 2007; Latour, 2005; Law, 2009). ANT’s idea of following the actors can be a useful technique for discovering connections between practices. The actors can be people, artifacts and inscriptions. The sociology of translation (an idea from ANT) can help examine how relations/associations are kept in place (Callon, 1984). I think I’ve read this piece by Callon before, at least it’s in my BibDesk database already, but I should move this up in the queue to read. I read Reassembling the Social a few years ago, probably before I had enough context to understand it. Czarniawska is new to me, so that might also merit some reading up on.

Again the word “knotted” is used to describe how are connected together:

The idea of translation invites us to appreciate that associations need to be “knotted” and kept actively in place through the coordination of humans and non-human mediators such as forms, software, policy documents, plans, contracts, and spaces. Only when all these resources are aligned to form chains of translation in such a way that the results of an activity are stabilized and turned into a more or less solid black box, can effecting the activity of another practice be accomplished. (p. 231-232)

Nicolini points out that ANT and translation don’t really explaining why things get knotted together other than somewhat bleak implications of power. Practices on the other hand provide a pro-social, teleo-affective (Schatzki) grounding to work from–where goals matter, and aren’t reduced to notions of Power.

The second class of techniques for zooming out are what Nicolini calls practice networks or action nets (Czarniawska, 2004). The idea here is to to ask where are the effects of the practice under consideration being felt. How are the results of the practice used in other contexts?

I’m actually really glad that the book closes by connecting the dots between practice theories and Latour, since this is something that had been at the back of my mind while I was working through the chapters. Latour seemed to be conspicuously missing in the review of the literature of practice, but Nicolini was actually saving him for the conclusion. I also like the formulation that practice provides an almost ethical dimension to ANT that isn’t simply political–not that politics are ever really simple. It’s just the level of focus isn’t at the macro level necessarily, but in the hands of people achieving things in their local environments, for their comfort and survival perhaps–not for domination.

Nicolini also brings back the idea of Cultural and Historical Activity Theory to stress the idea of analyzing how did we get here. Historical analysis is key to understanding power relations in the current state of affairs and how they are inscribed in practice. Zooming out on the temporal dimension provides this historical view.

The process of zooming in and zooming out can be repeated, but it also can be achieved by having multiple research projects open at the same time. Each project provides a zoomed in perspective on a particular phenomenon, but the connections between projects offer an opportunity to reflect from a zoomed out perspective. This idea appeals to my own habit of keeping multiple plates spinning at once. I suspect many people work this way. He offers the [rhizome] as a metaphor or talisman of this sort of work–rather than a linear process. Nicolini does say that the idea of zooming in and out suggests that the world is organized into micro, meso and macro levels – which is something he does not want to suggest. Instead he thinks it’s more a question of refocusing on specific circumstances, and then relations between those sites: moving around above practices and then hovering above particular practices.

The book concludes by saying that none of the chapters are meant to be formulas but just suggestions for ways of working to be tried.

So my last words are: give it a go and enjoy responsibly!

I feel like lots, and lots of rhizomatic exploration await from this very useful book. Thanks Nicolini :) The only problem is that I think I may need to adjust my reading for my independent study based on what I’ve learned. But that’s what independent studies are for right?


Barad, K. (2003). Posthumanist performativity: Toward an understanding of how matter comes to matter. Signs, 28(3), 801–831.

Callon, M. (1984). Some elements of a sociology of translation: Domestication of the scallops and the fishermen of St Brieuc Bay. The Sociological Review, 32(S1), 196–233.

Czarniawska, B. (2004). On time, space, and action nets. Organization, 11(6), 773–791.

Czarniawska-Joerges, B. (2007). Shadowing: And other techniques for doing fieldwork in modern societies. Copenhagen Business School Press.

Jackson, S. J., Gillespie, T., & Payette, S. (2014). The policy knot: Re-integrating policy, practice and design in CSCW studies of social computing. In Proceedings of the 17th ACM Conference on Computer Supported Cooperative Work & Social Computing (pp. 588–602). Association for Computing Machinery.

Latour, B. (2005). Reassembling the social: An introduction to actor-network-theory. Oxford University Press.

Law, J. (2009). The new blackwell companion to social theory. In B. S. Turner (Ed.), (pp. 141–158). Oxford: Wiley-Blackwell.

Schatzki, T. (2002). The site of the social: A philosophical exploration of the constitution of social life and change. University Park: Pennsylvania State University Press.

Stengers, I. (1997). Power and invention: Situating science. Minneapolis: University of Minnesota Press.

Jez Cope: Software Carpentry: SC Test; does your software do what you meant?

planet code4lib - Thu, 2016-10-06 17:51

“The single most important rule of testing is to do it.”
Brian Kernighan and Rob Pike, The Practice of Programming (quote taken from SC Test page

One of the trickiest aspects of developing software is making sure that it actually does what it’s supposed to. Sometimes failures are obvious: you get completely unreasonable output or even (shock!) a comprehensible error message.

But failures are often more subtle. Would you notice if your result was out by a few percent, or consistently ignored the first row of your input data?

The solution to this is testing: take some simple example input with a known output, run the code and compare the actual output with the expected one. Implement a new feature, test and repeat. Sounds easy, doesn’t it?

But then you implement a new bit of code. You test it and everything seems to work fine, except that your new feature required changes to existing code and those changes broke something else. So in fact you need to test everything, and do it every time you make a change. Further than that, you probably want to test that all your separate bits of code work together properly (integration testing) as well as testing the individual bits separately (unit testing). In fact, splitting your tests up like that is a good way of holding on to your sanity.

This is actually a lot less scary than it sounds, because there are plenty of tools now to automate that testing: you just type a simple test command and everything is verified. There are even tools that enable you to have tests run automatically when you check the code into version control, and even automatically deploy code that passes the tests, a process known as continuous integration or CI.

The big problems with testing are that it’s tedious, your code seems to work without it and no-one tells you off for not doing it.

At the time when the Software Carpentry competition was being run, the idea of testing wasn’t new, but the tools to help were in their infancy.

“Existing tools are obscure, hard to use, expensive, don’t actually provide much help, or all three.”

The SC Test category asked entrants “to design a tool, or set of tools, which will help programmers construct and maintain black box and glass box tests of software components at all levels, including functions, modules, and classes, and whole programs.”

The SC Test category is interesting in that the competition administrators clearly found it difficult to specify what they wanted to see in an entry. In fact, the whole category was reopened with a refined set of rules and expectations.

Ultimately, it’s difficult to tell whether this category made a significant difference. Where the tools to write tests used to be very sparse and difficult to use they are now many and several options exist for most programming languages. With this proliferation, several tried-and-tested methodologies have emerged which are consistent across many different tools, so while things still aren’t perfect they are much better.

In recent years there has been a culture shift in the wider software development community towards both testing in general and test-first development, where the tests for a new feature are written first, and then the implementation is coded incrementally until all tests pass.

The current challenge is to transfer this culture shift to the academic research community!

SearchHub: Tick Tock: Time Is Running out to Stump The Chump

planet code4lib - Thu, 2016-10-06 17:00

Time Is Running Out!

There’s only a few days left to submit your questions for Stump The Chump at Lucene/Solr Revolution 2016.

If the panel of judges decide you’re question did the best job of stumping me, you could win some great prizes — so what are you waiting for?

Even if you can’t make it to the Revolution, you can still submit questions and to try and stump me. Information on how to submit questions can be found on the session agenda page, and follow the Chump tag here on this blog to find out who won after the conference.

(And if you do plan to attend, don’t forget to register for the conference ASAP!)

The post Tick Tock: Time Is Running out to Stump The Chump appeared first on

LITA: The 2016 LITA Forum includes 3 amazing Keynotes

planet code4lib - Thu, 2016-10-06 16:57

Fort Worth, TX
November 17-20, 2016

Join your LITA and library technology colleagues for the 2016 LITA Forum

  • LITA Forum early bird rates end October 14, 2016
  • The guaranteed discount hotel rate at the Omni Fort Worth Hotel ends Wednesday October 21st, 2016, and as available thereafter.
  • Online registration closes Sunday November 13th, 2016

Register Now!

This year’s Forum has three amazing keynotes you won’t want to miss:


Cecily Walker is the Systems Project Librarian at Vancouver Public Library, where she focuses on user experience, community digital projects, digital collections, and the intersection of social justice, technology, and public librarianship. It was her frustration with the way that software was designed to meet the needs of highly technical users rather than the general public that led her to user experience, but it was her love of information, intellectual freedom, and commitment to social justice that led her back to librarianship. Cecily can be found on Twitter (@skeskali) where she frequently holds court on any number of subjects, but especially lipstick.


Waldo Jaquith is the director of U.S. Open Data, an organization that works with government and the private sector to advance the cause of open data. He previously worked in open data with the White House Office of Science and Technology Policy. No stranger to libraries, Jaquith used to work with digital assets at the Scripps Library at the Miller Center at the University of Virginia, and served on the Board of Trustees at his regional library. He lives near Charlottesville, Virginia with his wife and son. Waldo can also be found at Twitter (@waldojaquith)


Tara Robertson is the Systems Librarian and Accessibility Advocate at CAPER-BC. “I’m a librarian who doesn’t work in a library. I like figuring out how things work, why they break, and how to make them work better. I’m passionate about universal design, accessibility, open source software, intellectual freedom, feminism and Fluevog shoes.”

Twitter (@tararobertson)
Blog (

Don’t forget the Preconference Workshops

Come to the 2016 LITA Forum a day early and choose to participate in one of two outstanding preconferences.

Librarians can code! A “hands-on” computer programming workshop just for librarians
With presenter: Kelly Smith, founder of Prenda – a learning technology company with the vision of millions of kids learning to code at libraries all over the country.

Letting the Collections Tell Their Story: Using Tableau for Collection Evaluation
With presenters: Karen Harker, Collection Assessment Librarian University of North Texas Libraries; Janette Klein, Interdisciplinary Information Science PhD student University of North Texas; Priya Parwani, Graduate Research Assistant University of North Texas Libraries.

Full Details

Join us in Fort Worth, Texas, at the Omni Fort Worth Hotel located in Downtown Fort Worth, for the 2016 LITA Forum, a three-day education and networking event featuring 2 preconferences, 3 keynote sessions, more than 55 concurrent sessions and 25 poster presentations. It’s the 19th annual gathering of the highly regarded LITA Forum for technology-minded information professionals. Meet with your colleagues involved in new and leading edge technologies in the library and information technology field. Registration is limited in order to preserve the important networking advantages of a smaller conference. Attendees take advantage of the informal Friday evening reception, networking dinners and other social opportunities to get to know colleagues and speakers.

Get all the details, register and book a hotel room at the 2016 Forum Web site.

Forum Sponsors:

OCLC, Yewno, EBSCO, BiblioCommons

Get all the details, register and book a hotel room at the 2016 Forum website.

See you in Fort Worth.

David Rosenthal: Software Heritage Foundation

planet code4lib - Thu, 2016-10-06 15:00
Back in 2009 I wrote:
who is to say that the corpus of open source is a less important cultural and historical artifact than, say, romance novels.Back in 2013 I wrote:
Software, and in particular open source software is just as much a cultural production as books, music, movies, plays, TV, newspapers, maps and everything else that research libraries, and in particular the Library of Congress, collect and preserve so that future scholars can understand our society.There are no legal obstacles to collecting and preserving open source code. Technically, doing so is much easier than general Web archiving. It seemed to me like a no-brainer, especially because almost all other digital preservation efforts depended upon the open source code no-one was preserving! I urged many national libraries to take this work on. They all thought someone else should do it, but none of the someones agreed.

Finally, a team under Roberto di Cosmo with initial support from INRIA has stepped into the breach. As you can see at their website they are already collecting a vast amount of code from open source repositories around the Internet. statistics 06Oct16They are in the process of setting up a foundation to support this work. Everyone should support this important essential work.

Evergreen ILS: Evergreen 2.11.0 released

planet code4lib - Wed, 2016-10-05 20:21

On behalf of the build-master team and myself, I am pleased to announce the release of Evergreen 2.11.0. Included in Evergreen 2.11.0 are the following new features:

  • Add Date Header to Action Trigger Email/SMS Templates
  • Support for Ubuntu 16.04
  • Purge User Activity
  • Authority Record Import Updates Editor, Edit Date.
  • Authority Propagation Updates Bib Editor, Edit Date
  • Bibliographic Record Source Now Copied to 901$s
  • Option to Update Bib Source and Edit Details on Record Import
  • Staff Client Honors Aged Circulations
  • “Canceled Transit” Item Status
  • Copy Status “Is Available” Flag
  • Email Checkout Receipts
  • Set Per-OU Limits on Allowed Payment Amounts
  • Additional Fields Available for Display in Some Interfaces
  • Merge Notification Preferences Tables in TPAC
  • Improved Holds Screens in My Account
  • Popularity Boost for Ranking Search Results
  • Badge Configuration
  • Removal of Advanced Hold Options link when part holds are expected
  • SIP Renewals
  • Treat SIP Location Field as Login Workstation

These, along with dozens of bug fixes and updates to documentation, are the result of work by more than thirty individuals at over 15 organizations.

To download Evergreen 2.11.0 and to read the full release notes, please visit the downloads page.

LITA: Jobs in Information Technology: October 5, 2016

planet code4lib - Wed, 2016-10-05 19:40

New vacancy listings are posted weekly on Wednesday at approximately 12 noon Central Time. They appear under New This Week and under the appropriate regional listing. Postings remain on the LITA Job Site for a minimum of four weeks.

New This Week

MIT Libraries, Web Developer, Cambridge, MA

Boston College, Digital Library Applications Developer, Chestnut Hill, MA

Visit the LITA Job Site for more available jobs and for information on submitting a job posting.

FOSS4Lib Upcoming Events: JHOVE Online Hack Day

planet code4lib - Wed, 2016-10-05 15:46
Date: Tuesday, October 11, 2016 - 09:00 to 17:00Supports: JHOVE

Last updated October 5, 2016. Created by Peter Murray on October 5, 2016.
Log in to edit this page.


JHOVE is a widely-used open source digital preservation tool, used for validating content, such as PDFs. However, some of the validation output can be difficult to understand. The aim of this online hack day is to enhance our knowledge about JHOVE errors – to create descriptions of errors and to identify example files, as well as to start to understand their preservation impact and what can possibly be done about them.

David Rosenthal: Another Vint Cerf Column

planet code4lib - Wed, 2016-10-05 15:00
Vint Cerf has another column on the problem of digital preservation. He concludes:
These thoughts immediately raise the question of financial support for such work. In the past, there were patrons and the religious orders of the Catholic Church as well as the centers of Islamic science and learning that underwrote the cost of such preservation. It seems inescapable that our society will need to find its own formula for underwriting the cost of preserving knowledge in media that will have some permanence. That many of the digital objects to be preserved will require executable software for their rendering is also inescapable. Unless we face this challenge in a direct way, the truly impressive knowledge we have collectively produced in the past 100 years or so may simply evaporate with time.Vint is right about the fundamental problem but wrong about how to solve it. He is right that the problem isn't not knowing how to make digital information persistent, it is not knowing how to pay to make digital information persistent. Yearning for quasi-immortal media makes the problem of paying for it worse not better, because quasi-immortal media such as DNA are both more expensive and their more expensive cost is front-loaded. Copyability is inherent in on-line information, that's how you know it is on-line. Work with this grain of the medium, don't fight it.

LITA: Social Media For My Institution – a LITA web course

planet code4lib - Wed, 2016-10-05 14:59

Don’t miss out on this informative LITA web course starting soon.

Social Media For My Institution: from “mine” to “ours”

Instructor: Dr. Plamen Miltenoff
Wednesdays, 10/19/2016 – 11/9/2016
Blended format web course

Register Online, page arranged by session date (login required)

A course for librarians who want to explore the institutional application of social media. Based on an established academic course at St. Cloud State University “Social Media in Global Context”. This course will critically examine the institutional need of social media (SM) and juxtapose it to its private use. Discuss the mechanics of choice for recent and future SM tools. Present a theoretical introduction to the subculture of social media. Show how to streamline library SM policies with the goals and mission of the institution. There will be hands-on exercises on creation and dissemination of textual and multimedia content, and patrons’ engagement. And will include brainstorming on suitable for the institution strategies regarding resources, human and technological, workload share, storytelling, and branding and related issues such as privacy, security etc.

This is a blended format web course:

The course will be delivered as 4 separate live webinar lectures, one per week on Wednesdays, October 19, 26, November 2, and 9 at 2pm Central. The webinars will also be recorded and distributed through the web course platform, Moodle, for asynchronous participation.

Details here and Registration here

Dr. Plamen Miltenoff is an information specialist and Professor at St. Cloud State University. His education includes several graduate degrees in history and Library and Information Science and in education. His professional interests encompass social Web development and design, gaming and gamification environments. For more information see

And don’t miss other upcoming LITA fall continuing education offerings:

Beyond Usage Statistics: How to use Google Analytics to Improve your Repository
Presenter: Hui Zhang
Tuesday, October 11, 2016
11:00 am – 12:30 pm Central Time
Register Online, page arranged by session date (login required)

Online Productivity Tools: Smart Shortcuts and Clever Tricks
Presenter: Jaclyn McKewan
Tuesday November 8, 2016
11:00 am – 12:30 pm Central Time
Register Online, page arranged by session date (login required)

Questions or Comments?

For questions or comments, contact LITA at (312) 280-4268 or Mark Beatty,


Subscribe to code4lib aggregator