You are here

Feed aggregator

Open Knowledge Foundation: How to advance open data research: Renewing our focus on the demand of open data, user needs and data for society.

planet code4lib - Tue, 2016-09-20 10:15

Ahead of this year’s International Open Data Conference #iodc16, Danny Lämmerhirt and Stefaan Verhulst provide information on the Measuring and Increasing Impact Action Session, which will be held on Friday October 7, 2016 at IODC in Room E. Further information on the session can be found here.

Lord Kelvin’s famous quote “If you can not measure it, you can not improve it” equally applies to open data. Without more evidence of how open data contributes to meeting users’ needs and addressing societal challenges, efforts and policies toward releasing and using more data may be misinformed and based upon untested assumptions.

When done well, assessments, metrics, and audits can guide both (local) data providers and users to understand, reflect upon, and change how open data is designed. What we measure and how we measure is therefore decisive to advance open data.

Back in 2014, the Web Foundation and the GovLab at NYU brought together open data assessment experts from Open Knowledge International, Organisation for Economic Co-operation and Development, United Nations, Canada’s International Development Research Centre, and elsewhere to explore the development of common methods and frameworks for the study of open data. It resulted in a draft template or framework for measuring open data. Despite the increased awareness for more evidence-based open data approaches, since 2014 open data assessment methods have only advanced slowly. At the same time, governments publish more of their data openly, and more civil society groups, civil servants, and entrepreneurs employ open data to manifold ends: the broader public may detect environmental issues and advocate for policy changes, neighbourhood projects employ data to enable marginalized communities to participate in urban planning, public institutions may enhance their information exchange, and entrepreneurs embed open data in new business models.

In 2015, the International Open Data Conference roadmap made the following recommendations on how to improve the way we assess and measure open data.

  1. Reviewing and refining the Common Assessment Methods for Open Data framework. This framework lays out four areas of inquiry: context of open data, the datapublished, use practices and users, as well as the impact of opening data.
  2. Developing a catalogue of assessment methods to monitor progress against the International Open Data Charter (based on the Common Assessment Methods for Open Data).
  3. Networking researchers to exchange common methods and metrics. This helps to build methodologies that are reproducible and increase credibility and impact of research.
  4. Developing sectoral assessments.

In short, the IODC called for refining our assessment criteria and metrics by connecting researchers, and applying the assessments to specific areas. It is hard to tell how much progress has been made in answering these recommendations, but there is a sense among researchers and practitioners that the first two goals are yet to be fully addressed.

“…there seems to be a disconnect between top-level frameworks and on-the-ground research”

Instead we have seen various disparate, yet well meaning, efforts to enhance the understanding of the release and impact of open data. A working group was created to measure progress on the International Open Data Charter, which provides governments with principles for implementing open data policies. While this working group compiled a list of studies and their methodologies, it has not (yet) deepened the common framework of definitions and criteria to assess and measure the implementation of the Charter. In addition, there is an increase of sector- and case-specific studies that are often more descriptive and context specific in nature, yet do contribute to the need for examples that illustrate the value proposition for open data.

As such, there seems to be a disconnect between top-level frameworks and on-the-ground research, preventing the sharing of common methods and distilling replicable experiences about what works and what does not. How to proceed and what to prioritize will be the core focus of the “Action Track: Measurement” at IODC 2016. The role of research for (scaling) open data practice and policy and how to develop a common open data research infrastructure will also be discussed at various workshops during the Open Data Research Summit, and the findings will be shared during the Action Track.

In particular, the Action Track will seek to focus on:

  • Demand and use: Specifically, whether and how to study the demand for and use of open data—including user needs and data life cycle analysis (as opposed to being mainly focused on the data supply or capturing evidence of impact), given the nascent nature of many initiatives around the world. And how to identify how various variables including local context, data supply, types of users, and impact relate to each other, instead of regarding them as separate. To be more deductive, explanatory, and generate insights that are operational (for instance, with regard to what data sets to release) there may be a need to expand the area of demand and use case studies (such as org).
  • Informing supply and infrastructure: How to develop deeper collaboration between researchers and domain experts to help identify “key data” and inform the government data infrastructure needed to provide them. Principle 1 of the International Open Data Charter states that governments should provide key data open by default, yet the questions remains in how to identify “key” data (e.g., would that mean data relevant to society at large?). Which governments (and other public institutions) should be expected to provide key data and which information do we need to better understand government’s role in providing key data? How can we evaluate progress around publishing these data coherently if countries organize the capture, collection, and publication of this data differently?
  • Networking research and researchers: How to develop more and better exchange among the research community to identify gaps in knowledge, to develop common research methods and frameworks and to learn from each other? Possible topics to consider and evaluate include collaborative platforms to share findings (such as Open Governance Research Exchange – OGRX), expert networks (such as, implementing governance for collaboration, dedicated funding, research symposia (more below on ODRS), and interdisciplinary research projects.

Make the most of this Action Track: Your input is needed

To maximize outcomes, the Measurement Action Area will catalyze input from conversations prior to the IODC. Researchers who want to shape the future agenda of open data research are highly encouraged to participate and discuss in following channels:

1) The Measurement and Increasing Impact Action Session, which will take place on Friday October 7, 2016 at IODC in Room E (more details here).

2) The Open Data Research Symposium, which is further outlined below. You can follow this event on Twitter with the hashtag #ODRS16.


The Open Data Research Symposium

The Measurement and Increasing Impact Action Session will be complemented by the second Open Data Research Symposium (#ODRS16), held prior to the International Open Data Conference on October 5, 2016 from 9:00am to 5:00pm (CEST) in Madrid, Spain (view map here for exact location). Researchers interested in the Measurement and Increasing Impact Action Session are encouraged to participate in the Open Data Research Symposium.

The symposium offers open data researchers an opportunity to reflect critically on the findings of their completed research and to formulate the open data research agenda.

Special attention is paid to the question how we can increase our understanding of open data’s use and impacts. View the list of selected papers here and the tentative conference program here.

Interested researchers may register here. Please note that registration is mandatory for participation.

This piece originally appeared on the IODC blog and is reposted with permission.

Ed Summers: nicolini-5

planet code4lib - Tue, 2016-09-20 09:55

In Chapter 5 Nicolini takes a look at how practice theories have been informed by activity theory. Activity theory was pioneered by the psychologist Lev Vygotsky in the 1920s and 1930s. Since Vygotsky activity theory has grown and evolved in a variety of directions that are all characterized by the attention to the role of objects and an attention to the role of conflict or dialectic in human activity. Nicolini focuses specifically on cultural and historical activity theory which focuses on practice and has been picked up by the organization and management studies.

Things start off by talking about Marx again, specifically the description of work in [Das Kapital], where work is broken up into a set of interdependent components:

  1. the worker
  2. the material upon which the worker works
  3. the instruments used to carry out the work
  4. the actions of the worker
  5. the goal towards which the worker works
  6. the product of the work

The identity of the worker is a net effect of this process. Vygotsky and other activity theorists took these rough categories and refined them. Vygotsky in particular focused attention on mediation, or how we as humans typically interact with our environments using cultural artifacts (things designed by people) and that language itself was an example of such an artifact. These artifacts transform the person using them, and the environment: workers are transformed by their tools.

Instead of focusing on individual behavior, activity theorists often examine how actions are materially situated at various levels: actions, activities and operations which are a function of thinking about the collective effort involved. This idea was introduced by Leont’ev (1978). (???) is cited a few times, which is interesting because Kuutti & Bannon (2014) is how I found out about Nicolini in the first place (small world). To illustrate the various levels Leont’ev has an example of using the gears in a car with manual transmission, and how a person starts out performing the individual actions of shifting gears as they learn, but eventually they become automatic operations that are performed without much thinking during other activities such speeding up, stopping, going up hills, etc. The operations can also be dismantled and reassembled and recomposed to create new actions. I’m reminded of push starting my parent’s VW Bug when the battery was dead. The example of manual transmission is particularly poignant because of the prevalence of automatic cars today, where those shifting actions have been subsumed or embodied in the automatic transmission. The actions can no longer be decomposed, at least not by most of us non-mechanics. It makes me wonder briefly about the power dynamics are embodied in that change.

It wasn’t until Engestrom:1987 that the focus came explicitly to bear on the social. Yrjö Engeström (who is referenced and linked in Wikipedia but there is not an article for him yet) is credited for starting the influential Scandinavian activity theory strand of work, and helping bring it to the West. The connection to Scandinavia makes me think about participatory design which came from that region, and what connections there are between it and activity theory. Also action research seems similarly inflected, but perhaps it’s more of a western rebranding? At any rate Engeström got people thinking about an activity system which Nicolini describes as a “collective, systemic, object-oriented formation”, which is summarized with this diagram:

Activity System

This makes me wonder if there might be something in this conceptual diagram from Engeström for me to use in analyzing my interviews with web archivists. It’s kind of strange to run across this idea of object-oriented again outside of the computer science context. I can’t help but wonder how much cross-talk there was between psychology/sociology and computer science. The phrase is also being deployed in humanistic circles with the focus on object oriented ontology. It’s kind of ironic given how object-oriented programming has fallen out of favor a bit in software development, with a resurgence of interest in functional programming.

Kuutti, K., & Bannon, L. J. (2014). The turn to practice in HCI: Towards a research agenda. In Proceedings of the 32nd annual ACM Conference on Human Factors in Computing Systems (pp. 3543–3552). Association for Computing Machinery. Retrieved from

Leont’ev, A. N. (1978). Activity, consciousness, personality. Prentice Hall.

District Dispatch: Copyright Clearance Center charges a mark-up

planet code4lib - Mon, 2016-09-19 22:46

It all started when I polled some librarians about recent permission fees paid for journal articles, just to have more background on current state of interlibrary loan. If permission fees were unreasonably high, it might be a data point to share if the House Judiciary Committee on the Courts, Intellectual Property, and the Internet considers the U.S. Copyright Office’s senseless proposal to rewrite Section 108. I expected to be shocked by high permission fees—and I was—but I also discovered something else that I just had to share.

I received a few examples from librarians regarding a particular journal. One in particular struck me. “I received a request today for a five page article from The Journal of Nanoscience and Nanotechnology and while processing it through ILLiad, the Copyright Clearance Center (CCC) indicated a fee of $503.50. So that would be a $100 a page — call me crazy, but something doesn’t seem right to me with that fee. I went to the publisher’s website and the article is available for $113, just over $20 a page.”

The CCC is making a lot of money collecting permission fees, even on public domain materials and disreputable journal publications.

I then asked CCC to clarify why an article from CCC was five times the cost of the very same article direct from the publisher. I received a quick response from CCC that said “Unfortunately, the prices that appear in our system are subject to change at the publishers’ discretion. CCC only processes the fees that the publisher provides us.”

I discovered that the publisher—who allegedly sets the price of the permission fee—also was used Ingenta document delivery, as an additional online permissions service. Just as the librarian said, Ingenta only charged $113 (which is still a big number for a five page article). I contacted the journal editor and asked about the difference and he responded immediately via email, “You are right that article is available for $113 from Ingenta. Just download from the Ingenta website.”

The difference in price can only be explained as a huge markup by CCC. Surely processing a 5-page article request cannot cost CCC an additional $400. Think about it. CCC is giving the rights holder $113 and taking the other $390.50. Deep pockets, right?

But wait, there’s more. I discovered that the publisher of the journal is American Scientific Publishers, a publisher on the predatory journal blacklist. (Holy cow!) Predatory journals are bogus journals that charge publication fees to gullible scholars and researchers to publish in a journal essentially posing as a reputable publication. With no editorial board and no peer review, academics are duped into publishing with a journal they believe to be trustworthy.

Here’s where we are at. CCC is collecting permission fees five times the amount of other permission services for journal articles from likely bogus publications. Are they sending any of the permission fees collected to the predatory journal publishers? And if they are, isn’t this a way to help predatory journals stay in business? Trustworthy publishers surely would not like that. In any case, with predatory journals numbering in the thousands, CCC has discovered a very large cash cow.

For years, the CCC masqueraded as a non-profit organization until the Commissioner of Internal Revenue caught up with them in 1982, in Copyright Clearance Center, Inc. v. Commissioner of Internal Revenue. Now that CCC is a privately held, for-profit company, we have limited information on its financials, but we do know that in 2011 (according to a CCC press release), they distributed over 188 million dollars to rights holders. That’s a big number from five years ago. How much money they pocketed for themselves is unknown, but I think we can rest assured that it was more than enough to jointly fund (with the Association of American Publishers) Cambridge University Press et al v. Patton et al, a four year-long litigation against Georgia State University’s e-reserve service. (They lost, but are requesting an appeal).

CCC is making a lot of money collecting permission fees, even on public domain materials and disreputable journal publications. Their profit margin could be as high as Elsevier’s! Academics are duped by predatory journals that are apparently doing fairly well financially. Libraries are paying high permission fees from the CCC unless they know to pay the predatory journal directly, keeping the predatory journal people in the black. As if the traditional scholarly communication cycle could get any more absurd!

The post Copyright Clearance Center charges a mark-up appeared first on District Dispatch.

SearchHub: News Search at Bloomberg

planet code4lib - Mon, 2016-09-19 22:04

As we countdown to the annual Lucene/Solr Revolution conference in Boston this October, we’re highlighting talks and sessions from past conferences. Today, we’re highlighting Solr Committer Ramkumar Aiyengar’s talk, “Building the News Search Engine”.

Meet the backend which drives News Search at Bloomberg LP. In this session, Ramkumar Aiyengar talks about how he and his colleagues have successfully pushed Solr to unchartered territories over the last three years, delivering a real-time search engine critical to the workflow of hundreds of thousands of customers worldwide.

Ramkumar Aiyengar leads the News Search backend team at the Bloomberg R&D office in London. He joined Bloomberg from his university in India and has been with the News R&D team for nine years. He started working with Apache Solr/Lucene four years ago, and is now a committer to the project. Ramkumar is especially curious about Solr’s search distribution, architecture, and cloud functionality. He considers himself a Linux evangelist, and is one of those weird geeky creatures who considers Lisp beautiful and believes that Emacs is an operating system.

Building a Real-Time News Search Engine: Presented by Ramkumar Aiyengar, Bloomberg LP from Lucidworks

Join us at Lucene/Solr Revolution 2016, the biggest open source conference dedicated to Apache Lucene/Solr on October 11-14, 2016 in Boston, Massachusetts. Come meet and network with the thought leaders building and deploying Lucene/Solr open source search technology. Full details and registration…

The post News Search at Bloomberg appeared first on

LITA: Using Google Statistics for your Repository – a new LITA webinar

planet code4lib - Mon, 2016-09-19 20:09

Beyond Usage Statistics: How to use Google Analytics to Improve your Repository

Presenter: Hui Zhang
Tuesday, October 11, 2016
11:00 am – 12:30 pm Central Time

Register Online, page arranged by session date (login required)

Librarians and repository managers are increasingly asked to take a data-centric approach for content management and impact measurement. Usage statistics, such as page views and downloads, have been widely used for demonstrating repository impacts. However, usage statistics restrict your capacity of identifying user trends and patterns such as how many visits are contributed by crawlers, originated from a mobile device, or redirected by a search engine. Knowing these figures will help librarians to optimize the digital contents for better usability and discoverability. This 90 minute webinar will teach you the concepts of metrics and dimensions along with hands-on activities of how to use Google Analytics (GA) on library data from an institutional repository. Be sure to check the details page for takeaways and prerequisites.

Details here and Registration here

Hui Zhang is the Digital Application Librarian at Oregon State University Libraries and Press. He has years of experience in generating impact reports with major platforms such as DSpace and Hydra Sufia using Google Analytics or local statistics index. Other than repository development, his interests include altmetrics, data visualization, and linked data

And don’t miss other upcoming LITA fall continuing education offerings:

Social Media For My Institution; from “mine” to “ours”
Instructor: Plamen Miltenoff
Starting Wednesday October 19, 2016, running for 4 weeks
Register Online, page arranged by session date (login required)

Online Productivity Tools: Smart Shortcuts and Clever Tricks
Presenter: Jaclyn McKewan
Tuesday November 8, 2016
11:00 am – 12:30 pm Central Time
Register Online, page arranged by session date (login required)

Questions or Comments?

For questions or comments, contact LITA at (312) 280-4268 or Mark Beatty,

LibUX: Carousels Are Okay

planet code4lib - Mon, 2016-09-19 14:23

I recorded this episode at 2 a.m. this morning, because I’ve been feeling pretty good about the consistency of this podcast lately and by gosh I am not going to ruin it over a little something like sleep. No fooling, I am pretty entertained. This one’s a shorty, in which I make some enemies and defend the use of carousels on behalf of actually good user experiences – maybe.

Also, thank you for your kind reviews! Your brief reviews wherever you listen to LibUX make it easier to discover it.

Listen and please subscribe!

If you like, you can download the MP3 or subscribe to LibUX on StitcheriTunes, YouTube, Soundcloud, Google Play Music, or just plug our feed straight into your podcatcher of choice.

Ed Summers: Old Town, College Park (Observation 1)

planet code4lib - Sun, 2016-09-18 17:33
<iframe width=“100%” height=“300px” frameBorder=“0” src=“”>

See full screen


6 pm rode bike to regents drive and rte 1 Noting that the campus is very traditional rd brick Heavy traffic north and south Lots of students walking around Grabbing food Jogging People waiting for bus going south People going from one store to another: target to bagel place Mostly young people/ late teens early 20 People waking in pairs, in groups and alone Lots of mobile devices out and headphones Walking with Pizza takeout 16 people went into target in 10 minutes at 6:20 Two women having 15 min conversation in front of target Inside girls Talking about cooking dinner Fresh meat and veg People having convos in the produce aisle Walking with baskets Frozen food Cooking equipment, towels, blankets Phones & accessories Pharmacy

Left down college ave Sushi place that is open late Beauty shop Pizza Cigarettes Thai restaurant Place for lease

Walking a dog - leads to convo Surveillance Out behind lots of parking Lots of takeout cars Fraternity Japan center Person with shopping bag Feeling old Ledo restaurant on Knox fairly busy Lots of parking decks above 12 Sorority sisters all dressed the same blue jeans and black outside their house Red brick sororities seem like the university bldgs Princeton ave gives way to what looks like more residential Meor Maryland house Playing basketball hoop in parking lot Running with takeout, wait for me, putting sports equipment in car 3 20 somethings More fraternity mixed with residential a People walking away from campus Newly paved road Vacant lot being turned into housing near where the notice was ; looks like they are building Girls driving and singing loudly with music Sound of highway and trains at Norwich Bungalow First political sign trump More mopeds than usual Police auxiliary 6 Small apt complex grey units Ny nj pa plates Big Square with Greek orgs along perimeter guys playing frisbee Umd Police station nearby with parked police cars Zip car pickup Residential parking looks full Gym inside parking lot The building has apartments Back where I started Noticing it is one contiguous new building, must have been built at the same time What students get in here? What is the process? Weird to have place for lease across the road Why is Landmark written on the front? Only saw one family out with a stroller. Zags tee bikes Traffic north slower after 7 Nandos is packed Parking lot full in chipotle shopping area 20 secs to cross rte 1 after waiting like 4 mins 2 empty stores next to 711 prune real estate South campus commons newer red brick Music and grilling People wandering walking Walking with takeout Emptying trash in recycling Busy bus stop 115 bus 2 women “I picked up hit hikers in Iceland” 4 Chinese girls speaking in Chinese Everyone getting in the 117 - like 20-30 people Cookie store delivers until 3am very busy at 7

John Miedema: Bookface

planet code4lib - Sat, 2016-09-17 16:09

Ed Summers: Nicolini (4)

planet code4lib - Sat, 2016-09-17 04:00

Chapter 4 focuses on the idea of practice as something that is tied to tradition and community, which is something Nicolini sees Giddens and Bourdieu departing from. Nicolini is presenting this chapter mostly in order to critique the idea, because its focus on people transmitting ideas to each other, when left unexamined, tends to give solidity to social actors and groups:

I will argue that while a coherent theory of learning and transmission is a requisite element of any theory of practice, there is a fine balance to be struck between recognizing that all practices need to be recognized by a group of practitioners, and the reification of such a collective into a social body that exists independently of the practice. (p. 78)

Socialization (family and schooling) is important to the work of Durkheim, who influenced Giddens. Apprenticeship is another concept that has been used to explain how practices are transmitted–but it requires the master/pupil power dynamic, and hence the acceptance of inequality of social positions. It is also more limited in that it is focused primarily on learned skills of craftsmen or artists.

Legitimate Peripheral Participation (LPP) is a term introduced by Lave & Wenger (1991) that attempts to take apprenticeship out of the particular historical environments (the craftsman’s shop) and explain apprenticeship as a learning process. They do this by making it essential that the learner take responsibility for the thing they are doing – this is what makes it a practice. Nicolini cites Foucault in pointing out that this acceptance of responsibility also means an acceptance of the social order and power dynamics present in it. It’s interesting that the term community of practice was first introduced in Lave & Wenger (1991) as well. Well, at least for me since I find myself using that phrase quite a bit. The idea of apprenticeship is decentered, as not only happening between master and apprentice, but includes advanced novices, other apprentices, other master craftsmen, and the material artifacts used. So practice becomes socially situated.

Apparently Lave & Wenger (1991) gave rise to many ethnographic studies of situated learning, that loked at learning as a social phenomena rather than something that happens inside someone’s head. Nicolini sees two drawbacks to LPP. The first is similar to his criticism of Boudrieu’s idea of habitus: it fails to account for non-incremental change in a convincing way. And the second is that it doesn’t take into account the wider socio-historical context, and specifically the role that power, ideology and domination play in practice. This criticism can be found in Contu & Willmott (2000).

It is clear that Nicolini doesn’t particularly like the term community since he launches into a critique of its fuzziness, morality and the way that it is used ideologically to define groups of people in order to obscure power, conflict and differences. He references Foucault (1966) by calling community a discursive formation that controls what can and cannot be talked about. He sees the use of the term community with practice as problematic, because one obscures what the other is attempting to make clear. It might be interesting to look closer at this criticism, especially since I have used the term community of practice myself so often. Nicolini says that Handley, Sturdy, Fincham, & Clark (2006) has a good review of the debate.

With these criticisms in mind it does still seem like Wenger (1998) has some useful concepts for the study of practice in the idea of situated learning, which involves:

  • mutual engagement
  • communal negotiation
  • shared repertoire
  • shared history
  • boundaries (Star & Griesemer, 1989)

Nicolini makes a case for dropping the use of community and instead simply talking about practice, because of the way community obscures processual, social, temporary and conflictual properties. He seems to be saying that communities do exist, but they are an effect of practices in operation. Making communities the unit of analysis obscures the way that practices create communities. But then he goes on to say that it’s not practical to remove it because it is such a useful term in management circles. More importantly it does highlight the importance of shared practices, that things don’t just happen in our heads–they are social.

Nicolini cites Barley & Orr (1997) to explain how the phrase “community of practice” can in fact be a way for “semi-professions” to legitimate themselves–which is kind of an interesting idea. In fact Barley & Orr (1997) looks like it could be a very useful example of an ethnographic study of technical work, that could possibly be a useful model for my own examination of web archiving work. Here’s the summary from Amazon:

Between Craft and Science brings together leading scholars from sociology, anthropology, industrial relations, management, and engineering to consider issues surrounding technical work, the most rapidly expanding sector of the labor force. Part craft and part science, part blue-collar and part white-collar, technical work demands skill and knowledge but is rarely rewarded with commensurate status or salary. The book first considers the anomalous nature of technical work and the difficulty of locating it in any conventional theoretical framework. Only an ethnographic approach, studying the actual doing of the work, will make sense of the subject, the authors conclude. The studies that follow report daily practice filled with disjunctures and ironies that mirror the ambiguities of technical work’s place in the larger culture. On the basis of those studies, the authors probe questions of policy, management, and education. Between Craft and Science considers the cultural difficulties in understanding technical work and advances coherent, practice-oriented insights into this anomalous phenomenon.

Now I’m kind of wondering if I need to adjust what I read next this semester…


Barley, S. R., & Orr, J. E. (1997). Between craft and science: Technical settings in US settings. Cornell University Press.

Contu, A., & Willmott, H. (2000). Knowing in practice: A “delicate flower” in the organizational learning field. Organization, 7(2).

Foucault, M. (1966). The order of things: An archaeology of the human sciences. Pantheon.

Handley, K., Sturdy, A., Fincham, R., & Clark, T. (2006). Within and beyond communities of practice: Making sense of learning through participation, identity and practice. Journal of Management Studies, 43(3), 641–653.

Lave, J., & Wenger, E. (1991). Situated learning: Legitimate peripheral participation. Cambridge Ueniversity Press.

Star, S. L., & Griesemer, J. R. (1989). Institutional ecology, ’translations’ and boundary objects: Amateurs and professionals in Berkeley’s Museum of Vertebrate Zoology, 1907-39. Social Studies of Science, 19(3), 387–420.

Wenger, E. (1998). Communities of practice: Learning, meaning, and identity. Cambridge University Press.

John Miedema: The fire is a thorough &amp; voracious reader.

planet code4lib - Fri, 2016-09-16 19:19

The fire is a thorough & voracious reader.
Page by page my old manuscript turns gray & brittle
& when the mist thickens into rain,
the smoking pile emits a long thin sigh.

Dave Bonta, Book-burning

District Dispatch: CopyTalk SSRN: Another enclosure of the commons?

planet code4lib - Fri, 2016-09-16 19:09

Mike Wolfe discusses Elsevier, SSRN and open access to scholarly publications for monthly CopyTalk webinar

What happens when Elsevier – one of the most profitable publishers of scholarly journals and research materials – buys an open access working paper platform like the Social Science Resource Network (SSRN)? Michael Wolfe from Authors Alliance will explore this topic at our next CopyTalk on October 6th, 2016.

After being acquired by Elsevier, SSRN has made headlines following the discovery that the popular pre-print and working paper service had started pulling user-posted works following its own, internal copyright review process. Authors Alliance has been among those to condemn the actions, and to question SSRN’s continuing reliability as a provider of important scholarly infrastructure. In this webinar, Authors Alliance executive director Mike Wolfe will discuss the controversy, Authors Alliance’s response, and what we can learn from the experience about copyright and Digital Millennium Copyright Act best practices for hosts of user-submitted scholarship.

Mike Wolfe is the executive director of Authors Alliance and a copyright research fellow at the University of California, Berkeley, School of Law. Mike has a B.A. from Harvard and J.D. from Duke, and is licensed to practice law in California.

Date/Time: Thursday, October 6, 2:00 PM (ET) 

Sign in here as a guest. You’re in.

CopyTalk is a free monthly webinar brought to you by the copyright education subcommittee of ALA’s Office for Information Technology Policy.

The post CopyTalk SSRN: Another enclosure of the commons? appeared first on District Dispatch.

DPLA: New Year, New HQ for DPLA

planet code4lib - Fri, 2016-09-16 15:00

It’s a new academic year and we are excited to be kicking it off with great educational projects underway, new additions to our team, and a brand new office suite at Boston Public Library (BPL) to officially serve as DPLA’s national headquarters.

We are thrilled to announce that DPLA now calls BPL’s Digital Partners suite home. The Digital Partners space, which we share with two of our hubs, Digital Commonwealth and Internet Archive’s Boston Scanning Center, as well as BPL’s Digital Services team, was completely redesigned as part of Boston Public Library’s $78 million renovation of the historic Central Branch in Copley Square unveiled earlier this summer.  At 6,000 square feet, the Digital Partners space represents BPL’s continued commitment to digitization, digital services, and digital collaboration with state-of-the art facilities, technology, and room to expand.

For DPLA, our new office space represents a huge step forward and is designed to meet the needs of our growing team.  We now have access to a conference room wired with a large screen and a webcam to facilitate easy collaboration between our staff in Boston and staff working across the country.  Two smaller private conference rooms allow for breakout meetings and small team video calls.  When cross team collaboration is in order, our open floor plan now provides a great layout for both conversation and individualized work space.

This week, DPLA staff gathered from around the country to map out plans for the coming months and, thanks to our new conference room, we were able to watch together as Carla Hayden was sworn in as the next Librarian of Congress.  

To see more great photos and new features unveiled as part of Boston Public Library’s renovation, check out this feature in Boston MagazineWe would also like to send a special shout out and sincere thank you to Boston Public Library for being such a generous host and collaborator to DPLA!

Ed Summers: Nicolini (3)

planet code4lib - Fri, 2016-09-16 04:00

Chapter 3 takes a look at the work of two twentieth century thinkers who are critical in understanding the turn to practice: Anthony Giddens and Pierre Bourdieu.


Giddens is reportedly one of the most influential sociologists of the 20th century. His idea of structuration draws on the work of Marx, Weber and Durkheim. Originally I was going to focus specifically on structuration in my independent study, but I decided against it because of the breadth of Giddens’ influence, and the idea that it might be more useful to focus on the practice theory angle, which conceptually ties together Giddens work with the work of other folks in the field of IS and ITC. Also, I’ll admit, once I discovered he served as an advisor to Tony Blair my interest waned a little bit.

Giddens uses the idea of structuration to resolve dualist tensions in social theory related to subjectivity and objectivity. Structuration is a recursive model of society defined by practices that are composed of actors, rules and resources.

  • actors: the producers of activity, who draw on rules and resoures
  • rules: generalized procedures for action, not to be confused with instructions or prohibitions (Wittgenstein)
  • resources: the ways in which power, or the ability to mobilize people, is manifested (Marx)

Nicolini uses language as an example. Spoken language and the rules of language mutually constitute themselves. Spoken language is based on rules of language, but the rules of language would not exist if they were not enacted and reinvented in spoken language. So there’s the recursion.

Actors are required to be knowledgeable and reflexive in structuration theory. However their knowledge and abilities are finite which is how change and mutation can get in. Giddens also emphasizes that activity is always situated in time and place, which shows his connection to Marx’s historical materialism. And finally practices are related to each other–they form interdependencies and accrete which manifests as structures and systems. Sometimes practices may result in structures that contravene each other which can result in reorderings and revolutions in practice.

Another interesting concept Giddens introduces is practical and discursive conciousness. Where practical consciousness is “saturated with taken for grantedness” and has a lot of parallels to the idea of tacit knowledge that we saw earlier in Heidegger’s idea of ready-to-hand.

Could Giddens’ rules be comparable to algorithms? Who follows the rules in either case? Could people using technologies that embody alogorthmic rules be thought of as following the rules? Or does the level of indirection break that. When the algorithms break, they become visible, kind of like infrastructure. I wonder if the controversy involving adaptive structuration theory (DeSanctis & Poole, 1994) is centered around whether the rules can be written down I also wonder if focusing on the site of practice provides a way out of some of this controversy about the prescriptive application of structuration theory?

According to Nicolini uptake of Giddens was low, because of rise of postmodernism at the same time, which eschewed the theory building that Giddens was doing. They were also tired of the conservative implications of his system. (p. 50). His work was also highly theoretical, and difficult to put into practice. He actively dissuaded people from using his concepts in their own research! It was to be used as sensitizing principles. It sounds like I could read Giddens (1991) for more abou this. Giddens resisted the idea that material artifacts could be structural resources. This seems rather odd, and perhaps at odds with ANT. Orlikowski (1992) introduced the use of structuration theory into organizational studies and ICT. But in Orlikowski (2000) she moved away from it, towards practice theory. These might be useful transition to focus on later in the semester.

Giddens appeared too busy developing a theory of society and individuals which put everything in the right place, portrayed people as reflexive and rational, and allowed almost no room for pathos, emotions, disorder, conflict, and violence. Moreover,Giddens’ structurationism failed to inspire a community that had been held to ransom for decades by the boxes, arrows, and loops of system theory. In spite of its innovative, strong, processual character, Giddens’ system theory looked suspiciously like more of the same. Finally, critical authors were some-what unhappy with Giddens’ flat and a-conflictual view of the social, and were weary of the potentially deeply conservative implications of structurationism.


According to Nicolini, Bourdieu’s core point is that representing practice, or praxeology as he called it, is not enough (anthropology)–practice needs to be explained (sociology). I interpret this as saying that descriptions of practice must reflect on they ways in which description is being performed: what is being made visible, and what is being made invisible. These are important things for Bourdieu. I find this explanation much more compelling than the strong/weak distinction that Nicolini makes in the introduction.

Habitus is a key concept or theme throughout all of Bourdieu’s work. It helps get around the problems of objectivism and subjectivisme. I feel like uunderstanding objectivism as Nicolini describes it, would involve more reading, especially Levi-Strauss and the Structuralists. Habitus isn’t a way of understanding the world – it’s more a way of being in the world. Habitus relates to the body, in ways that are similar to Merleau-Ponty’s ideas of schema and habit as well as Polanyi’s idea of personal tacit knowledge. Schema and habit in particular really remind me a bit of Dewey’s ideas about norms. Schema is compared to the feeling of driving a car where the car is an extension of the body’s corporeal schema. It’s only where that meshing breaks down that the schema is noticed. Again breakdown plays an important role. It seems like this meshing is the content domain of HCI.

Tacit knowledge was used by Polyani to explain how scientists work. Explicit knowledge is traditional scientific knowledge exemplified by the scientific method. But tacit knowledge is an awareness of knowing how to do something that defies analytical description. “We know much more than we know we know.”

Bourdieu summarizes his idea of practice using the following formula that is desribed in Bourdieu (1984), p. 101.

(habitus * capital) + field = practice

Capital is anything rare and worthy of being sought after. It can be material and symbolic. Symbolic capitol in particular sustains domination, because it includes the power to name, and renders the entire process invisible. Fields are domains or structured spaces in which the distribution of capital is disputed.

Habitus is a group phenomenon.

Lau (2004) is cited quite a bit for distinguishing and criticizing these ideas – which might be useful to read.

Ways of studying practice (or rather, what not to do):

  • need to participate in “daily endeavors”. You need to live, not represent. You also need to dismantle or side step the power relation of the Academy over the practitioner

  • Simply providing a description of practice is not enough. You need to describe how the practices are propagated and work together.

Reflexivity if important to Bourdieu – since he saw how his own work was itself problematic in the way that it theorized capital in metaphysical terms. Michel De Certau criticizes Bourdieu’s split theoretical personality, and points to levels of pratices: dominant ones that are organized by institutions and many minor ones that operate as micro-tactices of resistance, local deformations, and reinvention. I almost put Certeau (2011) on the reading list for this semester after reading about him in an essay by Alan Liu. It’s just a matter of time since de Certeau’s approach, much like Latour, seems to be a great bridging work between the humanities and social sciences, which is kinda where I live.

Nicolini sees Bourdieu’s idea of habitus as not accounting for practices, and suggests that perhaps the very idea of trying to theorize practices is at the heart of the problem. The solution is the problem. Habitus is self contradictory: it says that practices are historically and socially contingent, but operates at a theoretical level that are outside place and time. Bourdieu fails to account for change (only reproduction), mediation (technology), and reflexivity as part of practice.

So, I’m left feeling Bourdieu has some quite subtle theoretical ideas, almost too subtle – but from this brief introduction I feel much more aligned with his politics than with Giddens. Bourdieu’s attention to everyday life is attractive:

Bourdieu directs our attention to the fact that practice is the locus of the social reproduction of everyday life and symbolic orders, of the taken-for-grantedness of the experienced world and the power structure that such a condition both carries and conceals. (p. 69)


Bourdieu, P. (1984). Distinction: A social critique of the judgement of taste. Harvard University Press.

Certeau, M. de. (2011). The practice of everyday life (3rd ed.). University of California Press.

DeSanctis, G., & Poole, M. S. (1994). Capturing the complexity in advanced technology use: Adaptive structuration theory. Organization Science, 5(2), 121–147.

Giddens, A. (1991). Modernity and self-identity. self and society in the late modern age. Polity Press.

Lau, R. W. (2004). Habitus and the practical logic of practice an interpretation. Sociology, 38(2), 369–387.

Orlikowski, W. J. (1992). The duality of technology: Rethinking the concept of technology in organizations. Organization Science, 3(3), 398–427.

Orlikowski, W. J. (2000). Using technology and constituting structures: A practice lens for studying technology in organizations. Organization Science, 11(4), 404–428.

District Dispatch: The sound of history

planet code4lib - Thu, 2016-09-15 20:05

Photo credit: AP Photo/Pablo Martinez Monsivais

History is rarely made silently, and yesterday’s investiture of Dr. Carla Hayden as the nation’s 14th Librarian of Congress was anything but an exception.Seated among notables from every branch of government, the media and – of course – ALA and the library profession, the most memorable part of the often solemn forty-five minute ceremony may well prove to be its soundtrack:

. . . The palpable hush that overtook the packed room as Speaker of the House Paul Ryan, Chief Justice John Roberts and Dr. Hayden herself strode purposefully to the low dais.
. . . The precision click of military heels on the venerated marble floors of the Jefferson Building’s resplendent Grand Hall as an honor guard presented the colors.
. . . The last soaring strains of the national anthem rendered in a fine Irish tenor echoing from the frescoed ceiling’s perfect vaults.
. . . Glowing adjectives, too many to count, offered by one Congressional luminary after another in praise of the Library of Congress, its past and potential, and Dr. Hayden’s unique qualifications.
. . . And, finally, the quiet dignity of Dr. Hayden’s own strong sure voice unveiling her vision of a Library of Congress for the new century whose treasures will – more than ever before – be accessible to all Americans and the world.

But — for all of their solemnity, power and beauty — these sounds surely will fade from the memories of those privileged to attend Dr. Hayden’s swearing in yesterday.  The one sound, however, that won’t easily be forgotten is the thunder. Yes, thunder.

Two stories above the comfortably seated dignitaries, the balustrades of the Great Hall’s massive staircase were crowded four deep and standing-room-only by the staff of the Library of Congress. To the initial astonishment and dawning delight of those below, four times over the course of yesterday’s finely tuned and meticulously executed program this gallery of passionate professionals erupted into exultant, sustained cheers – thunderous, joyful, hopeful huzzahs – for the new Librarian and her vision of the institution that they clearly so deeply love.

Lasting minutes on end, ebbing only to irrepressibly swell again with optimism and pride, the spontaneous unbridled cheers of her new staff were incredibly and indelibly moving. They will forever be part of the Great Hall of the Library of Congress and now, with Dr. Hayden herself, of history.

The post The sound of history appeared first on District Dispatch.

Jez Cope: Tools for collaborative markdown editing

planet code4lib - Thu, 2016-09-15 19:52

Photo by Alan Cleaver

I really love Markdown1. I love its simplicity; its readability; its plain-text nature. I love that it can be written and read with nothing more complicated than a text-editor. I love how nicely it plays with version control systems. I love how easy it is to convert to different formats with Pandoc and how it’s become effectively the native text format for a wide range of blogging platforms.

One frustration I’ve had recently, then, is that it’s surprisingly difficult to collaborate on a Markdown document. There are various solutions that almost work but at best feel somehow inelegant, especially when compared with rock solid products like Google Docs. Finally, though, we’re starting to see some real possibilities. Here are some of the things I’ve tried, but I’d be keen to hear about other options.

1. Just suck it up

To be honest, Google Docs isn’t that bad. In fact it works really well, and has almost no learning curve for anyone who’s ever used Word (i.e. practically anyone who’s used a computer since the 90s). When I’m working with non-technical colleagues there’s nothing I’d rather use.

It still feels a bit uncomfortable though, especially the vendor lock-in. You can export a Google Doc to Word, ODT or PDF, but you need to use Google Docs to do that. Plus as soon as I start working in a word processor I get tempted to muck around with formatting.

2. Git(hub)

The obvious solution to most techies is to set up a GitHub repo, commit the document and go from there. This works very well for bigger documents written over a longer time, but seems a bit heavyweight for a simple one-page proposal, especially over short timescales.

Who wants to muck around with pull requests and merging changes for a document that’s going to take 2 days to write tops? This type of project doesn’t need a bug tracker or a wiki or a public homepage anyway. Even without GitHub in the equation, using git for such a trivial use case seems clunky.

3. Markdown in Etherpad/Google Docs

Etherpad is great tool for collaborative editing, but suffers from two key problems: no syntax highlighting or preview for markdown (it’s just treated as simple text); and you need to find a server to host it or do it yourself.

However, there’s nothing to stop you editing markdown with it. You can do the same thing in Google Docs, in fact, and I have. Editing a fundamentally plain-text format in a word processor just feels weird though.

4. Overleaf/Authorea

Overleaf and Authorea are two products developed to support academic editing. Authorea has built-in markdown support but lacks proper simultaneous editing. Overleaf has great simultaneous editing but only supports markdown by wrapping a bunch of LaTeX boilerplate around it. Both OK but unsatisfactory.

5. StackEdit

Now we’re starting to get somewhere. StackEdit has both Markdown syntax highlighting and near-realtime preview, as well as integrating with Google Drive and Dropbox for file synchronisation.

6. HackMD

HackMD is one that I only came across recently, but it looks like it does exactly what I’m after: a simple markdown-aware editor with live preview that also permits simultaneous editing. I’m a little circumspect simply because I know simultaneous editing is difficult to get right, but it certainly shows promise.

7. Classeur

I discovered Classeur literally today: it’s developed by the same team as StackEdit (which is now apparently no longer in development), and is currently in beta, but it looks to offer two killer features: real-time collaboration, including commenting, and pandoc-powered export to loads of different formats.

Anything else?

Those are the options I’ve come up with so far, but they can’t be the only ones. Is there anything I’ve missed?

  1. Other plain-text formats are available. I’m also a big fan of org-mode. [return]

District Dispatch: Ellen Satterwhite joins ALA telecom policy team

planet code4lib - Thu, 2016-09-15 17:00

I’m pleased to announce that ALA has bolstered its telecommunications policy team with the addition of Ellen Satterwhite. As a new Fellow of ALA’s Office for Information Technology Policy (OITP), Ellen will provide leadership, counsel and representation on the full array of telecommunications issues that affect libraries and the general public, as well as those that intersect with information policy more broadly.

Ellen Satterwhite is a new OITP Fellow

Ellen is a Director at the policy communications firm Glen Echo Group, where she helps clients formulate policy positions and tell their stories within the rubric of information policy. WifiForward is one of several coalitions managed by Ellen and Glen Echo, and ALA was a founding member of the group, which advocates for abundant Wi-fi and balanced spectrum policy.

As a co-author of the Federal Communications Commission’s (FCC) National Broadband Plan, Consumer Policy Advisor to the FCC and freelance consultant, Ellen’s work has been written about in the Huffington Post, AllThingsD, CNet, Geekwire, GigaOm, and CivSource. Previously, Ellen also served as Program Director for Gig.U, supporting communities seeking gigabit speeds. She earned a master’s degree in Public Affairs from University of Texas at Austin and completed her undergraduate degree at Grinnell College.

OITP Deputy Director Larra Clark will continue to contribute to our telecommunications policy work, with OITP Associate Director Marijke Visser, OITP Senior Fellow Robert Bocher, and me, working in coordination on legislative matters with Kevin Maher of ALA’s Office of Government Relations, and telecommunications counsel John Windhausen.

Please welcome Ellen in her new role, as you see her inside the beltway or in libraryland.

The post Ellen Satterwhite joins ALA telecom policy team appeared first on District Dispatch.

Islandora: Islandoracon 2017 Call for Proposals now open

planet code4lib - Thu, 2016-09-15 16:53

The Islandoracon Planning Committee invites you to submit your proposals to present at the second Islandoracon, May 15 - 19, 2017 in Hamilton, Ontario, Canada.

This year’s conference theme is Beyond the Island. Since its creation at the University of Prince Edward Island in 2006, Islandora has spread around the world. It has grown to include diverse institutions, collections, and strategies for digital repository management that add to the richness of the Islandora community. The 2017 Islandoracon will celebrate these multifaceted visions of Islandora that are continually emerging, inspiring constant revision in the concept of a digital repository.

Regular Sessions:
A 20-minute talk (plus 10 minute for questions) with accompanying A/V.

Poster/Island Getaways:
A conference poster to be displayed during a poster session, accompanied by a five-minute ‘lightning talk’ style presentation of that poster, which will be known as "Island Getaways." Conference attendees will be invited to vote on your favourite "Island Getaway" to award prizes for the most inspirational posters and talks.

Post-Conference Sessions:
A free-form day for Islandora gatherings. The last day of Islandoracon is open for sessions, workshops, working groups, training, meetings, or whatever events you care to propose. Please let us know what kind of event you’d like to hold, the expected audience, and the space/time needed.

Speakers will have access to a discounted Speakers Rate for registration. Submissions will be accepted until December 16th, 2016

Submission Form

DPLA: Job Opportunity: Director of Technology

planet code4lib - Thu, 2016-09-15 16:30

The DPLA has an opening for the position of Director of Technology.

The Digital Public Library of America seeks a Director of Technology to lead its staff of developers and technologists, and to further DPLA’s mission to bring together the riches of America’s libraries, archives, and museums, and make them freely available to all. A belief in this mission and the drive to accomplish it over time in a collaborative spirit both within and beyond the organization is essential.

The Director of Technology will be responsible for the overall technology vision for the DPLA, and in consultation with the DPLA senior staff, will develop of new initiatives and cross-institutional partnerships. The Director of Technology will report to the Executive Director.

The Director of Technology will:

  • Orchestrate the design, implementation, and improvement of DPLA’s core infrastructure, user-facing applications, and back-end systems.
  • Oversee and a manage a team of four developers, and work with external contractors as necessary.
  • Act as the primary technical contact for outside organizations, partners, and developers.
  • Cultivate and develop the culture and values of the DPLA Technology Team and the larger organization.
  • Support the philosophy of open source, shared, and community-built software, frameworks, and technologies.
  • Be conversant and comfortable with a broad range of technologies used by the cultural heritage sector, and engage with cultural heritage communities and consortial efforts.


  • Experience with managing developers and technologists in the cultural heritage or non-profit sectors, with a demonstrated capacity of building an environment for success for technical teams.
  • Demonstrated experience working effectively in a team environment and the ability to interact effectively with stakeholders.
  • Demonstrated experience contributing to or managing collaborative open source software projects.
  • Excellent written and verbal communication skills.
  • Excellent analytical and organizational skills.
  • A high degree of emotional intelligence and empathy.


  • Experience with architectures, standards, and protocols to support interoperability and reuse within and beyond the cultural heritage sector.
  • Demonstrated experience with technical project management, grant writing and administration, and/or development and oversight of RFP processes.
  • Demonstrated desire to learn new technologies or programming languages.

This position is full-time. DPLA is a geographically distributed organization, with headquarters in Boston, Massachusetts. Ideally, this position would be situated in the Northeast Corridor between Washington and Boston (with a preference for Boston), but remote work based in other locations will also be considered. While not a requirement, a history of working effectively in a distributed organization is helpful.

Like its collection, DPLA is strongly committed to diversity in all of its forms. We provide a full set of benefits, including health care, life and disability insurance, and a retirement plan. Starting salary is commensurate with experience.

About DPLA

The Digital Public Library of America strives to contain the full breadth of human expression, from the written word, to works of art and culture, to records of America’s heritage, to the efforts and data of science. Since launching in April 2013, it has aggregated more than 14 million items from over 2,000 institutions. DPLA is a registered 501(c)(3) non-profit.

To apply, send a letter of interest detailing your qualifications, resume and a list of 3 references in a single PDF to First preference will be given to applications received by September 30, 2016, and review will continue until the position is filled.

David Rosenthal: Nature's DNA storage clickbait

planet code4lib - Thu, 2016-09-15 15:00
Andy Extance at Nature has a news article that illustrates rather nicely the downside of Marcia McNutt's (editor-in-chief of Science) claim that one reason to pay the subscription to top journals is that:
Our news reporters are constantly searching the globe for issues and events of interest to the research and nonscience communities.Follow me below the fold for an analysis of why no-one should be paying Nature to publish this kind of stuff.

Extance's article is entitled How DNA could store all the world's data, and starts with this scary thought:
The latest experiment signals that interest in using DNA as a storage medium is surging far beyond genomics: the whole world is facing a data crunch. Counting everything from astronomical images and journal articles to YouTube videos, the global digital archive will hit an estimated 44 trillion gigabytes (GB) by 2020, a tenfold increase over 2013. By 2040, if everything were stored for instant access in, say, the flash memory chips used in memory sticks, the archive would consume 10–100 times the expected supply of microchip-grade silicon3.He then claims a solution to this problem is at hand:
If information could be packaged as densely as it is in the genes of the bacterium Escherichia coli, the world's storage needs could be met by about a kilogram of DNA.The article is based on research at Microsoft that involved storing 151KB in DNA. The research is technically interesting, starting to look at fundamental DNA storage system design issues. But it concludes (my emphasis):
DNA-based storage has the potential to be the ultimate archival storage solution: it is extremely dense and durable. While this is not practical yet due to the current state of DNA synthesis and sequencing, both technologies are improving at an exponential rate with advances in the biotechnology industry[4].SourceThe paper doesn't claim that the solution is at hand any time soon. Reference 4 is a two year old post to Rob Carlson's blog. A more recent post to the same blog puts the claim that:
both technologies are improving at an exponential ratein a somewhat less optimistic light. It is (or may be, Carlson believes the last two data points are not representative) true that DNA sequencing is getting cheaper very rapidly. But already the cost of sequencing (read) was insignificant in the total cost of DNA storage. What matters is the synthesis (write) cost. Lower down the article Extance writes:
A closely related factor is the cost of synthesizing DNA. It accounted for 98% of the expense of the $12,660 EBI experiment. Sequencing accounted for only 2%, thanks to a two-millionfold cost reduction since the completion of the Human Genome Project in 2003.The rapid decrease in the read cost is irrelevant to the economics of DNA storage; if it was free it would make no difference. Carlson's graph shows that the write cost, the short DNA synthesis cost (red line) is falling more slowly than the gene synthesis cost (yellow line). He notes:
But the price of genes is now falling by 15% every 3-4 years (or only about 5% annually).A little reference checking, that should have been well within the capability of one of Nature's expert news reporters, reveals that the Microsoft paper's claim that:
both technologies are improving at an exponential ratewhile strictly true is deeply misleading. The relevant technology is currently getting cheaper slower than hard disk or flash memory! And since this has been true for around two decades, making the necessary 3-4 fold improvement just to keep up with the competition is going to be hard.

I actually believe that, decades from now, DNA will be an important archival medium. But I've been criticizing the level of hype around the cost of DNA storage for years. Extance's article admits that cost is a big problem, yet it finishes by quoting Goldman, lead author of a 2013 paper in Nature whose massively over-optimistic cost projections I debunked here. Goldman's quote is possibly true but again definitely deeply misleading:
"Our estimate is that we need 100,000-fold improvements to make the technology sing, and we think that's very credible," he says. "While past performance is no guarantee, there are new reading technologies coming onstream every year or two. Six orders of magnitude is no big deal in genomics. You just wait a bit."Yet again the DNA enthusiasts are waving the irrelevant absolute cost decrease in reading to divert attention from the relevant lack of relative cost decrease in writing. They need an improvement in relative write cost of at least 6 orders of magnitude. To do that in a decade means halving the relative cost every year, not increasing the relative cost by 10-15% every year.

Extance's article doesn't simply regurgitate the hype in the paper he's reporting on by failing to scrutinize its claims, he amplifies it by headlining claims the paper is careful not to make, and giving it prominence in Nature's news section. This kind of clickbaiting is a classic example of problem #6 in The 7 biggest problems facing science, according to 270 scientists by Julia Belluz, Brad Plumer, and Brian Resnick. I blogged about their article here:
Science journalism is often full of exaggerated, conflicting, or outright misleading claims. If you ever want to see a perfect example of this, check out "Kill or Cure," a site where Paul Battley meticulously documents all the times the Daily Mail reported that various items — from antacids to yogurt — either cause cancer, prevent cancer, or sometimes do both.My problem with the oligopoly of academic publishers isn't that they are incredibly expensive, but that they are incredibly poor value for money, as shown by the fact that it took me about an hour to show how misleading Extance's article is.

Open Knowledge Foundation: Why civil society organisations are using OpenSpending to share fiscal data with the public

planet code4lib - Thu, 2016-09-15 14:00

OpenSpending is one of Open Knowledge International’s current projects. It is a free and open platform for citizens looking to track and analyse public fiscal information globally.

While the OpenSpending team was busy revamping the platform over the last year we have been fortunate to have a community of users actively involved in testing the new tools. Here we  highlight the experiences of three partner civil society organisations collecting and structuring budget and spending data and using OpenSpending tools to present this data to the public. It also gives an insight into the challenges these organisations faced in data collection and solutions they employed to reduce data barriers.

Public Domain icons by David Merfield Sinar Project in Malaysia: Open Spending Data in Constrained Environments

Sinar Project is an initiative that uses open source technology and applications to make important information accessible to the Malaysian people. Sinar has been working to engage disenfranchised communities in the budget process, in order to hold the government accountable for budgets that respond to the needs of citizens.

Over the course of  2016, the team at Sinar has been working to obtain and to prepare over 100 datasets for upload on OpenSpending. So far, they uploaded over 40 datasets on the platform. Amongst others, the team published the 2014 allocated budgets for public housing maintenance in Kota Damansara township. Data uploaded and visualized on OpenSpending was shared with the community’s leaders for review. This gave the community the opportunity to compare and contrast how planned budget allocation matched up with how funds were actually spent. The community leaders identified potential misuse of funds in some budgets lines and are continuing to conduct investigations and collect evidence to expose poor management of public finances in Kota Damansara. Data and visualizations are available on OpenSpending Viewer.

It wasn’t easy for the team to obtain such data. First, they had to file a Freedom of Information (FOI) request to the state owned Selangor Housing and Property Agency. They also went into meetings with the authorities to get an indepth understanding of the data. Sinar continuously faces challenges in data collection of budgets at all levels of government. For example, for previous years, budgets for the federal government are not publicly available and there is no FOI law applicable to the federal government. There are roadblocks in data collection for state governments and for city councils as well.

“…to engage disenfranchised communities in the budget process…”

In spite of the roadblocks and reluctance of authorities to collaborate, the team at Sinar have filed FOI requests to the Selangor state government and Petaling Jaya city council to get access to fiscal budgets. They have also filed FOI requests to the management company responsible for the Kota Damansara public housing, obtaining access to data on how MYR 5 million (USD 1.2 million) were allocated to repair railings for all housing blocks and data on allocated budgets for public housing maintenance in 2014 and 2015.

Moving forward, Sinar Project is planning to continue using OpenSpending to:

  1. Address budget priorities at all levels of government
  2. Visualize allocated budgets and compare to official government policies and implementation of government programmes
  3. Make use of evidence based budget data and various survey results to hold the decision makers at all levels accountable
  4. Advocate for transparency in open data, promote better access to government budgets data, and push for better open data policies.
Metamorphosis Project in Macedonia: Revamp the current Follow the Money website

Metamorphosis Foundation is a civil society organization from Macedonia, having been active for more than 15 years. Several years ago they started collaborating with Open Knowledge International to implement the “Open Data Civil Society network” project, with the aim of improving the capacity of civil society organizations in the country. Moreover, they established School of Data Macedonia in order to promote an open agenda.

In 2012, Metamorphosis Project in Macedonia developed their Follow the Money website to familiarise citizens with the fiscal policies of local authorities. However, while budget information was presented on the site, over time it has lost its popularity.  In 2015, the School of Data fellow conducted in-depth user research to better understand why the site wasn’t being used and how it could be improved to better serve its potential user communities. Ultimately, the team at Metamorphosis decided to revamp the website.

“…improving the capacity of civil society organizations in the country.”

They focused on collecting, cleaning and preparing budget data from all 80 municipalities as well as the country’s central budget. Take a look at the planned Central Budget for 2016 made available on OpenSpending:

For the above visualization, click this link. Explore years 2010 to 2016 at this link.

Like with the Sinar project, data collection was incredibly challenging. Budget data for most municipalities was  “locked” in PDFs or not published at all. Instead of trying to get the data from the source, Metamorphosis partnered with other CSOs in the country that work closely with the municipalities who were willing to share the data that they had already collected.

Another issue they are facing is the lack of granularity of the published data and official institutions unwilling to provide more detailed data. Finally, while the central government budget was made available in machine readable format, it only included the economic budget classification, which identifies the type of budget and expenditure incurred, for example, salaries, goods and services, transfers and interest payments, or capital spending. Since the team needed the functional classification (expenditure according to the purposes and objectives for which they are intended) for the website, they had to scrape it from the website of the Ministry of Finance. The website includes functional classification data, since this is how the team found most useful to display data to users.

In the next few months, the team is working to identify funds to launch the revamped version of Macedonian Follow the Money website with embedded visualizations created on OpenSpending, and continue updating their data on the platform.

AfroLeadership in Cameroon: Open Local Budgets

AfroLeadership is a civil society organization in Cameroon, founded in 2007 and committed to the promotion of open data and civic technologies for governance, transparency and citizen participation. For several years, AfroLeadership has been promoting the use of a financial management information system in local governments, in order to improve budget transparency, accountability and public participation to budgeting. The adoption of the Financial Management Information System by several councils aims at improving budget reliability, budget execution and the ratio of budget reports to supreme audit institutions (SAI).

“…to bring budget information to citizens and CSOs in an accessible and open way…”

The Cameroon Open Local Budgets (COLB) project, launched in 2016, seeks to fight corruption, improve local accountability and ensure effective service delivery by collecting and publishing all 374 (the number of councils in Cameroon) approved budgets and accounts for all local authorities in Cameroon on OpenSpending. This project is a continuation of the organisation’s effort to bring budget information to citizens and CSOs in an accessible and open way, and engage them in public and local affairs.

The goal of the current OpenSpending Cameroon pilot phase is to upload 50 data sets for 2015 budget reports. For example, uploaded data on Cameroon’s Dschang council looks at functional expenses versus investment expenses, while a drill down into these categories lets users explore the expenses for each budget category.

The AfroLeadership team also faces challenges in data collection. Even if the deadline for 2015 budget reports and account production was at the end of May of this year, collection of these accounts has been more difficult than expected. Audit Bench of the Supreme Court of Cameroon has stressed the fact that less than 10% of budgets reports are received at their desk each year.

To address data collection challenges, AfroLeadership has organized information workshops to present to diverse stakeholders (Mayors, Supreme Audit Institutions, Civil Society Organisations, Journalists, etc.) the necessity of involving citizens in the budget cycle. Also, AfroLeadership has invited its institutional partner on this project, the national community driven development program (PNDP), to help collect approved 2015 budgets reports and accounts. AfroLeadership is currently in touch with the Ministry of Finance to explore opportunities in improving budget report collection in local governments.

All these organizations have been involved in upload training sessions on OpenSpending and now that the platform is available in Alpha, they are working to publish the data to the larger public through OpenSpending.

To browse existing datasets and to upload your data, visit OpenSpending. For questions, OpenSpending team is available via OpenSpending discussion forum, on in the OpenSpending chat room, or on the OpenSpending issue tracker.


Subscribe to code4lib aggregator