You are here

Feed aggregator

David Rosenthal: The Box Conspiracy

planet code4lib - Tue, 2018-01-02 16:00
Growing up in London left me with a life-long interest in the theatre (note the spelling).  Although I greatly appreciate polished productions of classics, such as the Royal National Theatre's 2014 King Lear, my particular interests are:
I've been writing recently about Web advertising, reading Tim Wu's book The Attention Merchants: The Epic Scramble to Get Inside Our Heads, and especially watching Dude, You Broke The Future, Charlie Stross' keynote for the 34th Chaos Communications Congress. As I do so, I can't help remembering a show I saw nearly a quarter of a century ago that fit the last of those categories. Below the fold I pay tribute to the prophetic vision of an under-appreciated show and its author.

George Coates Performance Works' Box Conspiracy: An Interactive Sho had a November 1993 run in San Francisco and was featured the next year in the Spoletto Festival. Variety panned it:
The purely visceral aspects of “A Box Conspiracy” may delight Coates neophytes. No doubt they’ll satisfy fans ever-ready to lay thought aside for 2 1 /4 hours. But in moving closer to theatrical conventions, Coates underlines his need for collaborators with a real understanding of how those conventions work.Many of Variety's criticisms were just:
Coates’ sloppy script screams for dramaturgical assistance. The odd witty line aside, dialogue consists mostly of lame puns and filler. The actors are clearly at the timing mercy of visual cues. They fumble through their paces as if barely off-book.But in hindsight Variety completely missed the point of the show because they had no idea of what, in a couple of years, was about to happen to the media industry as the Web took off.

The show mixed live actors with 3D projections - you had to wear 3D glasses. Coates was working closely with techies in Silicon Valley and, I believe, it was Silicon Graphics equipment that drove the projections. Technically, the show was at the very outside edge of the possible, so rough edges were forgivable. But the technical stuff wasn't the point either.

There's very little about Coates and Box Conspiracy on the Web apart from the review and a post about Coates by Steve Mobia, who writes:
It's ironic that there is so little about this pioneering theater group on the web since many Coates productions were directly commenting on this changing technology and its political/social effect. George Coates was, for two decades, the most celebrated director of experimental theater in San Francisco. Each show, despite occasional critical reservation, was given major attention by the city's newspapers. This you would never guess when researching Coates on the web today. Tim Wiggins as Derek HornsbyMobia quotes Tim Wiggins, who starred in the show:
"Box Conspiracy" was created during the birth of the internet's transition to public domain, ... long before images and graphics, long before most people understood what a URL was (including me and most of the cast). ... So while "Box" was ostensibly about interactive television, it was actually about the promises and pitfalls of the coming internet. ... Aside from the 'point and click' purchasing offered by Tom Testa, all the Hornsby family's actions were monitored ... In the end, Derek's repeated ordering of triple-sausage pizza resulted in a massive increase in his health insurance premiums.In 1993 the Web was four years old, so Coates couldn't assume that the audience knew anything about it. Instead, the show was about the future 5000-channel "interactive TV", a popular meme at the time. Almost a quarter-century later Charlie Stross is talking about the Web, but both Coates and Stross are really talking about advertising-driven media corporations' desperate need to consume more and more of society's attention, and to accumulate personal information to better do so. As Cory Doctorow writes:
Stross says we should be especially worried about machines designed to command ever-larger slices of our attention, without regard to whether we're made happier through this process (after all, you can make someone pay attention to you by driving them nuts, something that's often easier than pleasing them.

He traces the original sin of attention-optimizing autonomous artificial life-forms to the advertising-driven web,What Coates was showing was that, from the channel's point of view the worst thing that could happen was for the viewer to change channels, and from the system's point of view the worst thing that could happen was for the viewer to go outside. So the content was designed to prevent these disasters by targeting channels ("My Favorite Mayhem", "The Beer Channel") narrowly to the viewer, and by portraying outside the house as dangerous ("Live Crime In Your Neighborhood"). With point-and-click purchasing and almost instant delivery there was no need to go outside. Jeff Bezos wouldn't start until the following year.
Still no triple-sausage pizza at!Apparently, the New York Public Library has a videotape of Box Conspiracy! I hope they can digitize and put it up on the Web. It was quite extraordinarily prophetic, and deserves a place in media history.

Mark Matienzo: Iterative Intentions for 2018

planet code4lib - Mon, 2018-01-01 19:43

While I enjoy seeing what my friends are setting their intentions towards in the new year, I don’t really believe in new year’s resolutions for myself. They tend to wear on me heavily whenever I’ve proclaimed a long list of things I’m hoping to get better at. Instead, this year, I’m starting with a very short list. My hope is that I can commit to a small number of good habits at a time, which I can then build on iteratively. I want to have the windows of reinforcement stay small at first (maybe a week or two), and once I feel satisfied about whichever habits I’ve committed to, I can add more.

I’m starting with three items:

  • Rebuilding this website: simplified tooling; new layout/style; using and publishing more structured data, and a partial implementation of a stack following Indieweb and Solid principles. The last part is intentionally slippery, but I mostly really care about sending and receiving notifications at this point. I’m giving myself about a week to get this done.
  • Eating better breakfasts. I started 2018 with overnight oats, which happened to be mildly successful. I have a lot to master in terms of proportions and taste, to say the least.
  • Budgeting and financial tracking to better understand my ongoing expenses. This is something I’m undertaking with my partner, and we have actionable (but private) goals for this.

Wish me luck.

John Mark Ockerbloom: Public Domain Day 2018: The 20-year alarm clock

planet code4lib - Mon, 2018-01-01 05:00

In Washington Irving’s classic story “Rip van Winkle“, the title character follows an archaically-dressed stranger into a mountain hideaway, falls asleep, and wakes up to find the world has moved on 20 years without him.  He’s alarmed at first, but eventually figures out what has happened, adapts, and settles into the social fabric of his much-changed town.

We in the US are experiencing a similar phenomenon this Public Domain Day.  It’s now been 20 years since a full year’s worth of published content entered the public domain, when 1922 copyrights expired at the start of 1998.  Later that year, Congress passed the Sonny Bono Copyright Term Extension Act, and the advance of the public domain has been stuck in the Warren G. Harding administration ever since.

Other countries are doing somewhat better.  Most European countries have by now regained the 20 years of public domain many of them lost when they complied with the EU Copyright Duration Directive, and they now get to freely share the works of creators who died in 1947– people like Baroness Orczy, Alfred North Whitehead, Pierre Bonnard, and Willa Cather.  (We in the US get any unpublished works by people in this group, but that’s all that’s entering the public domain here today.)  Countries like Canada that kept to the Berne Convention’s “life plus 50 years” terms are doing substantially better– today they get the works of creators who died in 1967, including Carson McCullers, Arthur Ransome, Woody Guthrie, René Magritte, and Dorothy Parker.  (The Public Domain Review’s Class of 2018 article has writeups of some these authors, and others.)

This time next year, though, the US may well get to join the party in a bigger way, and have all copyrighted 1923 publications finally enter the public domain.  It’s not a sure thing, though– lobbyists for the entertainment industry have long pressed Congress for longer copyright terms, and while there’s no bill I’m aware of that’s been introduced to do it next year, that doesn’t mean that one couldn’t be quickly rammed through.  So it’s a good time for you to let your elected representatives know that you value a robust public domain, and want no further copyright extensions.  (And if you’re hoping to elect someone new in 2018, let your candidates know this matters to you as well.)

There are other ways that those of us in the US can prepare Public Domain Wake-Up party next year.  Some of us are continuing to bring the “hidden public domain” to light over the next year.  Researchers with HathiTrust’s Copyright Review Management System have by now found over 300,000 books and other monographs that are non-obviously in the public domain.  (You may be able to help them!)  The project I’m leading to help identify public domain serial content past 1922 is also underway, and should be complete later this year.  (See an earlier post for some ways you can help with that if you like.)

Other folks are working on another set of works that libraries in the US can now share– certain works in the last 20 years of copyright, which as of today span a full 20 years of publication, from 1923 to 1942.  A provision in the 1998 Copyright Term Extension Act, codified in Section 108(h) of US copyright law, allows non-profit libraries and archives to digitally display and distribute such works when they are not being commercially exploited, and copies are not available at a reasonable price.  Libraries have not made much explicit use of this provision until recently, partly due to uncertainty about when and how it applies.  A preprint article by Elizabeth Townsend Gard aims to clear up this uncertainty and spur libraries to make these works more widely available.  I hope that this work is further developed and applied in the coming year.  (And I’m happy to consider works in these last 20 years of copyright for inclusion in The Online Books Page‘s listings, when the libraries that have digital copies make it clear that they’re following the section 108(h) guidelines.)

I also hope some folks take some time this year to take a good look at what we’ve digitized from 1922, and compare it to what was copyrighted that year in the US.  What portion of 1922’s creative output have we brought to light online?   What sorts of works, and what people, have we tended to miss?  Knowing where the gaps are can tell us what we might want to focus on bringing to light from 1923.  Some of that material can be posted online now; hopefully the rest can be posted starting next January.

Finally, it’s worth remembering that you don’t have to wait for copyrights to expire on their own to share work whose copyright you control.  You can open-license them any time you like (this post, along with many of my other works, is CC-BY licensed, for instance).  And you can let them go entirely if you like.  I tend to put my works into the public domain 14 years after publication, following the original copyright term of the US, if I see no need to extend it further.  Today I do that again, and with this post you can consider anything I published in 2003, whose copyright I still control, to have a CC0 dedication to the public domain.

Happy Public Domain Day to everyone who’s seeing new works in the public domain where they are.  And here’s to waking up to lots of new additions to the public domain in the US a year from now.

Hugh Rundle: </learning 2017> <learning 2018>

planet code4lib - Mon, 2018-01-01 02:51

I learned a hell of a lot last year.

1. Tech stuff Koha

This year at MPOW we migrated our library management system to Koha and I had to simultaneously learn how Koha works and teach other people. It was a pretty crazy time, but it's been an amazing experience and I just want to keep learning more.


An unexpected part of learning how Koha works was that I had to learn a lot more about MARC. All of our cataloguing workflows were written for Amlib, and whilst Koha's cataloguing tools are extremely flexible and pretty sophisticated, the flipside is that with great power comes great responsibility. You can create complex MARC modification rules for importing files, but to do that you have to know what to modify, how, and why. I probably have more of an appreciation of just how much information is packed into MARC records now, though my fundamental position is still that it's an outdated standard that needs to be replaced a lot faster than we all seem to be moving.


I've been meaning to learn how to write SQL queries for years now, but somehow I always got away with not knowing how. Moving to Koha gave me the final push to learn, and it turns out it's not as complicated as I expected.


It's bothered me for quite a while that we've been effectively forcing newCardigan mailing list recipients and cardiParty attendees to sign up to third party platforms to participate. In December we finally completed our migration to CiviCRM on WordPress, bringing our mailing list, events, podcasting, website management and potential fundraising and membership 'in-house' into one system managed by us. Setting this up meant having to learn how CiviCRM works, and to be honest I'm pretty impressed by it. This is the same CRM used by EFA and The Greens and it's both powerful and reasonably easy to set up.

WordPress theme development

I didn't really want to move the newCardigan website to WordPress - everyone had learned how to use Ghost and I find it much easier to manage. Unfortunately, moving to CiviCRM required that we use one of the big three PHP-based platforms: WordPress, Joomla! or Drupal. Setting up a Drupal site seemed like it might be overkill, and WordPress is familiar to everyone on the newCardigan committee, so we went with that. Unfortunately that meant making our own theme in WordPress and I discovered to my surprise that the official WordPress docs are terrible. For a system that powers somewhere between 20% and 25% of the web, this was a bit of a shock, and it made me more appreciative of what the Ghost team have done with docs and guides, given how much smaller and less mature the project is. Anyway, I managed to work things out eventually and whilst I definitely wouldn't say I'm now a "WordPress theme developer" I learned how to hack something together.

CSS grid

I'd seen Jen Simmons and Rachel Andrew talking excitedly about CSS Grid on Twitter over the course of 2017. I didn't quite understand why it was so exciting, but it sounded interesting, so I watch some of Andrew's Grid by example videos and spent quite a few minutes staring at the screen say "wow". grid suddenly makes everything so much easier. A few lines and you can make a responsive, sensible and more interesting site, and spend time making things interesting and pretty instead of spending time just making sure it all lines up properly. With grid and flex, even n00bs like me should be able to quickly knock up something that looks sensible in any modern browser on any screen.

Using package.json and npm install for nodejs

Part of the move from Ghost to WordPress was exporting our old website posts from the Ghost site and importing them into WordPress. The Ghost team created a great tool for moving from WordPress to Ghost, but I couldn't find anything that worked moving in the other direction. So I wrote a nodejs script to do the job. When I was looking at how to make it simple to use with any filename, I finally realised why package.json is useful, and how npm install and npm start work.. I'll definitely be using all three again.


I'm not sure I'll ever really understand Docker, but I learned a lot more this year. As part of some server housekeeping, I moved all my projects (bots, meteor apps, etc) to Docker. This makes them all a lot easier to maintain and, if necessary, move again in future.

When there's a problem with something you've installed on a web server, it's probably a permissions error.

I learned this the hard way.

2. I can make things happen

In December 2016 I launched GLAM Blog Club without really knowing what would happen. Twelve months later and it seems to have mostly been a success. We'll publish some stats soon, but generally speaking we've maintained a regular group of GLAM bloggers, and an ok level of diversity, and I think it's reasonable to claim that more Australian GLAM professionals have blogged more regularly than would be the case if GLAM Blog Club didn't exist.

In July VALA Tech Camp happened. The genesis of Tech Camp was long and complicated, but I had a leading role in finally making it happen. I'm proud of what we accomplished, and if nothing else it proves that there is a mostly untapped market in Australia for reasonably priced GLAM conferences that don't imvolve vendor halls, keynote speakers or overpriced dinners.

Finally, in November, MPOW migrated to Koha ILS after pretty much two and a half years of work to get us to that point. To be honest, I never thought I would actually be able to get a large Australian public library service to move to an open source library management system, so the whole experience has been a little surreal. It's not perfect, but I'm now working with a system that we can improve either directly (writing the code ourselves) or indirectly (paying for development) as part of a collaborative international project. I'm excited about the possibilities for the (near) future.

3. Nobody knows what the hell they're doing

I already knew this but I had to learn it again. Hearing people I respect deeply say "I don't know", or "I'm not sure what to do" is slightly disconcerting, but in some ways a relief. We're all just humans doing the best we can, and it's good to remember that every now and again.

4. We have to smash the patriarchy

I already knew this one too but last year made it unbearably, unavoidably, shouting-in-my-face clear. Toxic masculinity is ruining everything for everyone and men need to take the lead to kill it. Which will often mean ...not taking the lead. Inevitably I'll keep failing at this, but I'll try to notice more when things are not ok, and actually do something about it.

5. I actually quite like my job

I sure didn't think so for a lot of this year. In hindsight, I think this was mostly a reaction to the many times I found myself uncertain about what I should be doing, and fearing the consequences if I messed things up. Looking at it more rationally, I am incredibly lucky to be in the job I have. I love that I oversee both IT systems and the general collections. I work with great people, my boss trusts me, and the community we serve is diverse, interesting and appreciative. I also got the chance to be Acting Manager for five weeks which was great because I defintely know that I never want to do that job permanently...

What I want to learn this year
  1. Koha templating and themes - to get the most out of the system.
  2. More mySQL - to get some nifty collection management reports out.
  3. Perl - this might sound weird, but Koha is written in Perl.
  4. Unit testing - I know why it's important but I still don't really understand how it works.
  5. How to cook bagels - because they are delicious.
  6. To read long texts immersively again - this will probably require less Twitter.
  7. Remembering names better - because I'm really, really, embarrassingly bad at this.

Thanks for reading my overshare. Happy New Learning Year!

Peter Murray: DLTJ in a #NEWWWYEAR

planet code4lib - Sun, 2017-12-31 22:36

I came across Jen Simmon's call for a #newwwyear by way of Eric Meyer:

Ok, here’s the deal. Tweet your personal website plan with the hashtag #newwwyear (thanks @jamiemchale!):
1) When will you start?
2) What will you try to accomplish?
3) When is your deadline?

Improve an existing site. Start a new one. Burn one down & start over. It’s up to you.

— Jen Simmons (@jensimmons) December 20, 2017

Eric's goal:

I plan to participate in #newwwyear (see My plan:

1) I’ll start December 27th.
2) I’ll redesign for the first time in a dozen years, and I’ll do it live on the production site.
3) My deadline is January 3rd, so I’ll have a week.

— Eric Meyer (@meyerweb) December 22, 2017

I was definitely inspired by Eric's courage to do it live on his personal website (see this YouTube playlist). Watch him work without a net (and without a CSS compiler!). I also know that the layout for this blog is based on the WordPress Twenty-eleven theme with lots of bits and pieces added by me throughout the years. It is time for a #newwwyear refresh. Here is my plan.

1) I'll start January 1.
2) I'll convert '' from WordPress to a static site generator (probably Jekyll) with a new theme.
3) My deadline is January 13 (with a conference presentation in between the start and the end dates).

Along the way, I'll probably package up a stack of Amazon Web Services components that mimic the behavior of GitHub pages with some added bonuses (standing temporary versions of the site based on branches, supporting HTTPS for custom domains, integration with GitHub pull request statuses). I'll probably even learn some Ruby along the way.

Eric Hellman: 2017: Not So Prime

planet code4lib - Sat, 2017-12-30 17:21
Mathematicians call 2017 a prime year because 2017 has no prime factors other than 1 and 2017. Those crazy number theorists.

I try to write at least one post here per month. I managed two in January. One of them raged at a Trump executive order that compelled federal libraries to rat on their users. Update: Trump is still president.  The second pointed out that Google had implemented cookie-like user tracking on previously un-tracked static resources like Google Fonts, jQuery, and Angular. Update: Google is still user-tracking these resources.

For me, the highlight of January was marching in Atlanta's March for Social Justice and Women with a group of librarians.  Our chant: "Read, resist, librarians are pissed!"

In February, I wrote about how to minimize the privacy impact of using Google AnalyticsUpdate: Many libraries and publishers use Google Analytics without minimizing privacy impact.

In March, I bemoaned the intense user tracking that scholarly journals force on their readersUpdate: Some journals have switched to HTTPS (good) but still let advertisers track every click their readers make.

I ran my first-ever half-marathon!

In April, I invented CC-licensed "clickstream poetry" to battle the practice of ISPs selling my clickstream.  Update: I sold an individual license to my poem!

I dressed up as the "Trump Resistor" for the Science March in New York City. For a brief moment I trended on Twitter. As a character in Times Square, I was more popular than the Naked Cowboy!

In May, I tried to explain Readium's "lightweight DRM"Update: No one really cares - DRM is a fig-leaf anyway.

In June, I wrote about digital advertising and how it has eviscerated privacy in digital libraries.  Update: No one really cares - as long as PII is not involved.

I took on the administration of the free-programming-books repo on GitHub.  At almost 100,000 stars, it's the 2nd most popular repo on all of GitHub, and it amazes me. If you can get 1,000 contributors working together towards a common goal, you can accomplish almost anything!

In July, I wrote that works "ascend" into the public domain. Update: I'm told that Saint Peter  has been reading the ascending-next-monday-but-not-in-the-US "Every Man Dies Alone

I went to Sweden, hiked up a mountain in Lappland, and saw many reindeer.

In August, I described how the National Library of Medicine lets Google connect Pubmed usage to Doubleclick advertising profilesUpdate: the National Library of Medicine still lets Google connect Pubmed usage to Doubleclick advertising profiles.

In September, I described how user interface changes in Chrome would force many publishers to switch to HTTPS to avoid shame and embarassment.  Update: Publishers such as Elsevier, Springer and Proquest switched services to HTTPS, avoiding some shame and embarrassment.

I began to mentor two groups of computer-science seniors from Stevens Institute of Technology, working on projects for and Gitenberg. They are a breath of fresh air!

In October, I wrote about new ideas for improving user experience in ebook reading systemsUpdate: Not all book startups have died.

In November, I wrote about how the Supreme Court might squash out an improvement to the patent system. Update: no ruling yet.

I ran a second half marathon!

In December, I'm writing this summary. Update: I've finished writing it.

On the bright side, we won't have another prime year until 2027. 2018 is twice a prime year. That hasn't happened since 1994, the year Yahoo was launched and the year I made my first web page!

Open Knowledge Foundation: Data aggregators: a solution to open data issues

planet code4lib - Thu, 2017-12-28 13:40

This is a guest opinion piece written by Guiseppe Maio, and Jedrzej Czarnota PhD. Their biographies can be found below this post.

Open Knowledge International’s report on the state of open data identifies the main problems affecting open government data initiatives. These are: the very low discoverability of open data sources, which were rightfully defined as being “hard or impossible to find”; the lack of interoperability of open data sources, which are often very difficult to be utilised; and the lack of a standardised open license, representing a legal obstacle to data sharing. These problems harm the very essence of the open data movement, which advocates data easy to find, free to access and to be reutilised.  

In this post, we will argue that data aggregators are a potential solution to the problems mentioned above.  Data aggregators are online platforms which store data of various nature at once central location to be utilised for different purposes. We will argue that data aggregators are, to date, one of the most powerful and useful tools to handle open data and resolve the issues affecting it.

We will provide the evidence in favour of this argument by observing how FAIR principles, namely Findability, Accessibility, Interoperability and Reusability, are put into practice by four different data aggregators engineered in Indonesia, Czech Republic, the US and the EU. FAIR principles are commonly utilised as a benchmark to assess the quality of open data initiatives and good FAIR practices are promoted by policymakers.

Image: SangyaPundir (Wikimedia Commons)

We will also assess the quality of aggregators’ data provision tools. Aggregators’ good overall performance on the FAIR indicators and the good quality of their data provision tools will prove their importance. In this post, we will firstly provide a definition of data aggregators presenting the four data aggregators previously mentioned. Subsequently, we will discuss the performance of the aggregators on the FAIR indicators and the quality of their data provision.   

Data aggregators

Data aggregators perform two main functions: data aggregation and integration. Aggregation consists of creating hubs were multiple data sources can be accessed for various purposes. Integration makes reference to linked data, namely data to which a semantic label (a name describing a variable) is attached in order to allow for the integration and amalgamation of different data sources (Mazzetti et al 2015, Hosen and Alfina 2016, Qanbari et al 2015, Knap et al 2012).

Following on that, two strengths characterise data aggregators. Firstly, aggregators implement the so-called “separations of concern”:  this means that each actor is responsible for a functionality. Separation of concerns spurs accountability and improves data services. Secondly, aggregators host added-value services, i.e. semantics, data transformation, data visuals (Mazzetti et al 2015). However, aggregators face a major challenge as they represent a “single point of failure”: when aggregators break down, the whole system (including data providers and users) is put in jeopardy.

In this post we investigate the Indonesian Active Hiring website, the Czech ODCleanStore, the US-based and the EU-funded ENERGIC-OD.  

  1. The Active Hiring website is a portal that monitors job hiring trends by sector, geographical area and job type. The platform utilises open and linked data (Hosen and Alfina 2016).
  2. ODCleanStore is a project that enables automated data aggregation, simplifying previous aggregation processes; the website provides provenance metadata (metadata showing the origin of the data) and information on data trustworthiness (Knap et al 2012).
  3. is a platform that catalogues raw data, providing open APIs to government data. This portal is part of the Gov 2.0 movement.
  4. ENERGIC-OD (European Network for Redistributing Geospatial Information to user Community – Open Data) is a European Commission-funded project which aims to facilitate access to the Geographic Information System (GIS) open data.  The project built a pan-European Virtual Hub (pEVH), a new technology brokering together diverse GIS open data sources.

FAIR indicators and quality of data provision to evaluate data aggregators

FAIR principles and the quality of data provision are the criteria for the assessment of open data aggregators.

Findability. Data aggregators by default increase the discoverability of open data, as they assemble data in just one site, rendering it more discoverable. Aggregators, however, do not fully resolve the problem of lack of discoverability: they merely  change the nature of it. While before, findability was associated with technical problems (data was available but technical skills were needed to extract it from various original locations), now it is intertwined to marketing ones (data is in one place, but it may be that no one is aware of it). Aggregators thus address the findability issues but do not fully resolve them.

Accessibility. Aggregators perform well on the Accessibility indicator. As an example, ENERGIC-OD makes data very accessible through the use of a single API. ‘s proposed new unit, Data Compute Unit (DCU), provide APIs to render data accessible and usable. ODCleanStore converts data in RDF format which makes it more accessible. Finally, Active Hiring website will provide data as CSV through API(s). Aggregators show improved data accessibility.

Interoperability. All platforms produce metadata (ENERGIC-OD, and linked data (Active Hiring Website and ODCleanStore) which make data interoperable, allowing it to be integrated, thus contributing to the resolution of the non-interoperability issue.

Reusability. ENERGIC-OD’s freemium model promotes reusability. data can be easily downloaded and reutilised as well. ODCleanStore guarantees re-use, as data is licensed with Apache 2.0, while Active Hiring allows visualisation only. Thus, three out of four aggregators enhance the reusability of the data, showing a good performance on the reusability indicator.

Quality of data provision. Web Crawler is used in ENERGIC-OD and Active Hiring websites. This is a programme which sifts the web searching for data in an automated and methodical way. ODCleanStore acquires data in the following ways:  A) through a “data acquisition module” which collects government data from a great deal of different sources in various formats and converts it into RDF (Knap et al 2012); 2) through the usage of a web service for publishers; 3) or data can be sent directly as RDF. In the case of, the government sends data directly to the portal. Three out of four aggregators show automated or semi-automated ways of acquiring data, rendering this process smoother.


This post analysed the performance of four data aggregators on the FAIR principles. The overall good performance of the aggregators demonstrates how they render the process of data provision smoother and more automated, improving open data practices. We believe, that aggregators are among the most useful and powerful tools available today to handle open data.

  • Hosen, A. and Alfina, I. (2016). Aggregation of Open Data Information using Linked Data: Case Study Education and Job Vacancy Data in Jakarta. IEEE, pp.579-584.
  • Knap, T., Michelfeit, J. and Necasky, M. (2012). Linked Open Data Aggregation: Conflict Resolution and Aggregate Quality. IEEE 36th International Conference on Computer Software and Applications Workshops, pp.106-111.
  • Mazzetti, P., Latre, M., Bauer, M., Brumana, R., Brauman, S. and Nativi, S. (2015). ENERGIC-OD Virtual Hubs: a brokered architecture for facilitating Open Data sharing and use. IEEE eChallenges e-2015 Conference Proceedings, pp.1-11.
  • Qanbari, S., Rekabsaz, N. and Dustdar, S. (2017). Open Government Data as a Service (GoDaaS): Big Data Platform for Mobile App Developers. IEEE 3rd International Conference on Future Internet of Things and Cloud, pp.398-403.


Giuseppe Maio is a research assistant working on innovation at Trilateral Research. You can contact him at His twitter handle is @pepmaio. Jedrzej Czarnota is a Research Analyst at Trilateral Research. He specialises in innovation management and technology development. You can contact Jedrzej at and his Twitter is @jedczar.

District Dispatch: Keeping public information public: Records scheduling, retention and destruction

planet code4lib - Thu, 2017-12-28 13:32

This is the third post in a three-part series on how federal records laws ensure government information is appropriately managed, preserved and made available to the public.(see also part one and part two).

The National Archives and Records Administration, which holds nearly 5 million cubic feet of archival records, helps federal agencies determine which records are deemed to be “of permanent value” and which can be disposed of, and when and how. Photo by Wknight94

Earlier posts in this series discussed the need for the federal government to have a system for determining how long to keep records and when to dispose of them. Just as a library deaccessions materials when they are no longer needed for that library’s purposes, the federal government also must dispose of records that it no longer needs. Without disposal, the haystack of records would grow ever larger, making it that much harder to find the needle.

Only 1-3% of records created by the federal government are deemed to be of permanent value. Agencies must transfer those records to the National Archives and Records Administration (NARA) and dispose of the rest after an appropriate period of time. Currently, NARA holds nearly 5 million cubic feet of archival records – and that number continues to grow. You can see why there has to be a way to separate the wheat from the chaff.

What are agency responsibilities to retain federal records?

The Federal Records Act (FRA) is the law that sets those standards. In turn, NARA provides guidance to federal agencies on their responsibilities under the FRA. The FRA requires agencies to safeguard federal records so that they are not lost or destroyed without authorization.

Agencies must maintain federal records as long as they have “sufficient administrative, legal, research, or other value to warrant their further preservation by the Government.” In order to do so, agencies must determine the “value” of a particular record.

How do agencies determine the expected value of a record?

Given the huge volume of records created by the federal government, it would be impractical to review every individual record in order to determine its value. Instead, agencies predict the value of records in advance through the use of categories. The agency categorizes the types of records it produces, and then, for each category, predicts how far in the future those records will be useful.

For instance, consider official agency documents explaining the rationale behind a new regulation. We can imagine that, thirty years later, people might want to know why the agency issued that regulation. But there would likely not be as much interest in the agency’s cafeteria menu after that length of time. This process is known as “records scheduling,” because the agency is creating a schedule for how long to retain each type of record.

However, the agency does not make scheduling decisions on its own. Agencies submit proposed schedules to NARA for its appraisal. If NARA agrees with the agency’s proposal, then it publishes a notice in the Federal Register describing the records and inviting comments from the public. After reviewing any comments, NARA may approve the schedule, and the agency may then begin disposing of records according to that schedule. Approved schedules are posted online to help the public identify how long particular records will be retained.

How can the records scheduling process be improved?

Typically, there is not much controversy about records schedules; rightly or wrongly, they’re generally seen as pretty mundane. But that’s not always the case.

For instance, in July, NARA published a notice of proposed records schedules from Immigration and Customs Enforcement (ICE) for several categories of records related to detainees. After reviewing the proposed schedules, the ACLU and other organizations filed comments arguing that the proposed retention schedule “does not account for the needs of the public, impacted individuals and government officials to conduct necessary, and in some cases required, evaluation of ICE detention operations.” In response, NARA has not yet approved the ICE proposal, and changes are expected before it is approved.

ALA believes that the records scheduling process can be improved to better ensure that important records are retained for an adequate period of time. In November, ALA sent a letter recommending that NARA increase the transparency of proposed records schedules and review its appraisal policy to give greater consideration to the potential value of records. ALA will continue to work with NARA and agencies to advocate for the proper preservation of government information and its availability to the public.

The post Keeping public information public: Records scheduling, retention and destruction appeared first on District Dispatch.

Terry Reese: MarcEdit MacOS 3 Design notes

planet code4lib - Thu, 2017-12-28 00:23

** Updated 12/28 **

Ok, so I’m elbow deep putting some of the final touches on the MacOS version of MarcEdit. Most of the changes to be completed are adding new functionality (introduced in MarcEdit 7 in recent updates), implementing the new task browser, updating the terminal mode, and completing some of the UI touches. My plan has been to target Jan. 1 as the release date for the next MacOS version of MarcEdit, but at this point, I’m thinking this version will release sometime between Jan.1 and Jan. 7. Hopefully, folks will be OK if I need a little bit of extra time.

As I’m getting closer to completing this work, I wanted to talk about some of the ways that I’m thinking about how the MacOS version of MarcEdit is being redesigned.

  1. As much as possible, I’m syncing the interfaces so that all versions of MarcEdit will fundamentally look the same. In some places, this means significantly redoing parts of the UI. In others, its mostly cosmetic (colors). While design does have to stay within Apple’s UI best practices (as much as possible), I am trying to make sure that the interfaces will be very similar. Probably the biggest differences will be in the menuing. To do this, I’ve had to improve the thread queuing system that I’ve developed in the MacOS version of MarcEdit to make up for some of the weaknesses in how default UI threads work.
  2. One of the challenges I’ve been having is related to some of the layout changes in the UI. To simplify this process, the initial version of the Mac update won’t allow a lot of the forms to be resized. The form itself will resize automatically based on the font and font sizes selected by the user – but resizing and scaling all the items on the window and views automatically and in spatial content is providing to be a real pain. Looking at a large number of Mac apps, window resizing doesn’t always appear to be available though (unless the window is more editor based) – so maybe this won’t be a big issue.
  3. Like MarcEdit 7, I’m working on integrations. Enhancing the OCLC integrations, updating the ILS integrations, integrating a lot of help into MarcEdit MacOS, adding new wizards, integrating plugins – I’m trying to make sure that I fully embrace with the MacOS update, one of the key MarcEdit 7 design rules – that MarcEdit should integrate or simplify the moving of data between systems (because you are probably using more programs than MarcEdit).
  4. Functionally, MacOS 3 should be functionally equivalent to MarcEdit 7 with the following exceptions
    1. There is no COM functionality (this is windows only)
    2. Initially, there will be no language switching (the way controls are named are very different than on Windows – I haven’t figured out how to connect the differences)
  5. Like MarcEdit 7, I’m targeting this build for newer versions of MacOS. In this case, I’ll be targeting 10.10+. This means that users will need to be running Yosemite (released in 2013) or greater. Let me know if this is problematic. I can push this down one, maybe two versions – but I’m just not sure how common older OSX versions are in the wild.


Here’s a working wireframe of the new main MacOS MarcEdit update.

Anyway – these are the general goals. All the functional code written for MarcEdit 7 has been able to be reused in MarcEdit MacOS 3, so like the MarcEdit 7 update, this will be a big release.

As I’ve noted before, I’m not a Mac user. I spent more time in the eco-system to get a better idea of how programs handle (or don’t) resizing windows, etc. – but fundamentally, working with MacOS feels like working with a broken version of Linux. So, with that in mind – I’m developing MarcEdit MacOS 3 using the 4 concepts above. Once completed, I’d be happy to talk (or hear) from primarily MacOS users and talk about how some of the UI design decisions might be updated to make the program easier for MacOS users.



District Dispatch: April CopyTalk on the calendar: fair use confidence

planet code4lib - Wed, 2017-12-27 22:40

Originally scheduled for January 4, 2018, this CopyTalk is being postponed to April 5. 

Please join us for “Assessing Librarians Confidence and Comprehension in Explaining Fair Use Following an Expert Workshop” presented by Sara Benson, Copyright Librarian at the University of Illinois.

Sara will discuss her study that evaluated librarian confidence and comprehension following an expert-led fair use training session. The results, though limited in scope, provide encouraging evidence that librarians can tackle the concept of fair use when provided with appropriate training. Both the level of confidence and the level of comprehension rose after the librarian participants were provided with training, indicating that the training did have an impact. Further evidence of impact was revealed by the survey distributed two weeks after the training wherein some librarians noted that they had had the opportunity to use the skills learned in the training workshop during their daily work. The findings could contribute to the enhancement of copyright in libraries training and workshops.

Sara Benson was a practicing lawyer before teaching at the University of Illinois College of Law. Now she is also a librarian with an MLIS degree from the iSchool at Illinois.

Mark your calendars and set aside some time for this webinar: Thursday, January 4, 2 p.m. Thursday, April 5, at 2 p.m. for our hour-long free webinar. Go to and sign in as a guest.

This program is brought to you by the Copyright Education Subcommittee.

Did you miss a CopyTalk? Check out our CopyTalk webinar archive!

The post April CopyTalk on the calendar: fair use confidence appeared first on District Dispatch.

Meredith Farkas: My year in books 2017

planet code4lib - Wed, 2017-12-27 19:50

Reading this year has been so many things for me. An escape. A way to educate myself. A way to see my own struggles in a different way through another’s story. A way to understand the struggles of others. A way to better understand where I came from. This year I think I’ve read more than I had in any other year since college. I read 16 of the 22 books I’d hoped to read this year, which feels like an accomplishment. Books with asterisks are ones that I didn’t (because I couldn’t get into them) or have not yet read in full (as is the case of the books by Nguyen, Clinton, and Clements). I struggled this year to think of what my favorite book was, but The Underground Railroad, Little Fires Everywhere, The Hate U Give, and Bad Feminist were definitely highlights.

This year, I also listed the children’s books I’ve either read with Reed or on my own. Reed is part of an Oregon Battle of the Books team this year and I’m their coach, so I’ve been slowly reading the 8 books he was assigned to read so I can help quiz him. He’s finally gotten into reading, which I could not be more thrilled about. I remember having the same experience the summer before I started third grade: one moment, I hated reading; the next, I was in love.

Adult and Young Adult Fiction

  • Willful Disregard: A Novel About Love by Lena Andersson
  • All Grown Up by Jami Attenberg
  • The Water Knife by Paolo Bacigalupi
  • The Noise of Time by Julian Barnes*
  • The Idiot by Elif Batuman
  • The Sellout by Paul Beatty
  • Outline by Rachel Cusk*
  • The Arrangement by Sarah Dunn
  • Eleven Hours by Pamela Erens
  • Turtles All the Way Down by John Green
  • Since we Fell by Dennis Lehane*
  • It Can’t Happen Here by Sinclair Lewis
  • The Association of Small Bombs by Karan Mahajan
  • Behold the Dreamers by Imbolo Mbue
  • Norwegian by Night by Derek Miller
  • The Bluest Eye by Toni Morrison
  • I’ll Give You the Sun by Jandy Nelson
  • Little Fires Everywhere by Celeste Ng
  • The Sympathizer by Viet Thanh Nguyen*
  • Commonwealth by Ann Patchett
  • Eleanor and Park by Rainbow Rowell
  • The Hate U Give by Angie Thomas
  • Chemistry by Weike Wang
  • The Underground Railroad by Colson Whitehead
  • The Sun is Also a Star by Nicola Yoon


  • Voices from Chernobyl by Svetlana Alexievich
  • My Name is Freida Sima:The American-Jewish Women’s Immigrant Experience Through the Eyes of a Young Girl from the Bukovina by Judith Tydor Baumel-Schwartz (which is actually about my relatives!)
  • What Happened by Hillary Clinton*
  • Evicted: Poverty and Profit in the American City by Matthew Desmond
  • Love Warrior by Glennon Doyle*
  • Abandon Me: Memoirs by Melissa Febos*
  • Bad Feminist by Roxanne Gay
  • The Morning They Came for Us: Dispatches from Syria by Janine di Giovanni
  • Lab Girl by Hope Jahren*
  • I Love Dick by Chris Kraus
  • Furiously Happy by Jenny Lawson
  • The Arm: Inside the Billion-Dollar Mystery of the Most Valuable Commodity in Sports by Jeff Passan
  • Becoming Habsburg: The Jews of Habsburg Bukovina, 1774-1918 by David Rechter*
  • Men Explain Things to Me by Rebecca Solnit
  • The Mother of All Questions by Rebecca Solnit

Children’s Fiction

  • Alien in My Pocket by Nate Ball
  • The Case of the Case of Mistaken Identity by Mac Barnett
  • Wild Life by Cynthia deFelice
  • The Terrible Two by Mac Barnett and Jory John
  • The Terrible Two Get Worse by Mac Barnett and Jory John
  • Keepers of the School Book 1: We the Children by Andrew Clements
  • Keepers of the School Book 2: Fear Itself by Andrew Clements*
  • The Westing Game by Ellen Raskin
  • Harry Potter and the Half-Blood Prince by J. K. Rowling
  • Harry Potter and the Deathly Hallows by J. K. Rowling
  • Holes by Louis Sachar
  • I Survived the Eruption of Mount St. Helens, 1980 by Lauren Tarshis

Here are some of the books I hope to read in 2018, though as always, I know that serendipity and the vagaries of Overdrive hold lists will impact my decision-making. If any of you have thoughts on these or alternative suggestions, let me know!

  • Beartown by Fredrik Backman
  • We Were Eight Years in Power by Ta Nahesi Coates
  • Why I Am Not a Feminist: A Feminist Manifesto by Jessa Crispin
  • Manhattan Beach by Jennifer Egan
  • Fresh Complaint by Jeffrey Eugenides
  • My Brilliant Friend by Elena Ferrante
  • Difficult Women by Roxanne Gay
  • Class Mom by Laurie Gelman
  • Homegoing by Yaa Gyasi
  • Exist West by Moshin Hamid
  • Uncommon Type by Tom Hanks
  • Before the Fall by Noah Hawley
  • This Will Be My Undoing: Living at the Intersection of Black, Female, and Feminist in (White) America by Morgan Jenkins
  • Her Body and Other Parties: Stories by Carmen Maria Machado
  • Dark Money: The Hidden History of the Billionaires Behind the Rise of the Radical Right by Jane Mayer
  • Homesick for Another World by Ottessa Moshfegh
  • Nasty Women: Feminism, Resistance, and Revolution in Trump’s America edited by Samhita Mukhopadhyay and Kate Harding
  • So You Want to Talk About Race by Ijeoma Oluo
  • The Bright Hour by Nina Riggs
  • On Tyranny: Twenty Lessons from the Twentieth Century by Timothy Snyder
  • Sing, Unburied, Sing by Jesmyn Ward
  • The Best Kind of People: A Novel by Zoe Whittall

Terry Reese: MarcEdit 7: Holiday Edition

planet code4lib - Wed, 2017-12-27 17:43

I hope that this note finds everyone in good spirits. We are in the mist of the holiday season and I hope that everyone that this reaches has had a happy one. If you are like me, the past couple of days have been spent clean up. There are boxes to put away, trees to un-trim, decorations to store away for another year. But one thing has been missing, and that has been my annual Christmas eve update. Hopefully, folks won’t mind it being a little belated this year.

The update includes a number of updates – I posted about the most interesting (I think) here:, but the full changelog is below:

  • Enhancement: Clustering Tools: Added the ability to extract records via the clustering tooling
  • Enhancement: Clustering Tools: Added the ability to search within clusters
  • Enhancement: Linux Build created
  • Bug Fix: Clustering Tools: Numbering at the top wasn’t always correct
  • Bug Fix: Task Manager: Processing number count wouldn’t reset when run
  • Enhancement: Task Broker: Various updates to improve performance and address some outlier formats
  • Bug Fix: Find/Replace Task Processing: Task Editor was incorrectly always check the conditional option. This shouldn’t affect run, but it was messy.
  • Enhancement: Copy Field: Added a new field feature
  • Enhancement: Startup Wizard — added tools to simplify migration of data from MarcEdit 6 to MarcEdit 7

One thing I specifically want to highlight, and that is the presence of a Linux build. I’ve posted a quick video documenting the installation process at: The MarcEdit 7 Linux build is much more self-contained than previous versions, something I’m hoping to do with the MacOS build as well. I’ll tell folks upfront, there are some UI issues with the Linux version – but I’ll keep working to resolve them. However, I’ve had a few folks asking about the tool, so I wanted to make it ready and available.

Throughout this week, I’ll be working on updating the MacOS build (I’ve fallen a little behind, this build may take an extra week to complete (I was targeting Jan. 1, it might slip a few days past) and I’ll say that functionality, I think folks will be happy as it fills in a number of gaps while still integrating the new MarcEdit 7 functionality (including the new clustering tools).

As always, if you have questions, please let me know. Otherwise, I’d like to wish everyone a Happy New Year, filled with joy, love, friendship, and success.



Open Library: A Holiday Gift from Open Library: Introducing the Reading Log

planet code4lib - Wed, 2017-12-27 00:23

For years readers have been asking us for a convenient way to keep track of the books they’re reading.

As we prepare to step through the threshold into 2018, we’re happy to announce the release of a brand new Reading Log feature which lets you indicate all the books you’d like to read, books you’ve already read, and books you’re currently reading.

You can now mark a book as “Want to Read”, “Currently Reading”, or “Already Read”, in addition to adding it to one of your themed reading lists.

Here’s how it works!

Any time you go to a book page on Open Library, you will see a new green button with the text “Want to Read”. By clicking on this button, you can mark this book as a work you’d like to read. By clicking on the white dropdown arrow on the right side of the button, you can select from other options too, like “Currently Reading” or “Already Read”. Once you click one of these options, the green button will appear gray with a green check, indicating your selection has been saved.

Where can I review my Reading Log?

You can review the books in your Reading Log by clicking the “My Books” menu and selecting the “My Reading Log” option in the dropdown.

You can find a link to your Reading Log page under the “My Books” menu

From this page, you can manage the status of the books you’re reading and easily find them in the future.

A preview of the Reading Log page

Who can see my Reading Log selections?

Books you mark as “Want to Read”, “Currently Reading”, or “Already Read” are made private by default. We know some people want to share what books they’re reading. In the future, we hope to offer an option for readers to make their Reading Log public.

Can I Still Use Lists?

You can still use your existing Lists and even create new ones! In addition to giving you a convenient way to log your reading progress, you can also use the green dropdown menu to add this book to one of your custom themed Lists.

Send Us Your Feedback!

We hope you love this new feature as much as we do and we’d love to hear your thoughts! Tweet us at @openlibrary. Is the Reading Log feature not working as you expect? Please tell us about any issues you experience here.

David Rosenthal: Updating Flash vs. Hard Disk

planet code4lib - Tue, 2017-12-26 16:00
Chris Mellor at The Register has a useful update on the evolution of the storage market based on analysis from Aaron Rakers. Below the fold, I have some comments on it. In order to understand them you will need to have read my post The Medium-Term Prospects for Long-Term Storage Systems from a year ago.

SourceThe first thing to note is that the transition to 3D is proceeding rapidly:
3D NAND bits surpassed 50 per cent of total NAND Flash bits supplied in 2017's third quarter, and are estimated to reach 85 per cent by the end of 2018,Despite this increase in capacity, price per bit has increased recently. Rakers' sources all predict that prices will resume their decrease shortly. They didn't predict the increase, so some skepticism is in order. They differ about the rate of the decrease:
IDC thinks there will be a $/GB decline of 36 per cent year-on-year 2018. TrendForce (DRAMeXchange) recently forecast that 2018 NAND Flash price declines would be in the 10 to 20 per cent year-on-year range. Western Digital concurs with that from a 3D NAND viewpoint, and has reported having seen 3D NAND price declines in the 15 - 25 per cent per annum range. SourceSo Rakers' graph is on the optimistic side, and TrendForce's estimate agrees with my projection. Rakers projects that SSD's gradual erosion of the hard disk market share will continue:
He looked at SSD ships related to disk drive ships on a capacity basis, seeing the flash percentage share rising to 19.3 per cent in 2021 from 8.4 per cent this year:So, as I projected, Rakers agrees that in 2021 bulk data will still reside overwhelmingly on hard disk, with flash restricted to premium markets that can justify its higher cost.

Mellor ends on a cautionary note, with which I concur:
It's still not clear if QLC (4bits/cell) flash will actually be an enterprise-class technology. Flash capacity increases beyond that might stall because there is nothing beyond QLC, such as a theoretical PLC (penta level cell - 5bits/cell) technology, or layering beyond 96 x 3D NAND layers might hit a roadblock.

Cynthia Ng: Making the Choice: Personal over Professional

planet code4lib - Tue, 2017-12-26 10:33
This is going to be a relatively short post, but as many people have been asking me why I am leaving my current position, I thought I might as well do a brief post about the whole thing. TL;DR version: We’re moving and since my current position is not a remote/telecommute job, I am resigning. … Continue reading "Making the Choice: Personal over Professional"

Ed Summers: Dossier

planet code4lib - Tue, 2017-12-26 05:00

This is a snippet from Mayer-Schönberger (2011) that I happened to read soon after the Digital Blackness Symposium in Ferguson, Missouri. In one panel presentation we heard from a group of activists who spoke in part about how they wanted the records of protest in Ferguson to reflect how the activists changed as individuals. This was actually a theme that continued on from the first meeting a year earlier, where one of the main takeaways was that the archive needs to reflect voices in time–voices that are in the process of becoming. In Delete: The Virtue of Forgetting in the Digital Age, Meyer-Schönberger draws on similar ideas and specifically focuses on how digital media’s ability to collapse time can actually work to to prevent change from happening:

But there is one further dimension of how digital remembering negates time. Earlier I mentioned digital dossiers and results of person queries in search engines. They are odd because they are limited to information available in digital format, and thus omit (possibly important) facts not stored in digital memory. Equally odd is how in such dossiers time is collapsed, and years–perhaps decades of life–are thrown together: that a person took out a loan a decade ago may be found next to the link to an academic paper she published six years later, as well as a picture posted on Flickr on last week. The resulting collage with all the missing (as well as possibly misleading) bits is no accurate reflection of the person themselves. Using an ever more comprehensive set of digital sources does not help much either. The resulting digital collage of facts would still be problematic. It would be like taking a box of unsorted photos of yourself, throwing on a table and thinking that by just looking hard enough at them you might gain a comprehensive accurate sense of who the person in the photos actually is today (if you think this example if far-fetched, just look at Flickr, and how it makes accessible photos taken over long periods of time). These digital collages combine in-numerous bits of information about us, each one (at best) having been valid at a certain point in our past. But as it is presented to us, it is information from which time has been eliminated; a collage in which change is visible only as tension between two contradicting facts, not as an evolutionary process, taking place over time.

Some worry that digital collages resemble momentary comprehensive snapshots of us frozen in time, like a photograph, but accessible to the world. Actually, digital collages are much more disquieting than that. They are not like one, but hundreds, perhaps thousands of snapshots taken over our lifetime superimposed over each other, but without the perspective of time. How can we grasp a sense of a person that way? How can we hope to understand how a person evolved over the years, adjusting his views, adapting his (changing) environment? How can we pretend to know who that person is today, and how his values, his thinking, his character have evolved, when all that we are shown is timeless collage of personal facts thrown together? Perhaps, the advocates of digital memory would retort that we could do so with appropriate digital filters, where through visual presentation time could be reintroduced, like color coding years, or sorting events. I am afraid it still might not fully address the problem. Because I fear the real bottleneck of conflating history and collapsing time is not digital memory, but human comprehension. Even if we were presented with a dossier of facts, neatly sorted by date, in our mind we would still have difficulties putting things in the right temporal perspective, valuing facts over time … From the perspective of the person remembering, digital memory impeded judgment. From the perspective of the person remembered, however, it denies development, and refuses to acknowledge that all humans change all the time. By recalling forever each of our errors and transgressions digital memory rejects our human capacity to learn from them, to grow and evolve.

(p. 123-124)

Part of me wants to say that this was the case with physical folders full of paper documents, and photographs too. It’s interesting to consider how digital media functions differently because of the way it implies completeness and obscures its gaps and inconsistencies. Indeed, Meyer-Schönberger points out that digital media actually is much more malleable on scales that are inconceivable with paper documents.


Mayer-Schönberger, V. (2011). Delete: The virtue of forgetting in the digital age. Princeton University Press.

Hugh Rundle: Dreaming bigger

planet code4lib - Sat, 2017-12-23 02:01

I missed the deadline for the GLAMBlogCLub in November, ironically because I was going through a month that was particularly lacking in 'balance'. At my place of work, D-Day arrived on Melbourne Cup Day: we migrated from Amlib - a system we'd been using for 20 years - to Koha ILS. The migration itself went fairly smoothly, but inevitably there was a lot of work cleaning things up afterwards. Twenty years of custom, practice, and not-particularly-controlled data entry will mess with even the most comprehensive data migration plan. I put in some very long hours in November, but that's not sustainable for very long: I had to remind myself in the weeks that followed that I don't have to get everything done and everything perfect in the next week, or even the next month.

I've long been somewhat sceptical of the idea of 'work-life balance'. Working and 'life' are not two opposites to be balanced. In professions like librarianship this can be especially so: many librarians proudly identify as librarians primarily. This is not unproblematic, but it does highlight that separating 'life' from 'work' is quite artificial. The idea of 'balance' needs to also be interrogated. Working hard or long hours is not, in itself, necessarily unbalanced. The idea of flow or being 'in the zone' can often lead to long periods of productive and highly enjoyable work. Flow is pretty hard to achieve in a busy open plan office, but I've had periods of getting enjoyably lost in the Koha weeds over the last few months, piecing together how we might reconceptualise our collection management, or take advantage of a particular feature. My loss of balance was, rather, driven by a self-imposed requirement that our migration must go flawlessly, or I would somehow have let down the entire Koha and library community. In hindsight this was a stupid amount of pressure to put on myself, but it happened. At the very moment I should have been feeling excited and triumphant at what turned out to be a reasonably successful migration (system down for only one day, which was a public holiday anyway), I was instead exhausted, stressed and not much fun to be around. I'm about to start six weeks of leave, so no need to feel any kind of sympathy! I think the lesson here is to always maintain a sense of perspective, and have people around who can tell you when you're starting to get out of control.


Koha ILS is the first open source library management system in the world and I still can't quite believe I managed to jump through all the hoops to move my five-branch library service onto it. There are things about Koha that are magical because they just work the way people expect them to work, but naturally there are also parts of the system that are less well developed, or appear to have been poorly thought through, or simply work in a particular way that is fine for many libraries but not so great for ours. Such is the nature of software generally, but the beauty of an open source system is that if you have the money or skills, you can fix, improve or extend it. My argument for moving to Koha has, from the beginning, been about flexibility and local control of our tools, rather than about saving money. Of course, the way this all works is that open source software is collaborative. It's not just that individual developers or organisations can use the code, but rather that the system is built by the Koha community as a community. Bugs are listed, patches are tested and signed off, proposals are made, questions are asked and answered. It's not perfect. I'd like a lot more documentation, and exactly how the whole system works is pretty mysterious for newbies, but everyone involved recognises that: they're just all really busy. The people I've spoken to about this have all responded by encouraging me to help fix it, rather than responding in any defensive way: the way to make things better is to just make things better.

I've been a fan of open source - and Koha - for a long time, but this year as I've worked on Koha implementation I've been surprised how much moving to a completely open source architecture has changed the way I approach the options available to us for future projects. We're now running an open source web CMS (Joomla!), room and equipment booking system (Brushtail), and library management system (Koha). With our core system now open source, I've noticed that my colleagues and I are already starting think much more in terms of "we could approach problem X with solution Y" rather than "I wish system A had feature B". It can be a subtle difference, but the very fact that we could adjust how our system works without asking anyone's permission has changed the way we think about what's possible at all. In reality, 'asking for permission' still happens with Koha core development, because patches are discussed, tweaked and signed-off. But this is collaboration with peers, not begging a vendor for a feature that's "not on their roadmap". For the ambitious solutions we'll need funding and fee-for-service development. But unlike the black box of proprietary software licensing, we'll have a reasonably clear idea what we're going to get for our money.

Seeing my own and other's thinking expand like this in such a small period of time, I've been thinking about the hidden benefits of open source. I've always thought about it as a collaborative way to make tools, and I've long thought that librarians need to make our own tools if we are to fulfil the promises our profession makes. But I'd not really consciously considered that open-source could also expand our collective horizons because of its collaborative nature. Institutions - especially those that are part of of funded by governments - naturally want expenditure to fall into neat boxes. This is the reason, I think, so many libraries buy off-the-shelf software licenses and subscription packages: it's easy to explain to Procurement, and means you can shift the responsibility onto the vendor. But with the responsibility goes the control. When we lose control, our mindsets change. We either become compliant, or we become demanding, but either way the scope becomes smaller. When you're spending all your energy begging for an extra option in that drop-down menu, you don't have the mental bandwidth to be dreaming up whole new use-cases for your software generally. Proprietary software has 'user groups' of course, but this is not collaboration - it's more akin to a club or a union. You can collectively bargain or beg, but you can't really collectively build and make. It's the expanded scale of what's possible now that changes things in a way I hadn't deeply thought about before. So I'm going to enjoy my summer break, but I've never been more excited about my work. Because while we still need to burn it all down, we've got to build something out of the ashes.

We'll dream bigger when we build it together.

Information Technology and Libraries: Everyone’s Invited: A Website Usability Study Involving Multiple Library Stakeholders

planet code4lib - Fri, 2017-12-22 16:40

This article describes a usability study of the University of Southern Mississippi Libraries’ website conducted in early 2016. The study involved six participants from each of four key user groups – undergraduate students, graduate students, faculty, and library employees – and consisted of six typical library search tasks such as finding a book and an article on a topic, locating a journal by title, and looking up hours of operation. Library employees and graduate students completed the study’s tasks most successfully, whereas undergraduate students performed fairly simple searches and relied on the Libraries’ discovery tool, Primo. The study’s results identified several problematic features that impacted each user group, including library employees. This increased internal buy-in for usability-related changes in a later website redesign. 

Information Technology and Libraries: Mobile Website Use and Advanced Researchers: Understanding Library Users at a University Marine Sciences Branch Campus

planet code4lib - Fri, 2017-12-22 16:40
This exploratory study examined the use of the Oregon State University Libraries website via mobile devices by advanced researchers at an off-campus branch location. Branch campus–affiliated faculty, staff, and graduate students were invited to participate in a survey to determine what their research behaviors are via mobile devices, including frequency of their mobile library website use and the tasks they were attempting to complete. Findings showed that while these advanced researchers do periodically use the library website via mobile devices, mobile devices are not the primary mode of searching for articles and books or for reading scholarly sources. Mobile devices are most frequently used for viewing the library website when these advanced researchers are at home or in transit. Results of this survey will be used to address knowledge gaps around library resources and research tools and to generate more ways to study advanced researchers’ use of library services via mobile devices.

Information Technology and Libraries: Editorial Board Thoughts: Reinvesting in Our Traditional Personnel Through Knowledge Sharing and Training

planet code4lib - Fri, 2017-12-22 16:40
Editorial Board Thoughts: Reinvesting in Our Traditional Personnel Through Knowledge Sharing and Training


Subscribe to code4lib aggregator