Code4Lib Journal: Python, Google Sheets, and the Thesaurus for Graphic Materials for Efficient Metadata Project Workflows
I participated in the “#1Lib1Ref” campaign again this year, recording my experience and talking through why I think it’s important.
I. We provide the highest level of service to all library users… ALA Code of Ethics
That’s what public libraries do, right? Provide service to everyone, respectfully and professionally — and without conditioning that respect on checking your papers. If you walk through those doors, you’re welcome here.
When you’re standing in the international arrivals area at Logan, you’re in a waiting area between a pair of large double doors, exiting from Customs, and then the doors to the outside world. We stood in a crowd of hundreds, chanting “Let Them In!” Sometimes, some mysterious number of minutes after a flight arrival, the doors would open, and tired people and their luggage pour through, from Zurich, Port-au-Prince, Heathrow, anywhere.
And the Code of Ethics ran through my head because that’s what we were chanting, wasn’t it? That anyone who walks through those doors is welcome here. Let them in.
Library values are American values. And if you have a stake in America, don’t let anyone build an America that’s less than what we as a profession stand for.
Apologies, but after our announcement, just before Christmas, of dates for Hydra Connect 2017 it became apparent that they clashed with a PASIG conference which, at that point, had not been widely advertised. This would have represented a conflict of interest for a significant number of our Hydra community
Accordingly, the dates for Hydra Connect 2017 have been changed. It will still be hosted by Northwestern University but the dates are now Monday November 6th – Thursday November 9th, 2017. This year we have made the decision to use a conference hotel and the event will take place at the Hilton Orrington near the University. Please update your calendars!
Further information via emails and the Hydra wiki in due course!
Open Knowledge Foundation: Brazil’s Public Spending project is looking for leaders in various regions of Brazil to increase participation in the budgeting process.
The website is part of a wider campaign to search, recruit and support new leaders that wish to work with transparency, mainly public spending, in Brazilian municipalities and is using OKI’s OpenSpending technical architecture. The support will be provided to mentors specializing in law, transparency, technology and open data. The goal here is to increase the transparency in budget execution, bidding process and contractual management of cities.
In order that leaders can achieve concrete results, the OK Brazil team will develop a chronogram with each and everyone of them, using the existing legal framework, the support of mentors and digital tools to increase transparency and the participation in the budgeting process.“The new website demonstrates how to organize the missions and actions of the new leaders, empower the civilian society so that they may be able to monitor public spending and give access to both academics and journalists to budgeting data of cities”, says Lucas Ansei, developer and one of the mentors of the new website.
According to Thiago Rondon, coordinator of the OK Brazil team, the mentors will have a fundamental role to the formation of the leaders. “They’re specialists with experience on the matter at hand and will support the leaders with online conferences that will offer directions so that the impact of the actions of these new leaders is meaningful.”
Another goal of this new phase of the project is to reach out to city mayors all over the country with the intention to get them to both sign the Public Spending Brazil Commitment Letter and realize the concrete actions foreseen in the letter.Be a leader of the Open Spending project in 2017
According to Thiago, there will be an initial agenda of action that functions like a step-by-step manual so that anyone can help to increase the transparency in the city where they reside. “We want to empower the people so that they may do that on their own. To potentialize the divulgation, we will have local leaders in pilot cities that will have a direct support from the OK Brazil.”
Those who want to participate as a local leader of the Public Spending project can do so on the website. During this first phase, the OK Brazil team will select 15 local leaders through answers offered via inscription form.
Users have high expectations these days. The hours spent in elegant web apps like Netflix and Spotify seem to be sharpening the collective sense of design. What was once the pinnacle is now the convention, and as Don Norman said, “Conventions are slow to be adopted and, once adopted, slow to go away.” So we thought it would be fun to emulate some of our favorite sites in a lightweight concept discovery layer we call Libre. Below are some of the expectations we prioritized in the design. 1#1: Things worth doing also look cool
First, we wanted to elevate books to the same “cool status” of other media. Thanks to Netflix and Spotify, that meant choosing a dark theme with white lettering and neon trim. Because of the ready association with the national library symbol, we chose blue for the secondary color.#2: The most useful things are also the most visible
The intent in a known-item search (33-60% of all queries 2) is rapid visual confirmation, so we highlighted title, author, and cover image. In more serendipitous browsing, the intent is evaluation, so average rating and a synopsis are prioritized second. 3#3: All the answers are here
Several friends of mine have revealed, at one point or another, that they didn’t know the library was free. While this can seem shocking, it’s bad design to assume that the user knows everything they need: immigrants may never have had access to a public library before, and the less tech-savvy might need to know that borrowing ebooks is legal. Hence, we avoided jargon like “Place Hold,” list requirements, and explained the basic premise of a library in fine print beneath the main call-to-action.#4: Browsing is always assisted
Other sites deliver personalized recommendations by capturing reams of personal data. Content-based recommendations like “Nebula Award Winners” or “NYT Bestsellers – Fiction” assist users in a similar way, though. Offering a compelling alternative is more important at the library than anywhere else online, since the title a user came looking for could be out on loan already. We wanted to keep our users from leaving in frustration if they encountered an unavailable title.#5: I can bring friends
A site without sharing is a city without roads. Even if the features aren’t used too often, we decided that it was important to offer up multiple options for users to save, share, and otherwise show off their discoveries. We distinguish subtly between casual users, who might know to post or tweet, and the power user, who may want to embed a free link on his book review blog, for instance.
1: Our work in this article focuses on a popular reading use case, and will therefore seem more applicable to public libraries. Still, we hope our friends in academics get something out of it too.
2: EBSCO and Ex Libris are at odds over this figure. EBSCO says “Just under 30” and Ex Libris “over 50.” Both of them exclude author searches from their definition of “known-item” entirely, which seems to me a mistake. Often an author search is an easier route to a known item: for instance, when the title is so long as to be annoying to type or so short as to be ambiguous. Therefore, I inflate their estimates by about 5%.
3: Notably absent are Format and Availability. These are currently displayed after the user clicks “See at the Library.” A more robust implementation might have them both appear on the page.
It can be difficult to have a conversation in Twitter but people somehow seem to manage. You can reply to someone’s tweet, and other people can reply to your replies, which forms a conversation thread of sorts. But the display of the thread is difficult to interpret.
What’s worse is that there is no Twitter API call to get the replies to a given tweet. If you have the JSON for a tweet in hand you can use the in_reply_to_status_id property to fetch the tweet that it is responding to. But the converse is not true: there is no straightforward way to get the tweets that are in response can given tweet. If I’m wrong about that please let me know. For a much more thorough discussion and analysis of these constraints see Alexander Nwala’s Tweet Visibility Dynamics in a Tweet Conversation Graph.
It’s a bit of a hack but you can use Twitter’s Search API to programmatically scan through tweets directed at a given user (e.g. to:barackobama), and inspect them to see if any are in response to a given tweet. You can also stop scanning when you arrive at tweets that are older than the tweet you are looking for responses to, since to my knowledge it’s impossible to reply to a tweet from the future. Yeah, that was my dry attempt at a joke. The big caveat here is that Twitter’s Search API only allows you to retrieve tweets from the last week. So this technique will only work for fetching conversation threads from the last week.
In the Documenting the Now project we are building tools to help researchers study Twitter. We’ve added a command to twarc that performs this heuristic to rebuild a given reply thread for a given tweet identifier. So to get the replies to this tweet:
let’s make this shit huge https://t.co/iP8IOY3CqB— laura olin ((???)) January 25, 2017
you can run this command:% twarc replies 82407791092769177 > replies.json
This will only get the initial set of replies to the tweet. If you want to get the entire conversation thread you can use the --recursive option:% twarc replies 82407791092769177 --recursive > replies.json
That will get the replies to the replies, and will also walk up the conversation chain if the supplied tweet identifier is itself a reply to another tweet. In addition it will follow tweets that are quotes.
To demonstrate that it’s working we’ve added a little utility called network.py that will read a set of tweets and write out the network of conversation as a GEXF for loading into Gephi, or DOT for use with Graphviz or as a standalone HTML file that uses D3 to visualize the conversation in your browser. Here’s how you run it:% ./network.py replies.json replies.html
and here’s what the D3 visualization looks like for that tweet above. Try clicking on the nodes in the graph to see the tweets that the node represents. You can see the quote is colored yellow, and the original tweet (the one with no parent) is colored red.
Paul Butler also recently added the ability to drag and drop a file of tweets generated with the twarc replies command in his Treeverse. Treeverse is a Chrome plugin which provides a much more usable display of a conversation thread. Here’s a screenshot of looking at that same set of replies. (https://paulgb.github.io/Treeverse/).
The nice thing about the D3 vidualization is that it’s possible to restyle the presentation using CSS. You can also use it to visualize the network of tweets that were not acquired using the replies command. For example here is a visualization that was generated from a search for the #datarefuge hashtag a few days ago. I recorded it as a video on a large screen because there were so many nodes.
If you get a chance to try any of this or have any thoughts about it I’d love to hear from you.
After four days of productive committee meetings and sessions at ALA’s Midwinter Meeting in Atlanta, the opportunity to see real libraries in action was a welcome change of scenery. Monday afternoon Office for Information Technology Policy (OITP) staff (plus me) were given a tour of several libraries in the Cobb County Public Library System (CCPLS) hosted by a CCPLS branch manager (and longtime OITP Advisory Committee member) Pat Ball and CCPLS Director Helen Poyer.
The 17 branches of the CCPLS offer a gamut of services. Staff at the county’s main location, Switzer Library, enthusiastically described programs ranging from job skills training (in partnership with the local Jewish Family Services) to virtual reality technology (thanks to an IMLS grant) to falls prevention workshops for older adults (in collaboration with Wellstar Health System). A month of daily blog posts wouldn’t suffice to recount the many ways that CCPLS successfully engages other organizations to serve the needs of people in their communities. But the CCPL program that captured my attention the most, was their “Girls Who Code” club.
Over the past year OITP has been working to promote coding and other programs designed to foster computational thinking in youth, particularly through the Libraries Ready to Code project. More than half of OITP’s sessions at ALA Midwinter were related to coding. Lucky for us, CCPL’s “Girls Who Code” meet every Monday evening, so we had a chance to meet some of them as they worked on their original project. As Stratton Library volunteer Ambrey McWilliams explained to us, the girls started by brainstorming issues of concern in their community and then came up with a way to build awareness of one issue through a coding project.
The issue they chose: texting while driving. The tool: a game hosted on an original website that requires players to resist various distractions while “driving.”
As we chatted with the girls, several aspects of the project struck me:
- The girls involved range in age from 12 to 17 and come from public, private and home schools. One girl – a home-schooler – traveled an hour each way to be part of this diverse group because it was the coding club closest to her home. Through coding (and eating pizza) together, a sense of community is forming amongst these girls leaning over each other’s computer screens.
- Their project emerged from a genuine conversation among this diverse group of girls about the needs they identified in the wider community. Theirs is a mission-driven endeavor. (In addition to the issue of texting while driving, they had considered problems like bus safety and animal treatment.)
- The many phases of the project (building the website, creating a PSA, designing characters, writing distractors) require teamwork and scaffolding, so having a committed volunteer to help guide the project is key. Stratton’s “Girls Who Code” are fortunate to have a volunteer that codes professionally and also has the skills to break down the task into manageable parts and facilitate the group’s discovering solutions to challenges of completing the task – which is a key element of computational thinking.
Thanks to the dedicated professionals at Stratton Library and the county that funds their innovative programs, these girls are learning skills that will serve them well in their future careers – not only in tech, but in any profession. As a recent OITP report states, as libraries get ready to code, “communities will see young people who are ready to take on their futures, who have robust career options, and who guarantee the economic and social vitality of the cities, towns and reservations in which they live.”
Yesterday, President Donald Trump issued an executive order to enhance "Public Safety in the Interior of the United States".
Of interest here is section 14:
Note that this executive order does not apply to the Library of Congress, an organ of the legislative branch of the US government. Nevertheless, it demonstrates the vulnerability of policy-based privacy. Who's to say that Congress won't enact the same restrictions for the legislative branch? Who's to say that Congress won't enact the same restrictions on any website. library or information system that operates in multiple states?
Lawyering privacy won't work any more. Librarianing privacy won't work any more. We need to rely on engineers to build privacy into our websites, libraries and information systems. This is possible. Engineers have tools such as strong cryptography that allow privacy to be built into systems without compromising functionality. It's not that engineers are immune from privacy-breaking mandates, but it's orders of magnitude more difficult to outlaw privacy engineering than it is to invalidate privacy policies. A system that doesn't record what a user does can't produce user activity records. Some facts are not alternativable. Math trumps Trump.