You are here

planet code4lib

Subscribe to planet code4lib feed
Planet Code4Lib -
Updated: 4 weeks 3 hours ago

Ed Summers: Twitter and Tear Gas

Thu, 2017-10-12 04:00

Twitter and Tear Gas: The Power and Fragility of Networks Protest by Zeynep Tufekci. New Haven, Yale University Press, 2017, xxxi+ 326 pp. (hardcover), 978-0-300-21512-0.

Originally published in Internet Histories.

In August 2014 I took part in a panel conversation at the Society of American Archivists meeting in Washington DC that focused on the imperative for archivists to interrogate the role of power, ethics and regulation in information systems. The conference itself stands out in my memory, because it began on 10 August, the day after Michael Brown was killed by police officer Darren Wilson in Ferguson, Missouri. I distinctly remember the hand that shot up immediately during the Q&A period to ask what, if anything, we will remember of the voices from Ferguson in social media, that raised awareness of the injustice that had occurred there. Before anyone had much of a chance to respond another voice asked whether anyone had seen the blog post about how radically different Twitter and Facebook’s presentations of Ferguson were. The topic of power, ethics and regulation were not simply academic subjects for discussion, they were demands for understanding from information professionals actively engaged in the work of historical production.

The blog post mentioned that day was What Happens to #Ferguson Affects Ferguson by Zeynep Tufekci. It was published on the social media platform Medium, as the sustained protests in Ferguson began to propel the hashtag #BlackLivesMatter into many Twitter timelines and newsrooms around the world. Like so much of her work, Tufekci’s post asked her readers to think critically about the algorithmic shift we have been witnessing in our media and culture since the advent of the web and the rise of social media. Tufekci is a consummate public scholar, who uses online spaces like her blog, Twitter, Medium, TED talks and editorials in the The Atlantic and The New York Times to advance a crucial discussion of how the affordances of information technology are both shaped, and being shaped, by social movements and political infrastructures. It is a pivotal time for scholars to step out from the pages of academic journals and into the World Wide Web spaces that are grappling with the impact of post-truth politics and fake news. It is into this time and place that Tufekci’s first book Twitter and Tear Gas: The Power and Fragility of Networked Protest is launched.

Tufekci’s book is divided into three main parts 1) Making a Movement, 2) A Protester’s Tools, and 3) After the Protests. While these suggest a chronological ordering to the discussion, the different parts, and the ten chapters found within them, reflect a shifting attention to the specifics of networked social movements. Part 1 provides the reader with a general discussion of how the networked public sphere operates with respect to social movements. This is followed by Part 2 which takes a deeper dive into the specific affordances and sociotechnical logics of social media platforms such as Twitter, Facebook and Google. And finally, Part 3 integrates the previous discussion by articulating a theory for how social movements function in, and through, online spaces.

Throughout the book Tufekci focuses on the specifics of protest and counter-protest, while stressing that social media spaces are not disembodied and virtual phenomena, but are actual, contingent configurations of people, technology and power. In teasing out the dimensions of networked public sphere Tufekci reminds me of Kelty’s concept of a recursive public in which the public’s participants are actively engaged in the maintenance, modification and design of the technical and material means that sustain the public itself (Kelty, 2008). In many ways Twitter and Tear Gas hacks the sociopolitical systems that it describes. It’s no mistake that the book is licensed with the Creative Commons and is freely downloadable from it’s companion website.

2013 Taksim Square by Fleshstorm

Prior to her academic career, Tufekci worked as a software developer at IBM where she first encountered the information infrastructure we call the Internet. You can sense this training and engagement with practice in her work which always seems to be pushing up against, but not overstepping, the art of what is possible. As a sociologist she brings the eye of an ethnographer to her study of protest. Tufekci is not a distant observer, but a participant, with actual stakes in the political outcomes she describes. She is pictured on the dust jacket wearing a helmet to protect her from tear gas canisters that were shot into the crowd that she was a part of in the Gezi Park protests. The book sits on the solid foundations of her own experience as well as the experiences of activists and organisers that she interviews. But Twitter and Tear Gas also significantly engages with sociological theories to bring clarity and understanding to how social media and social movements are co-produced.

In the pages of Twitter and Tear Gas you will find scenes of protests from around the world that are put into conversation with each other. From Zapatista solidarity networks, to the disruption of the World Trade Organization in Seattle, the global anti-war protests after 9/11, to [Occupy] in Zuccotti Park, the Egyptian Revolution in Tahrir Square, the Gezi Park protests in Istanbul, the Indignados in the Plaza del Sol, the Umbrella Movement in Hong Kong, and BlackLivesMatter in Ferguson, Missouri. Twitter and Tear Gas functions as a historical document that describes how individuals engaged in political action were empowered and inextricably bound up with social media platforms. While it provides a useful map of the terrain for those of us in the present, I suspect that Twitter and Tear Gas will also be an essential text for future historians who are trying to reconstruct how these historical movements were entangled with information technology, when the applications, data sources and infrastructures no longer exist, or have been transformed by neglect, mergers and acquisitions, or the demands for something new, into something completely different. Even if we have web archives that preserve some “sliver of a sliver” of the past web (Harris, 2002) we still need to remember the stories of interaction and the technosocial contingencies that these dynamic platforms provided. Despite all the advances we have seen in information technology a book is still a useful way to do this.

One of the primary theoretical contributions of this text is the concept of capacity, or a social movement’s ability to marshal end effect narrative, electoral and disruptive change. Tufekci outlines how the affordances of social media platforms make possible the leaderless adhocracy of just-in-time protests, and how these compare to our historical understanding of the African-American Civil Rights Movement in the United States. The use of hashtags in Twitter allow protesters to communicate at a great speed and distance to mobilise direct action in near real time. Planning the Civil Rights Movement took over a decade, and involved the development of complex communication networks to support long term strategic planning.

Being able to skip this capacity building phase allows networked social movements to respond more quickly and in a more decentralised fashion. This gives movements currency and can make them difficult for those in power to control. But doing so can often land these agile protests in what Tufekci calls a tactical freeze, where, after an initial successful march, the movement is unable to make further collective decisions that will advance their cause. In some ways this argument recalls Gladwell (2010) who uses the notion of weak ties (Granovetter, 1973) to contend that social media driven protests are fundamentally unable to produce significant activism on par with what was achieved during the civil rights era. But Tufekci is making a more nuanced point that draws upon a separate literature to make her argument, notably the capacity in development work of Sen (1993) and the capability theory of justice of Nussbaum (2003). Tufekci’s application of these concepts to social movements, and her categories of capacity, combined with the mechanics of signaling by which capacities observed and responded to operate as a framework for understanding why we cannot use simple outcome measures, such as numbers of people who attend a protest, when trying to understand the impact of networked social movements. For those who are listening she is also pointing to an area that is much in need of innovation, experimentation and study: tools and practices for collective decision-making that will allow people to thaw these tactical freezes.

Another significant theoretical thread running through Twitter and Tear Gas concerns the important role that attention plays in understanding the dynamics of networked protest. Social media is well understood as an attention economy, where users work for likes and retweets to get eyes on their content. Social and financial rewards can follow from this attention. While networked protest operates in a similar fashion, the dynamics of attention can often work against its participants, as they criticise each other in order to distinguish themselves. Tufekci also relates how the affordances of advertising platforms such as Google and Facebook made it profitable for Macedonian teenagers to craft and spread fake news stories that would draw attention away from traditional news sources, generate clicks and ad revenue, and as a side effect, profoundly disrupt political discourse.

Perhaps most significant is the new role that attention denial plays in online spaces, as a tactic employed by the state and other actors seeking to shape public opinion. Tufekci calls this the Reverse-Streisand Effect, since it uses the Internet to funnel attention to topics other than a particular topic at hand. She highlights the work of King, Pan, & Roberts (2013) that analysed how China’s so-called 50 Cent Army of web commenters shapes public opinion not simply by censoring material on the web, but by drawing attention elsewhere at key moments. Social media platforms are geo-political arenas, where bot armies are deployed to drown out hashtags and thwart communication, or attack individuals with threats and volumes of traffic that severely disrupt the target’s use of the platform. When people’s eyes can be guided, or pushed away, censorship is no longer needed. It is truly chilling to consider the lengths that those in power, or seeking power, might stoop to, in order to provide these events when needed.

Friday, Day 14 of Occupy Wall Street by David Shankbone

As significant as these theoretical contributions are, it is Tufekci’s personal voice combined with flashes of insight that I remember most from Twitter and Tear Gas. Details such as the use of the use of Occupy’s human microphone to amplify speaker’s voices and shape speech is a poignant metaphor for Twitter’s capacity for amplifying short message bursts that cascade through the network as retweets. In another Tufekci considers why so many protest camps set up libraries, and connects the work being done in social media to the work of pamphleteers throughout history. She describes the surreal experience of watching pastel hearts float across Periscope videos from Turkish Parliamentarians that were preparing to be bombed during an attempted coup. Near the end of the book she draws an analogy between the rise of fake news fueled by social media, and the ways in which Gutenberg’s printing press escalated the Catholic Church’s distribution of indulgences, opening itself up to the criticism found in Luther’s 95 theses–which were also printed. The stories work is generative of a humanistic outlook that does not deny or celebrate big data:

There is no perfect, ideal platform for social movements. There is no neutrality or impartiality–ethics, norms, identities, and compromise permeate all discussions and choices of design, affordances, policies, and algorithms on online platforms. And yet given the role of the these platforms in governance and expression, acknowledging and exploring these ramifications and dimensions seems more important than ever. (p. 185)

In fact, saying that Tufekci’s book has an explicit narrative arc is an oversimplification. It functions more like a fabric that weaves theory, observation and story, as topics are introduced and returned to later; there is no set chronology or teleology that is being pursued. On finishing the book it is clear how the concepts of attention and capacity are present throughout. But Tufekci makes these theoretical connections not with over abstraction and heavy citation, but by presenting scenes of protest where these concepts are being enacted. While there are certainly references to the supporting literature, the text is not densely packed with them. Finer theoretical manoeuvres are reserved for the endnotes, and do not overwhelm the reader as they move through the text. If you are teaching a course that surveys either communications, sociology or the politics of social media platforms and information infrastructures more generally Twitter and Tear Gas belongs on your syllabus. Your students will thank you: they can download the book for free, they can follow Tufekci on Twitter and Facebook, and her book speaks directly to the socio-political moment we are all living in.


Gladwell, M. (2010). Small change: Why the revolution will not be tweeted. The New Yorker. Retrieved from

Granovetter, M. S. (1973). The strength of weak ties. American Journal of Sociology, 78(6), 1360–1380.

Harris, V. (2002). The archival sliver: Power, memory, and archives in South Africa. Archival Science, 2(1-2), 63–86.

Kelty, C. M. (2008). Two bits: The cultural significance of free software. Duke University Press. Retrieved from

King, G., Pan, J., & Roberts, M. E. (2013). How censorship in china allows government criticism but silences collective expression. American Political Science Review, 107(2), 326–343.

Nussbaum, M. (2003). Capabilities as fundamental entitlements: Sen and social justice. Feminist Economics, 9(2-3), 33–59.

Sen, A. (1993). The quality of life. In M. Nussbaum & A. Sen (Eds.), The quality of life. Oxford: Clarendon Press.

District Dispatch: Higher education reauthorization on Congressional fall agenda

Wed, 2017-10-11 20:25

On the fall agenda for Congress is the reauthorization of the Higher Education Act (HEA). HEA was originally enacted in 1965 during the Lyndon Johnson Administration and was last reauthorized in 2008. While HEA has received significant interest in the past few sessions of Congress, its passage has stalled under partisan rancor.

U.S. President Lyndon B. Johnsonsigns the Higher Education Act Nov. 8, 1965. (AP Photo)

Even though HEA has been operating without being reauthorized, its reauthorization is important because it sends a message to the Appropriators that the program is a priority for Congress. Under the powers of Congress, a program or agency is “authorized”to operate and exist. Most program authorizations are designed to expire every few years and must be reauthorized. The process to reauthorize a program allows Congress an opportunity to examine if the program needs to be changed, modernized or possibly sunset (e.g., the Board of Tea Appeals, Board of Economic Warfare, etc.).

Authorizations can also include long-term spending plans for a program. However, an authorization is not necessary for a program to receive federal funding, nor does it guarantee a level of funding. Appropriations bills determine independently the level of funding a program is to receive in a given year. An unauthorized program may continue to receive funding, but some “fiscal hawks” in Congress are increasingly threatening to sunset unauthorized programs.

Many of the provisions, or Titles, of HEA will have minimal direct impact on libraries, but a few key areas warrant attention from the library community. How Congress views these programs may impact libraries at colleges and universities, particularly in two areas:

Title IV of HEA authorizes a broad array of aid programs to assist students in financing a higher education. The programs authorized under this title are the primary sources of federal aid to support higher education. Students who work in libraries or are enrolled in degree programs such as Master of Library and Information Science (MLIS) programs may qualify for loan relief. The two most impactful HEA authorized programs for libraries are the Public Service Loan Forgiveness (allows debt forgiveness for borrowers working in public service careers for 10 years, including libraries), and the Perkins Loan Cancellation (allows loan forgiveness for qualifying borrowers who work in school or tribal libraries or other educational settings).

Titles III and V authorize grants to higher education institutions that serve a high number of low-income and minority students (including Historically Black Colleges, Tribal Colleges and Universities, and Hispanic-Serving Institutions). These schools can utilize federal grants to meet a range of needs, including the purchase of library books and materials and the construction, renovation and improvement of classrooms and libraries. ALA opposes any efforts to reduce support for underserved students.

Programs to support higher education libraries and MLIS students are valuable assets at colleges and universities and support the mission of ALA. HEA reauthorization is likely to consume much of the higher education agenda for months, and the ALA Washington Office will keep you informed as these issues develop.


The post Higher education reauthorization on Congressional fall agenda appeared first on District Dispatch.

Meredith Farkas: The ballad of the sad instruction librarian

Wed, 2017-10-11 19:22

It’s been a busy Fall term so far and I haven’t had much time to spend on Twitter, but I usually check it first thing every morning. When I did one day last week, this thread caught my eye:

Sitting in a FB thread of professors complaining (nicely) about unqualified librarians doing shitty instruction sessions. They’re not wrong.

— Archivist Wasp (@nnschiller) October 5, 2017

Of course I want to “NOT ALL LIBRARIANS!” but defensiveness never won me an argument. Plus, they’re not wrong.

— Archivist Wasp (@nnschiller) October 5, 2017

So I apologize & say most of us have developed reflective & skilled pedagogical practices, but I’m full of shit, aren’t I?

— Archivist Wasp (@nnschiller) October 5, 2017

This just made me feel really sad, particularly that Nick felt he had to apologize for us and that he has so little confidence in librarians’ ability to teach. I’m certainly not going to deny that there are bad library instructors, but I think it’s a lot more complicated than that. I also find it funny that when people talk about the quality of library instruction, they always assume that they are the good ones (not just you Nick, but all of us). How do we really know that? I have never assumed that I’m great at teaching. I know that I’ve improved, based on assessments I’ve done and how students and faculty respond to my teaching, but I want to keep improving. If you think you’re a great instructor already and don’t need to improve, maybe you’re the problem.

And even great instruction librarians have awful sessions. This happens to disciplinary faculty too; I’ve had conversations with friends who teach outside of libraries and we all have horror stories. It sucks that one bad session can sour a disciplinary faculty member on library instruction entirely, especially when they should recognize that they’ve probably had bad one-off teaching experiences too. We’re all human.

But, still, I agree that there are librarians who are bad at teaching and bad at engaging students. There are also plenty of librarians who never wanted to teach in the first place. At my first job, everyone taught, from the the Head of Tech Services to the ILL Librarian to the Systems Librarian. There are a lot of libraries like that. But I also think that library schools don’t make it clear that teaching is part of being a librarian in so many library jobs, especially in academia. And in this job market, people will take jobs that include things they really don’t want to do so that they’re employed. If a librarian doesn’t want to teach, how motivated will they be on their own to try to improve?

Looking at my alma mater, Florida State, here is the recommended coursework if you’re going to focus on academic librarianship:

LIS 5603 Introduction to Information Services
LIS 5511 Management of Information Collections
LIS 5442 Information Leadership
LIS 5602 Marketing of Library and Information Services
LIS 5603 Introduction to Information Services
LIS 5485 Introduction to Information Technologies
LIS 5105 Communities of Practice (variable content areas)
LIS 5203 Assessing Information Needs
LIS 5241 International & Comparative Information Service
LIS 5260 Information Science
LIS 5263 Theory of Information Retrieval
LIS 5270 Evaluating Networked Information Services & Systems
LIS 5271 Research in Information Studies
LIS 5442 Information Leadership
LIS 5417 Introduction to Legal Resources
LIS 5474 Business Information Needs and Sources
LIS 5590 Museum Informatics
LIS 5602 Marketing Library and Information Services
LIS 5661 Government Information
LIS 5736 Indexing and Abstracting
LIS 5787 Fundamentals of Metadata Theory and Practice

Their only instruction-focused class, LIS 5524 Instructional Role of the Informational Specialist, is recommended for people focusing on “Reference” and “Youth Services,” not academic librarianship (yet somehow we all need Museum Informatics??? WTF FSU?). When I was at FSU, the class was 100% geared toward students planning to become Library Media Specialists so I didn’t take it. Based on the courses offered at FSU, I had NO IDEA instruction was a huge part of library work. I’m tremendously disappointed to see that they STILL aren’t doing more to promote courses on instruction and instructional design. Talk about out of touch!

So I think about the people who want to improve, but don’t have the time within their work day to develop professionally and improve or just don’t know where to start. Not everyone has the luxury of time and money to support their professional development. If you’re doing so much teaching and working at the reference desk that you don’t even have time to reflect on how classes went, how are you going to get better? And the fault for that does not lie with librarian, but with the institution that doesn’t support their improvement.

In response to what Nick Schiller tweeted, my collaborator and friend Lisa Hinchliffe wrote:

I wish librs would stop hiring ppl to teach it aren't good at teaching. Hurts lib reputation+traumas librns. Align hiring w duties!

— Lisa Hinchliffe (@lisalibrarian) October 5, 2017

Here was my response to that:

There’s not being good and there’s being green (which oft. looks the same). Most libraries throw ppl into the deep end w/o training/support.

— Meredith Farkas (@librarianmer) October 5, 2017

I did a little informal survey on Twitter to get a sense of how many librarians were prepared in any way — either by their LIS programs or by their workplaces — to teach information literacy.

Did you receive training on effective #infolit instruction before you were expected to start teaching?

— Meredith Farkas (@librarianmer) October 5, 2017

That is tremendously depressing. I have worked at three different academic libraries and at none of them did I receive any training in how to teach. I could understand that more in my second and third jobs, because they had some expectation that I knew how to teach (though I really had to relearn how to teach when I came to PCC and started working with community college classes). In my first job, I was thrown into the deep end with zero support and am sure I did a crappy job early on, especially since all of my classes in college had been lecture-focused so I didn’t have any models for active learning-style classes. Over time, I read books and articles and tried to learn as much as I could about how to be an effective instructor. I started to incorporate more activities into my teaching so students were actually doing (and sometimes teaching!) instead of me being a sage on the stage all the time. But I got no help from my colleagues because, though they had more experience, they had not been taught how to teach effectively either. We were all just fumbling around.

When you think about how few workplaces actually prepare librarians to teach, it makes me wonder whether those places think teaching is something anyone can do or if they just don’t value instruction. Reference and instruction positions are usually seen as entry-level, which is ironic, since they have the most contact with our students and faculty. They, to a large extent, determine how the library is viewed by faculty, which is hugely important! Administrators who don’t have a formal training program for library instruction, do you think this work is something anyone off the street can do? Or do you not value it? If neither of those things are true, then why are you not setting your library staff/faculty up to succeed?

I think having a formal training program around information literacy instruction for all librarians who teach when they are new to an institution is critically important and I urge every library director, dean, and AUL to consider why they don’t have something to on-board librarians for teaching at their institution. If it’s for all new hires who teach, it then becomes something that is supportive and not accusatory. Even experienced librarians have something to learn and instruction looks different at different institutions with different goals and different student populations.

As a former head of instruction at two institutions, I know how much ego and defensiveness can crop up around efforts to support instruction librarians with their teaching. It can feel like a threat to some, like an accusation that they are doing a shitty job. I’ve written about my own efforts as an instruction coordinator to support instructional improvement and there are a lot of ways to approach this. But, really, we’re no different than disciplinary faculty who are often equally uncomfortable being observed and/or critiqued. The difference is that we always have an instructor watching us when we teach, while they don’t.

Sometimes it’s less about the quality of the instructor and more about the approach the instructor takes. Every librarian has their own style; their own way of teaching certain concepts that may be more or less embraced by the people for whom we are teaching. My colleagues are all great teachers, but we all have widely varying approaches. At my library, each of us has instructors who request us specifically. I’ve been warned about instructors I loved teaching for and I’ve had classes go badly with instructors other colleagues love working with. I know instructors are sometimes frustrated that they will get a different approach to the outcomes depending on who is assigned to the class, but, again, we’re no different than they are. They don’t all teach the same either.

Any librarian who teaches information literacy also knows that there are things completely out of their control that impact how the class goes. Sometimes it’s the culture of the class. I remember once working with three sections of a criminal justice class in a row with the same instructor. Two of them went really well and one just was flat. The students were really low-energy and didn’t want to participate in activities. The instructor told me that class was like that with her as well. For some classes, I get to sit in on part of their class before I provide instruction, which gives me an interesting little window into the culture of the class. I’ve seen instructors who keep their students in rapt attention and instructors whose students look comatose. Not surprisingly, the students in classes where they are more engaged by their instructor are also usually more engaged when I’m teaching them. The instructor can really set the tone. Of course, we can still screw it up and I have, but how the instructor manages their own classroom makes a big difference. I’ve sometimes felt like a rock star leaving a classroom when, really, so much of the credit for how it went should have gone to their regular instructor.

We often walk into classes with incorrect or incomplete information about what the students are working on, where they are, and what they struggle with because their instructor doesn’t communicate the information to us. We walk into classes where students know nothing about the assignment even though the instructor told us they’d have selected topics by then. Sometimes they are doing their assignment later in the term, but the instructor requested that day because they couldn’t be in class. Sometimes in response to asking about their goals, instructors just tell us to do the “usual library spiel” or the “usual library tour” as if such a thing existed. Some instructors make it really difficult for us to create a tailored lesson plan for their class and sometimes we end up having to throw that plan out the window because we were misinformed. I recently wrote numerous times to an instructor who’d requested instruction to find out what they were working on and never received a response to any of my inquiries!

Our time and expertise is sometimes disrespected. We get instructors who request instruction because they’re going to be out that day. Often, we don’t find that out until the last minute when the instructor doesn’t show up. We have instructors who sit in the back and check email instead of participating in the class or even just being present. We get instructors who have noisy one-on-one conferences with students in the classroom while we are teaching (which isn’t at all distracting, right?). We get instructors who don’t give us enough time to cover the outcomes they want us to focus on or that give us time but then take the first 20 minutes to cover class stuff without warning us in advance. I’ve had instructors show up on the wrong day with their classes. One got angry at me about it, even after I showed them the confirmation email I’d sent. I ended up teaching the class (totally unprepared) and she never requested instruction again. Back in my first job, I was just starting a jigsaw activity with an English 101 class when the instructor said “I don’t want them doing that. That doesn’t sound useful.” Can you imagine how demoralizing it is to be contradicted in that way when you are teaching?

These stories do not represent the majority of the classes I work with. I also work with plenty of fantastic instructors who I love to work with year after year. I have instructors who really collaborate with me around determining the shape of library support for their classes. I have instructors who are totally game to try new things, even if they don’t always go well (and they’re kind enough to sympathize when things don’t go well). I have instructors who adequately prepare the students for what I’m going to cover in the information literacy session — they set the table for me. I have instructors who are active participants in my information literacy sessions. I have instructors who show they appreciate what I do. And those classes, not surprisingly, tend to go better than the ones where the instructors are checked out, disrespectful, or dismiss the work we put into tailoring a session to their students.

As I’ve mentioned before, I teach a class for San Jose State University’s iSchool on library embedment, which is mainly focused on embedding information literacy instruction and support into the curriculum and beyond the curriculum. A lot of what we read early on is focused on librarian-faculty collaboration and students always notice that there is often a lot of misunderstanding and also ego on both sides (am I the only person who now hears Donald Trump every time I hear or write “on both sides?” — barf). Librarians often assume that instructors are not teaching information literacy themselves and if they are, they’re certainly not doing it well. Instructors often underestimate librarians, seeing them more as service providers who demo databases rather than as instructors, experts, or collaborators. You can see it in the language both groups use. I witness that disconnect every time I see someone requesting a “library tour” when they don’t mean “walking students around the library” but actually mean information literacy instruction.

I think both librarians and disciplinary faculty should try to better understand and respect what the other does. I think we should cut each other some slack when it doesn’t always go well and also be willing to offer feedback, which I know is difficult (both for librarians and disciplinary faculty), but makes things better. I have saved many students from bad and unclear assignments by gently questioning the instructor about them and I would love to know what I can do to make their class’ experience better.

The problem of people who are poor instructors or lack motivation can only be solved by the Library. More resources should go toward training and on-boarding librarians to teach. The library should be set up to support the continuing development of veteran instruction librarians too; we all have more to learn. This won’t fix everything — there will always be people who just don’t care and aren’t motivated to improve — but everyone I have worked with earnestly wants to teach well and really cares about students. If we all had better support, the vast majority of us would be better instructors; and that includes our disciplinary colleagues.

Image credit: UIUC admissions blog

LITA: Jobs in Information Technology: October 11, 2017

Wed, 2017-10-11 18:53

New vacancy listings are posted weekly on Wednesday at approximately 12 noon Central Time. They appear under New This Week and under the appropriate regional listing. Postings remain on the LITA Job Site for a minimum of four weeks.

New This Week

University of North Florida – Thomas G. Carpenter Library, Systems Librarian, Jacksonville, FL

Dayton Metro Library, Technology Development Manager, Dayton, OH

Metropolitan State University, Electronic Resources and Discovery Librarian, St Paul, MN

Visit the LITA Job Site for more available jobs and for information on submitting a job posting.

Terry Reese: MarcEdit Delete Field by Position documentation

Wed, 2017-10-11 16:13

I was working through the code and found an option that quite honestly, I didn’t even know existed.  Since I’m creating new documentation for MarcEdit 7, I wanted to pin this somewhere so I wouldn’t forget again.

A number of times on the list, folks will ask if they can delete say the second field in a field group.  Apparently, you can.  In the MarcEditor, select the Add/Delete field tool.  To delete by position, you would enter {#} to denote the position to delete in the find.

Obviously, this is pretty obscure – so in MarcEdit 7, this function is exposed as an option

To delete multiple field positions, you just add a comma.  So, say I wanted to delete fields 2-5, I would enter: 2,3,4,5 into the Field Data box and check this option.  One enhancement that I would anticipate a request for is the ability to delete just the last option – this is actually harder than you’d think – in part, because it means I can’t process data as it comes in, but have to buffer it first, then process, and there are some reason why this complicates things due to the structure of the function.  So for now, it’s by direct position.  I’ll look at what it might take to allow for more abstract options (like last).


Dan Cohen: Roy’s World

Wed, 2017-10-11 15:03

In one of his characteristically humorous and self-effacing autobiographical stories, Roy Rosenzweig recounted the uneasy feeling he had when he was working on an interactive CD-ROM about American history in the 1990s. The medium was brand new, and to many in academia, superficial and cartoonish compared to a serious scholarly monograph.

Roy worried about how his colleagues and others in the profession would view the shiny disc on the social history of the U.S., and his role in creating it. After a hard day at work on this earliest of digital histories, he went to the gym, and above his treadmill was a television tuned to Entertainment Tonight. Mary Hart was interviewing Fabio, fresh off the great success of his “I Can’t Believe It’s Not Butter” ad campaign. “What’s next for Fabio?” Hart asked him. He replied: “Well, Mary, I’m working on an interactive CD-ROM.”

Ten years ago today Roy Rosenzweig passed away. Somehow it has now been longer since he died than the period of time I was fortunate enough to know him. It feels like the opposite, given the way the mind sustains so powerfully the memory of those who have had a big impact on you.

The field that Roy founded, digital history, has also aged. So many more historians now use digital media and technology to advance their discipline that it no longer seems new or odd like an interactive CD-ROM.

But what hasn’t changed is Roy’s more profound vision for digital history. If anything, more than ever we live in Roy’s imagined world. Roy’s passion for open access to historical documents has come to fruition in countless online archives and the Digital Public Library of America. His drive to democratize not only access to history but also the historical record itself—especially its inclusion of marginalized voices—can been seen in the recent emphasis on community archive-building. His belief that history should be a broad-based shared enterprise, rather than the province of the ivory tower, can be found in crowdsourcing efforts and tools that allow for widespread community curation, digital preservation, and self-documentation.

It still hurts that Roy is no longer with us. Thankfully his mission and ideas and sensibilities are as vibrant as ever.

Open Knowledge Foundation: Remix public domain artworks: join the GIF IT UP 2017 competition

Wed, 2017-10-11 13:00

This blogpost has been adapted from the press release by Europeana.

Open Knowledge International has for many years advocated for the importance of open cultural data, which enables citizens from across the world to enjoy this material, understand their cultural heritage and re-use this material to produce new works of art. Some examples of this work include the OpenGLAM initiative that promotes free and open access to digital cultural heritage held by Galleries, Libraries, Archives and Museums, and The Public Domain Review, an online journal and not-for-profit project dedicated to promoting and celebrating the public domain in all its abundance and variety. Another great initiative encouraging the reuse of openly licensed cultural data is the GIF IT UP competition, which is open for contributions this month.

From 1-31 October, all GIF­makers, history nuts, cultural heritage enthusiasts and lovers of the internet are invited to take part in the fourth annual GIF IT UP competition. The competition encourages people to create new, fun and unique artworks from digitized cultural heritage material. A GIF is an image, video or text that has been digitally manipulated to become animated. Throughout the month, they can create and submit their own, using copyright-free digital video, images or text from Europeana CollectionsDigital Public Library of America (DPLA)Trove, or DigitalNZ.

All entries help promote public domain and openly licensed collections to a wider audience, and increase the reuse of material from these four international digital libraries, including Europeana Collections. The contest is supported by GIPHY, the world’s largest library of animated GIFs.

The 2017 competition will have a special focus on first-time GIF-makers and introduce them to openly licensed content. A GIF-making workshop, providing tools and tutorials to help visitors create their first artworks, will be held on 14-15 October in cooperation with THE ARTS+, the creative business festival at the Frankfurt Book Fair.

One of this year’s contributions, via GIPHY

The jury, made up of representatives from GIPHY, DailyArt and Public Domain Review, will be awarding one grand prize winner with an Electric Object – a digital photo frame especially for GIFs – sponsored by GIPHY. Prizes of online gift cards will go to three runners-up as well as winners in a first-time GIF-makers category. Special prizes will be allocated in thematic categories: transport, holidays, animals and Christmas cards.

People are also invited to take part in the People’s Choice Award and vote on the competition website for their favourite GIF, which will receive a Giphoscope. All eligible entries will be showcased on the GIPHY channel dedicated to the competition, and promoted on social media with the hashtag #GIFITUP2017.

GIF IT UP started in 2014 as an initiative by the Digital Public Library of America (DPLA) and DigitalNZ, and has since become a cultural highlight. 368 entries from 33 countries are featured on the GIF IT UP Tumblr. In 2016, the grand prize was awarded to ‘The State Caterpillar’, created by Kristen Carter and Jeff Gill from Los Angeles, California, using source material from the National Library of France via Europeana. Nono Burling, who got awarded the 2016 People’s Choice Award for ‘Butterflies’, said: “I adore animated GIFs made from historic materials and have for many years. The first contest in 2014 inspired me to make them myself, and every year I try to improve my skills.”

Results of the 2017 competition will be announced in November on the GIF IT UP website and related social media.

DuraSpace News: Reminder: Board-Member-at-Large Nominations Accepted through Friday!

Wed, 2017-10-11 00:00
DuraSpace invites the Community to nominate the next Board-Member-at-Large!  Nominations accepted through Friday, October 13th.

Karen Coyle: Google Books and Mein Kampf

Tue, 2017-10-10 17:43
I hadn't look at Google Books in a while, or at least not carefully, so I was surprised to find that Google had added blurbs to most of the books. Even more surprising (although perhaps I should say "troubling") is that no source is given for the book blurbs. Some at least come from publisher sites, which means that they are promotional in nature. For example, here's a mildly promotional text about a literary work, from a literary publisher:

This gives a synopsis of the book, starting with:

"Throughout a single day in 1892, John Shawnessy recalls the great moments of his life..." 
It ends by letting the reader know that this was a bestseller when published in 1948, and calls it a "powerful novel."

The blurb on a 1909 version of Darwin's The Origin of Species is mysterious because the book isn't a recent publication with an online site providing the text. I do not know where this description comes from, but because the  entire thrust of this blurb is about the controversy of evolution versus the Bible (even though Darwin did not press this point himself) I'm guessing that the blurb post-dates this particular publication.

"First published in 1859, this landmark book on evolutionary biology was not the first to deal with the subject, but it went on to become a sensation -- and a controversial one for many religious people who could not reconcile Darwin's science with their faith."That's a reasonable view to take of Darwin's "landmark" book but it isn't what I would consider to be faithful to the full import of this tome.

The blurb on Hitler's Mein Kampf is particularly troubling. If you look at different versions of the book you get both pro- and anti- Nazi sentiments, neither of which really belong  on a site that claims to be a catalog of books. Also note that because each book entry has only one blurb, the tone changes considerably depending on which publication you happen to pick from the list.

First on the list:
"Settling Accounts became Mein Kampf, an unparalleled example of muddled economics and history, appalling bigotry, and an intense self-glorification of Adolf Hitler as the true founder and builder of the National Socialist movement. It was written in hate and it contained a blueprint for violent bloodshed."
Second on the list:
"This book has set a path toward a much higher understanding of the self and of our magnificent destiny as living beings part of this Race on our planet. It shows us that we must not look at nature in terms of good or bad, but in an unfiltered manner. It describes what we must do if we want to survive as a people and as a Race."That's horrifying. Note that both books are self-published, and the blurbs are the ones that I find on those books in Amazon, perhaps indicating that Google is sucking up books from the Amazon site. There is, or at least at one point there once was, a difference between Amazon and Google Books. Google, after all, scanned books in libraries and presented itself as a search engine for published texts; Amazon will sell you Trump's tweets on toilet paper. The only text on the Google Books page still claims that Google Books is about  search: "Search the world's most comprehensive index of full-text books." Libraries partnered with Google with lofty promises of gains in scholarship:
"Our participation in the Google Books Library Project will add significantly to the extensive digital resources the Libraries already deliver. It will enable the Libraries to make available more significant portions of its extraordinary archival and special collections to scholars and researchers worldwide in ways that will ultimately change the nature of scholarship." Jim Neal, Columbia UniversityI don't know how these folks now feel about having their texts intermingled with publications they would never buy and described by texts that may come from shady and unreliable sources.

Even leaving aside the grossest aspects of the blurbs and Google's hypocrisy about its commercialization of its books project, adding blurbs to the book entries with no attribution and clearly not vetting the sources is extremely irresponsible. It's also very Google to create sloppy algorithms that illustrate their basic ignorance of the content their are working with -- in this case, the world's books.

David Rosenthal: IPRES 2017

Tue, 2017-10-10 15:00
Kyoto Railway MuseumMuch as I love Kyoto, now that I'm retired with daily grandparent duties (and no-one to subsidize my travel) I couldn't attend iPRES 2017.

I have now managed to scan both the papers, and the very useful "collaborative notes" compiled by Micky Lindlar, Joshua Ng, William Kilbride, Euan Cochrane, Jaye Weatherburn and Rachel Tropea (thanks!). Below the fold I have some notes on the papers that caught my eye.

I have appreciated the Dutch approach to addressing problems ever since the late 70s, when I worked with Paul ten Hagen and Rens Kessner on the Graphical Kernel System standard. This approach featured in two of the papers:
  • How the Dutch prepared for certification by Barbara Sierman and Kees Waterman describes how six large cultural heritage organizations worked together to ease each of their paths up the hierarchy of repository certification from DSA to Nestor. The group added two preparatory stages before DSA (Initial Self-Assessment, and Exploratory Phase), comprising activities that I definitely recommend as a starting point. They also translated the DSA and Nestor standards into Dutch, enhanced some of the available tools, and conducted surveys and awareness-raising.
  • A Dutch approach in constructing a network of nationwide facilities for digital preservation together by Joost van der Nat and Marcel Ras reported that:
    In November 2016, the NCDD research on the construction of a cross-domain network of facilities for long-term access to digital Cultural Heritage in the Netherlands was rewarded the Digital Preservation Award 2016 in the category Research and Innovation. According to the judges the research report presents an outstanding model to help memory institutes to share facilities and create a distributed, nationwide infrastructure network for Digital Preservation. The NCDD didn't go all-out for either centralization or distribution, but set out to find the optimum balance for infrastructure spanning diverse institutions:
    Under the motto “Joining forces for our digital memory”, a research project was started in 2014 ... This project had the purpose to find out what level of differentiation between the domains offers the best balance for efficiency. Without collaboration, inefficiencies loom, while individual institutes continue to expand their digital archives and may be reinventing the same wheel over and over again. The project’s objective was and is to avoid duplication of work, and to avoid wasting time, money, and energy. Economies of scale make it easier for the many smaller Dutch institutes to profit from available facilities, services, and expertise as well. Policy makers can now ponder the question “The same for less money, or more for the same money?”.
I've blogged before about the important work of the Software Heritage Foundation. Software Heritage: Why and How to Preserve Software Source Code by Roberto Di Cosmo and Stefano Zacchiroli provides a comprehensive overview of their efforts. I'm happy to see them making two justifications for preserving open-source software that I've been harping on for years:
Source code is clearly starting to be recognized as a first class citizen in the area of cultural heritage, as it is a noble form of human production that needs to be preserved, studied, curated, and shared. Source code preservation is also an essential component of a strategy to defend against digital dark age scenarii in which one might lose track of how to make sense of digital data created by software currently in production.But they also provide other important justifications, such as these two:
First, Software Heritage intrinsic identifiers can precisely pinpoint specific software versions, independently of the original vendor or intermediate distributor. This de facto provides the equivalent of “part numbers” for FOSS components that can be referenced in quality processes and verified for correctness ....

Second, Software Heritage will provide an open provenance knowledge base, keeping track of which software component - at various granularities: from project releases down to individual source files — has been found where on the Internet and when. Such a base can be referenced and augmented with other software-related facts, such as license information, and used by software build tools and processes to cope with current development challenges. Considering Software Heritage's relatively short history the coverage statistics in Section 9 of the paper are very impressive, illustrating the archive-friendly nature of open-source code repositories.

Emulation featured in two papers:
  • Adding Emulation Functionality to Existing Digital Preservation Infrastructure by Euan Cochrane, Jonathan Tilbury and Oleg Stobbe is a short paper describing how Yale University Library (YUL) interfaced bwFLA, Freiburg's emulation-as-a-service infrastructure to their Preservica digital preservation system. The goal is to implement their policy:
    YUL will ensure access to hardware and software dependencies of digital objects and emulation or virtualization tools by [...] Preserving, or providing access to preserved software (applications and operating systems), and pre-configured software environments, for use in interacting with digital content that depends on them. Yale is doing important work making Feiburg's emulation infrastructure easy-to-use in libraries.
  • Trustworthy and Portable Emulation Platform for Digital Preservation by Zahra Tarkhani, Geoffrey Brown and Steven Myers:
    provides a technological solution to a fundamental problem faced by libraries and archives with respect to digital preservation — how to allow patrons remote access to digital materials while limiting the risk of unauthorized copying. The solution we present allows patrons to execute trusted software on an untrusted platform; the example we explore is a game emulator which provides a convenient prototype to consider many fundamental issues. Their solution depends on Intel's SGX instruction set extensions, meaning it will work only on Skylake and future processors. I would expect it to be obsoleted by the processor-independent, if perhaps slightly less bullet-proof, W3C Encrypted Media Extensions (EME) available in all major browsers. Of course, if SGX is available, implementations of EME could use it to render the user even more helpless.
Always on the Move: Transient Software and Data Migrations by David Wilcox is a short paper describing the import/export utility developed to ease the data migration between versions 3 and 4 of Fedora. This has similarities with the IMLS-funded WASAPI web archive interoperability work with which the LOCKSS Program is involved.

Although they caught my eye, I have omitted here two papers on identifiers. I plan a future post about identifiers into which I expect they will fit:

HangingTogether: Beyond the Authorized Access Point?

Tue, 2017-10-10 13:00

That was the topic discussed recently by OCLC Research Library Partners metadata managers, initiated by Stephen Hearn of the University of Minnesota. Can we still insist on using the authorized access point as the primary identifier? It is scary to imagine that we have to build authorized access points for titles in a “work” focused environment. Other communities are putting together separate pieces of information to help select the correct name or title. Dates are not always the most informative choice for the user. Libraries receive an influx of records where we have no control over the authorized form of the name anyway. Other environments make use of identifiers. Wikipedia, IMDb and MusicBrainz differentiate entities and then prompt you for more information. We have an opportunity to work with a larger community.

Do we still need an authorized access point as a “primary identifier”?  Let’s distinguish identifiers from their associated labels. Access points rely on unique text strings to distinguish them from other access points. A unique identifier could be associated with an aggregate of other attributes that would enable users to distinguish one entity from another. Ideally, we could take advantage of the identifiers and attributes from other, non-library sources. Wikidata, for example, aggregates a variety of identifiers as well as labels in different languages, as pictured above.

The library community has started to move towards the use of identifiers by adding identifiers in the $0 of heading fields. OCLC algorithmically added FAST (Faceted Application of Subject Terminology) headings, with their identifiers, to all WorldCat records that had an LC subject heading.  Other communities have started including VIAF (Virtual International Authority File) cluster identifiers to their entity descriptions. Providing contextual information is more important than providing one unique label. Labels could differ depending on communities—such as various spellings of names and terms, different languages and writing systems, and different disciplines—without requiring that one form be preferred over another.

Catalogers have long added value by supplying information about relationships. RDA attributes have spurred libraries to move toward contextualization. We now have ways of making that information more understandable to users. As those capabilities continue to evolve, the need for unique strings could diminish.

NACO is a valuable program but not everyone is able to contribute. Even in institutions that are NACO contributors, only staff who have received the requisite training can create LC/NACO authority records. The volume of names without authority control is increasing, especially as academic institutions commit to providing a comprehensive overview of their researchers’ output, often stored in separate local databases or scholar profile systems. NACO-level work isn’t sustainable beyond MARC records.

Could Wikidata be an alternative to contribute information about entities?  Adding names or information about entities into Wikidata could be a very low barrier way to for non-NACO staff to supplement NACO contributions. For example, the University of Miami’s RAMP (Remixing Archival Metadata Project) generates Wikipedia pages out of archival descriptions (discussed in the 2014 OCLC Research Webinar,  Beyond EAD). Encouraging contributions to Wikidata could also tap the expertise within our communities.

Envisioning the future:  The authorized access point was designed for a closed, MARC-based environment. Its time has come and gone.  We already see examples of “identifier hubs” that aggregate multiple identifiers referring to the same entity. More work is needed to establish “same as” relationships among different identifiers and to add identifiers to our large legacy databases that can point to one or more of these “identifier hubs.” We need technology that can integrate the metadata from all the sources that generated the identifiers, filtered according to the context. We could start by focusing on identifiers rather than labels as a means to concatenate result sets. Greater functionality for identifiers would drive the value proposition for datasets that merge them and provide correlations among the various sources.

FOSS4Lib Recent Releases: Avalon Media System - 6.2.0

Tue, 2017-10-10 12:36

Last updated October 10, 2017. Created by Peter Murray on October 10, 2017.
Log in to edit this page.

Package: Avalon Media SystemRelease Date: Wednesday, October 4, 2017

LITA: Explore Reproducibility & Open Scholarship in a new LITA web course

Mon, 2017-10-09 19:28

Sign up today for 

Building Services Around Reproducibility & Open Scholarship

Instructor: Vicky Steeves, Librarian for Research Data Management and Reproducibility, a dual appointment between New York University Division of Libraries and NYU Center for Data Science
November 1 – November 22, 2017
Please note this course start date has been re-scheduled from a previous date.

As research across domains of study has become increasingly reliant on digital tools, the challenges in achieving reproducibility have grown. Alongside this reproducibility challenge are the demands for open scholarship, such as releasing code, data, and articles under an open license. Openness is an important step towards reproducibility, but it cannot be the end of the road. This class will focus on open scholarship and reproducibility as two distinct but connected topics.

Register here, courses are listed by date

This course will examine, cover and discuss:

  • The discourse around open scholarship
  • Best practices around use of open source tools, creating an open web presence, preparing research output for publication, and linking those outputs to more traditional publications.
  • The tools that both researchers and librarians are using to engage in open work.

View details and Register here.

Discover upcoming LITA webinars

Diversity and Inclusion in Library Makerspace
Offered: October 24, 2017

Taking Altmetrics to the Next Level in Your Library’s Systems and Services
Offered: October 31, 2017

Digital Life Decoded: A user-centered approach to cyber-security and privacy
Offered: November 7, 2017

Introduction to and JSON-LD
Offered: November 15, 2017

Questions or Comments?

For all other questions or comments related to the course, contact LITA at (312) 280-4268 or Mark Beatty,

District Dispatch: How fast is fast enough? Comments on FCC broadband deployment report

Mon, 2017-10-09 18:01

Since 1996, the Federal Communications Commission (FCC) has been required by Section 706 of the Telecommunications Act to periodically release a report assessing the country’s state of advanced telecommunications capability and to adopt measures to measure further broadband deployments. Last Friday, we submitted comments to the FCC raising two issues particularly relevant to libraries and their public missions: first, the criteria and standards for broadband deployment to public institutions like libraries and schools; second, the role of mobile internet access in connecting consumers to information.

In our comments, we asked the FCC to maintain the benchmarks for broadband to libraries set in 2014 as part of the modernization of the E-rate program: for libraries serving less than 50,000 population the FCC recommended a minimum broadband speed of 100 Mbps; for libraries serving more than 50,000 population it recommended a speed of at least 1 Gbps. We also hope the FCC will work with us to find other metrics that might help our shared policy goals of ensuring well-connected anchor institutions.

In addition, we used our comments to share our view that mobile and fixed broadband access serve complementary purposes for people, but are not the same. Given our interest in ensuring peoples’ access to information, libraries have a vested interest in the quality of broadband access people have at home. Over 90 percent of public libraries offer their patrons access to commercial reference and periodical databases from thousands of sources, most offering that access to consumers at home. Increasingly, the content is multimedia, with a heavy reliance on streaming video.

In our view, the possibility that the FCC may consider mobile internet access as part of the universal deployment of advanced telecommunications capability is troubling because the capabilities of mobile service do not yet meet those of wired broadband access. Further, many services are subject to data caps, which will disproportionately hurt consumers with lower incomes.

Why does all this matter? FCC Chairman Ajit Pai recently suggested that the commission’s current standard for home broadband of 25 Mbps up and 3 Mbps down, defined under a previous FCC chairman, was perhaps unnecessarily high. Pai has proposed a mobile broadband standard of 10 Mbps up and 1 Mbps down. Rather than ensuring commercial ISPs are meeting consumers’ needs and holding carriers accountable to existing standards, the FCC may be choosing to “make the test easier” for those providers.

We are not alone in our concerns: there is a letter to Chairman Pai, supported by eight senators and 29 members of Congress, opposing his efforts to lower broadband Internet standards for millions of Americans. The full letter can be found here.

The FCC is likely to issue a report with their findings the first part of next year.

The post How fast is fast enough? Comments on FCC broadband deployment report appeared first on District Dispatch.

David Rosenthal: OAIS & Distributed Digital Preservation

Mon, 2017-10-09 00:15
One of the lessons from the TRAC audit of the CLOCKSS Archive was the mis-match between the OAIS model and distributed digital preservation:
CLOCKSS has a centralized organization but a distributed implementation. Efforts are under way to reconcile the completely centralized OAIS model with the reality of distributed digital preservation, as for example in collaborations such as the MetaArchive and between the Royal and University Library in Copenhagen and the library of the University of Aarhus. Although the organization of the CLOCKSS Archive is centralized, serious digital archives like CLOCKSS require a distributed implementation, if only to achieve geographic redundancy. The OAIS model fails to deal with distribution even at the implementation level, let alone at the organizational level.It is appropriate on the 19th anniversary of the LOCKSS Program to point to a 38-minute video about this issue, posted last month. In it Eld Zierau lays out the Outer OAIS - Inner OAIS model that she and Nancy McGovern have developed to resolve the mis-match, and published at iPRES 2014.

They apply OAIS hierarchically, first to the distributed preservation network as a whole (outer), and then to each node in the network (inner). This can be useful in delineating the functions of nodes as opposed to the network as a whole, and in identifying the single points of failure created by centralized functions of the network as a whole.

While I'm promoting videos, I should also point to's excellent video for a general audience about the importance of Web archiving, with subtitles in English.

DuraSpace News: Fedora Camp Texas: Last Chance to Register!

Mon, 2017-10-09 00:00

Join experienced trainers and Fedora gurus at Fedora Camp, co-hosted by University of Texas Libraries and Texas Digital Library, to be held October 16-18 at the Perry-Castañeda Library at the University of Texas, Austin.

Terry Reese: Saxon.NET and local file paths with special characters and spaces

Sun, 2017-10-08 04:27

I thought I’d post this here in case this can help other folks.  One of the parsers that I like to use is Saxon.Net, but within the .net platform at least, it has problems doing XSLT or XQuery transformations when the files in question have paths with special characters or spaces (or if they reference files via xsl:include statements that live inside paths with special characters or spaces).  The question comes up a lot on the Saxon support site and it sounds like Saxon is actually processing the data correctly.  Saxon is expecting valid URIs, and a URI can’t have a spaces.  Internally, the URI is escaped, but when you process those escaped paths against a local file system, accessing the file will fail.  So, what do I mean – here are two different types of problems I encounter:

  • Path 1: c:\myfile\C#\folder1\test.xsl
  • Path2: c:\myfile\C#\folder 1\test.xsl

When setting up a transformation using Saxon, you setup a XSLTransform.  You can set this up using either a stream, like an XMLReader, or a URI.  But here the problem.  If you create the statement like this:

System.Xml.XmlReader xstream = System.Xml.XmlReader.Create(filepath); transformer = xsltCompiler.Compile(xstream).Load();

The program can read Path 1, but will always fail on Path 2, and will fail on Path 1 if it includes secondary data.  If rather than using a stream, I use a URI class like:

transformer = xsltCompiler.Compile(new Uri(sXSLT, UriKind.Absolute)).Load();

Both Path’s will break.  On the Saxon list, there was a suggestion to create a sealed class, and to wrap the URI in that class.  So, you’d end up code that looked more like:

transformer = xsltCompiler.Compile(new SaxonUri(new Uri(sXSLT, UriKind.Absolute))).Load(); public sealed class SaxonUri : Uri { public SaxonUri(Uri wrappedUri) : base(GetUriString(wrappedUri), GetUriKind(wrappedUri)) { } private static string GetUriString(Uri wrappedUri, bool localuri = false) { if (wrappedUri == null) throw new ArgumentNullException("wrappedUri", "wrappedUri is null."); if (wrappedUri.IsAbsoluteUri) return wrappedUri.AbsoluteUri; return wrappedUri.OriginalString; } private static UriKind GetUriKind(Uri wrappedUri) { if (wrappedUri == null) throw new ArgumentNullException("wrappedUri", "wrappedUri is null."); if (wrappedUri.IsAbsoluteUri) return UriKind.Absolute; return UriKind.Relative; } public override string ToString() { if (IsWellFormedOriginalString()) return OriginalString; else if (IsAbsoluteUri) return AbsoluteUri; return base.ToString(); } }

And this get’s a closer.  Using this syntax, Path 1 doesn’t work, but Path 2 will.  So, you could use an if…then statement to look for spaces in the XSLT file path, and if there are no spaces, open the stream, and if there are, wrap the URI.  Unfortunately, that doesn’t work either – because if you include a reference (like xsl:include) in your XSLT, Path 1 and Path 2 fail, because internally, the BaseURI is set to an escaped version of the URI, and Windows will fail to locate the string.  At which point, you end up feeling like you might be pretty much screwed, but there are still other options but they take more work.  In my case, the solution that I adopted was to create a custom XmlResolver.  This allows me to handle all the URI processing myself, and in the case of the two path statements, I’m interested in handling all local file URIs.  So how does that work:

xsltCompiler.XmlResolver = new CustomeResolver(); transformer = xsltCompiler.Compile(new Uri(sXSLT, UriKind.Absolute)).Load(); internal class CustomeResolver : XmlUrlResolver { public override object GetEntity(Uri absoluteUri, string role, Type ofObjectToReturn) { if (absoluteUri.IsFile) { string filename = absoluteUri.LocalPath; if (System.IO.File.Exists(filename)==false) { filename = Uri.UnescapeDataString(filename); if (System.IO.File.Exists(filename)==false) { return (System.IO.Stream)base.GetEntity(absoluteUri, role, ofObjectToReturn); } else { System.IO.Stream myStream = new System.IO.FileStream(filename, System.IO.FileMode.Open); return myStream; } } else { return (System.IO.Stream)base.GetEntity(absoluteUri, role, ofObjectToReturn); } } else { return (System.IO.Stream) base.GetEntity(absoluteUri, role, ofObjectToReturn); } }

By creating your own XmlResolver, you can fix the URI problems and allow Saxon to process both use cases above.


District Dispatch: Section 702: Advocates brace for surveillance reform fight

Fri, 2017-10-06 17:29

With fewer than 90 calendar days remaining before the expiration of section 702 of the Foreign Intelligence Surveillance Act (FISA), this week ALA joined the ACLU and a host of other major national privacy advocates in calling on the leaders of the House Judiciary Committee not to reauthorize the program without adding substantial new privacy protections necessary to make the program constitutional. The fundamental flaw in 702, as reported in Politico’s coverage of the letter from almost 60 organizations, is that it “authorizes surveillance of people who are not ‘U.S. persons’ reasonably believed to be outside U.S. borders – but it vacuums up an unknown amount of data on Americans in the process.” 

Closing this so-called “backdoor search loophole” is the single most important reform backed by ALA and its coalition partners. For their part, leaders of the intelligence community recently wrote to congressional leaders in both chambers urging that the program be reauthorized without any changes and made permanent rather than subject to a new “sunset” date. (It’s just such a deadline, however, that’s forcing scrutiny of the program and the present reform debate.)

If passed, a June 2017 bill (S. 1297) by Senator Tom Cotton (R-AR) and backed by 13 other Senate Republicans, would do exactly as they wish. As widely reported, however, many other members of Congress – including the bipartisan leadership of both chambers’ Judiciary Committees with jurisdiction over section 702 – oppose clean reauthorization of the law without additional safeguards for civil liberties.

The House Judiciary Committee’s initial legislative proposals were just unveiled yesterday as the “USA Liberty Act” and are expected to change before being debated and voted on by the Committee, possibly as early as the week of October 23. As summarized by the Committee, the bill does constructively limit the use of information collected without a warrant for domestic criminal prosecutions, as well as sweeping so-called “about” searches of collected data. Initial reactions by major civil liberties organizations, while appreciative of those proposed changes, have been qualified and uniformly call for far greater reforms to section 702 than those detailed in the new bill.

The final shape of 702 reform legislation is still unclear, but the looming deadline to complete a bill by year’s end is now in legislators’ sharp focus. What is certain is that legislative debate now has begun in earnest and it will be fast, furious and high-stakes. All members of Congress will need to hear from their constituents at home that reforms are essential when the time is right.

That time will be very soon and ALA’s Washington Office will, as always, provide timely alerts of how and when to take action. The law is complicated, but the main message to Congress won’t be: “please don’t reauthorize section 702 the without closing backdoor search loophole now.”

Additional General Resources:

Warrantless Surveillance Under Section 702 of FISA
American Civil Liberties Union (2017)

Reforming Section 702: We Can Protect Americans’ Privacy and Protect Against Foreign Threats
Brennan Center for Justice (August 2017)

What is 702 and Why Should I Care? (infographic)
Arab American Institute (September 2017)

A History of FISA Section 702 Compliance Violations (interactive timeline and chart)
New America Foundation Open Technology Institute (September 2017)

Section 702 of the Foreign Intelligence Surveillance Act (FISA): Its Illegal and Unconstitutional Use
Electronic Frontier Foundation (undated)

The post Section 702: Advocates brace for surveillance reform fight appeared first on District Dispatch.

LITA: Spotlight Series: Brittney Buckland

Thu, 2017-10-05 17:28

Have you ever seen a really interesting library position and wondered how the person got there? This series will interview tech librarians to learn more about their journey, how they stay informed about emerging technologies, and tools they can’t live without.

Allow me to introduce Brittney Buckland, Head of Technical Services at Merrimack Public Library in New Hampshire. Below is an excerpt of our interview, the full transcript can be accessed here.

Brittney Buckland, Head of Technical Services at Merrimack Public Library

What is your background?

“My undergraduate degree is a BA in the Arts: Art History from the University of New Hampshire (UNH). I finished my MSLIS with a specialization in Library and Information Services in 2013. I completed my Master’s completely online through Drexel University. I recently finished a secondary M.Ed. in Educational Studies through UNH, also completely online.”

What were some of your early library jobs and how did they prepare you for your current position?

Brittney’s first library job was as a Student Library Assistant in an academic setting, where she learned the basics of page duties, as well as performing minor reference work.

In my current position I still work on the floor so the skills I learned as a Student Library Assistant are still useful to me. I currently spend one night a week and am part of a weekend rotation to work on the Reference desk. I believe those public desk skills come in handy when working with patrons. They also help when I am collaborating with my colleagues when making decisions about the collection and new services we want to provide to our patrons.”

Tell me about the Merrimack Public Library.

“Merrimack Public Library is a mid-sized public library located in New Hampshire. Directly we serve a population of just over 25,000 Merrimack residents, but serve many more as we are part of a twelve library consortium that includes public and academic libraries in the Greater Manchester area of New Hampshire.”

Along with the usual library offerings and New Hampshire local history. “There are a few really cool collections in the works that aren’t quite ready to circulate yet. These include kits for children and teens like Magformers, Snap Circuits, and Ozobots that our teens put together as part of their Summer Reading Program. We’ll also be circulating sandwich board signs to local groups, and are currently working with a local Girl Scout to set up a fun shape baking pan collection.

Starting soon we will be partnering with our local Meals on Wheels group to deliver homebound library services for those that cannot make it to the building.”

What are some of the main responsibilities in your current role? 

“Most of my main responsibilities revolve around cataloging and acquisitions. I do all of the original cataloging for the library. I work with the other Department Heads of the library to come up with practical circulating and material processing procedures. I am in charge of the collection development for our audiobook, CD, and DVD/Blu-Ray collections; which includes buying as well as withdrawing the items from the collection when necessary.”  

Tell me about libraries 10 years from now- what do they look like and what services do they offer?

“I think libraries will be very interesting places 10 years from now. I love how libraries are evolving from circulating only books to circulating all sorts of special collections. I think it speaks more to serving the library’s community and demonstrates that librarians are in touch with what the community wants and how it’s developing. I think public libraries are already on the way to doing this, but I believe it will be more developed in the future and libraries will look more like community centers.”

What advice would you give a recent MILS grad or current information professional looking to change careers?

“I think it is extremely important for MLIS recent grads, those considering an MLIS, and those looking to make the jump to switching careers into librarianship to realize that the job market is extremely saturated and competitive. I think graduates should also try to get an understanding of what is required to be in each position before they graduate so that they can start with the right foot forward. I wanted to be an academic librarian, but didn’t know that to be tenured I would need a secondary Master’s degree or Ph.D. I also believe you shouldn’t limit yourself. There are all sorts of different libraries out there–public and academic, but also law libraries, medical libraries and so many more.”

How do you stay current on new technology?

“Staying current takes a lot of work. There are journals to read, workshops/webinars to attend, and meetings to attend. One also needs to pay attention to what teachers are interested in so that libraries can follow those educational trends like programming. I try to keep an eye out for webinars that are relevant to my library’s strategic plan and mission and then see how those things could be implemented. Luckily, as part of the consortium my library is a part of there is a committee for technology so we can bounce ideas off each other and show each other the cool new things we’ve found or tried.”

Share technology that you can’t live or do your job without.

Along with OCLC Connexion, LC Authority Files and MARC Standards. ”On Facebook I follow the cataloging group Troublesome Catalogers and Magical Metadata Fairies for both professional posts and some humor. I use Amazon a lot for media purchases. One of my current favorites is the app Workflow on iOS. It lets you program your phone to automate some tasks. It comes with some preset ideas like sending your last photo to your Instagram account, but one of the Technology Librarians in my consortium wrote a program to have this app search our shared catalog from scanning the ISBN on the back of any book!”

Brittney also gives great advice for those frustrated information specialists looking for the right job: Get involved, stay current, and keep your mind open–you really never know when an opportunity will drop into your lap!”

Thanks to Brittney for kicking off the series and be sure to check out next month’s conversation with Rebecca McGuire, Tech Specialist at Mortenson Center for International Library Programs.



HangingTogether: Vocabulary control data in discovery environments

Thu, 2017-10-05 17:07

That was the topic discussed recently by OCLC Research Library Partners metadata managers, initiated  by Steven Folsom (then at Harvard), Stephen Hearn of the University of Minnesota, and Melanie Wacker of Columbia. Traditional authority control models have relied on left anchored browsing of alphabetically ordered lists of terms, a model that interposes the controlled terms, preferred, variant, and related, between the searcher and search results. The new world of authority sources in which libraries operate include ORCID and other international registries. Vocabularies designed for left-anchored browsing are a poor fit for current discovery environments oriented toward keyword search and facet term sets pulled directly from displayed search results.

Some new discovery layers are starting to take advantage of variant and related terms. Although traditional web OPACs based on the underlying local library system take advantage of variant terms, related terms, and scope notes, most institutions now also have a discovery layer that aggregates descriptive information from different sources in different formats. Initially none took advantage of any information in authority files. A few have now added non-preferred terms to bibliographic data and indexed them, or show related terms and reference as well as links to Wikipedia entries for persons and corporate bodies. Cornell takes advantage of the contextual information recorded in RDA authority records. Ideally, we would want to pull in information from different sources to provide context—multiple authority files, thesauri, VIAF, Wikipedia, etc.—and pull in the information most relevant.

Metadata managers reported that browsing generally has low usage. Those who measured usage of their browse data reported very low usage (under 2%). Search interfaces that do still support a browse search hide it away in the advanced search. The primary users are librarians and faculty—expert users. In several instances browse searching had been removed but then brought back. It appears that, while usage is low, there are several use cases that cannot be met without browse searching. The “power users” who use browse are “really vocal” so it’s important to learn why they use browsing and to think creatively about other ways to meet their needs. Database managers also find browsing useful to identify anomalies in the data to be fixed. What tools could be used instead to uncover these anomalies?”

“Browsing” just means looking through contents of a list of some kind, not just left-anchored browsing of headings used in our traditional systems. In discovery environments that bring together disparate sources conforming to different standards and using different vocabularies, are there ways we can enable browsing entities which also provides contextual information? For example, the linked data prototype catalogs of the national libraries of France ( and Spain ( provide different views of objects, persons and subjects that focus on entity relationships rather than text strings.

Lessons from non-library systems include upfront disambiguation, showing relationships between entities, and providing contextual information. Metadata managers would like their systems to be smarter about using the data they have to pull concepts together rather than just words. Examples of providing such context included the University of Pennsylvania’s Online Books such as this search result on Women; Cornell’s experimenting with making knowledge card-type displays where related information is displayed in a separate window, such as this display for Women and this one for Malcolm X. Others are learning what vocabularies to use and bringing in data from different sources with linked data. Such experimentation offers opportunities for more collaboration.

Few are addressing multiple overlapping and sometimes conflicting vocabularies. Even MARC records may have different but overlapping vocabularies, for example, records that include both FAST  and LC subject headings. To users, they look redundant. In New Zealand, Maori subject headings are added to the same records as LC subject headings; Australia adds terms authorized for indigenous peoples. But a growing percentage of data in institutions’ discovery layers come from non-MARC, non-library sources. Metadata describing universities’ research data and materials in Institutional Repositories is usually treated completely differently—and separately. How to provide normalization and access to the entities described so users don’t experience the “collision of name spaces” and ambiguous terms (or terms meaning different things depending on the source)? Synaptica solutions is working on a tool kit on crosswalks among different vocabularies and languages that sounds promising.

As different vocabularies were designed for different contexts, we need middleware that can help normalize or at least identify the differences among them. Perhaps we could use linked data to concatenate designated bits of data and then display the appropriate labels depending on context? We need to envision new ways and models of managing vocabulary data beyond left-anchored browsing for our discovery environments and providing users the context they need. The experimentation OCLC Research Library Partners are undertaking to help us all shift from strings to an entity-based environment is very encouraging.

Graphic: under CC BY-ND 3.0