You are here

Feed aggregator

Patrick Hochstenbach: It Is Raining Cat Doodles

planet code4lib - Sat, 2014-12-20 15:48
Filed under: Doodles Tagged: cat, Cats, doodle, elag

District Dispatch: Sony leak reveals efforts to revive SOPA

planet code4lib - Sat, 2014-12-20 15:17

Working in Washington, D.C. tends to make one a bit jaded: the revolving door, the bipartisan attacks, not enough funding for libraries — the list goes on. So, yes, I am D.C.-weary and growing more cynical. Now I have another reason to be fed up.

The Sony Pictures Entertainment data breach has uncovered documents that show that the Motion Picture Association of America (MPAA) has been trying to pull a fast one —reviving the ill-conceived Stop Online Piracy Act (SOPA) legislation (that failed spectacularly in 2010) by apparently working in tandem with state Attorneys General. Documents show that MPAA has been devising a scheme to get the result they could not get with SOPA—shutting down web sites and along with them freedom of expression and access to information.

Sony Pictures studio.

The details have been covered by a number of media outlets, including The New York Times. The MPAA seems to think that the best solution to shutting down piracy is “make invisible” the web sites of suspected culprits. You may think that libraries have little to worry about; after all we aren’t pirates. But the good guys will be yanked offline, as well as the alleged bad guys. Our provision of public access to the internet would then be in jeopardy because a few library users that allegedly posted protected content on, for example, Pinterest or YouTube. Our protection from liability for the activities of library patrons using public computers could be thrown out the window along with internet access. This makes no sense.

SOPA, touted initially by Congress as a solution to online piracy, also made no sense from the start because it was too broad. If passed, it would have required that libraries police the internet and block web sites whenever asked by law enforcement officials. Technical experts confirmed that the implementation of SOPA could threaten cybersecurity and undermine the Domain Name System (DNS), also known as the very “backbone of the Internet.”

After historically overwhelming public outcry the content community and internet supporters were encouraged to work together on a compromise, parties promised to collaborate, and some work was actually accomplished. Now it seems that, as far as MPAA was concerned, collaboration was just hype. They were, the leaked documents show, planning all along to get SOPA one way or another.

The library community opposes piracy. But we also oppose throwing the baby out with the bath water.

Update: The Verve has reported that Mississippi Attorney General Hood did indeed launch his promised attack on behalf of the MPAA by serving Google with a 79-page subpoena charging that Google engaged in “deceptive” or “unfair” trade practice under the Mississippi Consumer Protection Act. Google has filed a brief asking the federal court to set aside the subpoena and noting that Mississippi (or any state for that matter) has no jurisdiction over these matters.

For more on efforts to revive SOPA, see this post as well.

The post Sony leak reveals efforts to revive SOPA appeared first on District Dispatch.

Terry Reese: Working with SPARQL in MarcEdit

planet code4lib - Sat, 2014-12-20 06:06

Over the past couple of weeks, I’ve been working on expanding the linking services that MarcEdit can work with in order to create identifiers for controlled terms and headings.  One of the services that I’ve been experimenting with is NLM’s beta SPARQL endpoint for MESH headings.  MESH has always been something that is a bit foreign to me.  While I had been a cataloger in my past, my primary area of expertise was with geographic materials (analog and digital), as well as traditional monographic data.  While MESH looks like LCSH, it’s quite different as well.  So, I’ve been spending some time trying to learn a little more about it, while working on a process to consistently query the endpoint to retrieve the identifier for a preferred Term. Its been a process that’s been enlightening, but also one that has led me to think about how I might create a process that could be used beyond this simple use-case, and potentially provide MarcEdit with an RDF engine that could be utilized down the road to make it easier to query, create, and update graphs.

Since MarcEdit is written in .NET, this meant looking to see what components currently exist that provide the type of RDF functionality that I may be needing down the road.  Fortunately, a number of components exist, the one I’m utilizing in MarcEdit is dotnetrdf (https://bitbucket.org/dotnetrdf/dotnetrdf/wiki/browse/).  The component provides a robust set of functionality that supports everything I want to do now, and should want to do later.

With a tool kit found, I spent some time integrating it into MarcEdit, which is never a small task.  However, the outcome will be a couple of new features to start testing out the toolkit and start providing users with the ability to become more familiar with a key semantic web technology,  SPARQL.  The first new feature will be the integration of MESH as a known vocabulary that will now be queried and controlled when run through the linked data tool.  The second new feature is a SPARQL Browser.  The idea here is to give folks a tool to explore SPARQL endpoints and retrieve the data in different formats.  The proof of concept supports XML, RDFXML, HTML. CSV, Turtle, NTriple, and JSON as output formats.  This means that users can query any SPARQL endpoint and retrieve data back.  In the current proof of concept, I haven’t added the ability to save the output – but I likely will prior to releasing the Christmas MarcEdit update.

Proof of Concept

While this is still somewhat conceptual, the current SPARQL Browser looks like the following:

At present, the Browser assumes that data resides at a remote endpoint, but I’ll likely include the ability to load local RDF, JSON, or Turtle data and provide the ability to query that data as a local endpoint.  Anyway, right now, the Browser takes a URL to the SPARQL Endpoint, and then the query.  The user can then select the format that the result set should be outputted.

Using NLM as an example, say a user wanted to query for the specific term: Congenital Abnormalities – utilizing the current proof of concept, the user would enter the following data:

SPARQL Endpoint: http://id.nlm.nih.gov/mesh/sparql

SPARQL Query:

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> PREFIX xsd: <http://www.w3.org/2001/XMLSchema#> PREFIX owl: <http://www.w3.org/2002/07/owl#> PREFIX meshv: <http://id.nlm.nih.gov/mesh/vocab#> PREFIX mesh: <http://id.nlm.nih.gov/mesh/> SELECT distinct ?d ?dLabel FROM <http://id.nlm.nih.gov/mesh2014> WHERE { ?d meshv:preferredConcept ?q . ?q rdfs:label 'Congenital Abnormalities' . ?d rdfs:label ?dLabel . } ORDER BY ?dLabel

Running this query within the SPARQL Browser produces a resultset that is formatted internally into a Graph for output purposes.

The images snapshot a couple of the different output formats.  For example, the full JSON output is the following:

{ "head": { "vars": [ "d", "dLabel" ] }, "results": { "bindings": [ { "d": { "type": "uri", "value": "http://id.nlm.nih.gov/mesh/D000013" }, "dLabel": { "type": "literal", "value": "Congenital Abnormalities" } } ] } }

The idea behind creating this as a general purpose tool, is that in theory, this should work for any SPARQL endpoint.   For example, the Project Gutenberg Metadata endpoint.  The same type of exploration can be done, utilizing the Browser.

Future Work

At this point, the SPARQL Browser represents a proof of concept tool, but one that I will make available as part of the MARCNext research toolset:

As part of the next update.  Going forward, I will likely refine the Browser based on feedback, but more importantly, start looking at how the new RDF toolkit might allow for the development of dynamic form generation for editing RDF/BibFrame data…at least somewhere down the road.

–TR

[1] SPARQL (W3C): http://www.w3.org/TR/rdf-sparql-query/
[2] SPARQL (Wikipedia): http://en.wikipedia.org/wiki/SPARQL
[3] SPARQL Endpoints: http://www.w3.org/wiki/SparqlEndpoints
[4] MarcEdit: http://marcedit.reeset.net
[5] MARCNext: http://blog.reeset.net/archives/1359

William Denton: Intersecting circles

planet code4lib - Sat, 2014-12-20 03:21

A couple of months ago I was chatting about Venn diagrams with a nine-year-old (almost ten-year-old) friend named N. We learned something interesting about intersecting circles, and along the way I made some drawings and wrote a little code.

We started with two sets, but here let’s start with one. We’ll represent it as a circle on the plane. Call this circle c1.

Everything is either in the circle or outside it. It divides the plane into two regions. We’ll label the region inside the circle 1 and the region outside (the rest of the plane) x.

Now let’s look at two sets, which is probably the default Venn diagram everyone thinks of. Here we have two intersecting circles, c1 and c2.

We need to consider both circles when labelling the regions now. For everything inside c1 but not inside c2, use 1x; for the intersection use 12; for what’s in c2 but not c1 use x2; and for what’s outside both circles use xx.

We can put this in a table:

1 2 1 x 1 2 x 2 x x

This looks like part of a truth table, which of course is what it is. We can use true and false instead of the numbers:

1 2 T F T T F T F F

It takes less space to just list it like this, though: 1x, 12, x2, xx.

It’s redundant to use the numbers, but it’s clearer, and in the elementary school math class they were using them, so I’ll keep with that.

Three circles gives eight regions: 1xx, 12x, 1x3, 123, x2x, xx3, x23, xxx.

Four intersecting circles gets busier and gives 14 regions: 1xxx, 12xx, 123x, 12x4, 1234, 1xx4, 1x34, x2xx, x23x, x324, xx3x, xx34, xxx4, xxxx.

Here N and I stopped and made a list of circles and regions:

Circles Regions 1 2 2 4 3 8 4 14

When N saw this he wondered how much it was growing by each time, because he wanted to know the pattern. He does a lot of that in school. We subtracted each row from the previous to find how much it grew:

Circles Regions Difference 1 2 2 4 2 3 8 4 4 14 6

Aha, that’s looking interesting. What’s the difference of the differences?

Circles Regions Difference DiffDiff 1 2 2 4 2 3 8 4 2 4 14 6 2

Nine-year-old (almost ten-year-old) N saw this was important. I forget how he put it, but he knew that if the second-level difference is constant then that’s the key to the pattern.

I don’t know what triggered the memory, but I was pretty sure it had something to do with squares. There must be a proper way to deduce the formula from the numbers above, but all I could do was fool around a little bit. We’re adding a new 2 each time, so what if we take it away and see what that gives us? Let’s take the number of circles as n and the result as ?(n) for some unknown function ?.

n ?(n) 1 0 2 2 3 3 4 12

I think I saw that 3 x 2 = 6 and 4 x 3 = 12, so n x (n-1) seems to be the pattern, and indeed 2 x 1 = 2 and 1 * 0 = 0, so there we have it.

Adding the 2 back we have:

Given n intersecting circles, the number of regions formed = n x (n - 1) + 2

Therefore we can predict that for 5 circles there will be 5 x 4 + 2 = 22 regions.

I think that here I drew five intersecting circles and counted up the regions and almost got 22, but there were some squidgy bits where the lines were too close together so we couldn’t quite see them all, but it seemed like we’d solved the problem for now. We were pretty chuffed.

When I got home I got to wondering about it more and wrote a bit of R.

I made three functions; the third uses the first two:

  • circle(x,y): draw a circle at (x,y), default radius 1.1
  • roots(n): return the n nth roots of unity (when using complex numbers, x^n = 1 has n solutions)
  • drawcircles(n): draw circles of radius 1.1 around each of those n roots
circle <- function(x, y, rad = 1.1, vertices = 500, ...) { rads <- seq(0, 2*pi, length.out = vertices) xcoords <- cos(rads) * rad + x ycoords <- sin(rads) * rad + y polygon(xcoords, ycoords, ...) } roots <- function(n) { lapply( seq(0, n - 1, 1), function(x) c(round(cos(2*x*pi/n), 4), round(sin(2*x*pi/n), 4)) ) } drawcircles <- function(n) { centres <- roots(n) plot(-2:2, type="n", xlim = c(-2,2), ylim = c(-2,2), asp = 1, xlab = "", ylab = "", axes = FALSE) lapply(centres, function (c) circle(c[[1]], c[[2]])) }

drawcircles(2) does what I did by hand above (without the annotations):

drawcircles(5) shows clearly what I drew badly by hand:

Pushing on, 12 intersecting circles:

There are 12 x 11 + 2 = 123 regions there.

And 60! This has 60 x 59 + 2 = 3598 regions, though at this resolution most can’t be seen. Now we’re getting a bit op art.

This is covered in Wolfram MathWorld as Plane Division by Circles, and (2, 4, 8, 14, 24, …) is A014206 in the On-Line Encyclopedia of Integer Sequences: “Draw n+1 circles in the plane; sequence gives maximal number of regions into which the plane is divided.”

Somewhere along the way while looking into all this I realized I’d missed something right in front of my eyes: the intersecting circles stopped being Venn diagrams after 3!

A Venn diagram represents “all possible logical relations between a finite collection of different sets” (says Venn diagram on Wikipedia today). With n sets there are 2^n possible relations. Three intersecting circles divide the plane into 3 x (3 - 1) + 2 = 8 = 2^3 regions, but with four circles we have 14 regions, not 16! 1x3x and x2x4 are missing: there is nowhere where only c1 and c3 or c2 and c4 intersect without the other two. With five intersecting circles we have 22 regions, but logically there are 2^5 = 32 possible combinations. (What’s an easy way to calculate which are missing?)

It turns out there are various ways to draw four- (or more) set Venn diagrams on Wikipedia, like this two-dimensional oddity (which I can’t imagine librarians ever using when teaching search strategies):

You never know where a bit of conversation about Venn diagrams is going to lead!

District Dispatch: Sony leak reveals efforts to revive SOPA

planet code4lib - Fri, 2014-12-19 22:35

Working in Washington, D.C., tends to make one a bit jaded: the revolving door, the bipartisan attacks, not enough funding for libraries — the list goes on. So, yes, I am D.C.-weary and growing more cynical. Now I have another reason to be fed up.

Sony Pictures studio.

The Sony Pictures Entertainment data breach has uncovered documents that show that the Motion Picture Association of America (MPAA) has been trying to pull a fast one —reviving the ill-conceived Stop Online Privacy Act (SOPA) legislation (that failed spectacularly in 2010) by apparently buying off state Attorneys General.  Documents show that MPAA has been devising a scheme to get the result they could not get with SOPA —shutting down web sites and along with them freedom of expression and access to information.

The details have been covered by a number of media outlets, including The New York Times. The MPAA seems to think that the best solution to shutting down piracy is “make invisible” the web sites of suspected culprits. You may think that libraries have little to worry about; after all we aren’t pirates. But the good guys will be yanked offline, as well as the alleged bad guys. Our provision of public access to the Internet would then be in jeopardy because a few library users allegedly posted protected content on, for example, Pinterest or YouTube. Our protection from liability for the activities of library patrons using public computers could be thrown out the window along with internet access. This makes no sense.

SOPA, touted initially by Congress as a solution to online piracy, also made no sense from the start because it was too broad. If passed, it would have required that libraries police the internet and block web sites whenever asked by law enforcement officials. Technical experts confirmed that the implementation of SOPA could threaten cybersecurity and undermine the Domain Name System (DNS), also known as the very “backbone of the Internet.”

After historically overwhelming public outcry the content community and internet supporters were encouraged to work together on a compromise, parties promised to collaborate, and some work was actually accomplished. Now it seems that, as far as MPAA was concerned, collaboration was just hype.   They were, the leaked documents show, planning all along to get SOPA one way or another.

The library community opposes piracy. But we also oppose throwing the baby out with the bath water.

Update: The Verve has reported that Mississippi Attorney General Hood did indeed launch his promised attack on behalf of the MPAA by serving Google with a 79-page subpoena charging that Google engaged in “deceptive” or “unfair” trade practice under the Mississippi Consumer Protection Act. Google has filed a brief asking the federal court to set aside the subpoena and noting that Mississippi (or any state for that matter) has no jurisdiction over these matters.

 

The post Sony leak reveals efforts to revive SOPA appeared first on District Dispatch.

Library Hackers Unite blog: NTP and git client vulnerabilities

planet code4lib - Fri, 2014-12-19 21:00

Git client vulnerabilities on case-insensitive filesystems:
https://github.com/blog/1938-vulnerability-announced-update-your-git-clients

NTPd vulnerabilities announced:
http://www.kb.cert.org/vuls/id/852879

OSX and MS Windows users, start by updating your github apps and plugins and then your regular command-line git client. NTP fixes still pending for most platforms.

Library of Congress: The Signal: Dodging the Memory Hole: Collaborations to Save the News

planet code4lib - Fri, 2014-12-19 17:55

The news is often called the “first draft of history” and preserved newspapers are some of the most used collections in libraries. The Internet and other digital technologies have altered the news landscape. There have been numerous stories about the demise of the newspaper and disruption at traditional media outlets. We’ve seen more than a few newspapers shutter their operations or move to strictly digital publishing. At the same time, niche news blogs, citizen-captured video, hyper-local new sites, news aggregators and social media have all emerged to provide a dynamic and constantly changing news environment that is sometimes confusing to consume and definitely complex to encapsulate.

With these issues in mind and with the goal to create a network to preserve born-digital journalism, the Reynolds Journalism Institute at the University of Missouri sponsored part one of the meeting Dodging the Memory Hole  as part of the Journalism Digital New Archive 2014 forum, an initiative at the Reynolds Institute. Edward McCain (the focus of a recent Content Matters interview on The Signal) has a unique joint appointment at the Institute and the University of Missouri Library as the Digital Curator of Journalism. He and Katherine Skinner, Executive Director of the Educopia Institute (which will host part two of the  meeting in May 2015 in Charlotte, N.C.) developed the two-day program which attracted journalists, news librarians, technologists, academics and administrators.

Cliff Lynch, Director of the Coalition of Networked Information, opened the meeting with a thoughtful assessment of the state of digital news production and preservation. An in-depth case study followed recounting the history of the Rocky Mountain News, its connection to the Denver, CO community, its eventual demise as an actively published newspaper and, ultimately, the transfer of its assets to the Denver Public Library where the content and archives of the Rocky Mountain News remain accessible.

This is the first known arrangement of its kind, and DPL has made its donation agreement with E.W. Scripps Company openly accessible so it can serve as a model for other newspapers and libraries or archives. A roundtable discussion of news executives also revealed opportunities to engage in new types of relationships with the creators of news. Particularly, opening a dialog with the maintainers of content management systems that are used in newsrooms could make the transfer of content out of those systems more predictable and archivable.

Ben Welsh, a database producer at the Los Angeles Times, next debuted his tool Storytracker, which is based on PastPages, a tool he developed to capture screenshots of newspaper websites.  Storytracker allows for the capture of screenshots and the extraction of URLs and their associated text so links and particular stories or other content elements from a news webpage can be tracked over time and analyzed. Storytracker is free and available for download and Welsh is looking for feedback on how the tool could be more useful to the web archiving community. Tools like these have the potential to aid in the selection, capture and analysis of web based content and further the goal of preserving born-digital news.

Katherine Skinner closed the meeting with an assessment of the challenges ahead for the community, including: unclear definitions and language around preservation; the copyright status of contemporary news content; the technical complexity of capturing and preserving born-digital news; ignorance of emerging types of content; and the lack of relationships between new content creators and stewardship organizations.

In an attempt to meet some of these challenges, three action areas were defined: awareness, standards and practices and legal framework. Participants volunteered to work toward progress in advocacy messaging, exploring public-private partnerships, preserving pre-print newspaper PDFs, preserving web-based news content and exploring metadata and news content management systems. Groups will attempt to demonstrate some progress in these areas over the next six months and share results at the next Dodging the Memory Hole meeting in Charlotte. If you have ideas or want to participate in any of the action areas let us know in the comments below and we will be in touch.

Casey Bisson: Parable of the Polygons is the future of journalism

planet code4lib - Fri, 2014-12-19 17:35

Okay, so I’m probably both taking that too far and ignoring the fact that interactive media have been a reality for a long time. So let me say what I really mean: media organizations that aren’t planning out how to tell stories with games and simulators will miss out.

Here’s my example: Vi Hart and Nicky Case’s Parable of the Polygons shows us how bias, even small bias, can affect diversity. It shows us this problem using interactive simulators, rather than tells us in text or video. We participate by moving shapes around and pulling the levers of change on bias.

This nuclear power plant simulator offers some insight into the complexity that contributed to Fukushima, and I can’t help thinking the whole net neutrality argument would be better explained with a simulator.

Manage Metadata (Diane Hillmann and Jon Phipps): The Jane-athon Prototype in Hawaii

planet code4lib - Fri, 2014-12-19 15:22

The planning for the Midwinter Jane-athon pre-conference has been taking up a lot of my attention lately. It’s a really cool idea (credit to Deborah Fritz) to address the desire we’ve been hearing for some time for a participatory, hands on, session on RDA. And lets be clear, we’re not talking about the RDA instructions–this is about the RDA data model, vocabularies, and RDA’s availability for linked data. We’ll be using RIMMF (RDA in Many Metadata Formats) as our visualization and data creation tool, setting up small teams with leaders who’ve been prepared to support the teams and a wandering phalanx of coaches to give help on the fly.

Part of the planning has to do with building a set of RIMMF ‘records’ to start with, for participants to add on their own resources and explore the rich relationships in RDA. We’re calling these ‘r-balls’ (a cross between RIMMF and tarballs). These zipped-up r-balls will be available for others to use for their own homegrown sessions, along with instructions for using RIMMF and setting up a Jane-athon (or other themed -athon), and also how to contribute their own r-balls for the use of others. In case you’ve not picked it up, this is a radically different training model, and we’d like to make it possible for others to play, too.

That’s the plan for the morning. After lunch we’ll take a look at what we’ve done, and prise out the issues we’ve encountered, and others we know about. The hope is that the participants will walk out the door with both an understanding of what RDA is (more than the instructions) and how it fits into the emerging linked data world.

I recently returned from a trip to Honolulu, where I did a prototype Jane-athon workshop for the Hawaii Library Association. I have to admit that I didn’t give much thought to how difficult it would be to do solo, but I did have the presence of mind to give the organizer of the workshop some preliminary setup instructions (based on what we’ll be doing in Chicago) to ensure that there would be access to laptops with software and records pre-loaded, and a small cadre of folks who had been working with RIMMF to help out with data creation on the day.

The original plan included a day before the workshop with a general presentation on linked data and some smaller meetings with administrators and others in specialized areas. It’s a format I’ve used before and the smaller meetings after the presentation generally bring out questions that are unlikely to be asked in a larger group.

What I didn’t plan for was that I wouldn’t be able to get out of Ithaca on the appointed day (the day before the presentation) thanks not to bad weather, but instead to a non-functioning plane which couldn’t be repaired. So after a phone discussion with Hawaii, I tried again the next day, and everything went smoothly. On the receiving end there was lots of effort expended to make it all work in the time available, with some meetings dribbling into the next day. But we did it, thanks to organizer Nancy Sack’s prodigious skills and the flexibility of all concerned.

Nancy asked the Jane-athon participants to fill out an evaluation, and sent me the anonymized results. I really appreciated that the respondents added many useful (and frank) comments to the usual range of questions. Those comments in particular were very helpful to me, and were passed on to the other MW Jane-athon organizers. One of the goals of the workshop was to help participants visualize, using RIMMF, how familiar MARC records could be automatically mapped into the FRBR structure of RDA, and how that process might begin to address concerns about future workflow and reuse of MARC records. Another goal was to illustrate how RDA’s relationships enhanced the value of the data, particularly for users. For the most part, it looked as if most of the participants understood the goals of the workshop and felt they had gotten value from it.

But there were those who provided frank criticism of the workshop goals and organization (as well as the presenter, of course!). Part of these criticisms involved the limitations of the workshop, wanting more information on how they could put their new knowledge to work, right now. The clearest expression of this desire came in as follows:

“I sort of expected to be given the whole road map for how to take a set of data and use LOD to make it available to users via the web. In rereading the flyer I see that this was not something the presenter wanted to cover. But I think it was apparent in the afternoon discussion that we wanted more information in the big picture … I feel like I have an understanding of what LOD is, but I have no idea how to use it in a meaningful way.”

Aside from the time constraints–which everyone understood–there’s a problem inherent in the fact that very few active LOD projects have moved beyond publishing their data (a good thing, no doubt about it) to using the data published by others. So it wasn’t so much that I didn’t ‘want’ to present more about the ‘bigger picture’, there wasn’t really anything to say aside from the fact that the answer to that question is still unclear (and I probably wasn’t all that clear about it either). If I had a ‘road map’ to talk about and point them to, I certainly would have shared it, but sadly I have nothing to share at this stage.

But I continue to believe that just as progress in this realm is iterative, it is hugely important that we not wait for the final answers before we talk about the issues. Our learning needs to be iterative too, to move along the path from the abstract to the concrete along with the technical developments. So for MidWinter, we’ll need to be crystal clear about what we’re doing (and why), as well as why there are blank areas in the road-map.

Thanks again to the Hawaii participants, and especially Nancy Sack, for their efforts to make the workshop happen, and the questions and comments that will improve the Jane-athon in Chicago!

For additional information, including a link to register, look here. Although I haven’t seen the latest registration figures, we’re expecting to fill up, so don’t delay!

[these are the workshop slides]

[these are the general presentation slides]

Chris Prom: Configuring WordPress Multisite as a Content Management System

planet code4lib - Fri, 2014-12-19 14:56

In summer/fall 2012, I posted a series regarding the implementation of WordPress as an content management system.  Time prevented me from describing how we decided to configure WordPress for use in the University of Illinois Archives.  In my next two posts, I’d like to rectify that, first by describing our basic implementation, then by noting (in the second post) some WordPress configuration steps that proved particularly handy.It’s an opportune time to do this because our Library is engaged in a project to examine options for a new CMS, and WordPress is one option.

When we went live with the main University Archives site in August 2012, one goal was to manage  related sites (the American Library Association Archives, the Sousa Archives and Center for American Music, and the Student Life and Culture Archives) in one technology, but to allow a certain amount of local flexiblity in the implemenation.  Doing this, I felt, would minimize development and maintenance costs while making it easer for staff to add and edit content.  We had a strong desire to avoid staff training sessions and sought to help our many web writers and editors become self sufficient, without letting them wander too far afield from an overall design aesthetic (even if my own design sense was horrible, managing everything in one system would make it easier to apply a better design at a later date).

I began by setting up a WordPress multisite installation and by selecting the thematic theme framework.  In retrospect, these decisions have proven to be good ones, allowing us to achieve the goals described above

Child Theme Development

Thematic is  theme framework, and is not suitable for those who don’t like editing CSS or delving into code (i.e. for people who want to set colors and do extensive styling in the admin interface.   That said, its layout and div organization are easy to understand, and it is well documented. It includes a particularly strong set of widget areas, so that is a huge plus.  It is developer friendly since it is easy to do site customizations in the child theme, without affecting the parent Thematic style or the WordPress core.

Its best feature: You can spin off child themes, while reusing the same content blocks and staying in sync with WordPress best practices.  Even those with limited CSS and/or php skills can quickly develop attractive designs simply by editing the styles and including a few hooks to load images (in the functions file).  In addition to me, two staff members (Denise Rayman and Angela Jordan) have done this for the ALA Archives and SLC Archives.

Another plus: The Automattic “Theme division” developed and supports Thematic, which means that it benefits from close alignment with WP’s core developer group. Our site has never broken on upgrade when using my thematic child themes; at most we have done a few minutes of work to correct minor problems.

In the end, The decision to use Thematic required more upfront work, but it forced me to  about theme development and to begin grappling with the WordPress API (e.g. hooks and filters), while setting in place a method for other staff to develop spin off sites.  More on that in my next post.

Plugin Selection

Once WordPress multisite was running, we spent time selecting and installing plug-ins that could be used on the main site and that would help us achieve desired effects.  The following proved to be particularly valuable and have proven to have good forward compatibility (i.e. not breaking the site when we upgraded WordPress):

  • WPTouch Mobile
  • WP Table Reloaded (adds table editor)
  • wp-jquery Lightbox (image modal windows)
  • WordPress SEO
  • Simple Section Navigation Widget (builds local navigation menus from page order)
  • Search and Replace (admin tool for bulk updating paths, etc.)
  • List Pages Shortcode
  • Jetpack by WordPress.com
  • Metaslider (image carousel)
  • Ensemble Video  Shortcodes (allows embedding AV access copies in campus streaming service)
  • Google Analytics by Yoast
  • Formidible (form builder)
  • CMS Page Order (drag and drop menu for arranging overall site structure)
  • Disqus Comment System

Again, I’ll write more about how we are using these, in my next post.

 

William Denton: The best paper I read this year: Polster, Reconfiguring the Academic Dance

planet code4lib - Fri, 2014-12-19 01:58

The best paper I read this year is Reconfiguring the Academic Dance: A Critique of Faculty’s Responses to Administrative Practices in Canadian Universities by Claire Polster, a sociologist at the University of Regina, in Topia 28 (Fall 2012). It’s aimed at professors but public and academic librarians should read it.

Unfortunately, it’s not gold open access. There’s a two year rolling wall and it’s not out of it yet (but I will ask—it should have expired by now). If you don’t have access to it, try asking a friend or following the usual channels. Or wait. Or pay six bucks. (Six bucks? What good does that do, I wonder.)

Here’s the abstract:

This article explores and critiques Canadian academics’ responses to new administrative practices in a variety of areas, including resource allocation, performance assessment and the regulation of academic work. The main argument is that, for the most part, faculty are responding to what administrative practices appear to be, rather than to what they do or accomplish institutionally. That is, academics are seeing and responding to these practices as isolated developments that interfere with or add to their work, rather than as reorganizers of social relations that fundamentally transform what academics do and are. As a result, their responses often serve to entrench and advance these practices’ harmful effects. This problem can be remedied by attending to how new administrative practices reconfigure institutional relations in ways that erode the academic mission, and by establishing new relations that better serve academics’—and the public’s—interests and needs. Drawing on the work of various academic and other activists, this article offers a broad range of possible strategies to achieve the latter goal. These include creating faculty-run “banks” to transform the allocation of institutional resources, producing new means and processes to assess—and support—academic performance, and establishing alternative policy-making bodies that operate outside of, and variously interrupt, traditional policy-making channels.

This is the dance metaphor:

To offer a simplified analogy, if we imagine the university as a dance floor, academics tend to view new administrative practices as burdensome weights or shackles that are placed upon them, impeding their ability to perform. In contrast, I propose we see these practices as obstacles that are placed on the dance floor and reconfigure the dance itself by reorganizing the patterns of activity in and through which it is constituted. I further argue that because most academics do not see how administrative practices reorganize the social relations within which they themselves are implicated, their reactions to these practices help to perpetuate and intensify these transformations and the difficulties they produce. Put differently, most faculty do not realize that they can and should resist how the academic dance is changing, but instead concentrate on ways and means to keep on dancing as best they can.

A Dance to the Music of Time, by Nicolas Poussin (from Wikipedia)

About the constant struggle for resources:

Instead of asking administrators for the resources they need and explaining why they need them, faculty are acting more as entrepreneurs, trying to convince administrators to invest resources in them and not others. One means to this end is by publicizing and promoting ways they comply with administrators’ desires in an ever growing number of newsletters, blogs, magazines and the like. Academics are also developing and trying to “sell” to administrators new ideas that meet their needs (or make them aware of needs they didn’t realize they had), often with the assistance of expensive external consultants. Ironically, these efforts to protect or acquire resources often consume substantial resources, intensifying the very shortages they are designed to alleviate. More importantly, these responses further transform institutional relations, fundamentally altering, not merely adding to, what academics do and what they are.

About performance assessment:

Another academic strategy is to respect one’s public-serving priorities but to translate accomplishments into terms that satisfy administrators. Accordingly, one might reframe work for a local organization as “research” rather than community service, or submit a private note of appreciation from a student as evidence of high-quality teaching. This approach extends and normalizes the adoption of a performative calculus. It also feeds the compulsion to prove one’s value to superiors, rather than to engage freely in activities one values.

Later, when she covers the many ways people try to deal with or work around the problems on their own:

There are few institutional inducements for faculty to think and act as compliant workers rather than autonomous professionals. However, the greater ease that comes from not struggling against a growing number of rules, and perhaps the additional time and resources that are freed up, may indirectly encourage compliance.

Back to the dance metaphor:

If we return to the analogy provided earlier, we may envision academics as dancers who are continually confronted with new obstacles on the floor where they move. As they come up to each obstacle, they react—dodging around it, leaping over it, moving under it—all the while trying to keep pace, appear graceful and avoid bumping into others doing the same. It would be more effective for them to collectively pause, step off the floor, observe the new terrain and decide how to resist changes in the dance, but their furtive engagement with each obstacle keeps them too distracted to contemplate this option. And so they keep on moving, employing their energies and creativity in ways that further entangle them in an increasingly difficult and frustrating dance, rather than trying to move in ways that better serve their own—and others’ —needs.

Dance II, by Henri Matisse (from Wikipedia)

She with a number of useful suggestions about how to change things, and introduces this by saying:

Because so many academic articles are long on critique but short on solutions, I present a wide range of options, based on the reflections and actions of many academic activists both in the past and in the present, which can challenge and transform university relations in positive ways.

Every paragraph hit home. At York University, where I work, we’re going through a prioritization process using the method set out by Robert Dickeson. It’s being used at many universities, and everything about it is covered by Polster’s article. Every reaction she lists, we’ve had. Also, the university is moving to activity-based costing, a sort of internal market system, where some units (faculties) bring in money (from tuition) and all the other units (including the libraries) don’t, and so are cost centres. Cost centres! This has got people in the libraries thinking about how we can generate revenue. Becoming a profit centre! A university library! If thinking like that gets set in us deep the effects will be very damaging.

Library of Congress: The Signal: NDSR Applications Open, Projects Announced!

planet code4lib - Thu, 2014-12-18 18:32

The Library of Congress, Office of Strategic Initiatives and the Institute of Museum and Library Services are pleased to announce the official open call for applications for the 2015 National Digital Stewardship Residency, to be held in the Washington, DC area.  The application period is from December 17, 2014 through January 30, 2015. To apply, go to the official USAJobs page link.

Looking down Pennsylvania Avenue. Photo by Susan Manus

To qualify, applicants must have a master’s degree or higher, graduating between spring 2013 and spring 2015, with a strong interest in digital stewardship. Currently enrolled doctoral students are also encouraged to apply. Application requirements include a detailed resume and cover letter, undergraduate and graduate transcripts, two letters of recommendation and a creative video that defines an applicant’s interest in the program.  (Visit the NDSR application webpage for more application information.)

For the 2015-16 class, five residents will be chosen for a 12-month residency at a prominent institution in the Washington, D.C. area.  The residency will begin in June, 2015, with an intensive week-long digital stewardship workshop at the Library of Congress. Thereafter, each resident will move to their designated host institution to work on a significant digital stewardship project. These projects will allow them to acquire hands-on knowledge and skills involving the collection, selection, management, long-term preservation and accessibility of digital assets.

We are also pleased to announce the five institutions, along with their projects, that have been chosen as residency hosts for this class of the NDSR. Listed below are the hosts and projects, chosen after a very competitive round of applications:

  • District of Columbia Public Library: Personal Digital Preservation Access and Education through the Public Library.
  • Government Publishing Office: Preparation for Audit and Certification of GPO’s FDsys as a Trustworthy Digital Repository.
  • American Institute of Architects: Building Curation into Records Creation: Developing a Digital Repository Program at the American Institute of Architects.
  • U.S. Senate, Historical Office: Improving Digital Stewardship in the U.S. Senate.
  • National Library of Medicine: NLM-Developed Software as Cultural Heritage.

The inaugural class of the NDSR was also held in Washington, DC in 2013-14. Host institutions for that class included the Association of Research Libraries, Dumbarton Oaks Research Library, Folger Shakespeare Library, Library of Congress, University of Maryland, National Library of Medicine, National Security Archive, Public Broadcasting Service, Smithsonian Institution Archives and the World Bank.

George Coulbourne, Supervisory Program Specialist at the Library of Congress, explains the benefits of the program: “We are excited to be collaborating with such dynamic host institutions for the second NDSR residency class in Washington, DC. In collaboration with the hosts, we look forward to developing the most engaging experience possible for our residents.  Last year’s residents all found employment in fields related to digital stewardship or went on to pursue higher degrees.  We hope to replicate that outcome with this class of residents as well as build bridges between the host institutions and the Library of Congress to advance digital stewardship.”

The residents chosen for NDSR 2015 will be announced by early April 2015. Keep an eye on The Signal for that announcement. For additional information and updates regarding the National Digital Stewardship Residency, please see our website.

See the Library’s official press release here.

OCLC Dev Network: Now Playing: Coding for Users

planet code4lib - Thu, 2014-12-18 17:30

If you missed our November webinar on Coding for Users you can now view the full recording.

District Dispatch: New CopyTalk webinar archive available

planet code4lib - Thu, 2014-12-18 17:00

Photo by Barry Dahl

An archive of the CopyTalk webinar “Introducing the Statement of Best Practices in Fair Use of Collections Containing Orphan Works of Libraries, Archives and Other Memory Institutions” is now available. The webinar was hosted in December 2014 by the ALA and was presented by speaked Dave Hansen (UC Berkeley and UNC Chapel Hill) and Peter Jaszi (American University).

In this webinar, the speakers will introduce the “Statement of Best Practices in Fair Use of Collections Containing Orphan Works for Libraries, Archives, and Other Memory Institutions.” This Statement, the most recent community-developed best practices in fair use, is the result of intense discussion group meetings with over 150 librarians, archivists, and other memory institution professionals from around the United States to document and express their ideas about how to apply fair use to collections that contain orphan works, especially as memory institutions seek to digitize those collections and make them available online. The Statement outlines the fair use rationale for use of collections containing orphan works by memory institutions and identifies best practices for making assertions of fair use in preservation and access to those collections.

Watch the webinar

CopyTalks are scheduled for the first Thursday of even numbered months.

Archives of two earlier webinars are also available:

International copyright with Janice Pilch from Rutgers University Library)

Open licensing and the public domain: tools and policies to support libraries, scholars and the public with Tom Vollmer from the Creative Commons

The post New CopyTalk webinar archive available appeared first on District Dispatch.

LITA: Getting Started with GIS

planet code4lib - Thu, 2014-12-18 16:52

Coming for the New Year: Learning Opportunities with LITA

LITA will have multiple learning opportunities available over the upcoming year. Including hot topics to keep your brain warm over the winter. Starting off with:

Getting Started with GIS (Geographic Information Systems)

Instructor: Eva Dodsworth, University of Waterloo

Offered: January 12 – February 9, 2015, with asynchronous weekly lectures, tutorials, assignments, and group discussion. There will be one 80 minute lecture to view each week, along with two tutorials and one assignment that will take 1-3 hours to complete, depending on the student. Moodle login info will be sent to registrants the week prior to the start date.

WebCourse Costs: LITA Member: $135 ALA Member: $195 Non-member: $260

Register Online, page arranged by session date (login required)

Here’s the Course Page

Getting Started with GIS is a three week course modeled on Eva Dodsworth’s LITA Guide of the same name. The course provides an introduction to Geographic Information Systems (GIS) in libraries. Through hands on exercises, discussions and recorded lectures, students will acquire skills in using GIS software programs, social mapping tools, map making, digitizing, and researching for geospatial data. This three week course provides introductory GIS skills that will prove beneficial in any library or information resource position.

No previous mapping or GIS experience is necessary. Some of the mapping applications covered include:

  • Introduction to Cartography and Map Making
  • Online Maps
  • Google Earth
  • KML and GIS files
  • ArcGIS Online and Story Mapping
  • Brief introduction to desktop GIS software

Participants will gain the following GIS skills:

  • Knowledge of popular online mapping resources
  • ability to create an online map
  • an introduction to GIS, GIS software and GIS data
  • an awareness of how other libraries are incorporating GIS technology into their library services and projects

Instructor: Eva Dodsworth is the Geospatial Data Services Librarian at the University of Waterloo Library where she is responsible for the provision of leadership and expertise in developing, delivering, and assessing geospatial data services and programs offered to members of the University of Waterloo community. Eva is also an online part-time GIS instructor at a number of Library School programs in North America.

Register Online, page arranged by session date (login required)

Re-Drawing the Map Series

Don’t forget the final session in the series is coming up January 6, 2015. You can attend this final single session or register for the series and get the recordings of the previous two sessions on Web Mapping and OpenStreetMaps. Join LITA instructor Cecily Walker for:

Coding maps with Leaflet.js

Tuesday January 6, 2015, 1:00 pm – 2:00 pm Central Time
Instructor: Cecily Walker

Ready to make your own maps and go beyond a directory of locations? Add photos and text to your maps with Cecily as you learn to use the Leaflet JavaScript library.

Register Online, page arranged by session date (login required)

Webinar Costs: LITA Member $39 for the single session and $99 for the series.

Check out the series web page for all cost options.

Questions or Comments?

For all other questions or comments related to the course, contact LITA at (312) 280-4268 or Mark Beatty, mbeatty@ala.org.

 

District Dispatch: Another round of foolishness with the DMCA

planet code4lib - Thu, 2014-12-18 16:23

Photo by hobvias sudoneighm

It’s that time again when the U.S. Copyright Office accepts proposals for exemptions to the anti-circumvention provision of the Digital Millennium Copyright Act (DMCA).

Huh?

The DMCA (which added chaff to the Copyright Act of 1976) includes a new Chapter 12 regarding “technological protection measures” which is another name for digital rights management (DRM). The law says that it is a violation to circumvent (=hack) DRM that has been used by the rights holder to protect access to digital content. One cannot break a passcode that protects access to an online newspaper without being a subscriber, for example.

Here’s the problem: Sometimes DRM gets in the way of actions that are not infringements of copyright. Let’s say you have lawful access to an e-book (you bought the book, fair and square), but you are a person with a print disability, and you need to circumvent to enable text-to-speech (TTS) functionality which has been disabled by DRM. This is a violation of the circumvention provision. One would think that this kind of circumvention is reasonable, because it simply entails making a book accessible to the person that purchased it. Reading isn’t illegal (in the United States).

Because Congress thought lawful uses of protected content may be blocked by technology, it included in the DMCA a process to determine when circumvention should be allowed- the 1201 rulemaking. Every three years, the Copyright Office accepts comments from people who want to circumvent technology for lawful purposes. These people must submit a legal analysis of why an exemption should be allowed, and provide evidence that a technological impediment exists. The Copyright Office reviews the requests, considers if any requests bear scrutiny, holds public hearings, reads reply comments, writes a report, and makes a recommendation to the Librarian of Congress who then determines if any of the proposals are warranted. (The whole rigmarole takes 5-6 months). An exemption allows people with print disabilities to circumvent DRM to enable TTS for 3 years. After that length of time, the exemption expires, and the entire process starts over again. It is time consuming and costly, requires the collection of evidence, and legal counsel. The several days of public hearings are surreal. Attendees shake their heads in disbelief. Everyone moans and groans, including the Copyright Office staff. I am not exaggerating.

Ridiculous? Undoubtedly.

One would think that rights holders would just say “sure, go ahead and circumvent e-books for TTS, we don’t care.” But they do care. Some rights holders think allowing TTS will cut into their audiobook market. Some rights holders think that TTS is an unauthorized public performance and therefore an infringement of copyright. Some authors do not want their books read aloud by a computer, feeling it degrades their creative work. This madness can be stopped if Congress eliminates, or at least amends, this DMCA provision. Why not make exemptions permanent?

In the meantime…

The Library Copyright Alliance (LCA), of which ALA is a member, participates in the triennial rulemaking. Call us crazy. We ask, “What DRM needs to be circumvented this time around?” This question is hard to answer because it is difficult to know what library users can’t do that is a lawful act because DRM is blocking something. We solicit feedback from the library community, but response is usually meager because the question requires proving a negative.

For the last couple of rulemaking cycles, LCA focused on an exemption for educators (and students in media arts programs) that must circumvent DRM on DVDs in order to extract film clips for teaching, research and close study. To be successful, we need many examples of faculty and teachers who circumvent DRM to meet pedagogical goals or for research purposes. Right now, this circumvention allows educators to exercise fair use. BUT this fair use will no longer be possible if we cannot prove it is necessary.

For those librarians and staff who work with faculty, we ask for examples! We want to extend the exemption to K-12 teachers, so school librarians: we need to hear from you as well. Heed this call! Take a moment to help us survive this miserable experience on behalf of educators and learners.

NOTE: Ideally, we would like examples on or before January 15th, 2015, but will accept examples through January 28th, 2015

 

Contact Carrie Russell at ALA’s Office for Information Technology Policy at crussell@alawash.org. Or call 800.941.8478.

The post Another round of foolishness with the DMCA appeared first on District Dispatch.

David Rosenthal: Economic Failures of HTTPS

planet code4lib - Thu, 2014-12-18 16:00
Bruce Schneier points me to Assessing legal and technical solutions to secure HTTPS, a fascinating, must-read analysis of the (lack of) security on the Web from an economic rather than a technical perspective by Axel Arnbak and co-authors from Amsterdam and Delft universities. Do read the whole paper, but below the fold I provide some choice snippets.

Arnbak et al point out that users are forced to trust all Certificate Authorities (CAs):
A crucial technical property of the HTTPS authentication model is that any CA can sign certificates for any domain name. In other words, literally anyone can request a certificate for a Google domain at any CA anywhere in the world, even when Google itself has contracted one particular CA to sign its certificate. Many CAs are untrustworthy on their face:
What’s particularly troubling is that a number of the trusted CAs are run by authoritarian governments, among other less trustworthy institutions. Their CAs can issue a certificate for any Web site in the world, which will be accepted as trustworthy by browsers of all Internet users. The security practices of even leading CAs have proven to be inadequate:
three of the four market leaders got hacked in recent years and that some of the “security” features of these services do not really provide actual security.Customers can't actually buy security, only the appearance of security:
Information asymmetry prevents buyers from knowing what CAs are really doing. Buyers are paying for the perception of security, a liability shield, and trust signals to third parties. None of these correlates verifiably with actual security. Given that CA security is largely unobservable, buyers’ demands for security do not necessarily translate into strong security incentives for CAs. There's little incentive for CAs to invest in better security:
Negative externalities of the weakest-link security of the system exacerbate these incentive problems. The failure of a single CA impacts the whole ecosystem, not just that CA’s customers. All other things being equal, these interdependencies undermine the incentives of CAs to invest, as the security of their customers depends on the efforts of all other CAs. They conclude:
Regardless of major cybersecurity incidents such as CA breaches, and even the Snowden revelations, a sense of urgency to secure HTTPS seems nonexistent. As it stands, major CAs continue business as usual. For the foreseeable future, a fundamentally flawed authentication model underlies an absolutely critical technology used every second of every day by every Internet user. On both sides of the Atlantic, one wonders what cybersecurity governance really is about.

LITA: Are QR Codes Dead Yet?

planet code4lib - Thu, 2014-12-18 12:32

It’s a meme!

Flipping through a recent issue of a tech-centric trade publication that shall not be named, I was startled to see that ads on the inside flap and the back cover both featured big QR codes. Why was I startled? Because techies, including many librarians, have been proclaiming the death of the QR code for years. Yet QR codes cling to life, insinuating themselves even into magazines on information technology. In short, QR codes are not dead. But they probably ought to be.

Not everywhere or all at once, no. I did once see this one librarian at this one conference poster session use his smartphone to scan a giant QR code. That was the only time in five years I have ever seen anyone take advantage of a QR code.

When reading a print magazine, I just want to roll with the print experience. I don’t want to grab my phone, type the 4-digit passcode, pull up the app, and hold the camera steady. I want to read.

I’d rather snap a photo of the page in question. That way, I can experience the ad holistically. I also can explore the website at leisure rather than being whisked to a non-mobile optimized web page where I must fill out 11 fields of an online registration form to which UX need not apply.

So . . . Should I Use A QR Code?

Best. Flowchart. EVER! #ias13 pic.twitter.com/nuk68H2DJp

— Jonathon Colman (@jcolman) April 7, 2013

Pages

Subscribe to code4lib aggregator