You are here

Feed aggregator

LibUX: Meaningfully Judging Performance in Terms of User Experience

planet code4lib - Fri, 2016-08-26 16:00

Much about user experience design is concerned with subjective improvements to language and structure, style, tone. The bulk of our quantitative data is used toward these purposes — and, of course, being user-centric is precisely what that data is for. The role of the user experience designer connotes a ton about the sorts of improvements at the surface of our websites, at the obvious touchpoints between patron and library. Unfortunately, this approach can neglect deep systemic or technical pain points to which “design” is wrongfully oblivious but which are fundamental to good user experience.

Speed is a major example. Website performance is crucial enough that, when it is poor, the potential for even the best designs to convert is diminished. The most “usable” website can have no effect if it fails to load when and in the way users expect it to.

One thing we can be thankful for when improving the performance of a website is that while “more speed” definitely has a strong impact on the user experience, it is also easy to measure. Look, feel, and the “oomph” of meaningful, quality content, navigability, usability, each have their own quantitative metrics like conversion or bounce rate, time watched, and so on. But at best these aspects of the web design are objective-ish: the numbers hint at a possible truth, but these measurements only weather scrutiny when derived from real, very human, users.

A fast site won’t make up for other serious usability concerns, but since simple performance optimization doesn’t necessarily require any actual users, it lends itself to projects constrained by time or budget, or those otherwise lacking the human resources needed to observe usage, gather feedback, and iterate. The ideal cycle of “tweak, test, rinse, and repeat” is in some cases not possible. Few user experience projects return as much bang for the buck as site optimization, and it can be baked into the design and development process early and with known—not guessed-at, nor situational—results.

The signals

When it comes to site optimization, there are no shortage of signals to watch. There is a glut of data right in the browser about the number of bytes in, script or style file size, network status codes, drop-shadow rendering, frames per second, and so on. Tim Kadlec, author of Implementing Responsive Design, broke a lot of these down in terms of meaningful measurements in a series of articles throughout the last couple of years oriented around the “performance budget.”

A performance budget is just what it sounds like: you set a “budget” on your page and do not allow the page to exceed that. This may be a specific load time, but it is usually an easier conversation to have when you break the budget down into the number of requests or size of the page.

Such a strategy really took root in the #perfmatters movement, spurred by folks repulsed by just how fast the web was getting slower. Their observation was that because the responsive web was becoming increasingly capable and high pixel density screens were the new norm, developers making cool stuff sent larger and larger file sizes through the pipes. While by definition responsive websites can scale for any screen, they were becoming cumbersome herky-jerky mothras for which data was beginning to show negative impacts.

In his talk in 2013, “Breaking the 1000ms Time to Glass Mobile Barrier” — and, later, his book High Performance Browser Networking — Ilya Grigorik demonstrated users’ reactions to even milliseconds-long delays:

Delay User Reaction 0 – 100ms Instant 100 – 300ms Feels sluggish 300 – 1000ms Machine is working … 1s + Mental context switch 10s + I’ll come back later …

Since then, the average page weight has grown 134 percent, 186 percent since 2010. Poor performance is such a drag on what might otherwise be a positive user experience—encapsulated by a July 2015 article in The Verge, “The Mobile Web Sucks”—that the biggest players in the web game (Facebook and Google) have dramatically reacted by either enforcing design restrictions on the SEO-sensitive developer or removing the dev’s influence entirely.

Comparison of average bytes per content type in November 2010 (left) and November 2015 (right).

Self-imposed performance budgets are increasingly considered best practice, and—as mentioned—there are different ways to measure its success. In his write-up on the subject, Tim Kadlec identifies four major categories:

  • Milestone timings
  • Rule based metrics
  • Quantity based metrics
  • Speed index
Milestone Timings

A milestone in this context is a number like the time in seconds until the browser reaches the load event for the main document, or, for instance, the time until the page is visually complete. Milestones are easy to track, but there are arguments against their usefulness. Pat Meenan writes in the WebPagetest documentation that a milestone “isn’t a very good indicator of the actual end-user experience.”

As pages grow and load a lot of content that is not visible to the user or off the screen (below the fold) the time to reach the load event is extended even if the user-visible content has long-since rendered… [Milestones] are all fundamentally flawed in that they measure a single point and do not convey the actual user experience.

Rule Based and Quantity Based Metrics

Rule based metrics check a page or site against an existing checklist with a tool like YSlow or Google PageSpeed to grade your site. Quantity based metrics, on the other hand, include a lot of the data as reported by outlets like the HTTP Archive. These include total number of requests, overall page weight, and even the size of the CSS file. Not all these metrics indicate poor performance, but they are useful for conceptualizing the makeup of a page and where efforts at optimization can be targeted. If the bulk of the page weight is chalked-up to heavy image use, then perhaps there are image-specific techniques you can use for stepping-up the pace.

Example of a library web page graded by YSlow.

Speed Index

Speed Index is set apart by its attempts to measure the experience (there is an algorithm) to which Pat Meenan referred by determining how much above-the-fold content is visually complete over time then assigning a score. This is not a timing metric, but Meenan explains:

the ‘area above the curve’ calculated in ms and using 0.0–1.0 for the range of visually complete. The calculation looks at each 0.1s interval and calculates IntervalScore = Interval * ( 1.0 – (Completeness/100)) where Completeness is the percent visually complete for that frame and Interval is the elapsed time for that video frame in ms… The overall score is just a sum of the individual intervals.

Basically, the faster the website loads above the fold, the faster the user can start to interact with the content. A low score is better, which is read as milliseconds. A score of “1000” roughly means that a user can start to use the website after just one second. So if other metrics measure the Time To Load (TTL), then Speed Index measures Time To Interact (TTI), which may be a more meaningful signal.

TTI encapsulates an important observation even by quantitative-data nerds that web performance is just as much tied to the psychology of time and the perception of speed as it is by the speed of the network. If we look at page speed as a period of waiting, then how the user waits plays a role in how that wait is experienced. As Denys Mishunov writes in an article about “Why Performance Matters,” the wait is either active or passive:

The period in which the user has no choice or control over the waiting time, such as standing in line or waiting for a loved one who is late for the date, is called a passive phase, or passive wait. People tend to estimate passive waiting as a longer period of time than active, even if the time intervals are objectively equal.

For example, during my recent involvement with an academic library homepage redesign, our intention was that it would serve as thin a buffer as possible between the students or faculty and their research. This not only involved bringing search tools and content from deeper in the website to the forefront, but also reducing any barrier or “ugh” factor when engaging with them—such as time. Speed Index has a user-centric bias in that its measurement approximates the time the user can interact with—thus experience—the site. And it is for this reason we adopted it as a focal metric for our redesign project.

A report from Google Pagespeed.

Quick tangent tutorial: measuring Speed Index with WebPagetest

Google develops and supports WebPagetest, the online open-source web performance diagnostic tool at WebPagetest.org, which uses virtual machines to simulate websites loading on various devices and with various browsers, throttling the network to demonstrate load times over slower or faster connections, and much more. Its convenience and ease of use makes it an attractive tool. Generating a report requires neither browser extensions nor prior experience with in-browser developer tools. WebPagetest, like alternatives, incorporates rule-based grading and quantity metrics, but it was also the first to introduce Speed Index, which can be measured by telling it to “Capture Video.”

WebPagetest returns a straightforward report card summarizing the performance results of its tests, including a table of milestones alongside speed indices. The tool provides results for “First View” and “Repeat View,” which demonstrates the role of the browser cache. These tests are remarkably thorough in other ways as well, including screen captures, videos, waterfall charts, content breakdowns, and optimization checklists.

It’s worth noting that these kinds of diagnostics can be run by other tools on either end of development. Google PageSpeed Insights can be generated in the same way: type a URL and run the report. But folks can also install PageSpeed’s Apache and Nginx modules to optimize pages automatically, or otherwise integrate PageSpeed—or YSlowinto the build-process with grunt tasks. The bottom line is that these kinds of performance diagnostics can be run wherever it is most convenient, at different depths, whether you prefer to approach it as a developer or not. They can be as integrated or used ex-post-facto as needed.

The order in which the elements load matters

Of course, the user’s experience of load times is not only about how long it takes any interactive elements of the page to load but how long it takes certain elements to load. Radware’s recent report “Speed vs. Fluency in Website Loading: What Drives User Engagement” shows that “simply loading a page faster doesn’t necessarily improve users’ emotional response to the page.” They outfitted participants with neuroimaging systems and eye-trackers (mounted on monitors) in an attempt to objectively measure things like cognitive load and motivation. In the study, the same web page was loaded using three different techniques:

  1. the original, unaltered loading sequence,
  2. the fastest option, where the techniques used provided the most demonstrably fast load times regardless of rendering sequence,
  3. a version where the parts of the page most important to what the user wanted to accomplish were loaded first.

Results of Radware’s study on how users process web pages during rendering

In six out of ten pages, the sequence in which elements loaded based off their importance toward a primary user task affected overall user engagement, measured by total fixation time.

While not overwhelming, the results suggest that depending on the type of website, rendering sequence can play an important role on the “emotional and cognitive response and at which order [users] will look at different items.” Radware makes no suggestions about which rendering sequences work for which websites.

Still, the idea that cherry-picking the order in which things load on the page might decrease cognitive load (especially on an academic library homepage where the primary user task is search) is intriguing.

Earmark a Performance Budget

Anyway, this is getting a little long in the tooth. All this is to say that there are all sorts of improvements that can be made to library websites that add value to the user experience. Prioritizing between these involves any number of considerations. But while it may take a little extra care to optimize performance, it’s worth the time for one simple reason: your users expect your site to load the moment they want it.

This sets the tone for the entire experience.

Copyrights. So, this article originally appeared in Weave: Journal of Library User Experience, in an issue alongside people I really respect writing about anticipatory design and performance. It’s licensed under a creative commons attribution 3.0 license. I made some changes up there and embedded some links, but for the most part the article is in its original form.

District Dispatch: New season, new entrepreneurship opportunities

planet code4lib - Fri, 2016-08-26 13:55

A young girl works in the “Fab Lab” at Orange County Library System’s Dorothy Lumley Melrose Center. Photo credit: Orange County Library System.

This is a strange time of year. The days are still long and hot – at least here in D.C. – but the Labor Day promos and pre-season football games signal the start of a new season. It’s around this time that I usually reflect on the waning summer. Having just gotten back from a long vacation at the beach, I’ve had plenty of time for reflection on the past year. Professionally, I’ve focused heavily on a single topic these past few months: entrepreneurship.

In late June, months of research, outreach, and writing culminated in OITP’s release of a white paper on the library community’s impact on the entrepreneurship ecosystem. The paper brought together data and cases from across the country to outline the bevy of services academic and public libraries offer entrepreneurs. We called the paper “The People’s Incubator.” You don’t have to read the text to recognize the accuracy of this metaphor for describing the role the library community plays in helping people bring innovative ideas to life. Libraries are, and have always been, creative spaces for everyone. Since the analog era, library programs and services have encouraged all people to convert notions into innovations.

But, the more time that passes since the paper’s release, the more I feel the “People’s Incubator” moniker isn’t quite adequate to describe what the modern library community does in today’s entrepreneurship space. It does justice to the creative power of library resources, but it doesn’t convey the steadiness of the support the library community offers entrepreneurs at every turn. At each stage of launching and running a business – planning, fundraising, market analysis and more – libraries are equipped to offer assistance. Business plan competitions, courses on raising capital, research databases, census records, prototyping and digital production equipment, business counseling and intellectual property information all combine to round out the picture of the entrepreneurship services available at the modern library.

A facility offering these services is not just an incubator – it’s a constant companion; a hand to hold while navigating a competitive and often unforgiving ecosystem. And the more I read about library entrepreneurship activities, the more convinced I become that influencers across all sectors should leverage the robust resources libraries provide entrepreneurs to encourage innovation across the country. In just the few months since we published the paper, I have found one after another example of libraries’ commitment to developing a more democratic and strong entrepreneurship ecosystem. In addition to the examples described in the paper, recent library partnerships illustrate the entrepreneurship synergies the library community can help create.

The New York Public Library (NYPL) recently partnered with the 3D printing service bureau Shapeways to develop curricula for teaching the entrepreneurial applications of 3D printing.  The curricula will be piloted in a series of NYPL courses in the fall of 2016, and then publically released under an open license. Continued partnerships between libraries and tech companies like this one will advance the capacity of libraries to build key skills for the innovation economy.

For over a year, the Memphis Public Library has been a key partner in a citywide effort to boost start-up activity. Working with colleges, universities and foundations, the library’s resources and programming has helped the Memphis entrepreneurship ecosystem create hundreds of jobs. Libraries can and should continue to be a major part of these sorts of collaborations.

With support from the Kendrick B. Melrose Family Foundation, The Orange County Library System in Orlando opened the Dorothy Lumley Melrose Center in 2014. The Center offers video and audio production equipment, 3D printers, arduino and other electronics, and a host of tech classes – all of which individuals can use to launch new innovations and build key skills for the modern economy.

Through a partnership between the Montgomery County Public Library and the Food and Drug Administration (FDA), 80 teens had the opportunity to work in teams this summer to design their own mobile medical apps. The teens recently “pitched” their apps to a panel of judges at the FDA’s main campus in Silver Spring, Maryland. They’ve also gotten the chance to visit the White House.

Beyond partnerships between libraries, private firms, government agencies, academic institutions and foundations, library collaborations with Small Business Development Centers – federally-supported entrepreneurship assistance facilities – continue to be publically highlighted.

So, if I’ve learned anything from my summer of entrepreneurship, it’s this: libraries, as constant companions for entrepreneurs, are natural partners for the many public, private, non-profit and academic actors that work to advance the innovation economy. We will trumpet this important message in the coming weeks and months, as we work to alert policymakers to the important work of libraries ahead of the November elections. To do that, we need good examples of library efforts to advance start-up activities. Share yours in the comments section!

The post New season, new entrepreneurship opportunities appeared first on District Dispatch.

Equinox Software: Evergreen 2012: ownership and interdependence

planet code4lib - Fri, 2016-08-26 13:53

“Cats that Webchick is herding” by Kathleen Murtagh on Flickr (CC-BY)

A challenge common to any large project is, of course, herding the cats. The Evergreen project has pulled off a number of multi-year projects, including completely replacing the public catalog interface, creating acquisitions and serials modules from scratch, creating a kid’s catalog, writing Evergreen’s manual, and instituting a unit and regression testing regime. As we speak, we’re in the middle of a project to replace the staff client with a web-based staff interface.

All of this happened — and continues to happen — in a community where there’s little room for anybody to dictate to another community member to do anything in particular. We have no dictator, benevolent or otherwise; no user enhancement committee; no permanent staff employed by the Evergreen Project.

How does anything get done? By the power of Voltron interdependence.

In 2011, Evergreen become a member project of the Software Freedom Conservancy, representing a culmination of the efforts started in 2010 (as Grace mentioned).

As a member project of Conservancy, Evergreen receives several benefits: Conservancy holds the project’s money, negotiates venue contracts for the annual conference and hack-a-way, and holds the project’s trademark. However, Conservancy does not run the project — nor do they want to.

As part of joining Conservancy, the Evergreen Project established an Oversight Board, and in 2012, I had the privilege of beginning a term as chair of the EOB. The EOB is Conservancy’s interface with the Evergreen Project, and the EOB is the group that is ultimately responsible for making financial decisions.

Aha! You might think to yourself: “So, if the Evergreen Project doesn’t have a dictator in the mold of Linus Torvalds, it has elected oligarchs in the form of the Oversight Board!”

And you would be wrong. The Evergreen Oversight Board does not run the project either. The EOB does not appoint the release managers; it does not dictate who is part of the Documentation Interest Group; it does not mandate any particular sort of QA.

What does the EOB do? In part, it does help establish policies for the entire project; for example, Evergreen’s decision to adopt a code of conduct in 2014 arose from the suggestions and actions of EOB members, including Kathy Lussier and Amy Terlaga. It also, in conjunction with Conservancy, helps to protect the trademark.

The trademark matters. It represents a key piece of collective ownership, ownership that is in the hands of the community via a nonprofit, disinterested organization. Evergreen is valuable, not just as a tool that libraries can use to help patrons get access to library resources, but in part as something that various institutions have built successful services (commercial or otherwise) on.  If you take nothing else away from this post, take this: if you plan to launch an open source project for the benefit of libraries, give a thought to how the trademark should be owned and managed.  The consequences of not doing so can end up creating a huge distraction from shipping excellent software… or worse.

But back to the question of governance: how does the day to day work of writing documentation, slinging code, updating websites, training new users, seeking additional contributors, unruffling feathers, and so forth get done? By constant negotiation in a sea of interdependence. This is complicated, but not chaotic. There are plenty of contracts helping protect the interests of folks contributing to and using Evergreen: contracts with non-profit and for-profit service providers like Equinox; contracts to join consortia; contracts to pool money together for a specific project. There are also webs of trust and obligation: a developer can become a committer by showing that they are committed to improving Evergreen and have a track record of doing so successfully.

Governance is inescapable in any project that has more than one person; it is particularly important in community-based open source projects. Evergreen has benefited from a lot of careful thought about formal and informal rules and lines of communication…. and will continue to do so.

— Galen Charlton, Added Services and Infrastructure Manager

This is the seventh in our series of posts leading up to Evergreen’s Tenth birthday.

OCLC Dev Network: Leveraging Client-Side API and Linked Data Support

planet code4lib - Fri, 2016-08-26 13:00

See an example of how client-side support in APIs and Linked Data create opportunities for innovative applications.

FOSS4Lib Recent Releases: DIVA.js - 5.0

planet code4lib - Fri, 2016-08-26 12:55

Last updated August 26, 2016. Created by Peter Murray on August 26, 2016.
Log in to edit this page.

Package: DIVA.jsRelease Date: Thursday, August 25, 2016

LibUX: Improve your UX with Google Analytics

planet code4lib - Fri, 2016-08-26 11:45

Michael Beasley — author of Practical Web Analytics for User Experience — shares really quite useful tips for using Google Analytics to infer intent.

Evergreen ILS: Evergreen 2.9.7 and 2.10.6 released

planet code4lib - Fri, 2016-08-26 00:59

We are pleased to announce the release of Evergreen 2.9.7 and 2.10.6, both bugfix releases.

Evergreen 2.9.7 fixes the following issues:

  • The claims never checked out counter on the patron record is now incremented correctly when marking a lost loan as claims-never-checked-out.
  • When a transit is canceled, the copy’s status is changed only if its status was previously “In Transit”.
  • Retrieving records with embedded holdings via SRU and Z39.50 is now faster.
  • The hold status message in the public catalog now uses better grammar.
  • The error message displayed when a patron attempts to place a hold but is prevented from doing so due to policy reasons is now more likely to be useful.
  • The public catalog now draws the edition statement only from the 250 field; it no longer tries to check the 534 and 775 fields.
  • Embedded schema.org microdata now uses “offeredBy” rather than “seller”.
  • The ContentCafe added content plugin now handles the “fake” ISBNs that Baker and Taylor assigns to media items.
  • Attempting to renew a rental or deposit item in the public catalog no longer causes an internal server error.
  • Various format icons now have transparent backgrounds (as opposed to white).
  • The staff client will no longer wait indefinitely for Novelist to supply added content, improving its responsiveness.
  • A few additional strings are now marked as translatable.

Evergreen 2.10.6 fixes the same issues fixed in 2.9.7, and also fixes the following:

  • Those stock Action Trigger event definitions that send email will now include a Date header.
  • Prorating invoice charges now works again.
  • A performance issue with sorting entries on the public catalog circulation history page is fixed.
  • Various style and responsive design improvements are made to the circulation and holds history pages in the public catalog.
  • The public catalog holds history page now indicates if a hold had been fulfilled.

Evergreen 2.10.6 also includes updated translations. In particular, Spanish has received a huge update with over 9,000 new translations, Czech has received a sizable update of over 800 translations, and additional smaller updates have been added for Arabic, French (Canada), and Armenian.

Please visit the downloads page to retrieve the server software and staff clients.

David Rosenthal: Evanescent Web Archives

planet code4lib - Thu, 2016-08-25 18:00
Below the fold, discussion of two articles from last week about archived Web content that vanished.

At Urban Milwaukee Michail Takach reports that Journal Sentinel Archive Disappears:
Google News Archive launched [in 2008] with ambitious plans to scan, archive and release the world’s newspapers in a single public access database. ... When the project abruptly ended three years later, the project had scanned over a million pages of news from over 2,000 newspapers. Although nobody is entirely sure why the project ended, Google News Archive delivered an incredible gift to Milwaukee: free digital access to more than a century’s worth of local newspapers.But now:
on Tuesday, August 16, the Milwaukee Journal, Milwaukee Sentinel, and Milwaukee Journal Sentinel listings vanished from the Google News Archive home page. This change came without any advance warning and still has no official explanation. The result for Takach is:
For years, I’ve bookmarked thousands of articles and images for further exploration at a later date. In one lightning bolt moment, all of my Google News Archive bookmarks went from treasure to trash. To be fair, this doesn't appear to be another case of Google abruptly canceling a service:
“Google News Archive no longer has permission to display this content.” According to the Milwaukee Journal Sentinel:
“We have contracted with a new vendor (Newsbank.) It is unclear when or if the public will have access to the full inventory that was formerly available on Google News Archive.” The owner of the content arbitrarily decided to vanish it.

At U.S. News & World Report Steven Nelson's Wayback Machine Won’t Censor Archive for Taste, Director Says After Olympics Article Scrubbed is an excellent, detailed and even-handed look at the issues raised for the Internet Archive when the Daily Beast's:
straight reporter created a gay dating profile and reported the weights, athletic events and nationalities of Olympians who contacted him, including those from "notoriously homophobic" countries. As furor spread last week, the Daily Beast revised and then retracted the article, sending latecomers to the controversy to the Wayback Machine. The Internet Archive has routine processes that make content they have collect inaccessible, for example in response to DMCA takedown notices. It isn't clear exactly what happened in this case. Mark Graham is quoted:
“The page we’re talking about here was removed from the Wayback Machine out of a concern for safety and that’s it.”... Graham was not immediately able to think of a similar safety-motivated removal and declined to say if the Internet Archive retains a non-public copy. In fact, he says he has no proof, just circumstantial evidence, the article ever was in the Wayback Machine.I would endorse Chris Bourg's stance on this issue:
Chris Bourg, director of libraries at the Massachusetts Institute of Technology, says the matter is a "a tricky situation where librarian/archivists values of privacy and openness come in to conflict" and says in an email the article simply could be stored in non-public form for as long as necessary.

"My personal opinion is that we should always look for answers that cause the least harm, which in this case would be to dark archive the article; and keep it archived for as long as needed to best protect the gay men who might otherwise be outed," she says. "That’s a difficult thing to do, and is no guarantee that the info won’t be released and available from other sources; but I think archivists/librarians have special responsibilities to the subjects in our collections to 'do no harm'."These two stories bring up four points to consider:
  • The Internet Archive is the most-used, but only one among a number of Web archives which will naturally have different policies. Portals to the archived Web that use Memento to aggregate their content, such as oldweb.today, could well find content the Wayback machine had suppressed in other archives.
  • Copyright enables censorship. Anything on the public Web, or in public Web archives, can be rendered inaccessible without notice by the use or abuse of copyright processes, such as the DMCA takedown process.
  • Just because archived Web resources are in the custody of a major company, such as Google, or even what we may now thankfully call a major institution, the Internet Archive, does not guarantee them permanence.
  • Thus, scholars such as Takach are faced with a hard choice, either to risk losing access without notice to the resources on which their work is based, or to ignore the law and maintain a personal archive stored in their own equipment of all those resources.
While not specifically about Web archives, emptywheel's account of the removal of the Shadow Brokers files from GitHub, Reddit and Tumblr, and Roxane Gay's The Blog That Disappeared about Google's termination of Dennis Cooper's account, show that one cannot depend on what services such as these say in their Terms of Service.

LITA: New Titles in the LITA Guide Series

planet code4lib - Thu, 2016-08-25 17:55

A new relationship between LITA and Rowman and Littlefield publishers kicks off with the announcement of 7 recent and upcoming exciting titles on library technology. The LITA Guide Series books from Rowman and Littlefield publishers, contain practical, up to date, how-to information, and are usually under 100 pages. Proposals for new titles can be submitted to the Acquisitions editor using this link.

LITA members receive a 20% discount on all the titles. To get that discount, use promotion code RLLITA20 when ordering from the Rowman and Littlefield LITA Guide Series web site.

      

Here are the current new LITA Guide Series titles:

Integrating LibGuides into Library Websites
Edited by Aaron W. Dobbs and Ryan L. Sittler (October 2016)

Innovative LibGuides Application: Real World Examples
Edited by Aaron W. Dobbs and Ryan L. Sittler (October 2016)

Data Visualization: A Guide to Visual Storytelling for Libraries
Edited by Lauren Magnuson (September 2016)

Mobile Technologies in Libraries
Ben Rawlins (September 2016)

Library Service Design: A LITA Guide to Holistic Assessment, Insight, and Improvement
Joe J. Marquez and Annie Downey (July 2016)

The Librarian’s Introduction to Programming Languages
Edited by Beth Thomsett-Scott (June 2016)

Digitizing Flat Media: Principles and Practices
Joy M. Perrin (December 2015)

LITA publications help to fulfill its mission to educate, serve and reach out to its members, other ALA members and divisions, and the entire library and information community through its publications, programs and other activities designed to promote, develop, and aid in the implementation of library and information technology.

Open Knowledge Foundation: OpenTrials launch date + Hack Day

planet code4lib - Thu, 2016-08-25 13:26

Exciting news! OpenTrials, a project in which Open Knowledge is developing an open, online database of information about the world’s clinical research trials, will officially launch its beta on Monday 10th October 2016 at the World Health Summit in Berlin. After months of work behind-the-scenes meeting, planning, and developing, we’re all really excited about demoing OpenTrials to the world and announcing how to access and use the site!

The launch will take place at the ‘Fostering Open Science in Global Health’ workshop, with OpenTrials being represented by our Community Manager, Ben Meghreblian. The workshop will be a great opportunity to talk about the role of open data, open science, and generally how being open can bring improvements in medicine and beyond!

As the workshop’s theme is public health emergencies, we’ll also be demoing Ebola Trials Tracker, another OpenTrials project showing how long it takes for the results of Ebola trials to be made available.

If you’ll be attending the conference or the workshop, we’d love to meet you – please do get in touch and let us know.

Hack Day

If that wasn’t enough, we also have a confirmed date and location for the OpenTrials Hack Day – it will take place on Saturday 8th October at the German office of Wikimedia in Berlin.

We’re inviting people from a range of backgrounds. So, if you’re developer, data scientist, health technologist, open data advocate, or otherwise interested in health, medicine, and clinical trials, come along and learn more about the data that powers OpenTrials, how it’s structured, and how to use our API to search the OpenTrials database or build applications using the data.

On the day our technical lead and a domain expert will be on hand to explain the data and facilitate the day – we’re really looking forward to seeing what clever hacks and mini-projects you’ll create.

For those of you who have already asked, we’ll be releasing documentation on the OpenTrials API and database soon, but meanwhile if you’re interested in the event you’ll find more details on the OpenTrials Eventbrite page, or you can register quickly below.

OpenTrials is funded by The Laura and John Arnold Foundation and directed by Dr. Ben Goldacre, an internationally known leader on clinical transparency.

Contact: opentrials@okfn.org
Twitter: @opentrials

Powered by Eventbrite

Karen Coyle: Catalogs and Content: an Interlude

planet code4lib - Thu, 2016-08-25 03:02
This entire series is available a single file on my web site.

"Editor's note. Providing subject access to information is one of the most important professional services of librarians; yet, it has been overshadowed in recent years by AACR2, MARC, and other developments in the bibliographic organization of information resources. Subject access deserves more attention, especially now that results are pouring in from studies of online catalog use in libraries."
American Libraries, Vol. 15, No. 2 (Feb., 1984), pp. 80-83Having thought and written about the transition from card catalogs to online catalogs, I began to do some digging in the library literature, and struck gold. In 1984, Pauline Atherton Cochrane, one of the great thinkers in library land, organized a six-part "continuing education" to bring librarians up to date on the thinking regarding the transition to new technology. (Dear ALA - please put these together into a downloaded PDF for open access. It could make a difference.) What is revealed here is both stunning and disheartening, as the quote above shows; in terms of catalog models, very little progress has been made, and we are still spending more time organizing atomistic bibliographic data while ignoring subject access.

The articles are primarily made up of statements by key library thinkers of the time, many of whom you will recognize. Some responses contradict each other, others fall into familiar grooves. Library of Congress is criticized for not moving faster into the future, much as it is today, and yet respondents admit that the general dependency on LC makes any kind of fast turn-around of changes difficult. Some of the desiderata have been achieved, but not the overhaul of subject access in the library catalog.

The BackgroundIf you think that libraries moved from card catalogs to online catalogs in order to serve users better, think again. Like other organizations that had a data management function, libraries in the late 20th century were reaching the limits of what could be done with analog technology. In fact, as Cochrane points out, by the mid-point of that century libraries had given up on the basic catalog function of providing cross references from unused to used terminology, as well as from broader and narrower terms in the subject thesaurus. It simply wasn't possible to keep up with these, not to mention that although the Library of Congress and service organizations like OCLC provided ready-printed cards for bibliographic entries, they did not provide the related reference cards. What libraries did (and I remember this from my undergraduate years) is they placed near the card catalog copies of the "Red Book". This was the printed Library of Congress Subject Heading list, which by my time was in two huge volumes, and, yes, was bound in red. Note that this was the volume that was intended for cataloging librarians who were formulating subject headings for their collections. It was never intended for the end-users of the catalog. The notation ("x", "xx", "sa") was far from intuitive. In addition, for those users who managed to follow the references, it pointed them to the appropriate place in LCSH, but not necessarily in the catalog of the library in which they were searching. Thus a user could be sent to an entry that simply did not exist.

The "RedBook" todayFrom my own experience, when we brought up the online catalog at the University of California, the larger libraries had for years had difficulty keeping the card catalog up to date. The main library at the University of California at Berkeley regularly ran from 100,000 to 150,000 cards behind in filing into the catalog, which filled two enormous halls. That meant that a book would be represented in the catalog about three months after it had been cataloged and shelved. For a research library, this was a disaster. And Berkeley was not unusual in this respect.

Computerization of the catalog was both a necessary practical solution, as well as a kind of holy grail. At the time that these articles were written, only a few large libraries had an online catalog, and that catalog represented only a recent portion of the library's holdings. (Retrospective conversion of the older physical card catalog to machine-readable form came later, culminating in the 1990's.) Abstracting and indexing databases had preceded libraries in automating, DIALOG, PRECIS, and others, and these gave librarians their first experience in searching computerized bibliographic data.

This was the state of things when Cochrane presented her 6-part "continuing education" series in American Libraries.

Subject AccessThe series of articles was stimulated by an astonishingly prescient article by Marcia Bates in 1977. In that article she articulates both concerns and possibilities that, quite frankly, we should all take to heart today. In Lesson 3 of Cochrane's articles, Bates is quotes from 1977 saying:
"...with automation, we have the opportunity to introduce many access points to a given book. We can now use a subject approach... that allows the naive user, unconscious of and uninterested in the complexities of synonymy and vocabulary control, to blunder on to desired subjects, to be guided, without realizing it, by a redundant but carefully controlled subject access system." and
"And now is the time to change -- indeed, with MARC already so highly developed, past time. If we simply transfer the austerity-based LC subject heading approach to expensive computer systems, then we have used our computers merely to embalm the constraints that were imposed on library systems back before typewriters came into use!"
This emphasis on subject access was one of the stimuli for the AL lessons. In the early 1980's, studies done at OCLC and elsewhere showed that over 50% of the searches being done in the online catalogs of that day were subject searches, even those going against title indexes or mixed indexes. (See footnotes to Lesson 3.) Known item searching was assumed to be under control, but subject searching posed significant problems. Comments in the article include:
"...we have not yet built into our online systems much of the structure for subject access that is already present in subject cataloging. That structure is internal and known by the person analyzing the work; it needs to be external and known by the person seeking the work."
"Why should a user ever enter a search term that does not provide a link to the syndetic apparatus and a suggestion about how to proceed?"Interestingly, I don't see that any of these problems has been solved into today's systems.

As a quick review, here are some of the problems, some proposed solutions, and some hope for future technologies that are presented by the thinkers that contributed to the lessons.

Problems notedMany problems were surfaced, some with fairly simple solutions, others that we still struggle with.
  • LCSH is awkward, if not nearly unusable, both for its vocabulary and for the lack of a true hierarchical organization
  • Online catalogs' use of LCSH lacks syndetic structure (see, see also, BT, NT). This is true not only for display, but in retrieval, search on a broader term does not retrieve items with a narrower term (which would be logical to at least some users)
  • Libraries assign too few subject headings
  • For the first time, some users are not in the library while searching so there are no intermediaries (e.g. reference librarians) available. (One of the flow diagrams has a failed search pointing to a box called "see librarian" something we would not think to include today.)
  • Lack of a professional theory of information seeking behavior that would inform systems design. ("Without a blueprint of how most people want to search, we will continue to force them to search the we want to search." Lesson 5)
  • Information overload, aka overly large results, as well as too few results on specific searches

Proposed solutionsSome proposed solutions were mundane (add more subject headings to records) while others would require great disruption to the library environment.
  • Add more subject headings to MARC records
  • Use keyword searching, including keywords anywhere in the record.
  • Add uncontrolled keywords to the records.
  • Make the subject authority file machine-readable and integrate it into online catalogs.
  • Forget LCSH, instead use non-library bibliographic files for subject searching, such as A&I databases.
  • Add subject terms from non-library sources to the library catalog, and/or do (what today we call) federated searching
  • LCSH must provide headings that are more specific as file sizes and retrieved sets grow (in the document, a retrieved set of 904 items was noted with an exclamation point)

Future thinking As is so often the case when looking to the future, some potential technologies were seen as solutions. Some of these are still seen as solutions today (c.f. artificial intelligence), while others have been achieved (storage of full text).
  • Full text searching, natural language searches, and artificial intelligence will make subject headings and classification unnecessary
  • We will have access to back-of-the-book indexes and tables of contents for searching, as well as citation indexing
  • Multi-level systems will provide different interfaces for experts and novices
  • Systems will be available 24x7, and there will be a terminal in every dorm room
  • Systems will no longer need to use stopwords
  • Storage of entire documents will become possible
End of InterludeAlthough systems have allowed us to store and search full text, to combine bibliographic data from different sources, and to deliver world-wide, 24x7, we have made almost no progress in the area of subject access. There is much more to be learned from these articles, and it would be instructive to do an in-depth comparison of them to where we are today. I greatly recommend reading them, each is only a few pages long.

----- The Lessons -----*Modern Subject Access in the Online Age: Lesson 1
by Pauline Atherton Cochrane
Source: American Libraries, Vol. 15, No. 2 (Feb., 1984), pp. 80-83
Stable URL: http://www.jstor.org/stable/25626614

*Modern Subject Access in the Online Age: Lesson 2 Pauline A. Cochrane American Libraries Vol. 15, No. 3 (Mar., 1984), pp. 145-148, 150 Stable URL: http://www.jstor.org/stable/25626647

*Modern Subject Access in the Online Age: Lesson 3
Author(s): Pauline A. Cochrane, Marcia J. Bates, Margaret Beckman, Hans H. Wellisch, Sanford Berman, Toni Petersen, Stephen E. Wiberley and Jr.
Source: American Libraries, Vol. 15, No. 4 (Apr., 1984), pp. 250-252, 254-255
Stable URL: http://www.jstor.org/stable/25626708

*Modern Subject Access in the Online Age: Lesson 4
Author(s): Pauline A. Cochrane, Carol Mandel, William Mischo, Shirley Harper, Michael Buckland, Mary K. D. Pietris, Lucia J. Rather and Fred E. Croxton
Source: American Libraries, Vol. 15, No. 5 (May, 1984), pp. 336-339
Stable URL: http://www.jstor.org/stable/25626747

*Modern Subject Access in the Online Age: Lesson 5
Author(s): Pauline A. Cochrane, Charles Bourne, Tamas Doczkocs, Jeffrey C. Griffith, F. Wilfrid Lancaster, William R. Nugent and Barbara M. Preschel
Source: American Libraries, Vol. 15, No. 6 (Jun., 1984), pp. 438-441, 443
Stable URL: http://www.jstor.org/stable/25629231

*Modern Subject Access In the Online Age: Lesson 6
Author(s): Pauline A. Cochrane, Brian Aveney and Charles Hildreth Source: American Libraries, Vol. 15, No. 7 (Jul. - Aug., 1984), pp. 527-529
Stable URL: http://www.jstor.org/stable/25629275

DuraSpace News: KnowledgeArc Launches the NHS and FFI on DSpace

planet code4lib - Thu, 2016-08-25 00:00

By Michael Guthrie We are pleased to welcome the NHS Derby Teaching Hospitals and the Norwegian Defence Research Establishment (FFI) to the KnowledgeArc managed, hosted DSpace platform.

Eric Lease Morgan: Limit to full text in VuFind

planet code4lib - Wed, 2016-08-24 20:16

This posting outlines how a “limit to full text” functionality was implemented in the “Catholic Portal’s” version of VuFind.

While there are many dimensions of the Catholic Portal, one of its primary components is a sort of union catalog of rare and infrequently held materials of a Catholic nature. This union catalog is comprised of metadata from MARC records, EAD files, and OAI-PMH data repositories. Some of the MARC records include URLs in 856$u fields. These URLs point to PDF files that have been processed with OCR. The Portal’s indexer has been configured to harvest the PDF documents, when it comes across them. Once harvested the OCR is extracted from the PDF file, and the resulting text is added to the underlying Solr index. The values of the URLs are saved to the Solr index as well. Almost by definition, all of the OAI-PMH content indexed by Portal is full text; almost all of the OAI-PMH content includes pointers to images or PDF documents.

Consequently, if a reader wanted to find only full text content, then it would be nice to: 1) do a search, and 2) limit to full text. And this is exactly what was implemented. The first step was to edit Solr’s definiton of the url field. Specifically, its “indexed” attribute was changed from false to true. Trivial. Solr was then restarted.

The second step was to re-index the MARC content. When this is complete, the reader is able to search the index for URL content — “url:*”. In other words, find all records whose URL equals anything.

The third step was to understand that all of the local VuFind OAI-PMH identifiers have the same shape. Specifically, they all include the string “oai”. Consequently, the very astute reader could find all OAI-PMH content with the following query: “id:*oai*”.

The third step was to turn on a VuFind checkbox option found in facets.ini. Specifically, the “[CheckboxFacets]” section was augmented to include the following line:

id:*oai* OR url:* = “Limit to full text”

When this was done a new facet appeared in the VuFind interface.

Finally, the whole thing comes to fruition when a person does an initial search. The results are displayed, and the facets include a limit option. Upon selection, VuFind searches again, but limits the query by “id:*oai* OR url:*” — only items that have URLs or come from OAI-PMH repositories. Pretty cool.

Kudos go to Demian Katz for outlining this process. Very nice. Thank you!

LITA: Jobs in Information Technology: August 24, 2016

planet code4lib - Wed, 2016-08-24 18:51

New vacancy listings are posted weekly on Wednesday at approximately 12 noon Central Time. They appear under New This Week and under the appropriate regional listing. Postings remain on the LITA Job Site for a minimum of four weeks.

New This Week

American Institute for Radiologic Pathology, Medical Archivist / Case manager, Silver Spring, MD

Visit the LITA Job Site for more available jobs and for information on submitting a job posting.

Andromeda Yelton: An open letter to Heather Bresch

planet code4lib - Wed, 2016-08-24 13:49

Dear Heather Bresch,

You lived in Morgantown. I did, too: born and raised. My parents are retired from the university you attended. My elementary school took field trips to Mylan labs. They were shining, optimistic.

You’re from West Virginia. I am, too. This means we both know something of the coal industry that has both sustained and destroyed our home. You know, as I do, how many miners have been killed in explosions: trapped underground when a pocket of methane ignites. We both know that miners long carried safety lamps: carefully shielded but raw flames that would go out when the oxygen went too low, a warning to get away — if they had not first exploded, as open flames around methane do. Perhaps you know, as I only recently learned, that miners were once required to buy their own safety lamps: so when safer ones came out, ones that would only warn without killing you first, miners did not carry them. They couldn’t afford to. They set probability against their lives, went without the right equipment, and sometimes lost, and died.

I’m a mother. You are, too. I don’t know if your children carry medication for life-threatening illnesses; I hope you have not had to face that. I have. In our case it’s asthma, not allergies, and an inhaler, not an Epi-Pen. It’s a $20 copay with our insurance and lasts for dozens of doses. It doesn’t stop asthma attacks once they start — my daughter’s asthma is too severe for that — but sometimes it prevents them. And when it does not, it still helps: we spend two days in the hospital instead of five; we don’t go to the ICU. (Have you ever been with your child in a pediatric ICU? It is the most miraculous, and the worst, place on earth.)

Most families can find their way to twenty dollars. Many cannot find six hundred. They’ll go without, and set probability against their children’s lives. Rich children will live; poor children will sometimes lose, and die.

I ask you to reconsider.

Sincerely,

Andromeda Yelton


Equinox Software: Year 2010 : Sine Qua Non

planet code4lib - Wed, 2016-08-24 13:43

This is the fifth in our series of posts leading up to Evergreen’s Tenth birthday.  

I often tell people I hire that when you start a new job the first month is the honeymoon period. At month three you are panicking and possibly wondering why you thought you could do this. At six months you realize you’ve actually got the answers and at twelve months it’s like you never worked anywhere else. For me, 2010 represented months six through eighteen of my employment with Equinox and it was one of the most difficult, rewarding, and transformative years of my career. Coincidentally, it was also an incredibly transforming year for Evergreen.

In early 2010, Evergreen 1.6 was planned and released on schedule thanks to contributing efforts from the usual suspects back at that time. Bug fixes and new development were being funded or contributed by PINES, Conifer, Mohawk College, Evergreen Indiana, Calvin College, SAGE, and many others in the community. Somewhere in the midst of the ferocious adoption rate and and evolution of 2010, Evergreen quietly and without fanfare faced (and passed) its crucible. Instead of being thrown off stride, this amazingly determined community not only met the challenge, but deftly handled the inevitable friction that was bound to arise as the community grew.

In late August of 2010 KCLS went live on a beta version of Evergreen 2.0 after just over a year of intense and exhilarating development. It marked the beginning of another major growth spurt for Evergreen, including full support for Acquisitions, Serials, as well as the introduction of the template toolkit OPAC (or TPAC). I have nothing but positive things to say about the teams that worked to make that go-live a reality. KCLS and Equinox did amazing things together and, while not everything we did was as successful as we had envisioned, we were able to move Evergreen forward in a huge leap. More importantly, everyone involved learned a lot about ourselves and our organizations – including the community itself.

The community learned that we were moving from a small group of “insiders” and enthusiasts into a more robust and diverse community of users. This is, of course, natural and desirable for an open source project but the thing that sticks out in my mind is how quickly and easily the community adapted to rapid change. At the Evergreen Conference in 2010 a dedicated group met and began the process of creating an official governance structure for the Evergreen project. This meeting led to the eventual formation of the Evergreen Oversight Board and our current status as a member project of the Software Freedom Conservancy.

In the day-to-day of the Evergreen project I witnessed how the core principles of open source projects could shape a community of librarians. And I was proud to see how this community of librarians could contribute their core principles to strengthen the project and its broader community. We complement one another even as we share the most basic truths:
*The celebration of community
*The merit of the individual
*The empowerment of collaboration
*The belief that information should be free

Evergreen is special. More importantly, our community is special. And it’s special because behind each line of code there are dozens of people who contributed their time to create it. Each of those people brought with them their passion, their counter-argument, their insight, their thoughtfulness, and their sheer determination. And together, this community created something amazing. They made great things. They made mistakes. They learned. They adapted. They persevered. And those people behind those lines of code? They’re not abstractions. They are people I know and respect; people who have made indelible marks on our community. It’s Mike, Jason, Elizabeth, Galen, Kathy, Bill, Amy, Dan, Angela, Matt, Elaine, Ben, Tim, Sharon, Lise, Jane, Lebbeous, Rose, Karen, Lew, Joan, and too many others to name. They’re my community and when I think back on how much amazing transformation we’ve achieved in just one year, or ten years, I can’t wait to see what we do in the next ten.

– Grace Dunbar, Vice President

Open Knowledge Foundation: Open Knowledge Switzerland Summer 2016 Update

planet code4lib - Wed, 2016-08-24 10:46

The first half of 2016 was a very busy one for the Open Knowledge Swiss chapter, Opendata.ch. Just between April to June the chapter had 3 Hackathons, 15 talks, 3 meetups and 10 workshops. In this blog post we highlight some of these activities to update the Open Knowledge Community about our chapter’s work.

 

Main projects

Our directors worked on relaunching the federal Open Government Data portal and its new online handbook. We gathered and published datasets and ran workshops in support of various hackdays – and we migrated and improved our web infrastructure with better support of the open Transport API (handling up to 1.7 Mio requests per day!).

 

Main events

We held our annual conference in June, ran energy-themed hackdays in April and ran an OpenGLAM hackathon in July. Additionally, we supported two smaller regional hackathons in the spring, and a meetup on occasion of Open Data Day.

 

Challenges

Like other organisations in this space, our main challenge is redefining our manifesto and restructuring our operations to become a smoother running chapter that is more responsive to the needs of our members and community. This restructuring continues to be a challenge that we are learning from – and need to learn more about.

 

Successes

Our media presence and public identity continues to be stronger than ever. We are involved in a wide range of political and inter-organizational activities in support of diverse areas of openness, and in general we are finding that our collective voice is stronger and our messages better received everywhere we go.

 

Governance

We have had several retreats with the board to discuss changes in the governance and to welcome new directors: Catherine Pugin (ta-swiss.ch, datastory.ch), Martin Grandjean (martingrandjean.ch) and Alexandre Cotting (hevs.ch)

We are primarily working on a better overall organizational structure to support our community and working groups: starting and igniting new initiatives will be the next step. Among them will be the launch of business-oriented advocacy group called “Swiss Data Alliance”.

 

Looking ahead

We will soon announce a national program on food data, which includes hackdays and a funded follow-up/incubation phase for prototypes produced. And we are busy setting up a hackathon at the end of September with international scope and support called Hack for Ageing Well. Follow #H4AW for more info.

We are excited about upcoming cross-border events like #H4AW and Jugend Hackt, opening doors to development and research collaborations. Reach out through the Open Knowledge forums and we’ll do our best to connect you into the Swiss community!

LibUX: Helping users easily access content on mobile

planet code4lib - Wed, 2016-08-24 04:55

 Pages that show intrusive interstitials provide a poorer experience to users than other pages where content is immediately accessible. This can be problematic on mobile devices where screens are often smaller. To improve the mobile search experience, after January 10, 2017, pages where content is not easily accessible to a user on the transition from the mobile search results may not rank as highly.

I wonder, by their description, whether this describes exit-intent pop-ups OptinMonster made popular.

 Showing a popup that covers the main content, either immediately after the user navigates to a page from the search results, or while they are looking through the page.

One can hope.

Helping users easily access content on mobile

LibUX: A few things Brodie Austin learned doing usability tests on library websites

planet code4lib - Wed, 2016-08-24 04:43

Preach.

 My #1 rule when it came to thinking about website usability was that no one was allowed to claim to know what “normal people” would think or do until we actually sat down with normal(ish) people.

So, you want to do usability testing on your library website

Galen Charlton: Visualizing the global distribution of Koha installations from Debian packages

planet code4lib - Wed, 2016-08-24 04:15

A picture is worth a thousand words:

Click to get larger image.

This represents the approximate geographic distribution of downloads of the Koha Debian packages over the past year. Data was taken from the Apache logs from debian.koha-community.org, which MPOW hosts. I counted only completed downloads of the koha-common package, of which there were over 25,000.

Making the map turned out to be an opportunity for me to learn some Python. I first adapted a Python script I found on Stack Overflow to query freegeoip.net and get the latitude and longitude corresponding to each of the 9,432 distinct IP addresses that had downloaded the package.

I then fed the results to OpenHeatMap. While that service is easy to use and is written with GPL3 code, I didn’t quite like the fact that the result is delivered via an Adobe Flash embed.  Consequently, I turned my attention to Plotly, and after some work, was able to write a Python script that does the following:

  1. Fetch the CSV file containing the coordinates and number of downloads.
  2. Exclude as outliers rows where a given IP address made more than 100 downloads of the package during the past year — there were seven of these.
  3. Truncate the latitude and longitude to one decimal place — we need not pester corn farmers in Kansas for bugfixes.
  4. Submit the dataset to Plotly with which to generate a bubble map.

Here’s the code:

#!/usr/bin/python # adapted from example found at https://plot.ly/python/bubble-maps/ import plotly.plotly as py import pandas as pd df = pd.read_csv('http://example.org/koha-with-loc.csv') df.head() # scale factor the size of the buble scale = 3 # filter out rows where an IP address did more than # one hundred downloads df = df[df['value'] <= 100] # truncate latitude and longitude to one decimal # place df['lat'] = df['lat'].map('{0:.1f}'.format) df['lon'] = df['lon'].map('{0:.1f}'.format) # sum up the 'value' column as 'total_downloads' aggregation = { 'value' : { 'total_downloads' : 'sum' } } # create a DataFrame grouping by the truncated coordinates df_sub = df.groupby(['lat', 'lon']).agg(aggregation).reset_index() coords = [] pt = dict( type = 'scattergeo', lon = df_sub['lon'], lat = df_sub['lat'], text = 'Downloads: ' + df_sub['value']['total_downloads'], marker = dict( size = df_sub['value']['total_downloads'] * scale, color = 'rgb(91,173,63)', # Koha green line = dict(width=0.5, color='rgb(40,40,40)'), sizemode = 'area' ), name = '') coords.append(pt) layout = dict( title = 'Koha Debian package downloads', showlegend = True, geo = dict( scope='world', projection=dict( type='eckert4' ), showland = True, landcolor = 'rgb(217, 217, 217)', subunitwidth=1, countrywidth=1, subunitcolor="rgb(255, 255, 255)", countrycolor="rgb(255, 255, 255)" ), ) fig = dict( data=coords, layout=layout ) py.iplot( fig, validate=False, filename='koha-debian-downloads' )

An interactive version of the bubble map is also available on Plotly.

Pages

Subscribe to code4lib aggregator