You are here

Feed aggregator

Equinox Software: Evergreen 2014

planet code4lib - Tue, 2016-08-30 11:00

This past weekend I visited a farm in Central Washington and was able to see the full life cycle of crop production.  In one area, closed and light controlled, seeds germinate into small seedlings.  When large enough, the seedlings are tempered and prepared for movement out to the greenhouse.  In the greenhouse, the plants are carefully monitored and cultivated as they grow.  The last phase is moving the plants, now hardy and larger, out into the open air where, under the sun, they grow and fully develop for harvest.  My visit to the farm came at just the right time—there were fully grown plants ready for harvesting within the next few weeks and new seedlings, which will become next year’s crop, were just starting to grow.  While taking in this cyclical process of growth and harvest I couldn’t help but think about the growth of Evergreen over the years.

2014 is the year that saw the first seeds planted for the next generation of Evergreen.  While we all know and love the XUL staff client, the power and flexibility of newer technologies, such as AngularJS, was leading the Evergreen community to explore new options for a web based staff interface.  In January 2014, a prototype for the web client was released to the Evergreen community, thanks to the research and work of Bill Erickson and the rest Equinox development team.  Bill planted those first seeds and gave the community something to cultivate.  After evaluating the prototype, the community came together to move forward with the project. With the support of many development partners (BC Libraries Cooperative, Bibliomation, C/W MARS, GPLS, Grand Rapids Public Library, Howe Library, Kenton County Public Library, MassLNC, NC Cardinal, OhioNET, PaILS, Pioneer Library System, and SC LENDS), the web client project became a reality.  And with that, the project moved into the greenhouse, where real growth and careful cultivation could occur.

Like staging the crop on the farm, the development for the web client was broken up into sprints to tackle the modules individually to allocate proper time for each stage of growth and development.  Since 2014, Equinox has continued steady development on the web client sprints.  The goal of the web client was to maintain feature parity with the old client by porting over newer HTML interfaces and re-writing the older XUL interfaces.  Happily, and with much input from the users, many improvements to use and usability have been incorporated throughout the process.  In order to allow the web client to grow, the community decided to stop accepting new features into the XUL client, but development did not cease.  New features have been developed alongside the web client and upon implementation there will some new features such as customizable copy alerts and statistical popularity badges along with the new browser based interface.

The web client is currently in the last stages of the greenhouse phase of development.  Sprints 1, 2, and 3, Circulation, Cataloging, and Administration/Reports, respectively, are complete.  Sprint 4, Acquisitions and Serials, is currently in development and will be completed this fall. Sprints 5 (Booking, Offline Mode, etc.) and 6 (Bug Fixing) will round out the development phase and, upon completion, the Evergreen web client will move out of the greenhouse and into the community for use where it will continue to grow organically to meet the needs of the community.

As a trainer, I introduce new libraries to Evergreen and the Evergreen community and help translate their workflows to a new interface and ILS.  Evergreen feels like home to me and I hope that I have been able to help other libraries feel at home with Evergreen as well.  Through community support and development, Evergreen has undergone tremendous growth in the past 10 years.  It is constantly evolving and becoming a stronger ILS that meets the needs of its users.  The web client is the next phase of this evolution and it is a big step forward.  I’m looking forward to getting to know “Webby” and seeing what the harvest will bring in the next 10 years.  

–Angela Kilsdonk, Education Manager

This is the ninth in our series of posts leading up to Evergreen’s Tenth birthday.

Conal Tuohy: Linked Open Data Visualisation at #GLAMVR16

planet code4lib - Tue, 2016-08-30 02:02

On Thursday last week I flew to Perth, in Western Australia, to speak at an event at Curtin University on visualisation of cultural heritage. Erik Champion, Professor of Cultural Visualisation, who organised the event, had asked me to talk about digital heritage collections and Linked Open Data (“LOD”).

The one-day event was entitled “GLAM VR: talks on Digital heritage, scholarly making & experiential media”, and combined presentations and workshops on cultural heritage data (GLAM = Galleries, Libraries, Archives, and Museums) with advanced visualisation technology (VR = Virtual Reality).

The venue was the Curtin HIVE (Hub for Immersive Visualisation and eResearch); a really impressive visualisation facility at Curtin University, with huge screens and panoramic and 3d displays.

There were about 50 people in attendance, and there would have been over a dozen different presenters, covering a lot of different topics, though with common threads linking them together. I really enjoyed the experience, and learned a lot. I won’t go into the detail of the other presentations, here, but quite a few people were live-tweeting, and I’ve collected most of the Twitter stream from the day into a Storify story, which is well worth a read and following up.

My presentation

For my part, I had 40 minutes to cover my topic. I’d been a bit concerned that my talk was more data-focused and contained nothing specifically about VR, but I think on the day the relevance was actually apparent.

The presentation slides are available here as a PDF: Linked Open Data Visualisation

My aims were:

  • At a tactical level, to explain the basics of Linked Data from a technical point of view (i.e. to answer the question “what is it?”); to show that it’s not as hard as it’s usually made out to be; and to inspire people to get started with generating it, consuming it, and visualising it.
  • At a strategic level, to make the case for using Linked Data as a basis for visualisation; that the discipline of adopting Linked Data technology is not at all a distraction from visualisation, but rather a powerful generic framework on top of which visualisations of various kinds can be more easily constructed, and given the kind of robustness that real scholarly work deserves.
Linked Data basics

I spent the first part of my talk explaining what Linked Open Data means; starting with “what is a graph?” and introducing RDF triples and Linked Data. Finally I showed a few simple SPARQL queries, without explaining SPARQL in any detail, but just to show the kinds of questions you can ask with a few lines of SPARQL code.

What is an RDF graph?

While I explained about graph data models, I saw attendees nodding, which I took as a sign of understanding and not that they were nodding off to sleep; it was still pretty early in the day for that.

One thing I hoped to get across in this part of the presentation was just that Linked Data is not all that hard to get into. Sure, it’s not a trivial technology, but barriers to entry are not that high; the basics of it are quite basic, so you can make a start and do plenty of useful things without having to know all the advanced stuff. For instance, there are a whole bunch of RDF serializations, but in fact you can get by with knowing only one. There are a zillion different ontologies, but again you only need to know the ontology you want to use, and you can do plenty of things without worrying about a formal ontology at all. I’d make the case for university eResearch agencies, software carpentry, and similar efforts, to be offering classes and basic support in this technology, especially in library and information science, and the humanities generally.

Linked Data as architecture

People often use the analogy of building, when talking about making software. We talk about a “build process”, “platforms”, and “architecture”, and so on. It’s not an exact analogy, but it is useful. Using that analogy, Linked Data provides a foundation that you can build a solid edifice on top of. If you skimp on the foundation, you may get started more quickly, but you will encounter problems later. If your project is small, and if it’s a temporary structure (a shack or bivouac), then architecture is not so important, and you can get away with skimping on foundations (and you probably should!), but the larger the project is (an office building), and the longer you want it to persist (a cathedral), the more valuable a good architecture will be. In the case of digital scholarly works, the common situation in academia is that weakly-architected works are being cranked out and published, but being hard to maintain, they tend to crumble away within a few years.

Crucially, a Linked Data dataset can capture the essence of what needs to be visualised, without being inextricably bound up with any particular genre of visualisation, or any particular visualisation software tool. This relative independence from specific tools is important because a dataset which is tied to a particular software platform needs to rely on the continued existence of that software, and experience shows that individual software packages come and go depressingly quickly. Often only a few years are enough for a software program to be “orphaned”, unavailable, obsolete, incompatible with the current software environment (e.g. requires Windows 95 or IE6), or even, in the case of software available online as a service, for it to completely disappear into thin air, if the service provider goes bust or shuts down the service for reasons of their own. In these cases you can suddenly realise you’ve been building your “scholarly output” on sand.

By contrast, a Linked Data dataset is standardised, and it’s readable with a variety of tools that support that standard. That provides you with a lot of options for how you could go on to visualise the data; that generic foundation gives you the possibility of building (and rebuilding) all kinds of different things on top of it.

Because of its generic nature and its openness to the Web, Linked Data technology has become a broad software ecosystem which already has a lot of people’s data riding on it; that kind of mass investment (a “bandwagon”, if you like) is insurance against it being wiped out by the whims or vicissitudes of individual businesses. That’s the major reason why a Linked Data dataset can be archived and stored long term with confidence.

Linked Open Data is about sharing your data for reuse

Finally, by publishing your dataset as Linked Open Data (independently of any visualisations you may have made of it), you are opening it up to reuse not only by yourself, but by others.

The graph model allows you to describe the meaning of the terms you’ve used (i.e. the analytical categories used in your data can themselves be described and categorised, because everything is a node in a graph). This means that other people can work out what your dataset actually means.

The use of URIs for identifiers means that others can easily cite your work and effectively contribute to your work by creating their own annotations on it. They don’t need to impinge on your work; their annotations can live somewhere else altogether and merely refer to nodes in your graph by those nodes’ identifiers (URIs). They can comment; they can add cross-references; they can assert equivalences to nodes in other graphs, elsewhere. Your scholarly work can break out of its box, to become part of an open web of knowledge that grows and ramifies and enriches us all.

Equinox Software: Statistical Popularity Badges

planet code4lib - Tue, 2016-08-30 01:55

Statistical Popularity Badges allow libraries to set popularity parameters that define popularity badges, which bibliographic records can earn if they meet the set criteria.  Popularity badges can be based on factors such as circulation and hold activity, bibliographic record age, or material type.  The popularity badges that a record earns are used to adjust catalog search results to display more popular titles (as defined by the badges) first.  Within the OPAC there is a new sort option called “Sort by Popularity” which will allow users to sort records based on the popularity assigned by the popularity badges.

Popularity Rating and Calculation

Popularity badge parameters define the criteria a bibliographic record must meet to earn the badge, as well as which bibliographic records are eligible to earn the badge.  For example, the popularity parameter “Circulations Over Time” can be configured to create a badge that is applied to bibliographic records for DVDs.  The badge can be configured to look at circulations within the last 2 years, but assign more weight or popularity to circulations from the last 6 months.

Multiple popularity badges may be applied to a bibliographic record.  For each applicable popularity badge, the record will be rated on a scale of 1-5, where a 5 indicates the most popularity.  Evergreen will then assign an overall popularity rating to each bibliographic record by averaging all of the popularity badge points earned by the record.  The popularity rating is stored with the record and will be used to rank the record within search results when the popularity badge is within the scope of the search.  The popularity badges are recalculated on a regular and configurable basis by a cron job.  Popularity badges can also be recalculated by an administrator directly on the server.

Creating Popularity Badges

There are two main types of popularity badges:  point-in-time popularity (PIT), which looks at the popularity of a record at a specific point in time—such as the number of current circulations or the number of open hold requests; and temporal popularity (TP), which looks at the popularity of a record over a period of time—such as the number of circulations in the past year or the number of hold requests placed in the last six months.

The following popularity badge parameters are available for configuration:

  • Holds Filled Over Time (TP)
  • Holds Requested Over Time (TP)
  • Current Hold Count (PIT)
  • Circulations Over Time (TP)
  • Current Circulation Count (PIT)
  • Out/Total Ratio (PIT)
  • Holds/Total Ratio (PIT)
  • Holds/Holdable Ratio (PIT)
  • Percent of Time Circulating (Takes into account all circulations, not specific period of time)
  • Bibliographic Record Age (days, newer is better) (TP)
  • Publication Age (days, newer is better) (TP)
  • On-line Bib has attributes (PIT)
  • Bib has attributes and copies (PIT)
  • Bib has attributes and copies or URIs (PIT)
  • Bib has attributes (PIT)

To create a new Statistical Popularity Badge:

  1. Go to Administration>Local Administration>Statistical Popularity Badges.
  2. Click on Actions> Add badge.
  3. Fill out the following fields as needed to create the badge:

(Note: only Name, Scope, Weight, Recalculation Interval, Importance Interval, and Discard Value Count are required)

  • Name: Library assigned name for badge.  Each name must be unique.  The name will show up in the OPAC record display.  For example: Most Requested Holds for Books-Last 6 Months.  Required field.
  • Description: Further information to provide context to staff about the badge.
  • Scope: Defines the owning organization unit of the badge.  Badges will be applied to search result sorting when the Scope is equal to, or an ancestor, of the search location.  For example, a branch specific search will include badges where the Scope is the branch, the system, and the consortium.  A consortium level search, will include only badges where the Scope is set to the consortium.  Item specific badges will apply only to records that have items owned at or below the Scope.  Required field.
  • Weight:  Can be used to indicate that a particular badge is more important than the other badges that the record might earn.  The weight value serves as a multiplier of the badge rating.  Required field with a default value of 1.
  • Age Horizon:  Indicates the time frame during which events should be included for calculating the badge.  For example, a popularity badge for Most Circulated Items in the Past Two Years would have an Age Horizon of ‘2 years’.   The Age Horizon should be entered as a number followed by ‘day(s)’, ‘month(s)’, ‘year(s)’, such as ‘6 months’ or ‘2 years’.  Use with temporal popularity (TP) badges only.
  • Importance Horizon: Used in conjunction with Age Horizon, this allows more recent events to be considered more important than older events.  A value of zero means that all events included by the Age Horizon will be considered of equal importance.  With an Age Horizon of 2 years, an Importance Horizon of ‘6 months’ means that events, such as checkouts, that occurred within the past 6 months will be considered more important than the circulations that occurred earlier within the Age Horizon.
  • Importance Interval:  Can be used to further divide up the timeframe defined by the Importance Horizon.  For example, if the Importance Interval is ‘1 month, Evergreen will combine all of the events within that month for adjustment by the Importance Scale (see below).  The Importance Interval should be entered as a number followed by ‘day(s)’, ‘week(s)’,  ‘month(s)’, ‘year(s)’, such as ‘6 months’ or ‘2 years’.  Required field.
  • Importance Scale: The Importance Scale can be used to assign additional importance to events that occurred within the most recent Importance Interval.  For example, if the Importance Horizon is ‘6 months’ and the Importance Interval is ‘1 month’, the Importance Scale can be set to ‘6’ to indicate that events that happened within the last month will count 6 times, events that happened 2 months ago will count 5 times, etc. The Importance Scale should be entered as a number followed by ‘day(s)’, ‘week(s)’,  ‘month(s)’, ‘year(s)’, such as ‘6 months’ or ‘2 years’.
  • Percentile:  Can be used to assign a badge to only the records that score above a certain percentile.  For example, it can be used indicate that only want to assign the badge to records in the top 5% of results by setting the field to ‘95’.  To optimize the popularity badges, percentile should be set between 95-99 to assign a badge to the top 5%-1% of records.
  • Attribute Filter:  Can be used to assign a badge to records that contain a specific Record Attribute.  Currently this field can be configured by running a report (see note below) to obtain the JSON data that identifies the Record Attribute.  The JSON data from the report output can be copied and pasted into this field.   A new interface for creating Composite Record Attributes will be implemented with future development of the web client.
    • To run a report to obtain JSON data for the Attribute Filter, use SVF Record Attribute Coded Value Map as the template Source.  For Displayed Fields, add Code, ID, and/or Description from the Source; also display the Definition field from the Composite Definition linked table.  This field will display the JSON data in the report output.  Filter on the Definition from the Composite Definition liked table and set the Operator to ‘Is not NULL’.
  • Circ Mod Filter: Apply the badge only to items with a specific circulation modifier.  Applies only to item related badges as opposed to “bib record age” badges, for example.
  • Bib Source Filter:  Apply the badge only to bibliographic records with a specific source.
  • Location Group Filter:  Apply the badge only to items that are part of the specified Copy Location Group.  Applies only to item related badges.
  • Recalculation Interval: Indicates how often the popularity value of the badge should be recalculated for bibliographic records that have earned the badge.  Recalculation is controlled by a cron job.  Required field with a default value of 1 month.
  • Fixed Rating: Can be used to set a fixed popularity value for all records that earn the badge.  For example, the Fixed Rating can be set to 5 to indicate that records earning the badge should always be considered extremely popular.
  • Discard Value Count:  Can be used to prevent certain records from earning the badge to make Percentile more accurate by discarding titles that are below the value indicated.   For example, if the badge looks at the circulation count over the past 6 months, Discard Value Count can be used to eliminate records that had too few circulations to be considered “popular”.  If you want to discard records that only had 1-3 circulations over the past 6 months, the Discard Value Count can be set to ‘3’.  Required field with a default value of 0.
  • Last Refresh Time: Displays the last time the badge was recalculated based on the Recalculation Interval.
  • Popularity Parameter: Types of TP and PIT factors described above that can be used to create badges to assign popularity to bibliographic records.
  1. Click OK to save the badge.

New Global Flags

OPAC Default Sort:  can be used to set a default sort option for the catalog.  Users can always override the default by manually selecting a different sort option while searching.

Maximum Popularity Importance Multiplier:  used with the Popularity Adjusted Relevance sort option in the OPAC.  Provides a scaled adjustment to relevance score based on the popularity rating earned by bibliographic records.  See below for more information on how this flag is used.

Sorting by Popularity in the OPAC

Within the stock OPAC template there is a new option for sorting search results called “Most Popular”.  Selecting “Most Popular” will first sort the search results based on the popularity rating determined by the popularity badges and will then apply the default “Sort by Relevance”.  This option will maximize the popularity badges and ensure that the most popular titles appear higher up in the search results.

There is a second new sort option called “Popularity Adjusted Relevance” that can be turned on by editing the ctx.popularity_sort setting in the OPAC template configuration.  The “Popularity Adjusted Relevance” sort option can be used to find a balance between popularity and relevance in search results.  For example, it can help ensure that records that are popular, but not necessarily relevant to the search, do not supersede records that are both popular and relevant in the search results.  It does this by sorting search results using an adjusted version of Relevance sorting.  When sorting by relevance, each bibliographic record is assigned a baseline relevance score between 0 and 1, with 0 being not relevant to the search query and 1 being a perfect match.  With “Popularity Adjusted Relevance” the baseline relevance is adjusted by a scaled version of the popularity rating assigned to the bibliographic record.  The scaled adjustment is controlled by a Global Flag called “Maximum Popularity Importance Multiplier” (MPIM).  The MPIM takes the average popularity rating of a bibliographic record (1-5) and creates a scaled adjustment that is applied to the baseline relevance for the record.  The adjustment can be between 1.0 and the value set for the MPIM.  For example, if the MPIM is set to 1.2, a record with an average popularity badge score of 5 (maximum popularity) would have its relevance multiplied by 1.2—in effect giving it the maximum increase of 20% in relevance.  If a record has an average popularity badge score of 2.5, the baseline relevance of the record would be multiplied by 1.1 (due to the popularity score scaling the adjustment to half way between 1.0 and the MPIM of 1.2) and the record would receive a 10% increase in relevance.  A record with a popularity badge score of 0 would be multiplied by 1.0 (due to the popularity score being 0) and would not receive a boost in relevance.

Popularity Badge Example

A popularity badge called “Long Term Holds Requested” has been created which has the following parameters:

Popularity Parameter:  Holds Requested Over Time

Scope: CONS

Weight: 1 (default)

Age Horizon: 5 years

Percentile: 99

Recalculation Interval: 1 month (default)

Discard Value Count: 0 (default)

This popularity badge will rate bibliographic records based on the number of holds that have been placed on it over the past 5 years and will only apply the badge to the top 1% of records (99th percentile).

If a keyword search for harry potter is conducted and the sort option “Most Popular” is selected, Evergreen will apply the popularity rankings earned from badges to the search results.

Title search: harry potter. Sort by: Most Popular.

The popularity badge also appears in the bibliographic record display in the catalog. The name of the badge earned by the record and the popularity rating are displayed in the Record Details.

A popularity badge of 5.0/5.0 has been applied to the most popular bibliographic records where the search term “harry potter” is found in the title. In the image above, the popularity badge has identified records from the Harry Potter series by J.K. Rowling as the most popular titles matching the search and has listed them first in the search results.

Equinox Software: Copy Alerts

planet code4lib - Tue, 2016-08-30 01:38

The Copy Alerts feature allows library staff to add customized alert messages to copies. The copy alerts will appear when a specific event takes place, such as when the copy is checked in, checked out, or renewed. Alerts can be temporary or persistent: temporary alerts will be disabled after the initial alert and acknowledgement from staff, while persistent alerts will display each time the alert event takes place. Copy Alerts can be configured to display at the circulating or owning library only or, alternatively, when the library at which the alert event takes place is not the circulating or owning library. Copy Alerts at check in can also be configured to provide options for the next copy status that should be applied to an item. Library administrators will have the ability to create and customize Copy Alert Types and to suppress copy alerts at specific org units.

Adding a Copy Alert

Copy Alerts can be added to new copies or existing copies using the Volume/Copy Editor. They can also be added directly to items through the Check In, Check Out, Renew, and Item Status screens.

To add a Copy Alert in the Volume/Copy Editor:

1. Within the Volume/Copy Editor, scroll to the bottom of the screen and click on Copy Alerts.

2. A New Copy Alert window will pop up.

3. Select an alert Type and enter an additional alert message if needed. Check the box next to Temporary if this alert should not appear after the initial alert is acknowledged. Leaving the Temporary box unchecked will create a persistent alert that will appear each time the action to trigger the alert occurs, such as check in or check out.

4. Click OK to save the new Copy Alert. After a Copy Alert has been added. Clicking on the Copy Alerts button in the Volume/Copy Editor will allow you to add another Copy Alert and to view and edit Existing Copy Alerts.

5. Make any additional changes to the item record and click Store Selected to store these changes and the new copy alert(s) to the Completed Copies tab. If you are done modifying the copy, click Save & Exit to finalize the changes.

To add a Copy Alert from the Check In, Check Out, or Renewal screens:

1. Navigate to the appropriate screen, for example to Circulation>Check In.
2. Scan in the item barcode.
3. Select the item row and go to Actions>Add Copy Alerts or right click on the item row and select Add Copy Alerts.

4. The Add Copy Alert window will pop up. Select the alert Type, add an additional alert message if needed, and Click OK to save. This alert will be added to the copy.

To add a Copy Alert from the Item Status screen:

1. Go to the Detail View of the Item Status screen.
2. In the bottom left-hand corner of the item record there is a Copy Alerts option. Click Add to create a new copy alert.

3. The Add Copy Alert window will pop up. Select the alert Type, add an additional alert message if needed, and Click OK to save. This alert will be added to the copy.

Triggering a Copy Alert

The Copy Alert will appear when the action required to trigger the alert occurs. For example, the Normal Checkin Alert will appear when the item is checked in:

If Next Status options have been configured for the Checkin Alert, staff will see a drop down menu that allows then to select the next Status for the copy:

Managing Copy Alerts

Copy Alerts can be managed from the Item Status screen. Within the Quick Summary tab of the Detailed View of an item, click on Manage to view and Remove copy alerts.

Administration of Copy Alerts

Copy Alert Types

Copy Alert Types are created and managed in Administration>Local Administration>Copy Alert Types. Copy Alert Types define the action and behavior of an alert message type. The Alert Types included in a stock installation of Evergreen are:

• Normal checkout
• Normal checkin
• Checkin of missing copy
• Checkin of lost-and-paid copy
• Checkin of damaged copy
• Checkin of claims-returned copy
• Checkin of long overdue copy
• Checkin of claims-never-checked-out copy
• Checkin of lost copy

To create a new Copy Alert Type:

1. Go to Administration>Local Administration>Copy Alert Types.
2. Click on Create and fill out the following fields as needed:
Name: name of the Copy Alert Type.
Active: indicates if the alert type is currently in use (Yes) or not (No).
State: indicates the Copy Status of the item at the time of the event.
Event: the action that takes place in the ILS to trigger the alert.
Scope Org Unit: indicates which org unit(s) the alert type will apply to.
Next Status: can be used with temporary Checkin Alerts only. If a next status is configured, staff will be presented with a list of statuses to choose from when the item is checked in. Next statuses should be configured by using the Copy Status ID # surrounded by curly brackets. For example {7, 11}.
Renewing?: indicates if the alert should appear during a renewal.
Invert location?: if set to yes, this setting will invert the following two settings. For example, if an alert is set to appear at the Circulating Library only, inverting the location will cause the alert to appear at all libraries except the Circulating Library.
At Circulation Library?: indicates if the alert should appear at the circulation library only.
At Owning Library?: indicates if the alert should appear at the owning library only.
3. Click Save.

To edit an existing Copy Alert Type:

1. Go to Administration>Local Administration>Copy Alert Types.
2. Click on the type and go to Actions>Edit or right-click and select Edit.
3. Make changes to the existing configuration and click Save.

Copy Alert Suppression

The Copy Alert Suppression interface can be used to suppress alert types at a specific org unit. Suppression of alerts will adhere to the organization unit hierarchy. For example, if an alert is suppressed at the System level, it will be suppressed for all descendent branches.

To suppress an alert type:

1. Go to Administration>Local Administration>Copy Alert Suppression.
2. Click Create and select the Alert Type that you want to suppress from the drop down menu.
3. Next, select the Org Unit at which the alert should be suppressed.
4. Click Save.

DuraSpace News: NEW RELEASE: Message-based Integrations for Fedora

planet code4lib - Tue, 2016-08-30 00:00

From Aaron Coburn, Programmer and Systems Administrator, Amherst College

Amherst, MA  I would like to announce the immediate availability of version 4.6.0 of the Fedora Messaging Toolbox.

The messaging toolbox is designed to support a variety of asynchronous integrations with external tools and services, such as a Solr search engine or an external Triplestore. Version 4.6.0 of the messaging toolbox is compatible with both the forthcoming 4.6.0 release of the Fedora Commons server and previous releases of Fedora.

DuraSpace News: Learn More About Scholars@Duke

planet code4lib - Tue, 2016-08-30 00:00

From Julia Trimmer, Manager, Faculty Data Systems & Analysis, Office of the Provost, Duke University

Durham, NC  Will you be attending the Symplectic User Conference at Duke University on September 13 and 14?  If you would like to get together around that event to learn more about VIVO at Duke University, members of the Scholars@Duke team are available to meet before or after the event.

DuraSpace News: NEW Fedora Repository Web Site

planet code4lib - Tue, 2016-08-30 00:00

Austin, TX  DuraSpace is pleased to announce that the Fedora team recently completed a redesign of The site was designed in consultation with members of the Fedora Leadership Group and reflects a modern, mobile-friendly approach that makes it easy to find key items first.

Eric Lease Morgan: Blueprint for a system surrounding Catholic social thought & human rights

planet code4lib - Mon, 2016-08-29 20:32

This posting elaborates upon one possible blueprint for comparing & contrasting various positions in the realm of Catholic social thought and human rights.

We here in the Center For Digital Scholarship have been presented with a corpus of documents which can be broadly described as position papers on Catholic social thought and human rights. Some of these documents come from the Vatican, and some of these documents come from various governmental agencies. There is a desire by researchers & scholars to compare & contrast these documents on the paragraph level. The blueprint presented below illustrates one way — a system/flowchart — this desire may be addressed:

The following list enumerates the flow of the system:

  1. Corpus creation – The system begins on the right with sets of documents from the Vatican as well as the various governmental agencies. The system also begins with a hierarchal “controlled vocabulary” outlined by researchers & scholars in the field and designed to denote the “aboutness” of individual paragraphs in the corpus.
  2. Manual classification – Reading from left to right, the blueprint next illustrates how subsets of document paragraphs will be manually assigned to one more more controlled vocabulary terms. This work will be done by people familiar with the subject area as well as the documents themselves. Success in this regard is directly proportional to the volume & accuracy of the classified documents. At the very least, a few hundred paragraphs need to be consistently classified from each of the controlled vocabulary terms in order for the next step to be successful.
  3. Computer “training” – Because the number of paragraphs from the corpus is too large for manual classification, a process known as “machine learning” will be employed to “train” a computer program to do the work automatically. If it is assumed the paragraphs from Step #2 have been classified consistently, then it can also be assumed that the each set of similarly classified documents will have identifiable characteristics. For example, documents classified with the term “business” may often include the word “money”. Documents classified as “government” may often include “law”, and documents classified as “family” may often include the words “mother”, “father”, or “children”. By counting & tabulating the existence & frequency of individual words (or phrases) in each of the sets of manually classified documents, it is possible to create computer “models” representing each set. The models will statistically describe the probabilities of the existence & frequency of words in a given classification. Thus, the output of this step will be two representations, one for the Vatican documents and another for the governmental documents.
  4. Automated classification – Using the full text of the given corpus as well as the output of Step #3, a computer program will then be used to assign one or more controlled vocabulary terms to each paragraph in the corpus. In other words, the corpus will be divided into individual paragraphs, each paragraph will be compared to a model and assigned one more more classification terms, and the paragraph/term combinations will be passed on to a database for storage and ultimately an indexer to support search.
  5. Indexing – A database will store each paragraph from the corpus along side metadata describing the paragraph. This meta will include titles, authors, dates, publishers, as well as the controlled vocabulary terms. An indexer (a sort of database specifically designed for the purposes of search) will make the content of the database searchable, but the index will also be supplemented with a thesaurus. Because human language is ambiguous, words often have many and subtle differences in meaning. For example, when talking about “dogs”, a person may also be alluding to “hounds”, “canines”, or even “beagles”. Given the set of controlled vocabulary terms, a thesaurus will be created so when researchers & scholars search for “children” the indexer may also return documents containing the phrase “sons & daughters of parents”, or another example, when a search is done for “war” documents (paragraphs) also containing the words “battle” or “insurgent” may be found.
  6. Searching & browsing – Finally, a Web-based interface will be created enabling readers to find items of interest, compare & contrast these items, identify patterns & anomalies between these items, and ultimately make judgments of understanding. For example, the reader will be presented with a graphical representation of controlled vocabulary. By selecting terms from the vocabulary, the index will be queried, and the reader will be presented with sortable and groupable lists of paragraphs classified with the given term. (This process is called “browsing”.) Alternatively, researchers & scholars will be able to enter simple (or complex) queries into an online form, the queries will be applied to the indexer, and again, paragraphs matching the queries will be returned. (This process is called “searching”.) Either way, the researchers & scholars will be empowered to explore the corpus in many and varied ways, and none of these ways will be limited to any individuals’ specific topic of interest.

The text above only outlines one possible “blueprint” for comparing & contrasting a corpus of Catholic social thought and human rights. Moreover, there are at least two other ways of addressing the issue. For example, it it entirely possible to “simply” read each & every document. After all, that is they way things have been done for millennium. Another possible solution is to apply natural language processing techniques to the corpus as a whole. For example, one could automatically count & tabulate the most frequently used words & phrases to identify themes. One could compare the rise & fall of these themes over time, geographic location, author, or publisher. The same thing can be done in a more refined way using parts-of-speech analysis. Along these same lines there are well-understood relevancy ranking algorithms (such as term frequency / inverse frequency) allowing a computer to output the more statistically significant themes. Finally, documents could be compared & contrasted automatically through a sort of geometric analysis in an abstract and multi-dimensional “space”. These additional techniques are considerations for a phase two of the project, if it ever comes to pass.

Equinox Software: Evergreen 2013: Linus’s Law

planet code4lib - Mon, 2016-08-29 17:21

By 2013 Evergreen was, to coin a phrase, “nominally complete.”  It had gained the features needed to check off most of the right RFP boxes, and so be considered alongside other ILS’s with a significantly older code base.  Acquisitions and serials, along with circulation, cataloging, authority control, and the (underrated, in my opinion) booking functionality were all in place.  By this point it had a modern, pluggable OPAC infrastructure, integration with many 3rd party products to expand its functionality, and was attracting attention via non-traditional use cases such as publishing house backend systems.  So, we developers were done, right?

Not at all.

In years past, the development team working on Evergreen had been small, and grew slowly.  In important ways, though, that began to change around 2013.  Previously, having more than twelve distinct contributors in a month submitting code for inclusion in the master repository was quite rare, and usually happened right around the time when a new release was being polished.  But from late 2012 through all of 2013, 15-25 contributors became the rule and less than that was the exception.  That is a solid 20-30% increase, and is significant for any project.

At the software level this was a period of filing down rough edges and broadening the talent pool.  There were few truly massive technological advances but there were many, and varied, minor improvements made by a growing group of individuals taking time to dive deeper into a large and complex codebase.  Importantly, this included ongoing contributions from a Koha developer on a now-shared bit of infrastructure, the code we both use to parse searches against our respective catalogs.

In short, 2013 is the year that we began to truly realize one of the promises of Open Source, something that is attributed to Linus Torvalds of Linux fame.  Specifically that given enough eyeballs, all bugs are shallow.  What this means is that as your project adds users, testers, and developers, it becomes increasingly likely that bugs will be discovered early, classified quickly, and that the solution will be obvious to someone.

In some ways this can be a critical test for an Open Source project.  Many projects do not survive contact with an influx of new development talent.  For some projects, that is political.  For others, it is a consequence of early design decisions.  Fortunately, Evergreen passed that test, and that is in large part a credit to its community.  After seven years and significant scrutiny, Evergreen continued to improve and its community continued to grow.

— Mike Rylander, President

This is the eighth in our series of posts leading up to Evergreen’s Tenth birthday.

LITA: Transmission #8 – Return to Regularly Scheduled Programming

planet code4lib - Mon, 2016-08-29 15:00

Thank you to everyone who participated in my feedback survey! I have parsed the results (a little less than 100 responses) and I’m currently thinking through format changes.

I’ll give a full update on the changes to come and more after we conclude our initial ten interviews in October. Stay tuned, faithful viewers.

In today’s webisode, I am joined by one of my personal all-time favorite librarians and colleagues, Michael Rodriguez. Michael is Electronic Resources Librarian at the University of Connecticut. Enjoy his perspectives on one of my favorite topics, librarianship in the intersection of collections, technology, and discovery.

Begin Transmission will return September 12th.

Jonathan Rochkind: bittorrent for sharing enormous research datasets

planet code4lib - Mon, 2016-08-29 14:24 says:

We’ve designed a distributed system for sharing enormous datasets – for researchers, by researchers. The result is a scalable, secure, and fault-tolerant repository for data, with blazing fast download speeds.

There are data sets from researchers are several respected universities listed, including the University of Michigan and Stanford.

Filed under: General

Jonathan Rochkind: technical debt/technical weight

planet code4lib - Mon, 2016-08-29 14:22

Bart Wronski writes a blog post about “technical weight”, a concept related to but distinct from “technical debt.”  I can associate some of what he’s talking about to some library-centered open source projects I’ve worked on.

Technical debt… or technical weight?

…What most post don’t cover is that recently huge amount of technical debt in many codebases comes from shifting to naïve implementations of agile methodologies like Scrum, working sprint to sprint. It’s very hard to do any proper architectural work in such environment and short time and POs usually don’t care about it (it’s not a feature visible to customer / upper management)…


…I think of it as a property of every single technical decision you make – from huge architectural decisions through models of medium-sized systems to finally way you write every single line of code. Technical weight is a property that makes your code, systems, decisions in general more “complex”, difficult to debug, difficult to understand, difficult to change, difficult to change active developer.…


…To put it all together – if we invested lots of thought, work and effort into something and want to believe it’s good, we will ignore all problems, pretend they don’t exist and decline to admit (often blaming others and random circumstances) and will tend to see benefits. The more investment you have and heavier is the solution – the more you will try to stay with it, making other decisions or changes very difficult even if it would be the best option for your project.…





Filed under: General

Islandora: iCampMO Instructors Announced

planet code4lib - Mon, 2016-08-29 13:34

Islandora Camp is heading down to Kansas City, courtesy of our hosts at the University of Missouri Kansas City. Camp will consist of three days: One day of sessions taking a big-picture view of the project and where it's headed (including big updates about Islandora CLAW) one day of hands-on workshops for developers and front-end administrators, and one day of community presentations and deeper dives into Islandora tools and sites. The instructors for that second day have been selected and we are pleased to introduce them:


Rosie Le Faive started with Islandora in 2012 while creating the a trilingual digital library for the Commission for Environmental Cooperation. With experience and - dare she say - wisdom gained from creating highly customized sites, she's now interested in improving the core Islandora code so that everyone can use it. Her interests are in mapping relationships between objects, and intuitive UI design. She is the Digital Infrastructure and Discovery librarian at UPEI, and develops for Agile Humanities.  This is her second Islandora Camp as an instructor.

Jared Whiklo began working with Islandora in 2012. After stumbling and learning for a year, he began to give back to the community in late 2013. He has since assisted in both Islandora and Fedora releases and (to his own disbelief) has become an Islandora 7.x-1.x, Islandora CLAW, and Fedora committer. His day job is Developer with Digital Initiatives at the University of Manitoba Libraries. His night job is at the Kwik-E-Mart.


Melissa Anez has been working with Islandora since 2012 and has been the Community and Project Manager of the Islandora Foundation since it was founded in 2013. She has been a frequent instructor in the Admin Track and developed much of the curriculum, refining it with each new Camp.

Sandy Rodriguez is the Digital Special Collections Coordinator at the University of Missouri—Kansas City.  She has been working with Islandora for almost three years and currently serves as a member of the Islandora Metadata Interest Group and the Metadata Tools Subgroup.

LibUX: How to Write a User Experience Audit

planet code4lib - Mon, 2016-08-29 11:00

A User Experience Audit, or UX Audit for short, is something that should be conducted in the very beginning steps of a website, web application, dedicated app, or similar redesign project. Sometimes referred to as a deck or part of a design brief, UX Audits are typically done before user interface (UI) design occurs, and primarily consists of data intake, data compiling, research, and data visualization through presentation.

A UX Audit is simultaneously in-depth design research, and a cursory presentation of data. The UX Audit doesn’t jump to conclusions, or proposes finite UI and UX mechanics, but more so evaluates the project’s current state in order to:

  • compile qualitative data
  • conduct peer evaluation
  • discover interactive pain points
  • evaluate quantitative data
  • identify accessibility errors
  • survey information architecture
  • point out any branding violations
  • propose additional UX testing

Ultimately the UX Audit should serve as a compilation of the previous mentioned research, identify what data is missing or would need to be captured going forward, and would function as a point-of-departure for the next steps – which commonly would be sketching, wireframing, interactive wireframing, or prototyping depending on your development process.

The UX Audit is not a wireframe, it isn’t a design or user interface proposal, and it typically doesn’t highlight project requirements from stakeholders (although this is not uncommon). Additionally, a UX Audit’s summary does propose and provide recommendations based on the data compiled, but doesn’t do so in a way that graphically exceeds anything more than facilitating understanding (so no high fidelity solutions or graphics). As such, a UX Audit acts as a timestamp or bookmark as to project’s history, and serves as documentation for improvement. UX Auditing is also the prefered mode for project development, which is opposite of simply giving a project a ‘face lift’ without concern or regard to a project’s history or incremental improvement.

Once completed, the UX Audit and its recommendations are then given to a UI designer, front-end dev, web designer, or similar position who would begin designing, iterating, or wireframing (preferably in a medium that is close as possible to the final deliverable). The UI professional would then be in a better position going forward to wireframes (for example), and would be aware of the target audience, previous errors, and what data is present and what data is missing.

Possible Parts of a UX Audit

Depending on the project – be it a web application, website, or native app redesign – and on what data is available, each UX Audit is going to be different. Whenever possible, intake as much data as possible, because this data is a UX professional’s bread and butter, and they should spend a decent amount of time collecting, collating, filtering, interpreting, and visualizing it for stakeholders, developers, and UI professionals.

Although most UX Audits are essentially made from the same data and parts, they can follow any format or order. The following sections are some suggestions as to what can be included in a UX Audit:


This sometimes is called an Executive Summary, a Project Overview, or even an Introduction to the Problem. Despite what it’s called, the Introduction serves the function of briefly and succinctly introducing the the intent of the redesign, who all/what departments are involved in the project, and the scope of the UX Audit. Also accompanying, or contained in the Introduction are Research Objectives and a Table of Contents.

Research Objectives

The Research Objectives highlights and presents the hard deliverables of the UX Audit, as well as sets up the expectations of the reader as to what research will be presented.

Competitor Analysis

The UX Audit’s competitor analysis section is usually derived from data from a parent organization or competitors. A parent organization could be a policy commission, an accrediting body, location peers, etc. As for competitors, this can be determined by competitive sales, customers, consumers, goods, or services – all of which are being viewed for usage, and are best on increasing conversions.

The Competitor Analysis section can be comprised of a list of these peers, and hyperlinks to the similar projects’ website, web application, or native app. It also contains a features, functionality, and interactive elements comparison, in the form of percentages, and usually presents the comparisons through data visualization. This enables readers to see what percentage of peers have a responsive website, a homepage newsletter sign-up, sticky navigations, or other such features (which establishes baseline/industry UI standards).

Quantitative Data

Quantitative data refers to ‘the numbers’, or project traffic, usage, device/browser statistics, referrals, add-on blockers, social media sharing, and pathway mapping. All of this is quantitative data, is part of design research, and hopefully has already been set up on the project you are redesigning. Adobe Analytics and Google Analytics offer a lot of different solutions, but require a lot of customization, learning, or a significant financial investment. The following is a list of common industry software for this type of quantitative data:

Qualitative Data

Qualitative Data usually refers to the customer experience (CX) side of data, and can contain customer behavior, demographics, feedback, and search terms. This is usually derived from surveying mechanisms like Qualtrics or SurveyMonkey, embedded feedback tools like Adobe Experience Manager, search engine optimization (SEO) information from titles, spent advertising, and meta data tracking over time, and Google Trends and Google Insights.


The Accessibility portion of a UX Audit should contain both WCAG 2.0 AA and AAA errors, color contrast checking for fonts and UI mechanisms against their backgrounds, and even JavaScript errors that appear in the console log. Software used to help this portion of the UX Audit is WebAIM: Color Contrast Checker, WAVE: Web Accessibility Evaluation Tool, and any web browser’s web development toolbar’s JavaScript console window.

Interactive Pain Points

Interactive Pain Points refers to egregious UI and UX errors, non-typical animations or interactions, and unexpected functionality. This can be as little as forms and buttons being too small to click on a touch-based or mobile device, dysfunctional carousel buttons, all the way to hover navigations being jerky, and counter-intuitive forms wherein the labels cover up the input fields. This is usually best presented in the UX Audit through screenshots or videos, with annotations about what is occurring in contrast to user’s expectations.

Brad Frost has an excellent Interface Inventory Checklist available on Google Docs; this is a great place to start to know what all to look for, and what interactions to examine for improvement. A checklist like the one he shared is very helpful, but the most important thing is to demonstrate things like inconsistent button sizing, or if interaction elements are confusing/not functioning.

Information Architecture

Information Architecture (IA) is the structural design of information, interaction, and UX with the goals of making things both findable and discoverable. This part of the UX Audit focuses on the findability and discoverability of navigation items, general content strategy deficiencies, reading levels, and label auditing.

For example, analyzing the IA of a project could potentially identify that label auditing for primary and secondary navigation items, quick links, buttons, and calls-to-action is necessary. Analyzing IA could also demonstrate that the project’s Flesch readability score – a score which uses the sentence length and the number of syllables per word in an equation to calculate the reading ease – isn’t written for a 8th grade level (or that your content with specific instruction requires a 6th grade reading level). For more information the Nielsen Norman Group has a great article about legibility, readability, comprehension, and anyone can use Microsoft Word to analyze content’s Flesch readability scores.

Branding Violations

This mainly depends on an organization or company’s established style guides and pattern library. If the project being redesigned is particularly old and in need of a UX Audit, there may be a lot of color, font family, interaction elements, and UX patterns that are out of sync. If a company or organization doesn’t have a set style guide or pattern library, maybe that’s the best place to start before a UX Audit. The following are some really great style guides and pattern libraries from companies, entities, and organizations you might know already:


If the project that’s being redesigned is a website, web application, or dynamically pulls information through JavaScript data binding, performance should be factored into a UX Audit. Michael Schofield has a great article on LibUX about users having different connection speeds –  millennials all the way to broadband users – and main figures in the field of UX speak about the importance of performance all of the time.

“Over the past few months, conversations about Responsive Web design have shifted from issues of layout to performance. That is, how can responsive sites load quickly -even on constrained mobile networks.” Luke Wroblewski
Product Director at Google

When conducting a UX Audit, the Chrome web browser’s DevTools has a ‘Network’ and ‘Timeline’ view that analyzes and displays the loading of assets – images, scripts, code libraries, external resources, etc. – in real time. This information can and should be included in a UX Audit to document project load times, emulate different network conditions, verify any load time issues, and ultimately point out potential pain points for users.


Google Insights or even PageFair is desirable. This is place in the UX Audit where a UX professional really gets to shine, because they already demonstrated their data collection and presentation skills, and now they get to advise the stakeholders, UI, and development on what steps and UI decisions should be taken going forward.

How can UX Audits be used?

UX Audits can and should be incorporated as a non-bias and essential part of any redesign project. Many times a UX professional also has to be a librarian, or a UI designer, or even a front-end developer – so it’s easy to skip this important step of the redesign process, with limited staff and short deadlines.

However, performing a UX Audit will enable you to slow down, focus primarily on UX for a change, and perform an audit that will provide a lot of valuable information for stakeholders, designers, and developers. This will ultimately make everyone’s job easier, and what’s wrong with working smarter rather than harder?

LibUX: Crafting Websites with Design Triggers

planet code4lib - Mon, 2016-08-29 07:00

A design trigger is a pattern meant to appeal to behavior and cognitive biases observed in users. Big data and the user experience boom has provided a lot of information about how people actually use the web, which designs work, and–although creepy–how it is possible to cobble together an effective site designed to social engineer users

This episode is an introduction from a longer talk in which I introduce design triggers as a concept and their reason for being.

Help us out and say something nice. Your sharing and positive reviews are the best marketing we could ask for.

If you like, you can download the MP3 or subscribe to LibUX on StitcheriTunes, YouTube, Soundcloud, Google Play Music, or just plug our feed straight into your podcatcher of choice.



Terry Reese: MarcEdit Mac Update–Inclusion of Console Mode

planet code4lib - Mon, 2016-08-29 03:50

One of the gaps in the Mac version of MarcEdit has been the lack of a console mode.  This update should correct that.  However, a couple things about how his works…

1) Mac applications are bundles, so in order to run the console program you need to run against the application bundle.  What does this look like?   From the terminal, one would run
>>/Applications/ –console

The –console flag initializes the terminal application and prompts for file names.  You can pass the filenames (this must be fully qualified paths at this point) via command-line arguments rather than running in an interactive mode.  For example:
>>/Applications/ –s /users/me/Desktop/source.mrc –d /users/me/Desktop/output.mrk –break

The above would break a MARC file into the mnemonic format.  For a full list of console commands, enter:
>>/Applications/ –help

In the future, the MarcEdit install program will be setting an environmental variable ($MARCEDIT_PATH) on installation.  At this point, I recommend opening your .bash_profile, and add the following line:
export MARCEDIT_PATH=/Applications/

You can get this download from: 


Library Tech Talk (U of Michigan): HTTPS (Almost) Everywhere

planet code4lib - Mon, 2016-08-29 00:00

The University of Michigan Library pledges to update its major websites to use secure (HTTPS) connections between the servers and web browsers by December 2016.

FOSS4Lib Recent Releases: Metaproxy - 1.11.5

planet code4lib - Fri, 2016-08-26 20:55

Last updated August 26, 2016. Created by Peter Murray on August 26, 2016.
Log in to edit this page.

Package: MetaproxyRelease Date: Friday, August 26, 2016

District Dispatch: Google Policy Fellow: my OITP summer

planet code4lib - Fri, 2016-08-26 20:42

guest post by Nick Gross, OITP’s 2016 Google Policy Fellow

This summer I worked as a Google Policy Fellow at the American Library Association’s Office for Information Technology Policy (OITP) in Washington, D.C. The Google Policy fellowship gives undergraduate, graduate, and law students the opportunity to spend the summer working at public interest groups engaged in Internet and technology policy issues.

Google Policy Fellowships give undergraduate, graduate, and law students the opportunity to spend the summer working at public interest groups engaged in Internet and tech policy issues

As a fellow, my primary role at OITP was to prepare tech policy memos to submit to the incoming presidential administration. The goal is to inform policymakers about ALA’s public policy concerns, including why, and to what extent, ALA has an interest in specific tech issues and what the next policies should look like. With balanced, future-looking information and tech policies, libraries can continue to enable Education, Employment, Entrepreneurship, Empowerment, and Engagement for their patrons— The E’s of Libraries. To that end, I drafted a brief on telecommunications issues and one on copyright issues.

The telecommunications brief addresses the importance of broadband Internet to libraries. In particular, a robust broadband infrastructure ensures that libraries can continue to provide their communities with equitable access to information and telecommunications services, as well as serve residents with digital services and content via “virtual branches.” Through the Federal Communications Commission’s Universal Service Fund (USF), which includes the E-Rate program, the Lifeline program, and the Connect America Fund, libraries and underserved or unserved communities are better able to enjoy access to affordable high-capacity broadband. And greater broadband competition and local choice increase broadband deployment, affordability, and adoption for libraries and their communities, while opening up more unlicensed spectrum for Wi-Fi expands broadband capacity so libraries can better serve their communities. Moreover, libraries sometimes provide the only Internet access points for some communities and they play an important role in digital inclusion efforts. Finally, because libraries use the Internet to research, educate, and create and disseminate content, as well as provide no-fee public access to it, they highly value the FCC’s 2015 Open Internet Order which helps guarantee intellectual freedom and free expression, thereby promoting innovation and the creation and exchange of ideas and content.

As copyright lies at the core of library operations, OITP advocates for law that fulfills the constitutional purpose of copyright—namely, a utilitarian system that grants “limited” copyright protection in order to “promote the progress of science and useful arts.” The copyright brief calls for a balanced copyright system in the digital age that realizes democratic values and serves the public interest. The first sale doctrine enables libraries to lend books and other materials. The fair use doctrine is critical to libraries’ missions, as it enables the “free flow of information,” fostering freedom of inquiry and expression; for instance, it enables libraries to use so-called “orphan works” without fear of infringement liability. Moreover, libraries are at the forefront of archiving and preservation, using copyright law’s exceptions to make reproductions and replacements of works that have little to no commercial market or that represent culturally valuable content in the public domain. Libraries also enjoy protections against liability under the Section 512 Safe Harbors in the Digital Millennium Copyright Act (DMCA).

My brief on copyright issues also highlights specific challenges that threaten libraries’ mission to provide the public with access to knowledge and upset the careful balance between copyright holders and users. For instance, e-licensing and digital rights management (DRM) under section 1201 of the DMCA, as well as the section 1201 rulemaking process, limit libraries’ ability to take full advantage of copyright exceptions, from fair use to first sale to preservation and archiving. ALA also advocates for the ratification and implementation of the World Intellectual Property Organization’s “Marrakesh Treaty” to facilitate access to published works for persons who are blind, visually impaired, or otherwise print disabled.

In addition to my policy work, Google’s bi-weekly meetings at its D.C. headquarters shed light on the public policy process. At each event, Google assembled a panel of experts composed of its own policy-oriented employees and other experts from public interest groups in D.C. Topics ranged from copyright law to broadband deployment and adoption to Net Neutrality. During the meetings, I also enjoyed the opportunity to meet the other Google fellows and learn about their work.

My experience as a Google Policy Fellow at OITP taught me a great deal about how public interest groups operate and advocate effectively. For instance, I learned how public interest groups collaborate together and form partnerships to effect policy change. Indeed, ALA works, or has worked, with groups like the Center for Democracy & Technology to advocate for Net Neutrality, while advancing public access to information as a member of the Re:Create Coalition and the Library Copyright Alliance. As a founding member of the Schools, Health & Libraries Broadband Coalition and WifiForward, ALA promotes Internet policies, such as the modernization of the USF. Not only did I gain a deeper insight into telecommunications law and copyright law, I also developed an appreciation as to how such laws can profoundly impact the public interest. I’d highly recommend the Google Policy Fellowship to any student interested in learning more about D.C.’s policymaking in the tech ecosystem.

The post Google Policy Fellow: my OITP summer appeared first on District Dispatch.

Jez Cope: Software Carpentry: SC Config; write once, compile anywhere

planet code4lib - Fri, 2016-08-26 18:47

Nine years ago, when I first release Python to the world, I distributed it with a Makefile for BSD Unix. The most frequent questions and suggestions I received in response to these early distributions were about building it on different Unix platforms. Someone pointed me to autoconf, which allowed me to create a configure script that figured out platform idiosyncracies Unfortunately, autoconf is painful to use – its grouping, quoting and commenting conventions don’t match those of the target language, which makes scripts hard to write and even harder to debug. I hope that this competition comes up with a better solution — it would make porting Python to new platforms a lot easier!
Guido van Rossum, Technical Director, Python Consortium (quote taken from SC Config page)

On to the next Software Carpentry competition category, then. One of the challenges of writing open source software is that you have to make it run on a wide range of systems over which you have no control. You don’t know what operating system any given user might be using or what libraries they have installed, or even what versions of those libraries.

This means that whatever build system you use, you can’t just send the Makefile (or whatever) to someone else and expect everything to go off without a hitch. For a very long time, it’s been common practice for source packages to include a configure script that, when executed, runs a bunch of tests to see what it has to work with and sets up the Makefile accordingly. Writing these scripts by hand is a nightmare, so tools like autoconf and automake evolved to make things a little easier.

They did, and if the tests you want to use are already implemented they work very well indeed. Unfortunately they’re built on an unholy combination of shell scripting and the archaic Gnu M4 macro language. That means if you want to write new tests you need to understand both of these as well as the architecture of the tools themselves — not an easy task for the average self-taught research programmer.

SC Conf, then, called for a re-engineering of the autoconf concept, to make it easier for researchers to make their code available in a portable, platform-independent format. The second round configuration tool winner was SapCat, “a tool to help make software portable”. Unfortunately, this one seems not to have gone anywhere, and I could only find the original proposal on the Internet Archive.

There were a lot of good ideas in this category about making catalogues and databases of system quirks to avoid having to rerun the same expensive tests again the way a standard ./configure script does. I think one reason none of these ideas survived is that they were overly ambitions, imagining a grand architecture where their tool provide some overarching source of truth. This is in stark contrast to the way most Unix-like systems work, where each tool does one very specific job well and tools are easy to combine in various ways.

In the end though, I think Moore’s Law won out here, making it easier to do the brute-force checks each time than to try anything clever to save time — a good example of avoiding unnecessary optimisation. Add to that the evolution of the generic pkg-config tool from earlier package-specific tools like gtk-config, and it’s now much easier to check for particular versions and features of common packages.

On top of that, much of the day-to-day coding of a modern researcher happens in interpreted languages like Python and R, which give you a fully-functioning pre-configured environment with a lot less compiling to do.

As a side note, Tom Tromey, another of the shortlisted entrants in this category, is still a major contributor to the open source world. He still seems to be involved in the automake project, contributes a lot of code to the emacs community too and blogs sporadically at The Cliffs of Inanity.


Subscribe to code4lib aggregator