You are here

Feed aggregator

Ed Summers: Discourse and Pragmatics

planet code4lib - Wed, 2017-02-08 05:00

In keeping with past semesters I’m going to try posting my written notes for class here. This is mostly peer pressure on myself to think about how I’m writing a bit more publicly. Although the reality is it’s mostly going to be lost on the Web.

I’m taking two classes this semester Discourse Analysis (edci788) and Documentation, Collection and Appraisal (lbsc785). The latter I’m technically co-teaching with my advisor Ricky Punzalan, but the reality is I’m learning lots about appraisal practices from him as well as the practitioners who are in the class. The Discourse Analysis class requires a written summary each week, and this is the first of those.

It’s always a bit weird and maybe risky learning in public, but you only live once right? Ahem. I’d love to hear from you in comments/annotations here or on Twitter or email if any of this gives you any ideas or prompts any questions.

The readings this week focused on discourse and pragmatics. Paltridge (2012) defines pragmatics as the study of how the meaning of spoken and written discourse is related to the context in which that speech and writing occurs. Context here is taken to be the particular social situation that the discourse takes place in, the other text or speech it is situated with, and any background knowledge that it relies upon.

One of the foundational concepts in pragmatics is speech act theory, which is the idea that words do things in the world. Words have a literal meaning that can be analyzed for its truth or falsehood. But words also can be used to effect change in the word, to perform actions. Searle distinguished between these two types of acts as locutionary and illocutionary acts. And the actual action that is caused by the words is the perlocutionary act.

One practical example of this is the act of saying “I do” in a marriage ceremony. The words have a literal meaning, and perform the action of becoming legally married. They are also tied to the social situation in which they occur, the marriage ceremony, their partners speech and the speech of the marriage official. This example also highlights how various conditions can influence whether a specific speech act works or not. Austin called these felicity conditions, which Searle interpreted somewhat rigidly as rules.

Pragmatics is also specifically concerned with the theoretical perspective of the cooperative principle: that discourse is a function of participants having a shared interest or purpose, which provides a unifying shape to the discourse, which prevents it from just being a series of random and disconnected topics. This idea was introduced by Grice (1975) in which he provides four categories or maxims that help identify the operation of the cooperative principle in discourse:

  • quantity: make contribution informative, but not more informative than needed
  • quality: try to make a contribution that is true (not false, or lacking in evidence)
  • relation: moves in topic need to fit certain parameters
  • manner: how something is said (not what)

Grice uses these maxims in order to show how speech and language do not simply fit into either a formal (scientific) or informal (humanistic) analysis. To do this he introduces the idea of the implicature which is a meaning that is not explicitly provided in the literal analysis of the words in discourse, but can be ascertained by looking at how speech interacts with the four maxims in various ways:

  • when a maxim is quietly violated
  • when a participant explicitly opts-out from a maxim
  • when the fulfillment of one maxim is in contradiction, tension with another maxim: a clash
  • when a maxim is openly disregarded or flouted

Grice uses very short snippets of conversation, mostly just paired statements: A says this, B says this in response. He uses these snippets to illustrate the fulfillment of the four maxims, and how this can give rise to implicatures, or meanings that are not explicitly provided in the literal text.

In contrast Kasper (2006) also looks at pragmatics but uses much longer sequences of conversation. This makes sense because Kasper uses the lens of Conversation Analysis to examine pragmatics, or meaning making. CA requires looking at more than just pairs of utteranaces–at conversations. Kasper critiques the rationalist foundations of Speech Act theory, by questioning the idea that the meaning of an utterance is related to the internal state of the speaker, and that in turn, the listener receives and internalizes that meaning. This telementation model, where meaning is being transmitted from speaker to listener does not, in Kasper’s eyes, sufficiently describe the way that meaning is arrived at or generated. For Kasper meaning is co-constructed by participants, and rather than being transmitted it is emergent and highly contextual. Conversation Analysis’ attention to the specific details of full conversations allows meaning and context to be understood in its specificity as collaborative ventures, where the whole can be larger than the sum of its parts.

Taguchi (2015) provides an example of using cross-cultural speech act theory to look at competencies of language learners. Culture is an important dimension to understanding the speech acts because the mechanics of speech, and the significance of particular word choices are not necessarily portable across cultures. Taguchi is specifically interested in how spending a year abroad can change the learners cultural awareness and their ability to general speech acts, or their language comptency. The specific research question was to see if cultutral adjustment is correlated with language skill.

To achieve this Taguchi measures intercultural competence and pragmatic competenece in a group of 20 Japanese language learners before and after their semester abroad. Intercultural competence is measured using a tool called the Cross-Cultural Adapatability Inventory, which is essentially a survey of 50 questions that measures several factors using a Likert scale. Pragmatic competence is measures using an oral discourse complete test (DCT). This test collects what language learners think they would say in a particular situation, and the responses were then evaluated by Japanese speakers with respect to the speech style and speech act using a six point scale. The results were then analyzed statistically using the t-test to see if there was any correlation between changes in cultural adaptability and language use. They found that intercultural competence was correlated with appropriate speech acts, but not with speech style. The authors conjectured that this could be the failing of the DCT, or perhaps with their relatively small sample size.

The readings this week provided lots of different views on the idea of speech acts and discourse pragmatics. It was clear to me on reading them that this is a very deep area of research, where there is a great deal of theoretical work to draw on. I haven’t completely decided yet what I am going to be studying as part of my research project yet. I’m specifically interested in looking at how archivists decide what is valuable when collecting material from the Web, and I have three different data sources in mind:

  • a set of interviews I conducted with web archivists about their appraisal process
  • online conversations in Internet Relay Chat between volunteer archivists in the ArchiveTeam community
  • written collection development policies from different institutions

I think that discourse pragmatics could be used in all three, but probably would work best in the first two because of their conversational aspect. The idea of value in appraisal work is a slippery concept, and I think Grice’s idea of implicatures could be very useful in reading between the lines of how archivists ascribe value to material. Also, looking at the discussion through a cooperative lens could be useful since archivists do tend to look at what they are doing as a cooperative enterprise: a community of practice that is centered on preserving material for use by records creators and researchers. I also think Kasper’s use of conversational analysis could uncover emergent meanings in the interviews or transcripts to help uncover new understandings about this community of practice, its cooperative ideas and activities. I’m not particularly keen on making statistical claims like Taguchi, mostly because I don’t think questions of value lend themselves to statistical analyses so much as they do qualitative measures. But I’d like to be proven wrong if there are good tools for achieving that.


Grice, H. P. (1975). Syntax and semantics: Speech acts. In (Vol. 3). New York: Academic Press.

Kasper, G. (2006). Pragmatics & language learning. In K. Bardovi-Harlig, J. C. Félix-Brasdefer, & A. S. Omar (Eds.), (pp. 281–314). Natl Foreign Lg Resource Ctr.

Paltridge, B. (2012). Discourse analysis: An introduction. Bloomsbury Publishing.

Taguchi, N. (2015). Cross-cultural adaptability and development of speech act production in study abroad. International Journal of Applied Linguistics, 25(3), 343–365.

DuraSpace News: COAR Seeks Comments on Vision for Next Generation Repositories

planet code4lib - Wed, 2017-02-08 00:00

From Kathleen Shearer, Executive Director, Confederation of Open Access Repositories (COAR)

Göttingen, Germany  COAR is pleased to announce the publication of the initial outcomes of the COAR Next Generation Repositories Working Group for public comment. 

Jonathan Rochkind: ruby VCR, easy trick for easy re-record

planet code4lib - Tue, 2017-02-07 22:15

I do a lot of work with external HTTP API’s, and I love the vcr for use in writing tests/specs involving these. It records the interaction, so most of the time the tests are running based on a recorded interaction, not actually going out to the remote HTTP server.

This makes the tests run faster, it makes more sense on a CI server like Travis, it let’s tests run automatically without having to hard-code credentials in for authenticated services (make sure to use VCR’s filter_sensitive_data feature, figuring out the a convenient way to do that with real world use cases is a different discussion), and it even lets people run the tests without having credentials themselves at all to make minor PRs and such.

But in actual local dev, I sometimes want to run my tests against live data for sure, often as the exactly HTTP requests change as I edit my code. Sometimes I need to do this over and over again in a cycle. Previously, I was doing things like manually deleting the relevant VCR cassettes files , to ensure I was running with live data, or avoid VCR “hey, this is a new request buddy” errors.

Why did I never think of using the tools VCR already gives us to make it a lot easier on myself?

Normally works as always, but I just gotta VCR=all ./bin/rspec to run that run with brand newly recorded cassettes. Or VCR=all ./bin/rspec some_specific_spec.rb to re-record only that spec, or only the specs I’m working on, etc.

Geez, I should have figured that out years ago. So I’m sharing with you.

Just don’t ask me if it makes more sense to put VCR configuration in spec_helper.rb or rails_helper.rb. I still haven’t figured out what that split is supposed to be about honestly. I mean, I do sometimes VCR specs of service objects that do not have Rails dependencies…. but I usually just drop it (and all my other config) in rails_helper.rb and ignore the fact that rspec these days is trying to force us to make a choice I don’t really understand the implications or utility of and don’t want to think about.

Filed under: General

District Dispatch: Progress! Email Privacy Act clears House

planet code4lib - Tue, 2017-02-07 17:02

Congratulations library advocates! For the second time in just over 9 months, the US House of Representatives last night passed the Email Privacy Act (H.R. 387 in this Congress) by voice vote.  Propelled by more than 1,100 library supporters, the bill now moves to the Senate where the timing of its consideration – and ultimate fate – are not yet clear.


As previously discussed in DD, the bill’s primary purpose and benefit is to finally update the anachronistic Electronic Communications Privacy Act (ECPA). This would to require law enforcement authorities to obtain a judicial search warrant based on probable cause in order to obtain the actual content of an individual’s email, texts, tweets, cloud-stored files and photos or other electronic information. Under ECPA as still written, no such warrant typically is required for electronic communications older than six months. (This ACLU infographic lays out the problem well.)

Next month will mark the 6th anniversary of ALA’s charter membership in the Digital Due Process coalition, formed to harness the grassroots and Washington muscle of many organizations and companies in the service of ECPA reform.  With just one Senate vote between us and that goal, we’re not about to let up now.  Please stay tuned for yet another action alert, this time focused on the Senate, once we and our partners know more about when that will have the best chance of putting the Email Privacy Act on the President’s desk.

The post Progress! Email Privacy Act clears House appeared first on District Dispatch.

David Rosenthal: Coronal Mass Ejections (again)

planet code4lib - Tue, 2017-02-07 16:00
Back in 2014 I blogged about one of digital preservation's less well-known risks, coronal mass ejections (CME).  Additional information accumulated in the comments. Last October:
"President Barack Obama .. issued an Executive Order that defines what the nation’s response should be to a catastrophic space weather event that takes out large portions of the electrical power grid, resulting in cascading failures that would affect key services such as water supply, healthcare, and transportation.Two recent studies bought the risk back into focus and convinced me that my 2014 post was too optimistic. Below the fold, more gloom and doom.

Mark Gilbert's How Space Could Trigger a Future Economic Crisis reports on a new paper in Space Weather:
In four scenarios envisaging the economic impact of a solar storm, the mildest triggers a daily loss to the U.S. economy of $6.2 billion, or 15 percent of daily output; the worst case sees a cost of $41.5 billion, wiping out every dollar the world’s largest economy generates each day.and:
A study published last month by the Cambridge Centre for Risk Studies estimates that a solar storm would have the potential to wipe between $140 billion to $613 billion off the global economy in a five-year time span, depending on the severity of the impact.According to a NASA blog post, the probability is 12% per decade:
In February 2014, physicist Pete Riley of Predictive Science Inc. published a paper in Space Weather entitled "On the probability of occurrence of extreme space weather events." In it, he analyzed records of solar storms going back 50+ years. By extrapolating the frequency of ordinary storms to the extreme, he calculated the odds that a Carrington-class storm would hit Earth in the next ten years.

The answer: 12%.Macroeconomic impactSo there is about an 1-in-8 chance that in the next decade we will face one of the Cambridge scenarios. They divide the economic impact of the severe scenario's CME impacting the US into four areas:
  • Direct Impacts. It takes 5 months to restore power to 95% of the US population.
  • Indirect Supply Chain Impacts. The impact of power outages on international supply chains is bigger than their direct impact.
  • Macroeconomic Impacts. There is a large initial hit to US domestic product, but a fairly rapid recovery as government spends on recovery.
  • Insurance Impacts. For various reasons, insurance companies bear only about 14% of the economic loss, but this still amounts to about 4 time the total catastrophe losses they bear in a normal year.
I've always said that the chief threat to digital preservation is economic; digital information being very vulnerable to interruptions in the money supply. In the context of economic losses of the magnitude envisaged by the Cambridge report, digital preservation systems would be very low on the priority list for recovery funds.

The risk of CME's is one reason Facebook has advanced for their investment in optical storage for cold data. A CME could destroy the electronics in the racks, but it would not destroy the data on the DVDs. Actually, a CME is equally unlikely to destroy the data on hard disk platters, but destroying the drive electronics makes that data very expensive to recover.

DPLA: Color Our Collections 2017

planet code4lib - Tue, 2017-02-07 15:00

That’s right, folks — #ColorOurCollections is back for kids and grown-ups alike! On your next lunch break, free evening, or Saturday afternoon, try your hand at coloring cultural heritage collections from institutions across the country.

This year’s selection from DPLA includes an array of art, posters, inventions, landscapes and animals. For even more choices, last year’s images are still fair game too!

Color your favorites and share them with us all week at @DPLA or on Facebook using #ColorOurCollections.

Download all DPLA #ColorOurCollections coloring pages


To learn more about the campaign and find other participating institutions, visit

LITA: Call for LITA Guides

planet code4lib - Tue, 2017-02-07 14:56










LITA is looking to expand its popular LITA Guide series. Topics for consideration include:

  • Tools for big data
  • Developing in-house technology expertise
  • Budgeting for technology
  • Writing a technology plan
  • K-12 technology
  • Applications of agile development for libraries
  • Grant writing for library technology
  • Security for library systems

Do you have expertise in any of these areas? Reach out to Marta Deyrup, Acquisitions Editor.



State Library of Denmark: juxta – image collage with metadata

planet code4lib - Tue, 2017-02-07 14:38

Creating large collages of images to give a bird’s eye view of a collection seems to be gaining traction. Two recent initiatives:

Combining those two ideas seemed like a logical next step and juxta was born: A fairly small bash-script for creating million-scale collages of images, with no special server side.  There’s a small (just 1000 images) demo at SBLabs.

Presentation principle

The goal is to provide a seamless transition from the full collection to individual items, making it possible to compare nearby items with each other and locate interesting ones. Contextual metadata should be provided for general information and provenance.

Concretely, the user is presented with all images at once and can zoom in to individual images in full size. Beyond a given threshold, metadata are show for the image currently under the cursor, or finger if a mobile device is used. An image description is displayed just below the focused image, to avoid disturbing the view. A link to the source of the image is provided on top.

Overview of historical maps

Meta-data for a specific map

Technical notes, mostly on scaling

On the display side, OpenSeadragon takes care of the nice zooming. When the user moves the focus, a tiny bit of JavaScript spatial math resolves image identity and visual boundaries.

OpenSeadragon uses pyramid tiles for display and supports the Deep Zoom protocol can be implemented using only static files. The image to display is made up of tiles of (typically) 256×256 pixels. When the view is fully zoomed, only the tiles within the viewport are requested. When the user zooms out, the tiles from the level above are used. The level above is half the width and half the height and is thus represented by ¼ the amount of tiles. And so forth.

Generating tiles is heavy

A direct way of creating the tiles is

  1. Create one large image of the full collage (ImageMagick’s montage is good for this)
  2. Generate tiles for the image
  3. Scale the image down to 50%×50%
  4. If the image is larger than 1×1 pixel then goto 2

Unfortunately this does not scale particularly well. Depending on size and tools, it can take up terabytes of temporary disk space to create the full collage image.

By introducing a size constraint, juxta removes this step: All individual source images are scaled & padded to have the exact same size. The width and height of the images are exact multiples of 256. Then the tiles can be created by

  1. For each individual source image, scale, pad and split the image directly into tiles
  2. Create the tiles at the level above individually by joining the corresponding 4 tiles below and scale to 50%×50% size
  3. If there are more than 1 tile or that tile is larger than 1×1 pixel then goto 2

As the tiles are generated directly from either source images or other tiles, there is no temporary storage overhead. As each source image and each tile are processed individually, it is simple to do parallel processing.

Metadata takes up space too

Displaying image-specific metadata is simple when there are just a few thousand images: Use an in-memory array of Strings to hold the metadata and fetch it directly from there. But when the number of images goes into the millions, this quickly becomes unwieldy.

juxta groups the images spatially in buckets of 50×50 images. The metadata for all the images in a bucket are stored in the same file. When the user moved the focus to a new image, the relevant bucket is fetched from the server and the metadata are extracted. A bucket cache is used to minimize repeat calls.

Most file systems don’t like to hold a lot of files in the same folder

While the limits differ, common file systems such as ext, hfs & ntfs all experience performance degradation with high numbers of files in the same folder.

The Deep Zoom protocol in conjunction with file-based tiles means that the amount of files at the deepest zoom level is linear to the number of source images. If there are 1 million source images, with full-zoom size 512×512 pixels (2×2 tiles), the number of files in a single folder will be 2*2*1M = 4 million. Far beyond the comfort-zone fo the mentioned file systems (see the juxta readme for tests of performance degradation).

juxta mitigates this by bucketing tiles in sub-folders. This ensures linear scaling of build time at least up to 5-10 million images. 100 million+ images would likely deteriorate build performance markedly, but at that point we are also entering “is there enough free inodes on the file system?” territory.

Unfortunately the bucketing of the tile files is not in the Deep Zoom standard. With OpenSeadragon, it is very easy to change the mapping, but it might be more difficult for other Deep Zoom-expecting tools.

Some numbers

Using a fairly modern i5 desktop and 3 threads, generating a collage of 280 5MPixel images, scaled down to 1024×768 pixels (4×3 tiles) took 88 seconds or about 3 images/second. Repeating the experiment with a down-scale to 256×256 pixels (smallest possible size) raised the speed to about 7½ image/second.

juxta comes with a scale-testing script that generates sample images that are close (but not equal) to the wanted size and repeats them for the collage. With this near-ideal match, processing speed was 5½ images/second for 4×3 tiles and 33 images/second for 1×1 tiles.

The scale-test script has been used up to 5 million images, with processing time practically linear to the number of images. At 33 images/second that is 42 hours.

Open Knowledge Foundation: Open Data by default: Lorca City Council is using OpenSpending to increase transparency and promote urban mobility.

planet code4lib - Tue, 2017-02-07 10:00

Castillo de Lorca. Torre Alfonsina (Public Domain)

Lorca, a city located in the South of Spain with currently 92,000 inhabitants, launched its open data initiative on January 9th 2014. Initially it offered 23 datasets containing transport, mobility, statistical and economic information. From the very beginning, OpenSpending was the tool selected by Lorca City Council because of its capabilities and incredible visualization abilities.

The first upload of datasets was done in 2013, on the previous version of OpenSpending. With the OpenSpending relaunch last year, Lorca City Council continued to make use of the OpenSpending datastore, while the TreeMap view of the expenditure budget was embedded on the council’s open data website.

In December 2016, the council’s open data website was redesigned, including budget datasets built with the new version at The accounting management software of Lorca allows the automatic conversion of data files to csv. format, so these datasets are compatible with the requested formats established by OpenSpending.

Towards more transparency and becoming a smart city

In 2015, when the City of Lorca transparency website was launched, the council decided to continue with the same strategy focused on visualization tools to engage citizens with an intuitive approach to the budget data.

Lorca is a city pioneer in the Region of Murcia in terms of open data and transparency. So far, 125 datasets have been released and much information is available along with the raw data.

It deserves to be highlighted that there are pilot project initiatives to bring open data to schools, which was carried out during the past year. In 2017, we will resume to teach the culture of open data to school children with the main goal to demonstrate how to work with data by using open data.

In the close future the council plans to open more data directly from the sources, i.e. achieve policy of open data by default.

And of course Lorca intends to continue exploring other possibilities that Open Spending offers us to provide all this data to the citizenry. In addition, Lorca is working to become a smart city (article in Spanish only) – open data is a key element in this goal. Therefore, Lorca’s open data initiative will be a part of the Smart Social City strategy from the very beginning. 

Open Knowledge Foundation: Open Data by default: Lorca City Council is using OpenSpending to increase transparency and promote urban mobility.

planet code4lib - Tue, 2017-02-07 10:00

Castillo de Lorca. Torre Alfonsina (Public Domain)

Lorca, a city located in the South of Spain with currently 92,000 inhabitants, launched its open data initiative on January 9th 2014. Initially it offered 23 datasets containing transport, mobility, statistical and economic information. From the very beginning, OpenSpending was the tool selected by Lorca City Council because of its capabilities and incredible visualization abilities.

The first upload of datasets was done in 2013, on the previous version of OpenSpending. With the OpenSpending relaunch last year, Lorca City Council continued to make use of the OpenSpending datastore, while the TreeMap view of the expenditure budget was embedded on the council’s open data website.

In December 2016, the council’s open data website was redesigned, including budget datasets built with the new version at The accounting management software of Lorca allows the automatic conversion of data files to csv. format, so these datasets are compatible with the requested formats established by OpenSpending.

Towards more transparency and becoming a smart city

In 2015, when the City of Lorca transparency website was launched, the council decided to continue with the same strategy focused on visualization tools to engage citizens with an intuitive approach to the budget data.

Lorca is a city pioneer in the Region of Murcia in terms of open data and transparency. So far, 125 datasets have been released and much information is available along with the raw data.

It deserves to be highlighted that there are pilot project initiatives to bring open data to schools, which was carried out during the past year. In 2017, we will resume to teach the culture of open data to school children with the main goal to demonstrate how to work with data by using open data.

In the close future the council plans to open more data directly from the sources, i.e. achieve policy of open data by default.

And of course Lorca intends to continue exploring other possibilities that Open Spending offers us to provide all this data to the citizenry. In addition, Lorca is working to become a smart city (article in Spanish only) – open data is a key element in this goal. Therefore, Lorca’s open data initiative will be a part of the Smart Social City strategy from the very beginning. 

DuraSpace News: Telling Fedora 4 Stories at the University of Alberta with Geoff Harder, Peter Binkley, and Leah Vanderjagt

planet code4lib - Tue, 2017-02-07 00:00

“Telling Fedora 4 Stories” is an initiative aimed at introducing project leaders and their ideas to one another while providing details about Fedora 4 implementations for the community and beyond.

District Dispatch: Archived webinar on Sci-Hub and resource sharing now available

planet code4lib - Mon, 2017-02-06 21:56

Plan ahead! One hour CopyTalk webinars occur on the first Thursday of every month at 11 a.m. Pacific / 2 p.m. Eastern.

An archived copy of the CopyTalk webinar “Open Access ‘Pirates:’ Sci-Hub and #icanhazpdf as Resource Sharing” is now available. Originally webcasted on February 2, 2017, by the Office for Information Technology Policy’s Copyright Education subcommittee, this webinar was one our most popular CopyTalk of all time.

Presenters were Carolyn Caffrey Gardner from California State University Dominguez Hills and Gabriel J. Gardner from California State University Long Beach. They showed their latest research on who uses Sci-Hub or other guerrilla fulfillment sites and why. In addition, they describe the various ways people use and build guerrilla sites, both centralized (active and planned deployment) and decentralized (crowd-sourcing). Is this just a supply and demand issue, or is something else afoot?

You can watch the full CopyTalk and view the slides on the Office for Information Technology Policy’s website here.

Plan ahead! One hour CopyTalk webinars occur on the first Thursday of every month, 11 a.m. Pacific / 2 p.m. Eastern. Live stream:

The post Archived webinar on Sci-Hub and resource sharing now available appeared first on District Dispatch.

LITA: What’s so super about supercomputing? A joint LITA and ACRL webinar

planet code4lib - Mon, 2017-02-06 17:44

What’s so super about supercomputing? A very basic introduction to high performance computing

Presenters: Jamene Brooks-Kieffer and Mark J. Laufersweiler
Tuesday February 28, 2017
2:00 pm – 3:30 pm Central Time

Register Online, page arranged by session date (login required)

This 90 minute webinar provides a bare-bones introduction to high-performance computing, also known as HPC, supercomputing, and under many other monikers. This program is a unique attempt to connect the academic library to introductory information about HPC. Librarians who are learning about researchers’ data-intensive work should consider familiarizing themselves with the computing environment often used to conduct that work.

Academic librarians, particularly, face a landscape in which many of their users conduct part or all of their research using computation. Bibliometric analysis, quantitative statistical analysis, and geographic data visualizations are just a few examples of computationally-intensive work underway in humanities, social science, and science fields.

Covered topics will include:

  • Why librarians should care about HPC
  • HPC terminology and working environment
  • Examples of problems appropriate for HPC
  • HPC resources at institutions and nation-wide
  • Low-cost entry-level programs for learning distributed computing

The webinar slide set and a handout that includes a HPC glossary of basic HPC terminology as well as HPC resources will be made available.

Details here and Registration here

Webinar takeaways will include:

  • Attendees will learn the basic terminology of high performance computing.
  • Attendees will be introduced to the working environment commonly used for high performance computing.
  • Attendees will gain information on institutional and national high performance computing resources available to researchers.

Jamene Brooks-Kieffer brings a background in electronic resources to her work as Data Services Librarian at the University or Kansas. She regularly teaches on data management practices to audiences of faculty, graduate students, and undergraduates. She has engaged library professionals in many in-person and virtual programs at venues including Electronic Resources & Libraries, Coalition for Networked Information, and a Great Plains Network / Greater Western Library Association webinar series.

Dr. Mark Laufersweiler has, since the Fall of 2013, served as the Research Data Specialist for the University of Oklahoma Libraries. He is currently assisting the educational mission of the Libraries by developing and offering workshops, seminars and short courses, helping to inform the university community on best practices for data management and data management planning. He is the university’s representative as a member of the Software Carpentry Foundation and is an active instructor as well. He is a strong advocate of open source software and open access to data.

Look here for current and past LITA continuing education offerings

Questions or Comments?

contact LITA at (312) 280-4268 or Mark Beatty,
contact ACRL at (312) 280-2522 or Margot Conahan,

District Dispatch: Alarming new FCC moves

planet code4lib - Mon, 2017-02-06 16:45

ALA is concerned about announcements made in last week’s “Friday media dump”

Last Friday, Federal Communications Commission Chairman Ajit Pai rescinded close to a dozen policies of the FCC, including rulemakings on expanding the program providing Internet service to low income households, rulings on several TV stations’ violations of political file rules and further restricting TV shared services and joint sales agreements. Chairman Pai also announced the end of the Commission’s probe into the controversial wireless “zero rating” data plans.

The American Library Association has been a proud partner in initiatives to support broadband opportunity and access to information, including the expansion of the Lifeline program. We also have supported many policies that improve equity and access to information the Chairman unilaterally rescinded on Friday. We believe these moves will make the digital divide wider and are troubled by the direction this Chairman appears to be heading with “Friday news dumps” that give little to no time for discussion or dissent. Please see below for a statement from ALA President Julie Todaro on Friday’s alarming moves by the FCC:

On February 3, 2017, the Federal Communications Commission (FCC) revoked all of the designations of Lifeline Broadband Providers and ordered the retraction of multiple reports, including the “E-rate Modernization Progress Report” and “Improving the Nation’s Digital Infrastructure.”

The American Library Association (ALA) is dismayed by these actions to reduce digital opportunity and revise the public record. ALA President Julie Todaro released the following statement.

“The American Library Association (ALA) strenuously objects to recent actions by the Federal Communications Commission (FCC). First, the ALA is alarmed by the sudden revocation of the nine Lifeline Broadband Provider designations. Reducing options for Lifeline broadband services is a step back in efforts to close the homework gap and digital divide, and is at odds with Chairman Pai’s stated desire to advance digital empowerment. The 2016 Lifeline modernization order represented a critical milestone in our national commitment to connect low-income Americans to the broadband that powers educational and economic opportunity. ALA and our nation’s 120,000 libraries are committed to advancing digital opportunity for all, and we urge the FCC to increase the number of broadband options available for Lifeline customers.

“The ALA also calls for the FCC to maintain an accurate and complete historical record. While new FCC leadership may have new policy directions, the public record should not be permanently altered. Governmental agencies must be accountable in this regard. We urge the reversal of the retraction decisions and an agreement that the FCC will not order the removal of any other documents from the public record. Such actions undermine the credibility of the FCC and Chairman Pai’s recent move to increase transparency of the Commission’s rulemaking.

“Full and public debate with the accompanying historical record preserved on these foundational internet issues that affect every person in this country should be the standard we expect and demand.”

The post Alarming new FCC moves appeared first on District Dispatch.

Islandora: Islandora Foundation: New Members

planet code4lib - Mon, 2017-02-06 16:12

The Islandora Foundation is funded entirely by support from our member organizations, so we are very grateful to announce that we are welcoming two new members: the University of Texas at Austin and Digital Echidna.

UT Austin has long been a major implementer of Islandora and engaged with the community. They join the Islandora Foundation as a Collaborator.

Digital Echidna is newer to the scene, but has already made a mark with the contribution of several modules to the Islandora community and sponsorship of Islandoracon. They join as Members.

These members, plus renewed commitments from our existing members, bring our Lobstometre up another few notches:

If your institution would also like to support Islandora and become more engaged with the community, please consider membership. You can also support Islandora as an Individual Member with a donation of your choosing.

OCLC Dev Network: Calling CABs: Obtaining 3,000 required and recommended readings each semester

planet code4lib - Mon, 2017-02-06 14:00

As part of the process of optimizing the alignment of our book collection with the teaching and learning needs of the colleges, the Claremont Colleges Library launched a service designed to provide students with improved access to approximately 3,000 required and recommended readings each semester known as Course Adopted Books (CABs).

Library of Congress: The Signal: FADGI’s 10th Anniversary: Adapting to Meet the Community’s Needs

planet code4lib - Mon, 2017-02-06 13:53

This is a guest post by Kate Murray, IT Specialist in the Library of Congress’s Digital Collections and Management Services.

Note the dot over the “i” in Guidelines connects with the “g” in Agencies, which reflects FADGI’s collaborative ethos of working together and that guidelines should always intersect with agency needs.

Started in 2007 as a collaborative effort by federal agencies, FADGI has many accomplishments under its belt, including the widely implemented Technical Guidelines for Digitizing Cultural Heritage Materials (newly updated in 2016); open source software, including OpenDICE and AutoSFR and BWF MetaEdit; file format comparison projects; standards work, including the MXF AS-07 Application Specification and Sample Files; projects related to scanning motion picture film; embedded metadata in Broadcast Wave, DPX and TIFF (PDF) files and many more. Check out the handy summary chart (PDF) of our accomplishments, impacts and benefits to date.

Our 10th anniversary is 2017, so it’s a good time to think about a bit of an update as we head into our second decade.

First let’s talk about our name. “FADGI” (fah – jee), we readily admit, does not exactly roll off the tongue. But we’re a well-established brand name now so “FADGI” we stay but with an update. Up until now, the FADGI acronym stood for the Federal Agencies Digitization Guidelines Initiative because we’ve mainly been focused on developing technical guidelines, methods and practices for the digitization of historical content in a sustainable manner. In recent years however, the FADGI Still Image and Audio-Visual working groups have expanded their projects to include selected aspects of born-digital content alongside content reformatted through digitization.

FADGI 2.0 is now reborn as the Federal Agencies Digital Guidelines Initiative. Same acronym that we’ve grown to love, same great people (now up to 20 federal agencies) now with a new logo, updated website and expanded scope. FADGI will still focus on determining performance measures for digitization and develop methods for validation, recommending methods for digitization and exploring sustainable digital formats for still image and audiovisual material. But we’ll add some new ingredients to the mix, including recommending methods for creating and maintaining sustainable born-digital material. One example of this revised scope is the Creating and Archiving Born Digital Video project, which includes high-level recommended practices (PDF) for file creators.

More good news on the FADGI front is that our published guidelines will now carry the CC0 1.0 Universal license to declare unambiguously that the work is available for worldwide use and reuse. Because FADGI work is the product of US federal government personnel in the scope of their employment and therefore is not subject to copyright in the United States (17 U.S.C. §105), FADGI’s work products have always been in the public domain. The inclusion of the CC0 1.0 Universal license clarifies these statements for both US and international users of the FADGI guidelines.

All United States federal agencies and institutions involved in the creation or collection of digitized or born-digital content of a cultural, historical or archival nature are welcome to participate in FADGI. Please join us as we look forward to our next chapter and our next 10 years!

Terry Reese: MarcEdit MacOS Updates

planet code4lib - Mon, 2017-02-06 06:15

This past weekend, I spent a good deal of time getting the MacOS version of MarcEdit synchronized with the Windows and Linux builds.  In addition to the updates, there is a significant change to the program that needs to be noted as well. 

First, let’s start with the changelog.  The following changes were made in this version:

** 2.2.30
* Bug Fix: Delimited Text Translator — when receiving Unix formatted files on Windows, the program may struggle with determining new line data.  This has been corrected.
* Bug Fix: RDA Helper — when processing copyright information, there are occasions where the output can create double brackets ($c[[) — this should be corrected.
* Behavior Change: Delimited Text Translator — I’ve changed the default value from on to off as it applies to ignoring header rows. 
* Enhancement: System Info (main window) — I’ve added information related to referenced libraries to help with debugging questions.
* Bug fix/Behavior Change: Export Tab Delimited Records: Second delimiter insertion should be standardized with all regressions removed.
* New Feature: Linked Data Tools: Service Status options have been included so users can check the status of the currently profiled linked data services.
* New Feature: Preferences/Networked Tasks: MarcEdit uses a short timeout (0.03 seconds) when determining if a network is available.  I’ve had reports of folks using MarcEdit have their network dropped from MarcEdit.  This is likely because their network has more latency.  In the preferences, you can modify this value.  I would never set it above 500 milliseconds (0.05 seconds) because it will cause MarcEdit to freeze when off network, but this will give users more control over their network interactions.
* Bug Fix: Swap Field Function: The new enhancement in the swap field function added with the last update didn’t work in all cases.  This should close that gap.
* Enhancement: Export Tab Delimited Records: Added Configurable third delimiter.
* Enhancement: MarcEditor: Improvements in the Page Counting to better support invalid formatted data.
* Enhancement: Extract/Delete MARC Records: Added file open button to make it easier to select file for batch search
* Bug Fix: Log File locking and inaccessible till closed in very specific instances.
* Enhancement: Compiling changes…For the first time, I’ve been able to compile as 64-bit, which has reduced download size.
* Bug Fix: Deduplicate Records: The program would thrown an error if the dedup save file was left blank.

Application Architecture Changes

The first thing that I wanted to highlight is that the program is being built as a 64-bit application.  This is a significant change to the program.  Since the program was ported to MacOS, the program has been compiled as a 32-bit application.  This has been necessary due to some of the requirements found in the mono stack.  However, over the past year, Microsoft has become very involved in this space (primarily to make it easier to develop IOS applications on Windows via an emulator), and that has lead to the ability to compile MarcEdit as a 64-bit application. 

So why do this if the 32-bit version worked?  Well, what spurred this on was a conversation that I had with the homebrew maintainers.  It appears that they are removing the universal compilation options which will break Z39.50 support in MarcEdit.  They suggested making my own tap (which I will likely pursue), but it got me spending time seeing what dependencies were keeping me from compiling directly to 64-bit.  It took some doing, but I believe that I’ve gotten all code that necessitated building as 32-bit out of the application, and the build is passing and working. 

I’m pointing this out because I could have missed something.  My tools for automated testing for the MacOS build are pretty non-existent.  So, if you run into a problem, please let me know.  Also, as a consequence of compiling only to 64-bit, I’ve been able to reduce the size of the download significantly because I am able to reduce the number of dependencies that I needed to link to.  This download should be roughly 38 MB smaller than previous versions.

Downloading the Update

You can download the update using the automated download prompt in MarcEdit or by going to the downloads page at:


Terry Reese: MarcEdit Windows/Linux Updates

planet code4lib - Mon, 2017-02-06 06:02

This weekend, I worked on a couple of updates related to MarcEdit.  The updates applicable to the Windows and Linux builds are the following:

* Enhancement: Export Tab Delimited Records: Added Configurable third delimiter.
* Enhancement: MarcEditor: Improvements in the Page Counting to better support invalid formatted data.
* Enhancement: Extract/Delete MARC Records: Added file open button to make it easier to select file for batch search
* Update: Field Count: The record count of the field count can be off if formatting is wrong.  I’ve made this better.
* Update: Extract Selected Records: Added an option to sort checked items to the top.
* Bug Fix: Log File locking and inaccessible till closed in very specific instances.

The downloads can be picked up via the automatic downloader or via the downloads page at:


Jason Ronallo: Choosing a Path Forward for IIIF Audio and Video

planet code4lib - Sun, 2017-02-05 02:47

IIIF is working to bring AV resources into IIIF. I have been thinking about how to bring to AV resources the same benefits we have enjoyed for the IIIF Image and Presentation APIs. The initial intention of IIIF, especially with the IIIF Image API, was to meet a few different goals to fill gaps in what the web already provided for images. I want to consider how video works on the web and what gaps still need to be filled for audio and video.

This is a draft and as I consider the issues more I will make changes to better reflect my current thinking.

See updates at the end of this post.


When images were specified for the web the image formats were not chosen, created, or modified with the intention of displaying and exploring huge multi-gigabit images. Yet we have high resolution images that users would find useful to have in all their detail. So the first goal was to improve performance of delivering high resolution images. The optimization that would work for viewing large high resolution images was already available; it was just done in multiple different ways. Tiling large images is the work around that has been developed to improve the performance of accessing large high resolution images. If image formats and/or the web had already provided a solution for this challenge, tiling would not have been necessary. When IIIF was being developed there were already tiling image servers available. The need remained to create standardized access to the tiles to aid in interoperability. IIIF accomplished standardizing the performance optimization of tiling image servers. The same functionality that enables tiling can also be used to get regions of an image and manipulate them for other purposes. In order to improve performance smaller derivatives can be delivered for use as thumbnails on a search results page.

The other goal for the IIIF Image API was to improve the sharing of image resources across institutions. The situation before was both too disjointed for consumers of images and too complex for those implementing image servers. IIIF smoothed the path for both. Before IIIF there was not just one way of creating and delivering tiles, and so trying to retrieve image tiles from multiple different institutions could require making requests to multiple different kinds of APIs. IIIF solves this issue by providing access to technical information about an image through an info.json document. That information can then be used in a standardized way to extract regions from an image and manipulate them. The information document delivers the technical properties necessary for a client to create the URLs needed to request the given sizes of whole images and tiles from parts of an image. Having this standard accepted by many image servers has meant that institutions can have their choice of image servers based on local needs and infrastructure while continuing to interoperate for various image viewers.

So it seems as if the main challenges the IIIF Image API were trying to solve were about performance and sharing. The web platform had not already provided solutions so they needed to be developed. IIIF standardized the pre-existing performance optimization pattern of image tiling. Through publishing information about available images in a standardized way it also improved the ability to share images across institutions.

What other general challenges were trying to be solved with the IIIF Image API?

Video and Audio

The challenges of performance and sharing are the ones I will take up below with regards to AV resources. How does audio and video currently work on the web? What are the gaps that still need to be filled? Are there performance problems that need to be solved? Are there challenges to sharing audio and video that could be addressed?

AV Performance

The web did not gain native support for audio and video until later in its history. For a long time the primary ways to deliver audio and video on the web used Flash. By the time video and audio did become native to the web many of the performance considerations of media formats already had standard solutions. Video formats have such advanced lossy compression that they can sometimes even be smaller than an image of the same content. (Here is an example of a screenshot as a lossless PNG being much larger than a video of the same page including additional content.) Tweaks to the frequency of full frames in the stream and the bitrate for the video and audio can further help improve performance. A lot of thought has been put into creating AV formats with an eye towards improving file size while maintaining quality. Video publishers also have multiple options for how they encode AV in order to strike the right balance for their content between compression and quality.

Progressive Download

In addition video and audio formats are designed to allow for progressive download. The whole media file does not need to be downloaded before part of the media can begin playing. Only the beginning of the media file needs to be downloaded before a client can get the necessary metadata to begin playing the video in small chunks. The client can also quickly seek into the media to play from any arbitrary point in time without downloading the portions of the video that have come before or after. Segments of the media can be buffered to allow for smooth playback. Requests for these chunks of media can be done with a regular HTTP web server like Apache or Nginx using byte range requests. The web server just needs minimal configuration to allow for byte range requests that can deliver just the partial chunk of bytes within the requested range. Progressive download means that a media file does not have to be pre-segmented–it can remain a single whole file–and yet it can behave as if it has been segmented in advance. Progressive download effectively solves many of the issues with the performance of the delivery of very long media files that might be quite large in size. Media files are already structured in such a way that this functionality of progressive download is available for the web. Progressive download is a performance optimization similar to image tiling. Since these media formats and HTTP already effectively solve the issue of quick playback of media without downloading the whole media file, there is no need for IIIF to look for further optimizations for these media types. Additionally there is no need for special media servers to get the benefits of the improved performance.

Quality of Service

While progressive download solves many of the issues with delivery of AV on the web based on how the media files are constructed, it is a partial solution. The internet does not provide assurances on quality of service. A mobile device at the edge of the range of a tower will have more latency in requesting each chunk of content than a wired connection at a large research university. Even over the same stable network the time it takes for a segment of media to be returned can fluctuate based on network conditions. This variability can lead to media playback stuttering or stalling while retrieving the next segment or taking too much time to buffer enough content to achieve smooth playback. There are a couple different solutions to this that have been developed.

With only progressive download at your disposal one solution is to allow the user to manually select a rendition to play back. The same media content is delivered as several separate files at different resolutions and/or bitrates. Lower resolutions and bitrates mean that the segments will be smaller in size and faster to deliver. The media player is given a list of these different renditions with labels and then provides a control for the user to choose the version they prefer. The user can then select whether they want to watch a repeatedly stalling, but high quality, video or would rather watch a lower resolution video playing back smoothly. Many sites implement this pattern as a relatively simple way to take into account that different users will have different network qualities. The problem I have found with this solution for progressive download video is that I am often not the best judge of network conditions. I have to fiddle with the setting until I get it right if I ever do. I can set it higher than it can play back smoothly or select a much lower quality than what my current network could actually handle. I have also found sites that set my initial quality level much lower than my network connection can handle which results in a lesser experience until I make the change to a higher resolution version. That it takes me doing the switching is annoying and distracting from the content.

Adaptive Bitrate Formats

To improve the quality of the experience while providing the highest quality rendition of the media content that the network can handle, other delivery mechanisms were developed. I will cover in general terms a couple I am familiar with, that have the largest market share, and that were designed for delivery over HTTP. For these formats the client measures network conditions and delivers the highest quality version that will lead to smooth playback. The client monitors how long it takes to download each segment as well as the duration of the current buffer. (Sometimes the client also measures the size of the video player in order to select an appropriate resolution rendition.) The client can then adapt on the fly to network conditions to play the video back smoothly without user intervention. This is why it is called “smooth streaming” in some products.

For adaptive bitrate formats like HLS and MPEG-DASH what gets initially delivered is a manifest of the available renditions/adaptations of the media. These manifests contain pointers for where (which URL) to find the media. These could be whole media files for byte range requests, media file segments as separate files, or even in the case of HLS a further manifest/playlist file for each rendition/stream. While the media is often referred to in a manifest with relative URLs, it is possible to serve the manifest from one server and the media files (or further manifests) from a different server like a CDN.

How the media files are encoded is important for the success of this approach. For these formats the different representations can be pre-segmented into the same duration lengths for each segment across all representations. In a similar way they can also be carefully generated single files that have full frames relatively close together within a file and all have these full frames synchronized between all the renditions of the media. For instance all segments could be six seconds with an iframe every 2 seconds. This careful alignment of segments allows for switching between representations without having glitchy moments where the video stalls, without the video replaying or skipping ahead a moment, and with the audio staying synchronized with the video.

It is also possible in the case of video to have one or more audio streams separate from the video streams. Separate audio streams aligned with the video representations will have small download sizes for each segment which can allow a client to decide to continue to play the audio smoothly even if the video is temporarily stalled or reduced in quality. One use case for this audio stream performance optimization is the delivery of alternative language tracks as separate audio streams. The video and audio bitrates can be controlled by the client independently.

In order for adaptive formats like this to work all of the representations need to have the next required segment ready on the server in case the client decides to switch up or down bitrates. While cultural heritage use cases that IIIF considers do not include live streaming broadcasts, the number of representations that all need to be encoded and available at the same time effects the “live edge”–how close to real-time the stream can get. If segments are available in only one high bitrate rendition then the client may not be able to keep up with a live broadcast. If all the segments are not available for immediate delivery then it can lead to playback issues.

The manifests for adaptive bitrate formats also include other helpful technical information about the media. (For HLS the manifest is called a master playlist and for MPEG-DASH a Media Presentation Description.) Included in these manifests can be the duration of the media, the maximum/minimum height and width of the representations, the mimetype and codecs (including MP4 level) of the video and audio, the framerate or sampling rate, and lots more. Most importantly for quality of experience switching, each representation includes a number for its bandwidth. There are cases where content providers will deliver two video representations with the same height and width and different bitrates to switch between. In these cases it is a better experience for the user to maintain the resolution and switch down a bandwidth than to switch both resolution and bandwidth. The number of representations–the ladder of different bandwidth encodes–can be quite extensive for advanced cases like Netflix over-the-top (OTT aka internet) content delivery. These adaptive bitrate solutions are meant to scale for high demand use cases. The manifests can even include information about sidecar or segmented subtitles and closed captions. (One issue with adaptive formats is that they may not play back across all devices, so many implementations will still provide progressive download versions as a fallback.) Manifests for adaptive formats include the kind of technical information that is useful for clients.

Because there are existing standards for the adaptive bitrate pattern that have broad industry and client support, there is no need to attempt to recreate these formats.

AV Performance Solved

All except the most advanced video on demand challenges have current solutions through ubiquitous video formats and adaptive bitrate streaming. As new formats like VP9 increase in adoption the situation for performance will improve even further. These formats have bitrate savings through more advanced encoding that greatly reduces file sizes while maintaining quality. This will mean that adaptive bitrate formats are likely to require fewer renditions than are typically published currently. Note though that in some cases smaller file sizes and faster decoding comes at the expense of much slower encoding when trying to keep a good quality level.

There is no need for the cultural heritage community to try to solve performance challenges when the expert AV community and industry has developed advanced solutions.

Parameterized URLs and Performance

One of the proposals for providing a IIIF AV API alongside the Image API involves mirroring the existing Image API by providing parameters for segmenting and transforming of media. I will call this the “parameterized approach.” One way of representing this approach is this URL:


You can see more about this type of proposal here and here. The parameters after the identifier and before the quality would all be used to transform the media.

For the Image API the parameterized approach for retrieving tiles and other derivatives of an image works as an effective performance optimization for delivery. In the case of AV having these parameters does not improve performance. It is already possible to seek into progressive download and adaptive bitrate formats. There is not the same need to tile or zoom into a video as there is for a high definition image. A good consumer monitor will show you as full a resolution as you can get out of most video.

And these parameters do not actually solve the most pressing media delivery performance problems. The parameterized approach probably is not optimizing for bitrate which is one of the most important settings to improve performance. Having a bitrate parameter within a URL would be difficult to implement well. Bitrate could significantly increase the size of the media or increase visible artifacts in the video or audio beyond usability. Would the audio and video bitrates be controlled separately in the parameterized approach? Bitrate is a crucially important parameter for performance and not one I think you would put into the hands of consumers. It will be especially difficult as bitrate optimization for video on demand is slow and getting more complicated. In order to optimize variable bitrate encoding 2-pass encoding is used and slower encoding settings can further improve quality. With new formats with better performance for delivery, bitrate is reduced for the same quality while encoding is much slower. Advanced encoding pipelines have been developed that perform metrics on perceptual difference so that each video or even section of a video can be encoded at the lowest bitrate that still maintains the desired quality level. Bitrate is where performance gains can be made.

The only functionality proposed for IIIF AV that I have seen that might be helped by the parameterized approach is download of a time segment of the video. This is specific to download of just that time segment. Is this use case big enough to be seriously considered for the amount of complexity it adds? Why is download of a time segment crucial? Why would most cases not be met with just skipping to that section to play? Or can the need be met with downloading the whole video in those cases where download is really necessary? If needed any kind of time segment download use case could live as a separate non-IIIF service. Then it would not have any expectation of being real-time. I doubt most would really see the need to implement a download service like this if the need can be met some other way. In those cases where real-time performance to a user does not matter those video manipulations could be done outside of IIIF. For any workflow that needs to use just a portion of a video the manipulation could be a pre-processing step. In any case if there is really the desire for a video transformation service it does not have to be the IIIF AV API but could be a separate service for those who need it.

Most of the performance challenges with AV have already been solved via progressive download formats and adaptive bitrate streaming. Remaining challenges not fully solved with progressive download and adaptive bitrate formats include live video, server-side control of quality of service adaptations, and greater compression in new codecs. None of these are the types of performance issues the cultural heritage sector ought to try to take on, and the parameterized approach does not contribute solutions to these remaining issues. Beyond these rather advanced issues, performance is a solved problem that has had a lot of eyes on it.

If the parameterized approach is not meant to help with optimizing performance what problem is it trying to solve? The community would be better off steering clear of this trap of trying to optimize for performance and instead focus on problems that still need to be solved. The parameterized approach is sticking with a performance optimization pattern that does not add anything for AV. It has a detrimental fixation on the bitstream that does not work for AV especially as adaptive bitrate segmented formats are concerned. It appears motivated by some kind of purity of approach rather than taking into account the unique attributes of AV and solving these particular challenges well.

AV Sharing

The other challenge a standard can help with is sharing of AV across institutions. If the parameterized approach does not solve a performance problem, then what about sharing? If we want to optimize for sharing and have the greatest number of institutions sharing their AV resources, then there is still no clear benefit for the parameterized approach. What about this parameterized approach aids in sharing? It seems to optimize for performance, which as we have seen above is not needed, at the expense of the real need to improve and simplify sharing. There are many unique challenges for sharing video across institutions on the web that ought to be considered before settling on a solution.

One of the big barriers to sharing is the complexity of AV. Compared to delivery of still images video is much more complicated. I have talked to a few institutions that have digitized video and have none of it online yet because of the hurdles. Some of the complication is technical, and because of this institutions are quicker to use easily available systems just to get something done. As a result many fewer institutions will have as much control over AV as they have over images. It will be much more difficult to gain that kind of control. For instance with some media servers they may not have a lot of control over how the video is served or the URL for a media file.

Video is expensive. Even large libraries often make choices about technology and hosting for video based on campus providing the storage for it. Organizations should be able to make the choices that work for their budget while still being able to share in as much as they desire and is possible.

One argument made is that many institutions had images they were delivering in a variety of formats before the IIIF Image API, so asking for similar changes to how AV is delivered should not be a barrier to pursuing a particular technical direction. The difficulty of institutions in dealing with AV can not be minimized in this way as any kind of change will be much greater and asking much more. The complexity and costs of AV and the choices that forces should be taken into consideration.

An important question to ask is who you want to help by standardizing an API for sharing? Is it only for the well-resourced institutions who self-host video and have the technical expertise? If it is required that resources live in a particular location and only certain formats be used it will lead to fewer institutions gaining the sharing benefits of the API because of the significant barriers to entry. If the desire is to enable wide sharing of AV resources across as many institutions as possible, then that ought to lead to a different consideration of the issues of complexity and cost.

One issue that has plagued HTML5 video from the beginning is the inability of the browser vendors to agree on formats and codecs. Early on open formats like WebM with VP8 were not adopted by some browsers in favor of MP4 with H.264. It became common practice out of necessity to encode each video in a variety of formats in order to reach a broad audience. Each source would be listed on the page (on a source element within a video element) and the browser picks which it can play. HTML5 media was standardized to use a pattern to accommodate the situation where it was not possible to deliver a single format that could be played across all browsers. It is only recently that MP4 with H.264 has been able to be played across all current browsers. Only after Cisco open sourced its licensed version of H.264 was this possible. Note while the licensing situation for playback has been improved there are still patent/licensing issues which mean that some institutions still will not create or deliver any MP4 with H.264.

But now even as H.264 can be played across all current browsers, there are still changes coming that mean a variety of formats will be present in the wild. New codecs like VP9 that provide much better compression are taking off and have been adopted by most, but not all, modern browsers. The advantages of VP9 are that it reduces file size such that storage and bandwidth costs can be reduced significantly. Encoding time is increased while performance is improved. And still other new, open formats like AV1 using the latest technologies are being developed. Even audio is seeing some change as Firefox and Chrome are implementing FLAC which will make it an option to use a lossless codec for audio delivery.

As the landscape for codecs continues to change the decision on which formats to provide should be given to each institution. Some will want to continue to use a familiar H.264 encoding pipeline. Others will want to take advantage of the cost savings of new formats and migrate. There ought to be allowance for each institution to pick which formats best meet their needs. Since sources in HTML5 media can be listed in order of preference, in as much as is possible a standard ought to support the ability of a client to respect the preferences of the institution for these reasons. So if WebM VP9 is the first source and the browser can play that format it should play it even if an MP4 H.264 is available which it can also play. The institution may make decisions around the quality to provide for each format to optimize for their particular content and intended uses.

Then there is the choice to implement adaptive bitrate streaming. Again institutions could decide to implement these formats for a variety of reasons. Delivering the appropriate adaptation for the situation has benefits beyond just enabling smooth playback. By delivering only the segment size a client can use based on network conditions and sometimes player size, the segments can be much smaller lowering bandwidth costs. The institution can make a decision depending on their implementation and use patterns whether their costs are more with storage or bandwidth and use the formats that work best for them. It can also be a courtesy to mobile users to deliver smaller segment sizes. Then there are delivery platforms where an adaptive bitrate format is required. Apple requires iOS applications to deliver HLS for any video over ten minutes long. Any of these types of considerations might nudge an AV provider to use ABR formats. They add complexity but also come with attractive performance benefits.

Any solution for an API for AV media should not try to pick winners among codecs or formats. The choice should be left to the institution while still allowing them to share the media in these formats with other institutions. It should allow for sharing AV in whatever formats an institution chooses. An approach which restricts which codecs and formats can be shared does harm and closes off important considerations for publishers. Asking them to deliver too many duplicate versions will also mean forcing certain costs. Will this variety of codecs allow for complete interoperability from every institution to every other institution and user? Probably not, but the tendency will be for institutions to do what is needed to support a broad range of browsers while optimizing for their particular needs. Guidelines and evolving best practices can also be part of any community built around the API. A standard for AV sharing should not shut off options while allowing for a community of practice to develop.

Simple API

If an institution is able to deliver any of their video on the web, then that is an accomplishment. What could be provided to allow them to most easily share their video with other institutions? One simple approach would be for them to create a URL where they can publish information about the video. Some JSON with just enough technical information could map to the properties an HTML5 video player uses. Since it is still the case that many institutions are publishing multiple versions of each video in order to cover the variety of new and old browsers and mobile devices, it could include a list of these different video sources in a preferred order. Preference could be given to an adaptive bitrate format or newer, more efficient codec like VP9 with an MP4 fallback further down the list. Since each video source listed includes a URL to the media, the media file(s) could live anywhere. Hybrid delivery mechanisms are even possible where different servers are used for different formats or the media are hosted on different domains or use CDNs.

This ability to just list a URL to the media would mean that as institutions move to cloud hosting or migrate to a new video server, they only need to change a little bit of information in a JSON file. This greatly simplifies the kind of technical infrastructure that is needed to support the basics of video sharing. The JSON information file could be a static file. No need even for redirects for the video files since they can live wherever and change location over time.

Here is an example of what part of a typical response might look like where a WebM and an MP4 are published:

{ "sources": [ { "id": "" "format": "webm", "height": 480, "width": 720, "size": "3360808", "duration": "35.627000", "type": "video/webm; codecs=\"vp8,vorbis\"", }, { "id": "" "format": "mp4", "frames": "1067", "height": 480, "width": 720, "size": "2924836", "duration": "35.627000", "type": "video/mp4; codecs=\"avc1.42E01E,mp4a.40.2\"", } ] }

You can see an example of this “sources” approach here.

An approach that simply lists the available sources an institution makes available for delivery ought to be easier for more institutions over other options for sharing AV. It would allow them to effectively share the whole range of the types of audio and video they already have no matter what technologies they are currently using. In the simplest cases there would be no need for even redirects. If you are optimizing for widest possible sharing from the most institutions, then an approach along these lines ought to be considered.

Straight to AV in the Presentation API?

One interesting option has been proposed for IIIF to move forward with supporting AV resources. This approach is presented in What are Audio and Video Content APIs?. The mechanism is to list out media sources similar to the above approach but on a canvas within a Presentation API manifest. The pattern appears clear for how to provide a list of resources in a manifest in this way. It would not require a specific AV API that tries to optimize for the wrong concerns. The approach still has some issues that may impede sharing.

Requiring an institution to go straight to implementing the Presentation API means that nothing is provided to share AV resources outside of a manifest or a canvas that can be referenced separate from a Presentation manifest. Not every case of sharing and reuse requires the complexity of a Presentation manifest in order to just play back a video. There are many use cases that do not need a sequence with a canvas with media with an annotation with a body with a list of items–a whole highly nested structure, just to get to the AV sources needed to play back some media. This breaks the pattern from the Image API where it is easy and common to view an image without implementing Presentation at all. Only providing access to AV through a Presentation manifest lacks simplicity which would allow an institution to level up over time. What is the path for an institution to level up over time and incrementally adopt IIIF standards? Even if a canvas could be used as the AV API as a simplification over a manifest, requiring a dereferenceable canvas would further complicate what it takes to implement IIIF. Even some institutions that have implemented IIIF and see the value of a dereferenceable canvas have not gotten that far yet in their implementations.

One of the benefits I have found with the Image API is the ability to view images without needing to have the resource described and published to the public. This allows me to check on the health of images, do cache warming to optimize delivery, and use the resources in other pre-publication workflows. I have only implemented manifests and canvases within my public interface once a resource has been published, so would effectively be forced to publish the resource prematurely or otherwise change the workflow. I am guessing that others have also implemented manifests in such a way that is tied to their public interfaces.

Coupling of media access with a manifest has some other smaller implications. Requiring a manifest or canvas leads to unnecessary boilerplate when an institution does not have the information yet and still needs access to the resources to prepare the resource for publication. For instance a manifest and a canvas MUST have a label. Should they use “Unlabeled” in cases where this information is not available yet?

In my own case sharing with the world is often the happy result rather than the initial intention of implementing something. For instance there is value in an API that supports different kinds of internal sharing. Easy internal sharing enables us to do new things with our resources more easily regardless of whether the API is shared publicly. That internal sharing ought to be recognized as an important motivator for adopting IIIF and other standards. IIIF thus far has enabled us to more quickly develop new applications and functionality that reuse special collections image resources. Not every internal use will need or want the features found in a manifest, but just need to get the audio or video sources to play them.

If there is no IIIF AV API that optimizes for the sharing of a range of different AV formats and instead relies on manifests or canvases, then there is still a gap that could be filled. For at least local use I would want some kind of AV API in order to get the technical information I would need to embed in a manifest or canvas. This seems like it could be a common desire to decouple technical information about video resources from the fuller information needed for a manifest including attributes like labels needed for presentation with context to the public. Coupling AV access too tightly to Presentation does not help to solve the desire to decouple these technical aspects. It is a reasonable choice to consider this technical information a separate concern. And if I am already going through the work to create such an internal AV API, I would like to be able to make this API available to share my AV resources outside of a manifest or canvas.

Then there is also the issue of AV players. In the case of images many pan zoom image viewers were modified to work with the Image API. One of the attractions to delivery images via IIIF or adopting a IIIF image server is that there is choice in viewers. Is the expectation that any AV players would need to read in a Presentation manifest or canvas in order to support IIIF and play media? The complexity of the manifest and canvas documents may hinder adoption IIIF in media players. These are rather complicated documents that take some time to understand. A simpler API than Presentation may have a better chance to be more widely adopted for players and easier to maintain. We only have the choice of a couple featureful client side applications for presenting manifests (UniversalViewer and Mirador), but we already have many basic viewers for the Image API. Even though not all of those basic viewers are used within the likes of UniversalViewer and Mirador, the simpler viewers have still been of value for other use cases. For instance a simple image viewer can be used in a metadata management interface where UniversalViewer features like the metadata panel and download buttons are unnecessary or distracting. Would the burden of maintaining plugins and shims for various AV players to understand a manifest or canvas rest with the relatively small IIIF community rather than with the larger group of maintainers of AV players? Certainly having choice is part of the benefit of having the Image API supported in many different image viewers. Would IIIF still have the goal of being supported by a wide range of video players? This ability to have broad support within some of the foundational pieces like media players allows for better experimentation on top of it.

My own implementation of the Image API has shown how having a choice of viewers can be of great benefit. When I was implementing the IIIF APIs I wanted to improve the viewing experience for users by using a more powerful viewer. I chose UniversalViewer even though it did not have a very good mobile experience at the time. We did not want to give up the decent mobile experience we had previously developed. Moving to only using UV would have meant giving up on mobile use. So that we could still have a good mobile interface while UV was in the middle of improving its mobile view, we also implemented a Leaflet-based viewer alongside UV. We toggled each viewer on/off with CSS media queries. This level of interoperability at this lower level in the viewer allowed us to take advantage of multiple viewers while providing a better experience for our users. You can read more about this in Simple Interoperability Wins with IIIF. As AV players are uneven in their support of different features this kind of ability to swap out one player for another, say based on video source type, browser version, or other features, may be particularly useful. We have also seen new tools for tasks like cropping grow up around the Image API and it would be good to have a similar situation for AV players.

So while listing out sources within a manifest or canvas would allow for institutions with heterogeneous formats to share their distributed AV content, the lack of an API that covers these formats results in some complication, open questions, and less utility.


IIIF ought to focus on solving the right challenges for audio and video. There is no sense in trying to solve the performance challenges of AV delivery. That work has been well done already by the larger AV community and industry. The parameterized approach to an AV API does not bring significant delivery performance gains though that is the only conceivable benefit to the approach. The parameterized approach does not sufficiently help make it easier for smaller institutions to share their video. It does not provide any help at all to institutions that are trying to use current best practices like adaptive bitrate formats.

Instead IIIF should focus on achieving ubiquitous sharing of media across many types of institutions. The focus on solving the challenges with sharing media and the complexity and costs with delivering AV resources leads to meeting institutions more where they are at. A simple approach to an AV API that lists out the sources would more readily solve the challenges institutions will face with sharing.

Optimizing for sharing leads to different conclusions than optimizing for performance.


Since writing this post I’ve reconsidered some questions and modified my conclusions.

Update 2017-02-04: Canvas Revisited

Since I wrote this post I got some feedback on it, and I was convinced to try the canvas approach. I experimented with creating a canvas, and it looks more complex and nested than I would like, but it isn’t terrible to understand and create. I have a few questions I’m not sure how I’d resolve, and there’s some places where there could be less ambiguity.

You can see one example in this gist.

I’d eventually like to have an image service that can return frames from the video, but for now I’ve just included a single static poster image as a thumbnail. I’m not sure how I’d provide a service like that yet, though I had prototyped something in my image server. One way to start with creating an image service that just provides full images for the various sizes that are provided with the various adaptations. Or could a list of poster image choices with width & height just be provided somehow? I’m not sure what an info.json would look like for non-tiled images. Are there any Image API examples out in the wild that only provide a few static images?

I’ve included a width an height for the adaptive bitrate formats, but what I really mean is the maximum height and width that’s provided for those formats. It might be useful to have those values available.

I haven’t included duration for each format, though there would be slight variations. I don’t know how the duration of the canvas would be reconciled with the duration of each individual item. Might just be close enough to not matter.

How would I also include an audio file alongside a video? Are all the items expected to be a video and the same content? Would it be alright to also add an audio file or two to the items? My use case is that I have a lot of video oral histories. Since they’re mostly talking heads some may prefer to just listen to the audio than to play the video. How would I say that this is the audio content for the video?

I’m uncertain how with the seeAlso WebVTT captions I could say that they are captions rather than subtitles, descriptions, or chapters. Would it be possible to add a “kind” field that maps directly to an HTML5 track element attribute? Otherwise it could be ambiguous what the proper use for any particular WebVTT (or other captions format) file is.

Several video players allow for preview thumbnails over the time rail via a metadata WebVTT file that references thumbnail sprites with media fragments. Is there any way to expose this kind of metadata file on a canvas to where it is clear what the intended use of the metadata file is? Is this a service?


Subscribe to code4lib aggregator