You are here

planet code4lib

Subscribe to planet code4lib feed
Planet Code4Lib -
Updated: 11 hours 12 min ago

LITA: Call for Proposals, LITA @ ALA Annual 2018

Tue, 2017-07-18 15:54

Submit Your Program ideas for the 2018 ALA Annual Conference 

New Orleans LA, June 21-26, 2018

The LITA Program Planning Committee (PPC) is now encouraging the submission of innovative and creative proposals for the 2018 Annual American Library Association Conference. We’re looking for 60 minute conference presentations. The focus should be on technology in libraries, whether that’s use of, new ideas for, trends in, or interesting/innovative projects being explored – it’s all for you to propose. Programs should be of interest to all library/information agency types, that inspire technological change and adoption, or/and generally go above and beyond the everyday.

  • Submission Deadline: August 25, 2017
  • Final Decisions: September 29, 2017
  • Schedule of Sessions Announced: November 8, 2017

For the first time, proposals will be accepted via one submission site for all ALA Divisions, RoundTables, Committees and Offices. This link to the submission site will redirect to the ALA log-in page. All submitters are required to have an ALA profile, but are not required to be ALA members.

Help and details on making a successful submission are on the LITA Forms web site.

We regularly receive many more proposals than we can program into the slots available to LITA at the ALA Annual Conference. These great ideas and programs all come from contributions like yours. Submissions are open to anyone, regardless of ALA membership status. We welcome proposals from anyone who feels they have something to offer regarding library technology. We look forward to hearing the great ideas you will share with us this year.

Questions or Comments?

Contact LITA at (312) 280-4268 or Mark Beatty,





Terry Reese: MarcEdit 7: Preferences Wireframes and Ease of Use Features

Tue, 2017-07-18 06:16

This post relates to the previous posts:

  1. MarcEdit 7 visual styles: High Contrast:
  2. MarcEdit 7: Accessibility Options Part 2:

I’m continuing to flesh out new wireframes, and one of the areas where I’ll be consolidating some options is in the preferences window.  I’ve decided to reorganize the menu and some of the settings.  Additionally, I’m adding a new setting: Ease of Access. 

Here’s the Initial Wireframes demonstrating the new menu layout

Ease of Use:

This is a new section developed to support Accessibility options.  At this point, these are the options that I’m working on:

While MarcEdit will respect the operating system’s accessibility settings (i.e., if you’ve scaled fonts, etc.), but these settings directly affect the MarcEdit application.  In this section, you’ll find the themes (and I’m working out a way to provide a wizardry way to create themes and find ones that have been created), feedback options (right now, if this is selected, you’ll get audible clicks letting you know that an action has occurred), and Keyboard options.  I’m spending a lot of time mapping the current keyboard options, with the intention that I’ll try to map all actions to some keyboard combination.  These settings tell MarcEdit if this information should show up in the Tooltips, as well as rich descriptions about an operation.  The last thing that I’ll likely add is a set of links to topics for users looking for accessible friendly fonts, etc. 

I think that the reorganization should help to provide some clarity in the settings and will help me in thinking about the first run wizard – and hopefully the currently planned accessibility options will provide users with a wider range of options. 

Questions, comments, let me know.


Library of Congress: The Signal: Collections as Data: IMPACT

Mon, 2017-07-17 21:00

If you are in the Washington, DC area next week (or can be), please be our guest at a very special day-long event hosted by The Library of Congress National Digital Initiatives. “Collections as Data: Impact” will be held 9:30 a.m. to 5 p.m. on Tuesday, July 25, in the Coolidge Auditorium on the first floor of the Thomas Jefferson Building.

The event is free, but tickets are required to attend in person.  The event also will be livestreamed on the Library’s Facebook page at and its YouTube site (with captions) at

We will be recording the talks and creating stand-alone videos that we hope are shared widely and help to explain what we mean when we talk about the transformational opportunities of using library collections as data.

“The Library of Congress and other libraries have been serving digital collections online for over a decade,” said NDI’s chief Kate Zwaard. “With modern computing power and the emergence of data-analysis tools, our collections can be explored more deeply and reveal more connections. By unleashing computation on the world’s biggest digital library, the knowledge and creativity contained in libraries become even more relevant. At this event we’re showcasing true leaders in the field of using digital collections and technology to advance collective understanding. We’re so excited to hear their stories and share them with our community.”

Ed Ayers

Among the symposium’s keynote speakers is Edward Ayers, the University of Richmond’s President Emeritus and Tucker-Boatwright Professor of the Humanities. President Barack Obama awarded him the National Humanities Medal in 2013 for his dedication to public history. He is a pioneer in digital scholarship and is currently co-host of the BackStory podcast. His talk is titled “History Between the Lines: Thinking about Collections as Data.”

Paul Ford

Another featured speaker is Paul Ford, a journalist, programmer and co-founder of Postlight, a digital product studio in New York City. He is the author of a breakthrough piece, “What is Code,” revealing how computers, applications and software work. He will discuss “Unscroll: An Approach to Making New Things From Old Things.”

Other speakers include:

  • Tahir Hemphill, media strategist and artist, manager of the Rap Research Lab
  • Sarah Hatton, contemporary Canadian artist, creator of Detachment
  • Stephen Robertson, director of the Roy Rosenzweig Center for History and New Media and professor at George Mason University
  • Patrick Cronin and Thomas Neville, co-directors of THATCLASS
  • Jessie Daniels, professor at Hunter College and the Graduate Center, CUNY
  • Geoff Haines-Stiles, producer of “The Crowd and the Cloud” television series
  • Nicholas Adams, sociologist and research fellow at the Berkeley Institute for Data Science
  • Rachel Shorey of The New York Times’ Interactive News Department
  • Stephanie Stillo, curator of the Lessing J. Rosenwald Collection in the Library of Congress Rare Book and Special Collections Division

This is the second in the “Collections as Data” event series hosted by the Library of Congress. Last year’s event in the Coolidge Auditorium attracted a sold-out crowd and has been viewed more than 8,000 times on the Library’s YouTube channel. That event introduced the topic of collections as data and explored ethical issues around building and using digital collections. This year’s meeting will focus on stories of impact this work has on the public.

We hope you can join us next week either in-person or virtually. Everyone can follow along and join the conversation via the #AsData hashtag.

District Dispatch: ALA comments filed at the FCC

Mon, 2017-07-17 18:50

Image credit:

Today, ALA continues the fight for an open internet for all. In comments filed at the Federal Communications Commission (FCC), ALA questions the need to review current net neutrality rules and urges regulators to maintain the strong, enforceable rules already in place.

“Network neutrality is all about equity of access to information, and thus of fundamental interest to libraries,” said ALA President Jim Neal. “The 2015 Open Internet Order is the right reading of the law, and we do not see any reason for the FCC to arbitrarily return to this issue now. Without strong, enforceable rules protecting the open internet—like those outlined in the FCC’s 2015 Order—libraries cannot fulfill their missions, serve their patrons or support America’s communities.”

The ALA comments, filed with the American Association of Law Libraries (AALL) and the Chief Officers of State Library Agencies (COSLA), make clear that our nation’s 120,000 libraries—and their patrons—depend on fair access to broadband networks for basic services they provide in communities like connecting people to unbiased research, job searches, e-government services, health information and economic opportunity.

Moreover, as people increasingly turn from being solely content consumers to content producers, access to the internet and other library resources empower all to participate fully in today’s vibrant digital economy. And, the comments note, the library community has always had the professional and philosophical mission of preserving the unimpeded flow of information and intellectual freedom. Libraries believe ensuring equitable access for all people and institutions is critical to our nation’s social, cultural, educational and economic well-being and the existing net neutrality rules protect that access.

Absent strong, enforceable rules, commercial ISPs have financial incentives to interfere with the openness of the internet in ways that are likely to be harmful to people who use the internet content and services provided by libraries. Being able to prioritize their own content over anything else available online would allow ISPs to reap huge dividends at internet users’ expense. Pointing to increasing consolidation in the fixed and mobile broadband markets, the comments argue that these rules are becoming more necessary, not less.

The organizations filing comments today have a long history of advocating for the open internet, most recently sending letters to the FCC and Congressional leaders articulating Net Neutrality Principles that should form the basis of any review of the FCC’s 2015 Open Internet Order.

This post is from an ALA’s press release issued today:

The post ALA comments filed at the FCC appeared first on District Dispatch.

Islandora: Islandora Altmetrics is now Islandora Badges

Mon, 2017-07-17 14:50
Thanks to the efforts of newly minted Islandora 7.x Committer Brandon Weigel, from the British Colombia Electronic Library Network, the Islandora Altmetrics module has received a major overhaul, moving beyond support for Altmetrics to become a more generalized tool for adding various metrics. With this change in function comes a name change: Islandora Badges. Available badges include:
  • Altmetrics: display social media interactions
  • Scopus: Citation counts via the Scopus database
  • Web of Science: Citation counts via Web of Science
  • oaDOI: Provides a link to a fulltext document for objects without a PDF datastream, via the API
The switch from Altmetrics to Badges will take place with the Islandora 7.x-1.10 release (or now, if you're running on HEAD and want the new update).  Making this change took six months of testing and committing, with a total of 161 commits. Special thanks to Jared Whikloj, Jordan Dukart, Diego Pino,  Dan Aitken, and Nick Ruest for their part in helping the process along, and to Will Panting and Donald Moses, who created the original Islandora Altmetrics during the 2015 Islandoracon Hackfest.

Islandora: Islandora Foundation Annual General Meeting - Friday, July 14th

Mon, 2017-07-17 12:40
The Islandora Foundation held its Annual General Meeting on Friday. The agenda and meeting minutes are here.   We had a relatively brief AGM, as much of the business maintaining and improving Islandora is handled by community groups such as our Committers, Interest Groups, Roadmap Committee, Board of Directors, and in discussions on our listserv, but some highlights from Friday's meeting include:  
  • The announcement of the re-election of Mark Jordan as the Chair of Islandora Foundation Board of Directors.
  • A Treasurer's report showing that the Islandora Foundation's financial status is stable, owing to the support of our members and revenue from our events.
  • Reports highlighting the work of the Board of Directors and Roadmap Committee in 2016/2017
  • Updates on the status of Islandora CLAW and the Fedora Specification, and how the two are related.
  Thank you to everyone who attended.

Terry Reese: MarcEdit 7 visual styles: High Contrast

Fri, 2017-07-14 15:50

An interesting request made while reviewing the Wireframes was if MarcEdit 7 could support a kind of high contrast, or “Dark” theme mode.  An Example of this would be Office:

Some people find this interface easier on the eyes, especially if you are working on a screen all day. 

Since MarcEdit utilizes its own GUI engine to handle font sizing, scaling, and styling – this seems like a pretty easy request.  So, I did some experimentation.  Here’s MarcEdit 7 using the conventional UI:

And here it is under the “high contrast” theme:

Since theming falls into general accessibility options, I’ve put this in the language section of the options:

However, I should point out that in MarcEdit 7, I will be changing this layout to include a dedicated setting area for Accessibility options, and this will likely move into that area.

I’m not sure this is an option that I’d personally use as the “Dark” theme or High Contrast isn’t my cup of tea, but with the new GUI engine added to MarcEdit 7 with the removal of XP support – supporting this option really took about 5 minutes to turn on.

Questions, comments?


D-Lib: RARD: The Related-Article Recommendation Dataset

Fri, 2017-07-14 11:42
Article by Joeran Beel, Trinity College Dublin, Department of Computer Science, ADAPT Centre, Ireland; Zeljko Carevic and Johann Schaible, GESIS - Leibniz Institute for the Social Sciences, Germany; Gabor Neusch, Corvinus University of Budapest, Department of Information Systems, Hungary

D-Lib: Massive Newspaper Migration - Moving 22 Million Records from CONTENTdm to Solphal<

Fri, 2017-07-14 11:42
Article by Alan Witkowski, Anna Neatrour, Jeremy Myntti and Brian McBride, J. Willard Marriott Library, University of Utah

D-Lib: The Best Tool for the Job: Revisiting an Open Source Library Project

Fri, 2017-07-14 11:42
Article by David J. Williams and Kushal Hada, Queens College Libraries, CUNY, Queens, New York

D-Lib: Ensuring and Improving Information Quality for Earth Science Data and Products

Fri, 2017-07-14 11:42
Article by Hampapuram Ramapriyan, Science Systems and Applications, Inc. and NASA Goddard Space Flight Center; Ge Peng, Cooperative Institute for Climate and Satellites-North Carolina, North Carolina State University and NOAA's National Centers for Environmental Information; David Moroni, Jet Propulsion Laboratory, California Institute of Technology; Chung-Lin Shie, NASA Goddard Space Flight Center and University of Maryland, Baltimore County

D-Lib: Trends in Digital Preservation Capacity and Practice: Results from the 2nd Bi-annual National Digital Stewardship Alliance Storage Survey

Fri, 2017-07-14 11:42
Article by Michelle Gallinger, Gallinger Consulting; Jefferson Bailey, Internet Archive; Karen Cariani, WGBH Media Library and Archives; Trevor Owens, Institute of Museum and Library Services; Micah Altman, MIT Libraries

D-Lib: The End of an Era

Fri, 2017-07-14 11:42
Editorial by Laurence Lannom, CNRI

D-Lib: Explorations of a Very-large-screen Digital Library Interface

Fri, 2017-07-14 11:42
Article by Alex Dolski, Independent Consultant; Cory Lampert and Kee Choi, University of Nevada, Las Vegas Libraries

District Dispatch: Library funding bill passes Labor HHS

Thu, 2017-07-13 23:08

In response to today’s House subcommittee vote, ALA President Jim Neal sent ALA members the following update:

(AP Photo/Carolyn Kaster)


I am pleased to report that, this evening, the House Appropriations subcommittee that deals with library funding (Labor, Health & Human Services, Education and Related Agencies) voted to recommend level funding in FY2018 for the Institute of Museum and Library Services (IMLS, $231 million), likely including $183 million for the Library Services and Technology Act, as well as $27 million for the Innovative Approaches to Literacy program.

Four months ago, President Trump announced that he wanted to eliminate IMLS and federal funding for libraries. Since then, all of us have been communicating with our members of Congress about the value of libraries. This evening’s Subcommittee vote, one important step in the lengthy congressional appropriations process, shows that our elected officials are listening to us and recognize libraries’ importance in the communities they represent. We are grateful to the leaders of the Subcommittee, Chairman Tom Cole (R-OK-4) and Ranking Member Rosa DeLauro (D-CT-3), and all Subcommittee members, for their support.

We have not saved FY18 federal library funding yet. Hurdles can arise at each stage of the appropriations process, which will continue into the fall. But the fact that federal library funding was not cut at this particular stage shows what can be accomplished when ALA members work together. We expect the full House Appropriations Committee to vote on the subcommittee bills as early as next Wednesday, July 19. I will send an update as soon as we have the results of the full committee’s actions.

In the meantime, I encourage you to stay informed and stay involved. Libraries and the millions of people we serve are in a better position today because of your advocacy.

Thank you,

Jim Neal

The post Library funding bill passes Labor HHS appeared first on District Dispatch.

Jonathan Rochkind: on hooking into sufia/hyrax after file has been uploaded

Thu, 2017-07-13 15:53


Our app (not yet publicly accessible) is still running on sufia 7.3. (A digital repository framework based on Rails, also known in other versions or other drawings of lines as hydra, samvera, and hyrax).

I had a need to hook into the point after a file has been added to fedora, to do some post-processing at that point.

(Specifically, we are trying to run a riiif instance on another server, without a shared file system (shared FS are expensive and/or tricky on AWS). So, the riiif server needs to copy the original image asset down from fedora. Since our original images are uncompressed TIFFs that average around 100MB, this is somewhat slow, and we want to have the riiif server “pre-load” at least the originals, if not the derivatives it will create. So after a new image is uploaded, we want to ‘ping’ the riiif server with an info request, causing it to download the original, so it’s there waiting for conversion requests, and at least it won’t have to do that. But it can’t pull down the file until it’s in fedora, so we need to wait until after fedora has it to ping. phew.)

Here are the cooperating objects in Sufia 7.3 that lead to actual ingest in Fedora. As far as I can tell. Much thanks to @jcoyne for giving me some pointers as to where to look to start figuring this out.

Keep in mind that I believe “actor” is just hydra/samvera’s name for a service object involved in handling ‘update a persisted thing’. Don’t get it confused with the concurrency notion of an ‘actor’, it’s just an ordinary fairly simple ruby object (although it can and often does queue up an ActiveJob for further processing).

The sufia default actor stack at ActorFactory includes the Sufia::CreateWithFilesActor.


  • AttachFilesToWork job does some stuff, but then calls out to a CurationConcerns::Actors::FileSetActor#create_content. (we are using curation_concerns 1.7.7 with sufia 7.3) — At least if it was a direct file upload (I think is what this means). If the file was a `CarrierWave::Storage::Fog::File` (not totally sure in what circumstances it would be), it instead kicks off an ImportUrlJob.  But we’ll ignore that for now, I think the FileSetActor is the one my codepath is following. 





  • We are using hydra-works 0.16.0. AddFileToFileSet I believe actually finishes things off synchronously without calling out to anything else related to ‘get this thing into fedora’. Although I don’t really totally understand what the code does, honestly.
    • It does call out to Hydra::PCDM::AddTypeToFile, which is confusingly defined in a file called add_type.rb, not add_type_to_file.rb. (I’m curious how that doesn’t break things terribly, but didn’t look into it).


So in summary, we have six fairly cooperating objects involved in following the code path of “how does a file actually get added to fedora”.  They go across 3-4 different gems (sufia, curation_concerns, hydra-works, and maybe hydra-pcdm, although that one might not be relevant here). Some of the classes involved inherit from, mix-in, or have references to classes from other gems. The path involves at least two (sometimes more in some paths?) bg jobs — a bg job that queues up another bg job (and maybe more).

That’s just trying to follow the path involved in “get this uploaded file into fedora”, some  of those cooperating objects also call out to other cooperating objects (and maybe queue bg jobs?) to do other things, involving a half dozenish additional cooperating objects and maybe one or two more gem dependencies, but I didn’t trace those, this was enough!

I’m not certain how much this changed in hyrax (1.0 or 2.0), at the very least there’d be one fewer gem dependency involved (since Sufia and CurationConcerns were combined into Hyrax). But I kind of ran out of steam for compare and contrast here, although it would be good to prepare for the future with whatever I do.

Oh yeah, what was I trying to do again?

Hook into the point “after the thing has been successfully ingested in fedora” and put some custom code there.

So… I guess…  that would be hooking into the ::IngestFileJob (located in CurationConcerns), and doing something after it’s completed. It might be nice to use the ActiveJob#after_perform hook to this.  I actually hadn’t known about that callback, haven’t used it before — we’d need to get at least the file_set arg passed into it, which the docs say maybe you can get from the passed-in job.arguments?  That’s a weird way to do things in ruby (why aren’t ActiveJob’s instances with their state as ordinary state? I dunno), but okay! Or, of course we could just monkey-patch override-and-call-super on perform to get a hook.

Or we could maybe hook into Hydra::Works::AddFileToFileSet instead, I think that does the actual work. There’s no callbacks there, so that’d just be monkey-patch-and-call-super on #call, I guess.

This definitely seems a little bit risky, for a couple different reasons.

  • There’s at least one place where a potentially different path is followed, if you’re uploading a file that ends up as a CarrierWave::Storage::Fog::File instead of a CarrierWave::SanitizedFile.  Maybe there are more I missed? So configuration or behavior changes in the app might cause my hook to be ignored, at least in some cases.


  • Forward-compatibility seems unreliable. Will this complicated graph of cooperating instances get refactored?  Has it already in future versions of Hyrax? If it gets refactored, will it mean the object I hook into no longer exists (not even with a different namespace/name), or exists but isn’t called in the same way?  In some of those failure modes, it might be an entirely silent failure where no error is ever raised, my code I’m trying to insert just never gets called. Which is sad. (Sure, one could try to write a spec for this to warn you…  think about how you’d do that. I still am.)  Between IngestFileJob and AddFileToFileSet, is one ‘safer’ to hook into than the other? Hard to say. If I did research in hyrax master branch, it might give me some clues.

I guess I’ll probably still do one of these things, or find another way around it. (A colleague suggested there might be an entirely different place to hook into instead, not the ‘actor stack’, but maybe in other code around the controller’s update action).

What are the lessons?

I don’t mean to cast any aspersions on the people who put in a lot of work, very well-intentioned work, conscientious work, to get hydra/samvera/sufia/hyrax where it is, being used by lots of institutions. I don’t mean to say that I could or would have done differently if I had been there when this code was written — I don’t know that I could or would have.

And, unfortunately, I’m not saying I have much idea of what changes to make to this architecture now, in the present environment, with regards to backwards compat, with regards to the fact that I’m still on code one or two major versions (and a name change) behind current development (which makes the local benefit from any work I put into careful upstream PR’s a lot more delayed, for a lot more work; I’m not alone here, there’s a lot of dispersion in what versions of these shared dependencies people are using, which adds a lot of cost to our shared development).  I don’t really! My brain is pretty tired after investigating what it’s already doing. Trying to make a shared architecture which is easily customizable like this is hard, no ways around it.  (ActiveSupport::Callbacks are trying to do something pretty analogous to the ‘actor stack’, and are one of the most maligned parts of Rails).

But I don’t think that should stop us from some evaluation.  Going forward making architecture that works well for us is aided immensely by understanding what has worked out how in what we’ve done before.

If the point of the “Actor stack” was to make it easy/easier to customize code in a safe/reliable way (meaning reasonably forward-compatible)–and I believe it was–I’m not sure it can be considered a success. We gotta start with acknowledging that.

Is it better than what it replaced?  I’m not sure, I wasn’t there for what it replaced. It’s probably harder to modify in the shared codebase going forward than the presumably simpler thing it replaced though… I can say I’d personally much rather have just one or two methods, or one ActiveJobs, that I just hackily monkey-patch to do what I want, that if it breaks in a future version will break in a simple way, or one that takes less time and brain to figure out what’s going on anyway. That wouldn’t be a great architecture, but I’d prefer it to what’s there now, I think.  Of course, it’s a pendulum, and the grass is always greener, if I had that, I’d probably be wanting something cleaner, and maybe arrive at something like the ‘actor stack’ — but now we’re all here now with what we’ve got, so we can at least consider that this may have gone in some unuseful directions.

What are those unuseful directions?  I think, not just in the actor stack, but in many parts of hydra, there’s an ethos that breaking things into many very small single-purpose classes/instances is the way to go, then wiring them all together.  Ideally with lots of dependency injection so you can switch in em and out.  This reminds me of what people often satirize and demonize in stereotypical maligned Java community architecture, and there’s a reason it’s satirized and demonized. It doesn’t… quite work out.

To pull this off well — especially in shared library/gem codebase, which I think has different considerations than a local bespoke codebase, mainly that API stability is more important because you can’t just search-and-replace all consumers in one codebase when API changes — you’ve got to have fairly stable APIs, which are also consistent and easily comprehensible and semantically reasonable.   So you can replace or modify one part, and have some confidence you know what it’s doing, when it will be called, and that it will keep doing this for at least a few months of future versions. To have fairly stable and comfortable APIs, you need to actually design them carefully, and think about developer use cases. How are developers intended to intervene in here to customize? And you’ve got to document those. And those use cases also give you something to evaluate later — did it work for those use cases?

It’s just not borne out by experience that if you make everything into as small single-purpose classes as possible and throw them all together, you’ll get an architecture which is infinitely customizable. You’ve got to think about the big picture. Simplicity matters, but simplicity of the architecture may be more important than simplicity of the individual classes. Simplicity of the API is definitely more important than simplicity of internal non-public implementation. 

When in doubt if you’re not sure you’ve got a solid stable comfortable API,  fewer cooperating classes with clearly defined interfaces may be preferable to  more classes that each only have a few lines. In this regard, rubocop-based development may steer us wrong, too much to the micro-, not enough to the forest.

To do this, you’ve got to be careful, and intentional, and think things through, and consider developer use cases, and maybe go slower and support fewer use cases.  Or you wind up with an architecture that not only does not easily support customization, but is very hard to change or improve. Cause there are so many interrelated coupled cooperating parts, and changing any of them requires changes to lots of them, and breaks lots of dependent code in local apps in the process. You can actually make forwards-compatible-safe code harder, not easier.

And this gets even worse when the cooperating objects in a data flow are spread accross multiple gems dependencies, as they often are in the hydra/samvera stack. If a change in one requires a change in another, now you’ve got dependency compatibility nightmares to deal with too. Making it even harder (rather than easier, as was the original goal) for existing users to upgrade to new versions of dependencies, as well as harder to maintain all these dependencies.  It’s a nice idea, small dependencies which can work together — but again, it only works if they have very stable and comfortable APIs.  Which again requires care and consideration of developer use cases. (Just as the Java community gives us a familiar cautionary lesson about over-architecture, I think the Javascript community gives us a familiar cautionary lesson about ‘dependency hell’. The path to code hell is often paved with good intentions).

The ‘actor stack’ is not the only place in hydra/samvera that suffers from some of these challenges, as I think most developers in the stack know.  It’s been suggested to me that one reason there’s been a lack of careful, considered, intentional architecture in the stack is because of pressure from institutions and managers to get things done, why are you spending so much time without new features?  (I know from personal experience this pressure, despite the best intentions, can be even stronger when working as a project-based contractor, and much of the stack was written by those in that circumstance).

If that’s true, that may be something that has to change. Either a change to those pressures — or resisting them by not doing rearchitectures under those conditions. If you don’t have time to do it carefully, it may be better not to commit the architectural change and new API at all.  Hack in what you need in your local app with monkey-patches or other local code instead. Counter-intuitively, this may not actually increase your maintenance burden or decrease your forward-compatibility!  Because the wrong architecture or the wrong abstractions can be much more costly than a simple hack, especially when put in a shared codebase. Once a few people have hacked it locally and seen how well it works for their use cases, you have a lot more evidence to abstract the right architecture from.

But it’s still hard!  Making a shared codebase that does powerful things, that works out of the box for basic use cases but is still customizable for common use cases, is hard. It’s not just us. I worked last year with spree/solidus, which has an analogous architectural position to hydra/samvera, also based on Rails, but in ecommerce instead of digital repositories. And it suffers from many of the same sorts of problems, even leading to the spree/solidus fork, where the solidus team thought they could do better… and they have… maybe… a little.  Heck, the challenges and setbacks of Rails itself can be considered similarly.

Taking account of this challenge may mean scaling back our aspirations a bit, and going slower.   It may not be realistic to think you can be all things to all people. It may not be realistic to think you can make something that can be customized safely by experienced developers and by non-developers just writing config files (that last one is a lot harder).  Every use case a participant or would-be participant has may not be able to be officially or comfortably supported by the codebase. Use cases and goals have to be identified, lines have to drawn. Which means there has to be a decision-making process for who and how they are drawn, re-drawn, and adjudicated too, whether that’s a single “benevolent dictator” person or institution like many open source projects have (for good or ill), or something else. (And it’s still hard to do that, it’s just that there’s no way around it).

And finally, a particularly touchy evaluation of all for the hydra/samvera project; but the hydra project is 5-7 years old, long enough to evaluate some basic premises. I’m talking about the twin closely related requirements which have been more or less assumed by the community for most of the project’s history:

1) That the stack has to be based on fedora/fcrepo, and

2) that the stack has to be based on native RDF/linked data, or even coupled to RDF/linked data at all.

I believe these were uncontroversial assumptions rather than entirely conscious decisions (edit 13 July, this may not be accurate and is a controversial thing to suggest among some who were around then. See also @barmintor’s response.), but I think it’s time to look back and wonder how well they’ve served us, and I’m not sure it’s well.  A flexible powerful out-of-the-box-app shared codebase is hard no matter what, and the RDF/fedora assumptions/requirements have made it a lot harder, with a lot more uncharted territory to traverse, best practices to be invented with little experience to go on, more challenging abstractions, less mature/reliable/performant components to work with.

I think a lot of the challenges and breakdowns of the stack are attributable to those basic requirements — I’m again really not blaming a lack of skill or competence of the developers (and certainly not to lack of good intentions!). Looking at the ‘actor stack’ in particular, it would need to do much simpler things if it was an ordinary ActiveRecord app with paperclip (or better yet shrine), it would be able to lean harder on mature shared-upstream paperclip/shrine to do common file handling operations, it would have a lot less code in it, and less code is always easier to architect and improve than more code. And meanwhile, the actually realized business/institutional/user benefits of these commitments — now after several+ years of work put into it — are still unclear to me.  If this is true or becomes consensual, and an evaluation of the fedora/rdf commitments and foundation does not look kindly upon them… where does that leave us, with what options?

Filed under: General