The user experience audit is the core deliverable from the UX bandwagon if you don’t code or draw. It has real measurable value, but it also represents the lowest barrier of entry for aspirants. Code or visual design work have these baked-in quality indicators. Good code works, and you just know good design when you see it in the same way Justice Stewart was able to gauge obscenity.
I shall not today attempt further to define the kinds of material I understand to be embraced within that shorthand description [“hard-core pornography”], and perhaps I could never succeed in intelligibly doing so. But I know it when I see it, and the motion picture involved in this case is not that.
Audits, though, aren’t so privileged. Look for audit templates or how-to’s — we even have one here on LibUX — and you’ll find the practice is inconsistent across the discipline.
In part, they suffer from the same flaw inherent to user experience design in general in that nobody can quite agree on just what user experience audits do.
It’s an ambiguity that extends across the table.
As a term, the “user experience audit” fails to describe its value to client stakeholders. There is no clear return in paying for an “audit,” rather than the promise of red flags under scrutiny. And precisely because the value of performing an audit requires explanation, scoring the opportunity relies now on the art of the pitch rather than the expertise of the service you provide.
It boils down to a semantic problem.
That’s all preamble for this: this weekend, my partnership with a library association came to an end – capped by the delivery of a benchmarks and heuristics report, which was a service I was able to up-sell in addition to my original scope of involvement. I don’t think I could have sold a “user experience audit.”
Instead, I offered to report on the accessibility, performance, and usability of their service in order to establish benchmarks on which to build moving forward. This creates an objective-ish reference that they or future consultants can use in future decision-making. Incremental improvements in any of these areas has an all-ships-rising-with-the-tide effect, but with this report — I say — we will be able to identify which opportunities have the most bang for their buck.
So, okay. It’s semantics. But this little wordsmithy makes an important improvement: “benchmarks and heuristics” actually describe the content of the audit. This makes it easier to convince stakeholders it’s no report card – but a decision-making tool that empowers the organization.My template
I use a simple template. I tweak, add, and remove sections depending on the scope of the project, but I think the arrangement holds-up. There is a short cover letter followed with an overview summarizing the whole shebang. I make it conversational, and try to answer the question stakeholders paid me to answer: how do we stand, and what should we do next? The rest of the report is evidence to support my advice.
Benchmarks are quantitative scores informed by data or programmatic audits that show the organization where they stand in relation to law or best practice or competition. You can run objective-ish numbers from user research as long as they adhere to some system — like net promoter scores or system usability scales — but in my experience the report is best organized from inarguable to old-fashioned gut feelings.
Programmatic audits border on the “inarguable” here. You’re either Section 508 compliant or you’re not. These are validation scans for either accessibility, performance, security, which can — when there’s something wrong — identify the greatest opportunities for improvement. I attach the full results of each audit in the appendix and explain my method. Then, I devote the white-space to describing the findings like you would over coffee.
Anticipate and work the answers to these questions into your writeup:
- Is this going to cost me [the stakeholder] money, business, credibility, or otherwise hurt me sometime down the road if I don’t fix?
- What kind of involvement, cost, consideration, and time does it take to address?
- What would you [the expert] recommend if you had your druthers?
- What is the least I could do or spend to assuage the worst of it?
I follow benchmarks with liberally linked-up heuristics and other research findings as they veer further into opinion, and the more opinionated each section becomes, the more I put into their presentation: embed gifs or link out to unlisted YouTube videos of the site in action, use screenshots, pop-in analytics charts or snippets from external spreadsheets like their content audit — or even audio clips from a user chatting about a feature.
Wait, audio? I’m not really carrying podcast equipment everywhere. Sometimes, I’ll put the website or prototype up on www.usertesting.com and ask five to ten people to perform a short task, then I’ll use the video or audio to — let’s say — prove a point about the navigation.
The more qualitative data I can use to support a best practice or opinion, the better I feel. I don’t actually believe that folks who reach out to me for this kind of stuff are looking for excuses to pshaw my work, but I’m a little insecure about it.
Anyway, your mileage may vary, but I thought I’d show you the basic benchmarks and heuristics report template I fork and start with each time. It might help if you don’t know where to start.
The Internet Archive had this to say earlier today:
— Internet Archive (@internetarchive) February 15, 2017
This was in response to the MacArthur Foundation announcing that the IA is a semifinalist for a USD $100 million grant; they propose to digitize 4 million books and make them freely available.
Well and good, if they can pull it off — though I would love to see the detailed proposal — and the assurance that this whole endeavor is not tied to the fortunes of a single entity, no matter how large.
But for now, I want to focus on the rather big bus that the IA is throwing “physical libraries” under. On the one hand, their statement is true: access to libraries is neither completely universal nor completely equitable. Academic libraries are, for obvious reasons, focused on the needs of their host schools; the independent researcher or simply the citizen who wishes to be better informed will always be a second-class user. Public libraries are not evenly distributed nor evenly funded. Both public and academic libraries struggle with increasing demands on their budgets, particularly with respect to digital collections. Despite the best efforts of librarians, underserved populations abound.
Increasing access to digital books will help — no question about it.
But it won’t fundamentally solve the problem of universal and equitable service. What use is the Open Library to somebody who has no computer — or no decent smart phone – or an inadequate data plan—or uncertain knowledge of how to use the technology? (Of course, a lot of physical libraries offer technology training.)
I will answer the IA’s overreach into technical messianism with another bit of technical lore: TIMTOWTDI.
There Is More Than One Way To Do It.
I program in Perl, and I happen to like TIMTOWTDI—but as a principle guiding the design of programming languages, it’s a matter of taste and debate: sometimes there can be too many options.
However, I think TIMTOWTDI can be applied as a rule of thumb in increasing social justice:
There Is More Than One Way To Do It… and we need to try all of them.
Local communities have local needs. Place matters. Physical libraries matter—both in themselves and as a way of reinforcing technological efforts.
Technology is not universally available. It is not available equitably. The Internet can route around certain kinds of damage… but big, centralized projects are still vulnerable. Libraries can help mitigate some of those risks.
I hope the Internet Archive realizes that they are better off working with libraries — and not just acting as a bestower of technological solutions that may help, but will not by themselves solve the problem of universal, equitable access to information and entertainment.
I’ve been talking about linked data for so long that I can’t remember when I first began. I was actually a skeptic at first, as I was struggling to see the benefit from all the work required to move our data from where it is now into that brave new world.
But then I started to understand what a transformational change we were contemplating, and the many benefits that could accrue. Let me spell it out for you.
MARC, our foundational metadata standard, is fundamentally built for description. As a library cataloger, you have an object in hand, and your task is to describe that item to the best of your abilities so that a library user can distinguish it from other, similar items. Sure, your task is also to assign some subject headings so it can be discovered by a subject search, but the essential bit is to describe the thing with enough specificity so that someone else (perhaps another cataloger) can determine whether the item they hold in their hand is the same thing.
I humbly submit that this has been the mission of cataloging for the last X number of decades. And now, I also submit, we are about to turn the tables. Rather than focusing our efforts on description, we will be focusing more of our efforts on discovery. What does this mean?
It means lashing up our assertions about an item (e.g., this person wrote this work) with canonical identifiers that can be resolved and can lead to additional information about that assertion. This of course assumes the web as the foundational infrastructure that makes linked data possible.
But it’s also more than this. It is also about using linked data techniques to associate related works. Using the concepts laid out by the Functional Requirements for Bibliographic Records (FRBR) to bring together all of the various manifestations of a work. This can support interfaces that make it easier to navigate search results and find the version of a work that you need. Linked data techniques are also making it easier for us to link translations to the original works and vice versa.
All of these advancements are making discovery easier and more effective, which is really what we should be all about, don’t you think?About Roy Tennant
Roy Tennant works on projects related to improving the technological infrastructure of libraries, museums, and archives.Mail | Web | Twitter | Facebook | LinkedIn | Flickr | YouTube | More Posts (97)
New vacancy listings are posted weekly on Wednesday at approximately 12 noon Central Time. They appear under New This Week and under the appropriate regional listing. Postings remain on the LITA Job Site for a minimum of four weeks.
New This Week
Visit the LITA Job Site for more available jobs and for information on submitting a job posting.
Our Collections as Data event in September 2016 on exploring the computational use of library collections was a success on several levels, including helping steer our team at National Digital Initiatives in our path of action.
We are pleased to release the following summary report which includes an executive summary of the event, the outline of our work in this area over the past five months, and the work of our colleagues Oliver Baez Bendorf, Dan Chudnov, Michelle Gallinger, and Thomas Padilla. If you are interested in what we mean when we talk about collections as data or the infrastructure necessary to support this work, this is for you.
The format of this summary report is itself an experiment. We contracted authors and artists to comment on this important topic from their diverse perspectives in order to create a holistic resource reflective of what made the symposium so great. You will read a reflection on the event from keynote speaker Thomas Padilla, a recommendation on how to implement a computational environment for scholars by Dan Chudnov and Michelle Gallinger, as well as a series of collages by the artist Oliver Baez Bendorf representing key themes from the day.
Mark your calendars for the next #AsData event on July 24th-25th at the Library of Congress. By featuring stories from humanities researchers, journalists, students, social scientists, artists, and story tellers who have used library collections computationally, we hope to communicate the possibilities of this approach to a broad general audience.
ON A COLLECTIONS AS DATA IMPERATIVE | THOMAS PADILLA
LIBRARY OF CONGRESS DIGITAL SCHOLARS PILOT PROJECT REPORT | DAN CHUDNOV AND MICHELLE GALLINGER
DOWNLOAD THE #ASDATA POSTER SERIES | OLIVER BAEZ BENDORF
ALL SPEAKER PRESENTATIONS AVAILABLE FOR STREAMING ON THE LIBRARY OF CONGRESS YOUTUBE PAGE
The Islandoracon Planning Committee is very pleased to unveil the logo that will grace our conference t-shirts in Hamilton, Ontario this May:
With all due credit to both the remarkable musical that inspired the image, and to the entirely different man named Hamilton who actually founded the city, the concept for this image comes from Bryan Brown at FSU. It also brings back the now-ubiquitous Islandora lobster (also known as the CLAWbster), who was created for the first Islandoracon and has gone on to dominate Islandora CLAW repositories in many different guises.
At DPLA, it is very important to us that DPLAfest bring together a broad array of professionals and advocates who care about access to culture to discuss everything from technology and open access to copyright, public engagement, and education. We celebrate the diversity of our DPLA community of partners and users and want to ensure that these perspectives are represented at DPLAfest, which is why we are thrilled to announce three fully funded travel awards to attend DPLAfest 2017.
Our goal is to use this funding opportunity to promote the widest possible range of views represented at DPLAfest. We require that applicants represent one or more of the following:
- Professionals from diverse ethnic and/or cultural backgrounds representing one or more of the following groups: American Indian/Alaska Native, Asian, Black/African American, Hispanic/Latino or Native Hawaiian/Other Pacific Islander
- Professionals whose work or institutions primarily serve and/or represent historically underserved populations including, but not limited to, LGBTQ communities, incarcerated people and ex-offenders, people of color, people with disabilities, or Native American and tribal communities
- Individuals who would not otherwise have the financial capacity to attend DPLAfest 2017
- Graduate students and/or early career professionals
- Students or professionals who live and/or work in the Greater Chicago metro area
- Visit the DPLAfest Scholarships page and complete the application form by March 1, 2017.
- Award recipients must attend the entire DPLAfest, taking place on Thursday, April 20 between 9:00am and 5:45pm and Friday, April 21 between 9:00am and 3:30pm.
- Award recipients agree to write a blog post about their experience at the event to be published on DPLA’s blog within two weeks following the end of the event.
Please note that this funding opportunity will provide for airfare and/or other required travel to Chicago if needed, lodging for two nights in one of the event hotels, and complimentary registration for DPLAfest. The award will not provide for meals or incidental expenses such as cab fares or local public transportation. All applicants will be notified regarding their application status during the week of March 13.
We appreciate your help sharing this opportunity with interested individuals in your personal and professional networks.
While we anticipated the Federal Communications Commission (FCC) would take a look at its Universal Service Fund (USF) programs once Chairman Pai was in place, we did not anticipate the speed at which moves to review and evaluate previous actions would occur. After the Commission retracted the “E-rate Modernization Report,” our E-rate ears have been itching with concern that our bread and butter USF program would attract undue attention. We did not have long to wait.
Last week, FCC Commissioner Michael O’Rielly sent a letter (pdf) to the Universal Service Administrative Company (USAC) seeking detailed information on libraries and schools that applied in 2016 for E-rate funding for dark fiber and self-provisioned fiber. Our main concern is that the tenor of the Commissioner’s inquiries calls into question the need for these fiber applications. The FCC’s December 2014 E-rate Modernization Order allowed libraries and schools to apply for E-rate on self-construction costs for dark fiber and applicant owned fiber. Allowing E-rate eligibility of self-construction costs “levels the playing field” with the more typical leased fiber service offered by a third party, like a local telecommunications carrier. Because we know from our members that availability of high-capacity broadband at reasonable costs continues to be a significant barrier for libraries that want to increase broadband capacity of their libraries, ALA advocated for this change in several filings with the FCC.
We find Commissioner O’Rielly’s concern about overbuilding to be misplaced. The real issue is getting the best broadband service at the lowest cost, thus ensuring the most prudent use of limited E-rate and local funds. As we explained in our September 2013 comments (pdf) filed in response to then Acting Chair Mignon Clyburn’s opening of the E-rate modernization proceeding, “It is not a good stewardship of E-rate funds (or local library funds) to pay more for leasing a circuit when ownership is less expensive.”
To help ensure that applicants get the lowest cost for their fiber service the FCC already has in-place detailed E-rate bidding regulations that require cost be the most important factor when evaluating bids from providers. As the Commission stated in its December 2014 E-rate Modernization Order (pdf), incumbent providers “Are free to offer dark-fiber service themselves, or to price their lit-fiber service at competitive rates to keep or win business – but if they choose not to do so, it is market forces and their own decisions, not the E-rate rules” that preclude their ability to compete with a self-construction option. The Commission’s reforms to allow self-construction costs for dark fiber and applicant owned fiber were correct in 2014 and remain so. In addition, applicants will evaluate and select the best, most cost effective fiber option for their library or school.
If the last few weeks are any indication of activity at the FCC, we’re in for a busy spring.
The post Concerns about FCC E-rate letter on fiber broadband deployment appeared first on District Dispatch.
There are many scenarios in which users must be able to prove the existence of data at a specific point in time and be able to demonstrate the integrity of data since that time, even when the duration from time of existence to time of demonstration spans a large period of time. Additionally, users must be able to verify signatures on digitally signed data many years after the generation of the signature. This document describes a class of long-term archive services to support such scenarios and the technical requirements for interacting with such services.Below the fold, a look at how it has stood the test of time.
The RFC's overview of the problem a long-term archive (LTA) must solve is still exemplary, especially in its stress on the limited lifetime of cryptographic techniques (Section 1):
Digital data durability is undermined by continual progress and change on a number of fronts. The useful lifetime of data may exceed the life span of formats and mechanisms used to store the data. The lifetime of digitally signed data may exceed the validity periods of public-key certificates used to verify signatures or the cryptanalysis period of the cryptographic algorithms used to generate the signatures, i.e., the time after which an algorithm no longer provides the intended security properties. Technical and operational means are required to mitigate these issues.But note the vagueness of the very next sentence:
A solution must address issues such as storage media lifetime, disaster planning, advances in cryptanalysis or computational capabilities, changes in software technology, and legal issues.There is no one-size-fits-all affordable digital preservation technology, something the RFC implicitly acknowledges. But it does not even mention the importance of basing decisions on an explicit threat model when selecting or designing an appropriate technology. More than 18 months before the RFC was published, the LOCKSS team made this point in Requirements for Digital Preservation Systems: A Bottom-Up Approach. Our explicit threat model was very useful in the documentation needed for the CLOCKSS Archive's TRAC certification.
How to mitigate the threats? Again, Section 1 is on point:
A long-term archive service aids in the preservation of data over long periods of time through a regimen of technical and procedural mechanisms designed to support claims regarding a data object. For example, it might periodically perform activities to preserve data integrity and the non-repudiability of data existence by a particular point in time or take actions to ensure the availability of data. Examples of periodic activities include refreshing time stamps or transferring data to a new storage medium.Section 4.1.1 specifies a requirement that is still not implemented in any ingest pipeline I've encountered:
The LTA must provide an acknowledgement of the deposit that permits the submitter to confirm the correct data was accepted by the LTA.It is normal for Submission Information Packages (SIPs) to include checksums of their components, bagit is typical in this respect. The checksums allow the archive increased confidence that the submission was not corrupted in transit. But they don't do anything to satisfy RFC 4810's requirement that the submitter be reassured that the archive got the right data. Even if the archive reported the checksums to the submitter, this doesn't tell the submitter anything useful. The archive could simply have copied the checksums from the submission without validating them.
SIPs should include a nonce. The archive should prepend the nonce to each checksummed item, and report the resulting checksum back to the submitter, who can validate them, thus mitigating among others the threat that the SIP might have been tampered with in transit. This is equivalent to the first iteration of Shah et al's audit technology.
Note also how the RFC follows (without citing) the OAIS Reference Model in assuming a "push" model of ingest.
The RFC correctly points out that an LTA will rely on, and trust, services that must not be provided by the LTA itself, for example (Section 4.2.1):
Supporting non-repudiation of data existence, integrity, and origin is a primary purpose of a long-term archive service. Evidence may be generated, or otherwise obtained, by the service providing the evidence to a retriever. A long-term archive service need not be capable of providing all evidence necessary to produce a non-repudiation proof, and in some cases, should not be trusted to provide all necessary information. For example, trust anchors [RFC3280] and algorithm security policies should be provided by other services. An LTA that is trusted to provide trust anchors could forge an evidence record verified by using those trust anchors. and (Section 2):
Time Stamp: An attestation generated by a Time Stamping Authority (TSA) that a data item existed at a certain time. For example, [RFC3161] specifies a structure for signed time stamp tokens as part of a protocol for communicating with a TSA. But the RFC doesn't explore the problems that this reliance causes. Among these are:
- Recursion. These services, which depend on encryption technologies that decay over time, must themselves rely on long-term archiving services to maintain, for example, a time-stamped history of public key validity. The RFC does not cite Petros Maniatis' 2003 Ph.D. thesis Historic integrity in distributed systems on precisely this problem.
- Secrets. The encryption technologies depend on the ability to keep secrets for extended periods even if not, as the RFC explains, for the entire archival period. Keeping secrets is difficult and it is more difficult to know whether, or when, they leaked. The damage to archival integrity which the leak of secrets enables may only be detected after the fact, when recovery may not be possible. Or it may not be detected, because the point at which the secret leaked may be assumed to be later than it actually was.
Encryption poses many problems for digital preservation. Section 4.5.1 identifies another:
A long-term archive service must provide means to ensure confidentiality of archived data objects, including confidentiality between the submitter and the long-term archive service. An LTA must provide a means for accepting encrypted data such that future preservation activities apply to the original, unencrypted data. Encryption, or other methods of providing confidentiality, must not pose a risk to the associated evidence record. Easier said than done. If the LTA accepts encrypted data without the decryption key, the best it can do is bit-level preservation. Future recovery of the data depends on the availability of the key which, being digital information, will need itself to have been stored in an LTA. Another instance of the recursive nature of long-term archiving.
"Mere bit preservation" is often unjustly denigrated as a "solved problem". The important archival functions are said to be "active", modifying the preserved data and therefore requiring access to the plaintext. Thus, on the other hand, the archive might have the key and, in effect, store the plaintext. The confidentiality of the archived data then depends on the archive's security remaining impenetrable over the entire archival period, something about which one might reasonably be skeptical.
Section 4.2.2 admits that "mere bit preservation" is the sine qua non of long-term archiving:
Demonstration that data has not been altered while in the care of a long-term archive service is a first step towards supporting non-repudiation of data.and goes on to note that "active preservation" requires another service:
Certification services support cases in which data must be modified, e.g., translation or format migration. An LTA may provide certification services.It isn't clear why the RFC thinks it is appropriate for an LTA to certify the success of its own operations. A third-party certification service would also need access to pre- and post-modification plaintext, increasing the plaintext's attack surface and adding another instance of the problems caused by LTAs relying on external services discussed above.
Overall, the RFC's authors did a pretty good job. Time has not revealed significant inadequacies beyond those knowable at the time of publication.
Some of you may be aware that the Hydra Project has been attempting to trademark its “product” in the US and in Europe. During this process we became aware of MPDV, a German company that has a wide ranging trademark on the use of ‘Hydra’ for computer software and that their claim to the word considerably predates ours. Following discussions with their lawyers, our attorney advised that we should agree to MPDV’s demand that we cease use of the name “Hydra” and, having sought a second opinion, we have agreed that we will do so. Accordingly, we need to embark on a program to rebrand ourselves. MPDV have given us six months to do this which our lawyer deems “generous”.
The Steering Group, in consultation with the Hydra Partners, has already started mapping out a process to follow over the coming months but will welcome input from the Hydra Community – particularly help in identifying a new name, a matter of some urgency. We will be especially interested in hearing from anyone with prior success in any naming and (re-)branding initiatives! Rather than seeing this as a setback we are looking at the process as a way to refocus and re-invigorate the project ahead of new, exciting developments such as cloud-hosted delivery.
Please share your ideas via any of Hydra’s mailing lists. If you use Slack you may like to look at a new Hydra channel called #branding where some interesting ideas are being discussed.
All versions have been updated. For specific information about workstream work, please see: MarcEdit Workstreams: MacOS and Windows/Linux
MarcEdit Mac Changelog:
* Bug Fix: Delimited Text Translator: The 3rd delimiter wasn’t being set reliably. This should be corrected.
* Enhancement: Accessibility: Users can now change the font and font sizes in the application.
* Enhancement: Delimited Text Translator: Users can enter position and length on all fields.
MarcEdit Windows/Linux Changelog:
* Enhancement: Plugin management: automated updates, support for 3rd party plugins, and better plugin management has been added.
* Bug Fix: Delimited Text Translator: The 3rd delimiter wasn’t being set reliably. This should be corrected.
* Update: Field Count: Field count has been updated to improve counting when dealing with formatting issues.
* Enhancement: Delimited Text Translator: Users can enter position and length on all fields.
Downloads are available via the automated updating tool or via the Downloads (http://marcedit.reeset.net/downloads) page.
Library Tech Talk (U of Michigan): The Joy of Insights: How to harness qualitative data in your work
Quantitative data gives you the hard numbers: what, how many times, when, generally who, and where. Quantitative data also leaves out the biggest and possibly most important factor: why.
Austin, TX Mark your calendars to submit abstracts for presentations, workshops, posters and demos for the 8th Annual VIVO Conference by March 26, 2017!
Austin, TX As the Hydra-in-a-Box project prepares for major developments in 2017 – release of the Hyku repository minimum viable product, a HykuDirect hosted service pilot program, and a higher-performing aggregation system at DPLA – we welcome three stars who recently joined the project team. Please join us in welcoming Michael Della Bitta, Heather Greer Klein, and Kelcy Shepherd.
DuraSpace News: Announcing: DuraSpace Hot Topics Webinar Series, "Introducing DSpace 7: Next Generation UI"
DuraSpace is pleased to announce its latest Hot Topics Webinar Series, "Introducing DSpace 7: Next Generation UI"
Curated by Claire Knowles, Library Digital Development Manager, The University of Edinburgh.
Austin, TX A cornerstone of the DuraSpace mission is focused around expanding collaborations with academic, scientific, cultural, technology, and research communities in support of projects and services to help ensure that current and future generations will have access to our collective digital heritage. The DuraSpace organization seeks a Business Development Manager to cultivate and deepen those relationships and partnerships particularly with international organizations and consortia to elevate the organization’s profile and to expand the services it offers.
This week’s episode of Metric: A User Experience Podcast with Amanda L. Goodman (@godaisies) gives you a peek into the work of the LITA Persona Task Force, who are charged with defining and developing personas that are to be used in growing membership in the Library and Information Technology Association.
bento_search is the gem for making embedding of external searches in Rails a breeze, focusing on search targets and use cases involving ‘scholarly’ or bibliographic citation results.
Bento_search isn’t dead, it just didn’t need much updating. But thanks to some work for a client using it, I had the opportunity to do some updates.
Bento_search 1.7.0 includes testing under Rails 5 (the earlier versions probably would have worked fine in Rails 5 already), some additional configuration options, a lot more fleshing out of the EDS adapter, and a new ConcurrentSearcher demonstrating proper use of new Rails5 concurrency API. (the older BentoSearch::MultiSearcher is now deprecated).
See the CHANGES file for full list.
As with all releases of bento_search to date, it should be strictly backwards compatible and an easy upgrade. (Although if you are using Rails earlier than 4.2, I’m not completely confident, as we aren’t currently doing automated testing of those).
Filed under: General
In this video, the speaker shows that by searching on "black on white violence" in Google the top items are all from racist sites. Each of these link only to other racist sites. The speaker claims that Google's algorithms will favor similar sites to ones that a user has visited from a Google search, and that eventually, in this case, the user's online searching will be skewed toward sites that are racist in nature. The claim is that this is what happened to Dylan Roof, the man who killed 9 people at an historic African-American church - he entered a closed information system that consisted only of racist sites. It ends by saying: "It's a fundamental problem that Google must address if it is truly going to be the world's library."
I'm not going to defend or deny the claims of the video, and you should watch it yourself because I'm not giving a full exposition of its premise here (and it is short and very interesting). But I do want to question whether Google is or could be "the world's library", and also whether libraries do a sufficient job of presenting users with a well-round information space.
It's fairly easy to dismiss the first premise - that Google is or should be seen as a library. Google is operating in a significantly different information ecosystem from libraries. While there is some overlap between Google and library collections, primarily because Google now partners with publishers to index some books, there is much that is on the Internet that is not in libraries, and a significant amount that is in libraries but not available online. Libraries pride themselves on providing quality information, but we can't really take the lion's share of the credit for that; the primary gatekeepers are the publishers from whom we purchase the items in our collections. In terms of content, most libraries are pretty staid, collecting only from mainstream publishers.
I decided to test this out and went looking for works promoting Holocaust denial or Creationism in a non-random group of libraries. I was able to find numerous books about deniers and denial, but only research libraries seem to carry the books by the deniers themselves. None of these come from mainstream publishing houses. I note that the subject heading, Holocaust denial literature, is applied to both those items written from the denial point of view, as well as ones analyzing or debating that view.
Creationism gets a bit more visibility; I was able to find some creationist works in public libraries in the Bible Belt. Again, there is a single subject heading, Creationism, that covers both the pro- and the con-. Finding pro- works in WorldCat is a kind of "needle in a haystack" exercise.
Don't dwell too much on my findings - this is purely anecdotal, although a true study would be fascinating. We know that libraries to some extent reflect their local cultures, such as the presence of the Gay and Lesbian Archives at the San Francisco Public Library. But you often hear that libraries "cover all points of view," which is not really true.
The common statement about libraries is that we gather materials on all sides of an issue. Another statement is that users will discover them because they will reside near each other on the library shelves. Is this true? Is this adequate? Does this guarantee that library users will encounter a full range of thoughts and facts on an issue?
First, just because the library has more than one book on a topic does not guarantee that a user will choose to engage with multiple sources. There are people who seek out everything they can find on a topic, but as we know from the general statistics on reading habits, many people will not read voraciously on a topic. So the fact that the library has multiple items with different points of view doesn't mean that the user reads all of those points of view.
Second, there can be a big difference between what the library holds and what a user finds on the shelf. Many public libraries have a high rate of circulation of a large part of their collection, and some books have such long holds lists that they may not hit the shelf for months or longer. I have no way to predict what a user would find on the shelf in a library that had an equal number of books expounding the science of evolution vs those promoting the biblical concept of creation, but it is frightening to think that what a person learns will be the result of some random library bookshelf.
But the third point is really the key one: libraries do not cover all points of view, if by points of view you include the kind of mis-information that is described in the SPLC video. There are many points of view that are not available from mainstream publishers, and there are many points of view that are not considered appropriate for anything but serious study. A researcher looking into race relations in the United States today would find the sites that attracted Roof to provide important insights, as SPLC did, but you will not find that same information in a "reading" library.
Libraries have an idea of "appropriate" that they share with the publishing community. We are both scientific and moral gatekeepers, whether we want to admit it or not. Google is an algorithm functioning over an uncontrolled and uncontrollable number of conversations. Although Google pretends that its algorithm is neutral, we know that it is not. On Amazon, which does accept self-published and alternative press books, certain content like pornography is consciously kept away from promotions and best seller lists. Google has "tweaked" its algorithms to remove Holocaust denial literature from view in some European countries that forbid the topic. The video essentially says that Google should make wide-ranging cultural, scientific and moral judgments about the content it indexes.
I am of two minds about the idea of letting Google or Amazon be a gatekeeper. On the one hand, immersing a Dylann Roof in an online racist community is a terrible thing, and we see the result (although the cause and effect may be hard to prove as strongly as the video shows). On the other hand, letting Google and Amazon decide what is and what is not appropriate does not sit well at all. As I've said before having gatekeepers whose motivations are trade secrets that cannot be discussed is quite dangerous.
There has been a lot of discussion lately about libraries and their supposed neutrality. I am very glad that we can have that discussion. With all of the current hoopla about fake news, Russian hackers, and the use of social media to target and change opinion, we should embrace the fact of our collection policies, and admit widely that we and others have thought carefully about the content of the library. It won't be the most radical in many cases, but we care about veracity, and that''s something that Google cannot say.