You are here

Feed aggregator

District Dispatch: ALA to Congress in 2018: Continue to #FundLibraries

planet code4lib - Thu, 2018-01-11 15:10

2017 was an extraordinary year for America’s libraries. When faced with serious threats to federal library funding, ALA members and library advocates rallied in unprecedented numbers to voice their support for libraries at strategic points throughout the year*. Tens of thousands of phone calls and emails to Congress were registered through ALA’s legislative action center. ALA members visited Congress in Washington and back home to demonstrate the importance of federal funding.

The challenge to #FundLibraries in 2018 is great: not only is Congress late in passing an FY 2018 budget, it’s time to start working on the FY 2019 budget.

ALA members have a lot to be proud of. Thanks to library advocates, Congress did not follow the administration’s lead in March 2017, when the president made a bold move to eliminate the Institute of Museum and Library Services (IMLS) and virtually all federal library funding. In every single state and congressional district, ALA members spoke up in support for federal library funding. We reminded our senators and representatives how indispensable libraries are for the communities they represent. And our elected leaders listened. By the time FY 2018 officially began in October 2017, the Appropriations Committees from both chambers of Congress had passed bills that maintained (and in the Senate, increased by $4 million) funding for libraries.

Despite our strong advocacy, we have not saved library funding for FY 2018. We’re more than three months into the fiscal year, and the U.S. government still does not have an FY 2018 budget. Because the House and Senate have not reconciled their FY 2018 spending bills, the government is operating under a “continuing resolution” (CR) of the FY 2017 budget. What happens when that CR expires on January 19, 2018 is a matter of intense speculation; options include a bi-partisan budget deal, another CR or a possible government shutdown.

While government may seem to be paralyzed, this is no time for library advocates to take a break. The challenge in 2018 is even greater than 2017: not only is Congress late in passing an FY 2018 budget, it’s time to start working on the FY 2019 budget. The president is expected to release his FY 2019 budget proposal in February, and we have no reason to believe that libraries have moved up on the list of priorities for the administration.

2018 is a time for all of us to take our advocacy up a notch. Over the coming weeks, ALA’s Washington Office will roll out resources to help you tell your library story and urge your members of Congress to #FundLibraries. In the meantime, here’s what you can do:

Stay informed. The U.S. budget and appropriations process is more dynamic than ever this year. There is a strong chance that we will be advocating for library funding for FY 2018 and FY 2019 at the same time. Regularly visit DistrictDispatch.org, the Washington Office blog, where we will post the latest information on ALA’s #FundLibraries campaign and sign up for ALA’s Legislative Action Center.

Stay involved. What you show your decision-makers at home is important part of our year-round advocacy program because it helps supplement the messages that your ALA Washington team is sharing with legislators and their staff on the Hill. Keep showing them how your library – and IMLS funding – is transforming your community. Plan to attend National Library Legislative Day 2018 in Washington (May 7-8) or participate virtually from home.

Stay proud of your influence. Every day you prove that libraries are places of innovation, opportunity and learning – that libraries are a smart, high-return investment for our nation. When librarians speak, decision-makers listen!

*2017: Federal appropriations and library advocacy timeline March The president announced in his first budget proposal that he wanted to eliminate IMLS and virtually all federal funding for libraries. April ALA members asked their representatives to sign two Dear Appropriator letters sent from library champions in the House to the Chair and Ranking Members of the House Appropriations Subcommittee that deals with library funding (Labor, Health & Human Services, Education and Related Agencies, or “Labor-HHS”). One letter was in support of the Library Services and Technology Act (LSTA), and one letter was for the Innovative Approaches to Literacy program (IAL).

House Results: One-third of the entire House of Representatives, from both parties, signed each Dear Appropriator letter, and nearly 170 Members signed at least one.

May More than 500 ALA members came to Washington, D.C. to meet their members of Congress for ALA’s 2017 National Library Legislative Day. Nearly identical Dear Appropriator letters were sent to Senate Labor-HHS Approps Subcommittee leaders.

Senate Results: 45 Senators signed the LSTA letter, and 37 signed the IAL letter.

July The House Labor-HHS Subcommittee and then the full Committee passed their appropriations bill, which included funding for IMLS, LSTA and IAL at 2017 levels. September The House passed an omnibus spending package, which included 12 appropriations bills. The Senate Labor-HHS Subcommittee and then the full Committee passed their appropriations bill, which included a $4 million increase for LSTA above the 2017 level.  Unable to pass FY 2018 funding measures, Congress passed a continuing resolution, averting a government shutdown. December Congress passed two additional CRs, which run through January 19, 2018.

The post ALA to Congress in 2018: Continue to #FundLibraries appeared first on District Dispatch.

Open Knowledge Foundation: 2017: A Year to Remember for OK Nepal

planet code4lib - Thu, 2018-01-11 09:24

This blog has been cross-posted from the OK Nepal blog as part of our blog series of Open Knowledge Network updates.

Best wishes for 2018 from OK Nepal to all of the Open Knowledge family and friends!!

The year 2017 was one of the best years for Open Knowledge Nepal. We started our journey by registering Open Knowledge Nepal as a non-profit organization under the Nepal Government and as we start to reflect 2017, it has been “A Year to Remember”. We were able to achieve many things and we promise to continue our hard work to improve the State of Open Data in South Asia in 2018 also.

Some of the key highlights of 2017 are:

  1. Organizing Open Data Day 2017

For the 5th time in a row, the Open Knowledge Nepal team led the effort of organizing International Open Data Day at Pokhara, Nepal. This year it was a collaborative effort of Kathmandu Living Labs and Open Knowledge Nepal. It was also the first official event of Open Knowledge Nepal that was held out of the Kathmandu Valley.  

  1. Launching Election Nepal Portal  

On 13th April 2017 (31st Chaitra 2073), a day before Nepalese New Year 2074, we officially released the  Election Nepal Portal in collaboration with Code for Nepal and made it open for contribution. Election Nepal is a crowdsourced citizen engagement portal that includes the Local Elections data. The portal will have three major focus areas; visualizations, datasets, and twitter feeds.

  1. Contributing to Global Open Data Index  

On May 2nd, 2017 Open Knowledge International launched the 4th edition of Global Open Data Index (GODI), a global assessment of open government data publication. Nepal has been part of this global assessment continuously for four years with lots of ups and downs. We have been leading it since the very beginning. With 20% of openness, Nepal was ranked 69 in 2016 Global Open Data Index. Also, this year we helped Open Knowledge International by coordinating for South Asia region and for the first time, we were able to get contributions from Bhutan and Afghanistan.

  1. Launching Local Boundaries   

To help journalists and researchers visualize the geographical data of Nepal in a map, we build Local Boundaries where we share the shapefile of Nepal federal structure and others. Local Boundaries brings the detailed geodata of administrative units or maps of all administrative boundaries defined by Nepal Government in an open and reusable format, free of cost. The local boundaries are available in two formats (TopoJSON and GeoJSON) and can be easily reused to map local authority data to OpenStreetMap, Google Map, Leaflet or MapBox interactively.

  1. Launching Open Data Handbook Nepali Version  

After the work of a year followed by a series of discussion and consultation, on 7 August 2017 Open Knowledge Nepal launched the first version of Nepali Open Data Handbook – An introductory guidebook used by governments and civil society organizations around the world as an introduction and blueprint for open data projects. The handbook was translated with the collaborative effort by volunteers and contributors.  Now the Nepali Handbook is available at http://handbook.oknp.org

  1. Developing Open Data Curriculum and Open Data Manual  

To organize the open data awareness program in a structured format and to generate resources which can be further use by civil society and institution, Open Knowledge Nepal prepared an Open Data Curriculum and Open Data Manual. It contains basic aspects of open data like an introduction, importance, principles, application areas as well as the technical aspects of open data like extraction, cleaning, analysis, and visualization of data. It works as a reference and a recommended guide for university students, private sectors, and civil society.

  1. Running Open Data Awareness Program

The Open Data Awareness Program was conducted in 11 colleges and 2 youth organization, reaching more than 335+ youths are first of its kind conducted in Nepal. Representatives of Open Knowledge Nepal visited 7 districts of Nepal with the Open Data Curriculum and the Open Data Manual to train youths about the importance and use of open data.

  1. Organizing Open Data Hackathon  

The Open Data Hackathon was organized with the theme “Use data to solve local problems faced by Nepali citizens” at Yalamaya Kendra (Dhokaima Cafe), Patan Dhoka on November 25th, 2017. In this hackathon, we brought students and youths from different backgrounds under the same roof to work collaboratively on different aspects of open data.

  1. Co-organizing Wiki Data-a-thon

On 30th November 2017, we co-organized a Wiki Data-a-thon with Wikimedians of Nepal at Nepal Connection, Thamel on the occasion of Global Legislative Openness Week (GLOW). During the event, we scraped the data of last CA election and pushed those data in WikiData.  

  1. Supporting Asian Regional Meeting  

On 2nd and 3rd December 2017, we supported Open Access Nepal to organize Asian Regional Meeting on Open Access, Open Education and Open Data with the theme “Open in Action: Bridging the Information Divide”. Delegates were from different countries like the USA, China, South Africa, India, Bangladesh, China, Nepal. We managed the Nepali delegates and participants.

2018 Planning

We are looking forward to a prosperous 2018, where we plan to outreach the whole of South Asia countries to improve the state of open data in the region by using focused open data training, research, and projects. For this, we will be collaborating with all possible CSOs working in Asia and will serve as an intermediary for different international organizations who want to promote or increase their activities in Asian countries. This will help the Open Knowledge Network in the long run, and we will also get opportunities to learn from each others’ successes and failures, promote each other’s activities, brainstorm collaborative projects and make the relationship between countries stronger.

Besides this, we will continue also our work of data literacy like Open Data Awareness Program to make Nepalese citizens more data demanding and savvy, and launch a couple of new projects to help people to understand the available data.

To be updated about our activities, please follow us at different medias:

 

Terry Reese: MarcEdit Updates (All versions)

planet code4lib - Thu, 2018-01-11 05:35

I’ve posted updates for all versions of MarcEdit, including MarcEdit MacOS 3.

MarcEdit 7 (Windows/Linux) changelog:
  • Bug Fix: Export Settings: Export was capturing both MarcEdit 6.x and MarcEdit 7.x data.
  • Enhancement: Task Management: added some continued refinements to improve speed and processing
  • Bug Fix: OCLC Integration: Corrected an issue occuring when trying to post bib records using previous profiles.
  • Enhancement: Linked Data XML Rules File Editor completed
  • Enhancement: Linked Data Framework: Formal support for local linked data triple stores for resolution

One of the largest enhancements is the updated editor to the Linked Data Rules File and the Linked Data Framework. You can hear more about these updates here:

MarcEdit MacOS 3:

Today also marks the availability of MarcEdit MacOS 3. You can read about the update here: MarcEdit MacOS 3 has Arrived!

If you have questions, please let me know.

–tr

Terry Reese: MarcEdit MacOS 3 has Arrived!

planet code4lib - Thu, 2018-01-11 05:01

MarcEdit MacOS 3 is the latest branch of the MarcEdit 7 family. MarcEdit MacOS 3 represents the next generational update for MarcEdit on the Mac and is functionally equivalent to MarcEdit 7. MarcEdit MacOS 3 introduces the following features:

  1. Startup Wizard
  2. Clustering Tools
  3. New Linked Data Framework
  4. New Task Management and Task Processing
  5. Task Broker
  6. OCLC Integration with OCLC Profiles
  7. OCLC Integration and search in the MarcEditor
  8. New Global Editing Tools
  9. Updated UI
  10. More

 

There are also a couple things that are currently missing that I’ll be filling in over the next couple of weeks. Presently, the following elements are missing in the MacOS version:

  1. OCLC Downloader
  2. OCLC Bib Uploader (local and non-local)
  3. OCLC Holdings update (update for profiles)
  4. Task Processing Updates
  5. Need to update Editor Functions
    1. Dedup tool – Add/Delete Function
    2. Move tool — Copy Field Function
    3. RDA Helper — 040 $b language
    4. Edit Shortcuts — generate paired ISBN-13
    5. Replace Function — Exact word match
    6. Extract/Delete Selected Records — Exact word match
  6. Connect the search dropdown
    1. Add to the MARC Tools Window
    2. Add to the MarcEditor Window
    3. Connect to the Main Window
  7. Update Configuration information
  8. XML Profiler
  9. Linked Data File Editor
  10. Startup Wizard

Rather than hold the update till these elements are completed, I’m making the MarcEdit MacOS version available now so that users can be testing and interacting with the tooling, and I’ll finish adding these remaining elements to the application. Once completed, all versions of MarcEdit will share the same functionality, save for elements that rely on technology or practices tied to a specific operating system.

Updated UI

The MarcEdit MacOS 3 introduces a new UI. While the UI is still reflective of MacOS best practices, it also shares many of the design elements developed as part of MarcEdit 7. This includes new elements like the StartUp wizard with Fluffy Install agent:

 

The Setup Wizard provides users the ability to customize various application settings, as well as import previous settings from earlier versions of MarcEdit.

 

Updates to the UI

New Clustering tools

MarcEdit MacOS 3 provides MacOS users more tools, more help, more speed…it gives you more, so you can do more.
Downloading:

Download the latest version of MarcEdit MacOS 3 from the downloads page at: http://marcedit.reeset.net/downloads

-tr

Library of Congress: The Signal: Digital Scholarship Resource Guide: Making Digital Resources, Part 2 of 7

planet code4lib - Wed, 2018-01-10 22:25

This is part two in a seven part resource guide for digital scholarship by Samantha Herron, our 2017 Junior Fellow. Part one is available here, and the full guide is available as a PDF download

Creating Digital Documents

Internet Archive staff members such as Fran Akers, above, scan books from the Library’s General Collections that were printed before 1923.  The high-resolution digital books are made available online at www.archive.org­ within 72 hours of scanning. 

The first step in creating an electronic copy of an analog (non-digital) document is usually scanning it to create a digitized image (for example, a .pdf or a .jpg). Scanning a document is like taking an electronic photograph of it–now it’s in a file format that can be saved to a computer, uploaded to the Internet, or shared in an e-mail. In some cases, such as when you are digitizing a film photograph, a high-quality digital image is all you need. But in the case of textual documents, a digital image is often insufficient, or at least inconvenient. In this stage, we only have an image of the text; the text isn’t yet in a format that can be searched or manipulated by the computer (think: trying to copy & paste text from a picture you took on your camera–it’s not possible).

Optical Character Recognition (OCR) is an automated process that extracts text from a digital image of a document to make it readable by a computer. The computer scans through an image of text, attempts to identify the characters (letters, numbers, symbols), and stores them as a separate “layer” of text on the image.

Example Here is a digitized copy of Alice in Wonderland in the Internet Archive. Notice that though this ebook is made up of scanned images of a physical copy, you can search the full text contents in the search bar. The OCRed text is “under” this image, and can be accessed if you select “FULL TEXT” from the Download Options menu. Notice that you can also download a .pdf.epub, or many other formats of the digitized book.

Though the success of OCR depends on the quality of the software and the quality of the photograph–even sophisticated OCR has trouble navigating images with stray ink blots or faded type–these programs are what allow digital archives users to not only search through catalog metadata, but through the full contents of scanned newspapers (as in Chronicling America) and books (as in most digitized books available from libraries and archives).

ABBYY FineReader, an OCR software.

As noted, the automated OCR text often needs to be “cleaned” by a human reader. Especially with older, typeset texts that have faded or mildewed or are otherwise irregular, the software may mistake characters or character combinations for others (e.g. the computer might take “rn” to be “m” or “cat” to be “cot” and so on). Though often left “dirty,” OCR that has not been checked through prevents comprehensive searches: if one were searching a set of OCRed texts for every instance of the word “happy,” the computer would not return any of the instances where “happy” had been read as “hoppy” or “hoopy” (and conversely, would inaccurately find where the computer had read “hoppy” to be “happy”). Humans can clean OCR by hand to “train” the computer to interpret characters more accurately (see: machine learning).

In this image of some OCR, we can see some of the errors–the “E”s in the title were interpreted as “Q”s, in the third line, a “t’” was interpreted by the computer as an “f”.

Example of raw OCR text.

Even with imperfect OCR, digital text is helpful for both close readings and distant reading. In addition to more complex computational tasks, digital text allows users to, for instance, find the page number of a quote they remember, or find out if a text ever mentions Christopher Colombus. Text search, enabled by digital text, has changed the way that researchers use database and read documents.

Metadata + Text Encoding

Bibliographic search–locating items in a collections–is one of the foundational tasks of libraries. Computer-searchable library catalogs have revolutionized this task for patrons and staff, enabling users to find more relevant materials more quickly.

Metadata is “data about data”. Bibliographic metadata is what makes up catalog records, from the time of card catalogs to our present day electronic databases. Every item in a library’s holdings has a bibliographic record made up of this metadata–key descriptors of an item that help users find an item when they need it. For example, metadata about a book might include its title, author, publishing date, ISBN, shelf location, and so on. In a electronic catalog search, this metadata is what allows users to increasingly narrow their results to materials targeted to their needs: Rich, accurate metadata, produced by human catalogers, allow users to find in a library’s holdings, for example, 1. any text material, 2. written in Spanish, 3.  about Jorge Luis Borges, 4. between 1990-2000.

Washington, D.C. Jewal Mazique [i.e. Jewel] cataloging in the Library of Congress. Photo by John Collier, Winter 1942. //hdl.loc.gov/loc.pnp/fsa.8d02860

Metadata needs to be in a particular format to be read by the computer. A markup language is a system for annotating text to give the computer instructions about what each piece of information is. XML (eXtensible Markup Language) is one of the most common ways of structuring catalog metadata, because it is legible to both humans and machines.

XML uses tags to label data items. Tags can be embedded inside each other as well. In the example below, <recipe> is the first tag. All of the tags inside between <recipe> and it’s end tag </recipe>, (<title>, <ingredient list>, and <preparation>) are components of <recipe>. Further, <ingredient> is a component of <ingredient list>.

MARC (MAchine Readable Cataloging) standards, developed in the 1960s by Henriette Avram at the Library of Congress, is the international standard data format for the description of items held by libraries. Here are the MARC tags for one of the hits from our Jorge Luis Borges search above:

https://catalog.loc.gov/vwebv/staffView?searchId=9361&recPointer=0&recCount=25&bibId=11763921

The three numbers in the left column are “datafields” and the letters are “subfields”. Each field-subfield combination refers to a piece of metadata. For example, 245$a is the title, 245$b is subtitle, 260$ is the place of publication, and so on. The rest of the fields can be found here.

Here is some example XML.

MARCXML is one way of reading and parsing MARC information, popular because it’s an XML schema (and therefore readable by both human and computer). For example, here is the MARCXML file for the same book from above: https://lccn.loc.gov/99228548/marcxml

The datafields and subfields are now XML tags, acting as ‘signposts’ for the computer about what each piece of information means. MARCXML files can be read by humans (provided they know what each datafield means) as well as computers.

The Library of Congress has made available their 2014 Retrospective MARC files for public use: http://www.loc.gov/cds/products/marcDist.php

Examples The Library of Congress’s MARC data could be used for cool visualizations like Ben Schmidt’s visual history of MARC cataloging at the Library of Congress. Matt Miller used the Library’s MARC data to make a dizzying list of every cataloged book in the Library of Congress.

An example of the uses of MARC metadata for non-text materials is Yale University’s Photogrammar, which uses the location information from the Library of Congress’ archive of US Farm Security Administration photos to create an interactive map.

TEI (Text Encoding Initiative) is another important example of xml-style markup. In addition to capturing metadata, TEI guidelines standardize the markup of a text’s contents. Text encoding tells the computer who’s speaking, when a stanza begins and ends, and denotes which parts of text are stage instructions in a play, for example.

Example Here is a TEI file of Shakespeare’s Macbeth from the Folger Shakespeare Library. Different tags and attributes (the further specifiers within the tags) describe the speaker, what word they are saying, in what scene, what part of speech the word is, etc. With an encoded text like this, it can easily be manipulated to tell you which character says the most words in the play, which adjective is used most often across all of Shakespeare’s works, and so on. If you were interested in the use of the word ‘lady’ in Macbeth, an un-encoded plaintext version would not allow you to distinguish between references to “Lady” Macbeth vs. when a character says the word “lady”. TEI versions allow you to do powerful explorations of texts–though good TEI copies take a lot of time to create.

Understanding the various formats in which data is entered and stored allows us to imagine what kinds of digital scholarship is possible with the library data.

Example The Women Writers Project encodes with TEI texts by early modern women writers and includes some text analysis tools.

Next week’s installment in the Digital Scholarship Resource Guide will show you what you can do with digital data now that you’ve created it. Stay tuned!

LITA: Jobs in Information Technology: January 10, 2018

planet code4lib - Wed, 2018-01-10 20:09

New vacancy listings are posted weekly on Wednesday at approximately 12 noon Central Time. They appear under New This Week and under the appropriate regional listing. Postings remain on the LITA Job Site for a minimum of four weeks.

New This Week

University of Arkansas, Assistant Head of Special Collections, Fayetteville, AR

West Chester University, Electronic Resources Librarian, West Chester, PA

Miami University Libraries, Web Services Librarian, Oxford, OH

Visit the LITA Job Site for more available jobs and for information on submitting a job posting.

Peter Murray: Anxious Anger – or: why does my profession want to become a closed club

planet code4lib - Wed, 2018-01-10 18:38

I’m in the Austin, Texas, airport – having just left the closing session of the Re-Think It conference – and I’m wondering what the heck is happening to my chosen profession. When did we turn into an exclusive member’s only club with unrealistic demands on professionalism and a secret handshake?

The closing keynote featured current president of the American Library Association (ALA) Jim Neal and past president Julie Todaro on the topic Library Leadership in a Period of Transformation. The pair were to address questions like “What trends are provoking new thinking about the 21st century library?” and “Do 20th century visions and skills still matter?” I expected to be uplifted and inspired. Instead, I came away feeling anxious and angry about their view of the library profession and the premier library association, ALA.

To start with a bit of imposter syndrome exposure: I’ve been working in and around libraries for 25 years, but I don’t follow the internal workings and the politics of the principal librarian professional organization(s) in the United States. I read about the profession — enough to know that primary school librarians are under constant threat of elimination in many school districts and that usage of public libraries, particularly public libraries that are taking an expansive view of their role in the community, is through the roof. I hear the grumbles about how library schools are not preparing graduates of masters programs for “real world” librarianship, but in my own personal experience, I am indebted to the faculty at Simmons College for the education I received there. The pay inequity sucks. The appointment of a professional African American librarian to head the Library of Congress is to be celebrated, and the general lack of diversity in the professional ranks is a point to be worked on. My impression of ALA is of an unnecessarily large and bureaucratic organization with some seriously important bright spots (the ALA Washington Office for example), and that the governance of ALA is plodding and cliquish, but for which some close colleagues find professional satisfaction for their extra energies. I’m pretty much hands off ALA, particularly in the last 15 years, and view it (in the words of Douglas Adams) as Mostly Harmless.

So anxious and angry are unexpected feelings for this closing keynote. I don’t think there is a recording of Jim’s and Julie’s remarks, so here in the airport, the only thing I have to go on are my notes. I started taking notes at the beginning of their talks expecting there would be uplifting ideas and quotes that I could attribute to them as I talk with others about the aspirations of the FOLIO project (a crucial part of my day job). Instead, Julie kicked things off by saying the key task that she works on at her day job is maintaining faculty status for librarians. She emphasized the importance of credentialing and using the usefulness of skills to a library’s broader organization as a measure of value. Jim spoke of the role of library schools and library education to define classes of people: librarians, paraprofessionals, students, and the like, and that the ALA should be at the heart of minting credentials to be used (I think) as gatekeepers into “professional” jobs.

Hogwash. If I were to identify my school of thought, I’d say I come from the big melting pot of professional librarianship. I started in libraries just out of college with a degree in systems analysis working for my alma mater as they were bringing up their first automation system. In the first decade of my career, I worked in three academic libraries — each of whom did a fantastic job of instilling in me the raw knowledge and the embedded ethos of the library profession — before choosing to get a library degree. Some of the best librarians I know are not classically trained librarians, and in quiet voices will timidly offer that they do not have a degree. I met one such person during the Re-Think It conference, in fact, that I hope becomes a close colleague. They are drawn to the profession from other disciplines and bring of a wealth of skills and insights that make the library profession stronger. I’ve hired too many people using the phrase “or equivalent experience” to know that a library degree is not the only gateway to a successful team member. The value of the people I hired came from the skills they earned through experience and their outlook to grow as the library itself wanted to grow.

Julie closed her initial remarks by saying that “success — world domination — begins with attention to detail.” Jim spoke wistfully at the lack of an uber-OCLC that would be at the heart of all library technical services work. Such statements make me think that raving megalomania is a prerequisite for ALA president. I’m not sure this is the profession I want to be in.

Julie and Jim both had statements that I wholeheartedly agree with…at least in content if not delivery. As a profession “we’re going to be more unseen as things go digital” (Julie) and that is a challenge to take on. “Stop strategic planning; it is a waste of time” (Jim) and that our organizations need to be a loosely coupled structure of maverick units to move at a pace demanded by our users. Cooperation between libraries and removing duplicate effort is a key sustainability strategy and one that I take to heart in the FOLIO project. (I’m just not convinced that a national strategy of technical centers is desired, if even possible.)

I had no idea that such a panel could stir up such feelings, but there you go. In many ways, I hope that I misinterpreted the intent of Jim’s and Julie’s remarks, but the forcefulness with which they spoke them and the bodily reaction I had to hearing them leaves little room.

District Dispatch: 2018 WHCLIST award accepting nominations

planet code4lib - Wed, 2018-01-10 14:30

Those interesting in participating in National Library Legislative Day 2018 take note – nominations are now being accepted for the 2018 WHCLIST award. The award, sponsored by the White House Conference on Library and Information Services Taskforce (WHCLIST) and the ALA Washington Office, is open to non-librarian, first time participants of National Library Legislative Day (NLLD). WHCLIST winners receive a stipend of $300 and two free nights at the Liaison Hotel, where NLLD 2018 will be hosted.

WHCLIST 2017 winner Lori Rivas and Past President Julie Todaro.

Over the years, WHCLIST has been an effective force in library advocacy on the national stage, as well as statewide and locally. To transmit its spirit of dedicated, passionate library support to a new generation of advocates, WHCLIST provided its assets to the ALA Washington Office to fund this award. Both ALA and WHCLIST are committed to ensuring the American people get the best library services possible.

To apply for the WHCLIST Award, nominees must meet the following criteria:

  • The recipient should be a library supporter (trustee, friend, general advocate, etc), and not a professional librarian (this includes anyone currently employed by a library).
  • Recipient should be a first-time attendee of NLLD.
  • Should have a history of supporting librarians and library work in their community.

Representatives of WHCLIST and the ALA Washington Office will choose the recipient. The winner of the WHCLIST Award will be announced at National Library Legislative Day by the President of the American Library Association.

The deadline for applications is April 2, 2018.

To apply for the WHCLIST award, please submit a completed NLLD registration form; a letter explaining why you should receive the award; and a letter of reference from a library director, school librarian, library board chair, Friend’s group chair, or other library representative to:

Lisa Lindle
Manager, Advocacy and Grassroots Outreach
American Library Association
1615 New Hampshire Ave., NW
First Floor
Washington, DC 20009
llindle@alawash.org

Note: Applicants must register for NLLD and pay all associated costs. Applicants must make their own travel arrangements. The winner will be reimbursed for two free nights in the NLLD hotel in D.C and receive the $300 stipend to defray the costs of attending the event.

The post 2018 WHCLIST award accepting nominations appeared first on District Dispatch.

Ed Summers: Static React

planet code4lib - Wed, 2018-01-10 05:00

This post contains some brief notes about building offline, static web sites using React, in order to further the objectives of minimal computing. But before I go there, first let me give you a little background…

The Lakeland Community Heritage Project is an effort to collect, preserve, and interpret the heritage and history of those African Americans who have lived in the Lakeland community of Prince George’s County, Maryland from the late 19th century to the present. This effort has been led by members of the Lakeland community, with help from students from University of Maryland working with Professor Mary Sies to collect photographs, maps, deeds, and oral histories and published them in an Omeka instance at lakeland.umd.edu. As Mary nears retirement she has become increasingly interested in making these resources available and useful to the community of Lakeland, rather than embedded in a software application that is running on servers owned by UMD.

Recently MITH has been in conversation with LCHP to help explore ways that this data stored in Omeka could be meaningfully transferred to the Lakeland community. This has involved first getting the Omeka site back online, since it partially fell offline as the result of some infrastructure migrations at UMD. We also have been collecting and inventorying disk drives of content used by the students as they have collected and transfer devices over the years.

One relatively small experiment I tried recently was to extract all the images and their metadata from Omeka to create a very simple visual display of the images that could run in a browser without an Internet connection. The point was to provide a generous interface from which community members attending a meeting could browse content quickly and potentially take it away with them. Since we were going to be doing this in a environment where there wasn’t stable network access it was important that for the content to be browsed without an Internet connection. We wanted to be able to put the application on a thumb drive, and move it around as a zip file, which could also ultimately allow us to make it available to community members independent of the files needing to be kept online on the Internet.

The first step was getting all the data out of Omeka. This was a simple matter with Omeka’s very clean, straightforward and well documented REST API. Unfortunately, LCHP was running an older version of Omeka (v1.3.1) that needed to be upgraded to 2.x before the API was available. The upgrade process itself leapfrogged a bunch of versions so I wasn’t surprised to run into a small snag, which I was fortunately able to fix myself (go team open source).

I wrote a small utility named nyakara that talks to Omeka and downloads all the items (metadata and files) as well as the collections they are a part of, and places them on the filesystem. This was a fairly straightforward process because Omeka’s database ensures the one-to-many-relationships between a site and its collections, items, and files which means they can be written to the filesystem in a structured way:

omeka.example.org omeka.example.org/site.json omeka.example.org/collections omeka.example.org/collections/1 omeka.example.org/collections/1/collection.json omeka.example.org/collections/1/items omeka.example.org/collections/1/items/1 omeka.example.org/collections/1/items/1/item.json omeka.example.org/collections/1/items/1/files omeka.example.org/collections/1/items/1/files/1 omeka.example.org/collections/1/items/1/files/1/fullsize.jpg omeka.example.org/collections/1/items/1/files/1/original.jpg omeka.example.org/collections/1/items/1/files/1/file.json omeka.example.org/collections/1/items/1/files/1/thumbnail.jpg omeka.example.org/collections/1/items/1/files/1/square_thumbnail.jpg

This post was really meant to be about building a static site with React, and not about extracting data from Omeka. But this filesystem data is kinda like a static site, right? It was really just building the foundation for the next step of building the static site application, since I didn’t really want to keep downloading content from the API as I was developing my application. Having all the content local made it easier to introspect with command line tools like grep, find and jq as I was building the static site.

Before I get into a few of the details here’s a short video that shows what the finished static site looked like:

Lakeland Static Site Demo from Ed Summers on Vimeo.

You can see that content is loaded dynamically as the user scrolls down the page. Lots of content is presented at once in random orderings each time to encourage serendipitous connections between items. Items can also be filtered based on type (buildings, people and documents). If you want to check it out for yourself download this zip file and open up the index.html in the root of your home directory. Go ahead and turn off your wi-fi connection so you can see it working without an Internet connection.

When building static sites in the past I’ve often reached for Jekyll but this time I was interested in putting together a small client side application that could be run offline. This shouldn’t be seen as an either/or situation: it would be quite natural to create a static site using Jekyll that embeds a React application within it. But for the sake of experimentation I wanted to see how far I could go just using React.

Ever since I first saw Twitter’s personal archive download (aka Grailbird) I’ve been thinking about the potential of offline web applications to function as little time capsules for web content that can live independently of the Internet. Grailbird lets you view your Twitter content offline in a dynamic web application where you can view your tweets over time. Over the past few years the minimal computing has been gaining traction in the digital humanities community, as a way to ethically and sustainably deliver web content without necessarily needing to mentally make promises of keeping it online forever.

React seemed like a natural fit because I’ve been using it for the past year on another project. React offers a rich ecosystem of tools, plugins and libraries like Redux for building complex client side apps. The downside of using React is that it is not as easy for people to set up out of the box, or for changing over time if you you aren’t an experienced software developer. With Jekyll it’s not simple, but at least its relatively easy to dive in and edit HTML and CSS. But on the plus side for Reactf you really want to deliver an unchanging finished thing (static) artifact, then maybe these things don’t really matter so much?

At any rate it seemed like a worthwhile experiment. So here are a few tidbits I learned when bending React to the purposes of minimal computing:

The first is to build a static representation of your data. Many React applications rely on an external REST API being available. This type of dependency is an obvious no-no for minimal computing applications, because an Internet connection is needed, and someone needs to keep the REST API service up and running constantly, which is infrastructure and costs money.

One way of getting around this is to take all the structured data your application needs and bundle it up as a single file. You can see the one I created for my application here. As you can see it contains metadata for all the photographs expressed as JSON. But the the JSON itself is part of a global JavaScript variable declaration which allows it to be loaded by the browser without relying on an asynchronous HTTP call. Browsers need to limit the ability of JavaScript to fetch files from the filesystem for security reasons. This JavaScript file is loaded immediately by your web browser when it loads the index.html, and the app can access it globally as window.DATA. Think of it like a static read-only, in memory database for your application. The wrapping HTML will look as simple as something like this:

<!DOCTYPE html> <html lang="en"> <head> <meta charset="UTF-8"> <meta name="viewport" content="width=device-width, initial-scale=1"> <title>Lakeland Community Heritage Project</title> <script src="static/data.js"></script> </head> <body> <div id="app"></div> <script type="text/javascript" src="bundle.js"></script> </body> </html>

Similarly, the image files need to be available locally. I took all the images and saved them into a directory I named static, and named the file using a unique item id (from Omeka) which allowed the metadata and data to be conceptually linked:

lakeland-images/static/{omeka-id}/fullsize.jpg

My React application has an Image component that simply renders the image along with a caption using the >figure<, <img> <figcaption> elements.

  • image
class Image extends Component { render() { return ( <Link to={'/item/' + this.props.item.id + '/'}> <figure className={style.Image}> <img src={'static/' + this.props.item.id + '/fullsize.jpg'} /> <figcaption> {this.props.item.title} </figcaption> </figure> </Link> ) } }

It’s pretty common to use webpack to build React applications, and the [copy-webpack-plugin] will handle copying the files from the static directory into the distribution directory during the build.

You may have noticed that in both cases the data.js and images are being loaded using a relative URL (without a leading slash, or a protocol/hostname). This is a small but important detail that allows the application to be moved around from zip file, to thumb drive to disk drive, without needing paths to be rewritten. The images and data are loaded relative to where the index.html was initially loaded from.

In addition many React applications these days use the new History API in modern browsers. This lets your application have what appear to be normal URLs structured with slashes which you can manage with react-router. However slash URLs are problematic in a offline static site for a couple reasons. The first is that there is no server so you can’t tweak it to respond to any request with the HTML file I included above that will bootstrap your application. This means that if you reload a page you will get a 404 not found.

The other problem is that while the History API works fine for an offline application, the relative links to bundle.js, data.js and the images will break because they will be relative to the new URL.

Fortunately there is a simple solution to this: manage the URLs the way we did before the History API, using hash fragments. So instead of:

file:///lakeland-images/index.html/items/123

you’ll have:

file:///lake-landimages/index.html#/items/123

This way the browser will look to load static/data.js from file:///lakeland-images/ instead of file://lakeland-images/index.html/items/. Luckily react-router lets you simply import and use createHashHistory in your application initialization and it will write these URLs for you.

It’s important to reiterate that this was an experiment. We don’t know if the LCHP is interested in us developing this approach further. But regardless I thought it was worth just jotting down these notes for others considering similar approaches with React and minimal computing applications.

I’ll just close by saying in some ways it seems counter-intuitive to refer to a React application as an example of minimal computing. After working with React off and on for a couple years it still seems quite complicated when you throw Redux into the mix. Assembling the boilerplate needed to get started is still tedious, unless you use create-react-app which is a smart way to start. It’s much easier to get Jekyll out of the box and start using it.

But static sites ultimately rely on a web browser, which is an insanely complicated piece of code. With a few exceptions (e.g. Flash) browsers have been pretty good at maintaining backwards compatibility as they’ve evolved along with the web. JavaScript is so central to a functioning web it’s difficult to imagine it going away. So really this approach is a bet on the browser and the web remaining viable. Whatever happens to the web and the Internet we can probably rely on some form of browser continuing to exist as functioning software, either natively, or in some sort of emulator, for a good time to come…or at least longer than the typical website will be kept online.

Ed Summers: Offline Sites with React

planet code4lib - Wed, 2018-01-10 05:00

This post contains some brief notes about building offline, static web sites using React, in order to further the objectives of minimal computing. But before I go there, first let me give you a little background…

The Lakeland Community Heritage Project is an effort to collect, preserve, and interpret the heritage and history of African Americans who have lived in the Lakeland community of Prince George’s County, Maryland since the late 19th century. This effort has been led by members of the Lakeland community, with help from students from the University of Maryland working with Professor Mary Sies. As part of the work they’ve collected photographs, maps, deeds, and oral histories and published them in an Omeka instance at lakeland.umd.edu. As Mary is wrapping up the UMD side of the project she has become increasingly interested in making these resources available and useful to the community of Lakeland, rather than leaving them embedded in a software application that is running on servers owned by UMD.

Sneakernet

Recently MITH has been in conversation with LCHP to help explore ways that this data stored in Omeka could be meaningfully transferred to the Lakeland community. This has involved first getting the Omeka site back online, since it partially fell offline as the result of some infrastructure migrations at UMD. We also have been collecting and inventorying disk drives of content used by the students as they have collected and transfer devices over the years.

One relatively small experiment I tried recently was to extract all the images and their metadata from Omeka to create a very simple visual display of the images that could run in a browser without an Internet connection. The point was to provide a generous interface from which community members attending a meeting could browse content quickly and potentially take it away with them. Since this meeting was in a environment where there wasn’t stable network access it was important that for the content to be browsed without an Internet connection. We also wanted to be able to put the application on a thumb drive, and move it around as a zip file, which could also ultimately allow us to make it available to community members independent of the files needing to be kept online on the Internet at a particular location. Basically we wanted the site to be on the Sneakernet instead of the Internet.

Static Data

The first step was getting all the data out of Omeka. This was a simple matter with Omeka’s very clean, straightforward and well documented REST API. Unfortunately, LCHP was running an older version of Omeka (v1.3.1) that needed to be upgraded to 2.x before the API was available. The upgrade process itself leapfrogged a bunch of versions so I wasn’t surprised to run into a small snag, which I was fortunately able to fix myself (go team open source).

I wrote a small utility named nyaraka that talks to Omeka and downloads all the items (metadata and files) as well as the collections they are a part of, and places them on the filesystem. This was a fairly straightforward process because Omeka’s database ensures the one-to-many-relationships between a site and its collections, items, and files which means they can be written to the filesystem in a structured way:

lakeland.umd.edu lakeland.umd.edu/site.json lakeland.umd.edu/collections lakeland.umd.edu/collections/1 lakeland.umd.edu/collections/1/collection.json lakeland.umd.edu/collections/1/items lakeland.umd.edu/collections/1/items/1 lakeland.umd.edu/collections/1/items/1/item.json lakeland.umd.edu/collections/1/items/1/files lakeland.umd.edu/collections/1/items/1/files/1 lakeland.umd.edu/collections/1/items/1/files/1/fullsize.jpg lakeland.umd.edu/collections/1/items/1/files/1/original.jpg lakeland.umd.edu/collections/1/items/1/files/1/file.json lakeland.umd.edu/collections/1/items/1/files/1/thumbnail.jpg lakeland.umd.edu/collections/1/items/1/files/1/square_thumbnail.jpg

This post was really meant to be about building a static site with React, and not about extracting data from Omeka. But this filesystem data is kinda like a static site, right? It was really just laying the foundation for the next step of building the static site application, since I didn’t really want to keep downloading content from the API as I was developing the application. Having all the content local made it easier to introspect with command line tools like grep, find and jq as I was building the static site.

React

Before I get into a few of the details here’s a short video that shows what the finished static site looked like:

Lakeland Static Site Demo from Ed Summers on Vimeo.

You can see that content is loaded dynamically as the user scrolls down the page. Lots of content is presented at once in random orderings each time to encourage serendipitous connections between items. Items can also be filtered based on type (buildings, people and documents). If you want to check it out for yourself download and unzip this zip file and open up the index.html in the directory that is created. Go ahead and turn off your wi-fi connection so you can see it working without an Internet connection.

When building static sites in the past I’ve often reached for Jekyll but this time I was interested in putting together a small client side application that could be run offline. This shouldn’t be seen as an either/or situation: it would be quite natural to create a static site using Jekyll that embeds a React application within it. But for the sake of experimentation I wanted to see how far I could go just using React.

Ever since I first saw Twitter’s personal archive download (aka Grailbird) I’ve been thinking about the potential of offline web applications to function as little time capsules for web content that can live independently of the Internet. Grailbird lets you view your Twitter content offline in a dynamic web application where you can view your tweets over time. Over the past few years the minimal computing movement has been gaining traction in the digital humanities community, as a way to ethically and sustainably deliver web content without needing to make promises about keeping it online forever, or 25 years (whichever comes first).

React seemed like a natural fit because I’ve been using it for the past year on another project. React offers a rich ecosystem of tools, plugins and libraries like Redux for building complex client side apps. The downside of using React is that it is not as easy for people to set up out of the box, or for changing over time if you you aren’t an experienced software developer. With Jekyll it’s not simple, but at least its relatively easy to dive in and edit HTML and CSS. But on the plus side for React, if you really want to deliver an unchanging, finished (static) artifact, then maybe these things don’t really matter so much?

At any rate it seemed like a worthwhile experiment. So here are a few tidbits I learned when bending React to the purposes of minimal computing:

Static Database

The first is to build a static representation of your data. Many React applications rely on an external REST API being available. This type of dependency is an obvious no-no for minimal computing applications, because an Internet connection is needed, and someone needs to keep the REST API service up and running constantly, which is infrastructure and costs money.

One way of getting around this is to take all the structured data your application needs and bundle it up as a single file. You can see the one I created for my application here. As you can see it contains metadata for all the photographs expressed as JSON. But the the JSON itself is part of a global JavaScript variable declaration which allows it to be loaded by the browser without relying on an asynchronous HTTP call. Browsers need to limit the ability of JavaScript to fetch files from the filesystem for security reasons. This JavaScript file is loaded immediately by your web browser when it loads the index.html, and the app can access it globally as window.DATA. Think of it like a static read-only, in memory database for your application. The wrapping HTML will look as simple as something like this:

<!DOCTYPE html> <html lang="en"> <head> <meta charset="UTF-8"> <meta name="viewport" content="width=device-width, initial-scale=1"> <title>Lakeland Community Heritage Project</title> <script src="static/data.js"></script> </head> <body> <div id="app"></div> <script type="text/javascript" src="bundle.js"></script> </body> </html>

Update: Another more scalable approach to this suggested by Alex Gil after this post went live, is to try using an in browser database like PouchDB. When combined with Lunr for search this could make for quite a rich and extensible data layer for minimal computing browser apps.

Static Images

Similarly, the image files need to be available locally. I took all the images and saved them into a directory I named static, and named the file using a unique item id (from Omeka) which allowed the metadata and data to be conceptually linked:

lakeland-images/static/{omeka-id}/fullsize.jpg

My React application has an Image component that simply renders the image along with a caption using the <figure>, <img> <figcaption> elements.

  • image
class Image extends Component { render() { return ( <Link to={'/item/' + this.props.item.id + '/'}> <figure className={style.Image}> <img src={'static/' + this.props.item.id + '/fullsize.jpg'} /> <figcaption> {this.props.item.title} </figcaption> </figure> </Link> ) } }

It’s pretty common to use webpack to build React applications, and the copy-webpack-plugin will handle copying the files from the static directory into the distribution directory during the build.

URLs

You may have noticed that in both cases the data.js and images are being loaded using a relative URL (without a leading slash, or a protocol/hostname). This is a small but important detail that allows the application to be moved around from zip file, to thumb drive to disk drive, without needing paths to be rewritten. The images and data are loaded relative to where the index.html was initially loaded from.

In addition many React applications these days use the new History API in modern browsers. This lets your application have what appear to be normal URLs structured with slashes which you can manage with react-router. However slash URLs are problematic in a offline static site for a couple reasons. The first is that there is no server so you can’t tweak it to respond to any request with the HTML file I included above that will bootstrap your application. This means that if you reload a page you will get a 404 not found.

The other problem is that while the History API works fine for an offline application, the relative links to bundle.js, data.js and the images will break because they will be relative to the new URL.

Fortunately there is a simple solution to this: manage the URLs the way we did before the History API, using hash fragments. So instead of:

file:///lakeland-images/index.html/items/123

you’ll have:

file:///lake-landimages/index.html#/items/123

This way the browser will look to load static/data.js from file:///lakeland-images/ instead of file://lakeland-images/index.html/items/. Luckily react-router lets you simply import and use createHashHistory in your application initialization and it will write these URLs for you.

Minimal?

It’s important to reiterate that this was an experiment. We don’t know if the LCHP is interested in us developing this approach further. But regardless I thought it was worth just jotting down these notes for others considering similar approaches with React and minimal computing applications.

I’ll just close by saying in some ways it seems counter-intuitive to refer to a React application as an example of minimal computing. As Alex Gil says:

In general we can say that minimal computing is the application of minimalist principles to computing. In reality, though, minimal computing is in the eye of the beholder.

After working with React off and on for a couple years it still seems quite complicated–especially when you throw Redux into the mix. Assembling the boilerplate needed to get started is still tedious, unless you use create-react-app which is a smart way to start. By comparison it’s much easier to get Jekyll out of the box and start using it. But, if the goal is truly to deliver something static and unchanging, then perhaps this up front investment in time is not so significant.

Static sites, thus conceived, ultimately rely on a web browser, which are insanely complicated pieces of code. With a few exceptions (e.g. Flash) browsers have been pretty good at maintaining backwards compatibility as they’ve evolved along with the web. JavaScript is so central to a functioning web it’s difficult to imagine it going away. So really this approach is a bet on the browser and the web remaining viable. Whatever happens to the web and the Internet we can probably rely on some form of browser continuing to exist as functioning software, either natively, or in some sort of emulator, for a good time to come…or at least longer than the typical website is kept online.

Many thanks to Raff Viglianti, Trevor Muñoz and Stephanie Sapienza who helped frame and explore many of the ideas expressed in this post.

Lucidworks: How to Handle Meltdown and Spectre for Solr

planet code4lib - Tue, 2018-01-09 21:23

Recent news reports have revealed that most Intel processors are vulnerable to a security flaw that allows processes to read the memory of other processes running on the same Intel CPU. At this time it appears that some of the flaws do appear to affect AMD CPUs as well, but the more serious performance-impacting do not. Because cloud providers use Intel CPUs and virtualization to support multiple clients on the same VM, this can be especially troubling to multi-tenant hosting environments such as Amazon Web Services. However, Google has stated that it believes that it has successfully mitigated the flaw in its Google Cloud Platform, although some user patches are required.

It is important to understand the risk of this bug, but not to overestimate it. To operate, the exploit needs to be already running inside of software in your computer. It does not allow anyone on the internet to take control of your server over http, for instance. If there is an existing vulnerability, it does make it worse as the vulnerable process might be used to read memory from other processes.

There are already operating system patches out for this bug. Unfortunately, the operating system level patch for this bug requires creating a software isolation layer which will have a significant impact on performance. Estimates are that its impact can be between 5-30%. Every piece of software running in the Application space may be affected. The impact will vary, and each application will need to be performance and load tested.

Some customers running on their own internal hardware may decide that, given the vector of the exploit and the performance cost of the fix, they may decide to delay applying it. Other customers running on more vulnerable environments or with more specific security concerns may need to apply it and deal with the performance implications.

Fortunately for Lucidworks customers, Fusion and its open source Solr core are especially adept at scale. For high capacity systems, the most cost-effective solution may be to add a number of additional nodes to allow for the increased weight of the operating system. Additionally, by tuning the Fusion pipeline it may be possible to reduce the number of calls necessary to perform queries or parallelize some calls thus compensating for the loss of performance through optimization in other areas.

In either case Lucidworks is here for our customers. If you’re considering applying the fix, please reach out to your account manager to understand ways that we can help mitigate any issues you may have. If you do not currently have or know your account manager, please file a support request or use the Lucidworks contact us page.

The post How to Handle Meltdown and Spectre for Solr appeared first on Lucidworks.

District Dispatch: Improving Digital Equity: The civil rights priority libraries and school technology leaders share

planet code4lib - Mon, 2018-01-08 18:46

This blog post, written by Consortium for School Networking (CoSN) CEO Keith Krueger, is first in a series of occasional posts contributed by leaders from coalition partners and other public interest groups that ALA’s Washington Office works closely with. Whatever the policy – copyright, education, technology, to name just a few – we depend on relationships with other organizations to influence legislation, policy and regulatory issues of importance to the library field and the public.

Learning has gone digital. Students access information, complete their homework, take online courses and communicate with technology and the internet.

The Consortium for School Networking is a longtime ally of the American Library Association on issues related to education and telecommunications, especially in advocating for a robust federal E-rate program.

Digital equity is one of today’s most pressing civil rights issues. Robust broadband and Wi-Fi, both at school and at home, are essential learning tools. Addressing digital equity – sometimes called the “homework gap” – is core to CoSN’s vision, and a shared value with our colleagues at ALA.

That is why the E-rate program has been so important for the past 20 years, connecting classrooms and libraries to the internet. Two years ago the Federal Communications Commission (FCC) modernized E-rate by increasing funding by 60 percent and focused on broadband and Wi-Fi. This action made a difference. CoSN’s 2017 Infrastructure Survey found that the majority of U.S. school districts (85 percent) are fully meeting the FCC’s short-term goal for broadband connectivity of 100 Mbps per 1,000 students.

While this is tremendous progress, we have not completed the job. Recurring costs remain the most significant barrier for schools in their efforts to increase connectivity. More than half of school districts reported that none of their schools met the FCC’s long-term broadband connectivity goal of 1 Gbps per 1,000 students. The situation is more critical in rural areas where nearly 60 percent of all districts receive one or no bids for broadband services. This lack of competition remains a significant burden for rural schools.

And learning doesn’t stop at the school door. CoSN has demonstrated how school systems can work with mayors, libraries, the business community and other local partners to address digital equity. In CoSN’s Digital Equity Action Toolkit, we show how communities are putting Wi-Fi on school buses, mapping out free Wi-Fi homework access from area businesses, loaning Wi-Fi hotspots to low-income families and working to ensure that broadband offerings are not redlining low-income neighborhoods. A great example is the innovative partnership that Charlotte Mecklenburg Schools has established with the Mecklenburg Library System in North Carolina. CoSN also partners with ALA to fight the FCC’s misguided plans to roll back the Lifeline broadband offerings.

Of course, the most serious digital gap is ensuring that all students, regardless of their family or zip code, have the skills to use these new tools effectively. We know that digital literacy and citizenship are essential skills for a civil society and safer world. Librarians have always been on the vanguard of that work, and our education technology leaders are their natural allies. Learn about these efforts and what more we can do by attending CoSN/UNESCO’s Global Symposium on Educating for Digital Citizenship in Washington, DC on March 12, 2018.

As we start 2018, I am often asked to predict the future. What technologies or trends are most important in schools? CoSN annually co-produces the Horizon K-12 report, and I strongly encourage you to read the 2017 Horizon K-12 Report to see how emerging technologies are impacting learning in the near horizons.

However, my top recommendation is that education and library leaders focus on “inventing” the future. Working together, let’s focus on enabling learning where each student can personalize their education – and where digital technologies close gaps rather than make them larger.

Keith R. Krueger, CAE, has been CEO of the Consortium for School Networking for the past twenty-three years. He has had a strong background in working with libraries, including being the first Executive Director the Friends of the National Library of Medicine at NIH.

The post Improving Digital Equity: The civil rights priority libraries and school technology leaders share appeared first on District Dispatch.

Lucidworks: Looking Back at Search in 2017

planet code4lib - Mon, 2018-01-08 18:22

2017 was a big year in search technology. As we chronicled last month in our rundown of trends for 2018, search technology has moved far beyond just keywords, faceting, and scale. But let’s take a look back  at the trends that have continued through the past year.

Continued Industry Consolidation

We’ve continued to see consolidation with the exit of Google Search Appliance from the market. Now organizations are re-evaluating technologies, like Endeca, that have been acquired by vendors and products like FAST and have been embedded in other products. Ecommerce companies that have traditionally thought of search as a primary part of what they do, have already migrated to newer systems. In 2017, IT companies stuck with maintaining technology not intended for today’s scale, moved away from legacy technologies in earnest.

Meanwhile other vendors have been downsizing staff but continuing to support the locked-in long tail installation base. You can figure out which ones by looking at current vs past employees on LinkedIn. In 2017, customers started to get wise. No one wants to be the last one on a sinking ship.

In this same time period, I’m proud to say Lucidworks continued to grow in terms of code written, revenue, employees, and even acquisitions.

Technology and Data Consolidation

Not long ago, larger companies tended to have more than one IT department and each of those individual departments had their own search solution. So there would be a search application for sales which was deployed by the sales IT group and then another search app for the HR department deployed by their IT group and then probably yet another search solution for the product teams built by their IT group. With IT consolidation, an ever-increasing mountain of data, and new integrated business practices, there is a greater need than ever to consolidate search technology. There are still single source solutions (especially in sales) but last year, IT departments continued to push to centralize on one search technology.

Meanwhile there are more data sources than ever. There are still traditional sources like Oracle RDBMS, Sharepoint, and file shares. However, there are newer data sources to contend with including NoSQL databases, Slack, and SaaS solutions. With the push towards digital business, and turning information into answers, it is critical to build a common search core to pull data from multiple sources. In 2017, we saw continued movement in this direction.

Scale Out

Virtualization replaced bare metal for most companies years ago. The trend was the joining of the private and public cloud. This move continued against a business backdrop of continued globalization and a technology backdrop of continued mobilization. In 2017, modern companies often conducted business all over the world from palm-sized devices, tablets and laptops.

Meanwhile there are new forms of data emerging. Customers now generate telemetry from their mobile devices. Buildings can now generate everything from presence data to environmental and security information. Factories and brick-and-mortar storefronts now generate data forming the so-called Internet of Things. With machine learning and search technology, companies are now starting to make better use of this sort of data. These trends were nascent in 2017, but still observable.

In a virtualized, cloud-based global world where data is generated from everything everywhere all of the time, companies need search technology that can handle the load whenever, wherever, and however it comes. Old client-server technology was no longer enough to handle these demands. In 2017, horizontal scale was no longer a luxury, but a necessity.

Personalization and Targeting

2017 saw simple search start to abate. While AI and machine learning technologies are relatively new to the search market, some of the more mature tools saw widespread deployment. Many organizations deployed search technology that could capture clicks, queries, and purchases. Modern search technology use this information to provide better, more personalized results.

Collaborative filtering (boosting the top clicked item for a given query) is the most common optimization followed by similarity (MoreLikeThis) but we also saw companies start to deploy Machine Learning powered recommendations especially in digital commerce. These recommendations use information about what a user or similar users have done to suggest choices.

Mainly Custom Apps, but The Rise of Twigkit

In 2017 most companies were still writing their own custom search apps. Unlike previous years, these apps are very AJAX/JavasScript-y/dynamic. Frameworks like Angular ruled the search application market. At the same time, savvy organizations realized that writing yet another search box with typeahead was a waste of time and they started using pre-built components. One of the best toolboxes of pre-tested pre-built components was Twigkit.

Twigkit had been around since 2009 and was a widely respected force in the search industry with relationships with all of the major vendors and customers all over the world. Lucidworks had been recommending it to our customers and even using it in some deployments so we decided to acquire the company and accelerate the technology. The future of Twigkit was announced as at our annual conference last September with the technology becoming part of Lucidworks App Studio.

Happy New Year

Goodbye to 2017 but hello to 2018. It was a great year for search, but not as good as what is coming. If you want to see what’s on the way in 2018, here’s my take on what to watch for in the coming year.

If you find yourself behind the curve, Lucidworks Fusion and Lucidworks App Studio are a great way to acquire the technologies you need to catch up. You might also sign up for Fusion Cloud.

The post Looking Back at Search in 2017 appeared first on Lucidworks.

David Rosenthal: The $2B Joke

planet code4lib - Mon, 2018-01-08 18:00
SourceEverything you need to know about cryptocurrency is in Timothy B. Lee's
Remember Dogecoin? The joke currency soared to $2 billion this weekend:
"Nobody was supposed to take Dogecoin seriously. Back in 2013, a couple of guys created a new cryptocurrency inspired by the "doge" meme, which features a Shiba Inu dog making excited but ungrammatical declarations. ... At the start of 2017, the value of all Dogecoins in circulation was around $20 million. ... Then on Saturday the value hit $2 billion. ... "It says a lot about the state of the cryptocurrency space in general that a currency with a dog on it which hasn't released a software update in over 2 years has a $1B+ market cap," [cofounder] Palmer told Coindesk last week. So blockchain, such bubble. Up 100x in a year. Are you HODL-ing or getting your money out?

District Dispatch: Full Text of FCC’s order rolling back Net Neutrality released

planet code4lib - Mon, 2018-01-08 17:14

At the end of last week, the FCC released the final order to roll back 2015’s Net Neutrality rules. The 539-page order has few changes from the draft first circulated in November and voted on along party lines by the Republican-controlled commission on December 14. ALA is working with allies to encourage Congress to overturn the FCC’s egregious action.

Procedurally, we are still waiting for the order to appear in the Federal Register and to also be delivered to Congress. These actions will kick off timing for members of Congress to have their shot at stopping the FCC. Right after the vote, members of Congress announced their intent to attempt to nullify the FCC’s actions. The Congressional Review Act (CRA) gives Congress the ability and authority to do this; the CRA allows Congress to review a new agency regulation (in this case, Pai’s “Restoring Internet Freedom” order) and pass a Joint Resolution of Disapproval to overrule it. This would repeal last weeks FCC order, restoring the 2015 Open Internet Order and keeping net neutrality protections in place, and the internet working the way it does now. This Congressional action would be subject to Presidential approval.

Senator Ed Markey (D-MA) is leading the charge and has announced his intention to introduce a resolution to overturn the FCC’s decision using the authority granted by the CRA. Democratic leadership in both Houses have urged their colleagues to support and Sen. Claire McCaskill (D-MO) has just tweeted that she will be the 30th Senator to sign on to the effort.

We will continue to update you on the activities and other developments as we continue to work to preserve a neutral internet. For now, you can email your members of Congress today and ask them to support the CRA to repeal the recent FCC action and restore the 2015 Open Internet Order protections.

The post Full Text of FCC’s order rolling back Net Neutrality released appeared first on District Dispatch.

David Rosenthal: Digital Preservation Declaration of Shared Values

planet code4lib - Mon, 2018-01-08 16:00
I'd like to draw your attention to the effort underway by a number of organizations active in digital preservation to agree on a Digital Preservation Declaration of Shared Values:
The digital preservation landscape is one of a multitude of choices that vary widely in terms of purpose, scale, cost, and complexity. Over the past year a group of collaborating organizations united in the commitment to digital preservation have come together to explore how we can better communicate with each other and assist members of the wider community as they negotiate this complicated landscape.

As an initial effort, the group drafted a Digital Preservation Declaration of Shared Values that is now being released for community comment. The document is available here: https://docs.google.com/document/d/1cL-g_X42J4p7d8H7O9YiuDD4-KCnRUllTC2s...

The comment period will be open until March 1st, 2018. In addition, we welcome suggestions from the community for next steps that would be beneficial as we work together.The list of shared values (Collaboration, Affordability, Availability, Inclusiveness, Diversity, Portability/Interoperability, Transparency/information sharing, Accountability, Stewardship Continuity, Advocacy, Empowerment) includes several to which adherence in the past hasn't been great.

There are already good comments on the draft. Having more input, and input from a broader range of institutions, would help this potentially important initiative.

HangingTogether: The OCLC Research Library Partnership: the challenges of developing scholarly services in a decentralized landscape

planet code4lib - Sun, 2018-01-07 14:00

On November 1st, North American members of the OCLC Research Library Partnership came together in Baltimore to engage in day-long discussion on three concurrent topics:

My colleagues Karen Smith-Yoshimura and Merrilee Proffitt have previously written on the first two discussions.

Attendees at the North American meeting of OCLC Research Library Partners, 1 Nov 2017

Scholarly communications in a complex institutional environment

While recognizing the landscape of evolving scholarly services and workflows is extremely broad, my colleague Roy Tennant and I chose to frame our conversation around three distinct areas of research university and library engagement:

  • research data management (RDM)
  • research information management (RIM)
  • institutional repositories (IR)

We discussed each focus area for libraries singly but also explored how these services are increasingly intersecting, driven by needs for improved workflows for researchers—and institutions. These service areas may also intersect with non-library services such as annual academic review workflows and institutional reporting, as well as with the growing numbers of resources researchers are using to independently manage their citations, lab notebooks, and more.

We engaged in an in-depth discussion utilizing two recent OCLC Research reports:

and also looked to the recent report of a CNI Executive Roundtable, Rethinking Institutional Repository Strategies, as well as a short 2008 blog by Lorcan Dempsey on “Stitching Costs”, or the costs of integrating services and workflows.

While the topic of our conversation was scholarly communications workflows, services, and interoperability, the theme was the challenge of libraries responding to enterprise-wide needs. Participating institutions represented a variety of states of exploration and implementation, particularly in relation to emerging areas like RDM and RIM, and librarians described their own experiences and challenges, as they seek to build relationships with other institutional stakeholders that have different goals, priorities, and practices. Every institution is decentralized, and each institution is unique in its organization. Relationship building and communications across silos is time consuming and often political, and efforts to develop meaningful collaborations can be stymied by individual personalities or limited knowledge of another unit’s priorities.

The theme of our conversation was the challenge of libraries responding to enterprise-wide needs

This is particularly true for RIM and RDM services, as there are a great many institutional stakeholders, including academic colleges and departments, the research office (usually led by a VP of research), institutional research, graduate school, registrar, human resources,  tech transfer, and public affairs.

Libraries must work collaboratively with other stakeholders across the institution

Roger Schonfeld describes this situation well in the recent Ithaka S+R issue brief, Big Deal: Should Universities Outsource More Core Research Infrastructure?, which explores the rapid development of research workflow tools being adopted by researchers,

“Today almost no university is positioned to address its core interests here in any truly coherent way. The reason is essentially structural. There is no individual or organization within any university . . . that is responsible for the full suite of research workflow services. . . . No campus office or organization has responsibility for anything other than a subset of the system.”

What can you do to learn more?

Libraries are important stakeholders in these conversations but will be ineffectual if they try to act alone. It is increasingly important for librarians to understand the goals and activities of other university stakeholders as well as to succinctly and persuasively communicate their own value proposition. I want to encourage readers to explore a couple of OCLC Research publications that address these challenges:

Join us

Our interactions with OCLC Research Library Partners helps inform our future research plans, as we learn more about the challenges, pain points, and ambitions of our partner libraries. We will be continuing this conversation with our UK and European Research Library Partnership members on February 19 at the University of Edinburgh. We also want to encourage ALL research institutions to share about their practices and collaborations by participating in our Survey of Research Information Management Practices, conducted in collaboration with euroCRIS, which remains open through 31 January 2018.

Mark Matienzo: Notes on ITLP Workshop 1 readings

planet code4lib - Sun, 2018-01-07 05:43

I completed my reading and viewing assignments for my cohort’s IT Leadership Program Workshop 1 (January 9-January 11 at UC Berkeley.) This is a brief set of notes for my own use about how all of them tie together.

  • Leaders are made not born and leadership skills don’t always transfer across contexts.
  • Leadership should be reflected in the culture of the organization; developing leaders, even in medium- and lower-level employees is a key part of that. Encourage them to take this on, and protect them when they step up. Leading up (i.e., leading your boss) should be expected, too.
  • Be aware of where you are looking to anticipate change.
  • Don’t be head down; you need to retain focus both of the work around you and broader context. You are responsible for framing problems, not solving them exclusively.
  • Great leaders have diverse networks and the ability to develop relationships with people different from them.
  • Self-mastery is the key to leadership. Great leaders model behavior (poise; emotional capacity) and define direction. Retaining empathy, humanity, dignity, passion, connection to other people in environment of transactional interaction are all hard.
  • Conflict and feeling pressure is necessary. Don’t smooth over either too much; instead, regulate it.
  • Be willing to look at taking large leaps, but take the time to understand them. At the same time, don’t wed yourself to long-term strategic planning processes that might be blocks.
  • Inspire people to move beyond their own perceived limitations and encourage others to break with convention when necessary.

And the readings and videos:

David Rosenthal: Meltdown &amp; Spectre

planet code4lib - Fri, 2018-01-05 18:00
This hasn't been a good few months for Intel. I wrote in November about the vulnerabilities in their Management Engine. Now they, and other CPU manufacturers are facing Meltdown and Spectre, three major vulnerabilities caused by side-effects of speculative execution. The release of these vulnerabilities was rushed and the initial reaction less than adequate.

The three vulnerabilties are very serious but mitigations are in place and appear to be less costly than reports focused on the worst-case would lead you to believe. Below the fold, I look at the reaction, explain what speculative execution means, and point to the best explanation I've found of where the vulnerabilities come from and what the mitigations do.

Although CPUs from AMD and ARM are also affected, Intel's initial response was pathetic, as Peter Bright reports at Ars Technica:
The company's initial statement, produced on Wednesday, was a masterpiece of obfuscation. It contains many statements that are technically true—for example, "these exploits do not have the potential to corrupt, modify, or delete data"—but utterly beside the point. Nobody claimed otherwise! The statement doesn't distinguish between Meltdown—a flaw that Intel's biggest competitor, AMD, appears to have dodged—and Spectre and, hence, fails to demonstrate the unequal impact on the different company's products.In addition, Intel's CEO is suspected of insider trading on information about these vulnerabilities:
Brian Krzanich, chief executive officer of Intel, sold millions of dollars' worth of Intel stock—all he could part with under corporate bylaws—after Intel learned of Meltdown and Spectre, two related families of security flaws in Intel processors.Not a good look for Intel. Nor for AMD:
AMD's response has a lot less detail. AMD's chips aren't believed susceptible to the Meltdown flaw at all. The company also says (vaguely) that it should be less susceptible to the branch prediction attack.

The array bounds problem has, however, been demonstrated on AMD systems, and for that, AMD is suggesting a very different solution from that of Intel: specifically, operating system patches. It's not clear what these might be—while Intel released awful PR, it also produced a good whitepaper, whereas AMD so far has only offered PR—and the fact that it contradicts both Intel (and, as we'll see later, ARM's) response is very peculiar.The public release of details about Meltdown and Spectre was rushed, as developers not read-in to the problem started figuring out what was going on. This may have been due to an AMD engineer's comment:
Just after Christmas, an AMD developer contributed a Linux patch that excluded AMD chips from the Meltdown mitigation. In the note with that patch, the developer wrote, "The AMD microarchitecture does not allow memory references, including speculative references, that access higher privileged data when running in a lesser privileged mode when that access would result in a page fault."What is speculative execution? Some things a CPU does, such as fetching a cache miss from main memory, take hundreds of clock cycles. It is a waste to stop the CPU while it waits for these operations to complete. So the CPU continues to execute "speculatively". For example, it can guess which way it is likely to go at a branch, and head off down that path ("branch prediction"). If it is right, it has saved a lot of time. If it is wrong the processor state accumulated during the speculative execution has to be hidden from the real program.

Modern processors have lots of hardware supporting speculative execution. Meltdown and Spectre are both due to cases where the side-effects of speculative execution on this hardware are not completely hidden. They can be revealed, for example, by careful timing of operations of the real CPU which the speculative state can cause to take longer or shorter than normal.

The clearest explanation of the three vulnerabilities I've seen is from Matt Linton and Pat Parseghian on Google's Security blog:
Project Zero discussed three variants of speculative execution attack. There is no single fix for all three attack variants; each requires protection independently.


  • Variant 1 (CVE-2017-5753), “bounds check bypass.” This vulnerability affects specific sequences within compiled applications, which must be addressed on a per-binary basis.
  • Variant 2 (CVE-2017-5715), “branch target injection”. This variant may either be fixed by a CPU microcode update from the CPU vendor, or by applying a software mitigation technique called “Retpoline” to binaries where concern about information leakage is present. This mitigation may be applied to the operating system kernel, system programs and libraries, and individual software programs, as needed.
  • Variant 3 (CVE-2017-5754), “rogue data cache load.” This may require patching the system’s operating system. For Linux there is a patchset called KPTI (Kernel Page Table Isolation) that helps mitigate Variant 3. Other operating systems may implement similar protections - check with your vendor for specifics.
The detailed description in the table below the section I quoted is clear and comprehensive.

These vulnerabilities can be exploited only by running code on the local system. Alas, these days JavaScript means anyone can do that, so ensuring that your browser is up-to-date is very important. As I write I believe that systems rebooted on up-to-date Linux and Windows systems should be protected against both Meltdown and the known exploits for Spectre, and up-to-date Apple systems should have partial protection.

There will be some cases where these fixes will degrade performance significantly, but Google and others report that they aren't common in practice.

It is somewhat worrisome that some of the mitigations depend on ensuring that user binaries do not contain specific code sequences, since there are likely ways for malware to introduce such sequences.

District Dispatch: ALA and NCTET celebrate the 20th anniversary of E-rate

planet code4lib - Fri, 2018-01-05 17:19
Sen. Ed Markey (D-MA) delivering opening remarks in May 2017 at the E-Rate briefing in the Russell Senate Office building, organized by the Education and Libraries Network Coalition and National Coalition for Technology in Education & Training.

This month, ALA is teaming up with the National Coalition for Technology in Education and Training (NCTET) to celebrate the 20th Anniversary of E-rate! Join us on Wednesday, January 24, along with advocates and beneficiaries, to discuss E-rate successes and potential at our E-rate Summit, scheduled from 3 to 5:00 p.m. in the Capitol Visitor Center, room 202/3. Anyone with an interest in learning more about E-rate is welcome to join.

The Summit will begin with a welcome from NCTET president, Amanda Karhuse, the National Association of Secondary School Principals followed with remarks delivered by Senator Ed Markey (D-MA). Evan Marwell, CEO of EducationSuperHighway, will open a panel session titled, “E-Rate Past, Present & Future.”

The panel will be moderated by Caitlin Emma, education reporter for Politico. Ms. Emma will engage panelists from the library and K12 schools’ community, covering the gamut of services these beneficiaries are able to provide because of the E-rate program. In addition to hearing from direct beneficiaries, a spokesperson for Maryland Governor Larry Hogan will address the overall impact to states. The afternoon will conclude with remarks from FCC Commissioner Jessica Rosenworcel.

The E-rate program makes it possible for libraries to offer critical and innovative community support across the nation—from urban and suburban centers to remote rural and tribal communities. ALA staff and panelists also look forward to discussing—alongside the significant strides made with E-rate modernization—areas where we can grow and improve when it comes to streamlining the program’s administration and meeting the needs and challenges of library applicants of diverse sizes and capacities.

ALA has been advocating for E-rate since 1996, most recently filing comments with the FCC this past October. This fall, over 140 librarians and libraries around the country shared moving stories about the profound impact of E-rate with the FCC.

You can read more about ALA’s work with E-rate or join us for an afternoon of telecommunications fun on Wednesday, January 24 from 3 to 5:00 p.m. in the Capitol Visitor Center.

The post ALA and NCTET celebrate the 20th anniversary of E-rate appeared first on District Dispatch.

Pages

Subscribe to code4lib aggregator