IAL Grant Applications Due by May 9
The American Library Association filed comments last week with House and Senate Appropriations Committees in support of funding for the Library Services and Technology Act (LSTA) and Innovative Approaches to Literacy (IAL).
As the Appropriations Committees begin their consideration of 12 appropriations bills, ALA is urging the Committees to fund LSTA at $186.6 million and IAL at $27 million for FY 2017. Both programs received increases in last year’s FY 2016 funding bills and were included in the President’s February budget request to Congress.
“Without LSTA funding, these and many other specialized programs targeted to the needs of their communities across the country likely will be entirely eliminated, not merely scaled back. In most instances, LSTA funding (and its required but smaller state match) allows libraries to create new programs for their patrons,” noted Emily Sheketoff in comments to both Committees.
The $186.6 million funding level for LSTA mirrors last year’s request to Congress from the President and is also supported by “Dear Appropriator” letters recently circulated in the Senate and House for Members’ signatures. LSTA was funded at $155.8 million for FY 2016 and ALA expressed concern that the President is requesting only $154.8 million for FY 2017.
In supporting $27 million in IAL funding for school libraries, ALA commented that “studies show that strong literacy skills and year-round access to books is a critical first-step towards literacy and life-long learning. For American families living in poverty, access to reading materials is severely limited. These children have fewer books in their homes than their peers, which hinders their ability to prepare for school and to stay on track.”
Congress provided $27 million in FY 2016 IAL funding and the President requested the same level for FY 2017. IAL, which dedicates half of its resources for school libraries, was authorized in last year’s Every Student Succeeds Act. “Dear Appropriator” letters circulated in the Senate and House on its behalf called for $27 million in FY 2017 funding.
ALA reminds its members that the Department of Education recently announced that it has opened its FY 2016 window for new IAL grant applications. The DOE’s announcement with full application filing details is available online. Grant applications must be submitted by May 9, 2016.
In additional support for library funding, LSTA and IAL were highlighted in the annual Committee for Education Funding (CEF) Budget Response to Congress: Education Matters: Investing in America’s Future. The CEF budget response, which reserves two chapters for the LSTA and IAL programs, provides an explanation of the programs, examples of how funds have been used, and a justification for the funding levels sought.
The post ALA urges House and Senate approps subcommittees to support LSTA, IAL appeared first on District Dispatch.
That was the topic discussed recently by OCLC Research Library Partners metadata managers, initiated by John Riemer of UCLA. With increasing expectations that research data creation made possible through grant funding will be archived and made available to others, many institutions are becoming aware of the need to collect and curate this new scholarly resource. To maximize the chances that metadata for research data are shareable (that is, sufficiently comparable) and helpful to those considering re-using the data, our communities would benefit from sharing ideas and discussing plans to meet emerging discovery needs. OCLC Research Scientist Ixchel Faniel’s two-part blog entry “Data Management and Curation in 21st Century Archives” (Sept 2015) provided useful background to this discussion.
The discussions revealed a wide range of experiences, from those just encountering researchers who come to them with requests to archive and preserve their research data to those who have been handling research data for some years. National contexts differ. For example, our Australian colleagues can take advantage of Australia’s National Computational Infrastructure for big data and the Australian Data Archive for the social sciences. Canada is developing a national network called Portage for the “shared stewardship of research data”.
The US-based metadata managers were split about whether to have a single repository for all data or a separate repository for research data, although there seems to be a movement to separate data that is to be re-used (providing some capacity for computing on it) from data that is only to be stored. A number of fields have a discipline-based repository, or researchers take advantage of a third-party service such as DataCite, also used for discovery. The library can fill the gap for research data without a better home.
Recently-published Building Blocks: Laying the Foundation for a Research Data Management Program includes a section on metadata:
Datasets are useful only when they can be understood. Encourage researchers to provide structured information about their data, providing context and meaning and allowing others to find, use and properly cite the data. At minimum, advise researchers to clearly tell the story of how they gathered and used the data and for what purpose. This information is best placed in a readme.txt file that includes project information and project-level metadata, as well as metadata about the data itself (e.g., file names, file formats and software used, title, author, date, funder, copyright holder, description, keywords, observation unit, kind of data, type of data and language).
A number of institutions have developed templates to capture metadata in a structured form. Some metadata managers noted the need to keep such forms as simple as possible as it can be difficult to get researchers to fill them in. All agreed data creators needed to be the main source of metadata. But how to inspire data creators to produce quality metadata? New ways of training and outreach are needed.
We also had general agreement on the data elements required to support re-use by others: licenses, processing steps, tools, data documentation, data definitions, data steward, grant numbers and geospatial and temporal data (where relevant). Metadata schema used include Dublin Core, MODS (Metadata Object Description Schema) and DDI (Data Documentation Initiative’s metadata standard). The Digital Curation Centre in the UK provides a linked catalog of metadata standards. The Research Data Alliance’s Metadata Standards Directory Working Group has set up a community-maintained directory of metadata standards for different disciplines.
The importance of identifiers for both the research data and the creator has become more widely acknowledged. DOIs, Handles and ARKs (Archival Resource Key) have been used to provide persistent access. Identifiers are available at the full data set level and for component parts, and they can be used to track downloads and potentially help measure impact. Both ORCID (Open Researcher and Contributor ID) and ISNI (International Standard Name Identifier) are in use to identify data creators uniquely.
Some have started to analyze the metadata requirements for the research data life cycle, not just the final product. Who are the collaborators? How do various projects use different data files? What kind of analysis tools do they use? What are the relationships of data files across a project, between related projects, and to other scholarly output such as related journal articles? The University of Michigan’s Research Data Services is designed to assist researchers during all phases of the research data life cycle.
Curation of research data as part of the evolving scholarly record requires new skill sets, including deeper domain knowledge, data modeling, and ontology development. Libraries are investing more effort in becoming part of their faculty’s research process and offering services that help ensure that their research data will be accessible if not also preserved. Good metadata will help guide other researchers to the research data they need for their own projects—and the data creators will have the satisfaction of knowing that their data has benefitted others.
About Karen Smith-Yoshimura
Karen Smith-Yoshimura, senior program officer, works on topics related to creating and managing metadata with a focus on large research libraries and multilingual requirements.Mail | Web | Twitter | More Posts (66)
This is a guest post by Carmel Curtis.
Over the past eight months I have been working as the National Digital Stewardship Resident at the Brooklyn Academy of Music. BAM is the oldest continually running performing arts center in the country and is home to a range of artistic expressions in dance, theater, film, and beyond. Over 150 years old, BAM has a rich history.
I have been working on a records management project at BAM. My mentor, processing archivist Evelyn Shunaman, and I have conducted 41 hour-long interviews with all divisions, departments and sub-departments to get a sense of what and how many electronic records are being created, saved and accessed. Then we created or revised departmental Record Retention Schedules to ensure they reflect BAM’s current workflows and practices.
Here are some of basics of records retention and tips on creating a Records Retention Schedule.
A Records Retention Schedule is an institutional or organizational policy that defines the information that is being created and identifies retention requirements based on legal, business and preservation requirements. An RRS can take many forms. Example 1 shows our RRS spreadsheet.
Record Series Title
Transfer to Archives
Category of Record
Explanation of record category
Time period records are retained
Whether or not records are sent to the Archives
and results from survey conducted every 3 years on BAM audience demographic.
yesExample 1. BAM RRS spreadsheet.
An RRS is a way for an institution to:
- Be accountable to any legal requirements – An RRS is a policy that ensures records are retained in accordance with state or federal legal requirements. It provides an outline for the minimum legal requirements related to the retention and destruction of records.
- Identify archivally significant materials – Appraisal and selection are not dead. While storage may be increasing in capacity and decreasing in cost, there is still considerable need for decisions to be made around what comes into the Archive and what does not. An RRS can help provide a framework for this decision making process.
- Identify when things can be deleted – People want permission to be able to delete their digital content. Similar to paper and other physical based records, there is little incentive to get rid of things until one runs out of space. With electronic records, it is not uncommon to purchase more storage instead of deleting unnecessary files. However, digital clutter is a real thing that can induce stress and anxiety as well as make retrievability challenging. Having an RRS can help reduce digital clutter by identifying what records can be deleted and when.
- Assist archive in preservation planning – Once an RRS has been created, it can be a helpful tool in planning for the specific preservation needs of the categories of records coming into the Archive. With the assistance of an RSS, you can think through file-format identification and decisions around normalization, requirements around minimum associated metadata, and estimations of how much information will be needed to be transferred into the Archive and thus how much space will be required.
Records management may be different than archives management but when there is no Records Manager, the responsibility often falls on the Archivist. While records management is concerned with all information created, not exclusively information that has archival significance, it can be useful for the Archive to have a comprehensive picture of work that is being done across the institution. Having a wide-ranging understanding of workflows will only strengthen decisions around selection of what needs to come into the Archive.
So how do you begin? Here are some tips on developing an RRS based off of my experience at BAM.
- Work with IT. While the creation of an RRS does not necessarily require the technical expertise or someone with an information technology background, the eventual transfer of materials into the Archive and the management of an electronic repository will take some technical know-how. Collaborating with IT at an early stage will only improve relations down the road. If you don’t have an IT Department, it is okay! The Archivist often wears many hats.
- Talk to as many staff members as possible! Those who create records are the experts in the records they are creating. Trust their words and do not aim to alter their workflows. Work with them! Conduct an interview with a general framework, not a strict roadmap. Give people space to speak and guide them when necessary. Consider this interview outline:
- Walk through the general responsibilities of your department with an emphasis on what kinds of records or information is being created.
- Who creates record(s)?
- How it is created? Specific software?
- What format is it?
- How is it identified (filename/folder)? Standard naming conventions?
- Are there multiple copies? Multiple versions? How are finals identified?
- Where is it stored?
- How long is it used/accessed/relevant to your department?
- What is the historical significance/long-term research value in information created by your department?
- Make people feel comfortable and not embarrassed. The Archive asking about records can have an intimidating feel. Few people are as organized as they would like to be. These interviews should not be about shaming people but are an opportunity to listen and identify issues across your institution.
- To record or not to record? To transcribe or not to transcribe? Think carefully about the decision to audio or video record these interviews. You want your interviewee to feel comfortable and you also want to be able to refer back to things you may have missed. Transcribing interviews can be helpful but it takes a considerable effort. Consider the amount of time and resources that are available to you.
- Determine a format for your RRS. Consider making a spreadsheet with the column headings from Example 1.
- Develop Record Series Titles based off of workflows present within the department. To encourage compliance to an RRS, it is recommended to have the categories be as reflective of workflows within your institution as possible. If you think of it as a map or a crosswalk, developing an RRS to mirror record types and folder structures currently being used will only make things easier. Directly referencing language used by departments within the Records Series Title or Description will facilitate the process of compliance.
- Determine retention periods and whether or not records should be transferred to the Archive. Use this decision tree to help establish appropriate time periods.
- Get legal advice. For record series with legal considerations, consult your legal department. If there is no legal department, look at existing records retention schedules and at your local legal requirements. Here are some useful resources:
- New York State Archives Retention and Disposition Schedule for Government Based Records – Includes useful justifications of all retention categories.
- IRS – How Long Should I Keep Records? – Guidance on financial based records.
- Society for Human Resource Management’s Federal Records Retention Requirements – Legal guidance on retention periods for HR based records.
It is always best to look up the underlying laws cited in example RRSs to confirm applicable interpretation.
- To help mitigate duplication, consider limiting records transferred to the Archive exclusively to the creating department. In other words, for information shared across departments or created collaboratively across departments, consider getting the department that holds the final version to transfer the record to the Archive, as opposed to all departments that have a copy.
- Make a note of information that is required to be transferred to the Archive but is stored in databases or other systems used by your institution. If any information that is required to be transferred into the Archive is stored on removable media or third party proprietary systems, make sure these are flagged and a specific archival ingest process is developed for these records.
- Appoint a departmental records coordinator and require yearly approval. Designating responsibility to a specific person will dissuade finger pointing. If every department has a specific records retention coordinator, there will be a person with whom the Archives can communicate with, thus improving likelihood of compliance. It is important to make sure that the RRS is reviewed annually to ensure that it continues to reflect current workflows and practices.
Writing an RRS is big step; however, it is only the beginning. At BAM, now that we have completed revisions on our RRS, we are working on developing workflows for transferring materials into the Archive.
Using TreeSize Pro, we have scanned the network storage systems of all departments and have estimated the amount of data that will need to be brought into the Archives based off of the RRS.
We are now working to establish timelines and requirements for when and how departments should transfer materials to the Archive. Presently, we are testing AVPS’s Exactly file delivery tool as a way to receive files and require minimum metadata associated with deposits. Follow the NDSR-NY blog for updates on this phase of the project as it continues to unfold.
This year, we’re teaming up with the Harry Potter Alliance (HPA) to help expand our efforts and take advantage of the momentum started by the nearly 400 library supporters who plan to attend NLLD. Their Chapters’ members are pledging their time to make calls, send emails, tweet, and otherwise raise awareness about library issues during the week of May 2nd. To date, the HPA has received pledges for nearly 500 actions from their members!
We think we can do our wizarding friends even better.
Over the next few weeks, please take a second to register and then ask everyone in your circles — members, followers, patrons, fellow library staffers, and listservs — to join us! We’ll follow up by sending out talking points and other handy resources you can use to advocate easily and effectively. We’ll also be including a link to a webstream of the National Library Legislative day program, live from Washington, on the morning of May 2nd. You’ll get to hear our keynote speaker, former Congressman Rush Holt, and listen in on this year’s issue briefing.
There’s also a handy resource toolkit, put together by the Harry Potter Alliance, for librarians who may want to get younger advocates involved. You can also find out more by visiting the United for Libraries and the Harry Potter Alliance webpages, or by subscribing to the Action Center.
Please feel free to contact Lisa Lindle, Grassroots Communications Specialist for ALA Washington, if you have any questions.
From Mike Conlon, VIVO Project Director
Did we launch OpenVIVO? Yes, we did. See http://openvivo.org Have an ORCID? Sign on. Don't have an ORCID? Get an ORCID at http://orcid.org and sign on. It's that easy. If you follow VIVO on Twitter (@vivocollab) you'll see good people saying nice things about OpenVIVO. It would be great if you did that too!
Austin, TX The LYRASIS and DuraSpace Boards announced an "Intent to Merge" the two organizations in January. Join us for the second session of the CEO Town Hall Meeting series with Robert Miller, CEO of LYRASIS and Debra Hanken Kurtz, CEO of DuraSpace. Robert and Debra will review how the organizations came together to investigate a merger that would build a more robust, inclusive, and truly global community with multiple benefits for members and users. They will also unveil a draft mission statement for the merged organization.
From Tim Donohue, DSpace Tech Lead, on behalf of the DSpace Committers
I'm pleased to announce that the DSpace 6.0 codebase is now ready for Testathon! We will be holding a two-week 6.0 Testathon extending from Monday, April 25 through Friday, May 6.
Please help ensure the success of DSpace 6.0 by helping us to test it during the 6.0 Testathon that will be held April 25 through May 6.
Today, the Schools and Libraries Divisions of the Universal Service Administrative Company (USAC), which administers the E-rate program, announced that it will extend the current form 471 filing window through May 26. For libraries and consortia, a second window will open to extend the filing window for those two groups until July 21. This additional window is in recognition of the difficulty libraries and larger consortia have had in completing the application process. The new and final day to file the 470 will be June 23.
With the new online filing system, there have been numerous issues preventing libraries from moving forward on their applications. While USAC has continually made efforts to provide updated information and fixes to the EPC system, it has proved challenging for many libraries to accurately finish and file an application. ALA requested the Federal Communications Commission (FCC) work with USAC to extend the filing window so libraries that have been struggling would still have an opportunity to apply, especially given the funding available for Category 2 services which most libraries have not been able to touch for most of the life of the E-rate program. We know that USAC and the FCC both heard the concerns of the library community and responded so that libraries are not inadvertently disadvantaged in this year’s filing window.
Regardless of the window extension, libraries will still need to work through the application process and solve any continuing kinks in their EPC accounts. There’s help out there! If you haven’t yet, connect with your state E-rate coordinator, who is there to help you. Check out Libraryerate, a new peer 2 peer portal for E-rate resources – and be sure to sign up for the News Brief from USAC for the latest information. As we get further information, we will be sure to make sure the library community is aware of it.
As of this posting you have 41 days, 6 hours, 14 minutes, and 32 seconds till the new May 26 deadline. Someone else who is better at math than I am can figure out the remaining hours before July 21. Or better still, allow yourself a little breathing room on a Friday afternoon and be ready to start fresh and take full advantage of the extra time.
The Access 2016 Program Committee invites proposals for participation in this year’s Access Conference, which will be held on the beautiful campus of the University of New Brunswick in the hip city of Fredericton, New Brunswick from 4-7 October.
There’s no special theme to this year’s conference, but — in case you didn’t know — Access is Canada’s annual library technology conference, so … we’re looking for presentations about cutting-edge library technologies that would appeal to librarians, technicians, developers, programmers, and managers.
Access is a single-stream conference that will feature:
• 45-minute sessions,
• lightning talks (try an Ignite-style talk: five minutes to talk while slides—20 in total—automatically advance every 15 seconds),
• a half-day workshop on the last day of the conference, or;
• dazzle us with a bright idea for something different (panel, puppet show, etc.). We’d love to hear it!
To submit your proposal, please fill out the form by 15 April. Deadline extended to April 22!
Please take a look at the Code of Conduct too.
We’re looking forward to hearing from you!
My first library “job” was as “volunteen” for the summer reading program at the Garden Grove Chapman Branch of the Orange County Public Library. I did this during the summers in junior high school and into my freshman year of high school. I spent my time helping smaller kids tally up the number of books they had read, doling out prizes, and making suggestions for books they might enjoy. We helped out with decorations, crafts, story time and puppet shows. We auditioned for and rehearsed for the peak moment of the summer reading program series, the annual teen melodrama. I also did “other duties as assigned” — pasting, cutting, sorting books in preparation for shelving (I had a very tenuous grasp of the Dewey Decimal System) “repairing” cheap paperback books that were near the end of their life, and running small errands. I have always liked to stay busy, so I’m sure I drove the librarians crazy with requests for more tasks. I’m also amazed with the relative autonomy I had. Never overlook the power of the 7th and 8th grade work force!
For National Library Week I helped to pull together an OCLC Next series focusing on OCLC staff “first library job” experiences. I’ve always been impressed with the depth and breadth of my OCLC colleagues’ experience working in libraries before coming to OCLC, and the commitment that they continue to show in working with libraries at OCLC. In reading their responses to our questions I’m struck by how many of my colleagues started working in libraries at a very young age. This should be instructive to all of us who work with young people — they may well stick around!
I note that this year the theme of National Library Week is “libraries transform” — the blog posts help to underscore that theme of transformation, both for the libraries we have worked for and with, and also for the careers we have had. There is a lot of of wisdom in these posts, so I hope you read and enjoy them.
- My first library job
- The road to librarianship
- Advice to my younger self
- How my first job prepared me for today’s challenges
- The future of libraries
Thanks to all who participated in the series — we had a great response and more content than we could possibly use. Special thanks to Brad Gauder, who did much of the heavy lifting in helping out with this series.
You can share your own “first library story” in the comments below, or on Twitter (use #NLW16 and #OCLCnext).About Merrilee ProffittMail | Web | Twitter | Facebook | LinkedIn | More Posts (285)
With the Code4Lib and PLA Conferences behind us, we’re now looking ahead to the Evergreen International Conference coming up on April 20. As you could probably guess, this is our favorite conference of the year! We love Evergreen and we love sharing things we’ve learned. Plus it gathers some of our favorite Evergreen aficionados in one place!
For 2016, Equinox is proud to be a Platinum Sponsor. We’re also sponsoring the Development Hackfest. In addition to our sponsorship roles, the Equinox team is participating in a combined nineteen presentations out of the forty scheduled for the conference. Here’s a sneak peek into those presentations:
- SQL for Humans (Rogan Hamby, Data and Project Analyst, Equinox Software, Inc.)
- Mashcat in Evergreen (Galen Charlton, Infrastructure and Added Services Manager, Equinox Software, Inc.)
- Introduction to the Evergreen Community (Ruth Frasur, Hagerstown-Jefferson Township Library; Kathy Lussier, MassLNC; Shae Tetterton, Equinox Software, Inc.)
- Digging Deeper: Acquisition Reports in Evergreen (Angela Kilsdonk, Equinox Software, Inc.)
- Staging Migrations and Data Updates for Success (Jason Etheridge, Equinox Software, Inc.)
- A Tale of Two Consortiums (Rogan Hamby, Equinox Software, Inc.)
- A More Practical Serials Walkthrough (Erica Rohlfs, Equinox Software, Inc.)
- SQL for Dummies (John Yorio and Dale Rigney, Equinox Software, Inc.)
- It Turns Out That This is a Popularity Contest, After All (Mike Rylander, Equinox Software, Inc.)
- We Are Family: Working Together to Make Consortial Policy Decisions (Shae Tetterton, Equinox Software, Inc.)
- Encouraging Participation in Evergreen II: Tools and Resources (and badges!) (Grace Dunbar, Equinox Software, Inc.)
- Metadata Abattoir: Prime Cuts of MARC (Mike Rylander, Equinox Software, Inc.)
- Serials Roundtable (Erica Rohlfs, Equinox Software, Inc.)
- Back to the Future: The Historical Evolution of Evergreen’s Code and Infrastructure (Mike Rylander and Jason Etheridge, Equinox Software, Inc; Bill Erickson, King County Library System)
- Fund Fun: How to Set Up and Manage Funds in Acquisitions (Angela Kilsdonk, Equinox Software, Inc.)
- Not Your High School Geometry Class: How to Develop for the Browser Client with AngularJS (Galen Charlton and Mike Rylander, Equinox Software, Inc; Bill Erickson, King County Library System)
- To Dream the Impossible Dream: Collaborating to Achieve Shared Vision (Grace Dunbar and Mike Rylander, Equinox Software, Inc.)
- The Catalog Forester: Managing Authority Records in Evergreen, Singly and In Batch (Galen Charlton and Mary Jinglewski, Equinox Software, Inc; Chad Cluff, Backstage Library Works)
- Many Trees, Each Different: Sprucing Up Your Evergreen TPAC (James Keenan, C/W MARS; John Yorio and Dale Rigney, Equinox Software, Inc.)
The Equinox Team will be loading up in the Party Bus (No, we’re not kidding) on Tuesday, April 19. Hope to see you all in Raleigh! Follow along at home using the #evgils16 hashtag on Twitter/Facebook.
Our staff were recently asked to check thousands of ISBNs to find out if we already have the corresponding books in our catalogue. They in turn asked me if I could run a script that would check it for them. It makes me happy to work with people who believe in better living through automation (and saving their time to focus on tasks that only humans can really achieve).
Rather than taking the approach that I normally would, which would be to just load the ISBNs into a table in our Evergreen database and then run some queries to take care of the task as a one-off, I opted to try for an approach that would enable others to run these sort of adhoc reports themselves. As with most libraries, I suspect, we work with spreadsheets a lot--and as our university has adopted Google Apps for Education, we are slowly using Google Sheets more to enable collaboration. So I was interested in figuring out how to build a custom function that would look for the ISBN and then return a simple "Yes" or "No" value according to what it finds.
Evergreen has a robust SRU interface, which makes it easy to run complex queries and get predictable output back, and it normalizes ISBNs in the index so that a search for an 10-digit ISBN will return results for the corresponding 13-digit ISBN. That made figuring out the lookup part of the job easy; after that, I just needed to figure out how to create a custom function in Google Sheets.
Then I just add a column beside the column with ISBN values and invoke the function as (for example) =CheckForISBN(C2).
Given a bit more time, it would be easy to tweak the function to make it more robust, offer variant search types, and contribute it as a module to the Chrome Web Store "Sheet Add-ons" section, but for now I thought you might be interested in it.
Caveats: With thousands of ISBNs to check, occasionally you'll get an HTTP response error ("#ERROR") in the column. You can just paste the formula back in again and it will resubmit the query. The sheet also seems to resubmit the request on a periodic basis, so some of your "Yes" or "No" values might change to "#ERROR" as a result.
I feel that this series is becoming a little long in the tooth. As such, this will be my last post in the series. This series will be aggregated under the following tag: linked data journey.
After spending a good amount of time playing with RDF technologies, reading authoritative literature, and engaging with other linked data professionals and enthusiasts, I have come to the conclusion that linked data, as with any other technology, isn’t perfect. The honeymoon phase is over! In this post I hope to present a high-level, pragmatic assessment of linked data. I will begin by detailing the main strengths of RDF technologies. Next I will note some of the primary challenges that come with RDF. Finally, I will give my thoughts on how the Library/Archives/Museum (LAM) community should move forward to make Linked Open Data a reality in our environment.Strengths
Modularity. Modularity is a huge advantage RDF modeling has over modeling in other technologies such as XML, relational databases, etc. First, you’re not bound to a single vocabulary, such as Dublin Core, meaning you can describe a single resource using multiple descriptive standards (Dublin Core, MODS, Bibframe). Second, you can extend existing vocabularies. Maybe Dublin Core is perfect for your needs, except you need a more specific “date”. Well, you can create a more specific “date” term and assign it as a sub-property of DC:date. Third, you can say anything about anything: RDF is self-describing. This means that not only can you describe resources, you can describe existing and new vocabularies, as well as create complex versioning data for vocabularies and controlled terms (see this ASIST webinar). Finally, with SPARQL and reasoning, you can perform metadata cross-walking from one vocabulary to another without the need for technologies such as XSLT. Of course, this approach has its limits (e.g. you can’t cross-walk a broader term to a specific term).
Linking. Linking data is the biggest selling point of RDF. The ability to link data is great for the LAM community, because we’re able to link our respective institutions’ data together without the need for cross-referencing. Eventually, when there’s enough linked data in the LAM community, it will be a way for us to link our data together across institutions, forming a web of knowledge.Challenges
Identifiers. Unique Resource Identifiers (URIs) are double-edged swords when it comes to RDF. URIs help us uniquely identify every resource we describe, making it possible to link resources together. They also make it much less complicated to aggregate data from multiple data providers. However, creating a URI for every resource and maintaining stables URIs (which I think will be a requirement if we’re going to pull this off) can be cumbersome for a data provider, as well as rather costly.
Duplication. I have been dreaming of the day when we could just link our data together across repositories, meaning we wouldn’t need to ingest external data into our local repositories. This would relieve the duplication challenges we currently face. Well, we’re going to have to wait a little longer. While there are mechanisms out there that could tackle the problem of data duplication, they are unreliable. For example, with SPARQL you can run what is called a “federated query”. A federated query queries multiple SPARQL endpoints, which presents the potential of de-duplicating data by accessing the data from its original source. However, I’ve been told by linked data practitioners that public SPARQL endpoints are delicate and can crash when too much stress is exerted on them. Public SPARQL endpoints and federated querying are great for individuals doing research and small-scale querying; not-so-much for robust, large-scale data access. For now, best practice is still to ingest external data into local repositories.Moving forward
Over the past few years I have dedicated a fair amount of research time developing my knowledge of linked data. During this time I have formed some thoughts for moving forward with linked data in the LAM community. These thoughts are my own and should be compared to others’ opinions and recommendations.
Consortia-level data models. Being able to fuse vocabularies together for resource description is amazing. However, it brings a new level of complexity to data sharing. One institution might use DC:title, DC:date, and schema:creator. Another institution might use schema:name (DC: title equivalent), DC:date, and DC:creator. Even though both institutions are pulling from the same vocabularies, they’re using different terms. This poses a problem when trying to aggregate data from both institutions. I still see consortia such as the Open Archives Initiative forming their own requirements for data sharing. This can be seen now in the Digital Public Library of America (DPLA) and Europeana data models (here and here, respectively).
LD best practices. Linked data in the LAM community is in the “wild west” stages of development. We’re experimenting, researching, presenting primers to RDF, etc. However, RDF and linked data has been around for a while (a public draft of RDF was presented in 1997, seen here). As such, the larger linked data and semantic web community has formed established best practices for creating RDF data models and linked data. In order to seamlessly integrate into the larger community we will need to adopt and adhere to these best practices.
Linked Open Data. Linked data is not inherently “open”, meaning data providers have to make the effort to put the “open” in Linked Open Data. To maximize linked data, and to follow the “open” movement in libraries, I feel there needs to be an emphasis on data providers publishing completely open and accessible data, regardless of format and publishing strategy.Conclusion
Linked data is the future of data in the LAM community. It’s not perfect, but it is an upgrade to existing technologies and will help the LAM community promote open and shared data.
I hope you enjoyed this series. I encourage you to venture forward; start experimenting with linked data if you haven’t. There are plenty of resources out there on the topic. As always, I’d like to hear your thoughts, and please feel free to reach out to me in the comments below or through twitter. Until next time.
This year's Drupal in Libraries Birds of a Feather session will be on Wednesday, May 11th from 3:45 to 4:45 in the Cherry Hill BoF Room (291) at the Morial Convention Center.
There is no agenda, so please bring your questions and stories. We would all love to see what you have been up to.
Among the things that we are interested in are the upcoming version of Islandora and summer reading programs.