You are here

Feed aggregator

John Miedema: Slow reading six years later. Digital technology has evolved, and so have I. There is a trade-off.

planet code4lib - Tue, 2014-09-09 12:59

I was recently interviewed by The Wall Street Journal about slow reading. It has been a few years since I did one of these interviews. I wrote Slow Reading in 2008, six years ago. At the time, the Kindle had just been released and there was a surge of discussion about reading practices, to which I attribute the interest in my little book of research. The request for an interview suggests an ongoing interest in slow reading. So what do I have to say about the subject now?

I used to slow-read often. I would write books reviews, thinking myself progressive in a digital sense for blogging reviews in just four paragraphs. A shift began. My ongoing use of digital technology to read, write and think forced that shift along. I tried to write about that shift in a new online book project — I, Reader — but I failed. The shift was still in progress. I hit a wall at one point. I thought for a time I had reached the end of reading. In 2013, I stopped reading and writing. A year later I started again. I have a good perspective on the shift, but I have no immediate plans to resume writing about it.

So what did I tell the interviewer about slow reading? I confessed that I slow-read print books less often. I re-asserted that “Slow reading is a form of resistance,  challenging a hectic culture that requires speed reading of volumes of information fragments.” I admitted that my resistance is waning. Digital technology has evolved to allow for reading, not just for scanning of information fragments, but also for comprehension of complex and rich material. I was surprised and pleased to discover how digital technology has re-programmed my reading and writing skills to process information more quickly and deeply. I am smarter than I used to be.

I have resumed my writing of book reviews. I restored a selection of book reviews from the past, ones relevant to my current blogging purposes. I will be writing new reviews, probably less often. I will be writing them differently. Currently I am reading Book Was There: Reading in Electronic Times by Andrew PiperI no longer take notes on paper as I read. I have been tweeting notes. I like the way it is evolving. I use a hashtag for the title and author, and sometimes a reader joins in. When I am done, I will write a very short review, two paragraphs tops, and post it here.

That’s not all I said to the interviewer. I said there has been a trade-off because of digital technology. There is always a trade-off. We just have to decide whether whether the gains are more than the losses. What have we lost? I lingered on this question because the loss is less than I anticipated. We still read. We still read rich and complex material. Students still prefer print books for serious reading but I expect they are going through the same transition as I did. What is lost, I assert, is long-form writing. Books born print can be scanned and put online, but books born digital are getting shorter all the time. It is no coincidence that my book, Slow Reading, was short. I was already a reader in transition. Digital technology prefers shortness. It is one reason that many kinds of poetry will survive and thrive on the web. Things should be short and simple as possible (but not simpler, per the quote attributed to Einstein). Long-form novels and textbooks will be lost in time. It is a loss. Is it worth it?

Jakob Voss: Abbreviated URIs with rdfns

planet code4lib - Tue, 2014-09-09 09:26

Working with RDF and URIs can be annoying because URIs such as “http://purl.org/dc/elements/1.1/title” are long and difficult to remember and type. Most RDF serializations make use of namespace prefixes to abbreviate URIs, for instance “dc” is frequently used to abbreviate “http://purl.org/dc/elements/1.1/” so “http://purl.org/dc/elements/1.1/title” can be written as qualified name “dc:title“. This simplifies working with URIs, but someone still has to remember mappings between prefixes and namespaces. Luckily there is a registry of common mappings at prefix.cc.

A few years ago I created the simple command line tool rdfns and a Perl library to look up URI namespace/prefix mappings. Meanwhile the program is also available as Debian and Ubuntu package librdf-ns-perl. The newest version (not included in Debian yet) also supports reverse lookup to abbreviate an URI to a qualified name. Features of rdfns include:

look up namespaces (as RDF/Turtle, RDF/XML, SPARQL…)

$ rdfns foaf.ttl foaf.xmlns dbpedia.sparql foaf.json @prefix foaf: . xmlns:foaf="http://xmlns.com/foaf/0.1/" PREFIX dbpedia: "foaf": "http://xmlns.com/foaf/0.1/"

expand a qualified name

$ rdfns dc:title http://purl.org/dc/elements/1.1/title

lookup a preferred prefix

$ rdfns http://www.w3.org/2003/01/geo/wgs84_pos# geo

create a short qualified name of an URL

$ rdfns http://purl.org/dc/elements/1.1/title dc:title

I use RDF-NS for all RDF processing to improve readability and to avoid typing long URIs. For instance Catmandu::RDF can be used to parse RDF into a very concise data structure:

$ catmandu convert RDF --file rdfdata.ttl to YAML

Jonathan Rochkind: Cardo is a really nice free webfont

planet code4lib - Tue, 2014-09-09 04:39

Some of the fonts on google web fonts aren’t that great. And I’m not that good at picking the good ones from the not-so-good ones on first glance either.

Cardo is a really nice old-style serif font that I originally found recommended on some list of “the best of google fonts”.

It’s got a pretty good character repertoire for latin text (and I think Greek). The Google Fonts version doesn’t seem to include Hebrew, even though some other versions might?  For library applications, the more characters the better, and it should have enough to deal stylishly with whatever letters and diacritics you throw at it in latin/germanic languages, and all the usual symbols (currency, punctuation; etc).

I’ve used it in a project that my eyeballs have spent a lot of time looking at (not quite done yet), and been increasingly pleased by it, it’s nice to look at and to read, especially on a ‘retina’ display. (I wouldn’t use it for headlines though)


Filed under: Uncategorized

DPLA: DPLA & Imgur’s Summer of Archives Comes to a Close

planet code4lib - Tue, 2014-09-09 00:58

Back in June, we announced our collaboration with the Digital Public Library of America (DPLA) for the Summer of Archives–an experimental gallery endeavor that brought tons of historical OC gems to User Submitted. From perfectly looping space GIFs, to famous cats of history, to beautiful book covers, to celestial maps, we’re happy to call this experiment a huge and awesome success.

The very last Summer of Archives post is live in User Submitted right now. We’re going out the same way we came in–with historical GIFs!

Huge thanks to the DPLA for sharing this special content with Imgur all summer long. Be sure to check the DPLA Imgur account to revisit all of the submissions. If your thirst for history cannot be quenched, head over to the DPLA website for a vast array of great content.

This blogpost was originally published on the Imgur.com blog (view on Imgur.com).

William Denton: Augustus

planet code4lib - Tue, 2014-09-09 00:32

A few people recommended Stoner by John Williams to me, and they were right. It’s a gem.

I was in Book City tonight and the clerk was selling a customer on Stoner for a book club. Browsing the new release tables with Williams on my mind I saw a similar new edition from New York Review Books of Augustus, which is about that Augustus.

The first line is a doozy:

… I was with him at Actium, when the sword struck fire from metal, and the blood of soldiers was awash on deck and stained the blue Ionian Sea, and the javelin whistled in the air, and the burning hulls hissed upon the water, and the day was loud with the screams of men whose flesh roasted in the armor they could not fling off; and earlier I was with him at Mutina, where that same Marcus Antonius overran our camp and the sword was thrust into the empty bed where Caesar Augustus had lain, and where we persevered and earned the first power that was to give us the world; and at Philippi, where he traveled so ill he could not stand and yet made himself to be carried among his troops in a litter, and came near death again by the murderer of his father, and where he fought until the murderers of the mortal Julius, who became a god, were destroyed by their own hands.

DuraSpace News: Update 5: Beta Pilot Projects Set to Kick-Off

planet code4lib - Tue, 2014-09-09 00:00
From David Wilcox, Fedora Product Manager   Winchester, MA This is the fifth in a series of updates on the status of Fedora 4.0 as we move from the Beta [1] to the Production Release. The updates are structured around the goals and activities outlined in the July-December 2014 Planning document [2], and will serve to both demonstrate progress and call for action as needed. New information since the last status update is highlighted in bold text.  

Library of Congress: The Signal: Hybrid Born-Digital and Analog Special Collecting: Megan Halsband on the SPX Comics Collection

planet code4lib - Mon, 2014-09-08 17:29

Megan Halsband, Reference Librarian with the Library of Congress Serial and Government Publications Division.

Every year, The Small Press Expo in Bethesda, Md brings together a community of alternative comic creators and independent publishers. With a significant history of collecting comics, it made sense for the Library of Congress’ Serial and Government Publications Division and the Prints & Photographs Division to partner with SPX to build a collection documenting alternative comics and comics culture. In the last three years, this collection has been developing and growing.

While the collection itself is quite fun (what’s not to like about comics), it is also a compelling example of the way that web archiving can complement and fit into work developing a special collection. To that end, I am excited to talk with Megan Halsband, Reference Librarian with the Library of Congress Serial and Government Publications Division and one of the key staff working on this collection as part of our Content Matters interview series.

Trevor: First off, when people think Library of Congress I doubt “comics” is one of the first things that comes to mind. Could you tell us a bit about the history of the Library’s comics collection, the extent of the collections and what parts of the Library of Congress are involved in working with comics?

Megan: I think you’re right – the comics collection is not necessarily one of the things that people associate with the Library of Congress – but hopefully we’re working on changing that! The Library’s primary comics collections are two-fold – first there are the published comics held by the Serial & Government Publications Division, which appeared in newspapers/periodicals and later in comic books, as well as the original art, which is held by the Prints & Photographs Division.

Example of one of the many comics available through The Library of Congress National Digital Newspaper Program. The End of a Perfect Day. Mohave County miner and our mineral wealth (Kingman, Ariz.) October 14, 1921, p.2.

The Comic Book Collection here in Serials is probably the largest publicly available collection in the country, with over 7,000 titles and more than 125,000 issues. People wonder why our section at the Library is responsible for the comic books – and it’s because most comic books are  published serially.  Housing the comic collection in Serials also makes sense, as we are also responsible for the newspaper collections (which include comics). The majority of our comic books come through the US Copyright Office via copyright deposit, and we’ve been receiving comic books this way since the 1930?s/1940?s.

The Library tries to have complete sets of all the issues of major comic titles but we don’t necessarily have every issue of every comic ever published (I know what you’re thinking and no, we don’t have an original Action Comics No. 1 – maybe someday someone will donate it to us!). The other main section of the Library that works with comic materials is Prints & Photographs – though Rare Book & Special Collections and the area studies reading rooms probably also have materials that would be considered ‘comics.’

Trevor: How did the idea for the SPX collection come about? What was important about going out to this event as a place to build out part of the collection? Further, in scoping the project, what about it suggested that it would also be useful/necessary to use web archiving to complement the collection?

Megan: The executive director of SPX, Warren Bernard, has been working in the Prints & Photographs Division as a volunteer for a long time, and the collection was established in 2011 after an Memorandum of Understanding was signed between the Library and SPX. I think Warren really was a major driving force behind this agreement, but the curators in both Serials and Prints & Photographs realized that our collections didn’t include materials from this particular community of creators and publishers in the way that it should.

Small Press Expo floor in 2013

Given that SPX is a local event with an international reputation and awards program (SPX awards the Ignatz) and the fact that we know staff at SPX, I think it made sense for the Library to have an ‘official’ agreement that serves as an acquisition tool for material that we wouldn’t probably otherwise obtain. Actually going to SPX every year gives us the opportunity to meet with the artists, see what they’re working on and pick up material that is often only available at the show – in particular mini-comics or other free things.

Something important to note is that the SPX Collection – the published works, the original art, everything – is all donated to the Library. This is huge for us – we wouldn’t be able to collect the depth and breadth of material (or possibly any material at all) from SPX otherwise.  As far as including online content for the collection, the Library’s Comics and Cartoons Collection Policy Statement (PDF) specifically states that the Library will collect online/webcomics, as well as award-winning comics. The SPX Collection, with its web archiving component,  specifically supports both of these goals.

Trevor:  What kinds of sites were selected for the web archive portion of the collection? In this case, I would be interested in hearing a bit about the criteria in general and also about some specific examples. What is it about these sites that is significant? What kinds of documentation might we lose if we didn’t have these materials in the collection?

Archived web page from the American Elf web comic.

Megan: Initially the SPX webarchive (as I refer to it – though its official name is Small Press Expo and Comic Art Collection) was extremely  selective – only the SPX website itself and the annual winner of the Ignatz Award for Outstanding Online Comic were captured.  The staff wanted to see how hard it would be to capture websites with lots of image files (of various types). Turns out it works just fine (if there’s not paywall/subscriber login credentials required) – so we expanded the collection to include all the Ignatz nominees in the Outstanding Online Comic category as well.

Some of these sites, such as Perry Bible Fellowship and American Elf, are long-running online comics who’s creators have been awarded Eisner, Harvey and Ignatz awards. There’s a great deal of content on these websites that isn’t published or available elsewhere – and I think that this is one of the major reasons for collecting this type of material. Sometimes the website might have initial drafts or ideas that later are published, sometimes the online content is not directly related to published materials, but for in-depth research on an artist or publication, often this type of related content is extremely useful.

Trevor: You have been working with SPX to build this collection for a few years now. Could you give us an overview of what the collection consists of at this point? Further, I would be curious to know a bit about how the idea of the collection is playing out in practice. Are you getting the kinds of materials you expected? Are there any valuable lessons learned along the way that you could share? If anyone wants access to the collection how would they go about that?

Megan: At this moment in time, the SPX Collection materials that are here in Serials include acquisitions from 2011-2013, plus two special collections that were donated to us, the Dean Haspiel Mini-Comics Collection and the Heidi MacDonald Mini-Comics Collection.  I would say that the collection has close to 2,000 items (we don’t have an exact count since we’re still cataloging everything) as well as twelve websites in the web archive. We have a wonderful volunteer who has been working on cataloging items from the collection, and so far there are over 550 records available in the Library’s online catalog.

Mini comics from the SPX collection

Personally, I didn’t have any real expectations of what kinds of materials we would be getting – I think that definitely we are getting a good selection of mini-comics, but it seems like there are more graphic novels that I anticipated. One of the fun things about this collection are the new and exciting things that you end up finding at the show – like an unexpected tiny comic that comes with its own magnifying glass or an oversize newsprint series.

The process of collecting has definitely gotten easier over the years. For example, the Head of the Newspaper Section, Georgia Higley, and I just received the items that were submitted in consideration for the 2014 Ignatz Awards. We’ll be able to prep permission forms/paperwork in advance of the show for the materials we’re keeping from this material, and it will help us cut down on potential duplication. This is definitely a valuable lesson learned! We’ve also come up with a strategy for visiting the tables at the show – there are 287 tables this year – so we divide up the ballroom between four of us (Georgia and I, as well as two curators from Prints & Photographs – Sara Duke and Martha Kennedy) to make it manageable.

We also try to identify items that we know we want to ask for in advance of the show – such as ongoing serial titles or debut items listed on the SPX website – to maximize our time when we’re actually there. Someone wanting to access the collection would come to the Newspaper & Current Periodical Reading Room to request the comic books and mini-comics. Any original art or posters from the show would be served in the Prints & Photographs Reading Room. As I mentioned – there is still a portion of this collection that is unprocessed – and may not be immediately accessible.

Trevor: Stepping back from the specifics of the collection, what about this do you think stands for a general example of how web archiving can complement the development of special collections?

Megan: One of the true strengths of the Library of Congress is that our collections often include not only the published version, but also the ephemeral material related to the published item/creator, all in one place. From my point of view, collecting webcomics gives the Library the opportunity to collect some of this ‘ephemera’ related to comics collections and only serves to enhance what we are preserving for future research. And as I mentioned earlier, some of the content on the websites provides context, as well as material for comparison, to the physical collection materials that we have, which is ideal from a research perspective.

Trevor:  Is there anything else with web archiving and comics on the horizon for your team? Given that web comics are such significant part of digital culture I’m curious to know if this is something you are exploring. If so, is there anything you can tell us about that?

We recently began another web archive collection to collect additional webcomics beyond those nominated for Ignatz Awards – think Dinosaur Comics and XKCD. It’s very new (and obviously not available for research use yet) – but I am really excited about adding materials to this collection. There are a lot of webcomics out there – and I’m glad that the Library will now be able to say we have a selection of this type of content in our collection! I’m also thinking about proposing another archive to capture comics literature and criticism on the web – stay tuned!

HangingTogether: Innovative solutions for dealing with born-digital content in obsolete forms – Part 1

planet code4lib - Mon, 2014-09-08 17:00

[Tweet] AB Schmuland: Obsolete media brings them in at 8 am EDT on a Saturday! #saa14 #s601 http://t.co/9BaDz0IhOs

I chaired a lightning talk session at SAA 2014 in Washington DC on August 16. The premise was that many archives have received materials in forms that they cannot even read. Archives are acquiring born-digital content at increasing rates and it’s hard enough to keep up with current formats. It makes sense to reach out to the community for help with more obscure media. I found ten speakers who had confronted this problem and figured out innovative solutions to getting material into a form that could be more easily managed.

[Tweet] Jennifer Schaffner: “my name is ___ and I have born-digital on crazy old media that I can barely identify that I have no idea what to do with” #saa14 #s601

The speakers’ stories were so encouraging to others in similar situations that I wanted to share them further.

This is the first of three posts. We start with a talk about the array of media an archives might confront, followed by a talk about an effort to test how much can be done in house.

Lynda Schmitz Fuhrig, the Electronic Records Archivist at the Smithsonian Institution Archives urged archivists to ingest materials off removable media as soon as possible — if possible. She itemized some of more typical physical media the SI Archives has and the workstations they maintain to access them. Then she told of some successes they’d had getting content off less typical forms, like Digital Audio Tapes, data tapes, interactive compact discs, and digital videocassettes.

[Tweet] Kevin Schlottmann: National air and space website from 1994 recovered from tape in 2012 #s601 #saa14

Finally she cautioned about some of the media we may be overly confident about: CDs and DVDs – not just that drives to read them are no longer standard issue, but that their life spans can vary dramatically.

She suggested looking to schools, eBay, craigslist, and listservs to obtain out of date equipment and considering whether another archives could help with your format. For formats that simply cannot be read, she raised the possibility of waiting until a researcher wants it and seeing if the researcher is willing to pay to have a vendor transfer the data.

Moryma Aydelott, Special Assistant to the Director of Preservation at the Library of Congress, described developing cross-division in-house workflows for processing 3 ½” and 5 ¼” floppy disks.

The goal was to get a backup copy of the items stored on long term storage, while encouraging standard practices and increasing staff digital competencies. She described the software used (xcopy and FTK Imager) to get complete and unchanged copies of the content. Tabs that make the floppies read-only were used to prevent disks being accidentally overwritten during copying. After reading data off the disks, the workflow included steps to create checksums and other files using the BagIt specification, and for items to be inventoried as they’re saved to tape-based long term storage. The workflows were documented, staff was trained, and processes were customized to particular situations.

[Tweet] Sasha Griffin: Balance outsourcing with developing staff competences in-house #s601 #saa14

Curatorial divisions had been contemplating transferring data off of these media but were unsure how to start, and this project gave them some help and confidence to get going. Now the Preservation Reformatting Division is assembling a lab with scanners, portable drives, and a FRED machine. It will be available to staff in all LC curatorial divisions and those staff are helping to determine other hardware and software the lab should include. A committee has formed to develop scalable ways of processing materials that can’t be processed in house.

Next up: Part 2 will continue with four speakers talking about solutions to particularly challenging formats.

About Ricky Erway

Ricky Erway, Senior Program Officer at OCLC Research, works with staff from the OCLC Research Library Partnership on projects ranging from managing born digital archives to research data curation.

Mail | Web | Twitter | LinkedIn | More Posts (32)

OCLC Dev Network: Enhancements Planned for September 14

planet code4lib - Mon, 2014-09-08 14:00

In addition to the upcoming VIAF changes we shared last week (currently planned for September 16), a separate  release on September 14 will bring enhancements to a couple of our WorldShare Web services.

Hydra Project: Hydra Connect #2 is a sell-out!

planet code4lib - Mon, 2014-09-08 09:39

We’re more than pleased to tell you that Hydra’s second Connect meeting, to be held in Cleveland 30 September – 3rd October, is a sell-out!  Not only have we sold all the tickets, we have a waiting list of people hoping we might manage to find a little more space.  We’re looking forward to seeing 160 faces, friends old and new, at Case Western Reserve University in three weeks.

HangingTogether: Linked Data Survey results 6 – Advice from the implementers

planet code4lib - Mon, 2014-09-08 08:00

 

 

OCLC Research conducted an international linked data survey for implementers between 7 July and 15 August 2014. This is the sixth--and last--post in the series reporting the results.  

An objective in conducting this survey was to learn from the experiences of those who had implemented or were implementing linked data projects/services.  We appreciate that so many gave advice. About a third of those who have implemented or are implementing a linked data project are planning to implement another within the next two years; another third are not sure.

Asked what they would differently if they were starting their project again, respondents answered with issues clustered around organizational support and staffing, vocabularies, and technology. One noted that legal issues seriously delayed the release of their linked data service and that legal aspects need to be addressed early.

Organizational support and staffing:

  • Have a clear mandate for the project. Our issues have stemmed from our organization, not the technology or concept.
  • It would have been useful to have a greater in-house technical input.
  • With hindsight we have more realistic expectations. if funding would allow I would hire a programmer to the project.
  • Attempt to garner wider organisational support and resources before embarking on what was in essence a very personal project.
  • We also would have preferred to have done this as an official project, with staff resources allocated, rather than as an ad-hoc, project that we’ve crammed into our already full schedules.
  • Have dedicated technical project manager – or at least a bigger chunk of time.
  • Have more time planned and allocated for both myself and team members.

Vocabularies

  • Build an ontology and formal data model from the ground up.
  • Align concepts we are publishing with other authorities, most of which didn’t exist at the time.
  • Vocabulary selection, avoid some of the churn related to that process.
  • Make more accurate and detailed records so that it is easier for people using the data to clear up ambiguity of similar names.
  • I might seek a larger number of partners to contribute their controlled vocabularies or thesauri in advance.

Technology

  • We would immediately use Open Refine to extract and clean up data after the first export from Access
  • We would provide a SPARQL endpoint for the data if we had the opportunity.
  • We would give more thought to service resilience from the perspective of potential denial of service attacks.
  •  Well define the schema first before we generated the records. Use the schema to validate all of the records before we stored them in the system’s database.
  • It is still a pity that the Linked Data Pilot is not more integrated to the production system. It would have easier if the LOD principles would have been included in this production system from the beginning.
  • We might have done more to help our vendor understand the complexity of the LCNAF data service as well as the complexity of the MARC authority format.
  • Better user experience; we chose to focus on data mining vs data use.
  • Transforming the source data into semantic form, before attempting process (clustering, clean up, matching).
  • A stable infrastructure is vital for the scalability of the project.

General advice

Much of the advice for both those considering projects to consume linked data and those considering projects to publish linked data cluster around preparation and project management:

  • Ask what benefit doing linked data at all will really have.
  • There is more literature and online information relating to consuming linked data than there was when we started so our advice would be to read as widely as possible and consult with experts in the community.
  • Get a semantic web expert in the team
  • The same as any other project: have a detailed programme.
  • Have a focus. Do your research. Overestimate time spent.
  • Take a Linked Data class
  • Estimate the time required for the project and then double it.  The time to explain details of MARC, EAD, and local practices and standards to the vendor, to test functionality of the application, and to test navigational design elements of the application all require dedicated blocks of time.
  • Bone up on your tech skills.  It’s not magic; there is no wand you can wave.
  • Basic project management, basic data management, basic project planning are really important at the onset.
  • Having a detailed program before starting. Get institutional commitment.  Unless the plan is to do the smallest thing… the investment is great enough to warrant some kind of administrative blessing, at the minimum.
  • Take advantage of the many great (and free) resources for learning about RDF and linked data.
  • Start with a small project and then apply the knowledge gained and the tools built to larger scale projects.
  • Find people at other institutions who are doing linked data so you can bounce ideas off of each other.
  • Plan, plan, plan! Do research. Understand all that there is going on and what challenges you will have before you reach them.
  • Automate, automate, automate

Advice for those considering a project to consume linked data

  • Linking to external datasets is very important but also very difficult.
  • Find authorities for your specific domain from the outset, and if they don’t exist don’t be afraid to define and publish your own concepts.
  • Firm understanding of ontologies
  • Use CIDOC CRM / FRBRoo for cultural heritage sources. It will be far more costs effective and provide the highest quality of data that can be integrated preserving the variability and language of your data.
  • Pick a problem you can solve. Start with schema.org as core vocabulary. Lean toward JSON-LD instead of rdfxml. Like agile fail quick and often. Store the index in a triplestore.
  • Make a decision what kind of granularity of data you want to make available as linked data – no semantics for now. We cannot make our data to transform as linked data as one to one relationship – there should be a data that will not be available in linked data. If you want to make your data discoverable, then schema.org semantic will work the best.
  • Sometimes the data available just won’t work with your project. Keep in mind that something may look like a match at first but the devil is in the details. 

Advice for those considering a project to publish linked data

General advice: “Try to consume it first!”

Project management

  • It’s possible to participate in linked data projects even by producing data and leaving the work of linking to others.
  • Managing expectations of content creators is tough – people often have expectations of linked data that aren’t possible. The promise of being able to share and link things up can efface the work required to prepare materials for publication.
  • Always look at what others have done before you. Build a good relationship with the researcher with whom you are working; leverage the knowledge and experience of that person or persons. Carefully plan your project ahead of time, in particular the metadata.
  • Look at the larger surrounding issues.  It is not enough to just dump your data out there.  Be prepared to perform some sort of analytics to capture information as to uses of the data.  Also include a mechanism for feedback about the data and requested improvements/enhancements.  The social contract of linked data is just as important as the technical aspects of transforming and publishing the data.
  • Just do it, but consider if you’re just adding more bad data to the web — dumping a set of library records to RDF is pointless. Consider the value of publishing data. Reusing data is probably more interesting.
  • The assumption that the data needs to be there in order to be used is, I think, wrong. The usefulness of data is in its use; create a service one uses oneself and it is valuable and useful. Whether others actually use it is irrelevant.
  • Pay attention to reuse existing ontologies in order to improve interoperability and user comprehension of your published data. 

Technical advice

  • Publish the highest quality possible that will also achieve semantic and contextual harmionisation. You will end up doing it again otherwise and therefore it is far more cost effective and gets the best results.
  • Don’t use fixed field/ value data models. For cultural heritage data use CIDOC CRM / FRBRoo.
  • Offer a SPARQL endpoint to your data.
  • Use JSON-LD.
  • Museums need to take a good look at their data and make sure that they create granular data, i.e. each concept (actors, keywords, terms, objects, events, …) needs to have unique ids, which in turn will be referenced in URIs. Also publishing linked data means embracing a graph data structure, which is a total departure from traditional relational data structure: linked data forces you to make explicit what is only implicit in the database.  Modeling data for events is challenging but rewarding. Define what data entities your museum is responsible for… Being able to define URIs for entities means being able to give them unique identifiers and and there are many data issues that need to be taken care of within an institution.  Also, very important is that producing LOD requires the data manager to think differently about data, and not about information.  LOD requires that you make explicit knowledge that is only implicit in a traditional relational database.

 Recommended Resources

This is a compilation of resources–conferences, linked data projects, listservs, websites–respondents found particularly valuable in learning more about linked data.

Conferences valuable in learning more about linked data: American Medical Informatics Association meetings,  Computer Applications in Archaeology, Code4Lib conferences, Digital Library Federation’s forums, Dublin Core Metadata Initiative, European Library Automation Group, European Semantic Web Conferences, International Digital Curation Conference, International Semantic Web Conference, Library and Information Technology Association’s national forums, Metadata and Digital Object Roundtable (in association with the Society of American Archivists), Scholarly Publishing and Academic Resources Coalition conferences, Semantic Web in Libraries, Theory and Practice of Digital Libraries

Linked data projects implementers track:

  • 270a Linked Dataspaces
  • AMSL, an electronic management system based on linked data technologies
  • Library of Congress’ BIBFRAME (included in the survey responses)
  • Bibliothèque Nationale de France’s Linked Open Data project
  • Bibliothèque Nationale de France’s OpenCat: Interesting data model – lightweight FRBR model together with reuse of commonly used web ontologies (DC; FOAF, etc.); scalable open source platform (cubicweb). Opencat aims to demonstrate that data published on data.bnf.fr can be re-used by other libraries, in particular public libraries.
  • COMSODE (Components Supporting the Open Data Exploitation)
  • Deutsche National Bibliothek’s Linked Data Service
  • Yale Digital Collections Center’s Digitally Enabled Scholarship with Medieval Manuscripts, linked data-based.
  • ESTC (English Short-Title Catalogue): Moving to a linked data model; tracked because one of the aims is to build communities of interest among researchers.
  • Libhub: Of interest because it has the potential to assess the utility of BIBFRAME as a successor to MARC21.
  • LIBRIS, the Swedish National Bibliography
  • Linked Data 4 Libraries (LD4L): “The use cases they created are valuable for communicating the possible uses of linked data to those less familiar with linked data and it will be interesting to see the tools that are developed as a result of the projects.” (Included in the survey responses)
  • Linked Jazz: Reveals relationships of the jazz community, something similar to what a survey respondent wants to accomplish.
  • North Carolina State University’s Organization Name Linked Data: Of interest because it demonstrates concepts in practice (included in the survey responses).
  • Oslo Public Library’s Linked Data Cataloguing: “It is attempting to look at implementing linked data from the point of view of actual need… of a real library for implementation. Cataloguing and all aspects of the system will be designed around linked data.” (Included in the survey responses)
  • Pelagios: Uses linked data principles to increase the discoverability of ancient data through place associations and a major spur for a respondent’s project.
  • PeriodO:  A gazetteer of scholarly assertions about the spatial and temporal extents of historical and archaeological periods; addresses spatial temporal definitions.
  • Spanish Subject Headings for Public Libraries Published as Linked Data (Lista de Encabezamientos de Materia para las Bibliotecas Públicas en SKOS)
  • OCLC’s WorldCat Works (included in the survey responses)

Listservs: bibframe@listserv.loc.gov (Bibliographic Framework Transition Initiative Forum), Code4lib@listserv.nd.edu, DCMI (Dublin Core Metadata Initiative) listservs, data-ac-uk@jiscmail.ac.uk,  dlf-announce@lists.clir.org (Digital Library Federation), lod-lam@googlegroups.com, public-ldp@w3.org (linked data platform working group), semantic-web@w3.org

Websites:

Analyze the responses yourself!

If you’d like to apply your own filters to the responses, or look at them more closely, the spreadsheet compiling all survey responses (minus the contact information which we promised we’d keep confidential) is available at: http://www.oclc.org/content/dam/research/activities/linkeddata/oclc-research-linked-data-implementers-survey-2014.xlsx

 

 

 

About Karen Smith-Yoshimura

Karen Smith-Yoshimura, program officer, works on topics related to renovating descriptive and organizing practices with a focus on large research libraries and area studies requirements.

Mail | Web | Twitter | More Posts (50)

Patrick Hochstenbach: Creating Cat Bookmarks

planet code4lib - Sun, 2014-09-07 08:45
Filed under: Comics Tagged: bookmark, books, cartoon, cat, Cats, comic, Illustrator, literature, Photoshop, reading

Patrick Hochstenbach: Trying out caricature styles

planet code4lib - Sat, 2014-09-06 15:10
Filed under: Doodles Tagged: belgië, belgium, caricature, karikatuur, kris peeters, politics, politiek

Cynthia Ng: Google Spreadsheets Tip: Show Data from All Sheets in One

planet code4lib - Sat, 2014-09-06 05:13
I’ve been working with spreadsheets a lot lately, and while anything Excel related is well documented and I’m more familiar with, Google spreadsheets does things differently.Today’s post is just a quick tip really, but thought I’d document because it took a long time for me to find the solution, plus I had to play around […]

Ed Summers: Agile in Academia

planet code4lib - Sat, 2014-09-06 01:44

I’m just finishing up my first week at MITH. What a smart, friendly group to be a part of, and with such exciting prospects for future work. Riding along the Sligo Creek and Northwest Branch trails to and from campus certainly helps. Let’s just say I couldn’t be happier with my decision to join MITH, and will be writing more about the work as I learn more, and get to work.

But I already have a question, that I’m hoping you can help me with.

I’ve been out of academia for over ten years. In my time away I’ve focused on my role as an agile software developer — increasingly with a lower case “a”. Working directly with the users of software (stakeholders, customers, etc), and getting the software into their hands as early as possible to inform the next iteration of work has been very rewarding. I’ve seen it work again, and again, and I suspect you have too on your own projects.

What I’m wondering is if you know of any tips, books, articles, etc on how to apply these agile practices in the context of grant funded projects. I’m still re-aquainting myself with how grants are tracked, and reported, but it seems to me that they seem to often encourage fairly detailed schedules of work, and cost estimates based on time spent on particular tasks, which (from 10,000 ft) reminds me a bit of the waterfall.

Who usually acts as the product owner in grant drive software development projects? How easy is it to adapt schedules and plans based on what you have learned in a past iteration? How do you get working software into the hands of its potential users as soon as possible? How often do you meet, and what is the focus of discussion? Are there particular funding bodies that appreciate agile software development? Are grants normally focused on publishing research and data instead of software products?

Any links, references, citations, tips or advice you could send my way here, @edsu, or email would be greatly appreciated. I’ve already got Bethany Nowviskie‘s Lazy Consensus bookmarked for re-reading

CrossRef: CrossRef Indicators

planet code4lib - Fri, 2014-09-05 19:14

Updated July 25, 2014

Total no. participating publishers & societies 5100
Total no. voting members 2433
% of non-profit publishers 57%
Total no. participating libraries 1885
No. journals covered 35,406
No. DOIs registered to date 68,416,081
No. DOIs deposited in previous month 552,871
No. DOIs retrieved (matched references) in previous month 34,385,296
DOI resolutions (end-user clicks) in previous month 98,365,532

CrossRef: New CrossRef Members

planet code4lib - Fri, 2014-09-05 19:11

Updated September 3, 2014

Voting Members
Annex Publishers, LLC
Association for Medical Education in Europe (AMEE)
Breakthrough Institute, Rockefeller Philanthropy Advisors
COMESA - Leather and Leather Products Institute (COMESA/LLPI)
Hebrew Union College Press
Incessant Nature Science Publishers Pvt Ltd.
Instituto Brasileiro de Avaliacao Psicologica (IBAP)
Instituto Nanocell
Scandinavian Psychologist
Servicios Ecologicos y Cientificos SA de CV
Shared Science Publishers OG
Society of Biblical Literature/SBL Press
Universidad Adolfo Ibanez
Vanderbilt University Library
Visio Mundi Academic Media Group

Represented Members
Global E-Business Association
Institute of Philosophy
Korea Consumer Agency
Korea Distribution Science Association
Korea Employment Agency for the Disabled/Employment Development Institute
Korea Society for Hermeneutics
Korean Association for Japanese Culture
Korean Society for Curriculum Studies
Korean Society for Drama
Korean Society for Parenteral and Enteral Nutrition
Korean Society of Biology Education
Korean Society on Communication in Healthcare
Korean Speech-Language and Hearing
Soonchunhyang Medical Research Institute
The Discourse and Cognitive Linguistics Society of Korea
The Society of Korean Dances Studies

Last updated August 26, 2014

Voting Members
A Fundacao para o Desenvolvimento de Bauru (FunDeB)
Associacao de Estudos E Pesquisas Em Politicas E Practicas Curriculares
Association For Child and Adolescent Mental Health (ACAMH)
International Journal of Advanced Information Science and Technology
School of Electrical Engineering and Informatics (STEI) ITB
Universidad Autonoma del Caribe

Sponsored Members
Journal of Nursing and Socioenvironmental Health
California Digital Library

Represented Members
Association of East Asian Studies
Korea Computer Graphics Society
Korea Institute for Health and Social Affairs
Korea Service Management Society
Korean Association of Multimedia-Assisted Learning
Korean Association for Government Accounting
Korean Counseling Association
The Korean Poetics Studies Society
The Society of Korean Language and Literature

Updated September 3, 2014

CrossRef: CrossRef Indicators

planet code4lib - Fri, 2014-09-05 19:07

Updated September 3, 2014

Total no. participating publishers & societies 5339
Total no. voting members 2548
% of non-profit publishers 57%
Total no. participating libraries 1898
No. journals covered 35,763
No. DOIs registered to date 69,191,919
No. DOIs deposited in previous month 582,561
No. DOIs retrieved (matched references) in previous month 35,125,120
DOI resolutions (end-user clicks) in previous month N/A

Harvard Library Innovation Lab: Link roundup September 5, 2014

planet code4lib - Fri, 2014-09-05 17:36

This is the good stuff.

Photogrammar

So nice, could even be taken further, I’d imagine they’ve got a lot of ideas in the works –

Our Cyborg Future: Law and Policy Implications | Brookings Institution

Whoa, weird. Our devices and us.

Evolution of the desk

The desk becomes clear of its tools as those tools centralize in the digital space.

Mass Consensual Hallucinations with William Gibson

Technology trumps ideology.

Awesomeness: Millions Of Public Domain Images Being Put Online

Mining the archive for ignored treasure.

Pages

Subscribe to code4lib aggregator