PeerLibrary’s groups and collections functionality is especially suited towards educators running classes that involve reading and discussing various academic publications. This week we would like to highlight one such collection, created for a graduate level computer science class taught by Professor John Kubiatowicz at UC Berkeley. The course, Advanced Topics in Computer Systems, requires weekly readings which are handily stored on the PeerLibrary platform for students to read, discuss, and collaborate outside of the typical classroom setting. Articles within the collection come from a variety of sources, such as the publicly available “Key Range Locking Strategies” and the closed access “ARIES: A Transaction Recovery Method”. Even closed access articles, which hide the article from unauthorized users, allow users to view the comments and annotations!
Gates Foundation to require immediate free access for journal articles
By Jocelyn Kaiser 21 November 2014 1:30 pm
Breaking new ground for the open-access movement, the Bill & Melinda Gates Foundation, a major funder of global health research, plans to require that the researchers it funds publish only in immediate open-access journals.
The policy doesn’t kick in until January 2017; until then, grantees can publish in subscription-based journals as long as their paper is freely available within 12 months. But after that, the journal must be open access, meaning papers are free for anyone to read immediately upon publication. Articles must also be published with a license that allows anyone to freely reuse and distribute the material. And the underlying data must be freely available.
Is this going to work? Will researchers be able to comply with these requirements without harm to their careers? Does the Gates Foundation fund enough research that new open access venues will open up to publish this research (and if so how will their operation be funded?), or do sufficient venues already exist? Will Gates Foundation grants include funding for “gold” open access fees?
I am interested to find out. I hope this article is accurate about what their doing, and am glad they are doing it if so.
I note that the policy mentions “including any underlying data sets.” Do they really mean to be saying that underlying data sets used for all publications “funded, in whole or in part, by the foundation” must be published? I hope so. Requiring “underlying data sets” to be available at all is in some ways just as big or bigger as requiring them to be available open access.
Filed under: General
Last updated November 24, 2014. Created by Peter Murray on November 24, 2014.
Log in to edit this page.
Join BitCurator users from around the globe for a hands-on day focused on current use and future development of the BitCurator digital software environment. Hosted by the BitCurator Consortium (BCC), this event will be grounded in the practical, boots-on-the-ground experiences of digital archivists and curators. Come wrestle with current challenges—engage in disc image format debates, investigate emerging BitCurator integrations and workflows, and discuss the “now what” of handling your digital forensics outputs.
Slate recently published a series of maps illustrating the languages other than English spoken in each of the fifty US states. In nearly every state, the most commonly spoken non-English language was Spanish. But when Spanish is excluded as well as English, a much more diverse – and sometimes surprising – landscape of languages is revealed, including Tagalog in California, Vietnamese in Oklahoma, and Portuguese in Massachusetts.
Public library collections often reflect the attributes and interests of the communities in which they are embedded. So we might expect that public library collections in a given state will include relatively high quantities of materials published in the languages most commonly spoken by residents of the state. We can put this hypothesis to the test by examining data from WorldCat, the world’s largest bibliographic database.
WorldCat contains bibliographic data on more than 300 million titles held by thousands of libraries worldwide. For our purposes, we can filter WorldCat down to the materials held by US public libraries, which can then be divided into fifty “buckets” representing the materials held by public libraries in each state. By examining the contents of each bucket, we can determine the most common language other than English found within the collections of public libraries in each state:
As with the Slate findings regarding spoken languages, we find that in nearly every state, the most common non-English language in public library collections is Spanish. There are exceptions: French is the most common non-English language in public library collections in Massachusetts, Maine, Rhode Island, and Vermont, while German prevails in Ohio. The results for Maine and Vermont complement Slate’s finding that French is the most commonly spoken non-English language in those states – probably a consequence of Maine and Vermont’s shared borders with French-speaking Canada. The prominence of German-language materials in Ohio public libraries correlates with the fact that Ohio’s largest ancestry group is German, accounting for more than a quarter of the state’s population.
Following Slate’s example, we can look for more diverse language patterns by identifying the most common language other than English and Spanish in each state’s public library collections:
Excluding both English- and Spanish-language materials reveals a more diverse distribution of languages across the states. But only a bit more diverse: French now predominates, representing the most common language other than English and Spanish in public library collections in 32 of the 50 states. Moreover, we find only limited correlation with Slate’s findings regarding spoken languages. In some states, the most common non-English, non-Spanish spoken language does match the most common non-English, non-Spanish language in public library collections – for example, Polish in Illinois; Chinese in New York, and German in Wisconsin. But only about a quarter of the states (12) match in this way; the majority do not. Why is this so? Perhaps materials published in certain languages have low availability in the US, are costly to acquire, or both. Maybe other priorities drive collecting activity in non-English materials – for example, a need to collect materials in languages that are commonly taught in primary, secondary, and post-secondary education, such as French, Spanish, or German.
Or perhaps a ranking of languages by simple counts of materials is not the right metric. Another way to assess if a state’s public libraries tailor their collections to the languages commonly spoken by state residents is to compare collections across states. If a language is commonly spoken among residents of a particular state, we might expect that public libraries in that state will collect more materials in that language compared to other states, even if the sum total of that collecting activity is not sufficient to rank the language among the state’s most commonly collected languages (for reasons such as those mentioned above). And indeed, for a handful of states, this metric works well: for example, the most commonly spoken language in Florida after English and Spanish is French Creole, which ranks as the 38th most common language collected by public libraries in the state. But Florida ranks first among all states in the total number of French Creole-language materials held by public libraries.
But here we run into another problem: the great disparity in size, population, and ultimately, number of public libraries, across the states. While a state’s public libraries may collect heavily in a particular language relative to other languages, this may not be enough to earn a high national ranking in terms of the raw number of materials collected in that language. A large, populous state, by sheer weight of numbers, may eclipse a small state’s collecting activity in a particular language, even if the large state’s holdings in the language are proportionately less compared to the smaller state. For example, California – the largest state in the US by population – ranks first in total public library holdings of Tagalog-language materials; Tagalog is California’s most commonly spoken language after English and Spanish. But surveying the languages appearing in Map 2 (that is, those that are the most commonly spoken language other than English and Spanish in at least one state), it turns out that California also ranks first in total public library holdings for Arabic, Chinese, Dakota, French, Italian, Korean, Portuguese, Russian, and Vietnamese.
To control for this “large state problem”, we can abandon absolute totals as a benchmark, and instead compare the ranking of a particular language in the collections of a state’s public libraries to the average ranking for that language across all states (more specifically, those states that have public library holdings in that language). We would expect that states with a significant population speaking the language in question would have a state-wide ranking for that language that exceeds the national average. For example, Vietnamese is the most commonly spoken language in Texas other than English and Spanish. Vietnamese ranks fourth (by total number of materials) among all languages appearing in Texas public library collections; the average ranking for Vietnamese across all states that have collected materials in that language is thirteen. As we noted above, California has the most Vietnamese-language materials in its public library collections, but Vietnamese ranks only eighth in that state.
Map 3 shows the comparison of the state-wide ranking with the national average for the most commonly spoken language other than English and Spanish in each state:
Now it appears we have stronger evidence that public libraries tend to collect heavily in languages commonly spoken by state residents. In thirty-eight states (colored green), the state-wide ranking of the most commonly spoken language other than English and Spanish in public library collections exceeds – often substantially – the average ranking for that language across all states. For example, the most commonly spoken non-English, non-Spanish language in Alaska – Yupik – is only the 10th most common language found in the collections of Alaska’s public libraries. However, this ranking is well above the national average for Yupik (182nd). In other words, Yupik is considerably more prominent in the materials held by Alaskan public libraries than in the nation at large – in the same way that Yupik is relatively more common as a spoken language in Alaska than elsewhere.
As Map 3 shows, six states (colored orange) exhibit a ranking equal to the national average; in all of these cases the language in question is French or German, languages that tend to be highly collected everywhere (the average ranking for French is four, and for German, five). Five states (colored red) exhibit a ranking that is below the national average; in four of the five cases, the state ranking is only one notch below the national average.
The high correlation between languages commonly spoken in a state, and the languages commonly found within that state’s public library collections suggests that public libraries are not homogenous, but in many ways reflect the characteristics and interests of local communities. It also highlights the important service public libraries provide in facilitating information access to community members who may not speak or read English fluently. Finally, public libraries’ collecting activity across a wide range of non-English language materials suggests the importance of these collections in the context of the broader system-wide library resource. Some non-English language materials in public library collections – perhaps the French Creole-language materials in Florida’s public libraries, or the Yupik-language materials in Alaska’s public libraries – could be rare and potentially valuable items that are not readily available in other parts of the country.
Visit your local public library … you may find some unexpected languages on the shelf.
Acknowledgement: Thanks to OCLC Research colleague JD Shipengrover for creating the maps.
Note on data: Data used in this analysis represent public library collections as they are cataloged in WorldCat. Data is current as of July 2013. Reported results may be impacted by WorldCat’s coverage of public libraries in a particular state.
About Brian Lavoie
Brian Lavoie is a Research Scientist in OCLC Research. Brian's research interests include collective collections, the system-wide organization of library resources, and digital preservation.Mail | Web | LinkedIn | More Posts (6)
by Tom Baker, Karen Coyle, Sean Petiya
Published in: Library Hi Tech, v. 32, n. 4, 2014 pp 562-582 DOI:10.1108/LHT-08-2014-0081
Open Access Preprint
The above article was just published in Library hi Tech. However, because the article is a bit dense, as journal articles tend to be, here is a short description of the topic covered, plus a chance to reply to the article.
We now have a number of multi-level views of bibliographic data. There is the traditional "unit card" view, reflected in MARC, that treats all bibliographic data as a single unit. There is the FRBR four-level model that describes a single "real" item, and three levels of abstraction: manifestation, expression, and work. This is also the view taken by RDA, although employing a different set of properties to define instances of the FRBR classes. Then there is the BIBFRAME model, which has two bibliographic levels, work and instance, with the physical item as an annotation on the instance.
In support of these views we have three RDF-based vocabularies:
FRBRer (using OWL)
RDA (using RDFS)
BIBFRAME (using RDFS)
The vocabularies use a varying degree of specification. FRBRer is the most detailed and strict, using OWL to define cardinality, domains and ranges, and disjointness between classes and between properties. There are, however, no sub-classes or sub-properties. BIBFRAME properties all are defined in terms of domains (classes), and there are some sub-class and sub-property relationships. RDA has a single set of classes that are derived from the FRBR entities, and each property has the domain of a single class. RDA also has a parallel vocabulary that defines no class relationships; thus, no properties in that vocabulary result in a class entailment. 
As I talked about in the previous blog post on classes, the meaning of classes in RDF is often misunderstood, and that is just the beginning of the confusion that surrounds these new technologies. Recently, Bernard Vatant, who is a creator of the Linked Open Vocabularies site that does a statistical analysis of the existing linked open data vocabularies and how they relate to each other, said this on the LOV Google+ group:
"...it seems that many vocabularies in LOV are either built or used (or both) as constraint and validation vocabularies in closed worlds. Which means often in radical contradiction with their declared semantics."What Vatant is saying here is that many vocabularies that he observes use RDF in the "wrong way." One of the common "wrong ways" is to interpret the axioms that you can define in RDFS or OWL the same way you would interpret them in, say, XSD, or in a relational database design. In fact, the action of the OWL rules (originally called "constraints," which seems to have contributed to the confusion, now called "axioms") can be entirely counter-intuitive to anyone whose view of data is not formed by something called "description logic (DL)."
A simple demonstration of this, which we use in the article, is the OWL axiom for "maximum cardinality." In a non-DL programming world, you often state that a certain element in your data is limited to the number of times it can be used, such as saying that in a MARC record you can have only one 100 (main author) field. The maximum cardinality of that field is therefore "1". In your non-DL environment, a data creation application will not let you create more than one 100 field; if an application receiving data encounters a record with more than one 100 field, it will signal an error.
The semantic web, in its DL mode, draws an entirely different conclusion. The semantic web has two key principles: open world, and non-unique name. Open world means that whatever the state of the data on the web today, it may be incomplete; there can be unknowns. Therefore, you may say that you MUST have a title for every book, but if a look at your data reveals a book without a title, then your book still has a title, it is just an unknown title. That's pretty startling, but what about that 100 field? You've said that there can only be one, so what happens if there are 2 or 3 or more of them for a book? That's no problem, says OWL: the rule is that there is only one, but the non-unique name rule says that for any "thing" there can be more than one name for it. So when an OWL program  encounters multiple author 100 fields, it concludes that these are all different names for the same one thing, as defined by the combination of the non-unique name assumption and the maximum cardinality rule: "There can only be one, so these three must really be different names for that one." It's a bit like Alice in Wonderland, but there's science behind it.
What you have in your database today is a closed world, where you define what is right and wrong; where you can enforce the rule that required elements absolutely HAVE TO be there; where the forbidden is not allowed to happen. The semantic web standards are designed for the open world of the web where no one has that kind of control. Think of it this way: what if you put a document onto the open web for anyone to read, but wanted to prevent anyone from linking to it? You can't. The links that others create are beyond your control. The semantic web was developed around the idea of a web (aka a giant graph) of data. You can put your data up there or not, but once it's there it is subject to the open functionality of the web. And the standards of RDFS and OWL, which are the current standards that one uses to define semantic web data, are designed specifically for that rather chaotic information ecosystem, where, as the third main principle of the semantic web states, "anyone can say anything about anything."
I have a lot of thoughts about this conflict between the open world of the semantic web and the needs for closed world controls over data; in particular whether it really makes sense to use the same technology for both, since there is such a strong incompatibility in underlying logic of these two premises. As Vatant implies, many people creating RDF data are doing so with their minds firmly set in closed world rules, such that the actual result of applying the axioms of OWL and RDF on this data on the open web will not yield the expected closed world results.
This is what Baker, Petiya and I address in our paper, as we create examples from FRBRer, RDA in RDF, and BIBFRAME. Some of the results there will probably surprise you. If you doubt our conclusions, visit the site http://lod-lam.slis.kent.edu/wemi-rdf/ that gives more information about the tests, the data and the test results.
 "Entailment" means that the property does not carry with it any "classness" that would thus indicate that the resource is an instance of that class.
 Programs that interpret the OWL axioms are called "reasoners". There are a number of different reasoner programs available that you can call from your software, such as Pellet, Hermit, and others built into software packages like TopBraid.
What technology are you watching on the horizon? Have you seen brilliant ideas that need exposing? Do you really like sharing with your LITA colleagues?
The LITA Top Tech Trends Committee is trying a new process this year and issuing a Call for Panelists. Answer the short questionnaire by 12/10 to be considered. Fresh faces and diverse panelists are especially encouraged to respond. Past presentations can be viewed at http://www.ala.org/lita/ttt.
If you have additional questions check with Emily Morton-Owens, Chair of the Top Tech Trends committee: firstname.lastname@example.org
Help preserve our shared heritage, increase funding for conservation, and strengthen collections care by completing the Heritage Health Information (HHI) 2014 National Collections Care Survey. The HHI 2014 is a national survey on the condition of collections held by archives, libraries, historical societies, museums, scientific research collections, and archaeological repositories. It is the only comprehensive survey to collect data on the condition and preservation needs of our nation’s collections.
The deadline for the Heritage Health Information 2014: A National Collections Care Survey is December 19, 2014. In October, the Heritage Health Information sent invitations to the directors of over 14,000 collecting institutions across the country to participate in the survey. These invitations included personalized login information, which may be entered at hhi2014.com.
Questions about the survey may be directed to hhi2014survey [at] heritagepreservation [dot] org or 202-233-0824.
The post Opportunity knocks: Take the HHI 2014 National Collections Care Survey appeared first on District Dispatch.
An archive of the free webinar “Lib2Gov.org: Connecting Patrons with Legal Information” is now available. Hosted jointly by the American Library Association (ALA) and iPAC, the webinar was designed to help library reference staff build confidence in responding to legal inquiries. Watch the webinar
The session offers information on laws, legal resources and legal reference practices. Participants will learn how to handle a law reference interview, including where to draw the line between information and advice, key legal vocabulary and citation formats. During the webinar, leaders offer tips on how to assess and choose legal resources for patrons.
Catherine McGuire is the head of Reference and Outreach at the Maryland State Law Library. McGuire currently plans and presents educational programs to Judiciary staff, local attorneys, public library staff and members of the public on subjects related to legal research and reference. She serves as Vice Chair of the Conference of Maryland Court Law Library Directors and the co-chair of the Education Committee of the Legal Information Services to the Public Special Interest Section (LISP-SIS) of the American Association of Law Libraries (AALL).
The post Archive webinar available: Giving legal advice to patrons appeared first on District Dispatch.
A couple of week ago we kicked off Islandora Show and Tell by looking at a newly launched site: Barnard Digital Collection. This week, we're going to take a look at a long-standing Islandora site that has been one of our standard answers when someone asks "What's a great Islandora site?" - Fundación Juan March, which will, to our great fortune, be the host of the next European Islandora Camp, set for May 27 - 29, 2015.
It was a foregone conclusion that once we launched this series, we would be featuring FJM sooner rather than later, but it happens that we're visiting them just as they have launched a new collection: La saga Fernández-Shaw y el teatro lírico, containing three archives of a family of Spanish playwrights. This collection is also a great example of why we love this site: innovative browsing tools such as a timeline viewer, carefully curated collections spanning a wide varieties of objects types living side-by-side (the Knowledge Protal approach really makes this work), and seamless multi-language support.
FJM was also highlighted by D-LIB Magazine this month, as their Featured Digital Collection, a well -deserved honour that explores their collections and past projects in greater depth.
But are there cats? There are. Of course when running my standard generic Islandora repo search term, it helps to acknowledge that this is a collection of Spanish cultural works and go looking for gatos, which leads to Venta de los gatos (Sale of Cats), Orientaçao dos gatos (Orientation of Cats), Todos los gatos son pardos (All Cats are Grey).
Curious about the code behind this repo? FJM has been kind enough to share the details of a number of their initial collections on GitHub. Since they take the approach of using .NET for the web interface instead of using Drupal, the FJM .Net Library may also prove useful to anyone exploring alternate front-ends for their own collections.
Our Show and tell interview was completed by Luis Martínez Uribe, who will be joining us at Islandora Camp in Madrid as an instructor in the Admin Track in May 2015.
What is the primary purpose of your repository? Who is the intended audience?
We have always said that more than a technical system, the FJM digital repository tries to bring in a new working culture. Since the Islandora deployment, the repository has been instrumental in transforming the way in which data is generated and looked after across the organization. Thus the main purpose behind our repository philosophy is to take an active approach to ensure that our organizational data is managed using appropriate standards, made available via knowledge portals and preserved for future access.
The contents are highly heterogeneous with materials from the departments of Art, Music, Conferences, a Library of Spanish Music and Theatre as well as various outputs from scientific centres and scholarships. Therefore the audience ranges from the general public interested in particular art exhibitions, concerts or lecture to the highly specialised researchers in fields such as theatre, sociology or biology.
Why did you choose Islandora?
Back in 2010 the FJM was looking for a robust and flexible repository framework to manage an increasing volume of interrelated digital materials. With preservation in mind, the other most important aspect was the capacity to create complex models to accommodate relations between diverse types of content from multiple sources such as databases, the library catalogue, etc. Islandora provided the flexibility of Fedora plus easy customization powered by Drupal. Furthermore, discoverygarden could kick start us with their services and having Mark Leggott leading the project provided us with the confidence that our library needs and setting would be well understood.
Which modules or solution packs are most important to your repository?
In our latest collections we mostly use Drupal for prototyping. For this reason modules such as the Islandora Solr Client, the PDF Solution Pack or the Book Module are rather useful components to help us test and correct our collections once ingested and before the web layer is deployed.
What feature of your repository are you most proud of?
We like to be able to present the information through easy to grasp visualizations and have used timelines and maps in the past. In addition to this, we have started exploring the use of recommendation systems that once an object is selected it will suggest other materials of interest. This has been used in production in “All our art catalogues since 1973”.
Who built/developed/designed your repository (i.e, who was on the team?)
Driven by the FJM Library, Islandora was initially setup at FJM with help from discoverygarden and the first four collections (CLAMOR, CEACS IR, Archive of Joaquín Turina, Archive of Antonia Mercé) were developed in the first year.
After that, the Library and IT Services undertook the development of a small and simple collection of essays to then move into a more complex product like the Personal Library of Cortazar that required more advanced work from web programmers and designers.
In the last year, we have developed a .NET library that allows us to interact with the Islandora components such as Fedora, Solr or RISearch. Since then we have undertaken more complex interdepartmental ventures like the collection “All our art catalogues since 1973” where Library, IT and the web team have worked with colleagues in other departments such digitisation, art and design.
In addition to this we have also kept working on Library collections with help from IT like Sim Sala Bim Library of Illusionism or our latest collection “La Saga de los Fernández Shaw” which merges three different archives with information managed in Archivist Toolkit.
Do you have plans to expand your site in the future?
The knowledge portals developed using Islandora have been well received both internally and externally with many visitors. We plan to expand the collections with many more materials as well as using the repository to host the authority index and the thesaurus collections for the FJM. This will continue our work to ensure that the FJM digital materials are managed, connected and preserved.
What is your favourite object in your collection to show off?
This is a hard one, but if we have to chose our favourite object we would probably chose a resource like the The Avant-Garde Applied (1890.1950) art catalogue. The catalogue is presented with different photos of the spine and back cover, with other editions and related catalogues with a responsive web design and multi-device progressive loading viewer.
Our thanks to Luis and to FJM for agreeing to this feature. To learn more about their approach to Islandora, you can query to source by attending Islandora Camp EU2.
In honor of Thanksgiving, I’d like to give thanks for 5 tech tools that make life as a librarian much easier.
On any given day I work on at least 6 different computers and tablets. That means I need instant access to my documents wherever I go and without cloud storage I’d be lost. While there are plenty of other free file hosting services, I like Drive the most because it offers 15GB of free storage and it’s incredibly easy to use. When I’m working with patrons who already have a Gmail account, setting up Drive is just a click away.
I dabbled in Goodreads for a bit, but I must say, Libib has won me over. Libib lets you catalog your personal library and share your favorite media with others. While it doesn’t handle images quite as well as Goodreads, I much prefer Libib’s sleek and modern interface. Instead of cataloging books that I own, I’m currently using Libib to create a list of my favorite children’s books to recommend to patrons.
Hopscotch is my favorite iOS app right now. With Hopscotch, you can learn the fundamentals of coding through play. The app is marketed towards kids, but I think the bubbly characters and lighthearted nature appeals to adults too. I’m using Hopscotch in an upcoming adult program at the library to show that coding can be quirky and fun. If you want to use Hopscotch at your library, check out their resources for teachers. They’ve got fantastic ready made lesson plans for the taking.
My love affair with Photoshop started many years ago, but as I’ve gotten older, Illustrator and I have become a much better match. I use Illustrator to create flyers, posters, and templates for computer class handouts. The best thing about Illustrator is that it’s designed for working with vector graphics. That means I can easily translate a design for a 6-inch bookmark into a 6-foot poster without losing image quality.
Twitter is hands-down my social network of choice. My account is purely for library-related stuff and I know I can count on Twitter to pick me up and get me inspired when I’m running out of steam. Thanks to all the libraries and librarians who keep me going!
What tech tools are you thankful for? Please share in the comments!
When Boston Public Library first designed its statewide digitization service plan as an LSTA-funded grant project in 2010, we offered free imaging to any institution that agreed to make their digitized collections available through the Digital Commonwealth repository and portal system. We hoped and suggested that money not spent by our partners on scanning might then be invested in the other side of any good digital object – descriptive metadata. We envisioned a resurgence of special collections cataloging in libraries, archives, and historical societies across Massachusetts.
After a couple of years, reality set in. Most of our partners did not have the resources to generate good descriptive records structured well enough to fit into our MODS application profile without major oversight and intervention on our part. What we did find, however, were some very dedicated and knowledgeable local historians, librarians, and archivists who maintained a variety of documentation that could be best described as “pre-metadata.” Their local landscapes included inventories, spreadsheets, caption files, finding aids, catalog cards, sleeve inscriptions, dusty three-ring binders – the rich soil from which good metadata grows.
We understood it was now our job to cultivate and harvest metadata from these local sources. And thus the “Metadata Mob” was born. It is a fun and creative type of mob — less roughneck and more spontaneous dance routine. Except, instead of wildly cavorting to Do-Re-Mi in train stations, we cut-and-paste, we transcribe, we script, we spell check, we authorize, we regularize, we refine, we edit, and we enhance. It is a highly customized, hands-on process that differs slightly (or significantly) from collection to collection, institution to institution.
In many ways, the work Boston Public Library does has come to resemble the locally-sourced food movement in that we focus on how each community understands and represents their collections in their own unique way. Free-range metadata, so to speak, that we unearth after plowing through the annals of our partners.
We don’t impose our structures or processes on anyone beyond offering advice on some standard information science principles – the three major “food groups” of metadata as it were – well defined schema, authority control, and content standard compliance. We encourage our partners to maintain their local practices.
We then carefully nurture their information into healthy, juicy, and delicious metadata records that we can ingest into the Digital Commonwealth repository. We have all encountered online resources with weak and frail frames — malnourished with a few inconsistently used Dublin Core fields and factory-farmed values imported blindly from collection records or poorly conceived legacy projects. Our mob members eschew this technique. They are craftsmen, artisans, information viticulturists. If digital library systems are nourished by the metadata they ingest, then ours will be kept vigorous and healthy with the rich diet they have produced.
Thanks to SEMAP for use of their the logo in the header image. Check out SEMAP’s very informative website at semaponline.org. Buy Fresh, Buy Local! Photo credit: Lori De Santis.
All written content on this blog is made available under a Creative Commons Attribution 4.0 International License. All images found on this blog are available under the specific license(s) attributed to them, unless otherwise noted.
From Bram Luyten, @mire
With the DSpace 5 release coming up, we wanted to make it easier for aspiring developers to get up and running with DSpace development. In our experience, starting off on the right foot with a proven set of tools and practices can reduce someone’s learning curve and help in quickly getting to initial results. IDEA 13, the integrated development environment by IntelliJ can make a developer’s life a lot easier thanks to a truckload of features that are not included in your run-of-the-mill text editor.
By Michele Mennielli, International Relations, Cineca
Bologna, Italy During the recent euroCRIS Strategic Membership Meeting held in Amsterdam November 11-13 Cineca had the opportunity to present a new version of DSpace-CRIS with DSpace 4.2. This version of DSpace CRIS will be released in the next few days.
From James Evans, Product Manager, Open Repository
As previously reported in The Digital Reader, the bill passed in September by wide margins in both houses of the New Jersey State Legislature and would have codified the right to read ebooks without letting the government and everybody else knowing about it.
I wrote about some problems I saw with the bill. Based on a California law focused on law enforcement, the proposed NJ law added civil penalties on booksellers who disclosed the personal information of users without a court order. As I understood it, the bill could have prevented online booksellers from participating in ad networks (they all do!).
Governor Christie's veto statement pointed out more problems. The proposed law didn't explicitly prevent the government from asking for personal reading data, it just made it against the law for a bookseller to comply. So, for example, a local sheriff could still ask Amazon for a list of people in his town reading an incriminating book. If Amazon answered, somehow the reader would have to:
- find out that Amazon had provided the information
- sue Amazon for $500.
In New Jersey, a governor can issue a "Conditional Veto". In doing so, the governor outlines changes in a bill that would allow it to become law. Christie's revisions to the Reader Privacy Act make the following changes:
- The civil penalties are stripped out of the bill. This allows Gov. Christie to position himself and NJ as "business-friendly".
- A requirement is added preventing the government from asking for reader information without a court order or subpoena. Christie gets to be on the side of liberty. Yay!
- It's made clear that the law applies only to government snooping, and not to promiscuous data sharing with ad networks. Christie avoids the ire of rich ad network moguls.
- Child porn is carved out of the definition of "books". Being tough on child pornography is one of those politically courageous positions that all politicians love.
I'm not a fan of his by any means, but Chris Christie's version of the Reader Privacy Act is a solid step in the right direction and would be an excellent model for other states. We could use a law like it on the national level as well.
(Guest posted at The Digital Reader)
As some of you already know, Marlene and I are moving from Seattle to Atlanta in December. We’ve moved many (too many?) times before, so we’ve got most of the logistics down pat. Movers: hired! New house: rented! Mail forwarding: set up! Physical books: still too dang many!
We could do it in our sleep! (And the scary thing is, perhaps we have in the past.)
One thing that is different this time is that we’ll be driving across the country, visiting friends along the way. 3,650 miles, one car, two drivers, one Keurig, two suitcases, two sets of electronic paraphernalia, and three cats.
Who wants to lay odds on how many miles it will take each day for the cats to lose their voices?
Fortunately Sophia is already testing the cats’ accommodations:
I will miss the friends we made in Seattle, the summer weather, the great restaurants, being able to walk down to the water, and decent public transportation. I will also miss the drives up to Vancouver for conferences with a great bunch of librarians; I’m looking forward to attending Code4Lib BC next week, but I’m sorry to that our personal tradition of American Thanksgiving in British Columbia is coming to an end.
As far as Atlanta is concerned, I am looking forward to being back in MPOW’s office, having better access to a variety of good barbecue, the winter weather, and living in an area with less de facto segregation.
It’s been a good two years in the Pacific Northwest, but much to my surprise, I’ve found that the prospect of moving back to Atlanta feels a bit like a homecoming. So, onward!
PeerLibrary participated at OpenCon 2014, the student and early career researcher conference on Open Access, Open Education, and Open Data.
Today I found the following resources and bookmarked them on <a href=
- FnordMetric | Framework for building beautiful real-time dashboards FnordMetric allows you to write SQL queries that return SVG charts rather than tables. Turning a query result into a chart is literally one line of code.
Digest powered by RSS Digest
As the Ebola outbreak continues, the public must sort through all of the information being disseminated via the news media and social media. In this rapidly evolving environment, librarians are providing valuable services to their communities as they assist their users in finding credible information sources on Ebola, as well as other infectious diseases.
On Tuesday, December 12, 2014, library leaders from the U.S. National Library of Medicine will host the free webinar “Ebola and Other Infectious Diseases: The Latest Information from the National Library of Medicine.” As a follow-up to the webinar they presented in October, librarians from the U.S. National Library of Medicine will be discussing how to provide effective services in this environment, as well as providing an update on information sources that can be of assistance to librarians.Speakers
- Siobhan Champ-Blackwell is a librarian with the U.S. National Library of Medicine Disaster Information Management Research Center. Champ-Blackwell selects material to be added to the NLM disaster medicine grey literature data base and is responsible for the Center’s social media efforts. Champ-Blackwell has over 10 years of experience in providing training on NLM products and resources.
- Elizabeth Norton is a librarian with the U.S. National Library of Medicine Disaster Information Management Research Center where she has been working to improve online access to disaster health information for the disaster medicine and public health workforce. Norton has presented on this topic at national and international association meetings and has provided training on disaster health information resources to first responders, educators, and librarians working with the disaster response and public health preparedness communities.
Date: December 12, 2014
Time: 2:00 PM–3:00 PM Eastern
Register for the free event
If you cannot attend this live session, a recorded archive will be available to view at your convenience. To view past webinars also done in collaboration with iPAC, please visit Lib2Gov.org.