You are here

Feed aggregator

DPLA: DPLAfest 2016: Post-Fest Wrap Up

planet code4lib - Thu, 2016-04-28 15:00

First of all, THANK YOU to all of the attendees, presenters, sponsors, and host institutions that helped make the third annual DPLAfest a great success!  With so many great sessions, conversations, workshops (and sightseeing!) taking place at once, we wanted to be sure to share a one-stop recap of the highlights.  Whether you missed the fest, participated from afar, or are just hoping to revisit some of the great ideas shared during the conference, look no further!  This post is your guide to the news, notes, media, and other materials associated with the DPLAfest 2016.

Announcements & Milestones

DPLAfest 2016 Opening Session at the Library of Congress

  • A growing network: DPLA now has over 13 million items from 1,900 contributing institutions
  • Debut of RightsStatements.org, a collaborative approach to rights statements that can be used to communicate the copyright status of cultural objects
  • 100 Primary Source Sets now published for educators and students
  • Open eBooks launched this spring to a great reception with over 1.4 million access codes distributed to date
  • DPLA looks forward to partnering with the Library of Congress
  • And…we’re on Instagram!

For more details about each of these announcements and milestones, check out our DPLAfest press release and our announcement launching RightsStatements.org.

Slides and notes

To find presentation slides and notes from DPLAfest 2016 sessions, visit the online agenda (click on each session to find attached slides and links to notes, where available).

Recorded Sessions

The DPLAfest Opening Plenary session is now available on the DPLAfest 2016 videos page.  We are currently processing recordings of additional sessions, which will be available in the coming months. Stay tuned for more video content.

        

Tweets

The first official DPLA selfie. Follow us on Instagram @digpublib.

If you weren’t able to make it to the fest (or if you just want to re-live it), check out the fantastic online conversation on Twitter using hashtag  or read our selection of posts on Storify.

Special thanks to the many DPLAfest attendees who helped capture each session on social media!

Instagram

We were excited to see great content contributed by fest participants on our newest social media platform – check out photos from our attendees.

We would like to give a special shout-out to Richard Naples (@drastician), the winner of the 2016 DPLAfest Instagram Challenge!

Sponsors

The Digital Public Library of America wishes to thank its generous DPLAfest Sponsors:

  • TextHelp
  • Digital Transitions, Division of Cultural Heritage
  • CLIR Digital Library Federation

DPLA also wishes to thank its gracious hosts:

  • Library of Congress
  • US National Archives and Records Administration
  • Smithsonian Institution
Photos

Storify Already excited for next year? We are too! Apply to host DPLAfest 2017

DPLAfest host organizations are essential contributors to one of the most prominent gatherings in the country involving librarians, archivists, and museum professionals, developers and technologists, publishers and authors, teachers and students, and many others who work together to further the mission of providing maximal access to our shared cultural heritage.

  • For colleges and universities, DPLAfest is the perfect opportunity to directly engage your students, educators, archivists, librarians and other information professionals in the work of a diverse national community of information and technology leaders.
  • For public libraries, hosting DPLAfest brings the excitement and enthusiasm of our community right to your hometown, enriching your patrons’ understanding of library services through free and open workshops, conversations, and more.
  • For museums, archives, and other cultural heritage institutions, DPLAfest is a great way to promote your collections and spotlight innovative work taking place at your organization.

It’s also a chance to promote your institution nationally and internationally, given the widespread media coverage of DPLAfest and the energy around the event.  Look for our formal call for proposals very soon!

Library of Congress: The Signal: The Astrophysics Source Code Library

planet code4lib - Thu, 2016-04-28 13:10

This is a guest post by Nicole Contaxis.

Example submissions for the Astrophysics Source Code Library: http://ascl.net/code/all

On April 12th2016, Alice Allen, editor of the Astrophysics Source Code Library, came to the National Library of Medicine to speak with National Digital Stewardship Residency participants, mentors and visitors about the importance of software as a research object and about why the ASCL is a necessary and effective resource for the astronomy and astrophysics academic communities.

Astrophysicists and astronomers frequently write their own code to do their research, and this code helps them interpret and manipulate large data sets. These codes, as an integral part of the research process, are important to share for two reasons: (1) they increase the efficiency of work by allowing code to be re-used and (2) it helps ensure the transparency of scientific research.

Yet, difficulties persist when it comes to encouraging researchers to share source code, regardless of the benefits. Allen talked about how researchers are reluctant to share code that may be “messy” and that creating this source code library requires community engagement and change management. She spoke about studying the impact of non-traditional scholarly outputs, like code, and the issues of scholarly publishing. Allen showed how ASCL has helped allow journal authors cite code, which had been a far more difficult procedure earlier. The ASCL assigns Digital Object Identifiers — persistent and unique identifiers — to source code in their library, which means that future academics can cite that code, even if that code is not featured in a journal article or a more traditional academic resource.

Image from the Visible Human Project at the National Library of Medicine.

The discussion turned to the difficulties of grant-based funding. The ASCL is basically unfunded, and all labor, including Allen’s, is voluntary. While talking about other code libraries that have lost funding and closed, Allen talked about how grant-funding, which runs on two- to five-year cycles, does not provide enough time to fully engage a community with a resource, regardless of how well that resource is designed, implemented and managed. Funding, as a universal source of concern, was a common point of interest, even for attendees without experience working with software or code.

The session included a tour of Visual Human Project, which is an NLM project that collects extensive data on a male and female cadaver, allowing artists and researchers to visualize that data in new and exciting ways.

District Dispatch: ALA selects Nick Gross as Google Policy Fellow

planet code4lib - Thu, 2016-04-28 06:24

Today, the American Library Association (ALA) announced that Nick Gross will serve as its 2016 Google Policy Fellow. As part of his summer fellowship, Gross will spend ten weeks in Washington, D.C. working on technology and Internet policy issues. As a Google Policy Fellow, Gross will explore diverse areas of information policy, such as copyright law, e-book licenses and access, information access for underserved populations, telecommunications policy, digital literacy, online privacy, the future of libraries, and others. Google, Inc. pays the summer stipends for the fellows and the respective host organizations determine the fellows’ work agendas.

Nick Gross

Gross will work for the American Library Association’s Office for Information Technology Policy (OITP), a unit of the association that works to ensure the library voice in information policy debates and promote full and equitable intellectual participation by the public. Gross is a Ph.D. candidate at the University of North Carolina, Chapel Hill, specializing in media law and policy. He completed a J.D. at the University of Miami School of Law and is a graduate of the University of California, Davis with an undergraduate degree in international relations. Gross was a staff attorney for the U.S. Court of Appeals for the Eleventh Circuit and is a member of the California Bar.

“ALA is pleased to participate once again in the Google Policy Fellowship program as it has from its inception,” said Alan S. Inouye, director of the ALA Office for Information Technology Policy. “We look forward to working with Nick Gross on information policy topics that leverage his strong background and advance library interests as we prepare for the next presidential Administration.”

Find more information the Google Policy Fellowship Program

The post ALA selects Nick Gross as Google Policy Fellow appeared first on District Dispatch.

DuraSpace News: Find Out Why DuraCloud Gets High Hosted Service Marks–in 3 Minutes

planet code4lib - Thu, 2016-04-28 00:00

Austin, TX  If you need a flexible service that allows you to easily access and manage actively-used digital content that also requires long-term preservation, then DuraCloud is your solution. Learn more about DuraCloud and DuraCloud Vault, and the differences between these two types of hosted digital preservation services, in this three-minute Quickbyte broadcast from DuraSpace: https://youtu.be/lSvfxrnF7z0

DuraSpace News: SoCal Fedora Camp at Caltech

planet code4lib - Thu, 2016-04-28 00:00

Austin, TX  The most recent Fedora camp in Pasadena, California was hosted by the Caltech Library at the California Institute of Technology's Keck Institute for Space Studies.

M. Ryan Hess: Google Analytics and Privacy

planet code4lib - Wed, 2016-04-27 21:53

Collecting web usage data through services like Google Analytics is a top priority for any library. But what about user privacy?

Most libraries (and websites for that matter) lean on Google Analytics to measure website usage and learn about how people access their online content. It’s a great tool. You can learn about where people are coming from (the geolocation of their IP addresses anyway), what devices, browsers and operating systems they are using. You can learn about how big their screen is. You can identify your top pages and much much more.

Google Analytics is really indispensable for any organization with an online presence.

But then there’s the privacy issue.

Is Google Analytics a Privacy Concern?

The question is often asked, what personal information is Google Analytics actually collecting? And then, how does this data collection jive with our organization’s privacy policies.

It turns out, as a user of Google Analytics, you’ve already agreed to publish a privacy document on your site outlining the why and what of your analytics program. So if you haven’t done so, you probably should if only for the sake of transparency.

Personally Identifiable Data

Fact is, if someone really wanted to learn about a particular person, it’s not entirely outside the realm of possibility that they could glean a limited set of personal attributes from the generally anonymized data Google Analytics collects. IP addresses can be loosely linked to people. If you wanted to, you could set up filters in Google Analytics that look at a single IP.

Of course, on the Google side, any user that is logged into their Gmail, YouTube or other Google account, is already being tracked and identified by Google. This is a broadly underappreciated fact. And it’s a critical one when it comes to how approach the question of dealing with the privacy issue.

In both the case of what your organization collects with Google Analytics and what all those web trackers, including Google’s trackers, collect, the onus falls entirely on the user.

The Internet is Public

Over the years, the Internet has become a public space and users of the Web should understand it as such. Everything you do, is recorded and seen. Companies like Google, Facebook, Mircosoft, Yahoo! and many, many others are all in the data mining business. Carriers and Internet Service Providers are also in this game. They deploy technologies in websites that identify you and then sell what your interests, shopping habits, web searches and other activities are to companies interested in selling to you. They’ve made billions on selling your data.

Ever done a search on Google and then seen ads all over the Web trying to sell you that thing you searched last week? That’s the tracking at work.

Only You Can Prevent Data Fires

The good news is that with little effort, individuals can stop most (but not all) of the data collection. Browsers like Chrome and Firefox have plugins like Ghostery, Avast and many others that will block trackers.

Google Analytics can be stopped cold by these plugins. But it won’t solve all the problems. Users also need to set up their browsers to delete cookies websites save to their browsers. And moving off of accounts provided from data mining companies “for free” like Facebook accounts, Gmail and Google.com can also help.

But you’ll never be completely anonymous. Super cookies are a thing and are very difficult to stop without breaking websites. And some trackers are required in order to load content. So sometimes you need to pay with your data to play.

Policies for Privacy Conscious Libraries

All of this means that libraries wishing to be transparent and honest about their data collection, need to also contextualize the information in the broader data mining debate.

First and foremost, we need to educate our users on what it means to go online. We need to let them know its their responsibility alone to control their own data. And we need to provide instructions on doing so.

Unfortunately, this isn’t an opt-in model. That’s too bad. It actually would be great if the world worked that way. But don’t expect the moneyed interests involved in data mining to allow the US Congress to pass anything that cuts into their bottom line. This ain’t Germany, after all.

There are ways with a little javascript to create a temporary opt-in/opt-out feature to your site. This will toggle tags added by Google Tag Manager on and off with a single click. But let’s be honest. Most people will ignore it. And if they do opt-out, it will be very easy for them to overlook everytime without a much more robust opt-in/opt-out functionality baked in to your site. But for most sites and users, this is asking alot. Meanwhile, it diverts attention from the real solution: users concerned about privacy need to protect themselves and not take a given websites word for it.

We actually do our users a service by going with the opt-out model. This underlines the larger privacy problems on the Wild Wild Web, which our sites are a part of.


DuraSpace News: VIVO 2016 Conference Announces 2 Invited Speakers

planet code4lib - Wed, 2016-04-27 00:00

From the VIVO 2016 Conference organizers, to be held In Denver, CO August 17-19

The VIVO 2016 Planning Committee is excited to announce two invited speakers! We’re looking forward to their talks, and we’re thrilled that, in addition to their invited sessions, both Dr. Ruben Verborgh and Dr. Pedro Szekely will be hosting half-day Workshops on August 17th.

District Dispatch: School librarian’s workshop: federal government resources for K-12

planet code4lib - Tue, 2016-04-26 18:22

The School Librarian’s Workshop will provide useful information for grades K-12, including Ben’s Guide to the U.S. Government and Kids.gov.

From the Federal Depository Library Program (FDLP):

A live training webinar, “School Librarian’s Workshop: Federal Government Resources for K-12 / Taller para maestros de español: Recursos de gobierno federal para niveles K-12,” will be presented on Tuesday, May 31, 2016.

Click here to register!

  • Start time: 2:00 p.m. (Eastern)
  • Duration: 60 minutes
  • Speaker: Jane Canfield, Coordinator of Federal Documents, Pontifical Catholic University of Puerto Rico
  • Learning outcomes: Are you a school librarian? Do you work with school librarians or children? The School Librarian’s Workshop will provide useful information for grades K-12, including Ben’s Guide to the U.S. Government and Kids.gov. The webinar will explore specific agency sites which provide information, in English and Spanish, appropriate for elementary and secondary school students. Teachers and school librarians will discover information on Federal laws and regulations and learn about resources for best practices in the classroom.
  • Expected level of knowledge for participants: No prerequisite knowledge required.

Closed captioning will be available for this webinar.

The webinar is free, however registration is required. Upon registering, a confirmation email will be sent to you. This registration confirmation email includes the instructions for joining the webinar.

Registration confirmations will be sent from sqldba[at]icohere.com. To ensure delivery of registration confirmations, registrants should configure junk mail or spam filter(s) to permit messages from that email address. If you do not receive the confirmation, please notify GPO.

GPO’s eLearning platform presents webinars using WebEx. In order to attend or present at a GPO-hosted webinar, a WebEx plug-in must be installed in your internet browser(s). Download instructions.

Visit FDLP Academy for access to FDLP educational and training resources. All are encouraged to share and re-post information about this free training opportunity.

The post School librarian’s workshop: federal government resources for K-12 appeared first on District Dispatch.

David Rosenthal: My Web Browser's Terms of Service

planet code4lib - Tue, 2016-04-26 15:00
This post was co-authored with Jefferson Bailey. NB - neither of us is a lawyer. Follow us below the fold to find out why this disclaimer is necessary.

Many web sites have explicit terms of service. For example, here are the terms of service that "govern your use of certain New York Times digital products". They start with this clause:
1.1 If you choose to use NYTimes.com (the “Site”), NYT’s mobile sites and applications, any of the features of this site, including but not limited to RSS, API, software and other downloads (collectively, the "Services"), you will be agreeing to abide by all of the terms and conditions of these Terms of Service between you and The New York Times Company ("NYT", “us” or “we”).So, just by using the services of nytimes.com, the New York Times claims that I have agreed to a whole lot of legal terms and conditions. I didn't have to click a check-box agreeing to them, or do anything explicit. The terms and conditions are not on the front page itself, they're just linked from it. The link is hard to find, in faint type at the very bottom of the page, wedged blandly between "Privacy" and the eye-glazing "Terms of Sale."

Among the terms that I'm deemed to have agreed to are:
2.3 You may download or copy the Content and other downloadable items displayed on the Services for personal use only, ... Copying or storing of any Content for other than personal use is expressly prohibited ... So, if the Terms of Service apply, Web archives are clearly violating the terms of service. Interestingly, there is an exception:
9.1 You shall have no rights to the proprietary software and related documentation, or any enhancements or modifications thereto, provided to you in order to access the Services ("Software"). ... You may make one copy of such software for archival purposes only. ...The software they are talking about must be the JavaScript they deliver to my browser. Recently there have been many instances of advertising networks serving JavaScript malware to visiting browsers, but if any of the ad networks the New York Times uses does this to you:
5.2 ... THE SERVICES AND ALL DOWNLOADABLE SOFTWARE ARE DISTRIBUTED ON AN "AS IS" BASIS WITHOUT WARRANTIES OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING, WITHOUT LIMITATION, WARRANTIES OF TITLE OR IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. YOU HEREBY ACKNOWLEDGE THAT USE OF THE SERVICES IS AT YOUR SOLE RISK.The New York Times claims not to be liable. Even if you thought "arguing with a man who buys ink by the barrel" was a good idea:
11.1 These Terms of Service have been made in and shall be construed and enforced in accordance with New York law. Any action to enforce these Terms of Service shall be brought in the federal or state courts located in New York City. Good luck with that.

So the interesting question is whether, in the absence of any explicit action on my (or an archive's crawler's) part, the terms of service bind me (or the archive)? Now, IANAL, and even actual lawyers appear to believe the answer isn't obvious. But writing on the Technology and Law blog a year ago, Venkat Balasubramani suggests that unless there is an explicit action indicating assent, the terms are unlikely to apply:
In place of the flawed browsewrap/clickwrap typology, we can use a simple non-overlapping typology for web interfaces: Category A is a click-through presentation where a user clicks while knowing that the click signals assent to the applicable terms; and Category B is everything else, which is not a contract.Let us assume for the moment that Balasubramani is correct and if there was no click-through the terms are not binding. In the good old days of Web archiving, this would mean there was no problem because the crawler would not have clicked the "I agree" box. But in today's Web, browser-based crawlers are clicking on things. Lots of things. In fact, they're clicking on everything they can find. Which might well be an "I agree" box. Lawyers will be able to argue whether the crawler clicked on it "knowing that the click signals assent to the applicable terms".

Let us instead assume the contrary, that despite the lack of an explicit action conveying agreement, the terms are binding. In the good old days of the Web, my browser requested service from nytimes.com and as a result I agreed to their terms of service. But the Web's model has evolved from linked documents to communicating programming environments. Now, my browser requests service from nytimes.com, and in return nytimes.com requests the service of running JavaScript code from my browser.

Making this assumption, Jefferson and I argued as follows. Suppose my, or the archive's, browser were configured to include in the HTTP request to nytimes.com, a Link header with "rel=license" pointing to the Terms of Service that apply to the services available from the requesting browser. The New York Times would have been notified of these terms far more directly than I had been of their terms by the faint type link at the bottom of the page that few have ever consciously clicked on. Thus, using exactly the same argument that the New York Times used to bind me to their terms, they would have been bound to my terms.

What's sauce for the goose is sauce for the gander. If an explicit action is required, archive crawlers that don't click on an "I agree" box are not bound by the terms. If no explicit action is required, only some form of notification, browsers and browser-based crawlers can bind websites to their terms by providing a suitable notification.

What Terms of Service would be appropriate for using my browser? Based on the New York Times' terms, perhaps they should include:
1.1 If you choose to use any of the features of this Browser, including but not limited to the ability to run JavaScript and WebAssembly (collectively, the "Services"), you will be agreeing to abide by all of the terms and conditions of these Terms of Service between you and [insert name] ("us" or "we").
1.2 We may change, add or remove portions of these Terms of Service at any time, which shall become effective immediately upon posting. It is your responsibility to review these Terms of Service prior to each use of the Browser and by continuing to use this Browser, you agree to any changes.and:
1.4 We may change, suspend or discontinue any aspect of the Services at any time, including the availability of any Services feature, database, or content. We may also impose limits on certain features and services or restrict your access to parts or all of the Services without notice or liability. and:
4.1 You may not access or use, or attempt to access or use, the Services to take any action that could harm us or a third party. You may not access parts of the Services to which you are not authorized. You may not attempt to circumvent any restriction or condition imposed on your use or access, or do anything that could disable or damage the functioning or appearance of the Services, I.e. you and your advertising networks better not send us any malware. And, of course, we need the perennial favorite:
5.2 ... THE SERVICES AND ALL INFORMATION THEY CONTAIN ARE DISTRIBUTED ON AN "AS IS" BASIS WITHOUT WARRANTIES OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING, WITHOUT LIMITATION, WARRANTIES OF TITLE OR IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. YOU HEREBY ACKNOWLEDGE THAT USE OF THE SERVICES IS AT YOUR SOLE RISK.A reverse EULA. Wouldn't you like to be able to do this?

So far, this may sound like a parody or a paranoid fantasy. But many online media companies have begun to target client-side browser information to police content delivery. Sites like Forbes, Wired, and maybe even (gasp) The New York Times are now disallowing access to their sites for those with ad-blocking browser add-ons:
We noticed you still have ad blocker enabled. By turning it off or whitelisting Forbes.com, you can continue to our site and receive the Forbes ad-light experience.It turns out that the "Forbes ad-light experience" includes free bonus malware!

You have probably noticed that European websites are now subject to the Cookie Law, requiring your click to explicitly assent to the web site's use of cookies, because the use of cookies implicates the EU's directive on privacy. Alexander Hanff argued:
that using an ad-blocker detector script is basically doing the same sort of thing as a cookie in terms of spying on client-side information within one's web browser, and a letter he received from the EU Commission apparently confirms his assertion.Thus running a script that collects information from an EU citizen's browser (which is what the vast majority do) apparently requires explicit permission. If Hanff's efforts succeed, anticipate European Web publishers going non-linear.

As the web has grown into a processing environment, it presumes a reciprocal interactivity, the parameters of which are still shifting and unbalanced. In the end the terms of this overall interplay of information exchange and license seem, as they so often do, inequitable. The future is here, it's just not evenly licensed. On one end, media and other corporate content sites target user browsers, inject (accidentally or via 3rd parties) potentially malicious scripts, monitor for plug-in screeners, install browsing trackers, analyze cookies and add all sorts of profiling and monitoring scripts, all generally without any explicit agreement on our part. On the other hand, we, simple users, often are presumed to agree to prolix legalese and verbose, obscure license agreements, all simply so we can read about people doing yoga with their dogs.

Code4Lib: Code4Lib Journal #32

planet code4lib - Tue, 2016-04-26 14:48
Topic: journal

Issue #32 of the Code4Lib Journal is now available.

Library of Congress: The Signal: Bagger’s Enhancements for Digital Accessions

planet code4lib - Tue, 2016-04-26 13:55

This is a guest post by John Scancella, Information Technology Specialist with the Library of Congress, and Tibaut Houzanme, Digital Archivist with the Indiana Archives and Records Administration. BagIt is an internationally accepted method of transferring files via digital containers. If you are new to BagIt, please watch our introductory video.

John Scancella. Photo by Mike Ashenfelder.

Bagger is a digital records packaging and validation tool based on the BagIt Specification. This BagIt-compliant software allows creators and recipients of BagIt packages to verify that the files in the bag are complete and valid. This is done by creating manifests of the files that exist in the bag and their corresponding checksum values.

Bagger, built in Java, works in a variety of computing environments such as Windows, Linux and Mac. As a graphic user interface application, Bagger is a simpler tool for the average computer user than the text-only command-line interface implementation of BagIt.

Many improvements were made to Bagger recently:

  • Added more profiles to give the user and archival communities more options. Users can select from various profiles and fields to decide on their own requirements.
  • Bagger’s build system was switched to Gradle. Gradle is quickly becoming the standard build system for Java applications, and its use contributes to future-proofing Bagger’s improvements by giving Bagger the advantage of having a domain-specific language that leads to concise, maintainable and comprehensible builds.
  • The lowest compatible version of Java that Bagger can run with now is 1.7. Running Bagger with at least Java 1.7 helps with security and brings a host of new programming language features that allow for easier maintenance and performance improvement.
  • General code cleanup was performed for easier maintenance.
  • Long standing bugs and issues were fixed.

The Indiana Archives and Records Administration prepared a relatively detailed accession profile that is included with Bagger 2.5.0. A generic version of this profile is also available, where metadata fields are all optional.

These profiles were designed to help facilitate the accessioning of digital records, with preservation actions and management in mind. Overall, intellectual and physical components of digital records’ metadata were targeted. The justifications behind the metadata fields in these new profiles are:

  1. Consistent metadata fields with simple descriptors. The metadata field names use clear and simple terms. The consistency in the order of the fields on the display screen and in the metadata text file (part of the recent improvements) is also a benefit to data entry and review. The profiles use pre-identified values in drop-down menus that will help reduce typing mistakes and enforce cleaner metadata collection. The Indiana profile also uses pre-populated field entries, such as names and addresses, which help reduce repetitive data entry and save time during accessioning.
  2. Adaptable to various institutional contexts and practices. IARA requires the collection of metadata that it deems essential for digital records; these are represented in its profile. To make the profile adaptable across institutions, the generic version uses optional fields only. Individual users can edit the metadata fields, delete them or change their optional/required status. Switching between “Required: false” to “Required: true” in the local JSON file will be sufficient to help achieve the desired level of enforcement appropriate for each institution. Additional fields from the main menu can be added that draw from the BagIt specification. Also, custom metadata fields can be created or added on the fly.
  3. Collection of data points that matter for preservation decisions and actions. Some of the metadata fields added to standard accession fields help to identify records that are available only in digital formats so they can be treated accordingly; others assist with being able to locate records in proprietary digital formats that need migration to open standards formats. Information about sensitive records can also be captured to assist with prioritization and access management.
  4. Make automation possible through fields mapping. By using consistent and orderly metadata fields in a profile, you will create bags with a well-structured and predictable metadata sequence and value. This makes it easier to map the bag’s fields, values or collected information to a preservation system’s database fields. Investing in this automation opportunity will likely reduce the data entry time when importing bags into a preservation system. This assumes that the preservation system is either BagIt-compliant already (interoperability benefit) or will be made to effectively know what to do with each part of the bag, each metadata field and the captured values (to be achieved through integration).

Following are two screenshots of Bagger with the full list of metadata fields for a sample accession:
Figure 1: IARA Profile with Sample Accession Screen 1 of 2 [ENLARGE]


Figure 2: IARA Profile with Sample Accession[ENLARGE]

In both screenshots, the letter “R” next to a metadata field means that you must enter or select a value, or the right value, before the bag can be finalized. The drop-down selection marked with “???” indicates that a value can be selected through clicks. Also question marks “???” as a value, or a different value in their place, can be used as a placeholder that may be found and replaced later with the correct value. In IARA’s experience, a single accession may come on multiple storage media/carriers. For that reason, the “records/medium carrier” field has been repeated five times (arbitrarily) to allow for multiple choices and entries; it can be further expanded. The number of media received, when entered with consistency, can help with easier media count and inventories.

Once completed, Bagger also adds, in the “bag-info.txt” metadata file, the size of the bag in Bytes and in Megabytes. When all the required metadata is entered and the files added, the bag can be completed. A successful bagging session process will see this message displayed: “Bag Saved Successfully.”

The fictitious metadata values in the first two figures are for demonstration and include additional metadata such as hash value and file size in the figure below:

Figure 3: Metadata Fields and Values in the bag-info.txt File after Bag Creation [ENLARGE]

This test accession used random files freely accessible from the Digital Corpora and Open Preservation websites.

IARA’s accession profile, the generic version or any profile available in Bagger, can be used as is if it meets the user’s requirements. Or they can be customized to fit institutional needs, such as enforcing certain metadata, field-name modifications, additional fields or drop-down values, and to support other document forms (e.g. audiovisual metadata fields such as linear duration of content). As Bagger’s metadata remain extensible, a profile can be created to fit almost any project. And the more profiles are available directly in Bagger, the better for the archival community who will have choices.

To use the IARA’s profile, its generic version or any other profile in Bagger, download the latest version (as of this writing 2.5.0). To start an accession, select the appropriate profile from the drop-down list. This will populate the screen with profile-specific metadata fields. Select files or folders, enter values and save the bag.
For detailed instructions on how to edit metadata fields and their obligation level, create a new profile, or change an existing profile to meet the project/institution’s requirements, please refer to the Bagger User Guide in the “doc” folder inside the downloaded Bagger.zip file.

BagIt has been adopted for digital preservation by The Library of Congress, the Dryad Data Repository, the National Science Foundation DataONE and the Rockefeller Archive Center. BagIt is also used at Cornell, Purdue, Stanford, Ghent, New York and the University of California. BagIt has been implemented in Python, Ruby, Java, Perl, PHP, and in other programming languages.

We encourage feedback for BagIt. Here are some ways to contribute:

Tara Robertson: embodied library work

planet code4lib - Tue, 2016-04-26 02:04

I’m coming down from the Gender and Sexuality in Library and Information Studies colloquium that Emily Drabinski, Baharak Yousefi and I organized. For me one of the big themes was bodies and embodiment.

Vanessa Richards‘ keynote was amazing. She spoke a bit and facilitated us in singing together. It was powerful, transformative and extremely emotional for me. Some of the instruction she gave us was to pay attention to our bodies, “what do you feel and where in your body do you feel it when I tell you we are going to sing together?” Both my body and my mind are very uncomfortable with singing. At some point in my life someone told me I was a bad singer and ridiculed me and I think I believed them. Vanessa Richards said something like: “Your body is the source code. Your body knows how to sing. All the people who told you that you can’t sing, kick them to the curb. This is your human right.”

For me this was deeply transformative and created magic in the room. We sang 3 songs together, and by the last one there was a beautiful transformation. I observed people’s bodies. People’s shoulders had dropped and their weight was sinking their weight down into their feet. People were taking up more space and looking less self conscious. Also, our voices were much louder and they were beautiful. This was an unconventional and magical way to start the day together.

There were so many excellent presentations. I was so excited to learn about GynePunk, the cyborg witches of DIY gynecology in Spain. James Cheng, Lauren Di Monte, and Madison Sullivan completely blew my mind in their talk titled Makerspace Meets Medicine: Politics, Gender, and Embodiment in Critical Information Practice. This is the most exciting talk I’ve heard about makerspaces, though they argued that because it’s gendered and political we’re unlikely to see this in a library makerspace. GynePunk reminds me of the zine Hot Pantz that starts with:

Patriarchy sucks. It’s robbed us of our autonomy and much of our history. We believe it’s integral for women to be aware and in control of our own bodies.

I also loved Stacy Wood’s talk on Mourning and Melancholia in Archives. She told the story of working in an archive and having cremated ashes fall out of a poorly sealed bag that was in a poorly sealed envelope. I hope I have a chance to read her paper as she had many smart things to say about institutional practice, as well as melancholia.

Marika Cifor presented Blood, Sweat, and Hair: The Archival Potential of Queer and Trans Bodies in three acts: blood, sweat and hair. She used examples of these parts of our bodies that were part of archival objects:

  • blood – blood on a menstrual sponge, blood during the AIDS crisis, blood on Harvey Milk’s clothing from when he was shot and killed
  • sweat – sweat stains on a tshirt from a gay leather bar
  • hair – hair on a lipstick of Victoria Schneider a trans woman, sex worker and activist, and hair samples (both pubic hair and regular hair from your head) in Samuel Steward’s stud file, where he documented his lovers, that is in the Yale Archives

It was so exciting and nourishing to talk about bodies in relation to libraries, archives and information work. I didn’t realize that I was so hungry to have these conversations. I realized that when I’m doing my daily work I’m fairly unembodied dissociated. I bike to work, hang up my body on the back of my office door, and then let my brain run around for the day. I put on my body and go about the rest of my life. I’ve been working to try and be my whole self at work, and have realized that the brain/body binary needs to be dismantled.

I’m not really sure what this is going to look like. I fear it might be messy, as bodies often are. I also fear that there will be failure, as is common with trying new things. To start, I think I’m going to go join the Woodward’s Community Singers this Thursday and sing again.

Woodward’s Community Singers – An Invitation to Sing Together from Woodward’s Community Singers on Vimeo.

DuraSpace News: DSpace 6.0 Testathon Gets Underway–Through May 6

planet code4lib - Tue, 2016-04-26 00:00

From Tim Donohue, DSpace Tech Lead, on behalf of the DSpace Committers

The DSpace 6.0 Testathon is now underway. We ask that you take a few minutes of your time in these coming weeks to help us fully test this new release. We want to ensure we are maintaining the same level of quality that you come to expect out of a new DSpace release. We'd also love to hear your early feedback on 6.0!

Roy Tennant: Are you Particular, Promiscuous, or Private?

planet code4lib - Mon, 2016-04-25 18:02

There are of course as many different ways to use social media (or not) as there are people. But I was thinking the other day that probably most of us who use social media tools such as Twitter and Facebook probably fall into one of three camps:

  • Promiscuous — These are the people who share just about everything. You know how many pets they have and of which kind, you know if they have kids, you know how many, how old, and also that you will see every cute or horrible thing that they do as it will be posted by their adoring parent.
  • Particular — These people don’t post everything, they post very selectively.
  • Private — These are the lurkers. They like to see what is going on with their friends but they only rarely, if ever, share themselves. I know a young person like this. She is a very private person and she has never shared anything on Facebook despite having an account.

Having created these categories, I would guess that many of us go in and out of these categories at different times and for different purposes. But do you consider yourself to be promiscuous, particular, or private when it comes to social media?

 

Photo by e-codices, Creative Commons License CC BY-NC 2.0

Cynthia Ng: Imagine Living Without Books Part 1: The Importance of Supporting Print Disabled Readers

planet code4lib - Mon, 2016-04-25 17:54
People often ask me what I do, and I tend to respond with “providing books in accessible formats to print disabled”, but most people seem to simply accept that as another job or project description. Some people do ask me to explain further, but often, I don’t think we (and that includes myself, and other … Continue reading Imagine Living Without Books Part 1: The Importance of Supporting Print Disabled Readers

Pages

Subscribe to code4lib aggregator