You are here

planet code4lib

Subscribe to planet code4lib feed
Planet Code4Lib - http://planet.code4lib.org
Updated: 4 hours 13 min ago

D-Lib: Capturing Interdisciplinarity in Academic Abstracts

Thu, 2016-09-15 11:13
Article by Federico Nanni, Data and Web Science Research Group, University of Mannheim, Germany and International Centre for the History of Universities and Science, University of Bologna, Italy; Laura Dietz, Stefano Faralli, Goran Glavas and Simone Paolo Ponzetto, Data and Web Science Research Group, University of Mannheim, Germany

D-Lib: Scraping Scientific Web Repositories: Challenges and Solutions for Automated Content Extraction

Thu, 2016-09-15 11:13
Article by Philipp Meschenmoser, Norman Meuschke, Manuel Hotz and Bela Gipp, University of Konstanz, Germany

Mashcat: Call for proposals for Mashcat in Atlanta 2017

Wed, 2016-09-14 17:11

The MashcatATL planning group is excited to announce that the second face-to-face Mashcat event in North America will be held January 24, 2017, at Georgia State University Library in Atlanta, Georgia. We invite you to save the date, and we hope to have registration and a schedule for this low-cost (less than $10), 1-day event open by November.

At present, we are accepting proposals for talks, events, panels, workshops or other sessions for the Mashcat event. We are open to a variety of formats, with the reminder that this will be a one-day, single-track event aiming to support the cross-pollination goals of Mashcat (see more below). We are open to proposals for sessions led virtually. Please submit your proposals using this form. All proposals must be received by October 25, 2016, midnight EST, and we will respond to all proposals by November 8, 2016.

What is Mashcat? “Mashcat” was originally an event in the UK in 2012 aimed at bringing together people working on the IT systems side of libraries with those working in cataloguing and metadata. Four years later, Mashcat is a loose group of metadata specialists, cataloguers, developers and anyone else with an interest in how metadata in and around libraries can be created, manipulated, used and re-used by computers and software. The aim is to work together and bridge the communications gap that has sometimes gotten in the way of building the best tools we possibly can to manage library data. Among our accomplishments in 2016 was holding the first North American face-to-face event in Boston in January and running webinars.  If you’re unable to attend a face-to-face meeting, we will be holding at least one more webinar in 2016. For more information about mashcat in general, see http://www.mashcat.info/.

For more information about the Atlanta event or for questions about the proposal form, please contact Erin Leach (eleach@uga.edu). Thanks for considering, and we hope to see you in January.

The Mashcat ATL planning team:

  • Galen Charlton (gmcharlt@gmail.com)
  • Erin Grant (erin.grant@emory.edu)
  • Elaine Hardy (ehardy@georgialibraries.org)
  • Marlene Harris (marlene@readingreality.net)
  • Mary Jinglewski (mary.jinglewski@gmail.com)
  • Erin Leach (eleach@uga.edu)
  • Emily Williams (ewill220@kennesaw.edu)
  • Susan Wynne (swynne@gsu.edu)

Equinox Software: Milford Joins Bibliomation with Equinox Support

Wed, 2016-09-14 15:11

FOR IMMEDIATE RELEASE

Duluth, Georgia–September 14, 2016

Bibliomation has partnered with Equinox on many occasions over the years.  Equinox is pleased to announce the completion of a project with the Connecticut-based organization.  Milford Public Library was successfully migrated to Evergreen in late August.  This was a joint effort between Equinox and Bibliomation.

Milford Public Library is located in Milford, Connecticut and serves 40,300 patrons with over 126,000 items.  Equinox provided migration of bibliographic records, items, patrons, and transactions from their previous system.  Equinox also handled project management and configuration of notifications as well as a deduplication of bibliographic records.  Bibliomation handled the rest of the configuration and will provide ongoing hosting and support.

Amy Terlaga, Direct of Member Services for Bibliomation, remarked; “We are so excited to have the Milford Public Library back as part of the Bibliomation family.  Their staff brings tremendous enthusiasm and will be fantastic assets to the Evergreen  community.  We look forward to our continued partnership with them.”

Rogan Hamby, Project and Data Analyst for Equinox, added; “I really enjoyed working with the Bibliomation and Milford staff on this migration, both are very passionate about library services and it’s exciting to watch services grow as new libraries join a consortium.”  

About Equinox Software, Inc.

Equinox was founded by the original developers and designers of the Evergreen ILS. We are wholly devoted to the support and development of open source software in libraries, focusing on Evergreen, Koha, and the FulfILLment ILL system. We wrote over 80% of the Evergreen code base and continue to contribute more new features, bug fixes, and documentation than any other organization. Our team is fanatical about providing exceptional technical support. Over 98% of our support ticket responses are graded as “Excellent” by our customers. At Equinox, we are proud to be librarians. In fact, half of us have our ML(I)S. We understand you because we *are* you. We are Equinox, and we’d like to be awesome for you. For more information on Equinox, please visit http://www.esilibrary.com.

About Bibliomation

Bibliomation is Connecticut’s largest library consortium. Sixty public libraries and seventeen schools share an Evergreen system with centralized cataloging and a shared computer network. Bibliomation is Connecticut’s only open source consortium.  Members enjoy benefits beyond the ILS. Bibliomation offers top-notch cataloging, expert IT support, discounts on technology and library products, and regular trainings and workshops on a variety of topics. Non-members can take advantage of Bibliomation’s services as well. BiblioTech, OverDrive, and a wide range of consulting services are available.  For more information on Bibliomation, please visit http://www.biblio.org/.

About Evergreen

Evergreen is an award-winning ILS developed with the intent of providing an open source product able to meet the diverse needs of consortia and high transaction public libraries. However, it has proven to be equally successful in smaller installations including special and academic libraries. Today, over 1400 libraries across the US and Canada are using Evergreen including NC Cardinal, SC Lends, and B.C. Sitka.
For more information about Evergreen, including a list of all known Evergreen installations, see http://evergreen-ils.org.

Library of Congress: The Signal: Digital Collections and Data Science

Wed, 2016-09-14 14:55

Unlock the access. By James Levine on Flickr.

Researchers, of varying technical abilities, are increasingly applying data science tools and methods to digital collections. As a result, new ways are emerging for processing and analyzing the digital collections’ raw material — the data.

For example, instead of pondering one single digital item at a time – such as a news story, photo or weather event – a researcher can compute massive quantities of digital items at once to find patterns, trends and connections. Such data visualization can be revelatory. Ian Milligan, assistant professor in the Department of History at the University of Waterloo, said, “Visualizations can, at a glance, tell you more than if you get mired down in the weeds of reading document after document after document.”

The NEH Chronicling America Data Challenge is an example of extracting data visualizations from a large, publicly available data set. Recently, the National Endowment for the Humanities invited people to “Create a web-based tool, data visualization or other creative use of the information found in the (Library of Congress’s) Chronicling America historic newspaper database.” The results are diverse and imaginative. According to the NEH website,

  • America’s Public Bible tracks Biblical quotations in American newspapers to see how the Bible was used for cultural, religious, social, or political purposes”
  • American Lynching…explores America’s long and dark history with lynching, in which newspapers acted as both a catalyst for public killings and a platform for advocating for reform”
  • Historical Agricultural News, a search tool site for exploring information on the farming organizations, technologies, and practices of America’s past”
  • Chronicling Hoosier tracks the origins of the word ‘Hoosier’ “
  • USNewsMap.com discovers patterns, explores regions, investigates how stories and terms spread around the country, and watches information go viral before the era of the internet”
  • Digital APUSH…uses word frequency analysis…to discover patterns in news coverage.”

Dame Wendy Hall and Vint Cerf at Archives Unleashed 2.0: Web Archive Datathon. Photo by Mike Ashenfelder.

The explicit purpose of the Library of Congress’s Archives Unleashed 2.0, Web Archive Datathon was exploratory — open-ended discovery. The data came from a variety of sources, such as the Internet Archive’s crawl of the Cuban web domain,  Participants divided into teams, each with a general objective of what to do with the data. The room bustled with people clacking laptop keys, poking at screens, bunching around whiteboards and scrawling rapidly on easel pads. At one table, a group queried web site data related to the Supreme Court nominations of Justice Samuel Alito and Justice John Roberts. They showed off a word cloud view of their results and pointed out that the cloud for the archived House of Representatives websites prominently displayed the words “Washington Post” and the word cloud for the Senate prominently displayed “The New York Times.” The group was clearly excited by the discovery. This was solid data, not conjecture. But what did it mean? And who cares?

Well, it was an intriguing fact, one that invites further research. And surely someone might be curious enough to research it someday to figure out the “why” of it. And the results of that research, in turn, might open a new avenue of information that the researcher didn’t know was available or even relevant.

Events such as hackathons and the upcoming Collections as Data conference bring together librarians, archivists, digital humanities professionals, data scientists, artists, scholars — people from disparate backgrounds, evidence that computation of large data sets in research is blurring the lines between disciplines. There are a lot of best practices to be shared.

Data labs
A variety of digital research centers, scholars’ labs, digital humanities labs, learning labs and visualization labs are opening in libraries, universities and other institutions. But, despite their variety, these data labs are congealing into identifiable, standardized components that include

  • A work space
  • Hardware resources
  • Network access
  • Databases and data sets
  • Teaming researchers with technologists
  • Powerful processing capability
  • Software resources and tools
  • Repositories for end-result data sets.

A work space
A quiet room or rooms should be available for brainstorming. Whiteboards and easel pads enable people to quickly jot down ideas and diagram half-formed thoughts. A brain dump, no matter how unfocused, contains bits of value that may clump into solid ideas and strategies. The room also needs enough tables, chairs and power outlets.

Nikola Tesla, with his equipment for producing high-frequency alternating currents. Wellcome Library, London. M0014782.

Hardware resources
The lab should provide computer workstations, monitors, laptops, conference phones and possibly a net-cam for video teleconferencing.

Network access
Because of the constant flow of network requests and transactions, some moving potentially large files around, Wi-Fi must be consistent and reliable, and cable networks should be optimized for the highest bandwidth possible.

Databases and data sets
The data may need to be cleaned. Web harvesting, for example, grabs almost everything related to the seed URL – even with some filtering — and the archive often includes web pages that the researcher does not care about. Databases and data sets, if they are to be accessed over the network, should be small enough so they can be moved about easily. A researcher can also download large databases in advance of the scheduled work time.

Teaming researchers with technologists
In a complimentary collaboration between a researcher or subject matter expert and an information technologist, the researcher conveys what she would like to query the data for and the technologist makes it happen. The researcher may analyze the results and make suggestions to the technologist for refining the results. Some workshops such as Ian Milligan’s web archiving analysis workshop, require their researchers to take a Data Carpentry workshop, which is an overview of computation, programming and analysis methods that a data researcher might need. The researcher could either conduct data analyses for herself or become more conversant in data analysis methods in order to better understand her options and communicate with the technologist.

Powerful processing capability
Processing large data sets foists a load on computational power, so a lab needs ample processing muscle. At the Archives Unleashed event, it took one group ten hours to process their query. Milligan is a big proponent of cloud processing and storage, using powerful network systems supported and maintained by others. He said, “We started out using machines on the ground and we found the issue was to have real sustainable storage that’s backed up and not risky to use, that’s going to have to live in network storage anyway. We found that we’re moving data all over the place and we do some of our stuff on our server itself and when we have to spin up other machines, it’s so much quicker to actually move stuff — especially when you’re working with terabytes of web archives — until you get to that last mile of the actual Ethernet cable coming to your workstation. That’s turning out to be the mass bottleneck. Our federation of provincial computing organizations has a sort of Amazon AWS-esque dashboard where we can spin up virtual machines. We have a big server at York University and we sometimes use Jimmy Lin’s old cluster down at Maryland. So the physical equipment turns out not to be that important when we have so many network resources to draw on.”

Software resources and tools
As data labs spring up, newer and better tools are appearing too. Data labs may offer a gamut of tools for

  • Content and file management
  • Data visualization
  • Document processing
  • Geospatial analysis
  • Image editing
  • Moving image editing
  • Network Analysis
  • Programming
  • Text mining
  • Version control

The Digital Research Tools site is a good comprehensive resource to begin with for an overview of resources.

Repositories for end-result data sets
The data set at the end of a project may be of value to other researchers and the researcher might want her project to be discoverable. The data set should include metadata to describe the project and how to repeat the work in order to arrive at the same data set and conclusions. The repository where the data set resides should have long-term preservation reliability.

Conclusion
Data science is drifting out of the server rooms and into the general public. The sharp differences among professions and areas of interest are getting fuzzier all the time as researchers increasingly use information technology tools. Archaeologists practice 3D modelling. Humanities scholars practice data visualization. Students of all kinds query databases.

For the near future, interacting with data is a specialized skill that requires a basic understanding of data science and knowledge of its tools, whether through training or teaming up with knowledgeable technologists. But eventually the relevant instruction should be made widely available, whether in person or by video, and tools need to be simplified, especially as API-enabled databases proliferate and more sources of data become available

In time, computationally enhanced research will not be a big deal and cultural institutions’ data resources and growing digital collections will be ready for researchers to access, use and enjoy.

LibUX: Mark Dodgson Talks about Bluespark’s Awesome Process

Wed, 2016-09-14 11:53

In this episode of the podcast, user experience and interface designer Mark Dodgson joins me to talk about the kind work Bluespark — a design agency that has sort of found itself popular in the library and higher-ed web niche — does. I kind of let him just talk about their process. It’s pretty fascinating and super instructive.

Notes

Help us out and say something nice. Your sharing and positive reviews are the best marketing we could ask for.

If you like, you can download the MP3 or subscribe to LibUX on StitcheriTunes, YouTube, Soundcloud, Google Play Music, or just plug our feed straight into your podcatcher of choice.

Open Knowledge Foundation: Open Knowledge Ireland Summer 2016 Update

Wed, 2016-09-14 11:45

This blog post is part of our summer series featuring chapter updates from across the Open Knowledge Network and was written by the team of Open Knowledge Ireland.

What is OK Ireland and what do we do?

 

Open Knowledge Ireland is a team of 9 volunteers who envision an information age where everyone, not just a few,  has access to and the ability to use the massive amounts of information and data generated by entities such as our government or public service.

 

We believe everyone should have access to this information and data to be able to make better decisions, receive better services and ensure money is spent in the right places. Our goal is to make taxpayer-supported information openly available, so that it can be used and re-used without the public having to pay for it, again.

 

In so doing we want to ensure that vital research can happen. We want people to be able to leverage information to hold powerful institutions to account, whether in health care, the charity sector, or through Freedom of Information requests in the public service.

 

Past events:

 

In June we organised and ran an event dedicated to Knowledge Preservation in the 21st century: https://ti.to/open-knowledge-ireland/knowledge-preservation/ . The event was attended by 20 enthusiasts. Kalpana Shankar, Stan Nazarenko and Rufus Pollock shared their visions of how knowledge and information can and should be preserved today and what the current challenges are. (Photos https://www.flickr.com/photos/139932355@N08/sets/72157669330777481)

 

In August we were delighted to help our friends and colleagues from Open Street Maps to map the Kingdom of Lesotho.

To see a list of our past events click here.

 

Current projects:

 

A notable highlight from the last few months has been our work on hospital waiting list data. For a more extensive look at the activities we have initiated, see here.

 

In May we presented the findings from our Hospital Waiting List Project at the all-Ireland conference ‘Knowledge For Health’ organised by the Institute of Public Health (IPH), which operates on both sides of the island of Ireland. The reason we took on this project is that people with illnesses requiring them to visit a hospital (bad enough in itself!) are currently waiting up to 18 months and more to be seen by specialist doctors and consultants. No one in Europe should have to wait so long for a consultation on what may prove to be a severe or life-threatening illness.

 

As a way of reducing waiting times to see specialist doctors in hospitals we would like waiting times to be publicly (= ‘openly’) available so that the public, journalists, and social media can hold service providers accountable where waiting times are unusually high. This would also allow experts to use the data and complete sophisticated problem analysis that could work to improve waiting lists.

While advocating for open data, we realise that for the data to be useful and to help answer real questions, users need to be sure that the data is authentic and that it will be accessible tomorrow or ten years from now.  We believe that the InterPlanetary File System (IPFS) has great potential to facilitate the preservation of the authenticity and accessibility of public data.

IPFS is a peer-to-peer distributed file system that seeks to connect all computing devices with the same system of files where each file is given a unique fingerprint called a cryptographic hash.

IPFS  provides historical versioning (like git) and makes it simple to set up resilient networks for mirroring of data.

 

At the conference, we demonstrated that the hospital waiting list data could and should be permanently and publicly available via the IPFS. See here for the examples of hospital waiting list data we presented. (https://ipfs.io/ipfs/QmT66oHDwzb8dU5vnZt3Ez5aStcWCjbqjNE2pA25ShTjmM/)

 

What are we working on next:?

 

Plans for the future:

  • Hospital Waiting List: OK Ireland continues to work with the Irish government on making Hospital Waiting List data open, to linking it with Wikimedia Data, and applying it on the Open Street Maps. As all of us can and might become ill, we believe that making health data accessible and comprehensive to everyone is the best way to demonstrate the potential value of open data.

 

We aim to get existing data on waiting times released into Data.gov.ie – to do so it is likely that a tender will have to be announced to get this work under way. We are therefore looking to draft what this work might look like and what a project plan & costs would look like.

 

  • Developing a sustainable fundraising strategy: We are struggling, as are many non-profits, to secure funds. Are there proven methods & tools that the Open Knowledge International Network could share to support us in developing a strategic plan for fundraising? For example, how could we leverage prominent personalities on the global level locally? Where should a strategic fundraising plan focus? And how do we go about sustaining a constant output of fundraising applications?

 

And our upcoming events:

 

  • During Open Access Week (October 24–30, 2016) Open Knowledge Ireland and the Institute of Public Health (IPH)  are co-organising an event which is dedicated to Open Data, Open Access, and Social Justice. The event will take place on Tuesday, 25 October at Pearse Street Library. More information to follow.

 

If you want to contact us:

 

If you found the above interesting and/or want to learn more about anything we talked about here, please feel free to email, tweet, or facebook us.

 

Twitter

Facebook

Email

 

To read more about Open Knowledge Ireland, visit their website.
Learn more about the Open Knowledge Network by visiting the Open Knowledge International website.

DuraSpace News: REMINDER Call for Expressions of Interest in Hosting Open Repositories Conference: 2018 and 2019

Wed, 2016-09-14 00:00

From William Nixon and Elin Stangeland on behalf of the Open Repositories Steering Committee

Glasgow, Scotland  The Open Repositories Steering Committee seeks Expressions of Interest (EoI) from candidate host organizations for the 2018 and 2019 Open Repositories Annual Conference series. The call is issued for two years this time to enable better planning ahead of the conferences and to secure a good geographical distribution over time. Proposals from all geographic areas will be given consideration. 

Important dates

Equinox Software: New Addition for Virginia Evergreen

Tue, 2016-09-13 17:24

FOR IMMEDIATE RELEASE

Duluth, Georgia–September 13, 2016

Equinox is happy to announce that yet another library has successfully migrated to Virginia Evergreen.  Halifax County South Boston-Public Library System is the ninth library system to join Virginia Evergreen which boasts close to thirty branches in total.  Halifax County-South Boston includes two branches:  Halifax Public Library and South Boston Public Library.

Jay Stephens, Director of Halifax Public Library, remarked; “It has been a pleasure working with Equinox.  Everyone is very knowledgeable and willing to share that knowledge to help out.”  He later added, “The training was awesome!  Mary rocks!”

In response, Equinox Training Service Librarian Mary Jinglewski had this to say about the migration; “I greatly enjoyed training with Halifax County South Boston Library System. They have a great community of caring staff members and I am excited that they’ll be a part of Virginia Evergreen moving forward!”

Equinox handled the migration from start to finish and will continue to support Halifax County-South Boston along with the rest of Virginia Evergreen.  

About Equinox Software, Inc.
Equinox was founded by the original developers and designers of the Evergreen ILS. We are wholly devoted to the support and development of open source software in libraries, focusing on Evergreen, Koha, and the FulfILLment ILL system. We wrote over 80% of the Evergreen code base and continue to contribute more new features, bug fixes, and documentation than any other organization. Our team is fanatical about providing exceptional technical support. Over 98% of our support ticket responses are graded as “Excellent” by our customers. At Equinox, we are proud to be librarians. In fact, half of us have our ML(I)S. We understand you because we *are* you. We are Equinox, and we’d like to be awesome for you. For more information on Equinox, please visit http://www.esilibrary.com.

 

About Evergreen
Evergreen is an award-winning ILS developed with the intent of providing an open source product able to meet the diverse needs of consortia and high transaction public libraries. However, it has proven to be equally successful in smaller installations including special and academic libraries. Today, over 1,500 libraries across the US and Canada are using Evergreen including NC Cardinal, SC Lends, and B.C. Sitka.
For more information about Evergreen, including a list of all known Evergreen installations, see http://evergreen-ils.org.

SearchHub: Hello Boston, My Old Friend

Tue, 2016-09-13 16:10

It all started in Boston…

In 2010, for the inaugural Lucene Revolution in Boston MA, I tried to weasel out of giving a prepared talk by proposing a Live Q&A style session where I’d be put on the spot with tough, challenging, unusual questions about Solr & Lucene — live, on stage. I don’t remember what my original session title was, but the conference organizer realized it sounded a lot like the “Stump The Chump” segment of the popular “Car Talk” radio show, hosted by Boston’s own Click & Clack, and insisted that be the title we use.

It’s been 6 years since that first “Stump The Chump” session in Boston, and now — one month from today — Stump The Chump will be returning to Boston for Lucene/Solr Revolution 2016.

If you’ve never seen our version of “Stump the Chump” it’s a little different then Click & Clack’s original radio call in format. In addition to being live in front of hundreds of rowdy convention goers, we also have a panel of judges who have had a chance to see and think about many of the questions in advance — Because folks like you are free to submit questions via email prior to conference (even if you can’t attend in person). The judges take every opportunity to mock The Chump (ie: Me) anytime I flounder, and ultimately the panel will award prizes to people whose questions do the best job of “Stumping The Chump”.

As my boss Cassandra (a Boston native, and this year’s Stump the Chump moderator) would say: “It’s a Wicked Pissa!”

You can see for yourself by checking out the videos from the past events like Lucene/Solr Revolution 2015 in Austin TX, or Lucene/Solr Revolution Dublin 2013. If you want a real blast from the past, check out the video from the last time “Stump The Chump” was in Boston: Lucene Revolution 2012. (Regrettably, there is no video from that first Stump The Chump in 2010)

Information on how to submit questions can be found on the session agenda page, and I’ll be posting more details with the Chump tag as we get closer to the conference.

(And don’t forget to register for the conference ASAP if you plan on attending! The registration price will be increasing on September 16th.)

The post Hello Boston, My Old Friend appeared first on Lucidworks.com.

LITA: Learn about “Online Productivity Tools” at this LITA webinar

Tue, 2016-09-13 15:14

Online Productivity Tools: Smart Shortcuts and Clever Tricks

Presenter: Jaclyn McKewan
Tuesday November 8, 2016
11:00 am – 12:30 pm Central Time

Register Online, page arranged by session date (login required)

This course has been re-scheduled from a previous date.

Become a lean, mean productivity machine!

In this 90 minute webinar we’ll discuss free online tools that can improve your organization and productivity, both at work and home. We’ll look at to-do lists, calendars, and other programs. We’ll also explore ways these tools can be connected, as well as the use of widgets on your desktop and mobile device to keep information at your fingertips. Perfect for any library workers who spend a significant portion of their day at a computer.

Details here and Registration here

Webinar takeaways will include:

  • Keep track of regular repeating tasks by letting your to-do list remember for you
  • Connect your calendars and to-do lists
  • Use mobile and desktop widgets to keep information at your fingertips

Jaclyn McKewan is the Digital Services Coordinator at WNYLRC, where she has worked since 2008. Her job duties include managing the Ask Us 24/7 virtual reference program, New York Heritage Digital Collections, and internal networking/IT.

And don’t miss other upcoming LITA fall continuing education offerings:

Social Media For My Institution; from “mine” to “ours”
Instructor: Plamen Miltenoff
Starting Wednesday October 19, 2016, running for 4 weeks
Register Online, page arranged by session date (login required)

Beyond Usage Statistics: How to use Google Analytics to Improve your Repository
Presenter: Hui Zhang
Tuesday, October 11, 2016
11:00 am – 12:30 pm Central Time
Register Online, page arranged by session date (login required)

Questions or Comments?

For questions or comments, contact LITA at (312) 280-4268 or Mark Beatty, mbeatty@ala.org

David Rosenthal: Scary Monsters Under The Bed

Tue, 2016-09-13 15:00
So don't look there!

I sometimes hear about archives which scan for and remove malware from the content they ingest. It is true that archives contain malware, but this isn't a good idea:
  • Most content in archives is never accessed by a reader who might be a target for malware, so most of the malware scan effort is wasted. It is true that increasingly these days data mining accesses much of an archive's content, but it does so in ways that are unlikely to activate malware.
  • At ingest time, the archive doesn't know what it is about the content future scholars will be interested in. In particular, they don't know that the scholars aren't studying the history of malware. By modifying the content during ingest they may be destroying its usefulness to future scholars.
  • Scanning and removing malware during ingest doesn't guarantee that the archive contains no malware, just that it doesn't contain any malware known at the time of ingest. If an archive wants to protect readers from malware, it should scan and remove it as the preserved content is being disseminated, creating a safe surrogate for the reader. This will guarantee that the reader sees no malware known at access time, likely to be a much more comprehensive set.
This is essentially the same argument as lies behind the LOCKSS system's approach to format migration, demonstrated more than a decade ago. It is, if necessary, to create temporary access surrogates on demand in the dissemination pipeline, in a less doomed format or shorn of malware as the case may be.

See, for example, the Internet Archive's Malware Museum, which contains access surrogates of malware which has been defanged.

DPLA: GIF IT UP Returns on October 1

Tue, 2016-09-13 15:00
Calling all gif makers, creatives, history nuts, animators, and more! GIF IT UP — DPLA’s annual competition seeking innovative and endlessly looping uses of archival videos and images — returns on October 1.

The rules are simple:

  1. Find your favorite piece of copyright-free material from DPLA, Europeana, Trove, or DigitalNZ
  2. Create a sweet gif
  3. Submit it for a chance to win some nifty prizes

To find out more about the 2016 competition, including available prizes and submission rules, visit https://dp.la/info/gif-it-up/. In the lead up to the October 1st kick-off, here are some fun and easy ways that you can start your source material exploration and build your gif-making skills.

Join our free gif-making workshops

Interested in participating in this year’s competition but aren’t sure how to make a gif? Looking to sharpen your existing gif-making skills with some more advanced techniques? Look no further! We’ve enlisted the help of some gif experts to teach you how to get started with gifs using open materials and beyond.

Workshop #1: GIF-Making 101, Wed, September 21, 3pm – 4pm Eastern

Ever wondered how to make an animated gif? Join gif-making experts Shaelyn Amaio (Consultant at Lord Cultural Resources) and Derek Tulowitzky (Web, Social Media, and Outreach Manager at the Muncie Public Library) for an hour long webinar workshop on how to make gifs using open materials found in DPLA and other digital libraries. The workshop will cover what gifs are, how to find suitable materials in DPLA and elsewhere, and how to make a simple gif. This workshop is the first part of a two-part series leading up to the GIF IT UP 2016 competition (October 1-31, 2016). Part two will cover advanced gif-making techniques. Attendees are encouraged but not required to attend both sessions.

Register for workshop

Workshop #2: Advanced GIF-Making Techniques, Wed, September, 28, 3pm – 4pm Eastern

Join us for a hands-on, hour long workshop on how to use photo editing software to perform advanced gif-making techniques, such as how to use frame animation in order to make objects disappear and then reappear, move around, and change color. This workshop will be led by two seasoned gif-making vets, Richard Naples (Outreach and Education Technical Information Specialist at the Smithsonian Institution) and Darren Cole (Digital Engagement Specialist at the National Archives and Records Administration’s Office of Innovation). This workshop is part two of a two-part series leading up to the GIF IT UP 2016 competition (October 1-31, 2016). Part one will provide a basic introduction to gifs and the materials used to make them. Attendees are encouraged but not required to attend both sessions.

Register for workshop

Where to Start? Explore DPLA and other participating digital libraries

GIF IT UP is all about exploring DPLA and the other participating digital libraries for the perfect piece of open content. If you’re not sure what type of material you should be looking for when creating a gif, here are some helpful suggestions to get you started.

That’s just a taste of the types of materials that can be found in DPLA and the other participating digital libraries. To explore the many open collections available for the competition, check out our list of select public domain and open collections for re-use.

Need Inspiration? Check out past competition galleries

This is the third year of GIF IT UP, so we have an awesome array of gifs from our previous couple of competitions that may help get your creative juices flowing.

2015

Following the inaugural GIF IT UP in 2014, the competition returned in 2015, seeking innovative and endlessly looping uses of archival videos and images. In 2015 the challenge expanded internationally with support from Europeana and Trove and featured an esteemed line-up of judges and cool prizes. Check out last year’s winning gifs.

View post on imgur.com

2014

Over the course of Fall 2014, DPLA and DigitalNZ held GIF IT UP,  an international competition to find the best GIFs reusing public domain and openly licensed digital video, images, text, and other material available via our search portals. Check out the 2014 winning gifs

GIF IT UP: Reusable gifs from DPLA & DigitalNZ

LITA: Transmission #9 – #litadanceparty

Tue, 2016-09-13 15:00

It’s our ninth webisode, and we’re lucky to be joined by Whitni Watkins, intrepid Web Systems Engineer at Analog Devices and long-time LITA Blogger. Watch this ‘sode, read her posts, and check out her web presence.

Begin Transmission will return on 9/26/2016!

LITA: Deadline Extended, Call for Proposals, LITA @ ALA Annual 2017

Mon, 2016-09-12 15:32

The proposals submission deadline for LITA programs at the 2017 ALA Annual conference has been extended two weeks until September 23, 2016.

Submit Your Call for Proposals for the 2017 Annual Conference Programs and Preconferences!

The LITA Program Planning Committee (PPC) is now accepting innovative and creative proposals for the 2017 Annual American Library Association Conference. We’re looking for 60- and 90-minute conference presentations. In addition to program session proposals, we are also eager to see your proposals for half-day or full-day preconferences to help participants develop skills through interactive learning. The focus should be on technology in libraries, whether that’s use of, new ideas for, trends in, or interesting/innovative projects being explored – it’s all for you to propose.

When and Where is the Conference?

The 2017 Annual ALA Conference will be held  in Chicago, IL, from June 22nd through 27th.

What kind of topics are we looking for?

We’re looking for programs of interest to all library/information agency types, that inspire technological change and adoption, or/and generally go above and beyond the everyday.

We regularly receive many more proposals than we can program into the 20 slots available to LITA at the ALA Annual Conference. These great ideas and programs all come from contributions like yours. We look forward to hearing the great ideas you will share with us this year.

This link from the 2016 ALA Annual conference scheduler shows the great LITA programs from this past year.

When are proposals due?

September 23, 2016

How I do submit a proposal?

Fill out this form bit.ly/litacfpannual2017

Program descriptions should be 150 words or less.

When will I have an answer?

The committee will begin reviewing proposals after the submission deadline; notifications will be sent out on October 3, 2016

Do I have to be a member of ALA/LITA? or a LITA Interest Group (IG) or a committee?

No! We welcome proposals from anyone who feels they have something to offer regarding library technology. Unfortunately, we are not able to provide financial support for speakers. Because of the limited number of programs, LITA IGs and Committees will receive preference where two equally well written programs are submitted. Presenters may be asked to combine programs or work with an IG/Committee where similar topics have been proposed.

Got another question?

Please feel free to email Nicole Sump-Crethar (PPC chair) (sumpcre@okstate.edu)

DPLA: Michael Della Bitta Joins DPLA as Developer for Data and Usage Analytics

Mon, 2016-09-12 14:00

The Digital Public Library of America is pleased to announce that Michael Della Bitta is joining its staff as Developer for Data and Usage Analytics, beginning September 12, 2016.

In this role, Della Bitta will work with DPLA’s Technology Team to process, evaluate, and share information about how DPLA’s diverse collections are discovered, used and shared through our user-facing platforms, social media and APIs.  Della Bitta will also play a key role in improving data ingestion systems and supporting DPLA’s ongoing work to assess and provide meaningful feedback to our partners about data quality to enhance discoverability and use of collections internally, across our partner network, and by our broad community of users.

“Michael’s experience in digital libraries, high volume data processing and analysis, and interest in serving the cultural heritage sector make him a valuable member of the DPLA Team,” said Director of Technology Mark Matienzo. “I believe his background will serve him well in making our operations run more smoothly and efficiently, and will benefit the DPLA Network as a whole.”

Prior to joining DPLA, Michael has worked in software development and publications and in the startup, library, and education space for nearly twenty years. Michael most recently worked as a data and analytics developer, architect and engineering manager at the content marketing company ScribbleLive. Prior to that, Michael worked as a developer and architect on the repository and Digital Gallery teams at The New York Public Library, and built content management, online learning, and semantic metadata applications at Columbia University. Michael holds a B.A. in Philosophy from Bates College.

Welcome, Michael!

 

Islandora: Islandora CLAW Sprint 10 - Complete!

Mon, 2016-09-12 13:08

The 10th Islandora CLAW Community Sprint finished up last week. Running August 22nd to September 5th, this sprint was mostly about learning and design, with "homework" tickets to read up on specifications, and long discussions about how various pieces of CLAW should work. You can do a little homework of your own and follow the discussions about ORE and IIIF

The MVP for this sprint was Everyone. We had some really great discussions, both in GItHub issues and via IRC (#islandora on freenode). 

Danny Lamb (Islandora Fundation)
Nick Ruest (York University)
Jared Whiklo (University of Manitoba)
Diego Pino (Metro.org)
Melissa Anez (Islandora Foundation)
Ed Fugikawa (University of Wyoming)
Nat Kanthan (University of Toronto Scarborough)
Kirsta Stapelfeldt (University of Toronto Scarborough)
Kim Pham (University of Toronto Scarborough)
Bryan Brown (Florida State University)

Next up in CLAW Sprint 11, running September 19th - October 3rd. A few issues are listed here, with more to come. Non-developers may be interested in signing on for Homework Ticket #360, where we will be exploring the Drupal 8 UI. You can sign up for the sprint here

Open Knowledge Foundation: Open Knowledge Austria Summer 2016 update

Mon, 2016-09-12 09:38

This blog post is part of our summer series featuring updates from chapters across the Open Knowledge Network and was written by the team of Open Knowledge Austria.

The last two months have been very vibrant within Open Knowledge Austria. We co-organized the monthly Vienna Open Data Meetup, but no other public appearances, because we had set up some major projects. We also had our bi-annual plenary meeting and the election of the new board.

 

First the projects:

  1. Our project OpenDataPortal, in cooperation with Wikimedia AT, got funded by the Austrian Ministry for Mobility, Innovation, and Technology (BMVIT). In the so-called “Data Pioneers” project, we work with companies on open innovation strategies around using and sharing open data. The central goal is to work out use-cases and narratives for companies in order to get them to open up some of their own data. We will organize two workshops and one hackathon in the next months and we will guide the companies during the opening process.
  2. We will organize our second data literacy event for kids, the first time under the branding of the german project “Jugend hackt”. The 3-day event will take place in Linz at the beginning of November and will show children between 12 and 18 how to code with open data.

Second, our governance.  

Out brand new board of Open Knowledge Austria for the next two years are:

  • Stefan Kasberger – @stefankasberger, stefan.kasberger@okfn.at
  • Christopher Kittel – @chris_kittel, christopher.kittel@okfn.at)
  • Clara Landler – @clara_l, clara.landler@okfn.at)


They are now planning the activities for the next half year, setting up a working group for a funding-strategy and one for a community-strategy, and are figuring  out a guideline for better handling of projects. The current situation in numbers does not look good: we have 1 employed person for 5 hours a week, a few thousand euros to survive the next months and about 30 volunteers members. The good thing: it’s getting better each day, but there are still huge challenges in front of us.

The next months will be the most active time of Open Knowledge Austria so far. The very active and well organized Open Science working group will organize a hackathon and a meetup around OpenKnowledgeMaps and disseminate their past activities and involvements, like the Vienna Principles and the copyright recommendations from the OANA (Open Access Network Austria) working groups. Additionally, Michela Vignoli, the coordinator of the Open Science group alongside Peter Kraker, was nominated for the EU Commissions Open Science Policy Platform for her involvement in the YEAR network and will also represent the interests of the Open Science community.

We will also start an Austrian City Open Data Census and start a project about Open Data in elections after the disastrous last presidential elections, called “Offene Wahlen Österreich”. Next to the already mentioned Jugend hackt and BMVIT events, we will co-organize as usual the Vienna Open Data Meetups and co-organize a panel about net political processes at European level at the Elevate Festival in Graz. And last but not least: The above-mentioned working groups on funding- and community strategy will start their activity, input welcome.

In terms of collaboration, we can offer expertise in the field of Open Science, Knowledge Discovery, Content Mining, Open Data Repositories, Data Literacy and Data Science Trainings. If there is an interest in the outcome of the funding- and community strategy, just ask, and we will try to translate it at the end. In general, we are always happy about international cooperation and we are looking forward to requests and feedback from other Open Knowledge chapters.

Jez Cope: Software Carpentry: SC Track; hunt those bugs!

Mon, 2016-09-12 07:50

This competition will be an opportunity for the next wave of developers to show their skills to the world — and to companies like ours. — Dick Hardt, ActiveState (quote taken from SC Track page)

All code contains bugs, and all projects have features that users would like but which aren’t yet implemented. Open source projects tend to get more of these as their user communities grow and start requesting improvements to the product. As your open source project grows, it becomes harder and harder to keep track of and prioritise all of these potential chunks of work. What do you do?

The answer, as ever, is to make a to-do list. Different projects have used different solutions, including mailing lists, forums and wikis, but fairly quickly a whole separate class of software evolved: the bug tracker, which includes such well-known examples as Bugzilla, Redmine and the mighty JIRA.

Bug trackers are built entirely around such requests for improvement, and typically track them through workflow stages (planning, in progress, fixed, etc.) with scope for the community to discuss and add various bits of metadata. In this way, it becomes easier both to prioritise problems against each other and to use the hive mind to find solutions.

Unfortunately most bug trackers are big, complicated beasts, more suited to large projects with dozens of developers and hundreds or thousands of users. Clearly a project of this size is more difficult to manage and requires a certain feature set, but the result is that the average bug tracker is non-trivial to set up for a small single-developer project.

The SC Track category asked entrants to propose a better bug tracking system. In particular, the judges were looking for something easy to set up and configure without compromising on functionality.

The winning entry was a bug-tracker called Roundup, proposed by Ka-Ping Yee. Here we have another tool which is still in active use and development today. Given that there is now a huge range of options available in this area, including the mighty github, this is no small achievement.

These days, of course, github has become something of a de facto standard for open source project management. Although ostensibly a version control hosting platform, each github repository also comes with a built-in issue tracker, which is also well-integrated with the “pull request” workflow system that allows contributors to submit bug fixes and features themselves.

Github’s competitors, such as GitLab and Bitbucket, also include similar features. Not everyone wants to work in this way though, so it’s good to see that there is still a healthy ecosystem of open source bug trackers, and that Software Carpentry is still having an impact.

Pages