ARCHITECTING ScholarSphere: How We Built a Repository App That Doesn't Feel Like Yet Another Janky Old Repository App
archivesARCHITECTING ScholarSphere: How We Built a Repository App That Doesn't Feel Like Yet Another Janky Old Repository AppARCHITECTING ScholarSphere: How We Built a Repository App That Doesn't Feel Like Yet Another Janky Old Repository App Pitfall! Working with Legacy Born Digital Materials in Special CollectionsPitfall! Working with Legacy Born Digital Materials in Special Collections Hacking the DPLAHacking the DPLA
The Digital Public Library of America is a growing open-source platform to support digital libraries and archives of all kinds. DPLA-alpha is available for testing, with data from six initial Hubs. New APIs and data feeds are in development, with the next release scheduled for April. Come learn what we are doing, how to contribute or hack the DPLA roadmap, and how you (or your favorite institution) can draw from and publish through it. Larger institutions can join as a (content or service) hub, helping to aggregate and share metadata and services from across their {region, field, archive-type}. We will discuss current challenges and possibilities (UI and API suggestions wanted!), apps being built on the platform, and related digitization efforts. DPLA has a transparent community and planning process; new participants are always welcome. Half the time will be for suggestions and discussion. Please bring proposals, problems, partnerships and possible paradoxes to discuss. EAD without XSLT: A Practical New Approach to Web-Based Finding AidsEAD without XSLT: A Practical New Approach to Web-Based Finding Aids The Avalon Media System: A Next Generation Hydra Head For Audio and Video DeliveryThe Avalon Media System: A Next Generation Hydra Head For Audio and Video Delivery Practical Relevance Ranking for 10 million books.Practical Relevance Ranking for 10 million books
HathiTrust Full-text search indexes the full-text and metadata for over 10 million books. There are many challenges in tuning relevance ranking for a collection of this size. This talk will discuss some of the underlying issues, some of our experiments to improve relevance ranking, and our ongoing efforts to develop a principled framework for testing changes to relevance ranking. Some of the topics covered will include:
n Characters in Search of an Authorn Characters in Search of an Author
When it comes to author names the disconnect between our metadata and what a user might enter into a search box presents challenges when trying to maximize both precision and recall [0]. When indexing a paper written by "Wäterwheels, A" a goal should be to preserve as much as possible the original information. However, users searching by author name may frequently omit the diaeresis and search for simply, "Waterwheels". The reverse of this scenario is also possible, i.e., your decrepit metadata contains only the ASCII, "Supybot, Zoia", whereas the user enters, "Supybot, Zóia". If recall is your highest priority the simple solution is to always downgrade to ASCII when indexing and querying. However this strategy sacrifices precision, as you will be unable to provide an "exact" search, necessary in cases where "Hacker, J" and "Häcker, J" really are two distinct authors. Evolving Towards a Consortium MARCR Redis DatastoreEvolving Towards a Consortium MARCR Redis Datastore Citation search in SOLR and second-order operatorsCitation search in SOLR and second-order operators
Citation search is basically about connections (Is the paper read by a friend of mine more important than others? Get me a paper read by somebody who cites many papers/is cited by many papers?), but the implementation of the citation search is surprisingly useful in many other areas. I will show 'guts' of the new citation search for astrophysics, it is generic and can be applied recursively to any Lucene query. Some people would call it a second-order operation because it works with the results of the previous (search) function. The talk will see technical details of the special query class, its collectors, how to add a new search operator and how to influence relevance scores. Then you can type with me: friends_of(friends_of(cited_for(keyword:"black holes") AND keyword:"red dwarf")) Hybrid Archival Collections Using Blacklight and HydraHybrid Archival Collections Using Blacklight and Hydra |
SearchBrowse archivesNavigationActive forum topicsWho's onlineThere are currently 0 users and 11 guests online.
Who's new
User login |
Recent comments
3 years 12 weeks ago
3 years 13 weeks ago
3 years 13 weeks ago
3 years 23 weeks ago
4 years 13 weeks ago
5 years 11 weeks ago
5 years 12 weeks ago
5 years 13 weeks ago
5 years 13 weeks ago
5 years 49 weeks ago