news aggregator

Cohen, Dan: Digital Humanities and the Disciplines

planet code4lib - Wed, 2008-10-01 23:36

On Thursday and Friday, October 2-3, 2008 (that is, starting tomorrow, if you’re reading this immediately from my feed) I’ll be at Rutgers University for the conference “Digital Humanities and the Disciplines,” sponsored by the Center for Cultural Analysis. If you’re in the area, please stop by—the conference is open to the public. If I can find some wifi I’ll also do my best to blog the conference and send brief updates via my Twitter feed (which I’ve been neglecting lately; sorry, been a little busy).

Equinox Software Incorporated: Happy days are here again!

planet code4lib - Wed, 2008-10-01 22:51

There’s a small element of defiance in that blog post title. The news “out there” in the wider world is a little rough, no doubt about it. But I am going to keep my eyes fixed on the good stuff happening around us — and it’s exciting to watch libraries joining the Evergreen community and watch Evergreen continue to grow and mature.

The National Weather Center Library will be the first (known) special library to “go Evergreen.” Evergreen scales all the way down to home libraries and all the way up to huge consortia such as Georgia PINES — and also fits in every type library.

Meanwhile, Grand Rapids Public Library went live with Evergreen on Monday, September 29. GRPL is the second Michigan library to go live with Evergreen. You can follow the Michigan Evergreen Project as five more libraries go live before 2009 (they’re even blogging).

Also, in addition to all the great new features in 1.4 — which is coming along briskly — Evergreen is going to get some very pretty foliage, just in time for the fall fashion lineup! Yes, there will be a very lovely, wonderful new design option for Evergreen’s catalog — not just more attractive, but more usable and a little bit more lightweight.

Plus we did a special Acquisitions webinar for Georgia PINES (the mother ship for Evergreen) and it went great. PINES is now doing what other Evergreen communities might consider: growing a committee to engage with us on acquisitions development. Expect more webinars on other topics, announced more widely.

Next, a small thing, but in case you hadn’t noticed, in the last month or so the Evergreen project site has spiffy new Frequently Asked Questions (in three acts, no less!) and a simpler-to-follow Roadmap, and also sports a new Open Source Glossary and updated Evergreen Feature Request Procedures.

Finally, we have found several terrific documentation writers — woohoo! (But if you were thinking “gee, you know, I’m still interested,” drop me a note at kgs at esi library dot com — we have some other documentation needs.)

Bookmark to:

Nodalities (Talis): This Week’s Semantic Web

planet code4lib - Wed, 2008-10-01 22:48

Special Edition : SIOC Update

I had a man cold when I should have been doing my duty, but with no apologies (fairly safely assuming John has a CC-with-attribution kind of policy) here’s a good proxy :

It’s time for another installment from the world of SIOC!

Previous SIOC-o-sphere articles:

#7 http://sioc-project.org/node/328
#6 http://sioc-project.org/node/310
#5 http://sioc-project.org/node/294
#4 http://sioc-project.org/node/272
#3 http://sioc-project.org/node/271

#2 http://sioc-project.org/node/138
#1 http://sioc-project.org/node/79

If you wish to contribute to the next article, join the SIOC Twine and use the tag “siocosphere9” when you add items.

Share This


Voss, Jakob: Digital libraries sleep away the web 2.0

planet code4lib - Wed, 2008-10-01 21:58

Frome time to time still publish on paper, so I have to deposit the publication in a repository to make it (and its metadata) available; mostly I use the “open archive for Library and Information Science” named E-LIS. But each time I get angry because uploading and describing a submission is so complicated - especially compared to popular commercial repositories like flickr, slideshare youtube and such. These web applications pay a lot attention to usability - which sadly is of low priority in many digital libraries.

I soon realized that E-LIS uses a very old version (2.13.1) of GNU EPrints - EPrints 3 is available since December 2006 and there have been many updates since then. To find out whether it is usual to run a repository with such an outdated software, I did a quick study. The Registry of Open Access Repositories (ROAR) should list all relevant public repositories that run with EPrints. With 30 lines of Perl I fetched the list (271 repositories), and queried each repository via OAI to find out the version number. Here the summarized result in short:

76 x unknown (script failed to get or parse OAI response), 8 x 2.1, 18 x 2.2, 98 x 2.3, 58 x 3.0, 13 x 3.1

Of 195 repositories (that I could successfully query and determine the version number of) only 13 use the newest version 3.1 (released September 8th). Moreover 124 still use version 2.3 or older. EPrints 2.3 was released before the web 2.0 hype in 2005! One true point of this web 2.0 bla is the concept of “perpetual beta”: release early but often and follow user feedback, so your application will quickly improve. But most repository operators do not seem to have a real interest in improvement and in their users!

Ok, I know that managing and updating a repository server is work - I would not be the right guy for such a job - but then don’t wail over low acceptance or wonder why libraries have an antiquated image. For real progress one should perpetually do user studies and engage in the developement of your software. Digital libraries with less resources should at least join the Community and follow updates to keep up to date.

"Python urllib" by mikeybe

#code4lib paste - Wed, 2008-10-01 20:36
Paste to channel #code4lib with 0 annotations.

Matienzo, Mark: The Apex of Hipster XML GeekDOM: TEI-Encoded Dylan

planet code4lib - Wed, 2008-10-01 20:26

Via Language Log: The Electronic Textual Cultures Lab (ETCL) at the University of Victoria has, in an effort to draw more attention to TEI, chosen to prepare an encoded version of the lyrics to Bob Dylan's "Subterranean Homesick Blues" and overlaid the resulting XML over the song's video. The resulting video is available, naturally, on YouTube.

ETCL's Ray Siemens writes about the reasoning behind this on the TEI Video Widgets blog: At the last gathering of the Text Encoding Initiative Consortium, in Maryland, a few of us were discussing the ways in which TEI has eluded some specific types of social-cultural representation that are especially current today . . . things like an avatar, or something that could manifest itself as a youtube posting. A quick search of youtube did reveal a significant and strong presence of sorts, but it was that of Tei the Korean pop singer (pronounced, we’re told, ‘tay’); so, our quest began there, setting out modestly to create a video widget that would balance T-E-I and Tei in the youtube world.

In addition, a high-quality MP4 video is available for download for those so inclined. This has just about made my week.

Bisson, Casey: Solaris’ CacheFS Could Be The Space Ship I’ve Been Looking For

planet code4lib - Wed, 2008-10-01 18:31

Joerg Moellenkamp’s post explaining CacheFS has me excited:

Long ago, admins didn’t want to manage dozens of operating system installations. Instead of this they wanted to store all this data on a central fileserver (you know, the network is the computer). Thus netbooting Solaris and SunOS was invented. But there was a problem: All the users started to work at 9 o’clock. They switched on their workstations and the load on the fileserver and the network got higher and higher. Thus the idea of CacheFS [as a way of using the speed of local disk and the convenience of central management] was born.

Remove the corporate office and the uncaring sysadmins, and CacheFS might be exactly what I’m looking for to elastically expand the capacity of my laptop’s internal storage. This isn’t some newfangled technology, Sun developed it in the early 90s and it’s been available for Linux since 2003 (try it in Gentoo). And most importantly, it’s “designed to be as transparent as possible to a user of the system. Applications should just be able to use NFS files as normal, without any knowledge of there being a cache.”

The local cache isn’t expected to be a complete mirror of the remote filesystem, just the recently opened files. So the capacity of your local disk is limited only by your willingness to wait for files to be retrieved from the network. The biggest problem is figuring out what happens when the network isn’t available. CacheFS doesn’t appear to solve that and would likely fail if the network dropped.

I know nothing about filesystem development, but this challenge is interesting enough to make me consider jumping in. The availability of a partial solution helps too.

Ups to hindesite for the sweet drive photo. Too bad about what happened to it, though.

Powell, Andy and Johnston, Pete: Losing it

planet code4lib - Wed, 2008-10-01 16:57

I spent much of yesterday in what felt like a time warp - sorry, I can't think of a nicer way of putting it.

I was at the JISC Services Skills event, Illuminating Event Management, a day that was intended to "explore all aspects of Event Management, from traditional 'Dressing a Stand' through to new and novel methods such as using web 2.0 to enhance your event".  Unfortunately, on the day, the event felt far more "traditional" than "novel" - since when did a 'skills' day involve listening to presentations that wouldn't have been out of place 10 years ago?

I'm not being critical of the organisers here - on paper they looked to have pulled together an interesting set of sessions covering event management, getting the most from your conference stand, the use of online conferencing tools, the impact of Web 2.0 and Second Life and so on.  No... it was just the way the day panned out I think, in part because the scheduled speaker on Web 2.0 (Matt Jukes) was unable to attend.  As a result, the day lacked some of the balance that it might otherwise have had.

You can get a feel for the day by reading my live-blog for the event on eFoundations LiveWire - but note that I was pretty despondent by the end and not typing much :-(  Look, I know it's important to label the vegetarian options correctly at lunchtime - 't was ever thus - and I accept that we don't always do it successfully at our Eduserv events (despite having a vegetarian on the team) but did we really need that level of information from a 'skills' day?  JISC is supposed to be about innovation... right?

Where was the stuff about the amplified conference?  About using tags successfully?  About streaming options?  About Flickr and Crowdvine and blogging and live-blogging and Slideshare and ... oh, you get the picture.  I'd expect these things to be at the forefront of every event manager's thinking these days?  In our sector at least.  This stuff isn't that cutting edge after all... look at this paper by Brian Kelly et al. from 2005.

Instead, the closest we got to the Web during the first presentation were some URLs for venue searches (very useful BTW) and a suggestion that you need to get all your presenters to sign a bit of paper saying they are happy for you to put their slides on the Web (as PDF - OMG!).  I was desperate to do a James Clay - leaping up with my iPhone streaming live to qik.com to ask the speaker if she'd like me to ask her to sign a bit of paper.  This stuff is out there - get used to it.  In many cases, it's not even happening over our networks anymore.

Grace Porter of the JISC was up second.  She spoke about her event manager's toolkit - essentially a wiki (to which people in the community are invited to contribute).  This was more like it!  Good stuff. I've always thought that there was space for a social network of some kind for event managers - sharing reviews about venues, information about streaming providers, sample budget templates and the like.  This sounds spot on to me and I'll certainly try and get the guys here involved.  Grace also talked about making events greener, again a useful and timely contribution.

Then there was a talk about getting the most out of your conference exhibition stand.  My innovative side wondered if we'd hear something about using an ARG to get people to your stand.  Maybe something about Moo cards at the very least.  Alas, no - just advice about dress codes, setting 'new contact' targets for staff on the stand and remembering to shower before turning up!  Hmmm...

Accessibility seemed to feature very highly in the day - I'm not quite sure why?  Not that I have anything against accessibility you understand.  But two presentations, one about 'accessible email'  - surely that was over the top (even just as a way to demonstrate some remote presentation software)?

Then in the afternoon we had presentations about using online conferencing systems - particularly focussing on Elluminate and Wimba.  This was much more on target (for me at least) and it was interesting to see the tools in action.

Is it just me that hates the use of Java in systems like this?  I know these tools are now the accepted norm but I find Java applications pretty much unbearable!  I tried to construct a question around this in terms of accessibility but all I got back was assurance that they were fully accessible (whatever that means).  I didn't make myself clear enough.  Accessibility is about inclusion - it's a social thing more than a technical thing.  Java applications aren't inclusive because they're bloody horrible.  I guess it's just a personal thing...

So what else did I learn?

That Networkshop attendees don't like people typing on their laptops while they are listening to presentations - at least not according to the evaluation forms.  Hmmm... all that proves is that luddites are at least as loud on evaluation forms as evangelists.  The reality is probably somewhere in the middle?  And if the loudness of typing really is a problem, how about putting all your mains sockets in one area of the auditorium, thus naturally pulling all the live-bloggers together in one place and letting everyone else sleep peacefully.

Oh... and that delegates to virtual conferences can sometimes be stupid enough to want to tell you their dietary requirements! Lol.

So, there was some stuff I found useful and some stuff I didn't and for some reason I allowed the latter the get the better of me.  The straw that broke the camel's back (for me) was a question from the audience about whether the DPA allows JISC services to keep lists of email addresses to which spam about future events can be emailled.  I kinda lost it at that point... pointing out that spamming people by email might not be the best approach to sharing information about events, even if it turns out the be legal. 

My comments where misplaced and I probably went too far.  Everyone uses email and there are target audiences for whom it is the only option.  In my defence, I'd say that my interjection did at least cause a nice bit of discussion.  When I started with, "I probably live on a different planet to everyone else, but ..." about 80% of the room nodded cheerfully!  And when the next questioner referred to me as "passionate", everyone in the room knew that what he really meant was, "why did you just completely lose it, you *@#%ing idiot"! :-)

On balance and after some reflection, I think it was a useful day for me.  It's good to be reminded that we don't all live in a world where blogging and live-blogging and Twitter and Slideshare and the rest are the norm - in fact, for many people, they are not even on the horizon.  This is a shame... and part of the JISC's role is to encourage people to think about these things.  I'm absolutely sure they will continue to do so.  But I guess they also have to be mindful of where people actually are.

Oh, and I nearly forgot...  I was at the event to give a talk about Second Life and how it can be used for events.  I was up last.  What can I tell you?  Getting wound up and pissing off the majority of the audience just before your own presentation probably doesn't feature in most 'presentation skills' good-practice guides but I think I got away with it.  I did the whole session in-world, with a virtual audience as well as the real audience.

I'll blog the details of my session separately, probably over on ArtsPlace SL, but suffice to say that this is a much more stressful way of giving a presentation than usual, since you have two sets of people and the technology to worry about.  In many ways, it is a whole new way of giving a presentation - one that I think will grow in popularity and one that I hope I'm getting a bit better at each time I do it (but I'll have to let the two audiences be the judge of that).

If I offended anyone yesterday I apologise - I think it's better to be honest and upfront about stuff even if it can be painful at times.  I also know that I'm at one end of a spectrum and other people are, rightly, elsewhere.  If you want to respond to this post, positively or negatively, please do so - and I'm happy to be called an idiot, because I know I act like one some of the time.  Yesterday being a case in point.

Dempsey, Lorcan: Optimal disclosure of published materials

planet code4lib - Wed, 2008-10-01 16:14

Simon Inger and Tracy Gardner released an interesting report a little while ago on How scholars navigate to scholarly content. This is a followup to a similar study carried out in 2005 [pdf], and one of the interesting strands of this report is an account of changes in that period.

The focus is on how publishers should think about their network presence in light of changing network behaviors of scholars. They report that readers are increasingly more likely to land in a publisher's website from some other starting point (RSS, Google, A&I database, library portal, etc). This switches focus from navigation of the publisher website to effective disclosure (my word) to those other starting points. They suggest that the "most highly sought-after features of journal web sites are content alerting services, but not personalization and not search functions". They emphasizes the importance of link and data syndication strategies to increase the exposure of their content to their potential readers.

There is much of interest in the specific results, and they have been collected into a readable and brief report. The conclusion provides a good summary.

A key measure of publisher success is the usage of its e-journals, which can be maximised by influencing and enabling all the routes to its content. Library technology plays a key role in user navigation, as well as the more apparent starting points such as Google or major subject A&I databases. Publishers need to support all conceivable routes to their content through the web. This can best be achieved through the open distribution of XML metadata catalogues, through RSS feeds, collaboration with CrossRef, library technology vendors and through working with major gateways, A&Is and search engines. Just as was stated in 2005, as metadata distribution is maximised and users are able to choose more freely their preferred routes to content, many of the advanced features that users require are likely to migrate to their chosen gateways (or portals) leaving the publisher site ever more as a content silo, without the need for many of the advanced features that are currently present there. At the same time it remains true that publishers are under pressure from editorial boards, society members and perversely even from librarians, to create a high level of functionality and the publisher has to manage a careful balancing act to satisfy all of its constituencies. [How readers navigate to scholarly content PDF]

One question I had as I was reading it. They make a distinction between A&I services and library web pages as starting points. When the former was made available through the latter, it was not clear to me which way it was counted.

Some takeaways for me:

  • The report provides good news for libraries, especially in relation to the important 'channeling' role of link resolvers. The authors report that nearly 60% of respondents were guided to e-journals by the library over 95% of the time. They note that this is an 'amazing result'.
  • Disclosure to user workflows has been a recurrent theme of this blog, and I was interested in how this was a major theme of the report. Increasingly we have to build services around user workflow, rather than expect them to build their workflow around services.
  • I recognized the truth of the last paragraph in the conclusions above, and smiled at the expanded version in the body of the text where it was noted that features sometimes had to be incorporated to support a 'political position with respect to societies and powerful editorial board members'.

Related entries:

Quick Bookmarks: del.icio.us  Digg   Google  Reddit   Furl

"Yo La Tengo != Six Layer Cake" by lbjay

#code4lib paste - Wed, 2008-10-01 15:22
Paste to channel #code4lib with 0 annotations.

Crosstech (CrossRef): The Last Mile

planet code4lib - Wed, 2008-10-01 15:16

The figure above (click to enlarge) is probably self-explanatory but a few words may be in order.

With no end-to-end delivery of data from the Handle System to the user's application (broswer or reader), getting data out of the Handle System has traditionally meant using the Web (ie. HTTP) as a courier - in effect, this is the "last mile" for Handle data. Typically an upstream (Handle) client provides services to the user. The most well known of these services is the URL redirect service which underpins the CrossRef reference linking service. Another hosted service is the web form which displays data stored in the Handle records in a simple HTML table for user browsing. See panel a) in the figure above.

By contrast, the OpenHandle proposal aims to move data in the Handle record in structured form (JSON or XML) over the Web for downstream processing - either in the user's browser or on the desktop. See panel b). Advantages are that the Handle data and data structures are moved closer to the user and the services provided are capable of being better targeted and made more relevant. Data mobility as a whole is much improved. The data are accessible using standard Web description and scripting languages. One might almost say (to paraphrase the well-known Java slogan "write once, run anywhere") that this is a case of "read once, write anywhere".

Crosstech (CrossRef): Handle Clients #1, #2, #3

planet code4lib - Wed, 2008-10-01 15:05

Three alternate clients for viewing a Handle (or DOI): #1 (sky - text), #2 (black - tuples), #3 (white - cards) - the image above is "clickable". When Handle clients become JavaScript-able, one really can have it one's own way. (The JavaScript library is here, the demo service interface here - the code for setting up a new service interface can be got from the OpenHandle project.)

del.icio.us: LOL: Planet Code4Lib

planet code4lib - Wed, 2008-10-01 15:04
Planet Code4Lib RSS feed superimposed on LOLCATS. Silly.

del.icio.us: Code4lib NYC

planet code4lib - Wed, 2008-10-01 14:59
Code4libNYC "chapter" of Code4Lib.

del.icio.us: Code4Lib New England

planet code4lib - Wed, 2008-10-01 14:57
Welcome to the current home of the New England "chapter" of Code4Lib. My friends, let's get something started, eh?

del.icio.us: Code4Lib Appalachia

planet code4lib - Wed, 2008-10-01 14:50
This local/regional "chapter" of Code4Lib, known as Code4Lib Appalachia, aspires to channel the spirit of the larger Code4Lib. We want to provide a forum for software and web developers and programmers, working in libraries, to discuss their ongoing projects.

Mignault, John: Rails Kits - Ready-made Rails Code [del.icio.us]

planet code4lib - Wed, 2008-10-01 14:20
Great idea - rails app in a box

Future Archives (Bodleian Library): Fun with tag clouds

planet code4lib - Wed, 2008-10-01 14:16
Not the traditional form of indexing an archive, I know, but it seems to me that automagically extracted metadata formed into tag clouds would be a marvelous way of navigating through some digital archives.

We could present clouds at different levels of granularity - at the collection level, in series and lower levels all the way down to the item. We could even present clouds across multiple aggregations, be they of series, collections or items. This could be fun.

For some digital archives, I think tag clouds are probably a 'must'. Poorly structured and overly large email archives are a good candidate.

One of the downsides of the 'hybrid archive' is that we can't necessarily generate tag clouds that draw on all the contents of the archive. All 'physical' material and non-textual digital formats are excluded unless these things are already tagged by creators. They can, of course, be tagged later by cataloguers and/or users. I guess that we need to recognise that imbalance in our user interface, to help our users get to grips with the nature of research in a hybrid archive.

I know that automatic metadata extraction may have shortcomings, but I'd really like to see a fusing of standardised subject headings with tag clouds. We can have the best of both worlds, surely?

There have been lots of examples of tag clouds about recently, including TagCrowd and Wordle.

This is a Tag Crowd entry for this blog...



archives available comments digital document email evaluate experiments file format futurearch ipaper library material planets plato policy present preservation project software strategy systems tools useful users work created at TagCrowd.com

Syndicate content