You are here

planet code4lib

Subscribe to planet code4lib feed
Planet Code4Lib - http://planet.code4lib.org
Updated: 1 day 7 hours ago

Jez Cope: Tools for collaborative markdown editing

Thu, 2016-09-15 19:52

Photo by Alan Cleaver

I really love Markdown1. I love its simplicity; its readability; its plain-text nature. I love that it can be written and read with nothing more complicated than a text-editor. I love how nicely it plays with version control systems. I love how easy it is to convert to different formats with Pandoc and how it’s become effectively the native text format for a wide range of blogging platforms.

One frustration I’ve had recently, then, is that it’s surprisingly difficult to collaborate on a Markdown document. There are various solutions that almost work but at best feel somehow inelegant, especially when compared with rock solid products like Google Docs. Finally, though, we’re starting to see some real possibilities. Here are some of the things I’ve tried, but I’d be keen to hear about other options.

1. Just suck it up

To be honest, Google Docs isn’t that bad. In fact it works really well, and has almost no learning curve for anyone who’s ever used Word (i.e. practically anyone who’s used a computer since the 90s). When I’m working with non-technical colleagues there’s nothing I’d rather use.

It still feels a bit uncomfortable though, especially the vendor lock-in. You can export a Google Doc to Word, ODT or PDF, but you need to use Google Docs to do that. Plus as soon as I start working in a word processor I get tempted to muck around with formatting.

2. Git(hub)

The obvious solution to most techies is to set up a GitHub repo, commit the document and go from there. This works very well for bigger documents written over a longer time, but seems a bit heavyweight for a simple one-page proposal, especially over short timescales.

Who wants to muck around with pull requests and merging changes for a document that’s going to take 2 days to write tops? This type of project doesn’t need a bug tracker or a wiki or a public homepage anyway. Even without GitHub in the equation, using git for such a trivial use case seems clunky.

3. Markdown in Etherpad/Google Docs

Etherpad is great tool for collaborative editing, but suffers from two key problems: no syntax highlighting or preview for markdown (it’s just treated as simple text); and you need to find a server to host it or do it yourself.

However, there’s nothing to stop you editing markdown with it. You can do the same thing in Google Docs, in fact, and I have. Editing a fundamentally plain-text format in a word processor just feels weird though.

4. Overleaf/Authorea

Overleaf and Authorea are two products developed to support academic editing. Authorea has built-in markdown support but lacks proper simultaneous editing. Overleaf has great simultaneous editing but only supports markdown by wrapping a bunch of LaTeX boilerplate around it. Both OK but unsatisfactory.

5. StackEdit

Now we’re starting to get somewhere. StackEdit has both Markdown syntax highlighting and near-realtime preview, as well as integrating with Google Drive and Dropbox for file synchronisation.

6. HackMD

HackMD is one that I only came across recently, but it looks like it does exactly what I’m after: a simple markdown-aware editor with live preview that also permits simultaneous editing. I’m a little circumspect simply because I know simultaneous editing is difficult to get right, but it certainly shows promise.

7. Classeur

I discovered Classeur literally today: it’s developed by the same team as StackEdit (which is now apparently no longer in development), and is currently in beta, but it looks to offer two killer features: real-time collaboration, including commenting, and pandoc-powered export to loads of different formats.

Anything else?

Those are the options I’ve come up with so far, but they can’t be the only ones. Is there anything I’ve missed?

  1. Other plain-text formats are available. I’m also a big fan of org-mode. [return]

District Dispatch: Ellen Satterwhite joins ALA telecom policy team

Thu, 2016-09-15 17:00

I’m pleased to announce that ALA has bolstered its telecommunications policy team with the addition of Ellen Satterwhite. As a new Fellow of ALA’s Office for Information Technology Policy (OITP), Ellen will provide leadership, counsel and representation on the full array of telecommunications issues that affect libraries and the general public, as well as those that intersect with information policy more broadly.

Ellen Satterwhite is a new OITP Fellow

Ellen is a Director at the policy communications firm Glen Echo Group, where she helps clients formulate policy positions and tell their stories within the rubric of information policy. WifiForward is one of several coalitions managed by Ellen and Glen Echo, and ALA was a founding member of the group, which advocates for abundant Wi-fi and balanced spectrum policy.

As a co-author of the Federal Communications Commission’s (FCC) National Broadband Plan, Consumer Policy Advisor to the FCC and freelance consultant, Ellen’s work has been written about in the Huffington Post, AllThingsD, CNet, Geekwire, GigaOm, and CivSource. Previously, Ellen also served as Program Director for Gig.U, supporting communities seeking gigabit speeds. She earned a master’s degree in Public Affairs from University of Texas at Austin and completed her undergraduate degree at Grinnell College.

OITP Deputy Director Larra Clark will continue to contribute to our telecommunications policy work, with OITP Associate Director Marijke Visser, OITP Senior Fellow Robert Bocher, and me, working in coordination on legislative matters with Kevin Maher of ALA’s Office of Government Relations, and telecommunications counsel John Windhausen.

Please welcome Ellen in her new role, as you see her inside the beltway or in libraryland.

The post Ellen Satterwhite joins ALA telecom policy team appeared first on District Dispatch.

Islandora: Islandoracon 2017 Call for Proposals now open

Thu, 2016-09-15 16:53

The Islandoracon Planning Committee invites you to submit your proposals to present at the second Islandoracon, May 15 - 19, 2017 in Hamilton, Ontario, Canada.

This year’s conference theme is Beyond the Island. Since its creation at the University of Prince Edward Island in 2006, Islandora has spread around the world. It has grown to include diverse institutions, collections, and strategies for digital repository management that add to the richness of the Islandora community. The 2017 Islandoracon will celebrate these multifaceted visions of Islandora that are continually emerging, inspiring constant revision in the concept of a digital repository.

Regular Sessions:
A 20-minute talk (plus 10 minute for questions) with accompanying A/V.

Poster/Island Getaways:
A conference poster to be displayed during a poster session, accompanied by a five-minute ‘lightning talk’ style presentation of that poster, which will be known as "Island Getaways." Conference attendees will be invited to vote on your favourite "Island Getaway" to award prizes for the most inspirational posters and talks.

Post-Conference Sessions:
A free-form day for Islandora gatherings. The last day of Islandoracon is open for sessions, workshops, working groups, training, meetings, or whatever events you care to propose. Please let us know what kind of event you’d like to hold, the expected audience, and the space/time needed.

Speakers will have access to a discounted Speakers Rate for registration. Submissions will be accepted until December 16th, 2016

Submission Form

DPLA: Job Opportunity: Director of Technology

Thu, 2016-09-15 16:30

The DPLA has an opening for the position of Director of Technology.

The Digital Public Library of America seeks a Director of Technology to lead its staff of developers and technologists, and to further DPLA’s mission to bring together the riches of America’s libraries, archives, and museums, and make them freely available to all. A belief in this mission and the drive to accomplish it over time in a collaborative spirit both within and beyond the organization is essential.

The Director of Technology will be responsible for the overall technology vision for the DPLA, and in consultation with the DPLA senior staff, will develop of new initiatives and cross-institutional partnerships. The Director of Technology will report to the Executive Director.

The Director of Technology will:

  • Orchestrate the design, implementation, and improvement of DPLA’s core infrastructure, user-facing applications, and back-end systems.
  • Oversee and a manage a team of four developers, and work with external contractors as necessary.
  • Act as the primary technical contact for outside organizations, partners, and developers.
  • Cultivate and develop the culture and values of the DPLA Technology Team and the larger organization.
  • Support the philosophy of open source, shared, and community-built software, frameworks, and technologies.
  • Be conversant and comfortable with a broad range of technologies used by the cultural heritage sector, and engage with cultural heritage communities and consortial efforts.

Requirements

  • Experience with managing developers and technologists in the cultural heritage or non-profit sectors, with a demonstrated capacity of building an environment for success for technical teams.
  • Demonstrated experience working effectively in a team environment and the ability to interact effectively with stakeholders.
  • Demonstrated experience contributing to or managing collaborative open source software projects.
  • Excellent written and verbal communication skills.
  • Excellent analytical and organizational skills.
  • A high degree of emotional intelligence and empathy.

Preferred

  • Experience with architectures, standards, and protocols to support interoperability and reuse within and beyond the cultural heritage sector.
  • Demonstrated experience with technical project management, grant writing and administration, and/or development and oversight of RFP processes.
  • Demonstrated desire to learn new technologies or programming languages.

This position is full-time. DPLA is a geographically distributed organization, with headquarters in Boston, Massachusetts. Ideally, this position would be situated in the Northeast Corridor between Washington and Boston (with a preference for Boston), but remote work based in other locations will also be considered. While not a requirement, a history of working effectively in a distributed organization is helpful.

Like its collection, DPLA is strongly committed to diversity in all of its forms. We provide a full set of benefits, including health care, life and disability insurance, and a retirement plan. Starting salary is commensurate with experience.

About DPLA

The Digital Public Library of America strives to contain the full breadth of human expression, from the written word, to works of art and culture, to records of America’s heritage, to the efforts and data of science. Since launching in April 2013, it has aggregated more than 14 million items from over 2,000 institutions. DPLA is a registered 501(c)(3) non-profit.

To apply, send a letter of interest detailing your qualifications, resume and a list of 3 references in a single PDF to jobs@dp.la. First preference will be given to applications received by September 30, 2016, and review will continue until the position is filled.

David Rosenthal: Nature's DNA storage clickbait

Thu, 2016-09-15 15:00
Andy Extance at Nature has a news article that illustrates rather nicely the downside of Marcia McNutt's (editor-in-chief of Science) claim that one reason to pay the subscription to top journals is that:
Our news reporters are constantly searching the globe for issues and events of interest to the research and nonscience communities.Follow me below the fold for an analysis of why no-one should be paying Nature to publish this kind of stuff.

Extance's article is entitled How DNA could store all the world's data, and starts with this scary thought:
The latest experiment signals that interest in using DNA as a storage medium is surging far beyond genomics: the whole world is facing a data crunch. Counting everything from astronomical images and journal articles to YouTube videos, the global digital archive will hit an estimated 44 trillion gigabytes (GB) by 2020, a tenfold increase over 2013. By 2040, if everything were stored for instant access in, say, the flash memory chips used in memory sticks, the archive would consume 10–100 times the expected supply of microchip-grade silicon3.He then claims a solution to this problem is at hand:
If information could be packaged as densely as it is in the genes of the bacterium Escherichia coli, the world's storage needs could be met by about a kilogram of DNA.The article is based on research at Microsoft that involved storing 151KB in DNA. The research is technically interesting, starting to look at fundamental DNA storage system design issues. But it concludes (my emphasis):
DNA-based storage has the potential to be the ultimate archival storage solution: it is extremely dense and durable. While this is not practical yet due to the current state of DNA synthesis and sequencing, both technologies are improving at an exponential rate with advances in the biotechnology industry[4].SourceThe paper doesn't claim that the solution is at hand any time soon. Reference 4 is a two year old post to Rob Carlson's blog. A more recent post to the same blog puts the claim that:
both technologies are improving at an exponential ratein a somewhat less optimistic light. It is (or may be, Carlson believes the last two data points are not representative) true that DNA sequencing is getting cheaper very rapidly. But already the cost of sequencing (read) was insignificant in the total cost of DNA storage. What matters is the synthesis (write) cost. Lower down the article Extance writes:
A closely related factor is the cost of synthesizing DNA. It accounted for 98% of the expense of the $12,660 EBI experiment. Sequencing accounted for only 2%, thanks to a two-millionfold cost reduction since the completion of the Human Genome Project in 2003.The rapid decrease in the read cost is irrelevant to the economics of DNA storage; if it was free it would make no difference. Carlson's graph shows that the write cost, the short DNA synthesis cost (red line) is falling more slowly than the gene synthesis cost (yellow line). He notes:
But the price of genes is now falling by 15% every 3-4 years (or only about 5% annually).A little reference checking, that should have been well within the capability of one of Nature's expert news reporters, reveals that the Microsoft paper's claim that:
both technologies are improving at an exponential ratewhile strictly true is deeply misleading. The relevant technology is currently getting cheaper slower than hard disk or flash memory! And since this has been true for around two decades, making the necessary 3-4 fold improvement just to keep up with the competition is going to be hard.

I actually believe that, decades from now, DNA will be an important archival medium. But I've been criticizing the level of hype around the cost of DNA storage for years. Extance's article admits that cost is a big problem, yet it finishes by quoting Goldman, lead author of a 2013 paper in Nature whose massively over-optimistic cost projections I debunked here. Goldman's quote is possibly true but again definitely deeply misleading:
"Our estimate is that we need 100,000-fold improvements to make the technology sing, and we think that's very credible," he says. "While past performance is no guarantee, there are new reading technologies coming onstream every year or two. Six orders of magnitude is no big deal in genomics. You just wait a bit."Yet again the DNA enthusiasts are waving the irrelevant absolute cost decrease in reading to divert attention from the relevant lack of relative cost decrease in writing. They need an improvement in relative write cost of at least 6 orders of magnitude. To do that in a decade means halving the relative cost every year, not increasing the relative cost by 10-15% every year.

Extance's article doesn't simply regurgitate the hype in the paper he's reporting on by failing to scrutinize its claims, he amplifies it by headlining claims the paper is careful not to make, and giving it prominence in Nature's news section. This kind of clickbaiting is a classic example of problem #6 in The 7 biggest problems facing science, according to 270 scientists by Julia Belluz, Brad Plumer, and Brian Resnick. I blogged about their article here:
Science journalism is often full of exaggerated, conflicting, or outright misleading claims. If you ever want to see a perfect example of this, check out "Kill or Cure," a site where Paul Battley meticulously documents all the times the Daily Mail reported that various items — from antacids to yogurt — either cause cancer, prevent cancer, or sometimes do both.My problem with the oligopoly of academic publishers isn't that they are incredibly expensive, but that they are incredibly poor value for money, as shown by the fact that it took me about an hour to show how misleading Extance's article is.

Open Knowledge Foundation: Why civil society organisations are using OpenSpending to share fiscal data with the public

Thu, 2016-09-15 14:00

OpenSpending is one of Open Knowledge International’s current projects. It is a free and open platform for citizens looking to track and analyse public fiscal information globally.

While the OpenSpending team was busy revamping the platform over the last year we have been fortunate to have a community of users actively involved in testing the new tools. Here we  highlight the experiences of three partner civil society organisations collecting and structuring budget and spending data and using OpenSpending tools to present this data to the public. It also gives an insight into the challenges these organisations faced in data collection and solutions they employed to reduce data barriers.

Public Domain icons by David Merfield Sinar Project in Malaysia: Open Spending Data in Constrained Environments

Sinar Project is an initiative that uses open source technology and applications to make important information accessible to the Malaysian people. Sinar has been working to engage disenfranchised communities in the budget process, in order to hold the government accountable for budgets that respond to the needs of citizens.

Over the course of  2016, the team at Sinar has been working to obtain and to prepare over 100 datasets for upload on OpenSpending. So far, they uploaded over 40 datasets on the platform. Amongst others, the team published the 2014 allocated budgets for public housing maintenance in Kota Damansara township. Data uploaded and visualized on OpenSpending was shared with the community’s leaders for review. This gave the community the opportunity to compare and contrast how planned budget allocation matched up with how funds were actually spent. The community leaders identified potential misuse of funds in some budgets lines and are continuing to conduct investigations and collect evidence to expose poor management of public finances in Kota Damansara. Data and visualizations are available on OpenSpending Viewer.

It wasn’t easy for the team to obtain such data. First, they had to file a Freedom of Information (FOI) request to the state owned Selangor Housing and Property Agency. They also went into meetings with the authorities to get an indepth understanding of the data. Sinar continuously faces challenges in data collection of budgets at all levels of government. For example, for previous years, budgets for the federal government are not publicly available and there is no FOI law applicable to the federal government. There are roadblocks in data collection for state governments and for city councils as well.

“…to engage disenfranchised communities in the budget process…”

In spite of the roadblocks and reluctance of authorities to collaborate, the team at Sinar have filed FOI requests to the Selangor state government and Petaling Jaya city council to get access to fiscal budgets. They have also filed FOI requests to the management company responsible for the Kota Damansara public housing, obtaining access to data on how MYR 5 million (USD 1.2 million) were allocated to repair railings for all housing blocks and data on allocated budgets for public housing maintenance in 2014 and 2015.

Moving forward, Sinar Project is planning to continue using OpenSpending to:

  1. Address budget priorities at all levels of government
  2. Visualize allocated budgets and compare to official government policies and implementation of government programmes
  3. Make use of evidence based budget data and various survey results to hold the decision makers at all levels accountable
  4. Advocate for transparency in open data, promote better access to government budgets data, and push for better open data policies.
Metamorphosis Project in Macedonia: Revamp the current Follow the Money website

Metamorphosis Foundation is a civil society organization from Macedonia, having been active for more than 15 years. Several years ago they started collaborating with Open Knowledge International to implement the “Open Data Civil Society network” project, with the aim of improving the capacity of civil society organizations in the country. Moreover, they established School of Data Macedonia in order to promote an open agenda.

In 2012, Metamorphosis Project in Macedonia developed their Follow the Money website to familiarise citizens with the fiscal policies of local authorities. However, while budget information was presented on the site, over time it has lost its popularity.  In 2015, the School of Data fellow conducted in-depth user research to better understand why the site wasn’t being used and how it could be improved to better serve its potential user communities. Ultimately, the team at Metamorphosis decided to revamp the website.

“…improving the capacity of civil society organizations in the country.”

They focused on collecting, cleaning and preparing budget data from all 80 municipalities as well as the country’s central budget. Take a look at the planned Central Budget for 2016 made available on OpenSpending:

For the above visualization, click this link. Explore years 2010 to 2016 at this link.

Like with the Sinar project, data collection was incredibly challenging. Budget data for most municipalities was  “locked” in PDFs or not published at all. Instead of trying to get the data from the source, Metamorphosis partnered with other CSOs in the country that work closely with the municipalities who were willing to share the data that they had already collected.

Another issue they are facing is the lack of granularity of the published data and official institutions unwilling to provide more detailed data. Finally, while the central government budget was made available in machine readable format, it only included the economic budget classification, which identifies the type of budget and expenditure incurred, for example, salaries, goods and services, transfers and interest payments, or capital spending. Since the team needed the functional classification (expenditure according to the purposes and objectives for which they are intended) for the website, they had to scrape it from the website of the Ministry of Finance. The website includes functional classification data, since this is how the team found most useful to display data to users.

In the next few months, the team is working to identify funds to launch the revamped version of Macedonian Follow the Money website with embedded visualizations created on OpenSpending, and continue updating their data on the platform.

AfroLeadership in Cameroon: Open Local Budgets

AfroLeadership is a civil society organization in Cameroon, founded in 2007 and committed to the promotion of open data and civic technologies for governance, transparency and citizen participation. For several years, AfroLeadership has been promoting the use of a financial management information system in local governments, in order to improve budget transparency, accountability and public participation to budgeting. The adoption of the Financial Management Information System by several councils aims at improving budget reliability, budget execution and the ratio of budget reports to supreme audit institutions (SAI).

“…to bring budget information to citizens and CSOs in an accessible and open way…”

The Cameroon Open Local Budgets (COLB) project, launched in 2016, seeks to fight corruption, improve local accountability and ensure effective service delivery by collecting and publishing all 374 (the number of councils in Cameroon) approved budgets and accounts for all local authorities in Cameroon on OpenSpending. This project is a continuation of the organisation’s effort to bring budget information to citizens and CSOs in an accessible and open way, and engage them in public and local affairs.

The goal of the current OpenSpending Cameroon pilot phase is to upload 50 data sets for 2015 budget reports. For example, uploaded data on Cameroon’s Dschang council looks at functional expenses versus investment expenses, while a drill down into these categories lets users explore the expenses for each budget category.

The AfroLeadership team also faces challenges in data collection. Even if the deadline for 2015 budget reports and account production was at the end of May of this year, collection of these accounts has been more difficult than expected. Audit Bench of the Supreme Court of Cameroon has stressed the fact that less than 10% of budgets reports are received at their desk each year.

To address data collection challenges, AfroLeadership has organized information workshops to present to diverse stakeholders (Mayors, Supreme Audit Institutions, Civil Society Organisations, Journalists, etc.) the necessity of involving citizens in the budget cycle. Also, AfroLeadership has invited its institutional partner on this project, the national community driven development program (PNDP), to help collect approved 2015 budgets reports and accounts. AfroLeadership is currently in touch with the Ministry of Finance to explore opportunities in improving budget report collection in local governments.

All these organizations have been involved in upload training sessions on OpenSpending and now that the platform is available in Alpha, they are working to publish the data to the larger public through OpenSpending.

To browse existing datasets and to upload your data, visit OpenSpending. For questions, OpenSpending team is available via OpenSpending discussion forum, on Gitter.im in the OpenSpending chat room, or on the OpenSpending issue tracker.

pinboard: Library of Congress LCCN Permalink sh2016001442

Thu, 2016-09-15 13:02
RT @JulieSwierczek: #code4lib #c4l16 - "Black Lives Matter movement" is now a SUBJECT HEADING. . Catalogers, make sure you USE IT!

D-Lib: Measuring Scientific Impact Beyond Citation Counts

Thu, 2016-09-15 11:13
Article by Robert M. Patton, Christopher G. Stahl and Jack C. Wells, Oak Ridge National Laboratory

D-Lib: Quantifying Conceptual Novelty in the Biomedical Literature

Thu, 2016-09-15 11:13
Article by Shubhanshu Mishra and Vetle I. Torvik, University of Illinois at Urbana-Champaign

D-Lib: Current Research on Mining Scientific Publications

Thu, 2016-09-15 11:13
Guest Editorial by Drahomira Herrmannova and Petr Knoth, Knowledge Media Institute, The Open University

D-Lib: Virtuous Cycle

Thu, 2016-09-15 11:13
Editorial by Laurence Lannom, CNRI

D-Lib: An Analysis of the Microsoft Academic Graph

Thu, 2016-09-15 11:13
Article by Drahomira Herrmannova and Petr Knoth, Knowledge Media Institute, The Open University

D-Lib: Rhetorical Classification of Anchor Text for Citation Recommendation

Thu, 2016-09-15 11:13
Article by Daniel Duma and Ewan Klein, University of Edinburgh; Maria Liakata and James Ravenscroft, University of Warwick; Amanda Clare, Aberystwyth University

D-Lib: Preliminary Study on the Impact of Literature Curation in a Model Organism Database on Article Citation Rates

Thu, 2016-09-15 11:13
Article by Tanya Berardini and Leonore Reiser, The Arabidopsis Information Resource; Ron Daniel Jr. and Michael Lauruhn, Elsevier Labs

D-Lib: Temporal Properties of Recurring In-text References

Thu, 2016-09-15 11:13
Article by Iana Atanassova, Centre Tesniere, University of Franche-Comte, France and Marc Bertin, Centre Interuniversitaire de Rercherche sur la Science et la Technologie (CIRST), Universite du Quebec a Montreal (UQAM)

D-Lib: The Impact of Academic Mobility on the Quality of Graduate Programs

Thu, 2016-09-15 11:13
Article by H. P. Silva, Alberto H. F. Laender, Clodoveu A. Davis Jr., Ana Paula Couto da Silva and Mirella M. Moro, Universidade Federal de Minas Gerais, Brazil

D-Lib: Capturing Interdisciplinarity in Academic Abstracts

Thu, 2016-09-15 11:13
Article by Federico Nanni, Data and Web Science Research Group, University of Mannheim, Germany and International Centre for the History of Universities and Science, University of Bologna, Italy; Laura Dietz, Stefano Faralli, Goran Glavas and Simone Paolo Ponzetto, Data and Web Science Research Group, University of Mannheim, Germany

Pages