June 18, 2010 Leave a comment
The DepoST – Deposit Show and Tell Meeting – was held last October at the University of London Student Union. The aim of the event was “to identify deposit tools (and perhaps combinations of tools) that would clearly benefit repository users if they could be taken up easily and with confidence, and to plot a path for those tools toward widespread and sustainable take-up.”
I was not actually at the event myself but have long been interested in implementing a user-friendly SWORD client; there has also been recent interest from UKCoRR colleagues so I thought it would be useful to revisit DepoST and try to establish which of the various tools that were demoed at the event have since become “going concerns” and, if so, how repository managers can go about exploiting them.
I have contacted several of the developers who showed their wares back in the Autumn as well as reviewing on-line information; the full list of demoes at the event is as follows:
(1) DepositAir IE Demonstrator - Julian Cheal, SUE/SIS Systems Developer, UKOLN
(2) ePrints 3 Upload Handler plugin – Dave Tarrant, Postgraduate researcher, University of Southampton
(3) PDFMetaExtractor - Pat McSweeney, ePrints project developer, University of Southampton
(4) ICE – Integrated Content Environment - Peter Sefton, eScholarship Tech Team Manager, University of Southern Queensland, Australia
(5) Dashboard deposit in ‘Publications’ product – Richard Jones, Symplectic Limited
(6) Email-based deposit plugin for SWORD – Alex Strelnikov, UKOLN
(7) Mendeley – Jan Reichelt, Mendeley
(8) The Open Access Repository Junction – Ian Stuart, Software Engineer, EDINA
(9) Drag & Drop Deposit Tool – Joe Lambert, University of Southampton
(10) GAip desktop curation tool – Viv Cothey, Gloucestershire Archives
(11) Map a WebDav or FTP drive directly into ePrints 3.2 – Tim Brody, EPrints WebDav, University of Southampton
(12) EM-Loader (Extracting Metadata to Load for Open Access Deposit) – Theo Andrew & Fred Howell, The Open Access Repository, EDINA
(13) EasyDeposit configurable deposit client – Stuart Lewis, IT Innovations Analyst & Developer, University of Auckland Library
(14) WordDeposit – Alex Wade, Director for Scholarly Communication, Microsoft External Research
(15) sWordInbox - Seb Francois, University of Southampton
(16) Xerte online authoring toolkit and Xpert deposit tool X- Julian Tenney and Patrick Lockey, Xerte, University of Nottingham
(17) Copyright Licensing Applications using SWORD for Moodle (CLASM) – James Ballard & Richard Davis, University of London
(18) Names Project – Dan Needham, University of Manchester & Alan Danskin, British Library
DepositAir IE Demonstrator is an Adobe AIR application which borrows its look and feel from the Flickr Uploadr. The user drags and drops the files to deposit from the source folder to the application which auto-populates metadata fields such as title, ISSN, publisher, author name, and then sends the files and metadata to http://dspace.swordapp.org/jspui/. Lack of resources has meant that it hasn’t been fully developed since DepoST and the code has not yet been released to the community; Julian has indicated that he could release the code if people would find it useful.
ePrints 3 Upload Handler plugin works with Microsoft Word 2007 and Powerpoint to extract metadata and media during the deposit process. The plug-in is included in EPrints 3.2; there is a 5 minute demo at http://www.eprints.org/software/training/3.2/videos/word_addin-only.swf
Although the current extraction process is in-line, the plan is to make it an unobtrusive background operation; as I understand it, this work is pending subject to funding.
PDFMetaExtractor searches the user’s computer for PDFs and then intelligently extracts metadata as well as keywords specified within the document. Patrick has been unable to develop the tool as far as he would like though EPrints issued a developer bounty at Dev8D which was taken up by John Harrison – see http://code.google.com/p/pdfssa4met/. Patrick’s original implementation allows extraction of title, abstract, number of pages and the references section from a pdf with a reasonably high level of accuracy and sometimes a string of authors. John Harrison’s extension allows titles, dates, pages, and a sometimes authors to be extracted from the references section which, in theory, should allow some reasonable assumptions about who cites who within a repository.
ICE – Integrated Content Environment – http://ice.usq.edu.au/- is a sophisticated suite of tools that integrates with Openoffice and Word to convert content produced on a word-processor into “usable, self-contained course web sites in IMS package format”. It comprises a toolbar add-on for Openoffice and Word to easily structure and format a document and convert to HTML/PDF and distribute to the web via the open source Subversion revision control system. There are a number of use cases described on the website including integrating (an EPrints) repository with the authoring workflow to produce “research publications which are available not just as paper-ready PDF files but as fully interactive semantically aware web documents which can be disseminated via repository software as complete supported web-native and PDF publications.”
My understanding is that SWORD deposit only currently works with EPrints – see http://techteam.wordpress.com/2009/02/10/procedure-configure-ice-to-eprints/ for technical implementation – Peter has said that the ICE team are working on making SWORD deposit much easier in the future.
Dashboard deposit in ‘Publications’ product – http://www.symplectic.co.uk/products/publications.html – Symplectic is a commercial research management system – this tool to link the repository module of the Symplectic Publications Management System to a repository is compatible with all major digital repository technologies. Users can upload full text documents and supporting information directly from the Symplectic Publications interface; copyright guidance is collected automatically from SHERPA/RoMEO and made available to users.
The University of Leeds are currently in the process of implementing Symplectic – http://wrro.blogspot.com/2009/08/symplectic-eprints-link-testing.html - and are optimistic it will improve ease of deposit and the volume of content deposited.
Email-based deposit plugin for SWORD – I haven’t been able to contact Alex or find out anything more than what was posted on http://infteam.jiscinvolve.org/ which is a shame as it sounds like a simple and effective solution – if anyone knows more please let me know.
Mendeley – http://www.mendeley.com/ – not strictly a repository deposit tool, Mendeley is a combination of a slick iTunes-esque desktop client working in conjunction with a social networking website; researchers use the desktop client to organise their research, annotating PDF’s, sorting, tagging etc and can use the web to collaborate with fellow researchers with shared and public collections. It is also possible to automatically generate bibliographic references in Word.
Mendelay has been described as Last.fm for researchers and certainly has the potential to affect the research workflow in interesting ways which could, in turn, impact on repository workflows. Mendelay is a venture capitalist funded start-up and particularly noteworthy, I think, is its slick implementation and sophisticated web 2.0 functionality compared with some of the other repository tools out there.
The Open Access Repository Junction (OA-RJ) – http://edina.ac.uk/projects/oa-rj/ aims to scope, build and test a deposit broker tool to assist open access deposit into, and interoperability between, existing repository services; currently multiple-authored journal articles are deposited singly in either an institutional, funder or subject-based repository and the primarily aim is to simplify the repository deposit workflow for multiple-authored journal articles; OA-RJ will therefore offer an API that supports redirect and deposit of research outputs into multiple repositories.
OA-RJ is currently working on delivering a proof-of-concept service for two multiple deposit use cases:
- Journal publishers depositing content directly into institutional repositories.
- Subject repositories depositing content into institutional repositories.
There is a project blog at http://oarepojunction.wordpress.com/
Drag & Drop deposit tool takes a strong end user perspective for a simple tool to reduce the workload involved with submitting to a repository. I haven’t been able to discover anything new about this tool beyond what is on the blog at http://blog.mspace.fm/2009/10/13/jisc-depost-meetup/ and which describes the demo prototype that extracts metadata from a PDF (i.e. no SWORD integration as yet) and how the developers would seek to integrate with other project demos at DepoST.
GAip desktop curation tool – Again, I haven’t been able to discover anything other than what was posted on http://infteam.jiscinvolve.org/, specifically that this tool stood out for addressing archive and repository materials other than academic research papers. The Gloucestershire Archives deals with physical materials as well as digital records, and faces the problem of taking “a 100-year view”. The intended user for GAip is an archivist rather than the creator or author.
Map a WebDAV or FTP drive directly into ePrints 3.2 – there is a screencast demo of this tool at http://files.eprints.org/451/ but Tim has indicated that development work has pretty much been abandoned due to problems with WebDAV itself but that he might think about picking it up again if Operating Systems had reliable DAV implementations.
EM-Loader (Extracting Metadata to Load for Open Access Deposit) – The presentation at DepoST – see http://a.nnotate.com/php/pdfnotate.php?d=2009-10-11&c=fwHrIkD8#page1 – was a proof of concept middleware that links the Depot and http://publicationslist.org/, a web site for researchers to build a web page listing their publications with the goal of making batch deposits easier, by handling multiple queries for metadata from web-based resources like PubMed, Web of Science, and personal databases such as EndNote, Reference Manager, BibTeX etc.
The follow-on from EM-Loader uses harvesting from publications lists rather than pushing into a repository via SWORD which avoids researchers having to be involved in a submission at all – it just fetches new stuff to put in the right place in the repository, and researchers just maintain their own list using the normal http://publicationslist.org/ gui (with auto-fetch from pubmed, web of science, endnote etc)
Integration is currently with DSpace only but it should be straightforward to also integrate with EPrints.
The full report is available at http://publicationslist.org/em-loader/emloader-report-intro.html
(This sounds particularly interesting and I plan to look at it more closely when I get a chance.)
EasyDeposit - http://easydeposit.swordapp.org/ – is an open source SWORD client creation toolkit to create customised SWORD deposit web interfaces from within your browser – this is one of the most fully realised of the tools from the event as well as being one of the best suited to my own immediate requirements from a SWORD client and which I looked at it in detail in a previous post.
WordDeposit – http://research.microsoft.com/en-us/projects/authoring/ – is an add-in for Word 2007 that enables metadata to be captured and stored at the authoring stage and enables semantic information to be preserved through the publishing process, which is essential for enabling search and semantic analysis once the articles are archived in a repository. It is SWORD enabled and Alex has indicated that Microsoft External Research are currently working on an updated release that will also make it easier for authors to add new/multiple SWORD end-points and to deposit to multiple repositories.
sWordinbox as I understand it has evolved into two separate plug-ins for EPrints – neither of which actually uses SWORD; one uses XML-RPC to post to a blog at deposit time. The other is a widget that allows uploads to EPrints from anywhere:
- Post To Blog plug-in – http://files.eprints.org/482/ – allows people to send their publications to their blog. It integrates into EPrints’ Manage Deposits screen and is displayed as an action you can perform on any of your publications. To test/use you need a blog account (eg on WordPress…), and your blog provider must support the XMLRPC standard.
- EPrints Remote Uploader plug-in – http://files.eprints.org/483/ – allows publications to be remotely created from web pages by adding a single line of HTML to include the Remote Uploader onto their pages. Any user can then select a file, set a title and send this information to the repository. Authentication is carried out by the “classic HTTP dialog box”, in order to minimize phishing attacks. It looks like this:
Xerte online authoring toolkit and Xpert deposit tool – Aimed at content other than research publications, Xerte - http://www.nottingham.ac.uk/xerte/ - is an open source suite of tools to rapidly develop richly interactive learning content. Content created in Xerte can be deposited into Xpert- http://www.nottingham.ac.uk/xpert/, a searchable distributed repository compiled by harvesting content from the publishing institution via RSS feed. The aim is to make learning content available for re-use, re-purposing and adaptation.
In order to deposit into Xpert it is simply necessary to submit a valid RSS feed via the form at http://xpert.nottingham.ac.uk/feedsubmit.php
Copyright Licensing Applications using SWORD for Moodle (CLASM) – http://clasm.ulcc.ac.uk/wiki/index.php/Main_Page – also aimed at teaching and learning materials rather than research outputs, CLASM has developed a SWORD plug-in for the Open Source VLE, Moodle so that objects in the VLE can be easily deposited into a SWORD enabled repository.
The project has now completed and there is a succinct summary – including caveats around limitations of interoperability offered by the SWORD library- at http://clasm.jiscinvolve.org/wp/2010/02/15/working-with-sword-and-moodle/
Names Project - http://names.mimas.ac.uk/ - focused on the critical issue of author disambiguation rather than deposit workflow, the Names Project is developing a pilot name authority system and uses data from Zetoc, British Library and contextual information from research documents to build a database of all UK research authors to uniquely identify individuals and institutions. A public beta API is available for testing.