Infrastructure schematic (1st draft)

There are several significant developments that will impact on our repository / research management / OER dissemination and discovery over the next 12 months or so…briefly these are:

This is a quick schematic of how the developing infrastructure might look (a bit big to fit in my WordPress theme so click on image for full size):

Repository deposit from the desktop

Thinking about repository workflows for staff – put a deposit client where their resources live, on their desktop…

What I have:

A (slightly unwieldy) set of files comprising:

Quick drop file set

How it works:

The VB script was written by Boyd Duffy at Keele University and, as a non-developer, I know only that I need to edit  sword_deposit.vbs with my SWORD DEPOSIT_TARGET. It’s then simply* a matter of dragging and dropping a file (or multiple files) onto the VBS icon for them to be uploaded into the repository (workflow can obviously be configured in the repository itself, to be published immediately**, for example, or, more likely, go into a workflow where metadata can be added according to a particular Application Profile).

** I think Keele use it as a quick and dirty method for image files to be transferred from desktop to repository from where they can be immediately accessed via a VLE PowerLink.

Here is a screen capture that I did a while ago: http://www.leedsmet.ac.uk/inn/repository/video/SWORD_drop_from_desktop/

* Re simple – I can, in fact, only make it work from a Leeds Met IP!  Perhaps something to do with PROXY_HOST / wireless?

What I need:

METADATA of course!

The current tool is of limited use as it just pushes a file into the repository. In fact, it will quite happily push a Content Package – a Zip comprising a file and some metadata as XML – either an IMSMANIFEST (which I would need for intraLibrary) or METS for DSpace (i.e. Jorum.)

Though I don’t have the skills myself, I’m hoping someone can tell me how we might develop a desktop app to integrate a way of capturing the metadata associated with a resource, converting it into an IMSMANIFEST and/or METS, zipping the whole lot up and pushing it to a repository (or multiple repositories) via SWORD …

If we were to use our current ukoer AP we would need to capture:

  • Title
  • Description
  • (Uncontrolled) Keyword(s)
  • Author / owner / contributor
  • Date
  • Type of resource
  • Technical format
  • Licence information
  • Subject classification (HEA and JACS)

Click link below for an example IMSCP:

http://repository-intralibrary.leedsmet.ac.uk/IntraLibrary?command=open-package-download&learning_object_key=i3605n162666t.zip

Or link below for METS (with cut-down metadata); this package has been successfully deposited in Jorum (dev) via SWORD:

http://repository-intralibrary.leedsmet.ac.uk/IntraLibrary?command=open-preview&learning_object_key=i3128n92902t

N.B. A practical issue with this approach might be including such an application on an institutional staff build and I have heard rumours that it might be possible to achieve similar drag and drop functionality with a web-based app using HTML5 – browser support still inconsistent though I think.

Four JISC repository infrastructure projects

I was contacted this week by Evidence Base at Birmingham City University who are conducting a “short lightweight review” of four key repository infrastructure projects, preliminary to a larger evaluation of the IE programme as a whole, and are talking to JISC programme managers and project managers as well as seeking views from lowly repository managers like me!

The four projects I was asked to discuss were:

Repository Search (UK Institutional Repository Search- IRS) – http://www.intute.ac.uk/irs/
Repository Support Project (RSP) – http://www.rsp.ac.uk
Repository Junction (Open Access – Repository Junction – OA -RJ) – http://edina.ac.uk/projects/oa-rj/ and
Repository Aggregation (RepUK) – http://www.ukoln.ac.uk/projects/repuk/

Now I like to think I’ve got my ear to the ground and I was immediately struck that I was only actually familiar with two of these projects (the intute IR search and, of course, the good old RSP).  So I followed the links for the other two projects to learn what I could – both of which, in my view, need to be very much more high profile than they are currently (though they do both have another 12 months to run until 31st March 2011.)  My ensuing discussion with the lady from Evidence Base was more around the conceptual value of all four projects.

OA-RJ

I expect that OA-RJ in particular will gain traction over the coming months, not least because it is referenced in the current JISC Grant Funding Call Deposit of research outputs and Exposing digital content for education and research.

The purpose of the project is to scope, build and test a deposit broker tool to assist open access deposit into, and interoperability between, existing repository services; currently multiple-authored journal articles are deposited singly in either an institutional, funder or subject-based repository and the primarily aim is to simplify the repository deposit workflow for multiple-authored journal articles; OA-RJ will therefore offer an API that supports redirect and deposit of research outputs into multiple repositories.

RepUK


I was particularly interested in RepUK and IRS as I have for some time been a little non-plussed by our collective, continued obsession with the woefully under-used OAI-PMH and both these projects are using the protocol (I think!).

There is not a huge amount of information on the RepUK website but the paragraph below gives a flavour of the project:

“The interest in exploiting the content to be found in institutional repositories is growing. At the same time, there is a range of possible uses for a central cache of metadata records held by institutional repositories. Most notably, with a recent emphasis on ‘rapid innovation’, there exists an opportunity to position this aggregation of data to support research and development generally in the fields of metadata and/or repositories. Rapid innovation projects which require a corpus of metadata to work with will benefit from this readily available data-store, avoiding the resource-intensive overhead of developing their own harvesting and aggregation solution.”

RepUK also invokes Lorcan Dempsey’s concept of ‘concentration’ in a Web 2.0 environment as a “major characteristic of our network experience” involving “major gravitational hubs” that “concentrate data, users (as providers and consumers), and communications and computational capacity” and posits that “a central cache of metadata records held by institutional repositories” in this way, exposed by a simple, RESTful API, would allow the community to start building value added services around this (hopefully) high quality metadata.

UK Institutional Repository Search (IRS)

This service has come to the end of its funded period as a JISC project but is being maintained at a basic level by Mimas. I presume that it is using OAI-PMH* to cross-search UK IRs and offers “conceptual search” and “text mining search”**. With the best will in the world, it is difficult to see how this facility can compete with the likes of Google in its current incarnation

* May be conceptual search used OAI-PMH but “text mining” is more Google style?
** At least it did but the text mining search was broken and was giving a “Bad Gateway” yesterday – it now appears to have been rerouted to the “conceptual search” only, presumably while it is fixed.


Google, of course, withdrew support for OAI-PMH back in April 2008 and though I’m aware of a few harvesters around like OAIster, even OpenDoar uses a Google custom search – http://www.opendoar.org/search.php, not OAI-PMH, to search repository content.

I can offer only anecdotal evidence but I’m pretty sure that your average academic will tend towards Google/Google Scholar to source research on the open web and has no idea about the OAI-PMH which simply isn’t widely used enough to justify our ongoing fixation.  The  reasons for this are severalfold and represent, to some extent, the protocol’s pedegree (that dates back to the earliest days of the open access and institutional repository movements) and the associated investment by the community, in software specification for example; also from a recognition of the limitations of Google for academic purposes and the undoubted potential of OAI-PMH (though this potential has arguably been watered down by so many repositories also carrying metadata only records rather than exclusively full text.)

RSP

When I was new to repositories I found the RSP absolutely invaluable as a source of information and support, they came to a soft launch of our repository back in 2008 which was really useful to give colleagues a little bit of a wider view of the repository landscape in the UK. I must confess that I haven’t been back to the RSP website for a while and I was pleasantly surprised that there is now a great deal more content covering everything from a primer on the OAI-PMH to advice and resources for successful advocacy.  I was also reminded that the RSP do outreach visits and I may well consider giving them a call – it would certainly be useful to get an objective perspective on some of the issues we continue to face with repository development here at Leeds Met.

I’m not naive, of course, to the reasons for JISC conducting these project evaluations and they clearly want to think carefully about where future investment can most effectively be made; I was asked a few leading questions around how the RSP still meets the needs of the community (I think they do!) and how they might adapt their approach to meet shifting requirements – the start-ups are all but finished I think but, no doubt, new people are coming into the sector all the time who will most certainly benefit from the clear information and support of the RSP.  I also speculated somewhat idly whether the website could be a bit more dynamic and, well, Web 2.0 – they do have a presence on Twitter – @RepoSupport – but I couldn’t find it from the website and I don’t think it feeds there.  I even wondered whether a social network style site using ning or elgg might work….just a thought.