Repository Day

Yesterday we ran several workshops designed to introduce the Leeds Met Repository (comprising PERSoNA and Streamline) as an integrated system-in-development and to have colleagues engage with some of the tools that will eventually (soon!) be incorporated into a complementary infrastructure surrounding the repository and facilitating easy and intuitive deposit, discovery and sharing of a myriad of different scholarly resources amongst academic colleagues bent on distributing their wares far and wide.

Note to self – might there be a trade off betwixt ambitious concept and project deliverable?

The plan was to deliver a short introductory presentation that contextualised the three projects before allowing participants to sit at a lap-top and interract with the tools we have made accessible from our new blog (see PERSoNA News for more info and link).

In retrospect I think that I was missing a crucial slide that might have more clearly illustrated how intraLibrary might fit within this infrastructure.  Also, it is not at all easy to succinctly describe the dual aim of our project (an Open Access research archive/Repository of RLOs) along with their respective issues and challenges when, frankly, many of the details are still to be worked out, but then that is where the end user comes in of course!

When let loose on a lap top, many made a bee line for intraLibrary itself.  Perfectly understandable, of course, and perfectly OK within the context of our workshop but it did throw into relief that the undoubted sophistication and flexibility of intraLibrary also equates to complexity and I found myself faced with a cohort of beginners at the bottom of a steep learning curve that I myself have only partly ascended.  Some of the questions led Dawn to wonder whether people had misunderstood and thought that we were responsible for developing the interface to intraLibrary itself – see Streamline News – and I’ll certainly be clearer next time (I’ll try to post that missing slide soon but might it look something like an evolved version of this?)

Having said this, people were definitely engaged and interested during and after the presentation with many keen to explore the potential of the system with me, especially with respect to RLOs and I wonder if it is now wise to disentangle the different types of content in order to more accurately target relevant groups of stakeholders – I just think the issues are too disparate and the fact that intraLibrary is the common underlying technology for storing and making them both available in the appropriate way is really irrelevant to the end user.

Janet made the point that, in the case of RLOs, we are perhaps confronted more by issues of changing academic culture; the arguments in favour of Open Access to research are relatively well established and most researchers would agree that they would like their published research to be as widely available, read and cited as possible.  For a number of reasons, this is not necessarily the case with Learning Objects – for a discussion of some of them see this EdSpace blog post by Hugh Davis of Southampton University.

I started each workshop by emphasising that the ultimate goal of the three projects is to facilitate engagement with the repository in as fluid and flexible manner as possible – not to impose another monolithic tool on people and expect them to use it (‘cos they won’t!).  Towards the end of the final workshop, one colleague expressed the view that his own conception of The Repository was perhaps ‘blinkered’ though he could see how it would be useful for a very particular need of his own!!  I siezed upon this as precisely the type of thing we are looking for – tell me what you want to do, let’s see if we can do it, then we can show and tell others how useful it is!  I hope that if we are able to build some real use cases and exemplars we can start to build some inertia and that ongoing developments to our repository infrastructure will be informed by what people actually use and want and that we can approach a realisation of our goals – for now, I was encouraged by the enthusiasm of many of the participants and intend to engage with them as much as possible over the coming months.

Getting there, slowly but surely

The Repository is really starting to take shape; the search interface has now been installed on a development server (as discussed previously, we are using the IRISS SRU client) and is returning very satisfying results on my test content. Now we can start adding the extra functionality (browse, advanced search) – well Mike T can at any rate, and my more technically inclined colleagues – and then to customise the look and feel, though Mike has already added an enormous Leeds Met Rose!

Ongoing development of the interface will also feed into PERSoNA – in a meeting today with John and Mike, Wendy and I discussed one initial approach being to embed the search box/additional search functionality from the interface into a google app (feeding into Leeds Met’s developing partnership with Google) or some kind of generic plug-in or widget. I’ll try to expand on this at some point on PERSoNA News and ask for some pertinent blog input from John and Mike.

And I’ve uploaded my first research paper! A colleague in the library has a paper published in the Reference Services Review – which is a subsidiary of Emerald – and RoMEO green; Do Academic Enquiry Services Scare Students? (This link to the Emerald full text, not the author’s version in The Repository.)

At the moment I am very much focussed on the Staff Development Festival in September and have also been uploading citation information for demonstration purposes – I hope to use the Festival to encourage folk to supply full text copies of their research papers which can then be uploaded in line with publishers’ copyright transfer agreements and we can finally start building that representative body of content. I’ve set up a basic taxonomy within intraLibrary based on Leeds Met faculties and intend to upload 5-10 citations per faculty which I’m linking through to publishers’ abstract pages where possible. This should give us the opportunity to review metadata and get a preliminary idea of the workflow as well as illustrating to people why they might want to release copies of their work from behind subscription barriers (look, there can be links to your work all over the web but you can’t get any further than the abstract without a subscription fee.) The final choice of taxonomy should also be informed by demonstrations to academic staff – we already know that the steering group does not want to base it on faculties as the major organisational structure.

Mike has said that he can do some very preliminary customisation of the search interface before the festival to illustrate how the external browse functionality might work – this will be based on the taxonomies as they currently appear within intraLibrary and, given the short amount of time, will be for demonstration purposes only and probably won’t return dynamic results but should give people the opportunity to visualise the interface and comment on its development.

Search interface, URLs, taxonomy, policies and content…

It is now established that we will be using the SRU interface developed by IRISS as the public search interface for the repository. I hope to install the current incarnation of the interface on a Leeds Met server very soon and two of my more technically adept colleagues are looking at the recently released code in order to scope the extent of the development work that will be required to incorporate advanced search and browse functionality. As this page will effectively be the repository by proxy (the URL that I have requested is repository.leedsmet.ac.uk – intraLibrary itself will require a different URL) we also need to think about what other elements it might need to comprise; authenticated log-in to intraLibrary itself (yet to be determined if this will be the appropriate route for self-archiving; it will certainly be one route but we may also need an authenticated link to a SWORD interface for example); About this repository; FAQs; Operational policies; Contact etc. It is also likely that this page will form the basis of – or at least link to – the PERSoNA web-tool(s).

What about learning objects which will require their own taxonomy and a different workflow for deposit (via SWORD perhaps)? Should they be incorporated into the search interface at all or will users need to authenticate into intraLibrary to browse? This would seem to make sense given intraLibrary is a specialised LO repository and access to this type of content is more likely to be restricted to Leeds Met staff.

I’ve adapted my schematic recently posted on PERSoNA News to try to represent what the repository might now look like:

The customisation of the search interface is one of the issues that I am taking to the steering group meeting tomorrow afternoon.

Other decision that needs to be ratified by that group are:

  • The URL for the search interface
  • The URL for intraLibrary
  • The taxonomy system that we shall use within intraLibrary and that the search interface (SRU) will map directly on to (at least for research)

Other items on the agenda are:

  • Development of operational policies for the repository

I have so far drafted the following:

  1. Metadata policy
  2. Data policy
  3. Takedown policy
  4. Content policy
  5. Submission policy
  6. Preservation policy

These are all fairly standard in terms of Open Access repositories and, with the exception of 3. Takedown policy, were all generated using the OpenDOAR Policies Tool, nevertheless, it may be necessary to identify specialised sub-groups to review these drafts to ensure they are appropriate for the Leeds Met repository; the issue is more complex of course due to our repository incorporating Learning Objects as well as research.

  • Content for the repository

There needs to be a discussion about how best to contact researchers and research ac tive staff to ask them for appropriate material for the repository. In the first instance, in line with the project plan, this will be their own versions of published research articles that are allowed to be self-archived into an OA repository. I have begun to identify such material and have drafted correspondence for review at the meeting.

  • Authentication

With the implementation of the search interface (SRU) it will not be necessary to authenticate in order to browse for research content (essential for OA). It will, however, be necessary to generate authenticated accounts for Leeds Met staff that require access to intraLibrary itself and these will need to be integrated with LDAP. Though much will depend on the precise configuration of our integrated repository systems it is likely that, in time, all staff will require an authenticated account whether to deposit material, search for learning objects or access their internal workspace. There are also authentication issues pertaining to the potential use of SWORD/other external interfaces such that only authorised Leeds Met staff/students can deposit material/access federated content. I am still unsure of some of the issues involved and require input from Intrallect and IMTS.

  • Integration with other Leeds Met systems

This is an area where it is perhaps still too early to think much beyond priorities and broad timescales. Given that there is already a plug-in for X-stream and that this is functionality that can be used as a selling point to the university community it makes sense to focus on this integration first. Also, perhaps, library online and the portal.

Adapting intraLibrary

intraLibrary is designed as a learning object repository and it is only now becoming clear just what is involved so that the platform will also function as an Open Access repository of research.

Access to learning objects is generally federated. For example, in order to access resources in JORUM it is neccessary to authenticate via Athens (soon to be Shibboleth) or by a UK Access Management Federation log-in mechanism and, so far as I know, it is not possible to search the repository externally via a search engine. As the very point of an Open Access repository is to make research discoverable and accessible on the public internet this is obviously not desirable! It is, I think, relatively straightforward to expose metadata out to search engines via the OAI-PMH but the majority of search engines no longer support the protocol and we really need to allow the full text to be crawled by Googlebot and other search engine spiders which, I suspect, will not be able to get past the authentication gateway (need more info on this). Moreover, if an external user does come to the repository via Google it will not be possible for them to search content without first authenticating into the system – not very open. Notwithstanding the fact that about 80% of traffic comes to a repository via search engines (assuming they can index content in the first place) we obviously also want an accessible search interface aswell.

The potential solution to these problems that I am currently investigating is to use a seperate, web-based SRU interface which sits outside the repository and is accessible on the public internet.

As part of the CD-LOR project Intrallect have already developed a basic SRU interface which, in turn, has been substantially improved by a third party – IRISS interface here – who have made the code available under an open source licence. The IRISS interface is still fairly basic and does not incorporate all of the functionality that we require – it is essentially a search box only and, for example, would not facilitate browsing the research collection by faculty. It should be reasonably straightforward to customise the interface to incorporate the functionality that we require; we essentially need a series of hyperlinks that map onto the internal repository structure and that will return the appropriate queries. I also need to clarify if such an approach will enable Googlebot and other search engine spiders to crawl the full text thus making the content searchable on the open web.

For each object, intraLibrary generates a public URL which can be linked to directly – on the open web and with no need for authentication. However, a further issue is that, due to the way that intraLibrary works, a query return (either from a search engine or the SRU interface) will link directly to the resource itself – i.e. a PDF of a research article will open immediately in the browser window. When facilitating Open Access to research this is undesirable for several reasons and we require some sort of “landing screen” that can provide context and basic information (abstract, copyright info, whether the paper has been refereed); indeed, there will often be a legal requirement to provide copyright information with many publishers also stipulating that there must also be a link to the published version of the paper. Precisely how we will resolve this issue is yet to be determined; it might be possible to embed a link to the PDF into some sort of HTML template and have this template returned at the public URL?

Watch this space…

By working closely with Intrallect and with a little ingenuity I am confident that these issues will be resolved and that we have, in intraLibrary, an excellent solution to our diverse needs.

Follow

Get every new post delivered to your Inbox.