PowerLink to X-stream and CLA copies

One of the selling points of intraLibrary was the PowerLink to X-stream (Blackboard Vista) which, as I understand it, will enable a tutor to link directly to an object stored in the repository without the need to upload it to the X-stream module.

We hope to be able to use the repository to store and make available digitised books in line with the CLA licence. Our copyright officer has outlined her ideal requirements from the combined system as follows:

• Closed, secure storage space for digitised files (“Digital Copies”)
• Tutor is provided with a link to a Digital Copy stored within the repository
• The link can be added to an X-stream module (to connect between VLE & Intralibrary)
• Student doesn’t need to login to access Digital Copy when already logged into X-stream
• The Digital Copy remains within the repository
• Library maintains control over the Digital Copies; the Digital Copies can be removed after end of course

She points out that there may well be copyright implications associated with using the repository in this way:

The CLA licence states:

Digital Copies may not be stored, or systematically indexed, with the intention of creating an electronic library or similar educational learning resource

On the face of it this seems to preclude the use of a repository but might it be allowed if the storage is entirely secure i.e. it cannot be accessed by students (or unauthorised staff) without a PowerLink to the VLE which will only make the digital copy available in accordance with the licence – that is, as though it had simply been uploaded to the VLE?

I suppose it would be an indexed, electronic library of sorts but purely for archival purposes – for authorised library staff to have a centralised, searchable store of digital copies that can be linked to directly from X-stream without needing to email the actual resource to an individual tutor so he can upload it to X-stream. Given the flexible nature of intraLibrary (another selling point) it should be straightforward to federate access to a particular collection (digitised books) and a particular user group (librarians) in this way but is the PowerLink secure? Will staff be able to share the link (which is ok but only if we know about it and can record it)? Will tutors just be linking to the resource and not actually copying the file from the repository into X-stream?

I need to learn more about how the PowerLink actually works – and X-stream itself for that matter. Not to mention the CLA licence and copyright!


Repository Steering Group meeting: 22nd July 2008

The staff development festival in September is a unique opportunity to promote the repository and our agenda for yesterday’s meeting aimed to get some much needed input from the steering group before the quiet month of August.

Item 1. Recap of previous meetings:

Documentation approved.

Item 2. Update on progress with intraLibrary

2a. Configuration:

Search interface (SRU):

Getting the search interface on line is the first priority – my request for the server is still pending with IMTS but I hope we can install the IRISS interface as is within the next few weeks (JohnG is installing it on a local server as we speak which can then be tranferred to our Leeds Met domain when it is available) and I think it will be straightforward to switch the CSS to get a very rough Leeds Met branding.

Content structure:

This is also crucial and needs to be put in place ASAP. Several members of the group expressed the opinion that it should not be based on faculties which tend not to be fixed entities within the university; it was also thought that such a schema would not reflect institutional emphasis upon cross-disciplinary research. There was consensus that organisation at the top level should be by content type (i.e. Research/Learning Objects) but exactly what hierarchy should be employed beneath is still not clear (library of congress subject headings?). We also need to make a decision on what other material types will be accomodated in the prototype (e.g. Dissertations and Theses)

Landing screen:

Technical challenges aside, the current conception of the landing screen is that it will essentially use the same template as the search interface i.e. it will be branded the same and share the same look and feel; it will also share some of the same functionality and link back ‘home’ to the search interface.

Given the close relationship between these configuration issues, a sub-group was identified that will liaise as necessary to develop the content structure; branding; look and feel; usability and will also inform the technical development of the additional functionality.

2b. Policies:

The group was briefed on the types of policies that need to be developed (see last post) with emphasis on the fact that the ‘standard’ institutional repository policies may be insufficient for our requirements given our wider remit (i.e. not just research outputs). A sub-group was identified that will liaise as necessary to develop suitable policies.

2c. URL:

The suggestion mooted – repository.leedsmet.ac.uk – was deemed suitable by the group

Item 3. Content for the repository:

To discuss method of contacting researchers / research active staff and soliciting content

Review of draft correspondence for research active staff and discussion of when this would most usefully be disseminated; consensus that it would have the greatest impact some time after the staff development festival. Content was broadly approved though it was suggested that greater emphasis be placed on the benefits of OA to citation and the increased importance of citation under proposals for REF (to replace RAE).

Emphasis was placed on the need to identify and recruit interested parties within specific faculties/research groups to help drive the advocacy process to the wider community; liaison with University Research Office for appropriate contact lists.

(NB. This is an ongoing process that is already underway but will increase in profile with the implementation of the prototype system.)

The Staff development festival confirmed as a key opportunity.

There was discussion whether content would be full text only or would also comprise citation of material that we do not have copyright permission to make available as full text (i.e. bibliographic reference only). Given that including such material will enable us to ‘hit the ground running’ and considering the increasing importance of citation data/bibliometrics for the RAE / REF the consensus was that citations should be included at the outset.

Item 4. Authentication

It was emphasised to the group that we can be fully functional as a mediated repository without the need for authentication in the first instance.

A representative from IMTS was able to inform the discussion in the light of recent feedback from Intrallect and will continue to liaise as necessary.

Item 5. Integration with other Leeds Met systems

In light of the decision to include citations as well as full text, an important early integration will be with SFX such that citations in the repository can incorporate a link to Leeds Met holdings of subscribed material; hardly Open Access as it will only be available to authenticated staff and students but will offer another local route to that material and can also be used to generate data on OA friendly publishers and perhaps to raise awareness of OA.

The PowerLink to X-stream should also be a priority such that it is operational at the earliest opportunity.

NB. Precise functionality of the PowerLink still needs to be determined.

Other systems flagged up for integration were iTunesU and the streaming server; pending investigation!

The next meeting of the steering group will take place after the staff development festival, probably late September/early October.

Search interface, URLs, taxonomy, policies and content…

It is now established that we will be using the SRU interface developed by IRISS as the public search interface for the repository. I hope to install the current incarnation of the interface on a Leeds Met server very soon and two of my more technically adept colleagues are looking at the recently released code in order to scope the extent of the development work that will be required to incorporate advanced search and browse functionality. As this page will effectively be the repository by proxy (the URL that I have requested is repository.leedsmet.ac.uk – intraLibrary itself will require a different URL) we also need to think about what other elements it might need to comprise; authenticated log-in to intraLibrary itself (yet to be determined if this will be the appropriate route for self-archiving; it will certainly be one route but we may also need an authenticated link to a SWORD interface for example); About this repository; FAQs; Operational policies; Contact etc. It is also likely that this page will form the basis of – or at least link to – the PERSoNA web-tool(s).

What about learning objects which will require their own taxonomy and a different workflow for deposit (via SWORD perhaps)? Should they be incorporated into the search interface at all or will users need to authenticate into intraLibrary to browse? This would seem to make sense given intraLibrary is a specialised LO repository and access to this type of content is more likely to be restricted to Leeds Met staff.

I’ve adapted my schematic recently posted on PERSoNA News to try to represent what the repository might now look like:

The customisation of the search interface is one of the issues that I am taking to the steering group meeting tomorrow afternoon.

Other decision that needs to be ratified by that group are:

  • The URL for the search interface
  • The URL for intraLibrary
  • The taxonomy system that we shall use within intraLibrary and that the search interface (SRU) will map directly on to (at least for research)

Other items on the agenda are:

  • Development of operational policies for the repository

I have so far drafted the following:

  1. Metadata policy
  2. Data policy
  3. Takedown policy
  4. Content policy
  5. Submission policy
  6. Preservation policy

These are all fairly standard in terms of Open Access repositories and, with the exception of 3. Takedown policy, were all generated using the OpenDOAR Policies Tool, nevertheless, it may be necessary to identify specialised sub-groups to review these drafts to ensure they are appropriate for the Leeds Met repository; the issue is more complex of course due to our repository incorporating Learning Objects as well as research.

  • Content for the repository

There needs to be a discussion about how best to contact researchers and research ac tive staff to ask them for appropriate material for the repository. In the first instance, in line with the project plan, this will be their own versions of published research articles that are allowed to be self-archived into an OA repository. I have begun to identify such material and have drafted correspondence for review at the meeting.

  • Authentication

With the implementation of the search interface (SRU) it will not be necessary to authenticate in order to browse for research content (essential for OA). It will, however, be necessary to generate authenticated accounts for Leeds Met staff that require access to intraLibrary itself and these will need to be integrated with LDAP. Though much will depend on the precise configuration of our integrated repository systems it is likely that, in time, all staff will require an authenticated account whether to deposit material, search for learning objects or access their internal workspace. There are also authentication issues pertaining to the potential use of SWORD/other external interfaces such that only authorised Leeds Met staff/students can deposit material/access federated content. I am still unsure of some of the issues involved and require input from Intrallect and IMTS.

  • Integration with other Leeds Met systems

This is an area where it is perhaps still too early to think much beyond priorities and broad timescales. Given that there is already a plug-in for X-stream and that this is functionality that can be used as a selling point to the university community it makes sense to focus on this integration first. Also, perhaps, library online and the portal.

Research repositories: the debate continues

Damyanti Patel and Owen Stephens summarise a presentation given by Bill Hubbard at the JISC Innovation forum here

As they say, nothing new perhaps but a concise review of the status quo and the big ideas for moving forward:

“OK – where are we going?

  • Exposure for harvesting
  • Linkage to departmental pages
  • Linkage to personal pages (we do this at Imperial)
  • REF – citation and usage analysis
  • Beyond pdf – text and data-mining
  • Virtual Research Environments
  • Embedding into institutional workflows
  • Repository as a set of services
  • Staffing and management
  • Funder mandate compliance

Bill drawing analogy between library processes and repository – getting a book into the library depends on many different people being involved, inside and outside the library, ‘Repositories’ need to be embedded to this degree (I’d argue, and think Bill would agree, that it isn’t ‘repositories’ but services that need to be embedded – afterall with the library analogy we don’t talk about it in terms of the library management system)”

Adapting intraLibrary

intraLibrary is designed as a learning object repository and it is only now becoming clear just what is involved so that the platform will also function as an Open Access repository of research.

Access to learning objects is generally federated. For example, in order to access resources in JORUM it is neccessary to authenticate via Athens (soon to be Shibboleth) or by a UK Access Management Federation log-in mechanism and, so far as I know, it is not possible to search the repository externally via a search engine. As the very point of an Open Access repository is to make research discoverable and accessible on the public internet this is obviously not desirable! It is, I think, relatively straightforward to expose metadata out to search engines via the OAI-PMH but the majority of search engines no longer support the protocol and we really need to allow the full text to be crawled by Googlebot and other search engine spiders which, I suspect, will not be able to get past the authentication gateway (need more info on this). Moreover, if an external user does come to the repository via Google it will not be possible for them to search content without first authenticating into the system – not very open. Notwithstanding the fact that about 80% of traffic comes to a repository via search engines (assuming they can index content in the first place) we obviously also want an accessible search interface aswell.

The potential solution to these problems that I am currently investigating is to use a seperate, web-based SRU interface which sits outside the repository and is accessible on the public internet.

As part of the CD-LOR project Intrallect have already developed a basic SRU interface which, in turn, has been substantially improved by a third party – IRISS interface here – who have made the code available under an open source licence. The IRISS interface is still fairly basic and does not incorporate all of the functionality that we require – it is essentially a search box only and, for example, would not facilitate browsing the research collection by faculty. It should be reasonably straightforward to customise the interface to incorporate the functionality that we require; we essentially need a series of hyperlinks that map onto the internal repository structure and that will return the appropriate queries. I also need to clarify if such an approach will enable Googlebot and other search engine spiders to crawl the full text thus making the content searchable on the open web.

For each object, intraLibrary generates a public URL which can be linked to directly – on the open web and with no need for authentication. However, a further issue is that, due to the way that intraLibrary works, a query return (either from a search engine or the SRU interface) will link directly to the resource itself – i.e. a PDF of a research article will open immediately in the browser window. When facilitating Open Access to research this is undesirable for several reasons and we require some sort of “landing screen” that can provide context and basic information (abstract, copyright info, whether the paper has been refereed); indeed, there will often be a legal requirement to provide copyright information with many publishers also stipulating that there must also be a link to the published version of the paper. Precisely how we will resolve this issue is yet to be determined; it might be possible to embed a link to the PDF into some sort of HTML template and have this template returned at the public URL?

Watch this space…

By working closely with Intrallect and with a little ingenuity I am confident that these issues will be resolved and that we have, in intraLibrary, an excellent solution to our diverse needs.

MSc Dissertation

I am excited by some of the material being generated by Masters student Beth Hall and we’d both like to thank all those staff and postgraduates (48 so far!) who have taken the time to complete her questionnaire (there’s still time if you haven’t yet got round to it!).

Beth is in the process of conducting follow-up interviews with those that have agreed.

I will post a couple of her early outputs to this blog:

Here is a graphical summary of some of the data gathered by the questionnaire [N=48]

Here are the questions that will comprise the face to face interviews.

I should emphasise that these are early outputs and very much represent work in progress; nevertheless there is some useful data that we can use as a foundation for ongoing advocacy and I look forward to reading Beth’s dissertation when it is written.