Implementing the Symplectic API

We’ve made real progress implementing the Symplectic API which I hope will help motivate academic staff to update and maintain their Symplectic profile and, who knows, perhaps even encourage them to upload full-text to the repository! Kudos to web-developer Mike Taylor who has done all the clever stuff (though this summary reflects my understanding so may contain errors!)

As can be seen in the screen-shot below, Mike has been able to submit a query to the API (using Leeds Met username as a parameter) and differentially parse the resulting XML by publication type including, where available, links to DOI and full-text in the repository (currently labelled as Public URL). Symplectic also has the option to “favourite” records which is flagged in the XML and which we’ve use to identify “Selected publications” in order to give academics greater control over their profile (there is also a “make invisible” option to prevent specific records being exposed from the API.)

The next step will be to liaise with the corporate web-team to explore how the feed can be embedded in the institutional CMS. We’ve already picked a few brains and it shouldn’t be too difficult though there are still one or two technical issues including how best to submit a query; we wouldn’t want to use username as that would be a privacy issue and the preference would be email address though this will require a layer of translation from email address (which isn’t searchable)to either Leeds Metropolitan username or Symplectic internal user id. In addition, the API isn’t designed to be hammered dynamically so results need to be cached so there are questions how best to refresh that cache to reflect changes that academics may wish to make to their profile.

Advertisement

An institutional tangram – musings on developing an integrated research management system

“The tangram (Chinese: 七巧板; pinyin: qī qiǎo bǎn; literally “seven boards of skill”) is a dissection puzzle consisting of seven flat shapes, called tans, which are put together to form shapes. The objective of the puzzle is to form a specific shape (given only an outline or silhouette) using all seven pieces, which may not overlap.”

http://en.wikipedia.org/wiki/Tangram

Having implemented an institutional repository at Leeds Metropolitan and learning by experience some of the difficulties associated with advocacy around the use of that repository (both for OA research and OER) I have become all too aware “that repositories are ‘lonely and isolated’; still very much under-used and not sufficiently linked to other university systems”. So said JISC’s Andy McGregor at an event called “Learning How to Play Nicely: Repositories and CRIS” in May 2010 at Leeds Metropolitan (see my report for Ariadne here). This quote is still relevant, though  perhaps a little less so than when I heard it nearly 2 years ago, thanks to the ongoing work of JISC and particularly the RSP. In any case, the event was a revelation for me and I have coveted a so called Current Research Information Management systems (or CRIS for short) ever since!

And now, in Symplectic Elements, I have one…or at least the components of one (click on image for full size.)

The finished tangram? (click on image for full size)

It’s a puzzle though. A tangram if you will…one with considerably more than seven pieces:

intraLibrary, Symplectic, institutional website, University Research Office (URO), faculty research administrators, The Research Excellence Framework (REF), academic staff, web-developers, bibliographic information, research outputs, Open Educational Resources (OER)…

In fact, this may well not be all the pieces…pretty sure a few have been pushed down the back of the settee. I’ll look for them later.

Anyway, tortured metaphors aside, I have become increasingly aware that working in a large institution, in a role that encompasses technology and institutional policy (though I’m not, by any means, a policy maker…or indeed a real techie) is largely about communication and getting the right people, with the right skills, in the right place at the right time! Absorb policy and technical requirements from senior stakeholders and communicate those requirements to the proper techies – while also trying to ensure any motivating passions of one’s own don’t get lost along the way – Open Access to research and Open Education in my case.

For various reasons, individual user accounts have never been implemented for our repository and historically it has been administered centrally from the Library. In Symplectic we now have a system that is populated with central HR data; all staff will have an account they can access with their standard user name and password from where they can manage their own research profile including uploading full-text outputs directly to the repository*. In addition, administration by the University Research Office and faculty research administrators will be more easily centralised (particularly for the REF).

* In actual fact this functionality is not yet available in lieu of development work from Intrallect to capture the Atom feed from Symplectic and transform with XSLT to a suitable format for intraLibrary. I think.

One of the clever bits of functionality used to sell the software is automatic retrieval of bibliographic data from online citation databases – we are currently running against various APIs, Web of Science (lite), PubMed and arXiv – but I think this may actually be a bit of a red-herring for an institution like Leeds Metropolitan – at least until more (preferably free) data sources are available (JournalToCs API please!); early testing has shown, at best, it will only retrieve a subset of (the types of) outputs that we will need to record and it will be necessary to manually import existing records (e.g. EndNote) as well as implementing other administrative procedures at faculty level to capture information at the point of publication, especially for book-items, monographs, conference material, reports and grey literature.

More important, I think, to ensure that academic staff actually engage with the software rather than just seeing it as a tool for administrators, is to re-use the data to generate a list of research outputs – a dynamic bibliography – on a personal web-profile which has the potential to dramatically increase the visibility of research including Open Access to full-text.

Developing staff profiles of this type has been something of an obsession of mine for a while; we explored doing so from the repository (using SRU and email address as a Unique Identifier) and did develop a working prototype. Symplectic, however, integrated with central HR data and with its more sophisticated API, should make it much easier, at least from a technical perspective, and we are currently liaising with the central web-team to develop something similar to this example from Keele University – http://www.keele.ac.uk/chemistry/staff/mormerod/ (like us, Keele run Symplectic alongside intraLibrary.)

N.B. From the Symplectic interface, a user is able to “favourite” a research record and a flag comes out in the xml from the API which I understand is used on this page to display “Selected Publications”. DOI is also available from the API to link to the published version and if a user uploads full-text to the repository from Symplectic, this link is also in the xml – the first two records on this page include links to the full-text in Keele’s intraLibrary repository.

Our own Library web-dev Mike Taylor has been looking at the Symplectic API in detail and has put together a couple of prototype pages on a development server and after a meeting this week with a representative of the central web-team I’m reasonably confident we can move forward with this work fairly quickly…though there’s still a bit of a chicken & egg situation in populating the Symplectic database to then be re-surfaced via the API in this way.

There is also the question of whether we might alter our repository policy to become full-text only; one limitation of repositories across UK HE from an original conception (in the arXiv mould) of holding, disseminating and preserving full-text research outputs, is that they have in effect become “diluted” by metadata records for which it has not (yet) been possible to procure full-text or copyright does not permit deposit and “hybrid” repositories like ours, of full-text and metadata typically contain more metadata records than full-text (see figures from the RSP survey here). As I have argued on the UKCoRR blog, I think is makes sense to separate a bibliographic database (in Symplectic) from full-text only in a repository.

N.B. As Symplectic does not have the same search functionality as the repository, this approach has the potential disadvantage that it makes it more difficult to search across the entire corpus of research records (though one potential solution may be along the lines of that implemented by City Research Online which, in my view is rapidly becoming an exemplar of a research management system (Symplectic) + full-text repository (EPrints). Another good example is  St Andrews (PURE + DSpace) who presented a case study at “Learning How to Play Nicely: Repositories and CRIS” (video here.)

And what of OER? Along with our EasyDeposit SWORD interface, using OER to resource the refocus the undergraduate curriculum and the soon to be released intraLibrary 3.5 that will enable us to harvest OER from other repositories…for now I think they may be the bits down the back of the settee…

Anti-Green OA propaganda?

I shall present this without comment for now:

“The second model is known as the ‘Green Road’. It might be described as “no one pays” and thus is unlikely to be sustainable. The basic idea is that in response to the demand for public access, research funders mandate grantees to post articles for free access, on publication or after an embargo period. There are two obvious problems with this policy. Making available copies for free access will undermine the economic base of the publication. If much of the contents of a journal, albeit in an inferior version, can be found over the internet within, say, six months of publication why should a library continue to subscribe? In addition, once the publisher taken the article through a process of selection and improvement supported by peer review, it has a copyright interest in the final version. Not sufficiently widespread yet to undermine paid circulations, the Green Road could become a serious problem: we could land up with several versions of an article available on repositories with no proper stewardship, and libraries will be more inclined to cancel subscriptions.”

Bob Campbell, Senior Publisher and Cliff Morgan, VP Planning and Development – Wiley-Blackwell

http://blogs.wiley.com/publishingnews/2010/12/22/scholarly-communication-the-future-for-academic-authors/

British Library special collection: ‘Race’, Ethnicity and Sport

Hylton, K. (2008) 'Race' and Sport: Critical Race Theory. Routledge.

Dr. Kevin Hylton, Course Leader – MA Sport, Leisure and Equity here at Leeds Met, is working with the British Library to assemble a special collection of material around ‘Race’, Ethnicity and Sport.  Dr Hylton has already collaborated with the British Library on their website Sport & Society – the Summer Olympics and Paralympics through the lens of Social Science which includes a synopsis of his book ‘Race’ and Sport: Critical Race Theory published by Routledge and which “takes on the controversial subject of racial attitudes in sport and beyond. With sport as his primary focus, Hylton unpacks the central concepts of race, ethnicity, social constructionism and racialisation, and helps the reader navigate the complicated issues and debates that surround the study of race in sport.”

The new collection will be archived at www.webarchive.org.uk which, under the auspices of the BL, aims “to collect and permanently preserve the UK web” – more info here – and the Public Call states that “we hope that the ‘Race’, Ethnicity and Sport Collection will provide a valuable resource for researchers now and in the future.”

As far as I understand, Dr. Hylton is currently at the stage of identifying suitable material for the archive and asked me whether it was possible to cross-search UK Institutional Repositories to discover relevant full-text research material in this area (having, on numerous occasions, had the [mis]fortune to hear my advocacy on Open Access and repositories!).  As far as I am aware there are two services currently available – the UK Institutional Repository Search from MIMAS and the custom Google Search at OpenDoar (I’d be interested to know of any others) and some preliminary searches yielded a few relevant results – though there is no way of specifying full-text only, of course, which means many results are bib records only.

It’s perhaps still a moot point whether there is real value to a fully functional IR cross-search tool (in the style of http://rian.ie/en for Irish repositories) and the MIMAS and OpenDoar tools are described respectively as “demonstrator” and “beta” but, as Dr. Hylton’s interest supports, I’m inclined to think that such a tool, properly promoted and combined with a fully realised system of Green OA would indeed benefit the academic community, especially since Google abandoned support for OAI-PMH; I do think it would be necessary, somehow, to be able to filter by full text however which perhaps keeps the idea moot for now…

In the meantime, if anyone does have appropriate full text material archived in their repository please let us know and/or pass the call on to interested colleagues.

How to Build a Case for University Policies and Practices in Support of Open Access

Briefing paper written by Alma Swan and Frederick Friend on behalf of JISC:

http://www.jisc.ac.uk/media/documents/publications/programme/2010/howtoopenaccessfinal.pdf

Repository Steering Group meeting: 22nd July 2008

The staff development festival in September is a unique opportunity to promote the repository and our agenda for yesterday’s meeting aimed to get some much needed input from the steering group before the quiet month of August.

Item 1. Recap of previous meetings:

Documentation approved.

Item 2. Update on progress with intraLibrary

2a. Configuration:

Search interface (SRU):

Getting the search interface on line is the first priority – my request for the server is still pending with IMTS but I hope we can install the IRISS interface as is within the next few weeks (JohnG is installing it on a local server as we speak which can then be tranferred to our Leeds Met domain when it is available) and I think it will be straightforward to switch the CSS to get a very rough Leeds Met branding.

Content structure:

This is also crucial and needs to be put in place ASAP. Several members of the group expressed the opinion that it should not be based on faculties which tend not to be fixed entities within the university; it was also thought that such a schema would not reflect institutional emphasis upon cross-disciplinary research. There was consensus that organisation at the top level should be by content type (i.e. Research/Learning Objects) but exactly what hierarchy should be employed beneath is still not clear (library of congress subject headings?). We also need to make a decision on what other material types will be accomodated in the prototype (e.g. Dissertations and Theses)

Landing screen:

Technical challenges aside, the current conception of the landing screen is that it will essentially use the same template as the search interface i.e. it will be branded the same and share the same look and feel; it will also share some of the same functionality and link back ‘home’ to the search interface.

Given the close relationship between these configuration issues, a sub-group was identified that will liaise as necessary to develop the content structure; branding; look and feel; usability and will also inform the technical development of the additional functionality.

2b. Policies:

The group was briefed on the types of policies that need to be developed (see last post) with emphasis on the fact that the ‘standard’ institutional repository policies may be insufficient for our requirements given our wider remit (i.e. not just research outputs). A sub-group was identified that will liaise as necessary to develop suitable policies.

2c. URL:

The suggestion mooted – repository.leedsmet.ac.uk – was deemed suitable by the group

Item 3. Content for the repository:

To discuss method of contacting researchers / research active staff and soliciting content

Review of draft correspondence for research active staff and discussion of when this would most usefully be disseminated; consensus that it would have the greatest impact some time after the staff development festival. Content was broadly approved though it was suggested that greater emphasis be placed on the benefits of OA to citation and the increased importance of citation under proposals for REF (to replace RAE).

Emphasis was placed on the need to identify and recruit interested parties within specific faculties/research groups to help drive the advocacy process to the wider community; liaison with University Research Office for appropriate contact lists.

(NB. This is an ongoing process that is already underway but will increase in profile with the implementation of the prototype system.)

The Staff development festival confirmed as a key opportunity.

There was discussion whether content would be full text only or would also comprise citation of material that we do not have copyright permission to make available as full text (i.e. bibliographic reference only). Given that including such material will enable us to ‘hit the ground running’ and considering the increasing importance of citation data/bibliometrics for the RAE / REF the consensus was that citations should be included at the outset.

Item 4. Authentication

It was emphasised to the group that we can be fully functional as a mediated repository without the need for authentication in the first instance.

A representative from IMTS was able to inform the discussion in the light of recent feedback from Intrallect and will continue to liaise as necessary.

Item 5. Integration with other Leeds Met systems

In light of the decision to include citations as well as full text, an important early integration will be with SFX such that citations in the repository can incorporate a link to Leeds Met holdings of subscribed material; hardly Open Access as it will only be available to authenticated staff and students but will offer another local route to that material and can also be used to generate data on OA friendly publishers and perhaps to raise awareness of OA.

The PowerLink to X-stream should also be a priority such that it is operational at the earliest opportunity.

NB. Precise functionality of the PowerLink still needs to be determined.

Other systems flagged up for integration were iTunesU and the streaming server; pending investigation!

The next meeting of the steering group will take place after the staff development festival, probably late September/early October.

Technology and learning day

Dawn and me at the TEL Day 03/06/08

Appealing to peoples’ acquisitive natures, Dawn and I offered a small incentive to encourage people to complete our questionnaire at the TEL day on 3rd June (the lucky winner has been informed so I’m afraid if you haven’t heard then it wasn’t you!)

We ended up with 20 respondents and, from my perspective, some interesting preliminary data; reassuring in that it suggests there is already a good awareness about the yet to be implemented Leeds Met repository and also a reasonable general knowledge about Open Access with only 4 of the 20 respondents professing ignorance about the project and 14 saying they have “some” (12) or “good” (2) awareness of OA.

I realise this is a small sample and that attendees at this event may well be better informed about new technological initiatives within the university than the academic population at large but is is nevertheless encouraging to know that there is a kernal of folk for whom these ideas aren’t entirely new with almost half (9) of people also being familiar with publisher self-archiving policies.

Just one of our respondents had actually submitted an article to an Open Access repository; hopefully this number will increase dramatically once they have an institutional repository of their own!

The penultimate question in the OA section of the questionnaire focussed on 6 potential benefits of Open Access and asked people to rank them each from 1 (not important) to 5 (important). For the purposes of summary here I am regarding ranks 1 and 2 (not important); rank 3 (of medium importance); 4 and 5 (important). The full spreadsheet is available here.

a. Public have access to research they have helped fund through taxation

15 respondents considered this important; 4 respondents considered it of medium importance; 1 did not respond

b. Teachers/students have access to key resources without subscription barriers

18 respondents considered this important; 1 respondent considered it of medium importance; 1 respondent did not consider it important

c. Maximise research impact/increase citation of your work

12 respondents considered this important; 5 respondents considered it of medium importance; 3 respondents did not consider it important

d. Increased return on investment for funding bodies

10 respondents considered this important; 8 respondents considered it of medium importance; 2 respondents did not consider it important

e. Scholars in economically disadvantaged areas of the world (eg. developing countries) have greater access to published research

17 respondents considered this important; 2 respondents considered it of medium importance; 1 respondent did not consider it important

f. Reduced economic constraints on institutional libraries that can currently afford to subscribe to a relatively small sub-set of published research

17 respondents considered this important; 2 respondents considered it of medium importance; 1 respondent did not consider it important

The final question in the OA section asked:

“In the course of your online research, how frequently do you encounter resources that you are unable to access (eg. LeedsMet does not subscribe to the resource)?”

For half of respondents (10) this is a problem “occasionally” with 7 encountering it more frequently; only 3 respondents said this was “hardly ever” a problem for them.

This brief questionnaire is just a staging post on the advocacy journey but it has certainly been a useful exercise; aside from the data itself, both Dawn and I need volunteers from the university community to become actively involved in our respective development and evaluation processes and many of our respondents indicated their willingness to do just that. I hope that from this small but interested kernal we can begin to reach out to others, spreading knowledge and enthusiasm for Open Access and the Leeds Met repository as we go.

For a summary of the PERSoNA section of the questionnaire see PERSoNA NEWS

What is the actual proportion of journal publishers in the SHERPA RoMEO database?

This is a question that was raised at the recent CRI seminar and as promised I’ve done a bit of digging. Well, anyway, I emailed the fellow at SHERPA so thank you to Bill Hubbard who I paraphrase/quote here.

First of all there are 386 academic publishers accounted for in the SHERPA RoMEO database but it is very difficult to establish what proportion of the worlds’ publishers this figure actually represents. Moreover, having Elsevier in the database is obviously more significant than some obscure specialist publisher that may only publish a single title a year; such publishers, of course, come and go all the time.

It is similarly difficult to give a meaningful figure for the actual number of acedemic journals published by those 330+ publishers and estimates vary (wildly!) between 14,000-28,000 but many of those will be extremely limited circulation specific to a country etc.

It is currently estimated that there are 8-9,000 journals covered by RoMEO although this does vary. It is based on a combination of British Library holdings and publishers outputs. And of course publishers acquire new titles, take over other companies, sell off titles, new ones start, etc every week.

So what, then, is Bill’s conclusion?

“I don’t think you can get better than saying the majority – and the vast majority of titles that are of interest to UK researchers.”