Using Yahoo Pipes to redirect to Open Search metadata page instead of intraLibrary public URL (and the power of Twitter)
December 21, 2009 5 Comments
First of all, a big thank you to Owen Stephens – @ostephens – who responded to my musing tweets on RSS by assembling a Pipe that “Rewrites Intralibrary RSS feed to use ‘link’ to metadata rather than object”; a great example of the power of Twitter for anyone who still thinks it’s an exercise in pointless self-revelation, full of trivial noise. As Amber Thomas –@ambrouk – put it recently and as I also to tend towards, Owen exemplifies “whole person Tweeting” not restricting our interraction on Twitter to our professional sphere but filling it with more personal and sociable “noise” – the closest thing to a virtual office you will find. I’ve never met Owen in real life but I shall certainly buy him a pint if our paths ever do cross!
As anyone who has passed by these parts before will know, we have been wrestling with intraLibrary for about two years now to develop a blended repository of Leeds Met’s research output (both Open Access full text and citation only) and Teaching & Learning material (both Open Educational Resources/material for federated access only) and we have spent a lot of time developing the IRISS SRU interface as a front end to provide appropriately differentiated Open Access to the different types of resources.
One of the simplest ways for a repository to alert users to new content is via RSS and it is very easy to generate a feed for pretty much any criteria in intraLibrary; I have generated several feeds for both research collections (by faculty) and for OER. There are, however, two main issues with these feeds:
- The first problem is associated with the metadata template I have implemented for research and the lack of flexibility to customise which fields are exposed via RSS – I haven’t yet got a solution to this issue.
- the second problem, however, arises because the URL exposed by the feed points to the public URL generated by intraLibrary whereas I need it to point to the Open Search metadata page and this is where Yahoo Pipes can come in.
I don’t have much experience with Pipes but it is billed by Yahoo as “a free online service that lets you remix popular feed types and create data mashups using a visual editor. You can use Pipes to run your own web projects, or publish and share your own web services without ever having to write a line of code.”
Owen’s pipe has three components:
- “Fetch feed” which is simply the intraLibrary generated RSS feed
- “Regex” which applies a regular expression to an item attribute. In this case it takes the components of the public URL in item.guid.content – oai:com\.intralibrary\.leedsmet:(.*) – and replaces it with the components to build the URL of the Open Search metadata page – http://repository.leedsmet.ac.uk/main/view_record.php?identifier=$1&SearchGroup=Open+Educational+Resources
- “Rename” which does what it says on the tin and simply renames or copies item atributes – in this case item.link becomes objectlink and item.guid.content becomes link
The Pipe Output can be subscribed to as RSS which gives us a feed that does indeed link to the Open Search metadata page rather than the intraLibrary public URL:
http://pipes.yahoo.com/pipes/pipe.run?_id=5c085e83cb144f9a1796558fa7d6d253&_render=rss
Simple when you know how!
N.B. Some of these links actually DON’T work and I haven’t yet been able to figure out quite why. As far as I can tell the affected resources were all uploaded as part of the Reproduce project and the links are to none-existent unique identifiers e.g. http://repository.leedsmet.ac.uk/main/view_record.php?identifier=1432&SearchGroup=Open+Educational+Resources . I think it may be because some of these records seem to have been ascribed 2 unique identifiers – this is an automatic process in intraLibrary and configured as uneditable so I’m not sure how it has happened. However, they were originally uploaded by another user before the user profiles and metadata template for ukoer were fully configured and I may need to delete and re-upload as I have done already with http://repository.leedsmet.ac.uk/main/view_record.php?identifier=1673&SearchGroup=Open+Educational+Resources – I’m not sure how/when this will be updated in the feed and currently it’s still linking to the none-existent UID; it may take a little time to update.
Anyway, I’ve copied @ostephens Pipe – hope you don’t mind Owen, I couldn’t find any rights information(!) – and replaced the feed with one of my research feeds – Carnegie Faculty of Sport and Education – and modified the “Regex” module to redirect appropriately by replacing oai:com\.intralibrary\.leedsmet:(.*) with http://repository.leedsmet.ac.uk/main/view_record.php?identifier=$1&&SearchGroup=research.
N.B. As mentioned, there is an additional problem associated with the research metadata template and the lack of flexibility to customise which fields are exposed via RSS – the only way I have been able to accommodate all of the information required for research (using EPrints software as a template) with the intraLibrary metadata schema (UK LOM) is by using multiple description fields as extensions; ideally I would like the abstract to be exposed via RSS but this is in second description field whereas it is the bottommost description field that is exposed via RSS which generally contains whether the resource is refereed or not – this is not terribly relevant so I’ve added a Mapping to the “Rename” module to remove it which means the feed now exposes just the title which does indeed link to the Open Search metadata page:
http://pipes.yahoo.com/pipes/pipe.run?_id=286bbb1d8d30f65b54173b3b752fa4d9&_render=rss
I’ve also become aware of another issue with our feed through correspondence with the Xpert project at Nottingham University; the RSS URL generated by intraLibrary has a dynamic element or token which means that the URL is different each time it is generated so every time Xpert harvests the feed, they see these urls as new resources, whereas they are actually duplicates. Might it be that submitting the RSS feed for the Pipe instead of that generated by intraLibrary will also solve this problem?
There’s only one way to find out – no, fans of Harry Hill not a fight – I was thinking just of resubmitting the new feed…
I have also summarised another issue arising from intraLibrary’s lack of flexibility to customise which fields are exposed via RSS on Lorna Campbell’s blog-post about OER, RSS and JorumOpen which examines the possibility of bulk deposit by ingest of RSS feeds.
Note: After a little experimentation, Mike has actually put together a basic prototype that does include dc:date and a dc:rights
Cheers Nick – you are very welcome to use/copy the pipes I’ve done of course 🙂
I don’t know if this helps on the metadata front, but I can see a way of using the existing RSS feeds in combination with OAI-PMH to enhance the RSS feed with the relevant DC fields. You can see the start of the pipe work for this if you go to http://pipes.yahoo.com/ostephens and look for the pipe called “IntraLibrary RSS feed enhanced via OAI to include DC fields”
The basics of what I’m trying to do seems to work – that is I can get the relevant DC fields populated – however, when I choose ‘RSS’ as the format on the Pipe output it gives me RSS 2.0 only – which isn’t able to include the enhanced DC metadata – I need RSS 1.0 for that. The Yahoo Pipes documentation says it supports RSS 1.0 as well, but at a brief look I can’t quite work out how. I’ll try to have a look when I get some more time.
Have I got the right end of the stick in terms of the problem you need to solve for the metadata?
Pingback: Using Yahoo! Pipes to Redirect to Openness «
Pingback: Really (not so) Simple Syndication « Repository News