Separate HTML pages for individual records

I’m returning here to an old theme that is still nagging away at the back of my mind and that I think still needs exploring further as the functionality of the SRU interface develops; both by Mike and I and by Intrallect in the context of their ongoing development of the research repository aspect of intraLibrary.

Can we generate individual HTML pages for records such that a search query could generate a list of hyperlinks that point to those individual pages rather than to the location URL stored in intraLibrary which is currently the case?  This would more closely approximate the way that EPrints and DSpace work and potentially solve the Google problem by providing an easily indexable page of static HTML for search engine spiders to crawl.  Could these pages also have nice, short, human readable URLs instead of convoluted search strings / machine-generated public URLs from intraLibrary.  Again more like EPrints/DSpace.  Currently the only way I can give a link to an item is:

http://repository.leedsmet.ac.uk/main/search.php?q=promoting+open+access+to+research&x=22&y=26&exacttext=1

(The SRU search string that will provide the metadata)

Or

http://repository-intralibrary.leedsmet.ac.uk/IntraLibrary?command=open-preview&learning_object_key=i05n27905t

(The machine generated public URL for the actual PDF)

I’ve recently been adding RSS feeds to http://repos-dev.leedsmet.ac.uk/main/browse.php and another issue (aside from the fact that the wrong field is exposed by RSS) is that these also point to the location URL stored in intraLibrary – the PDF in the case of full text but the published URL in instances where there is a citation only.  It would be much better if these feeds could point at a Leeds Met repository metadata record.

I simply do not have the technical insight to know whether any of this is achievable at all and, if it is, how big a job it will be.

Advertisements

6 Responses to Separate HTML pages for individual records

  1. lms4lwebdev says:

    The main question for this sort of functionality is, ‘Can we query the SRU in such a way as to guarantee that a single record is returned and that the record returned is the record we were expecting?’ Without some form of unique identifier being exposed (or at least query-able) via the SRU I think we’re a bit stuck.

    A query based on record titles might seem like a good idea, but it it only unlikely that there will be duplicates rather than impossible. A query based on a combination of title and author would decrease the likelihood of duplicates but this is adding unwanted complexity. Even with the presumption that we have these unique URLs, there is still the issue of finding a way to gather/generate them.

    I agree that this would be the perfect way to (hopefully) solve the Google indexing problem and in fact this concept has been in my head from when we first had issues with Google.

    My best case scenario would be that we could query the SRU with a unique record code (similar to the way we can query collection codes currently). There would also be a way to routinely extract these codes along with the record titles (as a minimum), as a feed or otherwise.

    Essentially, I think that it’s not particularly practical to attempt this currently, unless there are features of intraLibrary that we are unaware of.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: