Research records – filling the gaps with Google Scholar + Zotero

The stated aim of our Symplectic implementation – and integration with the repository – is to make it easier to maintain a constant, up-to-date picture of research activity across the University…historically, however, research management has been somewhat variable across the institution…frankly I knew this already and the repository had become the de facto research management tool but is itself far from comprehensive. Nor are the automatic data sources (Web of Science, Scopus and PubMed) likely to solve the problem, with variable results depending, for example, on the subject area and types of publication; I have also been importing existing records from EndNote libraries …where they exist, but there are still large swathes of research missing over the past 10 years or so that we are trying to cover. Especially less formal publications.

Other than automated search, the easiest way to get data into Symplectic is by importing RefMan (RIS) or BibTex, both of which can be exported from Google Scholar, but only as single records (so far as I can tell), unless you use Zotero in FireFox…

1. Install Zotero in FireFox – https://addons.mozilla.org/en-US/firefox/addon/zotero/
2. Go into settings in Google Scholar (top right)
3. Bibliography manager -> Check “Show links to import citations into” and select preferred output (RefMan/BibTex etc) and save preferences
4. Now a search in Google Scholar should show a folder icon in the address bar. Click the folder.


5. A small window drops down that shows the Google Scholar citations, with an empty check box in front of each citation
6. Select the citations that you need and click “OK”

7. A small window pops up that indicates the records are being saved into Zotero
9. Open the Zotero window with the icon at bottom of browser where the records should be displayed (you can keep searching and sending additional records to Zotero for eventual export)
10. Highlight (select) the Zotero records that you wish to export. Right-click on the selection and select “Export selected items”

Choose the appropriate format (in my case RIS) and save the file to the desktop with an appropriate name for subsequent import to Symplectic / research management system of choice. Ta da!

Records in Google Scholar aren’t necessarily the most reliable so care will need to be taken with this process but it’s certainly worth exploring as a method of filling the gaps in our research records.

Still baffled by Google…

Just reproducing an email to ukcorr-discuss here in case any technically minded folk not on the list might pass by these parts…

To revisit the whole Google Scholar / full-text indexing “thing” I was just looking at results in GS for a particular academic who has raised a query about his full-text not being visible in Google Scholar; he has 6 full-text in the repository but a site: search of GS only appears to return x2:

http://scholar.google.co.uk/scholar?hl=en&q=site%3Ahttp%3A%2F%2Frepository-intralibrary.leedsmet.ac.uk+%22x.+font%22&btnG=Search&as_sdt=0%2C5&as_ylo=&as_vis=0

Initially I thought it may be an artefact of when full-text were added; records were all added at the same time (24th May 2011) but full-text was only added for one of the GS results at that time (plus one not indexed at all – see below) as opposed to October 2011 for all the others (including the other GS result)…and that’s still a good 6 months which you would think would be long enough to be indexed. Wouldn’t you?

Normal Google, by contrast, returns 4 full-text records:

https://www.google.co.uk/search?hl=en&as_q=&as_epq=xavier+font&as_oq=&as_eq=&as_nlo=&as_nhi=&lr=&cr=&as_qdr=all&as_sitesearch=http%3A%2F%2Frepository-intralibrary.leedsmet.ac.uk%2F&as_occt=any&safe=images&tbs=&as_filetype=pdf&as_rights=

The missing results are http://repository.leedsmet.ac.uk/main/view_record.php?identifier=4881&SearchGroup=Research (full-text added 24th May 2011) / http://repository.leedsmet.ac.uk/main/view_record.php?identifier=4893&SearchGroup=Research (full-text added 10th October 2011).

The only other difference I can spot is that several of those non-indexed in GS don’t have metadata in the PDF (which is why they have just been picked up in normal Google as “Leeds Metropolitan University Repository” from the coversheet…

As a caveat, there is technical peculiarity in that we effectively have a two-server set up with our Open Search interface on an institutional server which queries intraLibrary by SRU, the software itself hosted for us in a server-farm somewhere which might explain idiosyncratic behaviour to some extent…

Am I missing anything else?!