Episode 2, SUNCAT: Notes from initial chat

Questions about use cases for the data; what the benefits are to end users or to libraries.
Something we have to work on formulating with data in hand. Morag mentions a 2005 document discussing SUNCAT use cases.

One suggested usage is deriving information about the focus and specialisms of libraries, by extending library subject metadata using journal/article subject metadata – so identifying the bent of universities through the holdings of their libraries.

Another immediate usage is linking bibliographic datasets of journal articles, to journal issues and journal information found in SUNCAT. Medline is a useful example of dataset that can be integrated – work on Linked, Open Medline metadata happening through the OpenBiblio project.

SUNCAT holds a record for each institution, its library location, and this could helpfully be linked to the OARJ Linked Data for institutions, and the JISC CETIS PROD work collecting different sources of UK HE/FE information.

Sources in SUNCAT may have an OID which could be re-used as part of a URI. Journals both electronic and hardcopy also (though not always) have ISSNs.
There are restrictions on re-use of data licensed from the ISSN Network, but one can get some of it from other sources – CONSER is a North-America-focused example, with a bit of a scientific bent (thus useful for Medline).

SUNCAT uses OpenURL to search for journal articles and holdings data in institutional libraries. Libraries run an “OpenURL resolver” – often with a bit of proprietary software such as SFX – to map OpenURLs to stuff in their holdings. Would be interesting to find out more about the inside of an OpenURL resolver and how useful a Linked Data rendering of it would be…

Surprised to learn that university libraries often don’t maintain their own subscription database; journals are bought in “bundles” whose contents are shifting, and libraries depend on vendors to sell them back their processed catalogue data.

SUNCAT contains a dataset describing libraries their affiliations and locations, held in a set of text files. This would be a good place to start with a simple Linked Data model that we can link up to the outcome of the previous LDFocus sprint, and then work on connecting up the library holdings data.

Starting a separate notepad for SUNCAT links. Should have done this earlier, been busy about the new release of Unlock and the wrap-up of the Chalice project

Preparing for Episode 2: SUNCAT

We’re preparing to start the second Linked Data Focus sprint next week (from May 16th) – working with the developers from the SUNCAT team, who are bibliographic data specialists.

Our notepad from the first sprint has a lot of links to relevant resources – introductions to RDF, tools in different languages, and descriptions of related work around academic institutions and communications.

This presentation by Jeni Tennison from the Pelagios workshop is also worth looking at for sensible advice about taking an existing data model into a Linked Data form. Ian took this sort of approach for the Open Access Repository Junction work – working through the different objects in a relational database model, thinking about how to decorate them with common RDF elements, then creating a vocabulary for the missing pieces. Some of the same questions about publishing and structuring Linked Data should come up; and in the middle of the sprint we’ll hold another Linked Data Learn-in at EDINA.

SUNCAT should have a fair bit more in common with existing Linked Data projects – particularly the JISC-supported OpenBiblio – and we’ll try to make links between SUNCAT-listed publications and some of their metadata. If we can get as far as then linking through to pre-prints in the institutional repositories found in OARJ, then I’ll be entirely satisfied.