Using source identifiers to link data

In the Chalice project we’ve used Unlock Places to make links across the Linked Data web, using the source identifier which appears in the results of each place search. As this might be useful to others, it’s worth walking through an example.

This search for “Bosley” shows us results in the UK from geonames and from the Ordnance Survey 50K gazetteer: http://unlock.edina.ac.uk/ws/nameSearch?name=Bosley&country=uk

Here’s an extract of one of the results, the listing for Bosley in the Ordnance Survey 1:50K gazetteer:

<identifier>11083412</identifier>
<sourceIdentifier>28360</sourceIdentifier>
<name>Bosley</name>
<country>United Kingdom</country>
<custodian>Ordnance Survey</custodian>
<gazetteer>OS Open 1:50 000 Scale Gazetteer</gazetteer>

The sourceIdentifier shown here is the identifier published by each of the original data sources that Unlock Places is using to cross-search.

Ordnance Survey Research re-uses these identifiers to create its Linked Data namespace. For any place in the 50K gazetteer, we can reconstruct the link that refers to that place by appending the source identifier to this URL, which is the namespace for the 50K gazetteer: http://data.ordnancesurvey.co.uk/id/50kGazetteer/

So our reference to Bosley can be made by adding the source identifier to the namespace:

http://data.ordnancesurvey.co.uk/id/50kGazetteer/28360

The same goes for source identifiers for places found in the geonames.org place-name gazetteer.

<sourceIdentifier>2655141</sourceIdentifier>
<name>Bosley</name>
<gazetteer>GeoNames</gazetteer>

Geonames uses http://sws.geonames.org/ as a namespace for its Linked Data links for places. So we can reconstruct the link for Bosley using the source identifier like this:

http://sws.geonames.org/2655141/

Note that the link needs the forward slash on the end to work correctly. If one looks at either of these links with a web browser, one is redirected to a human-readable page describing that place. To see the machine-readable, RDF version of the link’s contents, look at it with a command-line program such as curl, asking to “Accept” the RDF version:

curl -L http://data.ordnancesurvey.co.uk/id/50kGazetteer/28360 -H "Accept: application/rdf+xml"

I hope this is useful to others. We could add the links directly into the default search results, but many users may not be that interested in seeing RDF links in place-name search results. Thoughts on how we could offer this as a more useful function would be much appreciated.

Comments are closed.