ADS use case

Jo and I recently met with Stuart Jeffrey and Michael Charno at the Archaeology Data Service in York, to discuss a putative third CHALICE use case. The ADS is the main repository for archaeological data in the UK, and thus has many potential crossovers with CHALICE, and faces many comparable issues in terms of delivering the kind of information services its users want.

Much of the ADS’s discovery metadata as far as topography is concerned is based on the National Monument Record (NMR); and therefore on modern placenames. The ADS’s ArchSearch facility is based on a facetted classification principle: users can come into the system from a national perspective, and use parameters of ‘what’, ‘when’ and ‘where’ to pare the data down until they have a result set that conforms to their interests, with the indexing and classification into facets undetaken by ADS staff during the accession process.

In parallel with this, the ADS has experimented with Natural Language Processing (NLP) algorithms to extract place types – types of monument, types of site, types of feature etc from so-called ‘greay Literature’, employing the MIDAS period terms. The principle of using NLP to build metadata is not in itself unproblematic: many depositors prefer to be certain that *they* are responsible for creating, and signing off, the descriptive metadata for their records. As with other organizations that we’ve spoken to, Stuart noted that georeferencing collections according to county > district > parish can create problems  due to boundary changes; also many users do not necessarily approach administrative units in a systematic way. For example, most people would not, in their searching behaviour, characterize ‘Blackpool’ as a subunit of ‘Lancashire’. This throws up interesting structural parallels with what we heard from the CCED project.

Another good example the ADS recently encountered, is North Lincolnshire, which is described by Wikipedia as “a unitary authority area in the region of Yorkshire and the Humber in England… [and] for ceremonial purposes it is part of Lincolnshire.” This came up while creating a Web service for the Heritage Gateway for them.  It was assumed that users would naturally look for North Lincolnshire in Lincolnshire, however the Heritage Gateway used the official hierarchy, which put North Lincolnshire in Yorkshire and the Humber.  They were working on addressing that in the next version of their interface.

It was strongly agreed that there is a very good case to be made for using CHALICE to enrich ADS metadata with historical variants, and that those wishing to search the collections via location would benefit from such enrichment. This view of things sits well alongside the CCED case (which focuses on connections of structure and georeferenceing) and VCH (which focuses on connections between semantic entities). What is interesting is that all three cases have different implications for the technology, costs and research use: in the next three months or so the project will work on describing and addressing these implications.

Posted in Uncategorized

Discussions with CCED (or how I learned to stop worrying about vagueness and love point data)

I met recently with Prof. Stephen Taylor of the University of Reading. Prof. Taylor is one of the investigators of the Clergy of the Church of England (CCED) database project; whose backend development is the responsibility of the Centre for Computing in the Humanities (CCH). Like so many other online historical resources, CCED’s main motivation is to bring things together, in this case information about the CofE clergy between 1540 and 1835, just after which predecessors to the Crockford directory began to appear. There is, however, a certain divergance between what CCED does and what Crockford (simply a list of names of all clergy) does.

CCED started as a list of names, with the relatively straightforward ambition of documenting the name of every ordained  person between those dates, drawing on a wide variety of historical sources. Two things fairly swiftly became apparent: that a digital approach was needed to cope with the sheer amounts of information involved (CD-ROMS  were mooted at first), and that a facility to build queries around location would be critical to the use historians make of the resource. There is therefore clearly scope for considering how Chalice and CCED might complement one another.

Even more importantly however, some of the issues which CCED have come up against in terms of structure have a direct bearing on Chalice’s ambitions.  What was most interesting from Chalice’s point of view was the great complexity which the geographic component contains. It is important to note that there was no definitive list of English ecclesiastical parish names prior to the CCED (crucially, what was needed, was a list which also followed through the history of parishes – e.g. dates of creation, dissolution, merging, etc.), and this is a key thing that CCED provides, and is and of itself of great benefit to the wider community.

Location in CCED is dealt with in two ways: jurisdictional and geographical (see this article). Contrary to popular opinion, which tends to perceive a neat cursus honorum descending from bishop to archdeacon to deacon to incumbent to curate etc, ecclesiastical hierarchies can be very complex. For example, a vicar might be geographically located within a diocese, and yet not report to the bishop responsible for that diocese (‘peculiar’ jurisdictions).

In the geographic sense, location is dealt with in two distinct ways – according to civil geographical areas, such as counties, and according to what might be described as a ‘popular understanding’ of religious geography, treating a diocese as a single geographic unit. Where known, each parish name has a date associated with it, and for the most part this remains constant throughout the period, although where a name has changed there are multiple records (a similar principle to the attestation value of Chalice names, but a rather different approach in terms of structure).

Sub-parish units are a major issue for CCED, and there are interesting comparisons in the issues this throws up for EPNS. Chapelries are a key example: these existed for sure, and are contained with CCED, but it is not always possible to assign them to a geographical footprint (I left my meeting with Prof. Taylor considerably less secure in my convictions about spatial footprints) at least beyond the fact that, almost by definition, they will be been associated with a building. Even then there are problems, however. One example comes from East Greenwich, where there is a record of a curate being appointed, but there is no record of where the chapel is or was, and no visible trace of it today.

Boundaries are particularly problematic. The phenomenon of ‘beating the bounds’ around parishes only occurred where there was an economic or social interest in doing this, e.g. when there was an issue of which jurisdiction tithes should be paid to.  Other factors in determining these boundaries was folk memories, and the memories of the oldest people in the settlement. However, it is the case that, for a significant minority of parishes at least, pre Ordnance Survey there was very little formal/mapped conception of parish boundaries.

For this reason, many researchers consider that mapping based on points is more useful that boundaries. An exception is where boundaries followed natural features such as rivers. This is an important issue for Chalice to consider in its discussion about capturing and marking up natural features: where and how have these featured in the assignation and georeferencing of placenames, and when?

A similar issue is the development of urban centres in the late 18th and 19th centuries: in most cases these underwent rapid changes; and a system of ‘implied boundaries’ reflects the situation then more accurately than hard and fast geolocations.

Despite this, CCED reflects the formal structured entities of the parish lists. Its search facilities are excellent if you wish to search for information about specific parishes whose name(s) you know, but, for example, it would be very difficult to search for ‘parishes in the Thames Valley’; or (another example given in the meeting), to define all parishes within one day’s horse riding distance of Jane Austen’s home, thus allowing the user to explore the clerical circles she would have come into contact with but without knowing the names of the parishes involved.

At sub-parish level, even the structured information is lacking. For example, there remains no definitive list of chapelries.  CCED has ‘created’ chapelries, where the records indicate that one is apparent (the East Greenwich example above is an instance of this). In such cases, a link with Chalice and/or Victoria County History (VCH) could help establish/verify such conjectured associations (posts on Chalice’s discussions with VCH will follow at some point).

When one dips below even the imperfect georeferencing of parishes, there are non-geographic, or semi-geographic, exceptions which need to be dealt with: chaplains of naval vessels are one example; as are cathedrals, which sit outside the system, and indeed maintain heir own systems and hierarchies. In such cases, it is better to pinpoint the things that can be pinpointed, and leave it to the researcher to build their own interpretations around the resulting layers of fuzziness. One simple point layer that could be added to Chalice, for example, is data from Ordnance Survey’s describing the locations churches: a set of simple points which would associate the names of a parish with a particular location, not worrying too much about the amorphous parish boundaries, and yet eminently connectible to the structure of a resource such as CCED.

In the main, the interests that  CCED share with Chalice are ones of structural association with geography. Currently, Chalice relies on point based grid georeferencing, where that has been provided by county editors for the English Place Name Survey. However, the story is clearly far more complex than this.   If placename history is also landscape history, one must also accept that it is also intimately linked to Church history; since the Church exerted so much influence of all areas of life of so much of the period of history in question.

Therefore Chalice should consider two things:

  1. what visual interface/structure would work best to display complex layers of information
  2. how can the existing (limited) georeferencing of EPNS be enhanced by linking to it?

The association of (EPNS, placename, church, CCED, VCH) could allow historians to construct the kind of queries they have not been able to construct before.