IGIBS Final Product Post

“An INSPIREing tool enabling researchers to share their geospatial data over the web”

The Open Geospatial Consortium’s Web Map Service (WMS) is a core standard underpinning many Spatial Data Infrastructures (SDI) throughout the world.  This includes INSPIRE, the UK Location Programme and our own UK academic SDI.  The WMS Factory Tool created by the IGIBS project; for the first time, allows users to upload their data and automatically generate a fully standards based, INSPIRE compliant WMS.  Users can control styling and view their data alongside a broad range of other data from a broad range of content providers.  The WMS Factory Tool has been created in partnership with Welsh Government and students within UK academia in anticipation of the revolution in the use of Geographic Information that will come about through the increasing availability of data via interoperability standards in conjunction with the UK Location Programme and INSPIRE.

The WMS Factory Tool was developed in close cooperation with students at the University of Aberystwyth’s Institute of Geography and Earth Science in the context of their growing repository of data related to the UNESCO designated Dyfi Biosphere Reserve.  If a student is doing a project and generating data, and they need to be able, for purposes of analysis and integration, to view that data alongside data from the spectrum of Welsh public authorities establishing INSPIRE compliant services, then this tool lets them do so quickly, without the need to waste time sourcing, extracting, transforming and uploading data from a range of non-interoperable proprietary formats.

The working prototype has been developed and configured so that data is uploaded to EDINA machines.  The following video gives a flavour of how the tool works:

Click here to view the embedded video.

Note that as an advanced feature access can be restricted using Shibboleth (open source Security Assertion Markup Language implementation used in the UK Access Management Federation) so only authorised users can access the service and so that other organisations in the federation can make more data available.

The software is easy to deploy and configured so that data may be uploaded and WMS generated at user specified locations.  Here is a good place to start with documentation.

And here is a picture of the team that brought you this product.  More information on IGIBS can be found throughout this blog starting with the about page.

Core IGIBS Project Team at Welsh Government Offices in Cardiff on the 11th Nov, 2011

The software is in prototype at the moment, but is in a condition where it can be deployed.  EDINA commits to maintaining this software for a minimum of 3 years, ie, until Nov 2014, though it is likely the software will have developed considerably by then.

It is likely that this software will contribute to the growing suite of open source tooling available for use with INSPIRE compliant services and encodings, most obviously as a means for users within the UK academic sector to create WMS (temporary or persistent) for use with UK Location Programme network services.

At its heart is the Minnesota Mapserver WMS software, very stable, well understood and highly regarded software.  The IGIBS software is available for download.  It is licenced under the modified BSD licence, meaning, in précis, that the software is made available using a permissive free software licence, which has minimal requirements in respect of how the software can be redistributed.

STEEV Final Product Post

This blog post provides details about the web tool developed by the STEEV project.

Problem Space:

  • There is a requirement by the UK government to reduce the country’s carbon emission by 80% by 2050.
  • Buildings account for 45% of energy use in the UK, the equivalent of all transport and manufacturing combined (ESRC, 2009).
  • Most building stock which will exist in 2050 has already been built.
  • To achieve this target massive alterations of the current buildings are required. Part of the solution would be a tool that could enable planners, local authorities and government to best estimate the impact of policy changes and to target the interventions appropriately.

Cue  – the STEEV demonstrator, a stakeholder engagement tool developed to visualise spatio-temporal patterns of modeled energy use and efficiency outcomes for the period of 1990-2050 – http://steevsrv.edina.ac.uk/

For a portable overview of the project download the STEEV postcard

Primary Users:

Students, researchers, lecturers from a wide variety of disciplines/sub-disciplines, including geography, architecture, ecology, environmental science, economics, energy engineering and management.

The tool is also aimed at a range of stakeholders such as policy makers, urban developers, climate change specialists, carbon energy analysts, town planners.

Key Product Information – motivations and mechanisms

The STEEV demonstrator was developed to complement a larger project, Retrofit 2050 – Re-Engineering the City 2020-2050: Urban Foresight and Transition Management (EPSRC EP/I002162/1) which aims, through a range of stakeholders, to get a clearer understanding as to how urban transitions can be undertaken to achieve UK and international targets to reduce carbon emissions. The Retrofit 2050 project focuses on two large urban case study areas (Manchester and Neath/Port Talbot, South Wales – the latter being the focus of the STEEV demonstrator due to data availability within the project time-frame), through modelling scenarios of carbon emissions and energy use, both now and in the future.

The demonstrator itself is a client web application that enables researchers and stakeholders to look at how the spatial and temporal distribution of energy efficiency measures may impact upon likely regional outcomes for a given future state. This takes the form of a spatio-temporal exploration and visualisation tool for building-level energy efficiency modelling outputs such as the energy rating of the building, the likely energy demand of the building and the related CO2 emissions. A finite series of modelled scenario permutations have been ‘pre-built’ thus providing a limited number of parameters to be interactively altered in order to explore the spatio-temporal consequences of various policy measures.

View the STEEV Demonstrator Website: : http://steevsrv.edina.ac.uk/

Note: A further workpackage to establish a small area data viewer as part of the presentation layer will also be implemented shortly. This replaces the Memento geo-Timegate component of Workpackage 3.

The user interface has two main areas of activity, namely:

  • three ‘pre-built’ policy scenarios which depict government investment in energy efficiency measures (from best to worst case scenario) and a user generated scenario created by selecting a combination of the energy efficiency variables which go to make up the ‘pre-built’ scenarios.
  • a map viewer that enables model output values (SAP ratings, Energy use, CO2 emission) for each scenario to be viewed for each decade (1990 to 2050) at Output Area level of spatial granularity.

Further information about the policy-scenarios and variable descriptions are available from the help page

Fig1. – The STEEV Demonstrator

STEEV tool interface

Fig. 2. – Policy Scenario 2 – Low Carbon Reference

CO2 emissions, 2010 - Low carbon reference

Fig. 2 – Policy scenario 2 – Low Carbon Reference (i.e. the government invests in partial decarbonisation of the grid through reduced dependence on fossil fuels. Large investment in energy efficiency and small scale renewable, some change in occupant behaviour) has been selected for 2010. CO2 emissions have been chosen as model output value.

Fig. 3 – User-generated Scenario

Energy use for Custom Scenario 2020

Fig. 3 – A zoomed in view of a user-generated scenario for Energy Use for 2020. Note: User generated scenarios are forecast only.

Fig. 4 – Policy scenario 3 – Google Earth Time Slider

Energy efficiency data can be downloaded as Keyhole Markup Language (KML) files for use with the Google Earth Time Slider (for ‘pre-built’ scenarios only – see below) or as raw ASCII files complete with spatial reference for analysis in a Geographic Information System.

Energy Use policy scenario

Fig. 4 – KML files viewed on Google Earth for Energy Use output model values for policy scenario 3 – (i.e. the government invests in decarbonisation of the grid through renewable, nuclear, and huge investment in energy efficiency and small scale renewables. Large scale change in occupants behaviour)

Fig. 5 – Model output for individual buildings

Model output for individual buildings

Fig. 5 – Forecasted model output values (SAP rating, Energy use, CO2 emissions, CO2 emissions based on 1990 levels) for an individual building in 2030.

Note: Click on Blue dot and select Buildings map layer.

Engagement:
Members of the STEEV project presented at the following events:

  • STEEV / GECO Green Energy Tech Workshop at the Edinburgh Centre on Climate Change (13 October 2011) – for further details see blog post
  • Post-event comments include:

    “STEEV provides a new simple tool to quickly visualise a series of scenarios concerning energy consumption and carbon emissions within the complexities of the urban fabric. By facilitating the visual and historical understanding of these issues in a wider area, and for its forecasting capability considering a series of energy efficiency variables, it has a great potential to assist the planning and design processes.“ – Cristina Gonzalez-Longo (School of Architecture, University of Edinburgh)

    The STEEV system’s geospatial information on energy consumption and CO2 emissions can help planners and project developers target projects and initiatives related to energy efficiency and reduction of carbon emissions. Furthermore, the forecasting tools built into STEEV enables energy and carbon emissions to be estimated through to 2050 on the basis of alternative scenarios for energy efficiency initiatives, renewable energy, etc. This facility should help to determine where the opportunities for future emissions reductions will be, and the contributions made by existing policies and plans to future (e.g. 2020 and 2050) emissions reduction targets.” – Jim Hart (Business Manager, Edinburgh Centre for Carbon Innovation)

  • The Low Carbon Research Institute 3rd Annual Conference held at the National Museum of Wales on 15-16 November 2011
  • Post-Industrial Transformations – sharing knowledge and identifying opportunities, a two-day architectural symposium held at the Welsh School of Architecture on 22-23 November 2011

Technologies:
The STEEV demonstrator is a JavaScript client application which uses Open Layers as the mechanism for displaying the map data over the web. It also deploys a Web Map Service with temporal querying capabilities (WMS-T) to deliver Ordnance Survey open mapping products via the Digimap OpenStream API. The modelled energy efficiency variables are held in PostGIS (an open source spatial database extension to PostgreSQL)

Licences::
Data – Open Database License (ODC-ODbL) — “Attribution Share-Alike for data/databasesâ€�
Code – GNU General Public License version 3.0
Blog & other website content – Creative Commons Attribution 3.0 Unported License

Table of Contents of Blog Posts:

Project Logos:

combined logos of EDINA, JISC, WSA

Project Team:

STEEV Project Team

EDINA team members (L to R: Lasma Sietinsone, George Hamilton, Stuart Macdonald, Nicola Osborne. Fiona Hemsley-Flint is currently on maternity leave.)

Simon Lannon: Project partner from Welsh School of Architecture, Cardiff University:

Help – Policy-based scenarios and variables explained!

The first port of call for explanation or definition of STEEV tool functionality or terminology is this Help page.

We thought it useful to make available contextual information describing both policy scenario, variable.

Thus: here are the Policy Scenarios Descriptions, and here are the Variable Descriptions.

Note: As part of the usability and user testing we shall endeavour to make the variable and policy scenarios description information more explcit for the purposes of informing end use of the tool.

Other Help and Guidance notes:

STEEV Camtasia broadcast – explains and walks users through the functionality and features of the energy efficiency visualisation tool.

Contextual Overview of the STEEV tool

Overview of the Energy and Environment Prediction (EEP) model developed by the Welsh School of Architecture

The STEEV tool uses ‘hover over‘ boxes to provide an explanation about functionality. Use the mouse to hover over the buttons, slider gauge, markings and labels to get further information. Green information buttons provide further details about each scenario.

The Share Link feature on the interface uses a STEEV RESTful API to define a URI representing the value of the model, each variable, the year, the map extents and the map zoom level. This facilitates the sharing of a URL by returning the client to the state when saved.

Printing – Version 1.0 of the STEEV demonstrator does not include a print nor a save map image facility. To print (and edit) a map image created by the demonstrator use the Print Screen button on your keyboard and paste the image in to an image editing package such as PaintShop Pro. Save the map image in the image file format required (JPEG, GIF, WMF, TIF, PNG).

Model Output Value Feature Return functionality: further information about displaying model output values at the individual building level.

Data Download – further information about the raw ASCII Comma Separated Value (CSV) and Keyhole Markup Language (KML) format data file download.

Guidance notes on viewing the Policy Scenario KML files in Google Earth.

Alternatively view the ‘Using the Time Slider bar in Google Earth’ You Tube clip:

Click here to view the embedded video.

Usability and Time Sliders

As we move into the final phases of STEEV thoughts now turn to user testing and usability. OK, so we’ve built a visualisation tool to view time-series energy efficiency variables for a specific geographic area. But just how intuitive is the interface? How easy it is to use, for the practitioner, or for the novice user? What functionality is missing, and what is superfluous?

First step was to meet with the EDINA training officer (who has experience in conducting usability and user testing for EDINA projects and services). It was immediately apparent that work was required in terms of workflow and instruction. A detailed list of requirements has been assembled for implementation.

For the next step in this process we have approached a ‘Usability Expert’ with a view to having an overall look at the tool in terms of features and functionality in order to articulate and finesse possible ambiguities. We hope to have at the end of this process a usability guide detailing both process and outcome and make this available through the STEEV blog.

Our aim is to have conducted this exercise in time for the STEEV/GECO Green Energy Tech Workshop on on 13 October. This will allow practitioners the opportunity to use the tool in earnest whilst providing further feedback from an experts perspective.

Expect a future blog post detailing the results of the extended usability exercise.

Regarding part 2 of the title. OK, so there’s wasn’t a fit between STEEV and Memento. What does fit however, is the deployment of the Google Earth Time Slider to view the policy-based scenarios (as provided by our project partner) for each of the four modelled output over time (namely: SAP Rating, Energy, COs emissions, CO2 emissions based on 1990 levels). Our GI Analyst (Lasma Sietinsone – replacement for Fiona who’s currently on maternity leave) has created a dozen KML files which can be viewed in Google Earth using the Time Slider utility. The KML files can be downloaded from http://steevsrv.edina.ac.uk/data/.

Note: Guidance notes on viewing the KML files in Google Earth are available.

Alternatively view the ‘Using the Time Slider bar in Google Earth’ You Tube clip:

Click here to view the embedded video.

Final Product Post: Chalice: past places and use cases

This is our “final product post” as required by the #jiscexpo project guidelines. Image links somehow got broken, they are fixed now, please re-view.

Chalice – Past Places

Chalice is for anyone working with historic material – be that archives of records, objects, or ideas. Everything happens somewhere. We aimed to provide a historic place-name gazetteer covering a thousand years of history, linked to attestations in old texts and maps.

Place-name scholarship is fascinating; looking at names, a scholar can describe the lay of the land, see political developments. We would like to pursue further funding to work with the English Place-Name Survey on an expert-crowdsourced service consuming the other 80+ volumes and extracting the detailed information – etymology, field-names.

Linked to other archival sources, the place-name record has the potential to reveal connections between them, and in turn feed into deeper coverage in the place-name survey.

There is a Past Places browser to help illustrate the data and provide a Linked Data view of the data.

Stuart Dunn did a series of interviews and case studies with different archival sources, making suggestions for integration. The report on our use case for the Clergy of the Church of England Database may be found here; and that on our study of the Victoria County History is here. We also have valuable discussions with the Archaeology Data Service, which were reported in a previous post.

Rather than a classical ‘user needs’ approach, targeting groups such as historians, linguists and indeed place-name scholars, it was decided to look in detail at other digital resources containing reference material. This allowed us to start considering various ways in which a digitized, linkable EPNS could be automatically related to such resources. The problems are not only the ones we anticipated, of usability and semantic crossover between the placename variants listed in EPNS and elsewhere; but also ones of data structure, domain terminology and the relationship of secondary references acorss such corpora. We hope these considerations will help inform future development of placename digitization.

Project blog

This covers the work of the four partners in the project.

CeRch at KCL developed use cases through interviews with maintainers of different historic sources. There are blog descriptions of conversations with:

LTG did some visualisations for these use cases, and more seriously text mining the semi-structured text of different sample volumes of the English Place Name Survey.

The extraction of corrected text from previously digitised pages was done by CDDA in Belfast. There is a blog report on the final quality of the work, however the full resulting text is not open licensed nor distributed through Chalice.

EDINA took care of project management and software development. We used the opportunity to try out a Scrum-style “sprint” way of working with a larger team.

TOC to project blog –here is an Atom feed of all the project blog posts and they should be categorised / describe project partners

Project tag: chaliced

Full project name: Connecting Historical Authorities with Links, Contexts and Entities

Short description: Creating and re-using a linked data historic gazetteer through text mining.

Longer description:Text mining volumes of the English Place Name Survey to produce a Linked Data historic gazetteer for areas of England, which can then be used to improve the quality of georeferencing other archives. The gazetteer is linked to other placename sources on the Linked Data web via geonames.org and Ordnance Survey Open Data. Intensive user engagement with archive projects that can benefit from the open data gazetteer and open source text mining tools.

Key deliverables: Open source tools for text mining archives; Linked open data gazetteer, searchable through JISC’s Unlock service; studies of further integration potential.

Lead Institution: University of Edinburgh

Person responsible for documentation: Jo Walsh

Project Team: EDINA: Jo Walsh (Project Manager), Joe Vernon (Software Developer), Jackie Clark (UI design), David Richmond (Infrastructure), CDDA: Paul Ell (WP1 Coordinator), Elaine Yates (Administration), David Hardy (Technician), Karleigh Kelso (Clerical), LTG: Claire Grover (Senior Researcher), Kate Byrne (Researcher), Richard Tobin (Researcher), CeRch: Stuart Dunn (WP3 Coordinator).

Project partners and roles: Centre for Data Digitisation and Analysis, Belfast – preparing digitised text, Centre for e-Research, Kings College London – user engagement and dissemination, Language Technology Group, School of Informatics, Edinburgh – text mining research and tools.

This is the Chalice project blog and you can follow an Atom feed of blog posts (there are more to come).

The code produced during the Chalice project is free software; it is available under the GNU Affero GPL v3 license. You can get the code from our project sourceforge repository. The text mining code is available from LTG – please contact Claire Grover for a distribution…

The Linked Data created by text mining volumes of the English Place Name Survey – mostly covering Cheshire – is available under the
Open Database License – a share-alike license for data by Open Data Commons.
.

The contents of this blog itself are available under a Creative Commons Attribution-ShareAlike 3.0 Unported license.

 

CC-BY-SA GNU Affero GPL v3 license. Affero GPL v3

Link to technical instructional documentation

Project started: July 15th 2010
Project ended: April 30th 2011
Project budget: £68054


Chalice was supported by JISC as a project in its #jiscexpo programme. See its PIMS project management record for information about where responsibility fits in at JISC.

Musings on the first Chalice Scrum

For a while i’ve been hearing enthusiastic noises about how Scrum development practise can focus productivity and improve morale; and been agitating within EDINA to try it out. So Chalice became the guinea-pig first project for a “Rapid Application Development” team; we did three weeks between September 20th and October 7th. In the rest of this post I’ll talk about what happened, what seemed to work, and what seemed soggy.

What happened?

  • We worked as a team 4 days a week, Monday-Thursday, with Fridays either to pick up pieces or to do support and maintenance work for other projects.
  • Each morning we met at 9:45 for 15 minutes to review what had happened the day before, what would happen the next day
  • Each item of work-in-progress went on a post-it note in our meeting room
  • The team was of 4+1 people – four software developers, with a database engineer consulting and sanity checking
  • We had three deliverables –
        a data store and data loading tools
        a RESTful API to query the data
        a user interface to visualise the data as a graph and map

In essence, this was it. We slacked on the full Scrum methodology in several ways:

  • No estimates.

Why no estimates? The positive reason: this sprint was mostly about code re-use and concept re-design; we weren’t building much from scratch. The data model design, and API to query bounding boxes in time and space, were plundered and evolved from Unlock. The code for visualising queries (and the basis for annotating results) was lifted from Addressing History. So we were working with mostly known quantities.

  • No product owner

This was mostly an oversight; going into the process without much preparation time. I put myself in the “Scrum master” role by instinct, whereas other project managers might be more comfortable playing “product owner”. With hindsight, it would have been great to have a team member from a different institution (the user-facing folk at CeRch) or our JISC project officer, visit for a day and play product owner.

What seemed to work?

The “time-boxed” meeting (every morning for 15 minutes at 9:45) seemed to work very well. It helped keep the team focused and communicating. I was surprised that team members actually wanted to talk for longer, and broke up into smaller groups to discuss specific issues.

The team got to share knowledge on fundamentals, that should be re-useful across many other projects and services – for example, the optimum use of Hibernate to move objects around in Java decoupled from the original XML sources and the database implementation.

Emphasis on code re-use meant we could put together a lot of stuff in a compressed amount of time.

Where did things go soggy?

From this point we get into some collective soul-searching, in the hope that it’s helpful to others for future planning.

The start and end were both a bit halting – so out of 12 days available, for only 7 or 8 of those were we actually “on”. The start went a bit awkwardly because:

      We didn’t have the full team available ’til day 3 – holidays scheduled before the Scrum was planned
      It wasn’t clear to other project managers that the team were exclusively working on something else; so a couple of team members were yanked off to do support work before we could clearly establish our rules (e.g. “you’ll get yours later”).

We could address the first problem through more upfront public planning. If the Scrum approach seems to work out and EDINA sticks with it for other projects and services, then a schedule of intense development periods can be published with a horizon of up to 6 months – team members know which times to avoid – and we can be careful about not clashing with school holidays.

We could address the second problem by broadcasting more, internally to the organisation, about what’s being worked on and why. Other project managers will hopefully feel happier with arrangements once they’ve had a chance to work with the team. It is a sudden adjustment in development practise, where the norm has been one or two people full-time for a longish stretch on one service or project.

The end went a bit awkwardly because:

    I didn’t pin down a definite end date – I wasn’t sure if we’d need two or three weeks to get enough-done, and my own dates for the third week were uncertain
    Non-movable requirements for other project work came up right at the end, partly as a by-product of this

The first problem meant we didn’t really build to a crescendo, but rather turned up at the beginning of week 3, looked at how much of the post-it-note map we still had to cover. Then we lost a team member, and the last couple of days turned into a fest of testing and documentation. This was great in the sense that one cannot underestimate the importance of tests and documentation. This was less great in that the momentum somewhat trickled away.

On the basis of this, I imagine that we should:

  • Schedule up-front more, making sure that everyone involved has several months advance notice of upcoming sprints
  • Possibly leave more time than the one review week between sprints on different projects
  • Wait until everyone, or almost everyone, is available, rather than make a halting start with 2 or 3 people

We were operating in a bit of a vacuum as to end-user requirements, and we also had somewhat shifting data (changing in format and quality during the sprint). This was another scheduling fail for me – in an ideal world we would have waited another month, seen some in-depth use case interviews from CeRch and had a larger and more stable collection of data from LTG. But when the chance to kick off the Scrum process within the larger EDINA team came up so quickly, I just couldn’t postpone it.

We plan a follow-up sprint, with the intense development time between November 15th and 25th. The focuses here will be

  • adding annotation / correction to the user interface and API (the seeds already existing in the current codebase)
  • adding the ability to drop in custom map layers

Everything we built at EDINA during the sprint is in Chalice’s subversion repository on Sourceforge – which I’m rather happy with.

CHALICE: Our Budget

This is the last of the seven blog posts we were asked to complete as participants in a #jiscexpo project. I like the process. This is a generalised version of our project budget. More than half goes to the preparation and annotation of digitised text from scans, both manually and using named entity recognition tools.

The other half is for software development and user engagement; hoping to work together closely here. Of course we hope to over-deliver. Also have a small amount allocated to have people travel to a workshop. There’s another, independently supported JISC workshop planned to happen at EPNS on September 3rd.

Institution Apr10– Mar11
EDINA National Datacentre, University of Edinburgh (project management, design, software development) £21129
Language Technology Group, School of Informatics, University of Edinburgh (text mining archival work, named entity recognition toolkit development) £19198
Centre for Data Digitisation and Analysis, Queens College Belfast (preparation of corrected digitised texts for use in archival text mining – the EPNS in a set schedule of volumes) £15362
Centre for e-Research, Kings College London (backup project management, user needs and use case gathering, interviews, dissemination) £12365
Amount Requested from JISC £68054

CHALICE: Team Formation and Community Engagement

Institutional and Collective Benefits describes who, at an institutional level, is engaged with the CHALICE project. We have three work packages split across four institutions – the Centre for Data Digitisation and Analysis at Queens University Belfast; the Language Technology Group at the School of Informatics, and the EDINA National Datacentre, both at the University of Edinburgh; and the Centre for e-Research at Kings College, London.

The Chalice team page contains more detailed biographical data about the researchers, developers, technicians and project managers involved in putting the project together.

The community engagement aspect of CHALICE will focus on gathering requirements from the academic community on how a linked data gazetteer would be most useful in to historical research projects concerned with different time periods. Semi-structured interviews will be conducted with relevant projects, and the researchers involved will be invited to critically review existing gazetteer services, such as geonames, with a view to identifying how they would could get the most out of such a service. This will apply the same principles, based loosely on the  methodology employed by the TEXTvre project. The project will also seek to engage with providers of services and resources. CHALICE will be able to enhance such resources, but also link them together: in particular the project will collaborate with services funded by JISC to gather evidence as to how these services could make use of the gazetteer .  A rapid analysis of the information gathered will be prepared, and a report published within six months of the project’s start date.

When a first iteration of the system is available, we will revisit these projects, and  develop brief case studies that illustrate practical instances of how the resource can be used.

The evidence base thus produced will substantially inform design of the user interface and the scoping and implementation of its functionalities.

Gathering this information will be the responsibility of project staff at CeRch.

We would love to be more specific about exactly which archive projects will yield to CHALICE at this point; but a lot will depend both on the spatial focus of the gazetteer, and the investigation and outreach during the course of the project. So we have a half dozen candidates in mind right now, but the detailed conversations and investigations will have to wait some months… see the next post on the project plan describing when and how things will happen.

CHALICE: The Plan of Work

DRAFT

GANTT-like chart showing the interconnection between different work packages and participants in CHALICE – not a very high-quality scan, sorry. When there are shifts and revisions in the workplan, Jo will rub out the pencil markings and scan the chart in again, but clearer this time.

As far as software development goes we aspire to do a Scrum though given the resources available it will be more of a Scrum-but. Depending how many people we can get to Scrum, we may have to compress the development schedule in the centre – spike one week, deliver the next pretty much – then have an extended maintenance and integration period with just one engineer involved.

The preparation of structured versions of digitised text with markup of extracted entities will be more of a long slog, but perhaps I can ask CDDA and LTG to write something about their methodologies.

The use case gathering and user engagement parts of the project will develop on the form already used in the TextVRE project.


CHALICE: Open Licensing

Software

We commit to making source code produced as part of CHALICE available under a free software license – specifically, the GNU Affero General Public License Version 3. This is the license that was suggested to the Unlock service during consultation with OSS Watch, the open source advisory service for UK research.

GPL is a ShareAlike kind of license, implying that if someone adapts and extends the CHALICE code for use in a project or service, they should make their enhancements available to others. The Affero flavour of GPLv3 invokes the ShareAlike clause if the software is used over a network.

Data

We plan to use the Open Database License from Open Data Commons to publish the data structures extracted from EPNS – and other sources where we have the freedom to do this. ODbL is a ShareAlike license for data – the OpenStreetmap project is moving to use this license, which is especially relevant to geographic factual data.

As far as we know this will be the first time ODbL has been used for a research project of this kind – if there are other examples, would love to hear about them. We’ll seek advice from JISC Legal and from the Edinburgh Research and Innovation office legal service, as to the applicability of ODbL to research data, just to be sure.