This morning NASA successfully landed a probe on Mars. After a 9 month journey, the final 7 minutes of the trip were set to be the most risky, but the probe landed safely and has started transmitting data back to mission control. The probe is called Curiosity and is about the size of a small car. The purpose of Curiosity is to investigate the possibility of previous microbial life on Mars. If you are interested in finding out more about the mission then there are a number of links that provide further details:

If you have any more interesting links to the Curiosity mission, please add them as comments and i will add them to the list.

Consultation on access to address register data for social science research

Do you use address register data in your research?  If you do then you might be interested to know that ESRC are currently running a public consultation to gather the opinions of researchers around the UK.

There have recently been substantial changes to the creation and management of address data in the UK with the amalgamation of the National Land and Property Gazetteer and Ordnance Survey Mastermap Address Layer 2 into a National Address Gazetteer managed by GeoPlace.

GeoPlace is a public sector partnership between the Local Government Association and Ordnance Survey. The project brought together address data from local government and Royal Mail and Ordnance Survey. The Ordnance Survey data includes high resolution grid references for each address which permit mapping and detailed spatial analysis.

The establishment of the National Address Gazetteer marks an extremely important development in the UK data infrastructure. However, it also presents new challenges for the academic research community in terms of access to national address products.

It is therefore important for ESRC to assess the implications of these changes in address data arrangements, due to the potential costs and impacts for the social science research community.

This is important, the consultation document will be used to shape the way we access the data in the future.  It is your chance to express your opinion.  The deadline for filling in the consultation document is the 3rd August. To access the survey, click the link below.

Online Survey

GISRUK 2012 – Thursday

The second part of GoGeo’s review of GISRUK 2012 covers Thursday. If you want to find out what happened on Wednesday, please read this post

Thusrday saw a full programme of talks split between two parallel sessions.  I chose to go to the Landscape Visibility and Visualisation strand.

  • Steve Carver (University of Leeds) started proceedings with No High Ground: visualising Scotland’s renewable landscape using rapid viewshed assessment tools. This talk brought together new modeling software that allowed for multiple viewsheds to be analysied very quickly, with a practical and topical subject.  The SNP want Scotland to be self-sufficient with renewable energy by 2020.  An ambitious target. In 2009, 42% of Scotlands “views” were unaffected by human developments, this had declined to 28% by 2011.  Wind farms are threatening the “wildness” of Scotland and this may have implications on tourism.  Interestingly, the SNP also wants to double the income from tourism by 2020. So how can you achieve both?  By siting new wind farms in areas that do not further impact on the remaining wild areas.  This requires fast and efficient analysis of viewsheds which is what Steve and his team presented.
  • Sam Meek (University of Nottingham) was next up presenting on The influence of digital surface models choice on the visibility-based mobile geospatial application.  Sam’s research focused onan application called Zapp.  Sam is looking at how to efficiently and accuretly run visibility models on mobile devices in the field and how the results are influenced by the surface model.  In each case, all processing is done on the device. Resampling detailed DTM’s is obviously going to make processing less intensive, however this often leads to issues such as smoothing of features.  Other general issues with visibility models are stepping, where edges form in the DTM and interupt the line of sight and an over estimation of vegetation.  This research should help make navigation apps on mobiles that use visual landmarks to guide the user, more accurate and usable.
  • Possibly the strangest and most intruging paper title at GISRUK 2012 came from Neil Sang (Sweedish University of Argicultural Science) with New Horizons for the Standford Bunny – A novel method for view analysis.  The “bunny” reference was a bit of a red herring but the research did look at horizon based view analysis.  The essence was to identify horizons in a landscape to improve the speed of viewshed analysis as the horizons often persisted even when the local position changed.
  • The final paper of the session took a different direction with David Miller of The James Hutton Institute looking at Testing the publics preferences for future. This linked public policy with public consultations through the use of virtual reality environments.  The research investigated whether familiarity with the location altered the opinion of planned changes to the landscapes.  Findings showed agreement in developing amenity woodland adjacent to a village, and environmental protection, but differences arose in relation to proposals for medium-sized windfarms (note – medium-sized wind farms are defined as those that would perhaps be constructed to supply power to a farm and not commercial windfarms).

After coffee I chose to go to the Qualitative GIS session as it provided an interesting mix f papers that explored social media and enabling”the crowd”.

  • First up was Amy Fowler (Lancaster University) who asked How reliable is citized-derived scientific data?  This research looked at the prevelance of aircraft contrails using data derived through the Open Air Laboratories (OPAL) Climate Survey. Given the dynamic nature of the atmosphere, it is impossible to validate user contributed data. Amy hopes to script an automated confidence calculator to analyse nearly 9,000 observations, but initial analysis suggests that observations that have accompanying photographs tend to be more reliable.
  • Iain Dillingham (City University) looked at Characterising Locality Descriptors in crowd-sourced information.  This specifically focused on humanitarian organisations. Using the wealth of data available from the 2010 Haiti earthquake they investigated the uncertainty of location from social media. They looked at georeferencing locality descriptors in MaNIS (Mammal Network Information System).  The conclusion was that while there were similarities in the datasets, the crowd-sourced data presented significant challenges with respect to vagueness, ambiguity and precision.
  • The next presentation changed the focus somewhat, Scott Orford (Cardiff University) presented his work on Mapping interview transcript records: technical, theoretical and cartographical challenges. This research formed part of the WISERD project and aimed to geo-tag interview transcripts .  Geo-tagging was done using UNLOCK but there were several issues with getting useful results out, or reducing the noise in the data.  Interview scripts were transcribed in England and complicated Welsh placename spellings often got transcribed incorrectly.  In addition, phrases such as “Erm” were quite frequent and got parsed which then had to be removed as they did not actually relate to a place. Interesting patterns did emerge about what areas appeared to be of interest to different people in different regions of Wales, however care had to be taken in preparing the dataset and parsing it.
  • Chris Parker (Loughborough University) looked at Using VGI in design for online usability: the case of access information. Chris used a number of volunteers to collect data on accessibility to public transport. The volunteers might be considered an expert group as they were all wheel-chair users.  Comparison was made between an official map and one that used the VGI data. It was found that the public perception of quality increased when VGI data was used making it an attractive and useful option for improving the confidence of online information. However, it would be interesting to look at this issue with a more mixed crowd of volunteer, rather than just the expert user group who seemed to have been commission (but not paid) to collect specific information. I am also not too sure where the term Usability from the title fits.  Trusting the source of online data may increase it use but this is not usability which refers more to the ability of users to engage with and perform tasks on an interface.

There was a good demonstration from ESRI UK of their service.  This allows users to upload their own data, theme it and display it against one of a number of background maps. The service then allows you to publish the map and restrict the access to the map by creating groups.  Users can also embed the map into a website by copying some code that is automatically created for you. All good stuff, if you want to find out more about this then have a look at the website.

Most of Friday was given over to celebrating the career of Stan Openshaw.  I didn’t work with Stan but it is clear from the presentations that he made a significant contribution to the developing field of GIS and spatial analysis and had a huge effect on the development of many of the researchers that regularly attend GISRUK.  If you want to find out more about Stan’s career, have a look at the Stan Openshaw Collection website.

Friday’s keynote was given by Tyler Mitchel who was representing the OSGeo community.    Tyler was a key force in the development of the OSGeo group and has championed the use of open software in gis.  Tyler’s presentation focused on interoprability and standards and how they combine to allow you to create a software stack that can easily meet you GIS needs.  I will try to get a copy of the slides of Tyler’s presentation and link to them from here.

GISRUK 2012 – Wednesday

GISRUK 2012 was held in Lancaster, hosted by Lancaster University. The conference aimed to cover a broad range of subjects including Environmental Geoinformatics, Open GIS, Social GIS, Landscape Visibility and Visualisation and Remote Sensing. In addition to the traditional format, this years event celebrated the career of Stan Openshaw, a pioneer in the field of computational statistics and a driving force in the early days of GIS.


The conference kicked off with a Keynote from Professor Peter Atkinson of the University of Southampton.  This demonstrated the use of remotely sensed data to conduct spatial and temporal monitoring of environmental properties. Landsat data provides researchers with 40 years of data making it possible to track longer term changes. Peter gave two use case examples:

  1. River channel monitoring on the Ganges. The Ganges forms the International boundary between India and Bangladesh, understanding channel migration is extremely important for both countries.  The influece of man-made structures, such as barrages to divert water to Calcutta, can have a measurable effect on the river channel. Barrages were found to stabalise the migrating channel
  2. Monitoring regional phenology. Studying the biomass of vegetation is tricky but using “greenness” as an indicator provides a useful measure. Greenness can then be calculated for large areas, up to continent scale.  Peter gave an example where MODIS and MERIS data had been used to calculate the greenness of India. Analysis at this scale and resolution reveals patterns and regional variation such as the apparent “double greening” of the western Ganges basin which would allow farmers to have two harvests for some crops.

However, these monitoring methods are not without their challenges and limitations.  Remote sensing data provides continuous data based on a regular grid.  Ground based measurements are sparse and may not tie in, spatially or temporally, with the remotely sensed data. Ground based phenology measurements can be derived using a number of methods making it difficult to make comparisons.  A possible solution would be to adopt a crowd-sourcing technique where data is collected and submitted from enthusiasts in the field. This would certainly result in better spatial distributions of ground based measurements, but would the resulting data be reliable? Automatically calculating the greening from web-cams is currently being trialed.

The first session was then brought to a close with two talks on the use of terrestrial lidar. Andrew Bell (Queens University, Belfast) was investigating the use of terrestrial LiDAR for monitoring slopes.  DEMs were created from the scans and this was used to detect changes in slope, roughness and surface.  The project aims to create a probability map to identify surface that are likely to fail and cause a hazard to the public.  Andrew’s team will soon receive some new airbourne LiDAR data, however I feel that if this technique is to be useful to the highways agency, the LiDAR would have to mounted on a car as cost and repeatability would be two key drivers.  Andrew pointed out that this would reduce the accuracy of the data but perhaps such a reduction would be acceptable and change would still be detectable.

Neil Slatcher’s (Lancaster University) paper discussed the importance of calculating the optimum location to depoly a terrestrial scanner.  Neil’s research concentrated on lava flows which meant the landscape was rugged, some areas were inaccessible and the target was dynamic and had to be scanned in a relatively short period of time. When a target cannot be fully covered by just one scan analysis of the best positions to give complete coverage is needed.  Further, with a 10Hz scanner you could make 10 measurements per second which seems quick but a dense grid can result in scan times in excess of 3hrs.  By sub-dividing the scan into smaller scan windows that are centred over the target you can significantly reduce the size of the grid and the number of measurements required and hence the time it takes to acquire the data. This method had reduced scan times from 3 hrs to 1hr15mins.

The final session of the day had two parallel sessions, one on Mining Social Media and the other on Spatial Statistics.  Both interesting subjects but i opted to attend the Socail Media strand.

  • Lex Comber (University of Leicester) gave a presentation on Exploring the geographies in social networks.  This highlighted that there are many methods for identifying clusters or communities in social data but that the methods for understanding what a community means are still quite primitive.
  • Jonny Huck (Lancaster University) presented on Geocoding for social networking of social data.  This focused on the Royal Wedding as it was an announced event that was expected to generate traffic on social media allowing the team to plan rather than react. They found that less than 1% of tweets contained explicit location information. You could parse the tweets to extract geographic information but this introduced considerable uncertainty.  Another option was to use the location info in users profiles and assume they were at that location.  The research looked at defining levels of detail, so Lancaster Uni  Campus would be defined as Lancaster University Campus / Lancaster/Lancashire / England /UK.  By geocoding the tweets at as many levels of detail as possible you could then run analysis at the appropriate level.  What you had to be careful of was creating false hot-spots at the centroids of each country.
  • Omar Chaudhry (University of Edinburgh) explained the difficulties in Modelling Confidence in Extraction of Place Tags from Flickr.  Using a test case of Edinburgh they tried to use Flickr tags to define the dominant feature of grid cell covering central Edinburgh.  Issues arose when many photo’s were tagged for a personal event such as a wedding and efforts were made to reduce the impact of these events. Weighting the importance of the tag against the number of users who used it, rather than the absolute number of times it was used seemed to improve results. There was still the issue of tags relating to what the photo was of, rather than were it was taken.  Large features such as the Castle and Arthur’s Seat dominated the coarser grids as they are visible over a wide area.
  • Andy Turner and Nick Malleson (University of Leeds) gave a double header as they explined Applying geographical clustering methods to analyse geo-located open micro-blog posts: a case study of tweets around Leeds.  The research showed just how much information you could extract from location information in tweets, almost giving you a socio-economic profile of the people. There was some interesting discussion around the ethics of this, specifically in elation to the data protection act.  This clearly states that you can use the data for the purpose that it was collected for.  Would this research/profiling be considered what the original data had been collected for?  Probably not.  However, that was part of the research, to see what you could do and hence what companies could do if social media sites such as twitter start to allow commercial organisations to access your personal info. For more information on this look at this paper, or check out Nick’s Blog
  • One paper that was suggested as a good read on relating tweets to place and space was Tweets from Justin Bieber’s heart: the dynamics of the location field in user profiles.

I will post a summary of Thursday as soon as I can.

UK Bio Bank

While watching the news on Friday night, yes it doesn’t get much more exciting than that these days, i saw a piece on the UK Bio Bank.  UK Biobank is a major national health resource with the aim of improving the prevention, diagnosis and treatment of a wide range of serious and life-threatening illnesses – including cancer, heart diseases, stroke, diabetes, arthritis, osteoporosis, eye disorders, depression and forms of dementia. UK Biobank recruited 500,000 people aged between 40-69 years in 2006-2010 from across the country to take part in this project. They have undergone measures, provided blood, urine and saliva samples for future analysis, detailed information about themselves and agreed to have their health followed.

So what is the significance of this study?  There are a a couple of important differences from previous studies, most obvious being the number of participants.  Half a million subjects is a huge, and importantly significant, sample size.  It should allow researchers to cut through background noise and discover trends that have not been apparent in smaller studies.  Another difference is that this study focuses on a diverse range of people. Some are already suffering from an illness but many are perfectly healthy.  Much of the previous research has focused on just those that are already suffering from an illness.

So why am I writing about the UK Bio Bank on a geospatial blog? Well, along with the wide array of physiological measurements that are being collected about each subject, the research team are collecting information about:

  • where participants live
  • where they grew up
  • where they have lived throughout their life
  • the income of their family while they grew up
  • their employment

and I am sure many more things.  This gives the study a spatial element and geographical factors can have a strong influence on health.  With such a large sample size GIS is the obvious tool to analyse and extract patterns from the noise of the data.  Packages such as ArcGIS and R Stats will help researchers explore the dataset.  I am sure the UK Bio Bank will become a an important research resource in years to come and will have a significant impact on epidemiology research.

UK Biobank was established by the Wellcome Trust medical charity, Medical Research CouncilDepartment of HealthScottish Government and the Northwest Regional Development Agency. It has also had funding from theWelsh Assembly Government and the British Heart Foundation. UK Biobank is hosted by the University of Manchester and supported by the National Health Service (NHS).

Digging for Data in Archives

Since our last post the Trading Consequences team have been working with our identified and potential data providers to begin gathering digital data for the project.

As the various data providers were sending us millions of pages of text from digitized historical documents, I flew over to London to spend some time in the archives.

A major component of our Digging Into Data project will involve doing traditional historical research, in archives and using the digitized repositories, to provide a comparison between what the historians are able to find and what the data mining and visualization components discover. So I set about researching a few of the more interesting commodities flowing into London industry during the nineteenth century. This included archival records related to the palm oil trade in west Africa and records at Kew Gardens’ archives related to John Eliot Howard’s scientific investigations into cinchona and quinine. John Eliot was one of the “Sons” in Howard & Sons, who manufactured chemicals and drugs in Startford (near the site of the 2012 Olympics) throughout the nineteenth century. After photographing most of his papers at Kew, I also spent time at the London Metropolitan Archive, looking through the company records. It was at the LMA that I was reminded about the disappointments often associated with historical research. It turned out the single most interesting document listed in the archival holdings, a ledger listing the imports of cinchona bark throughout the middle of the century, had been destroyed at some point and a second document on their trade with plantations in Java is missing.

After collecting enough material to begin my study of the relationships between factories in the Thames Estuary and commodity frontiers in South America, Africa and India, I focused my final day in the archive on a set of sources that will directly assist with the data mining aspects of the project. I recorded four years of customs ledgers, which record the quantity, declared value and country of origin of the hundreds of different commodity categories imported into Britain (everything from live animals to works of art). This source will provide the foundation of the taxonomy of commodities that we will create over the next few months, which will then be used to mine the data. Moreover, these ledgers provide a good starting point for our research into Canada’s trade with Britain and we are recording the quantity and value of all the goods shipped across the Atlantic. Just in through the monotonous process of photographing a few thousand pages, the major changes between the early and late nineteenth century began to stand out. Not only were there a lot more commodities by the centuries’ end, but Britain was relying on far more countries to supply it with raw materials.


Digitisation as research

With the REF on the horizon, most academics are currently concerned with matters of impact and academic recognition. Therefore, getting academic recognition for a digitisation project, such as those funded under the JISC eContent programmes, is an important question. In order to receive JISC funding to digitise content, one has, of course, to demonstrate the academic value of the resource to be digitised, and to explain how making it available digitally will increase that value. The impact and value of digitisation outputs themselves, and how they fit into peer-review structures, has been the subject of previous studies, but the issue of getting credit for undertaking digitisation itself is less clear. This can cause problems when dealing with outside bodies concerned with the review or evaluation of research; or even with one’s own institution. In some cases, for example, digitisation activities might be interpreted as software development or IT support, thus preventing those involved from getting academic credit. How this classification is made varies from HEI to HEI. In some cases, an email from the PI or Co-I confirming that the project is ‘research’ will suffice, in others there is a questionnaire or some other pro forma. However they classify activities, most Higher Education Institutions adopt the principles of the Frascati Manual’s definition of research, or something very similar to them. These break research down into three headings:

  • Basic research is experimental or theoretical work undertaken primarily to acquire new knowledge of the underlying foundation of phenomena and observable facts, without any particular application or use in view.
  • Applied research is also original investigation undertaken in order to acquire new knowledge. It is, however, directed primarily towards a specific practical aim or objective.
  • Experimental development is systematic work, drawing on existing knowledge gained from research and/or practical experience, which is directed to producing new materials, products or devices, to installing new processes, systems and services, or to improving substantially those already produced or installed

Most academic digitisation work is likely to fall into the third category, provided that making available of digital resources is accompanied by some form of enhancement, such as machine-readable mark-up or a crowd-sourcing platform. This is especially so if it can be shown that the enhancement is drawn directly from the project team’s experience and expertise. Certainly in the context of the DEEP project, there are complicated questions of data structure, interpretation and mark-up, the exploration of which would appear as research questions to most scholars and deserving of recognition as such. Undoubtedly they require the extremely interdisciplinary skill set of all the partners.

Projects needing to make this argument may wish to consider the following suggestions:

1. Ensure the research question or questions that your resource will be addressing is clearly articulated, and that you have to hand a clear statement describing the unique knowledge needed to make it digitally available in the way you have chosen.

2. Refer to the Frascati guidelines, and any relevant institutional definitions of research and related activities.

3. Ensure you are talking to the right person. It may be the case that staff charged with classifying activities are not familiar with digitisation. This is especially so in departments or schools with little experience of such projects. In such cases, the decision on whether to classify the project as research may well need to be taken at a higher level than normal.

Both the Centre for Data Digitisation and Research at QUB and the Centre for e-Research in the Dept. of Digital Humanities at KCL have extensive experience in dealing with such projects, and would be happy to offer discussion and advice to any project which needs to make the argument that their work constitutes research.



Guest Blog Post: SPIRES Network Technological Spaces Event

We have a short guest blog post this week from Mòrag Burgon-Lyon of SPIRES who have an event coming up in October that should be of interest to those using AddressingHistory.

SPIRES is a network for researchers, young, old and somewhere in between, in academia, industry and leisure.  They run seminars and workshops, provide travel funding for these and other events, promote discussion and generally support members in any way they can. Anyone can join SPIRES (it’s free!) and you can find out more about how to do this on their about page.

The SPIRES (Supporting People who Investigate Research Environments and Spaces) network would like to invite some leisure researchers to join our next workshop on Technological Spaces at City University, London on 7th October.  We aim to get people together from academia, industry and leisure research for networking, and to better understand the physical, social and digital environments in which research is conducted.

The day will comprise short talks of around 15 minutes on various topics, discussion sessions and group activities.  Confirmed talks include a digital curator from the British Library about the Growing Knowledge exhibition and some academic projects on digital tools including SerenA (a Serendipity Arena) and Brain (Building Research and Innovation Networks).  More talks are in the pipeline from academic and industry speakers.

If you would like to present a short talk about your research, and the tools (digital and otherwise) you use, we would love to hear from you!  If you would rather not present a talk, but would still like to attend the workshop, or just join the SPIRES network (it is free, and there are lots of benefits) please get in touch.  Assistance with travel costs is available for workshop attendees, (though please check with me before booking travel) and lunch will be provided.  Contact @SPIRES13 on Twitter, or email  Further information is also available on our website