2012 FOSS4G-CEE Conference

Long time no post. Well the best things come to those that wait and today we have a guest blog from fellow EDINA Geodata team member James Crone. James attended the recent FOSS4G-CEE Conference which was held at the Faculty of Civil Engineering, Czech Technical University in Prague between the 21st and 23rd of May. Over to James…..

Seen as an add-on to the global FOSS4G conference which attracts developers and users of open source geospatial software as well as managers and decision-makers and which will be held in Beijing this year, FOSS4G-CEE has a regional focus on all things open source and geospatial in Central and Eastern Europe.  The official language of FOSS4G-CEE was English.

The conference consisted of workshops followed by parallel presentation/tutorial streams, unconference birds of a feather sessions and post-conference code sprints. I only attended the presentation streams which ran from Monday afternoon through to Wednesday.

The Plenary session on Monday consisted of introductory talks on different strands of what is meant by Open. Arnulf Christl of OSGeo/metaspatial covered open software; Athina Trakas of the Open Geospatial Consortium covered open standards whilst Markus Neteler of Edmund Mach Foundation covered open science. A local Central and East Europe flavour was provided by Jiri Polacek of the Czech Office for Surveying, Mapping and Cadastre who covered cadastre and INSPIRE in the Czech Republic and Vasile Craciunescu of the Romanian National Meteorological Administration / geo-spatial.org who provided an overview of open source software projects, applications and research projects using open source geospatial software in the Central and Eastern Europe region.

On Tuesday through to Wednesday the presentations proper started. Thematically the presentations were grouped around the themes of INSPIRE, Case studies of the use of geospatial FOSS, Geoinformatics and the more technical data / development ones. As an opportunity to track changes regarding open geosptial software itself I mostly attended the technical data/development presentations.

There were many awesome things presented during FOSS4G-CEE but my top three were:

1. MapServer

EDINA have been using MapServer, the open source platform for publishing spatial data to the web for some time. The next release of MapServer 6.2 is promising improved cartography, map caching and feature serving. The first two of these were covered in two talks by Thomas Bonfort of Terriscope.

In Advanced Cartography with MapServer 6.2, Thomas described some of the improved features that will be available when it comes to rendering vector data through MapServer. Some of the nice things that will be included are improved support for complex symbols and improvements to feature labeling.

Nobody likes waiting for their maps. In a second presentation, MapServer MapCache, the fast tile serving solution, Thomas described MapServer MapCache which provides all of the features of a certain tilecaching system with added goodness in the form of increased performance, native MapServer sources without the overhead of going through a WMS and configuration directly within the mapfile.

MapServer 6.2 certainly seems like it could be a release to watch for.

2. PostGIS Topology

Vincent Picavet of Oslandia provided an introduction to graphs and topology in PostGIS.

Here at EDINA we use PostGIS extensively within such services as UKBORDERS and Digimap. Within our UKBORDERS service we provide academics with access to digital boundary datasets. As a result we`ve been tracking with a great deal of interest developments in the storage of topology within PostGIS. The benefits of using PostGIS topology are that we can store shared boundaries which is good for data normalisation and has benefits when it comes to the generalisation of boundary datasets. These and network operations such as routing were demonstrated in Vincent’s very informative talk.

Although not related to topology, in a later talk Vincent presented Efficiently using PostGIS with QGIS and mentioned numerous extremely useful features and plugins for QGIS for working with PostGIS. Once back in the EDINA office I duly installed the Fast SQL Layer plugin which has made working with PostGIS in QGIS even nicer than it was before.

3. TinyOWS

The talk TinyOWS, the high performance WFS Server by Vincent Picavet of Oslandia, showcased some of the features of TinyOWS. TinyOWS provides a lightweight, fast implementation of the OGC WFS-T standard. Tightly coupled to PostGIS, TinyOWS will be released as part of MapServer 6.2.

Real world use of TinyOWS was demonstrated in talk held during a wednesday morning session titled

IPA-Online, an application built on FOSS to assist Romanian farmers to prepare their application form for direct payments. by Boris Leukert.

The IPA-Online system allows Romanian farmers to prepare single area payment applications by drawing parcel boundaries in an online application to support EU subsidy payments and replaces a previously manual system of drawing the parcels on paper maps. Built around MapBender/MapServer/PostgreSQL/PostGIS with TinyOWS used to provide WFS-T and allowed for a very large number of concurrent users. The conclusion from Boris was that deployment of a system based on geospatial FOSS brought with it savings of time, money and the environment, saving the need for 1.6 million less paper maps having to be printed.

Overall attendance at FOSS4G-CEE was very worthwhile. Slides for these and other talks are available online for viewing over at the FOSS4G-CEE homepage.

GISRUK 2012 – Thursday

The second part of GoGeo’s review of GISRUK 2012 covers Thursday. If you want to find out what happened on Wednesday, please read this post

Thusrday saw a full programme of talks split between two parallel sessions.  I chose to go to the Landscape Visibility and Visualisation strand.

  • Steve Carver (University of Leeds) started proceedings with No High Ground: visualising Scotland’s renewable landscape using rapid viewshed assessment tools. This talk brought together new modeling software that allowed for multiple viewsheds to be analysied very quickly, with a practical and topical subject.  The SNP want Scotland to be self-sufficient with renewable energy by 2020.  An ambitious target. In 2009, 42% of Scotlands “views” were unaffected by human developments, this had declined to 28% by 2011.  Wind farms are threatening the “wildness” of Scotland and this may have implications on tourism.  Interestingly, the SNP also wants to double the income from tourism by 2020. So how can you achieve both?  By siting new wind farms in areas that do not further impact on the remaining wild areas.  This requires fast and efficient analysis of viewsheds which is what Steve and his team presented.
  • Sam Meek (University of Nottingham) was next up presenting on The influence of digital surface models choice on the visibility-based mobile geospatial application.  Sam’s research focused onan application called Zapp.  Sam is looking at how to efficiently and accuretly run visibility models on mobile devices in the field and how the results are influenced by the surface model.  In each case, all processing is done on the device. Resampling detailed DTM’s is obviously going to make processing less intensive, however this often leads to issues such as smoothing of features.  Other general issues with visibility models are stepping, where edges form in the DTM and interupt the line of sight and an over estimation of vegetation.  This research should help make navigation apps on mobiles that use visual landmarks to guide the user, more accurate and usable.
  • Possibly the strangest and most intruging paper title at GISRUK 2012 came from Neil Sang (Sweedish University of Argicultural Science) with New Horizons for the Standford Bunny – A novel method for view analysis.  The “bunny” reference was a bit of a red herring but the research did look at horizon based view analysis.  The essence was to identify horizons in a landscape to improve the speed of viewshed analysis as the horizons often persisted even when the local position changed.
  • The final paper of the session took a different direction with David Miller of The James Hutton Institute looking at Testing the publics preferences for future. This linked public policy with public consultations through the use of virtual reality environments.  The research investigated whether familiarity with the location altered the opinion of planned changes to the landscapes.  Findings showed agreement in developing amenity woodland adjacent to a village, and environmental protection, but differences arose in relation to proposals for medium-sized windfarms (note – medium-sized wind farms are defined as those that would perhaps be constructed to supply power to a farm and not commercial windfarms).

After coffee I chose to go to the Qualitative GIS session as it provided an interesting mix f papers that explored social media and enabling”the crowd”.

  • First up was Amy Fowler (Lancaster University) who asked How reliable is citized-derived scientific data?  This research looked at the prevelance of aircraft contrails using data derived through the Open Air Laboratories (OPAL) Climate Survey. Given the dynamic nature of the atmosphere, it is impossible to validate user contributed data. Amy hopes to script an automated confidence calculator to analyse nearly 9,000 observations, but initial analysis suggests that observations that have accompanying photographs tend to be more reliable.
  • Iain Dillingham (City University) looked at Characterising Locality Descriptors in crowd-sourced information.  This specifically focused on humanitarian organisations. Using the wealth of data available from the 2010 Haiti earthquake they investigated the uncertainty of location from social media. They looked at georeferencing locality descriptors in MaNIS (Mammal Network Information System).  The conclusion was that while there were similarities in the datasets, the crowd-sourced data presented significant challenges with respect to vagueness, ambiguity and precision.
  • The next presentation changed the focus somewhat, Scott Orford (Cardiff University) presented his work on Mapping interview transcript records: technical, theoretical and cartographical challenges. This research formed part of the WISERD project and aimed to geo-tag interview transcripts .  Geo-tagging was done using UNLOCK but there were several issues with getting useful results out, or reducing the noise in the data.  Interview scripts were transcribed in England and complicated Welsh placename spellings often got transcribed incorrectly.  In addition, phrases such as “Erm” were quite frequent and got parsed which then had to be removed as they did not actually relate to a place. Interesting patterns did emerge about what areas appeared to be of interest to different people in different regions of Wales, however care had to be taken in preparing the dataset and parsing it.
  • Chris Parker (Loughborough University) looked at Using VGI in design for online usability: the case of access information. Chris used a number of volunteers to collect data on accessibility to public transport. The volunteers might be considered an expert group as they were all wheel-chair users.  Comparison was made between an official map and one that used the VGI data. It was found that the public perception of quality increased when VGI data was used making it an attractive and useful option for improving the confidence of online information. However, it would be interesting to look at this issue with a more mixed crowd of volunteer, rather than just the expert user group who seemed to have been commission (but not paid) to collect specific information. I am also not too sure where the term Usability from the title fits.  Trusting the source of online data may increase it use but this is not usability which refers more to the ability of users to engage with and perform tasks on an interface.

There was a good demonstration from ESRI UK of their ArcGIS.com service.  This allows users to upload their own data, theme it and display it against one of a number of background maps. The service then allows you to publish the map and restrict the access to the map by creating groups.  Users can also embed the map into a website by copying some code that is automatically created for you. All good stuff, if you want to find out more about this then have a look at the ArcGIS.com website.

Most of Friday was given over to celebrating the career of Stan Openshaw.  I didn’t work with Stan but it is clear from the presentations that he made a significant contribution to the developing field of GIS and spatial analysis and had a huge effect on the development of many of the researchers that regularly attend GISRUK.  If you want to find out more about Stan’s career, have a look at the Stan Openshaw Collection website.

Friday’s keynote was given by Tyler Mitchel who was representing the OSGeo community.    Tyler was a key force in the development of the OSGeo group and has championed the use of open software in gis.  Tyler’s presentation focused on interoprability and standards and how they combine to allow you to create a software stack that can easily meet you GIS needs.  I will try to get a copy of the slides of Tyler’s presentation and link to them from here.

GISRUK 2012 – Wednesday

GISRUK 2012 was held in Lancaster, hosted by Lancaster University. The conference aimed to cover a broad range of subjects including Environmental Geoinformatics, Open GIS, Social GIS, Landscape Visibility and Visualisation and Remote Sensing. In addition to the traditional format, this years event celebrated the career of Stan Openshaw, a pioneer in the field of computational statistics and a driving force in the early days of GIS.

Wednesday

The conference kicked off with a Keynote from Professor Peter Atkinson of the University of Southampton.  This demonstrated the use of remotely sensed data to conduct spatial and temporal monitoring of environmental properties. Landsat data provides researchers with 40 years of data making it possible to track longer term changes. Peter gave two use case examples:

  1. River channel monitoring on the Ganges. The Ganges forms the International boundary between India and Bangladesh, understanding channel migration is extremely important for both countries.  The influece of man-made structures, such as barrages to divert water to Calcutta, can have a measurable effect on the river channel. Barrages were found to stabalise the migrating channel
  2. Monitoring regional phenology. Studying the biomass of vegetation is tricky but using “greenness” as an indicator provides a useful measure. Greenness can then be calculated for large areas, up to continent scale.  Peter gave an example where MODIS and MERIS data had been used to calculate the greenness of India. Analysis at this scale and resolution reveals patterns and regional variation such as the apparent “double greening” of the western Ganges basin which would allow farmers to have two harvests for some crops.

However, these monitoring methods are not without their challenges and limitations.  Remote sensing data provides continuous data based on a regular grid.  Ground based measurements are sparse and may not tie in, spatially or temporally, with the remotely sensed data. Ground based phenology measurements can be derived using a number of methods making it difficult to make comparisons.  A possible solution would be to adopt a crowd-sourcing technique where data is collected and submitted from enthusiasts in the field. This would certainly result in better spatial distributions of ground based measurements, but would the resulting data be reliable? Automatically calculating the greening from web-cams is currently being trialed.

The first session was then brought to a close with two talks on the use of terrestrial lidar. Andrew Bell (Queens University, Belfast) was investigating the use of terrestrial LiDAR for monitoring slopes.  DEMs were created from the scans and this was used to detect changes in slope, roughness and surface.  The project aims to create a probability map to identify surface that are likely to fail and cause a hazard to the public.  Andrew’s team will soon receive some new airbourne LiDAR data, however I feel that if this technique is to be useful to the highways agency, the LiDAR would have to mounted on a car as cost and repeatability would be two key drivers.  Andrew pointed out that this would reduce the accuracy of the data but perhaps such a reduction would be acceptable and change would still be detectable.

Neil Slatcher’s (Lancaster University) paper discussed the importance of calculating the optimum location to depoly a terrestrial scanner.  Neil’s research concentrated on lava flows which meant the landscape was rugged, some areas were inaccessible and the target was dynamic and had to be scanned in a relatively short period of time. When a target cannot be fully covered by just one scan analysis of the best positions to give complete coverage is needed.  Further, with a 10Hz scanner you could make 10 measurements per second which seems quick but a dense grid can result in scan times in excess of 3hrs.  By sub-dividing the scan into smaller scan windows that are centred over the target you can significantly reduce the size of the grid and the number of measurements required and hence the time it takes to acquire the data. This method had reduced scan times from 3 hrs to 1hr15mins.

The final session of the day had two parallel sessions, one on Mining Social Media and the other on Spatial Statistics.  Both interesting subjects but i opted to attend the Socail Media strand.

  • Lex Comber (University of Leicester) gave a presentation on Exploring the geographies in social networks.  This highlighted that there are many methods for identifying clusters or communities in social data but that the methods for understanding what a community means are still quite primitive.
  • Jonny Huck (Lancaster University) presented on Geocoding for social networking of social data.  This focused on the Royal Wedding as it was an announced event that was expected to generate traffic on social media allowing the team to plan rather than react. They found that less than 1% of tweets contained explicit location information. You could parse the tweets to extract geographic information but this introduced considerable uncertainty.  Another option was to use the location info in users profiles and assume they were at that location.  The research looked at defining levels of detail, so Lancaster Uni  Campus would be defined as Lancaster University Campus / Lancaster/Lancashire / England /UK.  By geocoding the tweets at as many levels of detail as possible you could then run analysis at the appropriate level.  What you had to be careful of was creating false hot-spots at the centroids of each country.
  • Omar Chaudhry (University of Edinburgh) explained the difficulties in Modelling Confidence in Extraction of Place Tags from Flickr.  Using a test case of Edinburgh they tried to use Flickr tags to define the dominant feature of grid cell covering central Edinburgh.  Issues arose when many photo’s were tagged for a personal event such as a wedding and efforts were made to reduce the impact of these events. Weighting the importance of the tag against the number of users who used it, rather than the absolute number of times it was used seemed to improve results. There was still the issue of tags relating to what the photo was of, rather than were it was taken.  Large features such as the Castle and Arthur’s Seat dominated the coarser grids as they are visible over a wide area.
  • Andy Turner and Nick Malleson (University of Leeds) gave a double header as they explined Applying geographical clustering methods to analyse geo-located open micro-blog posts: a case study of tweets around Leeds.  The research showed just how much information you could extract from location information in tweets, almost giving you a socio-economic profile of the people. There was some interesting discussion around the ethics of this, specifically in elation to the data protection act.  This clearly states that you can use the data for the purpose that it was collected for.  Would this research/profiling be considered what the original data had been collected for?  Probably not.  However, that was part of the research, to see what you could do and hence what companies could do if social media sites such as twitter start to allow commercial organisations to access your personal info. For more information on this look at this paper, or check out Nick’s Blog
  • One paper that was suggested as a good read on relating tweets to place and space was Tweets from Justin Bieber’s heart: the dynamics of the location field in user profiles.

I will post a summary of Thursday as soon as I can.

UK Bio Bank

While watching the news on Friday night, yes it doesn’t get much more exciting than that these days, i saw a piece on the UK Bio Bank.  UK Biobank is a major national health resource with the aim of improving the prevention, diagnosis and treatment of a wide range of serious and life-threatening illnesses – including cancer, heart diseases, stroke, diabetes, arthritis, osteoporosis, eye disorders, depression and forms of dementia. UK Biobank recruited 500,000 people aged between 40-69 years in 2006-2010 from across the country to take part in this project. They have undergone measures, provided blood, urine and saliva samples for future analysis, detailed information about themselves and agreed to have their health followed.

So what is the significance of this study?  There are a a couple of important differences from previous studies, most obvious being the number of participants.  Half a million subjects is a huge, and importantly significant, sample size.  It should allow researchers to cut through background noise and discover trends that have not been apparent in smaller studies.  Another difference is that this study focuses on a diverse range of people. Some are already suffering from an illness but many are perfectly healthy.  Much of the previous research has focused on just those that are already suffering from an illness.

So why am I writing about the UK Bio Bank on a geospatial blog? Well, along with the wide array of physiological measurements that are being collected about each subject, the research team are collecting information about:

  • where participants live
  • where they grew up
  • where they have lived throughout their life
  • the income of their family while they grew up
  • their employment

and I am sure many more things.  This gives the study a spatial element and geographical factors can have a strong influence on health.  With such a large sample size GIS is the obvious tool to analyse and extract patterns from the noise of the data.  Packages such as ArcGIS and R Stats will help researchers explore the dataset.  I am sure the UK Bio Bank will become a an important research resource in years to come and will have a significant impact on epidemiology research.

UK Biobank was established by the Wellcome Trust medical charity, Medical Research CouncilDepartment of HealthScottish Government and the Northwest Regional Development Agency. It has also had funding from theWelsh Assembly Government and the British Heart Foundation. UK Biobank is hosted by the University of Manchester and supported by the National Health Service (NHS).