Visualising OS MasterMap® Topography Layer Building Height Attribute in AutoCAD Map 3D and InfraWorks

We’ve recently written blogs on visualising OS MasterMap® Topography Layer Building Height Attribute (BHA) data in ESRI’s ArcGIS and ArcGlobe and also QGIS. These blogs have proved very popular so we have written a 3rd instalment on how to achieve similar results using Autodesk products using AutoCAD Map3D and InfraWorks.

Please see the previous post for information on BHA data coverage, an explanation of the different height attributes supplied by Ordnance Survey (OS) and for details of further information sources; including the excellent Getting Started Guide produced by OS. Please remember this is an alpha release of the data and OS do not guarantee that BHA is error free or accurate. Additionally the dataset is not yet subject to update and maintenance.

Getting Started

Download the following datasets for your area of interest from Digimap using the OS Data Download application:

  1. OS MasterMap® Topography Layer: select ‘GML’ as the format for your data rather than ‘DWG’ as we need to get to some of the raw data values stored in the GML file, and limit your download to just the Buildings using the Layers drop-down in the basket.
  2. OS Terrainâ„¢ 50 DTM: this will be used as the base (surface) heights for the area;
  3. BHA data (BHA data is found in the ‘OS MasterMap’ group): select CSV as the format;
  4. Optionally download any additional data you may wish to use as a backdrop draped over the DTM surface, in this example we’re going to use OS MasterMap® 1:2,000 Raster but other datasets could be used.

Preparing BHA data for use

If your downloaded BHA data is made up of more than one CSV file we recommend merging them all together in to a single CSV file first to make subsequent processing easier and quicker. Use a text editor such as Notepad or TextPad rather than Excel, as Excel can change the formatting of numbers which contain leading zeros.

Each object in MasterMap Topography Layer has a unique identifier called a Topographic Identifier, or TOID for short. TOIDs supplied by OS take the format of a 13 or 16 digit number prefixed with ‘osgb’ e.g. ‘osgb1000039581300′ or ‘osgb1000002489201973′. Some applications, including AutoCAD Map 3D , automatically strip off the ‘osgb’ prefix and add three leading zeros to any TOID that has only 13 digits to make them all 16 characters long. In order to make it easier to join BHA data to building features in MasterMap® Topography Layer the BHA files supplied by EDINA have two TOID values:

  • os_topo_toid_digimap is the TOID formatted to match TOIDs in AutoCAD Map 3D, ArcGIS and in the File Geodatabase format supplied through Digimap.
  • os_topo_toid is the original TOID as supplied by Ordnance Survey

You should check the TOID values in your MasterMap data and those in the BHA data to ensure that there is a common field that you can use to match on; we will use os_topo_toid_digimap as this field in the BHA data matches the TOID values in the MasterMap data when used in AutoCAD Map 3D.

Open the merged CSV file in Excel. To ensure that the data is displayed correctly in Excel you should import the data as follows:

  1. Open a blank Excel document then use the ‘From Text’ option which can be found on the Data ribbon.  This allows you to specify the correct field delimiter and data types for the TOID columns, ensuring they are imported as text fields, show image.
  2. Import the data as a ‘delimited’ file, show image.
  3. Specify ‘Comma’ as your delimiter, show image.
  4. On the Text Import Wizard – Step 3 of 3 window select the first column in the ‘Data Preview’ section and set the ‘Column data format’ to ‘Text’. Repeat this step for the second column. This ensures that Excel treats the two TOID columns as text rather than numbers so doesn’t strip off the leading zero’s from any of the values (which are needed when joining the data to the building features in MasterMap later on), show image.
  5. Press Finish to complete the process, after which your data in Excel should look like the image below with the TOID values in the first column all 16 characters long and including three leading zero characters where necessary:Excel showing TOID values imported as text
  6. AutoCAD Map 3D requires a ‘named range’ of cells to connect to. To create this highlight/select all cells that contain data in the workbook and using the ‘Name Box’ give this selection of cells a name. In the screen grab below we have called the selection ‘BuildingHeightValues’ (note your name cannot include spaces):
    Excel named range
  7. Save your file as an .xslx file.
  8. The next step is to use the Windows ODBC Data Source Administrator to create a connection that points to this .xlsx file. Open the ODBC Data Source Administrator, the easiest way of doing this is to use the Windows search tool to search for ‘ODBC Data Source’.
    ODBC Data Connection Administrator
  9. On the ‘User DSN’ tab press the ‘Add…’ button to create a new ODBC connection:
    Add ODBC Connection
  10. Select ‘Microsoft Excel Driver’ and press the Finish button.
  11. Give your connection a name in the ‘Data Source Name’ field, and using the ‘Select Workbook…’ button browse and select the .xlsx file created above.
    ODBC Select File
  12. Press OK and the newly created User DSN will be listed:
    ODBC Data Source Administrator
  13. Press OK to close the ODBC Data Source Administrator.

Preparing MasterMap Topography Layer GML data for use

  1. Open AutoCAD Map 3D.
  2. At the command prompt type: MAPIMPORT, or select ‘Map Import’ from the Insert menu.
    AutoCAD Map 3D Map Import
  3. Browse to the .gz file downloaded from Digimap, ensure the ‘Files of type’ drop-down is set to ‘OS (GB) Mastermap (*.gml, *.xml, *.gz)’.
    AutoCAD Map 3D Browse
  4. Import just the TopographicArea layer by deselecting all other layers in the import dialog.
    AutoCAD Map 3D Import
  5. Click on the word <None> in the Data Column for the TopographicArea layer.
  6. In the Attribute Data window select ‘Create object data’ and press OK.
    AutoCAD Map 3D Attribute Data
  7. Select ‘Import polygons as closed polylines’ and press OK.
    AutoCAD Map 3D Closed Polylines
  8. The data will be imported in to your current map window. Note you may need to select View > Extents to see the data.
  9. The data needs to be converted to an .sdf file to allow the Building Height Attribute data to be joined to it.
    1. At the command prompt type: MAPEXPORT
    2. Select ‘Autodesk SDF (*.sdf)’ as the file type.
    3. On the Feature Class tab, click on the ‘Select Attributes…’ button.
    4. In the Select Attributes window select ‘Object Data’ and press OK.
      AutoCAD Map 3D Select Attributes
    5. On the Map Export window press OK to export the data.
  10. Connect to the .sdf file just created:
    1. In the Task Pane select Data > Connect to Data…
      AutoCAD Map 3D Connect To Data
    2. Select ‘Add SDF Connection’.
      AutoCAD Map 3D Add SDF Connection
    3. Give your connection a name and browse to the .sdf file exported in the previous step.
    4. Click the Connect button to establish the connection.
    5. Press the ‘Add to Map’ button to add this data to your current map window.
      AutoCAD Map 3D Add to map

You have now added MasterMap buildings to your current map window, the next step is to connect to the Building Height Attribute (Excel spreadsheet) and join it to the building features in MasterMap.

Joining Building Height Attribute to buildings in MasterMap

  1. Connect to the BHA spreadsheet using the ODBC connection:
    1. In the Task Pane select Data > Connect to Data…
    2. Select ‘Add ODBC Connection’
    3. Give your connection a name and select the Data Source Name created above using the ‘…’ button next to the ‘Source’ field.
      AutoCAD Map 3D select DNS
    4. Press the ‘Test Connection’ button.
      AutoCAD Map 3D create ODBC connection
    5. The table in the bottom half of the window will display all named ranges in your spreadsheet, we called our named range ‘BuildingHeightValues’. Before you can select this range for use in AutoCAD Map 3D you need to select a column to use as the ‘Identify Property’. To do this click on the text that says ‘<Click to select>’.
      AutoCAD Map 3D create ODBC connection
    6. In the drop-down that appears put a tick in the box next to the value ‘os_topo_toid_digimap’.
      AutoCAD Map 3D create ODBC connection
    7. Now you can tick the box next to the named range in the spreadsheet and press the ‘Connect’ button.
      AutoCAD Map 3D create ODBC connection
    8. The connection details will be displayed.
      AutoCAD Map 3D create ODBC connection
    9. The Data Connection window can now be closed.
  2. In the Task Pane right click on the MasterMap data and select ‘Create a Join…’
    AutoCAD Map 3D create a join
  3. Join detailsIn the ‘Create a Join’ window:
    1. select the building height data (in the Excel spreadsheet) as the ‘Table (or feature class) to join to’;
    2. select ‘TOID’ in the left hand drop-down menu;
    3. select ‘os_topo_toid_digimap’ in the right hand drop-down menu;
    4. select ‘Keep only left-side records with a match’ in the ‘Type of Join’ section;
    5. press ‘OK’ to create the join.
  4. To verify that the Join has worked, open the data table for the MasterMap data, this is done in the Task Pane by selecting the MasterMap data then pressing the ‘Table’ button. The table will be displayed, scroll to the right to see the joined building height values:
    AutoCAD Map 3D attribute table
  5. The final step is to export the joined data as a new .sdf file which we can then visualise in 3D in InfraWorks. This is done by either right clicking on the MasterMap layer in the Task Pane and selecting ‘Export Layer Data to SDF…’ or by using the ‘Export to SDF’ function on the ‘Vector Layer’ ribbon in the ‘Save’ group.

Visualising the data in 3D using Autodesk InfraWorks

So far we have downloaded OS MasterMap® Topography Layer and BHA data for the same area and joined the two together to create a new dataset containing just the building features which now include the various height attributes published by OS. We also downloaded additional data to use as a backdrop draped over the DTM surface, in this example we will use OS MasterMap® 1:2,000 Raster, but OS VectorMap® Local Raster or OS 1:25,000 Scale Colour Raster would also be suitable depending on the scale of your study area.

Visualising the data in 3D is achieved using Autodesk’s InfraWorks product . The steps below describe how to use the application to create a 3D model:

  1. Open InfraWorks and create a new model.
  2. Specify a location to save the model and give it a name:
    InfraWorks New Model
  3. Click and drag the OS Terrain 50 DTM in to InfraWorks; the file to drag is the one with the .asc extension.
  4. In the Data Source Configuration window, ensure the Type is set to ‘Terrain’ and Coordinate System is set to ‘BritishNatGrid’:
    InfraWorks Data Source DTM
  5. Press the ‘Close & Refresh’ button; the DTM should be displayed:
    InfraWorks showing DTM only
  6. Click and drag the final .sdf file created in the final step of the previous section which contains the heighted building data (i.e. the .sdf file created after joining the MasterMap buildings to the Building Height Attribute data spreadsheet).
  7. In the Data Source Configuration window, set the ‘Type’ drop-down to ‘Buildings’ and select a suitable ‘Roof Height’ attribute using the drop-down on the Common tab. As with previous blogs we have used the RelH2 attribute as we found this gave the best overall representation of building heights relative to each other:
    Data source configuration
  8. On the ‘Geo Location’ tab select ‘BritishNatGrid’ as the coordinate system:
    Data source configuration
  9. On the Source tab select ‘Drape’ from the drop-down under the ‘Draping Options’:
    Data source configuration
  10. Press the ‘Close & Refresh’ button, the buildings should now be displayed on top of the DTM, you may need to pan or zoom to view the data:

    InfraWorks with 3d buildings draped over OS Terrain50

    OS Terrain™ 50 with buildings from OS MasterMap® Topography Layer extruded on top using Building Height Attribute data.

  11. To give some more context to the visualisation you can drape additional raster layers on top of the DTM such as OS MasterMap® 1:2,000 Raster. This is done by selecting all the raster files and dragging them in to the InfraWorks window.
  12. In the Data Source Configuration window ensure ‘Type’ is set to ‘Ground Imagery’, and one the ‘Geo Location’ tab select the ‘BritishNatGrid’ Coordinate System:
  13. Select the ‘Close & Refresh’ button and the map data will be draped over the DTM surface:

    Infraworks with Terrain50 DTM, MasterMap 1:2,500 Raster and Heighted Buidlings

    OS MasterMap® 1:2,000 Raster draped on top of OS Terrain™ 50, with buildings from OS MasterMap® Topography Layer extruded on top using Building Height Attribute data.

The finished visualisation

The screen grab below shows the final visualisation centred on Biggar using OS MasterMap® 1:2,000 Raster as the surface layer.

InfraWorks visualisation

OS MasterMap® 1:2,000 Raster draped on top of OS Terrain™ 50, with buildings from OS MasterMap® Topography Layer extruded on top using Building Height Attribute data.

Share

Visualising OS MasterMap® Topography Layer Building Height Attribute in QGIS

Our recent blog post about visualising OS MasterMap® Topography Layer Building Height Attribute (BHA) data in ArcGIS and ArcGlobe prompted a number of questions about whether it’s possible to do something similar in open source software. In this post we’ll show you how to achieve something similar using QGIS and the freely available Qgis2ThreeJS plugin.

Please see the previous post for information on BHA data coverage, an explanation of the different height attributes supplied by Ordnance Survey (OS) and for details of further information sources; including the excellent Getting Started Guide produced by OS. Please remember this is an alpha release of the data and OS do not guarantee that BHA is error free or accurate. Additionally the dataset is not yet subject to update and maintenance.

Getting started

  1. Download the following datasets for your area of interest from Digimap using the OS Data Download application:
    1. OS MasterMap® Topography Layer: select the ‘File Geodatabase’ format for your data as this format does not require any conversion to use it in QGIS;
    2. OS Terrainâ„¢ 50 DTM: this will be used as the base (surface) heights for the area;
    3. BHA data (BHA data is found in the ‘OS MasterMap’ group): select CSV as the format;
    4. Optionally download any additional data you may wish to use as a backdrop, such as OS VectorMap® Local Raster or OS 1:25,000 Scale Colour Raster;
  2. Open QGIS and load in the OS MasterMap® Topography Layer, OS Terrain™ 50 DTM and your backdrop map data.

Preparing BHA data for use

If your downloaded BHA data is made up of more than one CSV file we recommend merging them all together in to a single CSV file first to make subsequent processing easier and quicker. Use a text editor such as Notepad or TextPad rather than Excel, as Excel can change the formatting of numbers which contain leading zeros.

Each object in MasterMap Topography Layer has a unique identifier called a Topographic Identifier, or TOID for short. TOIDs supplied by OS take the format of a 13 or 16 digit number prefixed with ‘osgb’ e.g. ‘osgb1000039581300′ or ‘osgb1000002489201973′. Some applications, such as ArcGIS, automatically strip off the ‘osgb’ prefix and add three leading zeros to any TOID that has only 13 digits to make them all 16 characters long. Additionally this same formatting is applied to the File Geodatabase format of MasterMap supplied through Digimap. In order to make it easier to join BHA data to building features in MasterMap® Topography Layer the BHA files supplied by EDINA have two TOID values:

  • os_topo_toid_digimap is the TOID formatted to match TOIDs in ArcGIS and in the File Geodatabase format supplied through Digimap.
  • os_topo_toid is the original TOID as supplied by Ordnance Survey

You should check the TOID values in your MasterMap data and those in the BHA data to ensure that there is a common field that you can use to match on; we will use os_topo_toid_digimap as this field in the BHA data matches the TOID values in the MasterMap data downloaded in File Geodatabase format from Digimap.

Before BHA data can be loaded in to QGIS it is necessary to create a small text file (called filename.csvt, where ‘filename’ is the name of your BHA csv file) that specifies the data type of each field so that QGIS handles it correctly. Specifically the .csvt file is used to ensure that QGIS treats the two TOID  values as text rather than numbers, and all height values as numbers. The steps required are detailed below:

  1. Create a new file called filename.csvt (replacing ‘filename‘ with the name of your BHA csv file) in the same folder as the BHA csv file you wish to import.
  2. Open the file in a text editor such as Notepad or TextPad.
  3. Copy and paste the following text in to the file:
    "String","String","Integer","Date","String","Real","Real","Real","Real","Real","Integer"
  4. Save your changes to the file. Ensure it is saved in the same folder as the CSV file you wish to import.
  5. Add your BHA CSV file to QGIS through the Add Vector Layer function; this will add the data as a table in the QGIS project.

Creating a heighted buildings dataset

In order to create a new heighted buildings dataset from the building features in OS MasterMap Topography Layer and the BHA data we use the GIS ‘join’ function. A join links these two datasets together through a common unique identifier (the TOID) resulting in a set of buildings with height values stored as additional attributes.

  1. Bring up the Layer Properties dialog for the Topographic Area layer in the MasterMap data either by double clicking on the layer in the Layer panel or by right clicking on the layer and selecting Properties from the pop-up menu.
  2. Select the ‘Joins’ tab on the left hand side to display the join panel:
    QGIS Join Window
  3. Press the green plus button to add a new join:
    1. QGIS add vector joinSelect your BHA dataset as the ‘Join Layer’.
    2. Select the correct TOID field that matches the TOIDs in your MasterMap data; as mentioned above we’re using os_topo_toid_digimap as the formatting of this matches the TOIDs supplied in the MasterMap data downloaded from Digimap in File Geodatabase format.
    3. In the ‘Target Field’ select the attribute column that contains the TOIDs in your MasterMap® data; by default this is called ‘TOID’ in MasterMap downloaded from Digimap.
    4. Leave the checkbox selected to ‘cache join layer in virtual memory’ as this will speed up query and display of the data.
    5. Press OK to create the join.
  4. Press OK on the Layer Properties dialog to close the window.
  5. Open the attribute table for the TopographicArea MasterMap layer to verify the join has worked. You will see the additional BHA columns at the end of the attribute table. Note you will see a lot of ‘null’ values in these additional columns as BHA values are only available for ‘building’ features (the TopograhicArea feature class contains features for everything, not just buildings).
  6. Having joined the datasets together, before we can create a heighted buildings dataset we need to select only those buildings which now have height information. This is done using the QGIS ‘Select features using an expression’ button: QGIS select by expression button
  7. QGIS select by expression windowWe are looking to select only features which have a value for the height attribute we wish to use in the 3D visualisation. As mentioned in the previous post, we have found that the ‘RelH2′ attribute provides a good representation of the height of buildings relative to one another. The expression used is shown below. Note the field name, in quotes below, is automatically created by QGIS by adding the BHA table name (NT27) to the attribute column name (RelH2) with an underscore between them:
    "NT27_relh2" IS NOT Null

    QGIS save vector layer

  8. Having selected just the buildings that include height information we can now export these features as a new dataset by right clicking on the TopographicArea dataset in the Layer panel and selecting ‘Save As…’ from the pop-up menu.
  9. Save the dataset as a new Shapefile in a suitable location, selecting the checkboxes ‘save only selected features’ and ‘add saved file to map’.
  10. The newly created heighted buildings dataset will be added to your QGIS project; now it’s time to visualise it in 3D.

Visualising the data in 3D

So far we have downloaded OS MasterMap® Topography Layer and BHA data for the same area and joined the two together to create a new dataset containing just the building features which now include the various height attributes published by OS. We also downloaded OS Terrain™ 50 DTM to use as the surface heights, 1:25,000 Colour Raster and OS VectorMap® Local Raster to drape over the surface.

Visualising the data in 3D in QGIS is achieved using the Qgis2ThreeJS plugin, which can be installed using the QGIS plugin manager if you don’t have it already. The steps below describe how to use the plugin to create a 3D model:

  1. Ensure you have all the data loaded in to your QGIS project that you wish to include in the 3D model, as a minimum you should have your DTM, the heighted buildings dataset and a suitable map layer to drape over the DTM.
  2. Turn off all layers in the Layers panel apart from the surface you wish to drape over the DTM; the buildings will be styled using the Qgis2ThreeJS plugin.
  3. Launch the plugin, which can be found on the Web toolbar.
  4. Using the ‘DEM’ panel of the plugin select your DTM data as the ‘DEM Layer’, leave all other settings at their default values. Tip: by default the surface has a vertical exaggeration of 1.5, if you wish to reduce or increase this, the setting is configured on the ‘World’ panel of the plugin.
  5. Qgis2ThreeJS plugin windowIn the ‘Polygon’ panel of the plugin select your heighted buildings dataset and complete the following settings:
    1. Z coordinate: set to ‘Height from surface’ – this will ensure the buildings sit on the DTM surface.
    2. Under ‘Style’: ensure the ‘Object type’ is set to ‘Extruded’ and select the height attribute you wish to use for the extrusion using the ‘Height’ drop down; as mentioned above, we’re using the RelH2 attribute from OS which is in the column ‘NT27_relh2′ in our data.
    3. Select suitable colours and transparency, we used a medium grey colour with 10% transparency to give a glasshouse effect.
  6. Optionally specify an ‘Output HTML file path’ to save the resultant files. Whilst you’re experimenting we recommend you leave this blank and the plugin will save the data in a temporary location, when you’re happy with the result you can use this setting to save your final visualisation.
  7. Press ‘Run’ to create the 3D model. Once it’s finished processing the model will open in your default web browser.

The plugin outputs an HTML file, along with a small number of accompanying files. The HTML file requires a WebGL compatible browser (WebGL is a method of generating dynamic 3D graphics using JavaScript), most modern browsers are WebGL compatible including IE 11, FireFox, Chrome, Safari and Opera, the Can I Use site offers further information on browser compatibility.

As the files are output as a web page, you can share the results of your work with colleagues without them needing to have any specialist GIS software, however you are not permitted to make the website publicly available as the html and javaScript files contain map data rather than just images of maps.  The Licence does not permit the sharing of licensed data from Digimap with anyone other than registered users of the service: Digimap Licence

Tips

  • The plugin uses the extents of the current QGIS map canvas, so the bigger the area being displayed, the bigger the generated 3D scene and the slower it will display. We have found that areas of up to 10km² display okay, anything bigger tends to be a little slow to respond.
  • If you wish to define specific extents for your 3D scene instead of using the map canvas extents this can be done on the ‘World’ panel of the plugin.
  • Applying a vertical exaggeration to your buildings is achieved through the ‘Multiplier’ setting on the ‘Polygon’ panel of the plugin.
  • You can create 3D models of multiple layers. For example in the screen shot below the trees were created by selecting the ‘Positioned Non Coniferous Trees’ from the OS MasterMap Topographic Point layer. These were then added to the QGIS project twice. Using the plugin one of these layers was extruded as a brown cylinder with a radius of 0.75m and a height of 3m to form the trunk; the other was extruded as a green sphere with a radius of 4.5m and a z coordinate of 4.5m (i.e. the height above the ground surface of the centre of the sphere) to form the tree canopy:
QGIS 3D visualisation with trees

OS VectorMap® Local Raster draped on top of OS Terrain™ 50, with buildings from OS MasterMap® Topography Layer Building Height Attribute and Positioned Non Coniferous Trees extruded on top

The finished visualisation

The screen grab below shows the final visualisation centred on the south side of Edinburgh using OS 1:25,000 Colour Raster as the surface layer.

3D visualisation

1:25,000 Colour Raster draped on top of OS Terrain™ 50, with buildings from OS MasterMap® Topography Layer extruded on top using Building Height Attribute data.

Share

Visualising OS MasterMap® Topography Layer Building Height Attribute in ArcGIS and ArcGlobe

In March 2014 Ordnance Survey (OS) published an alpha release of the much anticipated Building Height Attribute (BHA) dataset, which is an enhancement to OS MasterMap Topography Layer. You can read all about it in their blog post. In this blog we’re going to show you how to integrate the BHA dataset with buildings in the OS MasterMap Topography Layer to create a heighted buildings dataset and visualise it in 3D. We used ArcGIS 10.2 and ArcGlobe to do this but other software could be used.

The first alpha release of BHA included buildings covering approximately 8,000km2 of the country. A second alpha release of BHA was published in July 2014 which covers around 10,000km2 of the major towns and cities in Great Britain. OS publish an interactive map which shows the extents of the areas covered by the alpha release, so you can check if your area of interest is included.

A note of caution, this is an alpha release of the data and OS do not guarantee that BHA is error free or accurate. Additionally the dataset is not subject to update and maintenance. However in time OS intend to include BHA in OS MasterMap Topography Layer so in future it will be supplied and maintained as a part of the Topography Layer.

Attributes supplied by OS

BHA attributesA number of attributes are provided for each building as shown in the image :

  • ground level [AbsHMin]
  • the base of the roof [AbsH2]
  • highest part of the roof [AbsHMax]

Using these three values two additional relative heights are calculated:

  • relative height from ground level to the highest part of the roof [RelHMax]
  • relative height from ground level to base of the roof [RelH2]

Data availability

OS publish the data as a single CSV file containing over 20 million records. This is a very large dataset and can cause data management problems in a desktop environment so EDINA have split the dataset up using the OS 5km grid allowing you to download the data in tiles for your study area. The data is available in CSV and KML formats. To use the data in GIS or CAD packages you should download the data in CSV format; KML is used to visualise the data in Google Earth.

OS 5km gridThe ‘Show Grid/Overlay’ menu on the right hand side in the Data Download application displays the OS 5km grid. This will draw a grid with each square containing the OS 5km tile reference, as shown in the image.

Please note: BHA data is not currently available for the whole country, you should consult the interactive map published by the OS to see if data exists for your area of interest.

Using the data

OS provide an excellent Getting Started Guide which explains in detail the process of getting BHA data in to GIS for subsequent analysis. The main steps are described below but please refer to the Getting Started Guide for full details.

The data is supplied as CSV files. Each record in the file has a unique TOID which can be used to join the data to building features in OS MasterMap Topography Layer.

Getting started
  1. Download OS MasterMap Topography Layer data for your area of interest from Digimap using the OS Data Download application. Select the ‘File Geodatabase’ format for your data as this is a native ArcGIS format and doesn’t require any conversion.
  2. Download BHA data for your area of interest from Digimap using the OS Data Download application (BHA data is found in the ‘OS MasterMap’ group), selecting CSV format.
  3. Open the OS MasterMap Topography Layer data in ArcGIS.
Preparing BHA data for use

If your downloaded BHA data is made up of more than one CSV file we recommend merging them all together in to a single CSV file first to make subsequent processing easier and quicker. Use a text editor such as Notepad or TextPad rather than Excel, as Excel can change the formatting of numbers which contain leading zeros.

Each object in MasterMap Topography Layer have a unique identifier called a Topographic Identifier, or TOID for short. TOIDs supplied by Ordnance Survey take the format of a 13 or 16 digit number prefixed with ‘osgb’ e.g. ‘osgb1000039581300′ or ‘osgb1000002489201973′. ArcGIS automatically strips off the ‘osgb’ prefix and adds three leading zeros to any TOID that has only 13 digits to make them all 16 characters long. In order to make it easier to join BHA data to building features in MasterMap Topography Layer the BHA files supplied by EDINA have two TOID values:

  • os_topo_toid_digimap is the TOID formatted to match TOIDs in ArcGIS
  • os_topo_toid is the original TOID as supplied by Ordnance Survey (this should be used in other GIS packages such as QGIS which do not modify the TOIDs in MasterMap Topography Layer)

Before BHA data can be loaded in to ArcGIS it is necessary to create a small text file (called schema.ini) that specifies the data type of each field so that ArcGIS handles it correctly. Specifically the schema.ini file is used to ensure that ArcGIS treats the two TOID  values as text rather than numbers. The steps required are detailed below:

  1. Create a new file called schema.ini in the same folder as the BHA csv file you wish to import.
  2. Open the file in a text editor such as Notepad or Text pad.
  3. Copy and paste the following text in to the file:
    [bha_filename.csv]
    Format=CSVDelimited
    ColNameHeader=True
    Col1=OS_TOPO_TOID_DIGIMAP Text
    Col2=OS_TOPO_TOID Text
    Col3=OS_TOPO_VERSION Long
    Col4=BHA_ProcessDate DateTime
    Col5=TileRef Text
    Col6=AbsHMin Double
    Col7=AbsH2 Double
    Col8=AbsHMax Double
    Col9=RelH2 Double
    Col10=RelHmax Double
    Col11=BHA_Conf Long
  4. The first section of code, in square brackets shown in red above, refers to the name of the csv file you wish to import. You should modify this filename so that it references your BHA csv file.
  5. Save your changes to the file. Ensure it is called schema.ini and is saved in the same folder as the csv file you with to import.
  6. Add your BHA csv file to ArcGIS through the Add Data function; this will add the data as a table in the map document.
Creating a heighted buildings dataset

ArcGIS JOIN windowIn order to create a new heighted buildings dataset from the building features in OS MasterMap Topography Layer and the BHA data we use the GIS ‘join’ function. A join links these two datasets together through a common unique identifier (the TOID) resulting in a set of buildings with height values stored as additional attributes.

  1. Right click on the Topographic Area layer in the table of contents > Joins and Relates > Join. This will bring up the Join Data window which can be completed as shown. Remember to join to the TOID in the csv file that is formatted to match the TOIDs displayed in ArcGIS (os_topo_toid_digimap).
    Tip: to create a dataset which just includes the heighted buildings select ‘Keep only matching records’.
  2. Having joined the datasets together we can then export the result as a new Feature Class in our File Geodatabase for subsequent use and analysis. This is done by right clicking on the Topographic Area layer in the table of contents > Data > Export Data…  give your new dataset a suitable name and select your existing File Geodatabase as the destination.
Visualising the result in ArcGlobe

So far we have downloaded data from OS MasterMap Topography Layer and BHA data for the same area and joined the two together to create a new dataset containing just the building features which now include the various height attributes published by OS. Now the fun begins!

We can easily visualise the heighted buildings dataset in 3D using ArcGlobe or ArcScene. The following steps describe how to import the data in to ArcGlobe.

  1. Download the OS Terrain 50 DTM for your area of interest from Digimap using the OS Data Download application. This will be used as the base (ground) heights for the area to provide a more accurate terrain model than is available by default in ArcGlobe.
  2. Open ArcGlobe and add in the DTM. You will be asked if you wish to use the DTM as and ‘image source’ or an ‘elevation source’. You should select the ‘elevation source’ option:

ArcGlobe add DEM window

  1. The Geographic Coordinate Systems Warning dialog will appear as OS MasterMap Topography Layer data is in a different coordinate system (British National Grid) from that used by ArcGlobe (WGS 84):

ArcGlobe Geographic Coordinate Systems Warning

  1. You should specify the transformation used to ensure that the data is accurately positioned on the globe. Using the Transformations… button you should specify the ‘OSGB_1936_To_WGS_1984_Petroleum’ transformation:

ArcGlobe Geographic Coordinate System Transformation

  1. Adding your heighted building dataset from your File Geodatabase is achieved through the Add Data button. Once added you may need to zoom to the layer to view it: right click on the layer in the table of contents > Zoom To Layer.
  2. By default the data is not extruded vertically so appears flat on the earth’s surface. To visualise the buildings in 3D right click on the layer in the table of contents and select Properties and then click on the Globe Extrusion tab.
  3. Select the ‘Extrude features in layer’ checkbox and then in the ‘extrusion value or expression’ box enter the following:
[relh2] * 1.5

ArcGlobe layer properties

This will extrude the buildings using the RelH2 attribute with a vertical exaggeration of 1.5 times (i.e. buildings will be shown 1.5 times their actual height). We found using RelH2 (the relative height from ground level to base of the roof) provides a more useful visualisation over RelHMax (the relative height from ground level to the highest part of the roof) which can lead to some overly tall looking buildings where they include towers that extend significantly beyond the height of the rest of the roof.

The end result

The image below shows an area of Edinburgh including Edinburgh Castle with Arthurs’ Seat in the background. Aerial imagery from ArcGlobe is draped over OS Terrain 50 data for the region with heighted buildings drawn on top. Using the tools in ArcGlobe it is easy to explore the landscape, navigating across the surface and examining the relationships between buildings in the built environment.

BHA data in ArcGlobe

Further information

OS published Release Notes for the alpha releases of BHA. Additional information can be found in Annexe D of the OS MasterMap Topography Layer User Guide and Annexe E of the OS MasterMap Topography Layer Technical Specification.

 

Share

Palimpsest Methodologies Day

The Palimpsest Methodologies Day, which took place on 13th May 2014,  was an opportunity for the project team to meet and share ideas with our wonderful advisory board, whose experience and expertise spans the wide range of academic and technical areas needed to help guide our multidisciplinary research. It was a chance for us to discuss past projects and methodological approaches, as well as to reflect on how Palimpsest is developing so far.

The afternoon began with brief presentations from the project team, covering the background to the project and the literary tasks and aims (James L), the gazetteer and textmining challenges (Bea), the mapping and database aspects (James R), social media and communications (Nicola) and finally the creation of data visualisations (Uta and David). Slides from these sessions will be available soon via our forthcoming Publications and Presentations page.

The introductions to the Palimpsest project and project team was followed by presentations by the advisory board, many of whom will soon themselves appear as guest bloggers on this blog:

Screenshot of map selector in Walking Through Time App

Screenshot of map selector in Walking Through Time App

Chris Speed (Edinburgh College of Art) discussed the concept of  “Temporal Ubiquity”: the notion that many times and places co-exist. Many of Chris’s projects have played with this idea, such as Walking Through Time which allows you to explore old maps of Edinburgh as you walk through the modern world and so experience both time periods. Chris added: “I wish I had put the Abercrombie plans into that app – a utopian future that never actually happened”. He went on to explain the ways in which technologies are supplementing our mobile temporal consciousness: we can follow long-dead people on Twitter, we can re-experience our own earlier lives through tools like TimeHop, and send messages to our future selves. All of these aspects create temporal ubiquity and provide new ways to open up and explore time and place.

David Cooper (Manchester Metropolitan University) spoke about his work as a literary geographer. David’s interest in this area began with his research on post war writers descriptions of the Lake District. Of particular interest to David is the ways in which the burden of the past affects contemporary authors, and he explores this in terms of issues of spatial intertextuality and imaginative embedding. David was involved with the digital humanities project, Mapping the Lakes, which  reapplied urban studies of literary place to rural topography, micromapping works by Thomas Gray and Samuel Taylor Coleridge. This included an attempt to map authors’ emotional responses to landscapes and raised the issue of what we are doing when we attempt to map subjective emotional qualities. Finally, David emphasized the need to consider how we can convey the multisensory aspects of literary works.

Jonathan Hope (University of Strathclyde) described his research on the Visualising English Print 1470-1800 project as part of the Text Creation Partnership, a collaborative venture between universities who have been creating a corpus of works input as text (rather than just page images). From 1st January 2015, they will start making freely available the Early English Books Online texts, which range from 1450-1700. The TCP project will then move on to Eighteenth Century Collections Online (ECCO) although these present more complex copyright challenges. But it’s one thing to have the texts – what do you do with them? The team Jonathan works with is creating new tools and methodologies for working with these texts. Jonathan also raised questions for the Palimpsest team around the multidimensional nature of the data we are considering and stressed the need to enable the exploration of their intricate relations without flattening their complexity.

Jason Dykes (City University London) raised five areas of reflection for the project team to consider:

  1. Location – designing for legibility and comparison.
  2. Representation – you don’t need precision for qualitative narratives.
  3. Annotation – test as spatial information, the words are the map!
  4. Connection – maps that tell (spatial) stories.
  5. Collection – explore content through visualisation.

In discussing these areas Jason raised examples of work he and his colleagues have done at the giCentre at City University London that offer clever and playful takes on each of these dimensions.

Screenshot of the Slave Revolt in Jamaica, 1760-61

Screenshot of the Slave Revolt in Jamaica, 1760-61 built by Axis Maps Dr Vincent Brown’s African Rebellion Project, Harvard University.

David Heyman (Axis Maps) talked about the difference between map making and cartography, a difference which he defined as “the purposeful design of maps”. He talked about the importance of communicating a message to an audience and what that means for map design. For instance, in interactive cartography that means ensuring we design features that define functionality, communicate it to the user, and contextualise the thematic display.

Miguel Nacenta (University of St Andrews) described several ways in which meaningful distortions in visualisations can help to communicate the information shown.  FatFonts, which he created with Uta Hinrichs (on our visualisation team) and Sheelagh Carpendale, is a type face that provides a hybrid of the symbolic and the visual, by using a thickness of ink that is in proportion to the number being represented. These fonts are designed to highlight numerical changes and cluster numbers in a multilevel way to give you a sense of scale and meaning when you glance at a visualisation. Another of Miguel’s projects called Transmogrifiers enables users to interact with and transform a map to allow a new view of the information – as did the Jonson and Ward 1862 atlas, which showed rivers’ lengths juxtaposed in order to make visible a comparative sense of their scale.

The 1862 Johnson and Ward Map or Chart of the World's Mountains and Rivers - Geographicus. Image via Wikimedia Commons.

The 1862 Johnson and Ward Map or Chart of the World’s Mountains and Rivers – Geographicus. Image via Wikimedia Commons.

Ewan Klein (University of Edinburgh) closed the presentations with a discussion of vernacular geography – the sense of place reflected in ordinary people’s language, which reflects a vagueness in semantics, that is often not acknowledged. The Natural Neighbourhood Questionnaire included two questions: what is your postcode and where do you live. The answers suggested that there was no clear boundary between or shared definition of neighbourhoods, with a heatmap version of the responses making visible a number of in-between places and showing Leith over represented territorially, perhaps  partly due to streets such as Leith Walk , which despite their name stretch beyond Leith itself.

The day concluded with very useful discussions of future challenges and opportunities for the Palimpsest project – as well as some fantastic and inspiring resources, which will feed into our work over the coming months.

As a thank you to our advisory board, and as introduction to Edinburgh’s rich literary past, many of us followed the Methodologies Day with a literary walking tour of the city, finding out all about both some of our best known literary figures and some of their less well-known peers and inspirations for their characters. Thanks to a serendipitous accident of timing this included sighting one of Edinburgh’s most prominent present day writers, Ian Rankin, who was conveniently standing outside the very pub our guide had just mentioned as a favourite of his!

Resources and projects highlighted during the day:

– Nicola Osborne and Miranda Anderson

Data Visualisation Talk by Martin Hawksey

Today EDINA is hosting a talk by Martin Hawksey on data visualisation. He has posted a whole blog post on this, which includes his slides, so I won’t be blogging verbatim but hoping to catch key aspects of his talk.

Martin will be talking about achievable and effective ways to visualise data. He’s starting with Jon Snow’s 1850s map of cholera deaths identifying the epicentre of the outbreak through maps of death. And on an information literacy note you do need to know how to find the story in the graphics. Visualisation takes data, takes stories, and turns them into something of a narrative, explaining and enabling others to explore that data.

Robin Wilton georeferenced that original Snow data then Simon Rodgers (formally of Guardian, latterly of twitter) put data into CartoDB. This re interpretation of the data really makes the infected pump jump out at you, the different ways of visualising that data make the story even clearer.

Not all visualisations work, you may need narration. Graphics may not be meaningful to all people in the same way. E.g. Location of the pumps on these two maps. So this is where we get into theory. Reptinsp, a French cartographer, came up with his own systems of points, lines, symbols etc. but not based on research etc, his own cheat system. If you look at Gestalt psychology you get more research based visualisatsions – laws of similarity, proximity, continuity. There is something natural about where the eye is drawn but there is theory behind that too.

Jon Snows map was about explaining and investigating the data. His maps were explanatory visualisation and we have that same idea in Simon Rodgers map but it is also an exploratory visualisation, the reader/viewer can interact and interrogate it. But there are limitations of both approaches. Within both maps it’s essentially a heat map, more of something (in this case deaths). And you see that in visualisations you often get heat maps that actually map population rather than trends. Tony Hirst says “all charts are lies”. They are always an interpretation of the data from the creator’s point of view…

So going back to Simon Rodgers map we see that the radius of a dots based on the number of deaths. Note from the crowd “how to lie with statistics”. Yes, a real issue is that a lot of the work to get to that map is hidden, lots of room for error and confusion.

So having flagged up some examples and pitfalls I want to move onto the process of making data visualisations. Tools include Excel, Carto GB, Gephi, IBM Many Eyes, etc. but in addition to those tools and services you can also draw. Even now so many visualisations are made via drawing, if only final tweaking. Sometimes a sketch of a visualisation is the way to prototype ideas too. There are also code options, D3JS, SigmaJS, R, GGplot, etc.

Some issues around data: data access can be an issue, hard to find, hard to identify source data etc. Tony Hirst really recommends digging around for feeds, for RSS, find the stuff that feeds and powers pages. There are tools for reshaping feeds and data. Places like Yahoo Pipes, which lets you do drag and drop programming with input data. And I’ve started touching upon data shapes. Data may be provided in certain ways or shapes, but it may not suit your use. So a core skill is the transformation of data to reshape data, tools like Yahoo Pipes, Open Refine – which also lets you clean up data as well. I’ve tried Open Refine with public Jiscmail lists, to normalise for those with multiple user names.

So now the fun stuff…

For the Olympics last year for the cultural Olympiad last yer in Scotland we had the #citizenrelay tracking the progress of The Olympic torch. So lots of data to play with. First talk twitter (Topsy) media timeline. Uses Timeline by verity plus Topsy data. This was really easy to do. So data access was using Topsy, it pulls in data from Twitter to make its own archive. Has API to allow data. Make it easy to query for media against a hashtag. Can return data in XML but grabbed in Jason. Then output created with timelineJS. You can also use google spreadsheet template from timelineJS template (manually or automatically). Used spreadsheet her, yahoo pipes to manipulate. Can pull data in with google spreadsheets, when you’ve created the formula it will constantly refresh and update. So self updates when published.

Originally Topsy allowed data access without API key but now they require it. Google app script, JavaScript based – see big Stack Overflow community – has similar curl function for fetching URLs and dumping back into spreadsheet. Have also done this with yahoo pipes (use
Rate module for API key aspect).

Next as the relay went around the country they used Audioboo. When you upload AudioBoo geolocates your Boos. So AudioBoo has an API (without key) and you can filter for a tag. You can get the data out in XML, JSON and CSV option but they also produce KML. If you can access a public KML file and paste into Google Maps search box then it just gives you the map. Can then embed, or share link to that file. So super easy visualisation there. But disappointingly didn’t embed audio in the map pins. But that’s a google map limitation. Google Earth does let you do that though…

So using Google Earth we only have a bit of work to do. We need t work out the embed code. So Google now provides a template that lets you bring in placemark data (place marker templates). You can easily make changes here. And you can choose how to format variables. Yu can fill in manually but can also be automatically done SL use Google AppScript here. I go to AudioBoo API, grabs as JSON, then parses it. Then for each item push to spreadsheet. So for partial Geodata these Google templates are really useful.something else to mention: Google Spreadsheets are great, sit in the cloud. But recently was using Kasabi and it went down… And everything relying on it went live. Sometimes useful to take a flat capture as spreadsheet for back up.

So the next visualisation… Used NodeXL (SNA). This is an open source plug in for excel. It has a umber of data importers, including for twitter, Facebook, media wiki, etc. just from the menu). And it has lots of room for reformatting etc. then a grid view from that.

And this is where we start chaining tools together. So I had twitter data, I had NodeXL to identify community (who follows who, who is friends with who) so used Gephi, which lets you start using network graphs. A great way to see how nodes relate to which other. Often using for Social Network Analysis but people have also used it for cocktail recipes (there’s an academic paper on it). There is a recipe site that lets you reform recipes using same approach. Gephi is another tool.. You spend an hour playing… And then wonder about how to convey to others and you can end up with flat graphic. So I created something called TAGS Explorere to let anyone interact – and there are others who have done similar.

Another example here. A network of those using #ukoer hashtag and looking for bridges in the community, the key people. This is an early visualisation I created. It was generated From twitter connections and tag use with Gephi, but then combined and finished in a drawing package.

This is another example looking at different sources. A bubble chart for click throughs of tweets. Man get a degree of that info from bit.ly. But if you use another service it’s hard to get click through however can see referrals in Google Analytics – each twitter URL is unique to each person who tweets it so you can therefore see click through rate for an individual tweet. This is created in google spreadsheet. An explore interactively, reshape for your own exploration. So this spreadsheet goes and uses google analytics API and Twitter API then combines with some reshaping. One thing to be aware of is that spreadsheets have a duality of value and formulae. So when you call on APIs etc. it can get confusing. So sometimes good to use two sheets, second flr manipulaton. There’s a great blog post on this duality – “spreadsheet addiction”. if you are at IWMW next week I’m doing a whole session at Google Analytics data and reshaping.

Q&A

Comment: study/working group on social network analysis, some of these techniques could be buildpt onto our community of expertise here.

Comment: would have to slow way down for me but hopefully we can devise materials and workshops to make these step by step.

Martin: But there are some really easy wins, like that Google Maps one. And there is a good community of support around these tools. But for instance R, if I ask on Stack Overflow then I will get an answer back.

Q) is there a risk that if you start trying to visualise data you might miss out on proper statistical processes and vigour?

Martin: yes, that is a risk. People tend to be specialists in one area rather than all of them. Manchester Metroplitan use R as part of analysis of student surveys, recruitment etc. this was from an idea of Mark Stubbs, head of eLearning, raised by speaking to specialist in Teridon flight. r is wily used in the sciences and increasingly in big data analysis. So there it started with expert who did know what he was doing.

Q: have you done much with data mining or analysis, like Google N Gram?

Martin: not really. Done some work on sentiment analysis and social network data though.

DeliciousShare/Bookmark

Guest Post on Kew Gardens’ Blog

The Trading Consequences team have created a guest post, “Bringing Kew’s Archive Alive” for Kew Gardens’ Library, Art and Archives’ blog

The post looks at how digital data produced by Kew’s Directors’ Correspondence team can be used as a source for visualising the British Empire’s 19th Century trade networks.

You can read the post in full here: http://www.kew.org/news/kew-blogs/library-art-archives/bringing-kews-archive-alive.htm

Visualisation Allsorts

I sometimes receive quite specific requests about social media, new tech or other slightly more tangental things.  A few weeks ago I was asked for advice on Visualisation tools for a research project and thought that others here might be interested in the tools, sites and resources that came to mind.

The links and recommendations come from a mixture of angles: some I’ve looked at or been aware of through specific work projects; some come recommended by colleagues as new, interesting, or well crafted; and some came from looking for visualisation options for my MSc in eLearning dissertation. Do let me know what you think of any of these tools or the list itself and I’ll be very happy to update the list if you have others to recommend!

Tools 

This section generally focuses on online tools (with varying policies over data use/retention) that allow you to visualise your data one way or another:

Wordle is about the simplest visualisation tool but can be effective if you want a word/tag cloud: http://www.wordle.net/

Image of the Closing Session at OR2012 with Wordle by Adam Field shown in the background.

Image of the Closing Session at OR2012 with Wordle by Adam Field shown in the background.

Textal is a new and more academically-targetted and mobile-friendly alternative to Wordle, specifically designed for use with text research data sets. I think it should be due out soon… : http://www.textal.org/.

FigShare is a site for sharing academic data, particularly scientific data. It includes some automatic visualisation functionality as well as inspiration via other people’s shared resources, graphs, visualisations: http://figshare.com/

ManyEyes is an IBM tool for visualising data – very useful and once data is uploaded it can be re-visualised: http://www-958.ibm.com/software/data/cognos/manyeyes/

Visual.ly is a consumer web 2.0 tool for visualising data – generally social media related data – and is probably primarily useful as a source of inspiration for other visualisations: http://visual.ly/

Google Apps/Drive includes a series of pretty good visualisation tools that can be accessed from any spreadsheet. Standard Excel type charts can be accessed via:

Insert>Chart

You can also access more sophisticated visualisations from

Insert > Gadget

There are various examples of these being used well on the web but they really come into their own when you hook up a data collection form to a spreadsheet and then visualise it – it all connects up rather nicely.

Voyant Tools offers a number of approaches to large cohorts of prepared text-based data. It’s worth noting that, as with all of these tools really, you should anonomise and edit the text before submitting it. That’s particularly important for Voyant Tools as you can’t edit the data once it’s up and you can’t delete it easily either. But it does clever stuff in a simple way and for free: http://voyant-tools.org/.

Data-Driven Documents is a site focusing on D3.js, a JavaScript library for working with data – lots of very practical but very technical materials and ideas here: http://d3js.org/

SIMILE Widgets are a great wee set of visualisation tools from a project at MIT that are relatively easy to reuse and widely used on websites to make swishy looking previews etc.: http://simile-widgets.org/

Timeline JS is a flexible way to create timeline visualisations – useful if that type of visual is what you’re after: http://timeline.verite.co/

Tableau is a free data visualisation tool and rather less techie tool to handle than many of those mentioned above. I haven’t had much experience of using it but have heard good things: http://www.tableausoftware.com/public/

SourceMap is a web service that lets you create one type of visualisation – maps visualising “where things come from” whether those be sources, commodities, trade routes, etc. Very useful but only if that’s the visualisation you actually want to create: http://sourcemap.com/ You can find some good examples of these visualisations over on my Trading Consequences’ colleague Jim Clifford’s blog.

British Tallow trade map by Jim Clifford (click through to see his full blog post about these maps).

British Tallow trade map by Jim Clifford. Click through to see his full blog post about these maps.

Gource is a specific version control visualisation codebase – again it’s very niche but nice is that’s your niche: https://code.google.com/p/gource/

Logstalgia is, similarly, a specific visualisation codebase for access log visualisation: https://code.google.com/p/logstalgia/

Dedoose is also worth noting. This is a text analysis tool and isn’t really a visualisation tool but there are visual aspects and it does help you reimagine and reinterpret text data by colour coding, tagging, grouping and viewing trends as you mark up your data: http://www.dedoose.com/

 

Useful Lists of Visualisation Tools and Resources

These are some articles and listings I’ve found useful in the past – I suspect there are many more to add…

The Next Web did a great guide to visualisation tools in May 2012 (some of which have already been mentioned): http://thenextweb.com/dd/2012/05/10/want-to-make-your-own-data-visualizations-check-out-this-awesome-set-of-tools/

ComputerWorld also shared a very useful post on good free data visualisation tools. The article is here: http://www.computerworld.com/s/article/9215504/22_free_tools_for_data_visualization_and_analysis and you can view a chart of all of the tools featured here: http://www.computerworld.com/s/article/9214755/Chart_and_image_gallery_30_free_tools_for_data_visualization_and_analysis

GoGeo (http://www.gogeo.ac.uk/) includes a visualisation software area where you can find several useful tools: http://www.gogeo.ac.uk/gogeo-java/resources.htm?&searchcat=Software&search=visualisation. There are also a number of useful collections of geographically related visualisation tools featured in the news section: http://www.gogeo.ac.uk/gogeo-java/resources.htm?&searchcat=News&search=visualisation

Downloadable Software

I must note two fabulous blogs for finding out about these: Tony Hirst’s OUseful blog; Martin Hawksey’s MASHe blog. Both are brilliant resources and contain many many more recommendations for software for visualisation and data analysis.

R – Free software for visualisation: http://www.r-project.org/

Gephi – Powerful – but complex to start out with – open source tool for data visualisation: http://gephi.org/

Expertise – Technical

These are useful website

Visualizing.org is a site dedicated to visualisation and includes a wealth of examples and useful links – very worthwhile browsing this for ideas, practical solutions etc: http://www.visualizing.org/

Visual Complexity is a collection of best practice visualisations which can be searched, browsed, etc: http://www.visualcomplexity.com/vc/

FlowingData is a blog collecting best practice visualisations and usually also indicating technology used: http://flowingdata.com/

Visualisation of Facebook photo virality featured on Flowing Data. Click through to read the full article.

Visualisation of Facebook photo virality featured on Flowing Data. Click through to read the full article.

There are also some individuals whose blogs are always well worth a read:

Steven Gray specialises in working with data and geospatial data visualisation with several very interesting current projects (including Textal). His Big Data Toolkit website http://bigdatatoolkit.org/ includes updates on his research, links to useful resources, discussion of ideas, etc.

Melissa Terras is co-director of the Centre for Digital Humanities at UCL and has worked on a variety of visualisation, research and interaction projects around Digital Humanities, including Textal, which can be found on her website: http://www.ucl.ac.uk/dis/people/melissaterras

Martin Hawksey (already mentioned above) of JISC CETIS blogs at MASHe (http://mashe.hawksey.info/) and often examines data analysis and visualisation including some superb work on Twitter data and visualisation. A search or browse of his blog for visualisations should find some interesting examples using web and downloadable data visualisation tools. As with any of these notable folks he is likely to respond to comments or questions so do comment on his blog!

Visualisation of UK University Twitter Following patterns by Martin Hawksey. Click through to read more about this visualisation and view his and Tony Hirst's IWMW 2012 presentation on Data Visualisation.

Visualisation of UK University Twitter Following patterns by Martin Hawksey. Click through to read more about this visualisation and view his and Tony Hirst’s IWMW 2012 presentation on Data Visualisation.

Tony Hirst (already mentioned above) of the Open University blogs at OU Useful (http://blog.ouseful.info/) and his posts often revolve around visualisation of data, particularly social data. I would recommend having a browse around his site (e.g: http://blog.ouseful.info/?s=visualisation) and leaving comments/questions.

Aaron Quigley of St Andrews University (http://www.cs.st-andrews.ac.uk/~aquigley/) is an expert on Human Computer Interaction and shares great resources and ideas around HCI and visualisation regularly. Aaron is also working on the Trading Consequences project and occasionally blogs about visualisation plans/issues related to that project here: http://tradingconsequences.blogs.edina.ac.uk/

The giCentre at City University London looks at geographic information and visualisation is a major part of that work. Their projects – which have included special commissions for the BBC and others – and related materials can be found here: http://www.soi.city.ac.uk/organisation/is/research/giCentre/

Patrick McSweeney of University of Southampton has worked on a couple of nice visualisation projects and hacks – notably his OR2012 Developer Challenge winning concept of provenanced visualisation within/connect to the repository  – and usually shares the technologies behind them. You can browse recent projects here: http://users.ecs.soton.ac.uk/pm5/portfolio/projects/

 

Expertise – Artistic/Creative/Inspirational

This section focuses on those who offer visual inspiration and expertise. I had hoped to include Douglas Coupland who worked on a very creative data visualisation project a few years back but I can’t recall the name of the project nor find the link – do let me know if you can help me out with a link here.

Hint.fm is a site collating new ways to visualise data of various sorts. This is about novel artistic rather than automated approaches: http://hint.fm/

Information is Beautiful, which I’m sure you’ve all seen before, is the home of David McCandles’ work and is really useful for inspiration/artistic visualisation and interpretation of data: http://www.informationisbeautiful.net/

Pinterest includes a number of visualisation boards that may be useful as inspiration/a connecting point to further websites and technical details: http://pinterest.com/search/boards/?q=visualisation

Culture Hack Scotland has included some fantastic visualisation and interpretation work in the past – and I’m sure the same is true for other hackdays working with large data sets. For previous projects in 2012 and 2011 have a look here: http://www.welcometosync.com/hack/

And finally…

Ellie Harrison is a visual artist based in Glasgow who specialises in interpreting data, including some lovely visualisation work. Her website is here: http://www.ellieharrison.com/ and her internet projects can be found here: http://www.ellieharrison.com/index.php?pagecolor=2&pageId=menu-internet

Screenshot from Ellie Harrison's most recent web project Trajectories. Click through to access this art project which uses visualisation to explore self comparison.

Screenshot from Ellie Harrison’s most recent web project Trajectories. Click through to access this art project which uses visualisation to explore self comparison.

 

Hopefully some of the above will be of interest/useful to you as well as the person who originally asked the question. As I’ve already said I’d appreciate any comments, additions, etc. you may have. Visualisations aren’t the core thing I spend my time on but images and visual aspects are so important to making an impact on social media that they are, of course, an area of great interest.

DeliciousShare/Bookmark

Visualisation of some early results

Claire showed us some early results from the work of the Language Technology Group, text mining volumes of the English Place Name Survey to extract geographic names and relations between them.

LTG visualisation of some Chalice data

LTG visualisation of some Chalice data

What you see here (or in the full-size visualisations – start with files *display.html) is the set of names extracted from an entry in EPNS (one town name, and associated names of related or contained places). Note there is just a display, the data structures are not published here at the moment, we’ll talk next week about that.

The names are then looked up in the geonames place-name gazetteer, to get a set of likely locations; then the best-match locations are guessed at based on the relations of places in the document.

Looking at one sample, for Ellesmere – five names are found in geonames, five are not. Of the five that are found, only two are certainly located, e.g. we can tell that the place in EPNS and place in geonames are the same, and establish a link.

What will help improve the quantity of samenesses that we can establish, is filtering searches to be limited by counties – either detailed boundaries or bounding boxes that will definitely contain the county. Contemporary data is now there for free re-use through Unlock Places, which is a place to start.

Note – the later volumes of EPNS do provide OS National Grid coordinates for town names; the earlier ones do not; we’re still not sure when this starts, and will have to check in with EPNS when we all meet there on September 3rd.

How does this fit expectations? We know from past investigations with mixed sets of user-contributed historic place-name data that geonames does well, but not typically above 50% of things located. Combining geonames with OS Open Data sources should help a bit.

The main thing i’m looking to find out now is what proportion of the set of all names will be left floating without a georeference, and how many hops or links we’ll have to traverse to connect floating place-names with something that does have a georeference. How important it will be to convey uncertainty about measurements; and what the cost/benefit will be of making interfaces allowing one to annotate and to correct the locations of place-names against different historic map data sources.

Clearly the further back we go the squashier the data will be; some of the most interesting use cases that CeRch have been talking to people about, involve Anglo-Saxon place references. No maps – not a bad thing – but potentially many hops to a “certain” reference. Thinking about how we can re-use, or turn into RDF namespaces, some of the Pleiades Ancient World GIS work on attestation/confidence of place-names and locations.