Welcome to the new Research Data Management Service Coordinator: Stuart Macdonald

We welcome Stuart Macdonald to the position of Research Data Management Service Coordinator, as a 1-year secondment for the current post-holder. Stuart will continue the work of developing the research data services provided by Information Services at the University of Edinburgh. Stuart will be working for three quarters of his time on the programme, and the remaining quarter in his current role as Associate Data Librarian for EDINA and the Data Library.

Stuart Macdonald

Stuart has recently returned from a six month secondment at Cornell Institute for Social and Economic Research as Data Services Librarian where he co-ordinated the successful Data Seal of Approval trusted repository application for CISER Data Archive as well as modernized archival process and practice.

When not working as service coordinator, Stuart will be working towards gaining the Data Seal of Approval for DataShare, the University’s open data repository.

On the role of service coordinator, Stuart says “This is a marvellous opportunity to be at the heart of research data management activities here at the University and to continue the great work that has already been put in place”


Dealing with Data – Call For Papers

University of Edinburgh Logo

Dealing with Data Conference 2014
Call for Papers


Tuesday 26th August 2014, 9am – 1pm
Including a formal launch of the University Research Data Management services by the Principal at 11:30am


University of Edinburgh (room to be confirmed)


Data creation
Data management planning
Data visualisation
Data archiving and sharing
Open Data
Data re-use
Electronic lab books
Data preservation
Software preservation
Non-traditional data types
Data analysis
New requirements for Research Data Management
Data infrastructure
Linked Data


Presentations will be 20 minutes long, with 10 minutes for questions. Depending on numbers, thematic parallel strands may be used.  Presentations to be aimed at an academic audience, but from a wide range of disciplines.

Call for papers:

A half day conference on the subject of ‘Dealing with Data’ is being run to coincide with the launch of the University of Edinburgh’s Research Data Management services that consists of tools and support to deal with the whole lifecycle of research data, from planning and storage, to sharing and archiving.

The conference invites proposals for presentations from University of Edinburgh researchers on any aspect of the challenges and advances in working with data, particularly research data with novel methods of creating, using, storing, visualising or sharing data.  A list of themes is given above, although proposals that cover any aspect of working with research data are welcome.

Please send proposals (2 sides of A4 max) to Stuart Lewis (stuart.lewis@ed.ac.uk) before Friday 25th July 2014.  Papers will be reviewed and the programme compiled by the 8th August. A PDF version of this Call for Papers is available for printing: DealingwithDataConference2014-cfp


Open data repository – file size analysis

The University of Edinburgh’s open data sharing repository, DataShare, has been running since 2009.  During this time, over 125 items of research data have been published online. This blog post provides a quick overview of the the number, extent, and distribution of file sizes and file types held in the repository.

First, some high level statistics (as at March 2014):

  • Number of items: 125
  • Total number of files: 1946
  • Mean number of files per item: 16
  • Total disk space used by files: 76GB (0.074TB)

DataShare uses the open source DSpace repository platform. As well as stroring the raw data files that are uploaded, it creates derivative files such as thumbnails of images, and plain text versions of text documents such as PDF or Word files, which are then used for full-text indexing.  Of the files held within DataShare, about 80% are the original files, and 20% are derived files (including for example, licence attachments).


When considering capacity planning for repositories, it is useful to look at the likely file size of files that may be uploaded.  Often with research data, the assumption is that the file size will be quite large.  Sometimes this can be true, but the next graph shows the distribution of files by file size.  The largest proportion of files are under 1/10th of a megabyte (100KB).  Ignoring these small files, there is a normal distribution peaking at about 100MB.  The largest files are nearer to 2GB, but there are very few of these.


Finally, it is interesting to look at the file formats stored in the repository.  Unsurprisingly the largest number of files are plain text, followed by a number of Wave Audio files (from the Dinka Songs collection).  Other common file formats include XML files, ZIP files, and JPEG images.


Stuart Lewis
Head of Research and Learning Services, Library & University Collections

Data provided by the DataShare team.