Data Carpentry & Software Carpentry workshops

The Research Data Service hosted back to back 2-day workshops in the Main Library this week, run by the Software Sustainability Institute (SSI) to train University of Edinburgh researchers in basic data science and research computing skills.

Learners at Data Carpentry workshop

Learners at Data Carpentry workshop

Software Carpentry (SC) is a popular global initiative originating in the US, aimed at training researchers in good practice in writing, storing and sharing code. Both SC and its newer offshoot, Data Carpentry, teaches methods and tools that helps researchers makes their science reproducible. The SSI, based at Edinburgh Parallel Computing Centre (EPCC), organises workshops for both throughout the UK.

Martin Callaghan, University of Leeds

Martin Callaghan, University of Leeds, introduces goals of Data Carpentry workshop.

Each workshop is taught by trainers trained by the SC organisation, using proven methods of delivery, to learners using their own laptops, and with plenty of support by knowledgeable helpers. Instructors at our workshops were from Leeds and EPCC. Comments from the learners – staff and postgraduate students from a range of schools, included, ‘Variety of needs and academic activities/disciplines catered for. Useful exercies and explanations,’ and ‘Very powerful tools.’

Lessons can vary between different workshops, depending on the level of the learners and their requirements, as determined by a pre-workshop survey. The Data Carpentry workshop on Monday and Tuesday included:

  • Using spreadsheets effectively
  • OpenRefine
  • Introduction to R
  • R and visualisation
  • Databases and SQL
  • Using R with SQLite
  • Managing Research & Data Management Plans

The Software Carpentry workshop was aimed at researchers who write their own code, and covered the following topics:

  • Introduction to the Shell
  • Version Control
  • Introduction to Python
  • Using the Shell (scripts)
  • Version Control (with Github)
  • Open Science and Open Research
Software Carpentry learners

Software Carpentry learners

Clearly the workshops were valued by learners and very worthwhile. The team will consider how it can offer similar workshops in the future at a similarly low cost; your ideas welcome!

Robin Rice
EDINA and Data Library

Share

New MOOC! Research Data Management and Sharing

[Guest post from Dr. Helen Tibbo, University of North Carolina-Chapel Hill]

The School of Information and Library Science and the Odum Institute at the University of North Carolina-Chapel Hill and the MANTRA team at the University of Edinburgh are pleased to announce the forthcoming Coursera MOOC (Massive Open Online Course), Research Data Management and Sharing.

CaptureThis is a collaboration of the UNC-CH CRADLE team (Curating Research Assets and Data Using Lifecycle Education) and MANTRA. CRADLE has been funded in part by the Institute of Museum and Library Services to develop training for both researchers and library professionals. MANTRA was designed as a prime resource for postgraduate training in research data management skills and is used by learners worldwide.

The MOOC uses the Coursera on-demand format to provide short, video-based lessons and assessments across a five-week period, but learners can proceed at their own pace. Although no formal credit is assigned for the MOOC, Statements of Accomplishment will be available to any learner who completes a course for a small fee.

The Research Data Management and Sharing MOOC will launch 1st March, 2016, and enrolment is open now. Subjects covered in the 5-week course follow the stages of any research project. They are:

  • Understanding Research Data
  • Data Management Planning
  • Working with Data
  • Sharing Data
  • Archiving Data

Dr. Helen Tibbo from the School of Information and Library Science (SILS) at the University of North Carolina at Chapel Hill delivers four of the five sets of lessons, and Sarah Jones, Digital Curation Centre, delivers the University of Edinburgh-developed content in Week 3 (Working with Data). Quizzes and supplementary videos add to the learning experience, and assignments are peer reviewed by fellow learners, with questions and answers handled by peers and team teachers in the forum.

Staff from both organizations will monitor the learning forums and the peer-reviewed assignments to make sure learners are on the right track, and to watch for adjustments needed in course content.

The course is open to enrolment now, and will ‘go live’ on 1st March.
https://www.coursera.org/learn/research-data-management-and-sharing

Hashtag: #RDMSmooc

A preview of one of the supplementary videos is now available on Youtube:
www.youtube.com/watch?v=yhVqImna7cU

Please join us in this data adventure.
-Helen

Dr. Helen R. Tibbo, Alumni Distinguished Professor
President, 2010-2011 & Fellow, Society of American Archivists
School of Information and Library Science
201 Manning Hall, CB#3360
University of North Carolina at Chapel Hill
Chapel Hill, NC 27599-3360
Tel: 919-962-8063
Fax: 919-962-8071
tibbo@ils.unc.edu

Share

MANTRA @ Melbourne

The aim of the Melbourne_MANTRA project was to review, adapt and pilot an online training program in research data management (RDM) for graduate researchers at the University of Melbourne. Based on the UK-developed and acclaimed MANTRA program, the project reviewed current UK content and assessed its suitability for the Australian and Melbourne research context. The project team adapted the original MANTRA modules and incorporated new content as required, in order to develop the refreshed Melbourne_MANTRA local version. Local expert reviewers ensured the localised content met institutional and funder requirements. Graduate researchers were recruited to complete the training program and contribute to the detailed evaluation of the content and associated resources.

The project delivered eight revised training modules, which were evaluated as part of the pilot via eight online surveys (one for each module) plus a final, summative evaluation survey. Overall, the Melbourne_MANTRA pilot training program was well received by participants. The content of the training modules generally gathered high scores, with low scores markedly sparse across all eight modules. The participants recognised that the content of the training program should be tailored to the institutional context, as opposed to providing general information and theory around the training topics. In its current form, the content of the modules only partly satisfies the requirements of our evaluators, who made valuable recommendations for further improving the training program.

In 2016, the University of Melbourne will revisit MANTRA with a view to implement evaluation feedback into the program; update the modules with new content, audiovisual materials and exercises; augment targeted delivery via the University’s LMS; and work towards incorporating Melbourne_MANTRA in induction and/or reference materials for new and current postgraduates and early career researchers.

The current version is available at: http://library.unimelb.edu.au/digitalscholarship/training_and_outreach/mantra2

Dr Leo Konstantelos
Manager, Digital Scholarship
Research | Research & Collections
Academic Services
University of Melbourne
Melbourne, Australia

Share

Fostering open science in social science

FOSTER_logoOn 10th of June, the Data Library team ran two workshops in association with the EU Horizon 2020 project, FOSTER (Facilitate Open Science Training for European Research), and the Scottish Graduate School of Social Science.

The aim of the morning workshop, “Good practice in data management & data sharing with social research,� was to provide new entrants into the Scottish Graduate School of Social Science with a grounding in research data management using our online interactive training resource MANTRA, which covers good practice in data management and issues associated with data sharing.

The morning started with a brief presentation by Robin Rice on ‘open science’ and its meaning for the social sciences. Pauline Ward then demonstrated the importance of data management plans to ensure work is safeguarded and that data sharing is made possible. I introduced MANTRA briefly, and then Laine Ruus assigned different MANTRA units to participants and asked them to briefly go through the units and extract one or two key messages and report back to the rest of the group. After the coffee break we had another presentation on ethics, informed consent and the barriers for sharing, and we finished the morning session with a ‘Do’s and Dont’s exercise where we asked participants to write in post-it notes the things they remembered, the things they were taking with them from the workshop: green for things they should DO, and pink for those they should NOT. Here are some of the points the learners posted:

DO
– consider your usernames & passwords
– read the Data Protection Act
– check funder/institution regulations/policies
– obtain informed consent
– design a clear consent form
– give participants info about the research
– inform participants of how we will manage data
– confidentiality
– label your data with enough info to retrieve it in future
– develop a data management plan
– follow the certain policies when you re-use dataset[s] created by others
– have a clear data storage plan
– think about how & how long you will store your data
– store data in at least 3 places, in at least 2 separate locations
– backup!
– consider how/where you back up your data
– delete or archive old versions
– data preservation
– keep your data safe and secure with the help of facilities of fund bodies or university
– think about sharing
– consider sharing at all stages. Think about who will use my data next
– share data (responsibly)

DON’T
– unclear informed consent
– a sense of forcing participants to be part of research
– do not store sensitive information unless necessary
– don’t staple consent forms to de-identified data records/store them together
– take information security for granted
– assume all software will be able to handle your data
– don’t assume you will remember stuff. Document your data
– assume people understand
– disclose participants’ identity
– leave computer on
– share confidential data
– leave your laptop on the bus!
– leave your laptop on the train!
– leave your files on a train!
– don’t forget it is not just my data, it is public data
– forget to future proof

Robin Rice presenting at FOSTERing Open Science workshop

Our message was that open science will thrive when researchers:

  • organise and version their data files effectively,
  • provide comprehensive and sufficient documentation for others to understand and replicate results and thus cite the source properly
  • know how to store and transport your data safely and securely (ensuring backup and encryption)
  • understand legal and ethical requirements for managing data about human subjects
  • Recognise the importance of good research data management practice in your own context

The afternoon workshop on “Overcoming obstacles to sharing data about human subjects� built on one of the main themes introduced in the morning, with a large overlap of attendees. The ethical and regulatory issues in this area can appear daunting. However, data created from research with human subjects are valuable, and therefore are worth sharing for all the same reasons as other research data (impact, transparency, validation etc). So it was heartening to find ourselves working with a group of mostly new PhD students, keen to find ways to anonymise, aggregate, or otherwise transform their data appropriately to allow sharing.

Robin Rice introduced the Data Protection Act, as it relates to research with human subjects, and ethical considerations. Naturally, we directed our participants to MANTRA, which has detailed information on the ethical and practical issues, with specific modules on “Data protection, rights & access� and “Sharing, preservation & licensing�. Of course not all data are suitable for sharing, and there are risks to be considered.

In many cases, data can be anonymised effectively, to allow the data to be shared. Richard Welpton from the UK Data Archive shared practical information on anonymisation approaches and tools for ‘statistical disclosure control’, recommending sdcMicroGUI (a graphical interface for carrying out anonymisation techniques, which is an R package, but should require no knowledge of the R language).

DrNiamhMooreFinally Dr Niamh Moore from University of Edinburgh shared her experiences of sharing qualitative data. She spoke about the need to respect the wishes of subjects, her research gathering oral history, and the enthusiasm of many of her human subjects to be named in her research outputs, in a sense to own their own story, their own words.

Links:

Rocio von Jungenfeld & Pauline Ward
EDINA and Data Library

Share

Highlights from the RDM Programme Progress Report: Jan – Feb 2015

The Library and University Collections (L&UC) in association with project partner Manchester University received funding from the Jisc “Research Data Spring” programme to define and develop an open source Data Vault application which will allow data creators to describe and store data safely in one of the growing number of archival storage options. Phase 1 of the project started in March 2015.

The University of Edinburgh (UoE) were invited to contribute to a series of EPSRC (Engineering and Physical Sciences Research Council) Compliance Case Studies. Stuart MacDonald, RDM Service Coordinator, was interviewed by Jisc and the DCC in relation to the RDM programme and institutional compliancy with forthcoming EPSRC research data expectations. The case study will be published on the Jisc website in May 2015.

RDM Service Coordinator Stuart MacDonald co-presented with Rory Macneil (RSpace) their practice paper “Service Integration to Enhance RDM: RSpace electronic laboratory notebook (ELN) case study� at the International Conference on Digital Curation (IDCC) in London (Feb 2015). The paper has been published in the International Journal of Digital Curation (http://www.ijdc.net/index.php/ijdc/article/view/10.1.163), open access.

The RDM Service Coordinator also presented on ‘RDM Training Initiatives @ Edinburgh’ at the “Comparing Notes: Training Librarians for Research Data Management and Open Science Support� workshop at IDCC.

An EPSRC Expectations Awareness Survey was sent out to 98 EPSRC grant holders of which 38 responded. 9** grant holders agreed to participate in a follow-up interview. The findings of the interviews will follow shortly. Dr Evamaria Krause (Marburgh University, Germany) completed a 6 week internship with L&UC where she assisted with the EPSRC Expectations Awareness Survey and EPSRC grant holder interview exercises.

All Schools in the College of Humanities and Social Science (CHSS) have now added links to RDM Programme website and other RDM pages via their intranets. RDM Project Plan deadlines and deliverables which underpin the RDM Roadmap have been updated.* For more details visit the RDM Programme wiki (some content only available to UoE staff).

Four tailored Data Management Plans sessions have been organised with research groups in the College of Medicine and Veterinary Medicine and CHSS, and two workshops for the European Association for Health Information and Libraries (EAHIL) conference in Edinburgh are scheduled to run in June 2015.

Edinburgh DataShare release 1.71 has been announced with new features including faceted browsing, SOLR usage statistics, size limit on web deposit of Items increased from 5Gb to 10Gb.

DataSync (a Dropbox-like service in development) was themed and made available for beta testing to Information Services colleagues.

Links:

* IT Infrastructure input pending
** 1 PhD student who was forwarded the survey agreed to be interviewed

Share

Data: photographs in research

In collaboration with Scholarly Communications, the Data Library participated in the workshop “Data: photographs in research” as part of a series of workshops organised by Dr Tom Allbeson and Dr Ella Chmielewska for the pilot project “Fostering Photographic Research at CHSS” supported by the College of Humanities and Social Science (CHSS) Challenge Investment Fund.

In our research support roles, Theo Andrew and I addressed issues associated with finding and using photographs from repositories, archives and collections, and the challenges of re-using photographs in research publications. Workshop attendants came from a wide range of disciplines, and were at different stages in their research careers.

First, I gave a brief intro on terminology and research data basics, and navigated through media platforms and digital repositories like Jisc Media Hub, VADS, Wellcome Trust, Europeana, Live Art Archive, Flickr Commons, Library of Congress Prints & Photographs Online Catalog (Muybridge http://hdl.loc.gov/loc.pnp/cph.3a45870) – links below.

Eadweard Muybridge. 1878. The Horse in motion. Photograph.

From the Library of Congress Prints and Photographs Online Catalog

Then, Theo presented key concepts of copyright and licensing, which opened up an extensive discussion on what things researchers have to consider when re-using photographs and what institutional support researchers expect to have. Some workshop attendees shared their experience of reusing photographs from collections and archives, and discussed the challenges they face with online publications.

The last presentation tackling the basics of managing photographic research data was not delivered due to time constraints. The presentation was for researchers who produce photographic materials, however, advice on best RDM practice is relevant to any researcher independently of whether they are producing primary data or reusing secondary data. There may be another opportunity to present the remaining slides to CHSS researchers at a future workshop.

ONLINE RESOURCES

LICENSING

Share

New release of Research Data MANTRA (Management Training) online course

The Research Data MANTRA course is an open, online training course that provides instruction in good practice in research data management. There are nine interactive learning units on key topics such as data management planning, organising and formatting data, using shared data and licensing your own data, as well as four data handling tutorials with open datasets for use in R, SPSS, NVivo and ArcGIS.

This fourth release of MANTRA has been revised and systematically updated with new content, videos, reading lists, and interactive quizzes. Three of the data handling tutorials have been rewritten and tested for newer software versions too.

New content in the online learning modules with the September, 2014 release:

  • New video footage from previous interviewees and introducing Richard Rodger, Professor of Economic and Social History and Stephen Lawrie, Professor of Psychiatry & Neuro-Imaging
  • Big Data now in Research Data Explained
  • Data citation and ‘reproducible research’ added to Documentation and Metadata
  • Safe password practice and more on encryption in Storage and Security
  • Refined information about the DPA and IPR in Data Protection, Rights and Access
  • Linked Open Data and CC 4.0 and CC0 now covered in Sharing, Preservation & Licensing

MANTRA home pageThis release will also be more stable and more accessible due to back-end enhancements. The flow of the learning units and usability of quizzes have been improved based on testing and feedback. We have simplified our feedback form and added a four-star rating button to the home page. A YouTube playlist for each unit is available on the Data Library channel.

MANTRA was originally created with funding from Jisc and is maintained by EDINA and Data Library, a division of Information Services, University of Edinburgh. It is an integral part of the University’s Research Data Management Programme and is designed to be modular and self-paced for maximum convenience; it is a non-assessed training course targeted at postgraduate research students and early career researchers.

Data management skills enable researchers to better organise, document, store and share data, making research more reproducible and preserving it for future use. Researchers in 144 countries used MANTRA last year, which is available without registration from the website. Postgraduate training organisations in the UK, Canada, and Australia have used the Creative Commons licensed material in the Jorum repository to create their own training. The website also hosts a ‘training kit’ for librarians wishing to increase their skills in supporting Research Data Management.

Visit MANTRA and consider recommending it to your colleagues and research students this term! http://datalib.edina.ac.uk/mantra/

Usage Statistics

According to Google Analytics, the following organisation’s websites were the top ten referrers to the MANTRA website for the academic year 2013-2014 (discounting Data Library, EDINA and Information Services):

  • Institute for Academic Development, University of Edinburgh
  • LIS Links (India)
  • Digital Curation Centre
  • eScience Portal for New England Libraries at University of Massachusetts Medical Library
  • Oxford University
  • University of Nebraska-Lincoln (USA)
  • Carleton University (Canada)
  • Glasgow University
  • Food and Agriculture Organization of the United Nations
  • Jisc

Social media sites Facebook, Twitter and Slideshare provided a large number of referrals; several more came from other UK institutions, and HEIs in Australia, the rest of Europe, and North America—University Library pages especially. Forty percent of sessions came  from a referring website.

Visitors to MANTRA over the year came from 144 countries. Google searches accounted for 4,000 sessions, 25% of the total. Nearly ten thousand visits were from new users (based on IP addresses) over the year from 22nd August, 2013 – 23rd August, 2014. Here is a link to a Google Analytics summary spreadsheet extracted from our account.

We expect to have more detailed usage statistics over the forthcoming year due to moving the learning units out of the authoring software (Xerte Online Toolkits) onto the main MANTRA website.

Postscript, 15 Sept: See my Storify story, “Research Data MANTRA Buzz” to find out who’s been talking about MANTRA on twitter!

Robin Rice
Data Librarian

 

 

Share

New data curation profile in History

Margaret Forrest, Academic Liaison Librarian for the School of History, Classics and Archaeology, is the latest to contribute a data curation profile. She has interviewed researcher Graham J. Black, who is a PhD candidate in the School. His subject is the aerial bombing during the Vietnam War and he has thousands of government documents, articles and pictures to manage.

The profile has been added to previous ones on the DIY RDM Training Kit for Librarians web page created by other librarians participating in the RDM librarian training. The librarians covered five RDM topics in separate two-hour sessions,where they reinforced what was learned in MANTRA through group discussion, exercises from the UK Data Archive, and listening to local experts.

Each librarian was encouraged to complete an independent study as part of the training: interview a researcher and write up a data curation profile. This was designed to test their self-confidence at talking to researchers about RDM, as well as give them the opportunity to ‘share their data’ by publishing the profile on the website.

Margaret described her experience to Anne Donnelly, one of the trainers:

This was definitely the most enjoyable part of the training and I learned so much from this interview process and the writing up (mainly because of the value of what I had learned from the MANTRA course).

The final group of eight academic service librarians completed their training this summer. This completes a deliverable in the University’s RDM Roadmap. More curation profiles are welcome; we may put them in a collection in Edinburgh DataShare. They could be useful learning objects for others doing training in research data support, in terms of thinking critically about RDM practices.

Robin Rice
Data Librarian

Share