Highlights from the RDM Programme Progress Report: May to July 2016

Posted on February 2, 2017 by admin

The following key results were highlighted in the RDM Programme Progress Report:

There were 42 new users and 69 data management plans created with DMPOnline.
An additional 1.5PB has been procured for DataStoreâ€™s general capacity expansions.
The Roslin Institute has deposited 16 datasets into Data Vault.
DataShare upload release (2.1) went live on 23 May 2016.
There are now 334 dataset records in PURE, an increase of 124 records from the last reporting period (February to April 2016).
54 datasets have been deposited into DataShare.
The University of Edinburgh was recommended as a preferred supplier on the Framework for the Research Data Management Shared Services for Jisc Services Ltd (JSL) for the following Lots:
Lot 2: Repository Interfaces
Lot 3: Data Exchange Interface
Lot 6: Research Data Preservation Tools Development
Lot 8: User Experience Enhancements
A total of 390 staff and postgraduates attended RDM courses and workshops during this quarter.
A total of 3,649 learners enrolled for the 5-week RDMS MOOC rolling course from March through July, 2016 and a total of 461 people completed the course in the same time frame.
There were 5,198 MANTRA sessions recorded from May to July with 58 to 60 percent identified as new users.
Set up an RDM Forum in collaboration with College of Arts, Humanities and Social Sciences (CAHSS) Research Officer and Research Outputs Co-ordinator. The first RDM forum is scheduled for Wednesday, 7 September 2016.

Data Management Planning highlights

We currently hold sample data management plans for grant applications submitted to the Arts and Humanities Research Council (AHRC) the Economic and Social Research Council (ESRC) and the Medical Research Council (MRC).

Â Active Data Infrastructure highlights

DataStore

An additional 1.5PB has been procured for general capacity expansions. This capacity will primarily be deployed to the College of Medicine & Veterinary Medicine (CMVM) and the College of Science & Engineering (CSE).

MRC Institute of Genetics & Molecular Medicine (IGMM) has purchased an additional 1.2PB of capacity, and this is now deployed in their dedicated file system.

Data Stewardship highlights

DataShare

The large data sharing investigation was completed for DataShare and reported previously. Upload release (2.1) went live on 23 May 2016. Download release planned following â€˜embargo releaseâ€™ and ShareGeo spatial data migration.

Data Vault

There was a soft release of Data Vault in February 2016, with the Roslin Institute depositing 16 datasets during this quarter.

PURE

There are now 334 dataset records in PURE, an increase of 124 records from the last reporting period (February to April 2016).

Research Data Discovery Service (RDDS)

Two PhD interns are working on School engagement activities (dataset records into PURE / datasets into DataShare) for Divinity & Division of Infection and Pathway Medicine; contracts end 16 September 2016. One PhD intern retrospectively added DataShare metadata to PURE for data deposits prior to PURE Data Catalogue functionality; contract to end 16 September 2016. A fourth PhD intern (to work with School of Informatics) is awaiting for approval.

Data Management Support highlights

A total of 390 staff and postgraduates attended RDM courses and workshops during this quarter.

Other related research data management support activities to highlight

A talk was given â€˜Understanding and overcoming challenges to sharing personal and sensitive dataâ€™ at the ReCon (Research Communication and Data Visualisation) Conference, 24^th June 2016, The Edinburgh Centre for Carbon Innovation (ECCI).
â€˜Working with sensitive data in researchâ€™ guide was written for research staff and students in social sciences.
Another guide is being written on â€˜Sharing and retaining dataâ€™ for research staff and students in social sciences.
Set up an RDM Forum in collaboration with College of Arts, Humanities and Social Sciences (CAHSS) Research Officer and Research Outputs Co-ordinator. The first RDM forum is scheduled for Wednesday, 7 September 2016.

Other activities to highlight

The outcome of Jisc RDM Shared Services bid that was submitted in March 2016

The Procurement Panel has recommended University of Edinburgh as a preferred supplier on the Framework for the Research Data Management Shared Services for Jisc Services Ltd (JSL) for the following Lots:

Lot 2: Repository Interfaces
Lot 3: Data Exchange Interface
Lot 6: Research Data Preservation Tools Development
Lot 8: User Experience Enhancements

Unfortunately, the Procurement Panel has decided not to recommend University of Edinburgh for the following Lots:

Lot 1: Research Data Repository
Lot 4: Research Information and Administration Systems Integrations

National and International Engagement Activities

From May to June

Stuart Macdonald and Rocio von Jungenfeld ran three workshops for the IS Innovation Fund project, Data-X: Pioneering Research Data Exhibition, with PhD students from across the University. Introduction to Data-X: Pioneering Research Data Exhibition

In June

Stuart Macdonald presented peer-reviewed presentation to IASSIST conference, Bergen: Supporting the development of a national Research Data Discovery Service â€“ a Pilot Project

Robin Rice presented a poster at Open Repositories 2016, Dublin: Data Curation Lifecycle Management at the University of Edinburgh

Pauline Ward presented a lightning talk at Open Repositories 2016, Dublin: Â Growing Open Data: Making the sharing of XXL-sized research data files online a reality, using Edinburgh DataShare

Stuart Macdonald was an invited speaker at NFAIS (National Federation of Abstracting and Information Services) Fostering Open Science Virtual Seminar: NFAIS Fostering Open Science Virtual Seminar

In July

Robin Rice gave two presentations (invited and peer-reviewed) at LIBER 2016, Helsinki: University of Edinburgh RDM Training: MANTRA & beyond; Designing and delivering an international MOOC on Research Data Management and Sharing

Robin Rice filled in for Stuart Lewis as invited speaker for JISC-CNI 2016, London: Managing active research in the University of Edinburgh

This is the last quarterly report as the Research Data Management (RDM) Roadmap Project (August 2012 to July 2016) came to a close on 31 July 2016.

There will be discussions with the RDM Steering Group to decide how future reporting will be conducted. These reports will be released on the Research Data Blog as well.

Tony Mathys
Research Data Management Service Co-ordinator

Highlights from the RDM Programme Progress Report: February to April 2016

Posted on August 19, 2016 by admin

The membership of the Research Data Service Virtual Team across four divisions of IS was confirmed and met for the first time (to replace the former action group meetings) on 11 February where it was agreed meetings would be held approximately every six weeks for information and decision-making.

In February, the DataShare metadata was mapped to the PURE metadata and staff in L&UC and Data Library trained each other for creating dataset records in Pure and reviewing submissions in DataShare. It was agreed that staff would create records in Pure for items deposited in DataShare until the company (Elsevier) provides a mechanism for automatically inputting records into Pure.

In March, Jisc announced that the University of Edinburgh was selected as a framework supplier for their new Research Data Management Shared Service.

A review of the existing ethics processes in each college is in progress with Jacqueline McMahon at the College of Arts, Humanities and Social Sciences (CAHSS) to create a University-wide ethics template. There is also engagement with the School ethics committees at the School of Health in Social Sciences (HiSS), Moray House School of Education (MHSE), Law and School of Social and Political Science (SPS) in CAHSS.

The Research Data Management and Sharing (RDMS) Coursera MOOC opened for enrolment on 1 March 2016. This was completed in partnership with the University of North Carolina-Chapel Hill CRADLE project. Research Data Management and Sharing (RDMS) MOOC stats from the Coursera Dashboard reveal that as of 23 May 2016, there have been 5,429 visitors and 1,526 active learners; 335 visitors have completed the course.

The large data sharing investigation was completed for DataShare and reported previously. (Two new releases in DataShare defined: upload and download). Upload release (2.1) to go live 23 May 2016.

PURE dataset functionality is now included in standard PURE and Research Data Management (RDM) training. There are now 210 dataset records in PURE.

Four PhD interns were hired in mid-March to act as College representatives for the IS Innovation Fund Pioneering Research Data Exhibition. They will be employed until mid-December 2016.

A total of 363 staff and postgraduates attended RDM courses and workshops during this quarter.

There were 30 new DMPonline users and 55 new plans created during this quarter.

There are now 210 dataset metadata records in PURE.

A total of 56 datasets were deposited in DataShare during this quarter.

The total number of DataStore users rose from 12,948 in the previous quarter to 13,239 in this quarter, an increase of 291 new users.

National and International Engagement Activities

In February

Stuart Lewis gave a DataVault presentation at the International Digital Curation Conference (IDCC) in Amsterdam.

In March

A University news item was released to mark the launch of the Research Data Management and Sharing (RDMS) MOOC on Coursera. http://www.ed.ac.uk/news/2016/dataskills-010316
Stuart MacDonald gave an RDM presentation to trainee physicians at the Royal College of Physicians Edinburgh Course: Critical appraisal and research for trainees, Edinburgh. http://www.slideshare.net/smacdon2/rdm-for-trainee-physicians
Three delegates from GÃ¶ttingen University were hosted here. The delegates have shared interests in RDM and visited to gain more insight into RDM support and experiences here.
Robin Rice gave an invited talk about the RDMS MOOC and web-based Survey Documentation and Analysis (SDA) tool to Learning, Teaching and Web and elearning@Ed Showcase and Network monthly gathering.

In April

Robin Rice attended the ATI Reproducibility and Sustainability workshop at Oxford University on behalf of IS and gave a lightning talk on use of DataShare for reproducible science.
Stuart MacDonald gave an invited presentation on RDM and ELNs to HEIDS at University of Strathclyde: Research Data Management and ELN Information Sharing Event. http://www.slideshare.net/edinadocumentationofficer/rdm-elns-edinburgh
Robin Rice gave a peer reviewed presentation at the Force11 2016 conference in Portland, Oregon: Overcoming Obstacles to Sharing Data about Human Subjects. http://www.slideshare.net/rcrice/overcoming-obstacles-to-sharing-data-about-human-subjects

As part of my responsibilities to cover the one year interim of Kerry Miller’s maternity leave, I will be writing blogs for this page until Kerry returns next summer.

Prior to this post, I worked the past 12 years as the geospatial metadata co-ordinator at EDINA. My primary role was to promote and support research data management and sharing amongst UK researchers and students using spatial data and geographical information.

Tony Mathys
Research Data Management Service Co-ordinator

Mobile internet time for social networking, games and ad spending up

Posted on September 2, 2015 by admin

People are spending more time on their mobiles…social media accounts for more than 20% of time spent, compared to under 10% on desktop. Currently consumers spent 45% of their internet time on computers, 40% on mobiles and 15% tablets.

Mobile internet time is more heavily skewed towards social networking and games, whilst desktop is more loaded towards email and entertainment such as film and multimedia.

Read the Guardian article online.

All that’s based on an IAB report with more related data and survey findings showing an increase in advertising spending for mobiles too.

Ofcomâ€™s twelfth annual Communications Market report

Posted on August 6, 2015 by admin

The UK is now a “smartphone society”

Smartphones have overtaken laptops as the most popular device for getting online, Ofcom research has revealed, with record ownership and use transforming the way we communicate.

Two thirds of people now own a smartphone, using it for nearly two hours every day to browse the internet, access social media, bank and shop online.

Ofcomâ€™s 2015 Communications Market Report finds that a third (33%) of internet users see their smartphone as the most important device for going online, compared to 30% who are still sticking with their laptop.

The full, downloadable report is on their website.

IASSIST 2015 41st Annual Conference

Posted on July 10, 2015 by admin

Minneapolis, MN, USA, 2 to 5 June 2015
Host institution: Minnesota Population Center at the University of Minnesota

The theme of the 2015 conference was Bridging the Data Divide: Data in the International Context with many of the sessions dedicated to research data management in academia, which of course is being embraced across a growing number of UK academic institutions. I seem to recall that about 20 percent of UK academic institutions have a research data management strategy in place, so these sessions were of considerable interest, and well attended.

Data Infrastructure and Applications sessions were also prominent at the conference, with some interesting presentations relevant to EDINA, and attendance quite good, especially for the Block 5, E1 session for Geospatial and Qualitative Data on Thursday, June 4, 13:30 to 15:30. My presentation on GoGeo was slotted into this session along with three others with those focussed more on qualitative data.Â http://iassist2015.pop.umn.edu/program/block5#a1

Plenary Sessions

The first plenary session was interesting as Professor Steven Ruggles, from the Minnesota Population Center, provided an overview of the history of the US Census and how it was at the forefront with regards to data capture, process and dissemination. The second plenary speaker, Curtiss Cobb, from Facebook, tried to make the make the case that Facebook serves as a force of social good in the world, and Andrew Johnson, from the City of Minneapolis, spoke at the final plenary session on Friday with an overview of the Cityâ€™s open data policy.

Summaries of relevant presentations

3 June, Wednesday morning session:
A3: Enabling Public Use of Public Data

Mark Mitchell, from the Urban Big Data Centre (UBDC) at the University of Glasgow provided an interesting presentation titled And Data for all. The UBDC takes the Glasgow City Council’s urban open data that it has created, and makes it available to the public and to academia through its UBDC Data Portal (http://ubdc.gla.ac.uk/), which currently holds 934 datasets, primarily from the Glasgow City Council and Greater London Authority. MM noted the use of CKAN to build their data portal, and use R and QGISÂ at UBDC. He also noted that there are about 300+ data portal users and try to provide good metadata records and crosslink these with their datasets.

MM noted that there was a considerable degree of metadata quality, but indicated that the Glasgow City Council planned to mandate a minimum standard for metadata quality.

Some issues were revealed, most notably differences in projections between datasets where Transport Planning used British National Grid and Health Services used northing-easting.

He also pointed out an interesting result in a survey conducted in Glasgow which revealed support for the use of personal data for societal benefit, but not for commercial interest.

He touched on the ESRC-funded Integrated Multimedia City data (iMCD) project, which is intended to capture urban life through surveys, sensors and multimedia.
http://ubdc.ac.uk/our-research/research-projects/methods-research/integrated-multimedia-city-data-imcd/

Then on that same strand, he made reference to the gamification of data, which would incorporate Minecraft server and Minecraft, an interactive block game, to introduce Glasgow open data to Glasgow primary school children to make geography and maps more engaging and interesting.

More about this can be found on the UBDC website via this link.
http://ubdc.ac.uk/our-services/research-services/ubdc-computing-cluster/minecraft-server/

Someone noted during questions that the Australian Bureau of Statistics has created a mobile game called Run That Town. The ABS use data from every postal area in Australia and incorporate it into this mobile game.

Run That Town gives each player the ability to nominate any Australian town and take over as its virtual ruler. Players have to decide which local projects to approve and which to reject, with the real Census data of their town dictating how their population reacts. To win, players need to maintain their popularity, making Census data core to the gameplay and giving players the chance to use the data themselves.
http://runthattown.abs.gov.au/

Mark also mentioned about collaborative efforts between UBDC and the Glasgow School of Art to create noise and light maps for the City of Glasgow, then noted that housing charities were requesting more data from the Glasgow City Council as well.

Winny Akullo, from the Uganda Bureau of Statistics, delivered another presentation of this session, which provided an overview of the results of a quantitative study in Uganda that was carried out to investigate ways of improving the dissemination of statistical information there. The results indicated that the challenge remained, and one that required more resources to improve dissemination.

Margherita Ceraolo, from the UK Data Service, wrapped up the session with her presentation about the global momentum towards promoting open data including support from national governments and IGOs (e.g. IMF, World Bank and UN).

She made reference to macro data as well as boundary data, then made a reference to the UKDS building an open API for data re-use; release is scheduled for the end of 2015. She also made a reference to a map visualisation interface to display all data in their collection.

3 June, Wednesday afternoon session:
B5: Building on Common Ground: Integrating Principles, Practices, and Programs to support Research Data Management

Lizzy Rolando (Georgia Tech Library), and Kelly Chatain, from the Institute for Social Research (ISR) at the University of Michigan, gave interesting presentations on support for research data management at their respective institutions. Session Chair, Bethany Anderson, from University Archives at the University of Illinois-Urbana, also discussed ways of integrating the work of academic archives and research data services to appraise, manage and steward data.

Some key points that they noted during their presentations included the following:

requiring a chain of custody for data to encourage collective ownership and responsibility;
make data use a higher priority over preservation; and
mentioned Purdue Universityâ€™s policy for data retention which requires a reappraisal of data every 10 years.

These are eminently sensible approaches to data management in academia. Granted, the first one faces resistance, but if data creators and users refuse to be accountable for data, then who assumes this responsibility? Ownership needs to be addressed if data are to be managed and shared, and when it becomes a collective responsibility, then perhaps there might be more willingness as a shared activity?

Data re-use ought to be prioritised as well, and periodically assessed rather than stored on various media to be forgotten. Itâ€™s become another of many classic excuses when terabytes of data are blamed for eschewing the responsibilities of data documentation/metadata creation.

Itâ€™s uncertain, but how many spatial datasets are worth a place in archival storage? If there are spatial datasets of no value, then they should be deleted rather than saved. Question is who makes these decisions, but could assume that it would be within each department?

3 June, Wednesday afternoon late session:
C5: No Tools, No Standard — Software from the DDI Community

Listened to a presentation about the Ontario Data Documentation, Extraction Service and Infrastructure (ODESI) and the Canadian Data Liberation Initiative (DLI), with reference to Nesstar. Nesstar is a software system for data publishing and online analysis. The Norwegian Social Science Data Services (NSD) owns it and recall it during the time I worked years ago at the UK Data Archive.

4 June, Thursday morning session:
D4: Minnesota Population Data Center (MPC) Data Infrastructure: Integration and Access

This session provided an overview of the Minnesota Population Center (MPC) project activities with most of the presentation about Integrated Public Use Microdata Series (IPUMS) (www.ipums.org), which is dedicated to collecting and distributing freeÂ and accessible census data, both US and international census data.

Interesting to note from the presentation, the number of users, with economists, the highest, at 31 percent; demographers and sociologists accounting for 16 percent; and journalists and government users at 15 percent. Only 8 percent of users were identified as geographers/GIS, though they indicated that their numbers were growing.

The North Atlantic Population Project (NAPP) was mentioned, which includes 19^th and early 20^th century census microdata from Canada, Great Britain, Germany, Iceland, Norway, Sweden, and the US, so worth noting that British census data available as well.

The Terra Populus project (http://www.terrapop.org/) was also covered and sounded quite interesting. The goal of the project is to integrate the worldâ€™s population (census) with environmental data (remotely-sensed land cover, land cover records and climate data).

There is also a temporal aspect to this which exams interactions over time between humans and environment to observe changes that take place between the two.

There is a TerraPop Data Finder being built, which is currently in beta. It holds census data, and land use, land cover and climate data.
https://beta.terrapop.org/

The MPC has also been involved with the State Health Access Data Assistance Center (SHADAC) Data Center, doing analysis on estimates of health insurance coverage, health care use, access and affordability using data from the 2012 National Health Interview Survey (NHIS).
http://datacenter.shadac.org/

4 June, Thursday afternoon session:
E1: Geospatial and Qualitative Data

There was exceptionally good attendance for this session with most of the room filled. Amber Leahey, the Data Services Metadata Librarian at the University of Toronto, was Chair of our session. Had a chance to talk to her after the session, and learned about the Scholars GeoPortal, which is an online resource for Canadian academics and students to access licensed geospatial datasets through a subscription service, much like Digimap.Â Impressive portal, and data are free, though the portal provides a limited number of Canadian datasets. They encourage data creators to upload their datasets to the portal, much like Digimap ShareGeo, but face similar challenges as here.Â http://geo1.scholarsportal.info/

Andy Rutkowski (USC) started the session with his presentation on using qualitative data (social media, tweets, interviews, archived newspaper classifieds, photographs) to improve the understanding of quantitative data to produce more meaningful maps, maps as social objects, a move towards spatial humanities?

He alluded to skateboardersâ€™ information about pavement conditions at various locations in Los Angeles that led to a new skateboard park.
http://la.streetsblog.org/2014/07/23/filed-under-mostly-rad-skate-park-to-open-thursday-in-hard-to-skate-to-hazard-park/

He also referred to Professor Nazgol Bagheri’s (UT San Antonio) work on mapping women’s socio-spatial behaviours in Tehran’s public spaces using photographs and narratives linked to GIS data from the Iranian Census, national GIS database and City of Tehran; all this to generate a qualitative GIS map that displays the gendering of spatial boundaries.

He concluded with a reference to the LA Times Mapping project, which started in 2009 and displays the neighbourhoods of Los Angeles, which have been redrawn using feedback from readers whose perceptions of boundaries differed from the original ones.Â http://maps.latimes.com/neighborhoods/

The next presentation (The Landscape of Geospatial Research: A Content Analysis of Recently Published Articles) was a joint collaboration with library staff at the University of Michigan reporting on their efforts and results to use geospatial research methods to capture information from the body of published literature. Samples of articles, from a selection of multi-disciplinary journals with spatial themes, were UID coded for content including spatial data cited, software used and research methodology; I assume with regards to software, this would be ArcGIS, ERDAS MapInfo, etc?

Metadata was also compiled for the articles, which included title, subject, author(s) subject affiliation, number of authors and their gender; this information extracted through multi-coding. Also reference toÂ geo co-ordinate analysis and building the schema to support this information extraction.

Certainly the Unlock geo-parser (http://edina.ac.uk/unlock/) comes to mind as being relevant to their project. Weâ€™ve already discussed the possibility of doing something similar using GoGeo to extract and harvest metadata from open access journals as publications represent the best sources for spatial data information with most publications peer-reviewed, and the data cited, so this should address data quality concerns, and the purpose for which the data were created. Each publication would also provide the author(s) name(s) and contact details for those interested in acquiring the data, which might in turn pressure researchers to release their data through GoGeo rather than face personal requests for their data.

My presentation followed and can be found on this EDINA page.
GoGeo: A Jisc-funded service to promote and support spatial data management and sharing across UK academia
http://edina.ac.uk/presentations.html#presentations

One of my comments, and a photo of one of my slides, reached the IASSIST conferenceâ€™s Twitterland and went viral at the conference, though I noted as well that metadata creation was important, but the reality is that after 14 years of metadata coordination both in the public sector and academia, Iâ€™ve yet to meet anyone who has actually expressed any pleasure in creating metadata.

My presentation provided an overview of EDINA , Jisc and the GoGeo Spatial Data Infrastructure, then summarised the latterâ€™s successes and shortcoming, the former attributed to GoGeo users searching for data; the latter, GoGeo users unwilling to share their data. My presentation also offered to the audience, new approaches that would encourage spatial data management and sharing including a mandatory requirement for students to use Geodoc to document data cited in their dissertation and theses as a requirement for graduation; itâ€™s often easier for a department to impose this requirement on its students rather than its faculty, but if students document their data, future students can access the metadata records as part of their literature review, and access data that might complement their own research data; this in turn would require university departments to take ownership of their studentsâ€™ data and make available to others, so at least spatial data is shared internally. This could be restricted to the department or within a university if there is a data management policy and the infrastructure in place to support it, though if not GoGeo provides this.

The use of Geodoc and the GoGeo private catalogues was also presented as another approach to supporting spatial data information management with Geodoc used at the personal level where a researcher can document his/her spatial data, then use Geodoc to store and update those records. Then the option of exporting Geodoc records to attach to shared spatial datasets, which seems the preferred option as academics will entrust their data to colleagues rather than make them openly available; the data recipient can then import the metadata record into his/her own Geodoc to access for updating and editing. The other option is for Geodoc users, whether part of a research project group, a department, or university, to publish their metadata records to a GoGeo private catalogue, which only those with assigned usernames and passwords can access. As I manage these catalogues, I can assign these to those whoâ€™ve been granted permission to access the metadata records, and can be affiliated with the same project, but from different universities.

The hopeful outcome would be that after these records and their datasets have served their purpose, then the records would be published in GoGeoâ€™s open catalogue and the data uploaded to ShareGeo, or a GoGeo database as it would be better to have both the metadata and data in the GoGeo portal and not separate as it the case now between GoGeo and the ShareGeo data repository, which records from 500 to 3,000 downloads a month, so better to redirect those users to GoGeo.

My presentation noted as well the Jisc commitment to providing the resources to the UK academic community in support of research data management, then noted that about 20 percent of the UK universities have a data research management policy in place.

Also in line with the Landscape of Geospatial Research: A Content Analysis of Recently Published Articles presentation, the search interface in GoGeo could be updated to search and harvest metadata from peer-reviewed open access journal publications. It would also be an important step forwards if publishers would require authors to release their data, but there seems to be no movement on that front as it is in the financial interest of most publishers to publish more, and might see this as an imposition on researchers which would result in fewer publications?

If there was any consolation, there were other presentations at IASSIST that revealed similar experiences (see 5 June, Friday morning session), so academia represents a formidable challenge both here and the US, and probably in most other countries as well?

Mandy Swygart-Hobaugh (Georgia State University) concluded the session with her presentation on qualitative research. She asked if social sciences data services librarians devoted their primary attention to quantitative researchers to the detriment of qualitative researchers, and her survey indicated that it is overwhelmingly biased towards quantitative data researchers.

5 June, Friday morning session:
F5: Using data management plans as a research tool for improving data services in academic libraries

Â Amanda Whitmire (Oregon State University), Lizzy Rolando, Georgia Tech Library and Brian Westra and University of Oregon Libraries combined to offer interesting presentations.

AW talked about the DART Project (Data management plans as A Research Tool). This NSF-funded project is intended to facilitate a multi-university study to develop an analytic rubric to standardise the review of faculty data management plans for Oregon State University, the University of Michigan, the Georgia Institute of Technology and Penn State University.

This poster offers more insight about the Dart project.
https://ir.library.oregonstate.edu/xmlui/bitstream/handle/1957/55482/ACRL2015_DARTPoster_final.pdf?sequence=1

She also talked about the Data Management Plan (DMP) tool, which can be used to provide a rich source of information about researchers and their research data management (RDM) knowledge, capabilities and practices. She revealed some information including the possibility of plagiarism with 40 percent of researchers sharing text and geographical research comprising only 8 percent of the RDM activities, so probably no different than here in the UK as the social sciences/geosciences seem more averse to data management and sharing. Only 10 percent of the researchers approached the RDM staff for assistance as well.

The DMP tool also has the functionality to see cross-disciplinary trends without engaging with the researchers, and with only 10 percent of the researchers approaching the RDM staff, this is probably good. She noted that the cross-disciplinary trends were high for the likes of Mathematics and Physics and low for geography, and really no surprise in this revelation.

Further assessment of information revealed that with eight research plans/practices(?) did not indicate any intent of releasing data; five plans indicated a selective release of â€˜relevant data, which she interpreted as suggesting it was to the researchersâ€™ discretion and just another way of saying â€˜noâ€™ to data sharing.

In addition, she reported that researchersâ€™ descriptions of data types was done well, but no mention of metadata creation or data protection and data archiving; some mention of data re-use.

Lizzy Rolando revealed similar results during her presentation which involved feedback from researchers at Georgia Tech.

Asked about their plans on how they would share their data, researchers indicated the following:

–Â Citation in journals: 22 percent
–Â Conferences: 10 percent
–Â Repository: 9 percent
–Â Other repository: 7 percent

In effect, most researchers perceived that the citation of their data in journals or at conferences was effectively data sharing; only a minority seemed inclined to share their data directly.

Also, results of the survey indicated that researchers werenâ€™t aware of metadata standards, or metadata at all, and expressed a willingness to share their data, but not willing to archive their data, again, their interpretation of data sharing seems to suggest only through citation.

LR suggested that one way to encourage researchers to create metadata was to do so informally through note taking, but then would researchers be willing to share their notes is the question I have, or would they allow librarians or others to reference their notes to create metadata?

Iâ€™ve offered my services to academics in academia, but no one has accepted the offer of providing their data for me to extract information to document their datasets, and this is a step further than asking researchers to take notes about their data.

Itâ€™s a good idea, can it succeed, though it should be a reasonable approach to data management, but without any formal structure, what will happen to the notes? Will those files be stored randomly in various media, accidentally deleted, or not properly updated to reflect changes made to the dataset?

Brian Westa from the University of Oregon, offered another summary of a similar survey conducted at his university; the survey targeted researchers in Chemistry, Â Biological Sciences and Mathematics.

Asked about data documentation/description and metadata standards, 51 researchers in Biological Sciences and Chemistry acknowledged the following:

–Â Data description: 14
–Â Could identify metadata standards: 10
– Making data public: 14
– Mentioned data formats: 12

The Dryad repository was mentioned amongst the 14 who responded to making data public, but again, with only 10 respondents acknowledging familiarity with metadata standards, there are RDM issues here as well.

Feedback also indicated that most researchers were concerned about trusting others with their data, and though there were 14 respondents who acknowledged that they shared their data, most indicated that they shared their data through citation in publications and their own website, so again, a reluctance to physically share their data, and if they did actually share the data, it can be inferred that it would have been one-to-one with colleagues they could trust?

Turning to the survey for researchers in Chemistry, much the same was suggested in the results. A majority indicated that they shared their data through citations in publications and only shared data through â€˜specific requestsâ€™, again trust comes into play here and assume these requests would be approved if from a close, or trusted colleague?

The respondents noted the following as methods of data sharing in this order:

– Publications
– On request
– Personal website
– Data centre
– Repository
– Conferences

None of the respondents made any reference to metadata or standards.

BW concluded with an overview of the National Science Foundationâ€™s (NSF) effort to encourage research data management and sharing, which basically requires the research community, who receives considerable NSF funding, to establish data management practices; however, BW noted that itâ€™s not happening, though said that there was one occasion recently where continued funding for a postgrad student was withheld until the student had submitted an RDM plan to the NSF, so there has been little progress there, even from a major funding body as the NSF, and this sounds similar to experiences at NERC where researchers saw funding as a one-off, so felt no obligation to submit their data to NERC after the project was finished, though I think they were to review this and try to find another strategy that would encourage better data management and sharing.

The resistance within academia to both data management and sharing is quite concerning as access to the data should be part of the peer-review process. In this Reutersâ€™ article, and others, itâ€™s noted that there are publications where the data donâ€™t hold up to scrutiny, and this is an alarming concern.
http://www.reuters.com/article/2012/03/28/us-science-cancer-idUSBRE82R12P20120328

As governments continue to cut funding for research, this makes it increasingly more difficult for researchers to collect sufficient data for proper analysis, and less inclined to share their data, so will this only exacerbate the problem, or are there other issues as well, but certainly trust seems to be a key concern amongst researchers, and these presentations at the IASSIST conference reaffirm the reality here, and this reluctance to share data, and even data management seems to be too much to ask of most researchers to do. Metadata creation is so far removed from the actual data processing and analysis, and the publication of these results, hence, most researchers who would rather spend more time with their datasets than their descriptions, especially as most researchers have no intention of sharing their datasets publically, and only share it with those they trust; however, rather than taking questions about their datasets with each request, the Geodoc metadata editor tool would allow each researcher to document his/her datasets and bundle the corresponding metadata records with them to share both with their trusted colleagues.

Perhaps, over time, researchers will be willing to share both their metadata and data with the public, but that time still seems far in the future, but for now, the support must be made available to those who want to manage their data and share it with those that they can trust.

5 June, Friday afternoonÂ

I had planned to attend the G2 session on Planning Research Data Management Services, but had the fortunate opportunity to speak with Professor Bob Downs from Columbia University. GoGeo harvests metadata from the Socioeconomic Data and Applications Centerâ€™s (SEDAC) portal catalogue, which CIESIN hosts at Columbia University, so Professor Downs had asked me about this during question time after my presentation on Thursday.

We discussed both SEDAC and GoGeo, then he mentioned to me how DataCite was useful source for locating catalogues to harvest metadata, with SEDACâ€™s catalogue included on the website. Heâ€™d mentioned as well about tracing the use of SEDAC data in publications through citation, which was quite impressive as the number of times was more than 1,000, so clearly demonstrating the benefit of making their data open access, and the success of the SEDAC portal.

That was IASSIST 2015 in Minneapolis, Minnesota. The 2016 conference will be held in Bergen, Norway.

Call for Papers “GeoCom 2015: Resilient Futures”: closing date extended to Friday, 12 June 2015

Posted on May 26, 2015 by admin

The Association for Geographic Information is seeking papers for their flagship annual conference. GeoCom is a key event in the UK geo calendar, with representation from across th AGI e GeoCommunity (hence the name!) including commercial, public (central and local government) and the 3rd sector.

The theme of this yearâ€™s conference is Resilient Futures, and brings together all the topics examined during our GeoBig5 events (http://www.agi.org.uk/events/geo-the-big-five) through the year:

Smart Energy (past): securing the future energy sources to meet the growing demand, this event showed the significant role of location in connecting demand to supply.

BIM: The Next Level (past): probably one of the most significant events in the geospatial arena of recent years, BIM is more than a 3D model, but an entire process. A process that is entirely dependent on location based data.

Sensors & Mobile (past): examined the impact of an ever increasing capability to capture and ‘sense’ location based data.

Future Cities: Security (Thursday, 9 July): the role that geospatialÂ information has to play in preparing for future shocks and stresses

Big Data & You (Thursday, 8 October): examining the ethics of big data, privacy and the special role that location plays in the debate.

The AGI is keen to hear from thought leaders in all these areas and wants to encourage our members to submit an abstract based on your work in these sectors. But this is not just for the â€˜usual candidatesâ€™! There is a strong development aspect to the conference with dedicated spaces for those who are at early stage in their career (with support from our Early Career Network team (http://www.agi.org.uk/news/agi/721-successful-first-ecn-webinar).

Priority will be given to papers that explore themes around Resilience and the Big5 topics, however papers on any aspects of Geographic Information and Research are encouraged. In particular to our Technology stream.

Itâ€™s a very simple process: abstracts of up to 350 words (max) should be submitted before the 30th May 2015 (being extended to Friday, 12 June 2015) via our online form:

https://docs.google.com/forms/d/1CvsmoA01tYneKAHhc2YWtXnQcmxjPc0VEV4O72QXXFQ/viewform?c=0&w=1

For further information about this and other AGI events please see our website:

http://www.agi.org.uk/component/civicrm/?task=civicrm/event/info&Itemid=238&reset=1&id=28

http://www.agi.org.uk/events/calendar

If you have any questions, please contact the AGI via email ( info at agi.org.uk )

Mobile Conferences – in fixed venues

Posted on March 13, 2015 by admin

Prompted by seeing two events listed, I’ve just had a quick scour of the web for other mobile-related conferences this year.

Perhaps surprisingly, when you think of it, there’s still a lucrative circuit of charged, seated events round the country and globe – rather than, or perhaps as well as, a network of virtual spaces where ideas are exchanged. We’ll continue to monitor them and attend the odd one or two. For reference, and to save a few precious minutes for anyone else with a similar curiosity, here’s a short summary of today’s quick search:

The Third Annual Future of Education and Technology Conference 2015: Transforming Education through Digital Technology – Friday 13th March 2015, University of Salford, ManchesterÂ (sic).

A couple of other more general ones:Â Digital Media Strategies 2015, just passed this week in London and The Guardian Changing Media Summit there as well, next week. Another one to read the write-up is the Mobile World Congress 2015, this year in Barcelona a few days ago (conveniently reviewed in the Guardian already). Looking ahead, this weekend sees the 11th International Conference on Mobile Learning 2015 in Madeira, Portugal, then the UK Mobile GovernmentÂ Summit in London on Tuesday, St Patrick’s Day;Â next month, there’s a university-based one-day’er – the grandly titled Future of Mobile and Technology Enhanced Learning in Higher and Further Education Conference 2015 in Salford again.

Skipping ahead to the early Summer, although not Mobile per se, there’s an interesting looking gathering planned under the heading: Enabling Transformational Change and Innovation in Higher Education via Technology, in London in June. Followed by at least two dedicated, seriously Mobile conferences in August – International Conference on Mobile Computing & Networking in Birmingham and MobileHCI 2015 (the 17th International Conference on Human-Computer Interaction with Mobile Devices and Services) over in Copenhagen.

Then, one for the Fall, a November-scheduled ForumOxford: Mobile Apps and Technologies Conference 2015 already advertised.

And lastly the one that set me off on this, a free webinar next Wednesday lunchtime which I’ve just registered for “Mobile learning in practice: special educational needs and essential skills: Phase two of the [Jisc] mobile learning guide“. I’m looking forward to that, after a slightly gruelling trek to Birmingham for Jisc’s Digifest at the beginning of this week (which had at least one relevant session on this theme Mobile Learning in Practice.

With this data deluge, as only the tip of iceberg, there’s no shortage of themes and insights being offered for us to digest then inform our work. Watch this space, along with plenty of others, to see what we make of it.

GoGeo Mobile has been released

Posted on January 23, 2015 by admin

The GoGeo Mobile iPhone App was created b gogeoApp y EDINA at the University of Edinburgh to support teaching, learning and research.

JiscÂ provided support for the GoGeo App project as part of its commitment to encourage the use of new and emerging technology to support research and learning in the UK.

GoGeo Mobile is an appÂ that allows users to keep abreast of news and events in the geospatial sector. GoGeo Mobile is separated into a number of channels including News, Events, Jobs and Resources for Teachers. Each channel contains useful and relevant resources for anyone working with Geographic Information Systems (GIS), Remote Sensing or spatial data.

In addition, GoGeo Mobile allows users to perform targeted searches for spatial data. Searches can be defined by keyword and/or location and return a brief description of the data and users can then forward themselves a direct URL to the metadata record so they can download the data when they are back at their desk.

Compatibility: Requires iOS 7.0 or later. Compatible with iPhone, iPad, and iPod touch. This app is optimised for iPhone 5, iPhone 6, and iPhone 6 Plus.

You can download the GoGeo Mobile App from the UK iTunes App Store.

Please provide Feedback toÂ edina@ed.ac.ukÂ with GoGeo App in the subject field.

Ordnance Survey to become a GovCo at the end of the financial year

Posted on January 22, 2015 by admin

Matthew Hancock MP has just posted this statement regarding the status of the OS.Â

I am today announcing the Governmentâ€™s intention to change Ordnance Survey from a Trading Fund to a Government Company at the end of the financial year.
The change is operational in nature, and is aimed at improving Ordnance Surveyâ€™s day-to-day efficiency and performance. It will provide the organisation with a more appropriate platform from which to operate, and one which provides greater individual and collective responsibility for performance.
Ordnance Survey will remain under 100% public ownership with the data remaining Crown property, with ultimate accountability for the organisation staying with the Department for Business, Innovation and Skills.

Further to this change, in the coming weeks I will also be setting out more details on how Ordnance Survey will be building on its existing extensive support for the Governmentâ€™s Open Data policy and on some senior appointments which will further strengthen the management team.

Ordnance Survey exists in a fast moving and developing global market. There has been rapid technology change in the capture and provision of mapping data, and increasingly sophisticated demands from customers who require data and associated services â€“ including from government. To operate effectively, Ordnance Survey needs to function in an increasingly agile and flexible manner to continue to provide the high level of data provision and services to all customers in the UK and abroad, in a cost effective way, open and free where possible. Company status will provide that.

Mapping data and services are critical in underpinning many business and public sector functions as well as being increasingly used by individuals in new technology. Ordnance Survey sits at the heart of the UKâ€™s geospatial sector. Under the new model, the quality, integrity and open availability of data will be fully maintained, and in future, improved. Existing customers, partners and suppliers will benefit from working with an improved organisation more aligned to their commercial, technological and business needs.
The relationship with Government will be articulated through the Shareholder Framework Agreement alongside the Company Articles of Association. The change will be subject to final Ministerial approval of these governance matters.

Ordnance Survey will also continue to publish a statement of its public task, to subscribe to the Information Fair Trader Scheme and comply with the relevant Public Sector Information Regulations, including Freedom of Information legislation, and make as much data as possible openly available to a wide audience of users.

The statement can be found here.

How can Public Data Group data be made more accessible and useful?

Posted on January 7, 2015 by admin

This survey invitation just came across twitterland, so dropping it into GoGeo blogland. This is certainly important to monitor as it refers to the Ordnance Survey, and these other public sector bodies as well.

The Public Data Group (PDG) brings together four public sector bodies – Companies House, Land Registry, Met Office and Ordnance Survey – that collect, refine, manage and distribute data on the nationâ€™s companies, property, weather and geography.

The Public Data Groupâ€™s data is made available through a variety of channels and licenses and includes both commercial agreements and the provision of Open Data.

The value of the data that is charged for is vast â€“ with Ordnance Survey data widely used in the insurance sector, and the billions of pounds saved by the use of Met Office data in the aviation industry as just two examples. Equally, the value of the Open Data released by the Public Data Group is very significant and growing. The most recent estimate placed the value of Open Data released by PDG at over Â£900m annually.

Data is of increasingÂ importance to the economy, driving innovation and opening a range of new possibilities for businesses.

Although the PDG organisations have a commitment to make as much data freely available as possible, they have to balance this commitment with other requirements such as maintaining the quality of the data, covering the costs of the collection and distribution of the data, and avoiding cross subsidising one data set from another. Companies House has recently committed to making its digital information available free of charge in 2015Â but for Land Registry, Met Office and Ordnance Survey some data is charged for.

How you can help

We are keen that the charges for public data do not act as a barrier forÂ those just starting their business or developing their product. There are already some â€˜Developer Licensesâ€™ available that allow usage of charged for PDG data for free under certain criteria and the intention is to enhance and expand these further. We are also keen to understand if PDG data is widely known of and if users find it convenient to utilise so that future development work and publicity can be better targeted.

The purpose of this survey is therefore to seek your views on:

Your awareness of the PDG data that is available;
Any issues you face using PDG data; and
How â€˜Developer Licensesâ€™ should be designed to most meet user needs.

More can be found here, including access to the seven page online survey.

EDINA Blogs

A Blogs.edina.ac.uk weblog

Author Archives: admin