StuartLewis

After 6 years of being Repository Fringe‘s resident live blogger this was the first year that I haven’t been part of the organisation or amplification in any official capacity. From what I’ve seen though my colleagues from EDINA, University of Edinburgh Library, and the DCC did an awesome job of putting together a really interesting programme for the 2016 edition ofÂ RepoFringe, attracting a big and diverse audience.

Whilst I was mainly participating through reading the tweets to #rfringe16,Â I couldn’t quite keep away!

Pauline Ward at Repository Fringe 2016

This year’s chair, Pauline Ward, asked me to be part of the Unleashing Data session on Tuesday 2nd August. The session was a “World Cafe” formatÂ and I was asked to help facilitate discussion around the question:Â “How can the respository community use crowd-sourcing (e.g. Citizen Science) to engage the public in reuse of data?” – so I was along wearing my COBWEB: Citizen Observatory WebÂ andÂ social media hats. My session also benefited from what I gather was an excellent talk on “The Social Life of Data” earlier in the event from the Erinma Ochu (who, although I missed her this time, is always involved in really interesting projects including several fabÂ citizen science initiatives).

I won’t attempt to reflect onÂ all of the discussions during the Unleashing Data Session here – I know that Pauline will be reporting back from the session to Repository Fringe 2016 participants shortly – but I thought I would share a few pictures of our notes, capturing some of the ideas and discussions that came out of the various groups visiting this question throughout the session. Click the image to view a larger version.Â Questions or clarifications are welcome – just leave me a comment here on the blog.

Notes from the Unleashing Data session at Repository Fringe 2016

If you are interested in finding out more about crowd sourcing and citizen science in general then there are a couple of resources that made be helpfulÂ (plus many more resources and articles if you leave a comment/drop me an email with your particular interests).

This June I chaired the “Crowd-Sourcing Data and Citizen Science” breakout sessionÂ for the Flooding and Coastal Erosion Risk Management Network (FCERM.NET) Annual Assembly in Newcastle. The short slide set created for that workshop gives a brief overview of some of the challenges and considerations in setting up and running citizen science projects:

Last OctoberÂ the CSCS NetworkÂ interviewed me on developing and running Citizen Science projects for their website – the interviewÂ brings together some general thoughts as well as specific comment on the COBWEB experience:

After the Unleashing Data session I was also able to stick around for Stuart Lewis’ closing keynote. Stuart has been working at Edinburgh University since 2012 but is moving on soon to the National Library of Scotland so this was a lovely chance to get some of his reflections and predictionsÂ as he prepares to make that move. And to include quite a lot of fun references to The Secret Diary of Adrian Mole aged 13 Â¾. (Before his talk Stuart had also snuck some boxes of sweets under some of the tables around the room – a popularity tactic I’m noting for future talks!)

So, my liveblog notes from Stuart’s talk (slightly tidied up but corrections are, of course, welcomed) follow. Because old Repofringe live blogging habits are hard to kick!

The Secret Diary of a Repository aged 13 Â¾ – Stuart Lewis

Iâ€™m going to talk about our bread and butter â€“ the institutional repositoryâ€¦ Now my inspiration is Adrian Moleâ€¦ Why? Well we have a bunch of teenage repositoriesâ€¦ EPrints is 15 1/2; Fedora is 13 Â½; DSpace is 13 Â¾.

Now Adrian Mole is a teenager â€“ you can read about him on WikipediaÂ [note to fellow Wikipedia contributors: this, and most of the other Adrian Mole-related pages could use some major work!]. You see him quoted in two conferences to my amazement! And there are also some Scotland and Edinburgh entries in there tooâ€¦ Brought a haggisâ€¦ Goes to Glasgow at 11amâ€¦ and says he encounters 27 drunks in one hourâ€¦

Stuart Lewis illustrates the teenage birth dates of three of the major repository softwares as captured in (perhaps less well-aged) pop hits of the day.

So, I have four points to make about how repositories are like/unlike teenagersâ€¦

The thing about teenagersâ€¦ People complain about themâ€¦ They can be expensive, they can be awkward, they arenâ€™t always self awareâ€¦ Eventually though they usually become useful members of society. So, is that true of repositories? Well ERA, one of our repositories has gotten bigger and bigger â€“ over 18k itemsâ€¦ and over 10k paper thesis currently being digitizedâ€¦

Now teenagers also start to look aroundâ€¦ Pandora!

Iâ€™m going to call Pandora the CRISâ€¦ And weâ€™ve all kind of overlooked their commercial background because we are in love with themâ€¦!

Stuart Lewis captures the eternal optimism – both around Mole’s love of Pandora, and our love of the (commercial) CRIS.

Now, we have PURE at Edinburgh which also powers Edinburgh Research Explorer. When you looked at repositories a few years ago, it was a bit like Freshers Weekâ€¦ The three questions were: where are you from; what repository platform do you use; how many items do you have? But thatâ€™s moved on. We now have around 80% of our outputs in the repository within the REF compliance (3 months of Acceptance)â€¦ And thatâ€™s a huge change â€“ volumes of materials are open access very promptly.

So,

1. We need to celebrate our success

But are our successes as positive as they could be?

Repositories continue to develop. Weâ€™ve heard good things about new developments. But how do repositories demonstrate value â€“ and how do we compare to other areas of librarianship.

Other library domains use different numbers. We can use these to give comparative figures. How do we compare to publishers for cost? Whats our CPU (Cost Per Use)? And what is a good CPU? Â£10, Â£5, Â£0.46â€¦ But how easy is it to calculate â€“ are repositories expensive? Thatâ€™s a “to do” â€“ to take the cost to run/IRUS cost. I would expect it to be lower than publishers, but Iâ€™d like to do that calculation.

The other side of this is to become more self-awareâ€¦ Can we gather new numbers? We only tend to look at deposit and use from our own repositoriesâ€¦ What about our own local consumption of OA (the reverse)?

Working within new e-resource infrastructure â€“ http://doai.io/ – lets us see where open versions are available. And we can integrate with OpenURL resolvers to see how much of our usage can be fulfilled.

2. Our repositories must continue to grow up

Do we have double standards?

Hopefully you are all aware of the UK Text and Data Mining Copyright Exception that came out from 1^st June 2014. We have massive massive access to electronic resources as universities, and can text and data mine those.

Some do a good job here â€“ Gale Cengage Historic British Newspapers: additional payment to buy all the data (images + XML text) on hard drives for local use. Working with local informatics LTG staff to (geo)parse the data.

Some are not so good â€“ basic APIs allow only simple searchersâ€¦ But not complex queries (e.g. could use a search term, but not e.g. sentiment).

And many publishers do nothing at allâ€¦.

So we are working with publishers to encourage and highlight the potential.

But what about our content? Our repositories are open, with extracted full-text, data can be harvestedâ€¦ Sufficient but is it ideal? Why not do bulk download from one clickâ€¦ You can â€“ for example â€“ download all of Wikipedia (if you want to). Â We should be able to do that with our repositories.

3. We need to get our house in order for Text and Data Mining

When will we be finished though? Depends on what we do with open access? What should we be doing with OA? Where do we want to get to? Right now we have mandates so itâ€™s easy â€“ green and gold. With gold there is PURE or Hybridâ€¦ Mixed views on Hybrid. Can also publish locally for free. Then for gree there is local or disciplinary repositoriesâ€¦ For Gold â€“ Pure, Hybrid, Local we pay APCs (some local option is free)â€¦ In Hybrid we can do offsetting, discounted subscriptions, voucher schemes too. And for green we have UK Scholarly Communications License (Harvard)â€¦

But which of these forms of OA are best?! Is choice always a great thing?

We still have outstanding OA issues. Is a mixed-modal approach OK, or should we choose a single route? Which one? What role will repositories play? What is the ultimate aim of Open Access? Is it â€œjustâ€� access?

How and where do we have these conversations? We need academics, repository managers, librarians, publishers to all come together to do this.

4. Do we now what a grown-up repository look like? What part does it play?

Please remember to celebrate your repositories â€“ we are in a fantastic place, making a real difference. But they need to continue to grow up. There is work to do with text and data miningâ€¦ And we have more to doâ€¦ To be a grown up, to be in the right sort of environment, etc.

Q&A

Q1) I can remember giving my first talk on repositories in 2010â€¦ When it comes to OA I think we need to think about what is cost effective, what is sustainable, why are we doing it and whatâ€™s the cost?

A1) I think in some ways thatâ€™s about what repositories are versus publishersâ€¦ Right now we are essentially replicating themâ€¦ And maybe that isnâ€™t the way to approach this.

And with that Repository Fringe 2016 drew to a close. I am sure others will have already blogged their experiences and comments on the event. Do have a look at the Repository Fringe website and atÂ #rfringe16Â for more comments, shared blog posts, and resources from the sessions.Â

Share/Bookmark

EDINA Blogs

A Blogs.edina.ac.uk weblog

Category Archives: StuartLewis