A Mini Adventure to Repository Fringe 2016

After 6 years of being Repository Fringe‘s resident live blogger this was the first year that I haven’t been part of the organisation or amplification in any official capacity. From what I’ve seen though my colleagues from EDINA, University of Edinburgh Library, and the DCC did an awesome job of putting together a really interesting programme for the 2016 edition of RepoFringe, attracting a big and diverse audience.

Whilst I was mainly participating through reading the tweets to #rfringe16, I couldn’t quite keep away!

Pauline Ward at Repository Fringe 2016

Pauline Ward at Repository Fringe 2016

This year’s chair, Pauline Ward, asked me to be part of the Unleashing Data session on Tuesday 2nd August. The session was a “World Cafe” format and I was asked to help facilitate discussion around the question: “How can the respository community use crowd-sourcing (e.g. Citizen Science) to engage the public in reuse of data?” – so I was along wearing my COBWEB: Citizen Observatory Web and social media hats. My session also benefited from what I gather was an excellent talk on “The Social Life of Data” earlier in the event from the Erinma Ochu (who, although I missed her this time, is always involved in really interesting projects including several fab citizen science initiatives).

 

I won’t attempt to reflect on all of the discussions during the Unleashing Data Session here – I know that Pauline will be reporting back from the session to Repository Fringe 2016 participants shortly – but I thought I would share a few pictures of our notes, capturing some of the ideas and discussions that came out of the various groups visiting this question throughout the session. Click the image to view a larger version. Questions or clarifications are welcome – just leave me a comment here on the blog.

Notes from the Unleashing Data session at Repository Fringe 2016

Notes from the Unleashing Data session at Repository Fringe 2016

Notes from the Unleashing Data session at Repository Fringe 2016

 

If you are interested in finding out more about crowd sourcing and citizen science in general then there are a couple of resources that made be helpful (plus many more resources and articles if you leave a comment/drop me an email with your particular interests).

This June I chaired the “Crowd-Sourcing Data and Citizen Science” breakout session for the Flooding and Coastal Erosion Risk Management Network (FCERM.NET) Annual Assembly in Newcastle. The short slide set created for that workshop gives a brief overview of some of the challenges and considerations in setting up and running citizen science projects:

Last October the CSCS Network interviewed me on developing and running Citizen Science projects for their website – the interview brings together some general thoughts as well as specific comment on the COBWEB experience:

After the Unleashing Data session I was also able to stick around for Stuart Lewis’ closing keynote. Stuart has been working at Edinburgh University since 2012 but is moving on soon to the National Library of Scotland so this was a lovely chance to get some of his reflections and predictions as he prepares to make that move. And to include quite a lot of fun references to The Secret Diary of Adrian Mole aged 13 ¾. (Before his talk Stuart had also snuck some boxes of sweets under some of the tables around the room – a popularity tactic I’m noting for future talks!)

So, my liveblog notes from Stuart’s talk (slightly tidied up but corrections are, of course, welcomed) follow. Because old Repofringe live blogging habits are hard to kick!

The Secret Diary of a Repository aged 13 ¾ – Stuart Lewis

I’m going to talk about our bread and butter – the institutional repository… Now my inspiration is Adrian Mole… Why? Well we have a bunch of teenage repositories… EPrints is 15 1/2; Fedora is 13 ½; DSpace is 13 ¾.

Now Adrian Mole is a teenager – you can read about him on Wikipedia [note to fellow Wikipedia contributors: this, and most of the other Adrian Mole-related pages could use some major work!]. You see him quoted in two conferences to my amazement! And there are also some Scotland and Edinburgh entries in there too… Brought a haggis… Goes to Glasgow at 11am… and says he encounters 27 drunks in one hour…

Stuart Lewis at Repository Fringe 2016

Stuart Lewis illustrates the teenage birth dates of three of the major repository softwares as captured in (perhaps less well-aged) pop hits of the day.

So, I have four points to make about how repositories are like/unlike teenagers…

The thing about teenagers… People complain about them… They can be expensive, they can be awkward, they aren’t always self aware… Eventually though they usually become useful members of society. So, is that true of repositories? Well ERA, one of our repositories has gotten bigger and bigger – over 18k items… and over 10k paper thesis currently being digitized…

Now teenagers also start to look around… Pandora!

I’m going to call Pandora the CRIS… And we’ve all kind of overlooked their commercial background because we are in love with them…!

Stuart Lewis at Repository Fringe 2016

Stuart Lewis captures the eternal optimism – both around Mole’s love of Pandora, and our love of the (commercial) CRIS.

Now, we have PURE at Edinburgh which also powers Edinburgh Research Explorer. When you looked at repositories a few years ago, it was a bit like Freshers Week… The three questions were: where are you from; what repository platform do you use; how many items do you have? But that’s moved on. We now have around 80% of our outputs in the repository within the REF compliance (3 months of Acceptance)… And that’s a huge change – volumes of materials are open access very promptly.

So,

1. We need to celebrate our success

But are our successes as positive as they could be?

Repositories continue to develop. We’ve heard good things about new developments. But how do repositories demonstrate value – and how do we compare to other areas of librarianship.

Other library domains use different numbers. We can use these to give comparative figures. How do we compare to publishers for cost? Whats our CPU (Cost Per Use)? And what is a good CPU? £10, £5, £0.46… But how easy is it to calculate – are repositories expensive? That’s a “to do” – to take the cost to run/IRUS cost. I would expect it to be lower than publishers, but I’d like to do that calculation.

The other side of this is to become more self-aware… Can we gather new numbers? We only tend to look at deposit and use from our own repositories… What about our own local consumption of OA (the reverse)?

Working within new e-resource infrastructure – http://doai.io/ – lets us see where open versions are available. And we can integrate with OpenURL resolvers to see how much of our usage can be fulfilled.

2. Our repositories must continue to grow up

Do we have double standards?

Hopefully you are all aware of the UK Text and Data Mining Copyright Exception that came out from 1st June 2014. We have massive massive access to electronic resources as universities, and can text and data mine those.

Some do a good job here – Gale Cengage Historic British Newspapers: additional payment to buy all the data (images + XML text) on hard drives for local use. Working with local informatics LTG staff to (geo)parse the data.

Some are not so good – basic APIs allow only simple searchers… But not complex queries (e.g. could use a search term, but not e.g. sentiment).

And many publishers do nothing at all….

So we are working with publishers to encourage and highlight the potential.

But what about our content? Our repositories are open, with extracted full-text, data can be harvested… Sufficient but is it ideal? Why not do bulk download from one click… You can – for example – download all of Wikipedia (if you want to).  We should be able to do that with our repositories.

3. We need to get our house in order for Text and Data Mining

When will we be finished though? Depends on what we do with open access? What should we be doing with OA? Where do we want to get to? Right now we have mandates so it’s easy – green and gold. With gold there is PURE or Hybrid… Mixed views on Hybrid. Can also publish locally for free. Then for gree there is local or disciplinary repositories… For Gold – Pure, Hybrid, Local we pay APCs (some local option is free)… In Hybrid we can do offsetting, discounted subscriptions, voucher schemes too. And for green we have UK Scholarly Communications License (Harvard)…

But which of these forms of OA are best?! Is choice always a great thing?

We still have outstanding OA issues. Is a mixed-modal approach OK, or should we choose a single route? Which one? What role will repositories play? What is the ultimate aim of Open Access? Is it “just� access?

How and where do we have these conversations? We need academics, repository managers, librarians, publishers to all come together to do this.

4. Do we now what a grown-up repository look like? What part does it play?

Please remember to celebrate your repositories – we are in a fantastic place, making a real difference. But they need to continue to grow up. There is work to do with text and data mining… And we have more to do… To be a grown up, to be in the right sort of environment, etc.

 

Q&A

Q1) I can remember giving my first talk on repositories in 2010… When it comes to OA I think we need to think about what is cost effective, what is sustainable, why are we doing it and what’s the cost?

A1) I think in some ways that’s about what repositories are versus publishers… Right now we are essentially replicating them… And maybe that isn’t the way to approach this.

And with that Repository Fringe 2016 drew to a close. I am sure others will have already blogged their experiences and comments on the event. Do have a look at the Repository Fringe website and at #rfringe16 for more comments, shared blog posts, and resources from the sessions. 

Share/Bookmark

CSCS Network Event: Citizen Science and the Mass Media (Belated) Liveblog

This is a very belated LiveBlog post from the CSCS Network Citizen Science and the Mass Media event, which I chaired back on 22nd October 2015. Since the event took place several videos recorded at the event have been published by the lovely CSCS Network folks and I’ve embedded those throughout this post.

About the Event

This session looked at how media and communications can be used to promote and engage communities in a crowd sourcing and citizen science project. This included aspects including understanding the purpose and audience for a project; gaining exposure from a project; communicating these types of projects effectively; engaging the press; expectation management; practical issues such as timing, use of interviewees and quotes, etc.

I was chairing this session, drawing on my experience working on the COBWEB project in particular, and I was delighted that we were able to bring in two guest speakers whose work I’ve been following for a while:

Dave Kilbey, University of Bristol and Founder and CEO of Natural Apptitude Ltd. Natural Apptitute works with academic and partner organisations to create mobile phone apps and websites for citizen science projects that have included NatureLocator, Leafwatch, Batmobile, and BeeMapp. Some of these projects have received substantial press interest, in particular Leafwatch (along with the wider Conker Tree Science initiative), and Dave will talk about his personal experience of the way that crowd sourcing and citizen science and the media work together, some of the benefits and risks of exposure, and some of the challenges associated with working with the press based on his own experience.  @kilbey252

Alastair (Ally) Tibbitt, Senior Online Journalist at STV, where he has been based since 2011 working both in journalism and community engagement. Aly’s background lies in community projects in Glasgow and Edinburgh, experience that informs his work writing both for STV and Greener Leith. He has particular interests in hyperlocal news, open data and environmental issues, giving him a really interesting insiders’ perspective on the way that citizen science and crowd sourcing can engage the press, some of the realities of media expectations, timings, etc. and an insight into effective ways to pitch a citizen engagement story. @allytibbett

My notes from the talks were captured on the day but, due to chairing, I wasn’t able to capture all of the discussion or questions that arose in the session. The video below captures the talks, with my notes from these below. 

Click here to view the embedded video.

Musings on Media and Communications for Citizen Science Projects – Dave Kilbey, Natural Appitude

I’m not an expert but I have been working in this area for some time so these are some musings informed by my work to date.

I’ve worked on a variety of projects, which started with a project called NatureLocator – all basically mobile apps, but also website. We try to make it as simple as possible for people to take part in these projects, and we try to do that working with experts so that the data we collect is useful and purposeful. So our projects include work on invasive species, work with the biological monitoring centre. So effectively we work with researchers, organisations, and engaging the public in what we do. And we do that with design of bespoke smartphone apps and websites. In theory Innovative but actually much of this is established – although BatMobile is an exception – as was never really good enough to launch. And public engagement is central to what we do, and from that naturally comes much of our engagement with media.

We spend a lot of time and money on design and usability, because if they aren’t easy to use and appealling then participants won’t use them or use them again. The apps are for contribution, the website is for looking at the data – that’s more of an unprovoked engagement…

So the content on media on communications is this bit, which I’m calling “Smurfs… and the wrong kind of conkers”.

So I thought about why we want media coverage in the first place? It’s obvious but it matters… And these are selfish through to altruistic…

We want this to get the project (and us) noticed – we want to share what we do, and to get the project out there (important for a business too). You want to engage an army of volunteers – you can’t have citizen science without citizen scientists, you need people engaged. You want to attract more funding – crucial in a university context. Success metrics – which include impact – we are measured on how many people took part, engaged etc. and as researchers we are also measured on media presence to an extent. But there is also the aspect of personal satisfaction, and that matters.

On a more altruistic basis is increase knowledge of a concept or problem – we’ve really had that feedback on our invasive plant species work. Citizen science is increasingly about finding solutions to problems – there are all sorts of things like examination of proteins being gamified, so participants contribute regardless of knowledge. We also want to inspire interest, perhaps even the next generation of researchers – we are all passionate about what we do, and want to share that…

But the crux of the matter is that media isn’t always as important in the ways you’d expect.

If your project isn’t ready, the media coverage will be a real pain. There is a project called Ash Town done more of less as a media stunt… The organisation using the data wasn’t ready, the data wasn’t ready… and they had a backlog of verification and that disillusioned participants… The feedback loop wasn’t there but they had to take advantage of that moment. So I tend to be quite conservative about when I share projects, I want them ready.

Quite a few of our projects have had mass media interest and that can be brilliant but they cause a big spike and are largely unfocused… Normally you want a focused set of interested participants. It can be helpful but long term it’s less clear how it is helpful for finding those participants. By contrast micro media and focused marketsing and events, such as conferences, lead to better engagement – and the data from targeted audiences tends to be much better. For example there was a big issue of giant hog weed in the media this summer – we had more records than ever before… but 80% of that data was incorrect. Normally the data in Plant Tracker is 90% accurate. That was due to lots of people finding out about giant hog weed and recording lots of false positive. NOt neccassarily a problem, but an issue for data centric projects.

So we find drip feeding/organic networking works best for us. But as they say “Any publicity is good publicity?”… Maybe…. Mostly we’ve had good coverage,

To use a fishing analogy I see the mass media as ground bating – causing a general feeding frenzy, but then you have to think about how you are baiting your hook to make use of this… So it’s all about how you follow up…

So, with our first app, Leaf Watch, we had loads of media coverage. This project was small scale before with maybe 500 records a year, without the photos or georeference. So we set up a smartphone app with that sort of data for verification interested… And we had 5000 records… But also a lot of noise… 3 bottom pictures, and worse… even a smurf!

So, how to attract publicity… Again, I’m no expert… Often it’s about finding an interesting story to tell that has relevance at this point in time – is there a hook to draw people in, trigger their imagination. For the Uni of Bristol it was often our Public Relations Office that often got us the gig. Me, on my own using my Twitter feed, is going to get the Times interested… So utilise your existing resources in your organisation, they have some great powerful contacts etc. to call on. And I have a colleague who does a good job of researching likely journalists and contacting them directly…

Really much of this feels random, but it’s about a lot of events coming together, and stuff in the outside world… Looking for those opportunities to tell your story to an audience that’s ready to listen… (And do get in touch).

Engaging the Media – Ally Tibbett, STV

I work at STV, and have a background in community projects and volunteering activities. I currently work at STV, also setting up a fledgling news site.

So I wanted to set the context of engaging with media… ANd I wanted to set the scene. Many newspapers are losing 10% circulation, broadcast TV are doing better, but still online transition. But most media company websites are booming – our STV pages collectively reach a few million people a day. So still a lot of reason to get word out there. And it’s worth planning that as you do your citizen science project. You need to think about where you will find the people you do want to engage with. More and more people get their news via social media. Many read news via mobile device. It’s getting more visual with vides, images, infographics. Big interactive graphics are great, but hard to scale to a phone so many media companies keep it simple..

So I’ve tried to set this up as a timeline… How you might engage the media… Before your project. When recruiting participants – who do you want to reach, is it a specific geography? Age greoup? demographic? that should influence both the scial media platfors and media companies you use. What is the benefit for participants? What is the long term goal. Is ther ean interesting back story – and what change will it bring about. And plan out a communication calendar – can you hook into, e.g. International Authors day. Editors are always looking for a new angle on events, or a local angle on a national news story. And even if that doesn’t fit your timing it can be helpful. The other thing to think about is what digital assets can you share/produce. A press release is nice, but a press release with bangs and whistle, with infographics or images etc. That is brilliant – helps journalists know why they should engage now. It’s about the infotainment, not just the data. And it could be as simple as a slideshow, or animated gifs, or data we could map. Thinking about citizen science projects I’ve already worked on, I thought of a project on happiness on different neighbourhoods – we persuaded them to share some data. If you do want help producing maps etc, then there are skilled journalists who can help. We’ll need a Shapefile. And we need that data to be open to support more open interactive stuff…

So, assuming you had a nice launch and a little publicity boost… How do you engage dring th eproject? Well citizen engagement can be more than just research – can they promote project fro you on social media. You need a #hashtga to generate social media buss and help you collate conversation. Can you give progress reports to journalists who covered the launch and those you hope will cover final results. And building that buzz from the outset, can mean there is a story, and help show th eimpact of your prokect. Also, thnk about things that cannot be shared – could be copyright or child protection etc. issues. And as you aggregate content around the hashtag and curate the best, remove anything with an issue. Tools like STorify let you do this.

From my point of view one of the best ways to engage the press is when there is a result, a discovery… The media thrives on a wee bit of controversy etc. So Neive Short from CRESH at Edinburgh looks at mapping alchohol etc. and social issues – she is a campaigning academic, taking her studies to policy makers, and that, for instance, is always of interest. So air quality or air pollution crowd sourcing project would certainly have some of those qualities, those cases to engage policy makers. Too often we get press releases about “we did a study… we might be able to do something in the future…” but we need a concrete story really…

A note on press releases… They are fundamentally quite useful. Do send them out. Keep them short. Include multiple short quotes. have a clear top line, be clear about what you’ve done. Comes with a variety of visuals in different formats – landscape, portrait, infographics, animated films etc. And supplying images in multiple formets – making our job to package it easier – makes a big difference. Is the story important enough for us to send someone out to take new images? Maybe not. BUt actually don’t send 6MBs of materials is not good – so send a press release linking to resources.

So, journalists. Do send releases etc to a generic news email addresses. Use tools like Twitter and LinkedIn to find journalists with an interest in your subject, message them direct. Provide advance warning, reminders, photo and filming opportunities. Don’t do it at the weekend – no TV will come. Do it at a lunchtime on a weekday… PRactical stuff. If no one shows up, don’t worry about it, do send them pictures etc. And if there is one place that you really really want to be featured in, offer it as an exclusive and see it works. Obviously I’d like that to be me… BUt that’s something useful to hold back ni that way…

And, lastly, humour works. If you can find something daft, and can present it in a funny way… Our story “What if Back to the Future was set in Glasgow” is the second most ready story on our website having gone up yesterday. Most read story in the last year on STV was a very tall man who using the bathroom had a hand dryer calamity – that did great and almost made the front page of Reddit. We can be too serious… Be fun. Share the 15 things that happened in this project that were most funny, say… Humour works.

And with that we turned to some really interesting questions and discussion – huge thanks to all who came along and took part in this.

Whilst he was in Edinburgh for this event Dave Kilbey was also able to give an interview for the CSCS Network website, which you can watch there, or in the embed below:

Click here to view the embedded video.

Huge thanks to Dave and Ally for making the time to come along and speak to the CSCS network who I know really appreciated their presentations and sharing of experience. Huge thanks too to the lovely CSCS network team for providing a space for this event and support for our speakers and their travel. 

Share/Bookmark

Upcoming Events: Citizen Science & Media; PTAS Managing Your Digital Footprints Seminar

I am involved in organising, and very much looking forward to, two events this week which I think will be of interest to Edinburgh-based readers of this blog. Both are taking place on Thursday and I’ll try to either liveblog or summarise them here.

If you are are based at Edinburgh University do consider booking these events or sharing the details with your colleagues or contacts at the University. If you are based further afield you might still be interested in taking a look at these and following up some of the links etc.

Firstly we have the fourth seminar of the new(ish) University of Edinburgh Crowd Sourcing and Citizen Science network:

Citizen Science and the Mass Media

Thursday, 22nd October 2015, 12 – 1.30 pm, Paterson’s Land 1.21, Old Moray House, Holyrood Road, Edinburgh.

“This session will be an opportunity to look at how media and communications can be used to promote a CSCS project and to engage and develop the community around a project.

The kinds of issues that we hope will be covered will include aspects such as understanding the purpose and audience for your project; gaining exposure from a project; communicating these types of projects effectively; engaging the press; expectation management;  practical issues such as timing, use of interviewees and quotes, etc.

We will have two guest presenters, Dave Kilbey from Natural Apptitude Ltd, and Ally Tibbitt from STV, followed by plenty of time for questions and discussion. The session will be chaired by Nicola Osborne (EDINA), drawing on her experience working on the COBWEB project.”

I am really excited about this session as both Dave and Ally have really interesting backgrounds: Dave runs his own app company and has worked on a range of high profile projects so has some great insights into what makes a project appealing to the media, what makes the difference to that project’s success, etc; Ally works as STV and has a background in journalism but also in community engagement, particularly around social and environmental projects. I think the combination will make for an excellent lunchtime session. UoE staff and students can register for the event via Eventbright, here.

On the same day we have our Principal’s Teaching Award Scheme seminar for the Managing Your Digital Footprints project:

Social media, students and digital footprints (PTAS research findings)

Thursday, 22nd October 2015, 2 – 3.30pm, IAD Resources Room, 7 Bristo Square, George Square, Edinburgh.

“This short information and interactive session will present findings from the PTAS Digital Footprint research http://edin.ac/1d1qY4K

In order to understand how students are curating their digital presence, key findings from two student surveys (1457 responses) as well as data from 16 in-depth interviews with six students will be presented. This unique dataset provides an opportunity for us to critically reflect on the changing internet landscape and take stock of how students are currently using social media; how they are presenting themselves online; and what challenges they face, such as cyberbullying, viewing inappropriate content or whether they have the digital skills to successfully navigate in online spaces.

The session will also introduce the next phase of the Digital Footprint research: social media in a learning & teaching context.  There will be an opportunity to discuss e-professionalism and social media guidelines for inclusion in handbooks/VLEs, as well as other areas.”

I am also really excited about this event, at which Louise Connelly, Sian Bayne, and I will be talking about the early findings from our Managing Your Digital Footprints project, and some of the outputs from the research and campaign (find these at: www.ed.ac.uk/iad/digitalfootprint).

Although this event is open to University staff and students only (register via the Online Bookings system, here), we are disseminating this work at a variety of events, publications etc. Our recent ECSM 2015 paper is the best overview of the work to date but expect to see more here in the near future about how we are taking forward this work. Do also get in touch with Louise or I if you have any questions about the project or would be interested in hearing more about the project, some of the associated training, or the research findings as they emerge.

Share/Bookmark

CSCS Network – Seminar 1 Science and the citizen worker: the Zooniverse – LiveBlog

This morning I am at the first seminar arranged by the University of Edinburgh Citizen Science and Crowdsourced Data and Evidence Network. The Network brings together those interested in citizen science and crowdsourcing from across the organisation and this event is also supported by the Academic Networking Fund, IAD. Today’s seminar looks at the Zooniverse crowdsourcing organisation and suite of projects with two guest speakers, and I’ll be taking live notes here. As usual, because these are live notes there may be errors, typos etc and corrections are welcomed. 
We are starting our day with an introduction by James Stewart on the focus of the network, which will particularly focus on methodological approaches.
Grant Miller (Zooniverse): ‘The Zooniverse – Real Science Online’
About Grant and his talk:
‘The Zooniverse is the world’s largest and most successful citizen science platform. I will discuss what we have learned from building over 40 projects, and where the platform is heading in the future.’
 
Grant Miller is a recovering astrophysicist who gained his PhD from the University of St Andrews, searching for planets orbiting distant stars. He is now the communications lead for the Zooniverse on-line citizen science platform.
I had kind of a weird introduction into crowdsourcing and citizen science.. But the main thing I will be talking about today is about how we engage the Zooniverse community to participate and enjoy doing that and being part of our community.
Zooniverse all started with Kevin, a student at Oxford who was tasked with looking at thousands of images of the universe to find two sorts of galaxies: eliptical galaxies and spiral galaxies. He had a million to classify. He did 50,000 and then met with his supervisor and had some strong arguements: he didn’t want to spend his whole academic career classifying galaxies, and he argued that it didn’t require his training. So, by show of hands who thinks this image of a galaxy (we are looking at one of many) is an eliptical, how many think it is a spiral? The room votes that this is a spiral and it is indeed a spiral – and that’s basically how Zooniverse works. We show an image, we ask people what it is, and they choose. And people, en mass, really went for this. They went through huge amounts of images very quickly.
Other things started to happen to… The first community around the project was the Galaxy Zoo forum. A participant called Hanny found a thing (vootwerp)… It didn’t look like the galaxies she was classifying. This was a completely new astronomical phenomenon, which was never known about. An amateur had found this through this very simple platform. People aren’t just good at recognising patterns, they also get distracted and find new things. And after discovering and publishing on this phenomenon – a huge cloud of gas associated with a galaxy – a group from the community decided to make a project of looking for more of these in other Galaxy Zoo images. And this is why communities are so brilliant. On another project our community found a whole new worm under the sea. That’s the power of having this community taking part.
So, how do we do this? Well we really simplify the language of the task, make it easy for people to take part. And when Galaxy Zoo took off we found other scientists and researchers approaching us to build new projects including humanities projects, and biological projects. So we set up projects such as Snapshot Serengeti – used to indicate what you can see in images from camera traps on the Serengeti. I was working with a group of computer scientists trying to work out how to identify the object in the image, and also my 4 year old nephew… and he said in seconds, the computer scientists are still looking for a solution.
So at this point in time we now have 42 projects in the Zooniverse. Old Weather in 2010 was our first humanities project. It started as a climatology project, but because it was using historic ship logs and those include so many other types of data we found humanities researchers and historians coming on board so it has had a second life. We have other humanities projects, cancer research projects, etc. Of those projects about 30-35 are currently live. We think this will expand rapidly soon but I’ll come back to that. And last year we passed the 1 million volunteer mark, that’s registered volunteers. Mostly those are in Western Europe and North America, but we have participants in 200 countries (7 countries have not).
The community is expanding, the projects are expanding… But there is a lot of potential out there, a huge cognitive surplus we could be using. For instance Clay Shirky notes that 200 billon hours are spent watching TV by adults in the UK, it took only 100 million hours to create Wikipedia. We are only beginning to tap that potential. On January 7th last year we relaunched a project called Space Warps – we had over a million classifications an hour – when Prof Brian Cox and Dara O’Brien asked the public to do it on live TV. That meant that overnight we had discovered an object it can take astronomers years to discover. It’s good but it’s no 200 billion hours… Imagine what you could do with that much time. Every hour there are 16 years worth of human effort spent playing Angry Birds… How do we get that effort into citizen science?
So, if gamification the way to go? For those working in citizen science you could probably run a week long conference just on whether you should or should not do gamification. We have decided not to but some of the most successful – foldit and Eyewire – do use it. Those projects gave huge thought about how to ensure participants reward efforts in the right way so that people don’t just game the system. For us we are worried that that won’t work for us, not convinced we would be good enough building a game and end up with something neither game nor citizen science. But some of our projects have tried gamification and we have studied this. On Galaxy Zoo we used a leader board to start with but that caused some tension: those in the lead were doing hundreds of thousands of classifications and people felt the leaders might have cheated, others felt that they could never get there so just left. On Old Weather we enabled those participants who focused on a particular ships log could become captain – but it put off as many people as it attracted. And those who became captain had nowhere to go.
This comes back to motivation for taking part. When we do ask our volunteers frequently it comes down to those participants wanting to contribute to research. So, for instance, The Andromeda project involved images that weren’t that exciting… They were asked to circle clusters of galaxy. The task is simple, they feel they are really contributing… They finished the task in a week. This time, when we had finished we put up a message thanking participants for their contribution, saying that we had enough for the paper, but they were welcome to carry on… And that shows a rapid fall down to zero participation – they were only interested while the task at hand was useful. And that pattern reminds us not to mess with our community, they use precious spare time and they want to be doing something useful and meaningful.
Planet Hunters is a project we used to detect planets based on data. People don’t take part to discover planets, it is because they really are interested in the science. Some of our really active participants choose to download the data, write their own code, doing work at PhD level as a volunteer and sending data back… The planets discovered in that project are rare and weird – things we didn’t spot with algorithms – the first one found had 4 suns. And recently we found a seven planet solar system, the largest other than our own .
Volunteers are keen to go further, so we have a discussion area – labelled Talk – for all of our projects. That means you can comments, Twitter style, or you can use old style discussion boards for long form discussions. Those areas are also used by the scientists, the researchers, the technical teams and developers, and the community can interact with them there – the most productive findings often come from that interaction between volunteers and scientists. The talk areas of our community are really important. In fact we have a network diagram for our community we can see some of our most active participants  – one huge green blob on this diagram is a wonderful woman called Elizabeth who posts and comments, and moderates, helps fellow volunteers come along. And we are looking at those networks, at who those lynchpins are, etc.
I said that people write their own code, do their own analysis… So can we get that on the site? We have been playing with the tools area, which we’ve tried this for Galaxy Zoo and for Snapshot Serengeti. We’ve been funded to build a broader set of tools, to map data, etc. from the website itself.
One of the other big things we are trying to do is to translate the site. For instance here is Galaxy Zoo in traditional character Mandarin. And we are doing this through crowdsourcing. You pick your site, and you show words or sections for users to translate. I talked about understanding the community and their interest and motivation. You also need to understand how we allocate images etc. We have done it based on seen/not seen but have been toying with the idea of shaping what images you see based on what you have seen, or are good at, or particularly like or are good at identifying. We tried that, shaping images to suit interested folk. When we tried that it wasn’t that successful, this was on Snapshot Serengeti, and realised we hadn’t been showing them blank images… So we looked at usage data to see to what extend seeing blank images impacts classifying images. It seems that the more blank images a user sees, the more they classify. When you classify a few/lots in one go they leave the site sooner. But psychologically we aren’t sure why this is yet – to classify a blank image its one click, that’s quick… But also what is the reward there for that image – is it just as rewarding to classify a blank image. There seems to be a sweet spot here… The same team trying to automatically spot a zebra has also been looking at identifying anything being in the image… But doing that may mean they leave the site sooner so we could be shooting ourselves in the foot…
So, we’ve been thinking who should see what? And as part of that we have been trying, with some of the space image projects, putting some simulated images into the mix  to rank/detect expert level – and looking at that in comparison to their experience/expert level within the system. We want to see if there is a smarter way to do a Zooniverse project.
The other thing that can happen is fear, a sort of classification anxiety. For instance for cancer images people can be quite scared to click the button and contribute to the research. So we are toying with showing volunteers how the consensus clustering works – so we can show people that their marking counts but that they are backed up by the wisdom of the crowds we think that may help them trust themselves. At the moment we just blog about this stuff, but how can we show this on the site.
Panoptes is our new infrastructure platform, which we’ve been building for the last year, built with 2 million dollars of funding from Google. And the first project using this appeared on Stargazing Live this year, looking for Super Novas. We discovered five Super Novas during the week long run of that programme.
Mark Hartswood (Oxford University & CSCS Data and Evidence network founder): ‘Intervening in Citizen Science: From incentives to value co-creation’
About Mark and his talk:
‘This talk reflects upon a collaboration between SmartSociety, an EU project exploring how to architect effective collectives of people and machines, and the Zooniverse,  a leading on-line citizen science platform.
Our collaboration tackled the question of how to increase engagement of Zooniverse volunteers. In the talk I will chart how our thinking has progressed from framing volunteering in terms of motivation and incentives, and how it moved towards a much richer conceptualisation of multiple participating groups engaging in complicated relationships of value co-creation.’
 
Mark Hartswood is a Social Informatician whose main employer is Oxford University and currently working in the area of Responsible Research and Innovation.

Share/Bookmark