LiveBlog: JISC Cetis Workshop

CETIS teaching and learning repositories workshop (Phil Barker/Lorna Campbell)

The full programme is available on the JISC CETIS webpage.

Phil Barker introduced the day by talking about how we are thinking about repositories today: as places to put stuff but not necasssarily as stict

John Robertson is the warm up!

Many of you will be here because of your background in repositories but perhaps not in learning and teaching materials so we are going to be thinking about learning and teaching resources and what might be special about them.

What is distinctice about Teaching and Learning?

  • Type of content
  • source of content
  • value of content
  • system functions
  • expected users and use

Two scenarios: collection of openly licensed materials connected to teaching and learning; a set of high stakes (final exam) assessement itsems (questions, answers, rubrics).

For exam material – far more management challenges for ingest around quality, around checks and balances. Open material is a more open process. I suggest that the management issues the exams materials goes through all those processes already, the open material is more challenging

For each of these we need to think about the key management considerations. What are the key discovery routes.metadata and what does the user want to do with that content.

So we are splitting into groups of about 6 people. I’m with Sheila from CETIS who is playing devils advocate on behalf of users; Dan, technical architect for the Learning Repository and we are interested in discovery, effective learning content, and effective learning teaching materials and how to ; Yogesh Patel of Mimas who works on Jorum; Nicola of EDINA, also MSc in eLearning student and previously worked with Jorum; Lisa is a cataloguer working on the DELORES project assessing; Peter of Intrallect who make Intralibrary.

We are now spending 8 minutes thinking about management issues, discovery routes, what is the expected uses for both types of materials for Open Teaching and Learning stuff, and for exam stuff.

Open teaching and learning materials and exam materials – SQA have been pulling those materials in the same repository. The two types of materials sit in the same place but have totally different business cases. What kinds of exams are we talking about – there are the static materials – past papers etc. and there are those that are online assessments. But we are assuming both of those are fairly static – you could be generating personal assessments per person. But there are scenarios where you generate questions from a bank – the question as an object and the metadata about how they can be use, how questions are performing and feedback on those.

Discovery – most people focus on Google. When they get into exams they are very much concerned with how content aligns with curricula standards. Pops up right away rather than in other materials. Experience is that the exam materials often have more and better metadata to allow algorithms to draw on them. But metadata could be seen as political in some ways.

And we are back to report…

Group 1: We thought about whether students can contribute and whether you can use external content. In terms of exam papers the lecturers decide what to expose in terms of marking schemas so can have different approaches. Should the statistics for cohort performance be available? Should it be shared internally? Externally? What does it tell you about how a course is doing is one cohort performs unexpectedly.

Group 2 (my group): We talked about formal and informal processes. In some cases you can be dealing with one final copy but the metadata may be more complex. For open materials there may be more management issues re: clearing copyright etc. Google but also trusted sources. High stakes assessment can include some metadata you might not be able to share for various political reasons. We did talk about the difference of live exams versus past papers. We also talked about dynamic exam creation where the question is the object and knowing how it performs will help you develop your exams in the future.

Group 3: In an open teaching and learning environemnt how do you surface good stuff – do you want students to see everything, how do you assign quality marks. In discovery Google matters but without guidance how do you find relavant content. How do you index that all? How do you find what is useful? Expected use – enhance the teaching and learning process. Finding material around your subject. Enhance your learning through this content.

Group 4: Management issue s- we got bogged down and sidetracked. We mentioned the word metadata and it all went crazy from there. How much do you need, how much would actually be good enough, what’s realistic? Discovery – social networks particularly discipline networks for relevant content. And we talked about being able to search for content, news, etc.

Group 5: We talked about management issues including policy control – should you control what does up? Should material be within the institution or shared more widely. In terms of exams security is an issue – particularly if distance learning students have a deadline for instance. There may be 3rd party rights issues over content etc. Expect access through institutions, virtual learning environments but also through Google perhaps. In terms of exams would we be just focusing on our own assessments or sharing those globally.

Back to John: hopefully seeing these different responses will give you pause to think about why those responses vary and think about those different perspectives on learning and teaching in this context.

  • Community Engagement in Teaching and Learning Repositories: ePrints, HumBox and OER – Patrick McSweeny, University of Southampton

How many of us have a learning and teaching repository in their institution? Ok by show of hands

HumBox is a digital humanities teaching repository, it’s community led with some 1400 resources (units of teaching as defined by the uploader) including rich media content (video, high quality images, sound). It is run by the HEA subject centre, based at Southampton, and has built a hugely productive community around the resources.

CRIG – Common Repositories Interfaces Group – came out of a hackday and looked at the Repository 2.0. Looking to discover why web 2.0 worked. Not just a mimic but an understanding of how to use techniques.

The EdShare team runs around 8 to 10 repositories now. We have community, project and institutional repositories but they are united by being about teaching materials. We’ve learned things about motivation:

From an institutional point of view the main reason for having a repository is to distribute the content in a way that raises the profile of the organisation. People do worry that people won’t join an institution, they’ll just use the materials at home. But the inverse is true. Textbooks already contain much course content but we still come to university – it’s an experience, it’s an engaging process, it’s about interactions with other people – that’s what makes that course. Your teaching materials are therefore the low value thing. Students pay to hear you talk, to talk to others, to share their thoughts.

So if lectures and teachers can upload their content there are some easy wins – no need to worry if a hard disc fails, if a staff member leaves etc. And it enables a more flexible teaching approach and include richer content in teaching. A few years back there weren’t a great number of appropriate options to share rich content.

Library managers find repositories much more fun to manage, and a more engaging atmosphere to manage.

After building lots of repositories we’ve built up a process:

  • Stakeholder analysis
  • Rich personas – you name these, you build your repository with strong use cases in mind
  • Validated by the community – you get them to tell you if those personas and ideas match that community
  • Firm foundation to move forward – so when you develop functionality or documentation you can call back upon those personas and think about how relevant your action is here.

We found that people generally want community tools in some way – see Humbox for example – there is no mandate so you need to enable people to deposit material. Users needed a focus for their work in the repository, they wants to feel it was important for their content. And they wanted mechanicms to interact with other users – you need more than star ratings. To maintain integrity this is important. And these materials can be reused in so many ways that you need subtle understandings of why a resource is/is not useful.

We used MePrints to enable this community – you have profile, a published items, your most viewed items, bookmarks and collections. There are two faces to this – one for you, one that is a public facing profile. Some content will sit in both versions, some only appears in your personal space. The photo of the academic was an important inclusion here but there is also commenting, email via the system, alerts to bookmarking and remixing etc. All important.

So how do we do it? How do we build a community? We did flop before we got this right! We got academics from across the UK into a room and discussed the issues with their teaching, issues with their community. They were interested in discussing ways to improve and share learning and teaching materials. The ability to share adbice and meet each other in person is hugely valuable. Pictures on profiles are so important as they allow continuity between meeting someone in person and the materials and work profile of that person. Allows you to make the most of those connections.

So you need a tight community and that is very powerful. So for Humbox we had 50 people along who engaged with workshops and got properly into it. It was a bit like a street team (think punk music) who evangelised the project to their colleagues and peers. They give rick feedback but it also makes the project fun! They also keep momentum up even after funding comes to an end.

We’ve done some follow up work on the impact of Humbox around survey and stats. We surveyed 55 Humbox users, examined usage logs, looked for patterns of use as wel as just statistics. It’s anout understanding how people use the repository and what their path through materials was and how behaviour changed.

Many people began as a way of distributing content. The VLE doesn’t facilitate sharing – even with colleagues – so uploading to HumBox made sharing easy and quick. People often shared existing materials – which did mean they were high quality.

An astonishing amount of users told us that they’d gone back and changed teaching resources as a result of sharing resources on HumBox. Some made minor revisions, some changed minor elements. Look for the paper online with more detail.

The other thing we found was that remixing and repurposing was going on. Over 50% of reusers modify or augment the material and make it suit their own needs. Or just taking inspiration from each other. Not reusing slides or video but I will use a video and having seen your use and spoken to you has helped me believe this could work for me. So there was real change to teaching practice. Some 66.7% of those reusing anothe rpersons resource said they had changed their teaching practice as a result. At the time we surveyed we didn’t specifically enable this reuse but we’ve added it since and I think it will be very popular.

Repositories about teaching content are NOT about teching content but about the people that create and contribute to teaching content – and that does include students on occasion. Actually the problems around repositories are almost always social. Being appreciated for your work is hugely valuable to people. Complete the loop, keep people engaged and they will come back and continue.

So to build Humbox: 3 years; 8 JISC projects; 5 developers; 4 interns; 4 SVN trees (and Pat’s had 2 desks).

The software is online and free (GPL). You can get the papers, project reports, advice, materials in the EPrints community etc. All you need to do is find the community that needs this


Q1) Did your participamts have an established pattern of building online profiles before?

A1) Not really, the group was scared of YouTube, Flickr, etc. Those workshops let people voice their concerns and worries about inappropriate reuse of content, comments from the wrong communities. We were able to

Q2) The title suggest humanities but was there cross disciplinary approach?

A2) Humanities was defined very widely here but there was a critical mass of engagement in several different areas. But comparing styles of teaching across those diverse materials was really helpful.

Q2 again) Do you think the same pattern might translate to sciences?

A2) We just started Bloodhound online looking at engineering materials for teaching so it will be interesting to see how that works in comparison to Humbox

Q3) in terms of feeling exposed where are you now – where are we at about sharing beyond institutions

A3) Loro the OU repository for languages is really interesting. They made all that content Creative Commons licenced. But their definitoin of repositories changed in that process. They didm’t just want public access but also comments and feedback. Really interesting. But generally people are keen to be open

A3 – Yvonne) We very much made sure that by doing rich stakeholder analysis and rich personas we really understood our users, what they were comfortable and how to make these things accessibe to them. These skeptical set of people have moved on to really rich engagement now.

And we are back with Phil Barker:

I like very much what Patrick was saying about changes to teaching practice there. And now, for a break….

We are back!

First up is Chris Awre standing in for John Casey from the University of the Arts who is unwell and unable to make it today.

Hydra, Fedora and learning materials at University of Hull – Chris Awre

I am aware that our repository is for the whole instiution, not just learning and teaching materials but including those.

We will be talking about what we have done, our learning materials activity at Hull. Then I will talk about Fedora, then the Hydra projec and what we would like to do in the next year or so.

We have had our repository live since 2008 and we have thought about how it can be used for learning materials at various points but we haven’t particularly focused on this. We have focused on lower hanging fruit, and more immediate university priority. But it slightly changed when th eUK Physical Science Centre Subject Centre  (based at Hull until last week) got one of the OER phase 1 projects “Skills for Scientist” to generate resources for the physical sciences – chemistry, physics and forensic science being the particular focus. All those materials were placed in JORUM but they were also to be stored locally as a back up and further access point.

This collection appeas in a fairly basic way. It’s hierachichal in it’s construction but it’s quite easy to move through. That collection now sits alongside an archive of the website for that subject centre as it is now closing down.

So having said that we have other teaching materials in the repository and we are keen to expand that. However there are other materials also used for teaching though not perhaps presented that way. Several data sets used by the Departement of History function in this way as well. There are also digitised books and poems where lecturers have requested this – where the items were out of copyright this has been easy and very effective to do. And we also have some permissions from copyright holders to use materials for teaching. And we also have audio and video recordings for a particular creative writing ciurse. We also hold the exam papers though departments decide whether or not to deposit these. We also have CLS digitised materials in use at the university but these are NOT in the repository at present but we’d like to if we can overcome rights issues/complexities in some practical way.

In terms of using Fedora we needed something with a scalable solution – nothing with an upper limit on content storage. We wanted something standards based and open source where possible in order to future proof our repository to some extent. We wanted to be content agnostic – we don’t know what content will come along.

And we were also interested in Content semantics so that er could record the relationshiops between projects and materials and how this changes over time.

Fedora (Flexible Extensible Digital Object Repository Architecture) tries to live up to it’s name – it’s an architecture on which to build a repository solution. It’s fairly well formed out of the box but has huge amounts of customisation. It also has a powerful digital object model and user interface flexibility according the needs of particular content.

That digital object model lets you define a digital object identifer, you can associate reserved datastreams for key metadata, you can associate datastreams – content or metadata of any type, and disseminators – tools for reuse etc. The system doesn’t mind what is packaged under that object identifier and that is very powerful.

The development of Fedora has been overseen by DuraSpace since July 09 and it is a parent non-profit body for Fedora, DSpaces, Mulgara, Akubra and DuraCloud. It is now at version 3.5 (coming soon) but there is also a roadmap for future development outlines. There is a core developement team but thre are also lots of community based committers that drive software development in conjunction with community input. The user group is active and really helpful. There is a UK community that meets every year so there is great support available here.

Current areas of activity for the Hull repository include:

Theses, Dissertations, Exam papers, Committee Papers, University Policies Procedures and Regulations, HR documentation, E-Prints/journal articles (and there will be more activity as we connect to the CRIS), Images, skull scan images, audio recordings, lectures (particularly inaugeral lectures), digitised content, LTSU documents, student handbooks, and open educational resources.

Given that we have that repository why would we need anything else? Well partle because our interface to Fedora, Muradora, ceased development when it’s funding ceased. There wasn’t a critical mass of users to take Muradora forward. It was in need of major reengineering as only ever a serious proof of concept. Essentially that interface made Fedora act like a Dublin Core registry with files attached. We wanted to take advantage of richness of Fedora’s model. But based on experience of MuraDora we needed to be part of a larger sustainable community connected to our interface.

We presented on the REMAP/RepoMMan project at OR2008. This was a JISC projects looking at how repositories could be incorporated into the digitisation life cycle and how Fedora could be improved. As a result of this we made contact with partners and the grouping of Hull, the Unievrsity of Virginia, Stanford University, Fedora Commons/DuraSpace and MEdiaShelf LLC got together as an unfunded Hydra project to allow us to build a flexible multi-institutional interface solution for Fedora. We needed to work together adn collaborate to build a community of developers and adopters to extend and enhance the core – where technology and community were tightly connected.

So we wanted 4 key capabilities

  • Support for any kind of record or metadata
  • Object-specific behaviours – for books, images, music, video, manuscripts, finding aids, learning objects etc.
  • Tailored views for domain or discipline specific materials
  • Tailored for local needs and views

We looked to build a semi-legal basis for contribution and partnership around Hydra to make this work well, And we have enabled others to join the project on these terms if they want. One of the Fedora dilemmas is that it is so flexible you need to have focus for what you actually want to build. So here is an overview of wat we use. We have Fedora with Hydra Rails Plugin (CUD), we use Solr, Blacklight (R) and Solarizer and we use Ruby gens to enable ActiveFedora, opinionated metadata etc.

In 2011/12 we are implementing Hydra in Hull. We will switch off our current repository (eDocs) when everything has migrated over. Implementations are also going live in the US this year – Virginia’s is already up and running. It will provide an end-user UI with graded levels of access and create adn manage functiuonality for particular users and groups. And this will connect with tools including SharePoint so that the repository can be used to store materials created elsewhere.

How can we apply Hydra/Fedora to learning materials? Well the object model lets us structure and describe learning objects within the repository. Hydra provides a way of delivering this through CRUD interfaces. Two possible approaches – a learning onjects Hydra head with specific workflows etc. Or including learning objects in an IR head. We are going that latter route to further develop a broad approach to th eprovision of a repository for Hull, based on MODS metadata.But we could always alter this in the future, Hydra enables this flexibility.

We have OER pilots scheduled for 2012 (via FLEX elearning peer support group):

  • Building on Skills for Scientissts and local RLO project
  • OER phase 3 funding?
  • University of Hull projects
  • And building the ability to view the repository from the VLE, SAKAI


Q1 – Mahendra) Can you tell us about your strategy for getting content?

A1) To date it has to not tell anyone we have a repository and wait until they come to us – as we have plenty to be getting on with. We’ve sort of dealt with content when we are asked for content solutions. So when the Physical Science Centre got their OER project we were able to tie that to the repository, etc. We are concious that we have to officially launch it at some point. We are slightly wary of this as we need to be able to deal with queries and resource any future development. But our strategy is to support the needs of the university, and that won’t change. In theory we can take any content or format but we do take some practical decisions – for instance for dissemination theses are required to be in PDF format at the moment.

Q2) You talked about these different views on material – how have you used these and what is the key thing that you need to do with objects when depositing to allow you to package them well in the repository? How do disciplinary views get created and how do they change?

A2) At the moment the different views are based on materials we know we are providing. We are not clear on demand for new views but we know that if others request a new view we can develop it for them – and we will need to go out and ask people what they want in these ways. The repository is flexible but you need to describe the materials at ingest to give an idea of how the content should be displayed. So if you ingest a journal article you could associate a journal article content model with that – that would present the abstract and publication information say. You could also associate that article with a learning object and viewing that might display the article differently.

We use the Solr index to create tailored views of the repository. You can create subset indexes viewing the whole content to enable specific views but that depends on an indication of how that content could be used. So you could create a subset of medical materials. But the generic materials view is always there. The issue of peripheral views or interests is a persistent issue for any sort of subject view of materials.

Social Networking for Metadata: The Federal Learning Registry Initiative, Dan Rehak, ADL

The word registry can be confusing to some people. It’s not what libraries think of as a registry for instance. We have the tagline that we are Social Networking for Metadata! Find us on Twitter here: @learningreg / #learningreg and online here:

I had hoped to show you a video of Anish Chopra, chief technologist to the US based at the White House. He spoke a few weeks ago about the importance of learning technology to the adminstration. We call ourselves the Learning Registry as we want to be global, we want to be relevant to broad audiences. Lets start with an emotive example:

NASA is the biggest STEM digital materials producer in the US. They announce a physics video… it goes into PBS Discovers and shares it, NSDL discovers and posts link, used in a course in Moodle, OER Commons incorporates into site. This use ecosystem of usage. There’s no one place for that resource, it sits all over the place. You search on Google and all that reuse doesn’t translate into pagerank,. And then there are further aggregations. We want to change how people think about this material and metadata.

We want to create a Twitter-style public timeline of metadata, which acts as a central spine of knowledge about the usage of these materials. And we ask the users of these resources to put back in the usage information to feed into knowledge about that item. We pull in where the resource has been used, how, etc. We can put this information into recommendations systems and create really great information about context and usage for other sites, other portals etc. We expect this “para data” to be more useful that formally structured metadata. We think it will superceed metadata for this material.

We don’t know what content really well. We have a lot. We don’t share it very well. We have ost data exhaust. There is so much analytical data that Google sees but when you copy material to your silo that information you lose that connection aorund usage etc.

Learning Registry sounds confusing… what is it?

  • a concept
  • a research project
  • a community project
  • a codebase
  • a public social metadata distribution network

We are not creating portals or user interfaces. We are creating an enabling infrastructure that others can use as they like. We think they will do interesting things with it. We had an interoperability workshop a few months ago and Pat Lockley did a quick thing but it was a great example of what you might do with the data.

So the Approach of the Learning Registry is to be enabling. We want to provide capabiliies not solutions; anyone can participate; no default winners; no single point of failre or control – and once you switch it on you can’t switch it off; anyone can provide information on anything; idenity and trust exist and are really important; re-aggregation and sharing is natural; usage/utility is shared; as simple as possible.

We want to look at all the ways in which learning resources, services, applications, communities all interact and connect, how material is discovered, described, recommended, how feedback improves the system.

We are resource agnostic – they can be open resources, they can be federal resources, they can be commercial resources, they can be your resources. Anything is welcome. There are common APIs and Resource Network to ingest aterial. Then there are aggregators, publishers, amplifiers, app builders, curators, governments, organisations, businesses that use that material and lead to learning and discovery. The Learning Registry is that mixture of APIs and Resource Network. The model is a fast distributed model, there is no preferred place to find information. Consumers can access any node in the system. If you’ve been around the internet a long time this is NNTP reimagined.

Learning Registry Resource Data:

  • Resource Locator (resource identity: URL)
  • Who’s providing the data (identity: submitter, owner, curator)
  • Terms of Service (URL, optional)
  • Digital Signature (OpenPGP, for trust)
  • Hashtags (recommended)
  • Formal metadta (optional, any schema)
  • Workflow stuff (message IDs, versions, times, transit notes – assertions)
  • Weight (Confidence)

All of this is in JSON, Document-orientated, schema-free database

We have APIs to publish; SWORD (1.3, 2.0), 3rd party OAI-PMH (we don’t harvest(

Acesss (pull to get data): obtain (by ID, reord, URL); Harvest (JSON, or OAI-PMH – we have extended OAI to get information by URL); Slide (subset of identity, schema, keyword) – this is in place of a query API which we are not going to do (e.g. use Elastic Search).

Distribute (node-to-node with regex “filtering”) – this is an internal API really.

Admin (Status, discovery, etc…)

We have this idea of nodes, nodes form networks, networks form communities. And we have nodes that connect networks together. We partition th espace to some extent. Policies are common to particular networks or communities. We have instances where people want to be private: we run 3 communities: production; tech; development. No way to accidently propogate one community with another – so our development version doesn’t accidently creep into production system. This is important at the Dept of Defence co-funded this research and they need one way gateways to take in materials only. See forthcoming image for infrastructure in use across the Registry.

We are in prototype implementation (version 0.3). We use RESTful APIs; data driven policies adn descriptions; we use CouchDB (NoSQL) storage and master master replication – as used by the BBC and the LHC; Map-Reduce views; Python Apps layer which abstracts Coch. We have test and development network, we have a public production network hosted by Amazon EC2 – easiest cheapest solution.

So in terms of looking at the data you can see the interface Jim Klo built at DevCSI this week – more on their blog. If you look at the Ariadne Federated Search Engine and see how their system connects up you can see why they have had concerns about distributing documentation about collections. We can do this easily via nodes in the Learning Registry. We will be issuing assertions that a node will harvest/when they will harvest/alerts to harvest to balance the work across the nodes – the load can be distributed easily.

Pat Lockley’s Plugin (as mentioned earlier). It looks like a normal site search in Google. Can do the normal stuff. BUT he added a button that pulls up the Educational details from Learning Registry. Actually what you want is not those results on Googles but you want a site for users to request use of the plugin and then that will allow them to display that data on site. Exposes a wealth of information from other sources on materials they are sharing.

All tools shown have used the same basic open API. Go ahead and do this yourself!

We are all about openess, mashups etc. We believe in open processes, open data, open products, open standards. We let poeple put in propriatary data BUT we let nodes choose whether or not to look at proprietary data. We make everything except detailed finances open to all who are interested.

Anyone can join the community. You can be a provider of learning resources, metadata, paradata, analytic data etc. you can make your own stuff, you can use the data etc.

We started in June 2010. We had 6 week Agile developement sprints from Oct 1 2010. Production version 0.5 Sept 30th 211. Formal partner integration from October 2011.

Learning Registry Plugfest 2, Dec 12-15, Boulder CO USA

We are working with a huge range of partners. We have spoken to JISC and the BBC – Mo mentioned Digital Public Space and there will be Para data through this. We are working with Open Nottingham as well for instance. Australian partners as well.

Everything is on the website – presentations, documentation, github has the code. Anybody can join and participate.



Q1) Could what you describe be used beyond just learning and educational materials?

A1) It could be used for lots of different things. Nodes can control what they do/don’t interact with. You know this structure could even be a porn network – that’s up to those nodes to do that. But we are seeding with learning stuff.

Q2 – Phil Barker for the audience) What would you do once the Learning Registry is build. How could CETIS and JISC help you do that?

A2 – audience member) We’d like to use it to reenforce the supporting community in the life sciences, particularly in the biological sciences. I’d need some technical support – looks like great material but a steep learning curve.

A2 – Dan) Our goal is that you should be able to understand anything in 2 hours and implement it in 6 hours… otherwise we’ve failed.

Phil: we are really supportive and actively engaged in this project and we have been encouraging people go to Plugfest!

Q3) I like the idea of used materials rise to the top. How can you increase the amplification of those resources?

A3) Two things: more and more people put in data and amplify it themself. We think that someone will write a recommendation system using trust to help decide priority of options.

And now we are off to lunch…

We are back and starting a little early as the sunshine beckons! First up we have:

Getting Bioscience Open Educational resources into ‘Academic Orbit’. Tales from the OeRBITAL launchpad – Terry McAndrew, University of Leeds

I am going to tell you about the OeRBITAl project which myself and Chris Taylor have been working on. We are doing some Fringe touches today… they’ll be some hyped claims! some risks, some leaps of faih and imagination. There will be audience participation opportunities, some death defying PowerPoint… and an Ice Breaker!

Discuss and Declare the BEST OER you have (a) seen and (b) used with your adjacent colleague.

And what was “best” about it, so things like:

  • What qualities did it have?
  • Who did you tell?
  • Did it improve from your collaboration?
  • What happened to it?

So some of those mentioned…

  • Landmap geographical teaching resources
  • Online brain atlas in 3D
  • Video of Silly Putty
  • Virtual Dutch Timeline – moved location causing 404 errors that undermined the resource.
  • YouTube, Flickr gets mention

The biosciences domain is rich for content, many disciplines. Great potential for broad learning objects at UG level but less and less flexible higher up the learning chain.

The Interactive Laboratory and Fieldwork manual for the Biosciences. a large release project of £250k. 10 project partners, over 140 records in Jorum and 200+ resources.

We developer STEM OER Guidance Wiki but at the same time JSIC OER InfoKit appeared. OERs were still completely new to our audience at the time but we did find realistic profiles of academic habits to help us find workflows for creating OERs.

The problem OERs have to tackle is to make resources more effective, reduce duplication of effort, sharing is costly but reinvention more so, but IPR clearance, tagging, branding discovery all challenges to deal with in getting material out there. resources should also raise profile of UK educational sector.

But OERs don’t fall into naturally accomodation culture  be realistic, not niav. Discovery – tryst is important to establish early on (copyright, pedigree, date). Google is an expected route to finding. “any other solution is often more trouble than it’s worth”. And timing is crucial as at certain points in the academic year it’s not realistic to get OERs shared.

Big OER is difficult fit to existing courses. It is a significant thing to bring materials in to your courses that are large objects as it takes away fromyour own teaching content time. Senior staff don’t like junior staff using others’ materials too much. Funded work gets prioritised – staff move onto new funded projects afterwards. Is the academic role the key OER stakeholder? Are there other roles that can boost OERs more?

We ran 10 projects within the consortia with Bioscience. Various approaches here – web applications can be OERs if we trust that they will be maintained (e.g. Oxford iCases). Some of the key issues and outcomes were that:

Starting from scratch is still seen as preferred route for various reasons

Repositories are not “core” practice according to our survey. We have to keep selling their use

Learning Technologists role unclear

OER awareness raised in community

OER approach established

For phase 2 we decided to work on Collections strand (under 75k) and look at geting value for money project to identify, collect and promote collections of OER on the Bioscience Learning and TEaching theme. And we sought to establish an OER cycle – get some velocity around reuse.

We wanted to expose the interface and relate the stories around a Wiki. We recruited 8-10 Discipline Consultants (under 180 hours in total) with network connections hrough Learned Societies, Subject Association. They investigated Ipen repositories and many other source/projets. Highlight best of breed. Identify enahancements using existing OERs and we wanted to do this work with Learning Technologiests. We were concerned to deliver on the needs of our subject areas.

We used the OeRBITAL wiki and encouraged supporting blogs. Mediawiki was used because of broad familiarity with Wikipeia. The wiki allowed collaboration but also encouraging competition to produce work.

We identified individuals and supporting information on topic specialty. We supported our group with information about how to use the wiki, space to comment, space for issues raised. We’ve even spawned an issues raised page and discussion around these. We could grab the whole dialogue and people can investigate the process, the way in which resources were found – these are important to understanding the use of the Wiki in the future.

We found that the Learned Societies are not “gearing up” for OER. The small ones are under resources. here can be monolythic behaviour and there are some ageing resources outthere. There are network ownership and identities new via web 2.0. Not likely to be a primary network. Difficult for other roles to attach and boost support for OERs. Potential to provide content to Learning Registry.

We hope that the gains here will be better teaching, feedback, reputation. The importance was accessibility of resources – see Techdis alter-ego on this. SWOT will be done for found resources but we found that reward and recognition is still a major concern for all looking at OER. Are we expecting the Academic culture to change too much here? Does it have too much gravity – harms sustainable orbit. Should we launch (additionally) from a moon instead – Learning Technologists (R&R) perhaps. Students as Producers, Academics as mentors may be a great low risk approach for creating resources.



A1) Some sort of Bogwart. There are so many disciplines within biosciences and the approaches vary hugely!

Q2) Have all contributed equally?

A2) No, but there are patterns and we can learn from that and feedback to repository provider. And you have have to work with the attitude of the “academic in a hurry” – if you put them off they will not come back and tell others!

Q3) How sustainable is this approach

A3) I think very sustainable especially through something like Learning Registry. If your profile is boosted by OER, or reviewing OER materials. We like Humbox’s profile approach but really that should be on their own spaces – you don’t want loads of diluted efforts across different profiles.

Next up is:

Intralibrary and Item Banking – Charles Duncan, Intrallect Ltd.

We have worked with repositories for some ten years or so, mainly through Intralibrary which we run. We will be looking at Interoperability and Integration both generally and through various specific examples.

Intralibrary as a digital repository now has a different variant: Intralibrary Plus which includes third party tools so that Plus is interoperability. Interoperability and integration lead to certain situations/opportunities/limitations.

if you have low interoperability, low integrations then combining resources involves lots of effort, much unproductive, requires lots of technical skills. Low interoperability but high integration leads to proprietary systems and a high degree of lock in. High interoperability and low integration we are probably quite used to: flexible maships are possible, but technical skills are required and you need to understand interoperbaility. But High interoperability and high integration makes the level of skill and understanding needed to get the best out of combining materials much less and the process more productive. Standards and metadata standards make both interoperability and integration possible.

The high integration and high interoperability model enables exchange of materials in many directions, redeposit etc. Workflows are crucial here. You may want to ensure both a technical and a quality assurance sign off before deposit perhaps. Perhaps the metadata may need signoff as well. And there may be review and revision of your materials as well – connected or separate from the deposit workflow.

So a specific example using MathAssessEngine. This is integrated into intralibrary plus as a special workflow that enables additional buttons – you can use algorithms to build assessment around particular problems, questions, themes. You can create a question for a collection in the system using another tool on another server. Lots of interoperable elements combining here. The create a question tool lets you add any number of differently formatted content – these can be included or excluded depending on the workflow one needs (so maths assessments may not have an essay question option). So lets try an example – a find Edinburgh on the map question that uses hotspots in the map. We can classify the question according to curricula options in the system. And we can save and publish that item.

We can use SVG images as well as traditional images in the system here. You can also assign quality marks to materials and elements here to make content more relevant and useful to others.

So now I will demo how a question within the system can be used in another space. So we just add a question on another site and use that question – and stats about it. So we link together the Create a Question, the Interlibrary and the MassAssessEngine tools all connected together.


Q1) This is sharable across multiple institutions yes? So how do we motivate teaching staff to create materials for others?

A1) The incentive is to share and reduce effort locally but it can also be useful in the same institution for staff to share materials between them. You can also edit the material – being editable makes it updatable and reusable.

Q1 again) It’s getting them to take that approach though, we struggle with that.

And so, with a brief pause for coffee, it’s into our final presentation of the day…

WordPress for hosting and describing learning resources: Reflections from UKOER Delores and other projects – Phil Barker

The thing about speaking at your own event is that no-one introduces you! But I will be talking about a non CETIS project here. I will be talking about a project that was in the same strand as Terry’s and I’ll be talking about assessing OERs using WordPress.

Delores is: Delivering Open Educational Resources for Engineering Design.

We have static and dynamic collections of university level OERs and other openly available resources relevant to Engineering Design. A static collection may include dynamic resources but the collection itself is static. Dynamic collections can have new materials added or taken away or developed.

ICBL, School of mathematical and computer sciences, Heriot-Watt University and the University of Bath worked together on this project, funded by HEA and JISC under OER Phase 2.

We used WordPress to gather resources collected by experts in design engineering as being of high quality and usefulness for the collection. We aimed for about 100 objects in that collection of materials. The dynamic collection is everything underneath that. We use a tool called Sux0r which does Bayesian filtering of content – this is how Spam filtering works. We are using that idea the other way around – filtering out the useful information to detect likely design engineering materials. Then we put material through a tool designed by Bath called Waypoint which enables faceted searching. Anything they can build a search for, that will do this classification, can be passed to the interface. Because Sux0r pulls RSS feeds from collections we know of, those feeds are continually updated and the Waypoint continues to grow the collection that is made available to users. I am going to focus on WordPress but I mention this context to point out that the hard technical stuff, the effort, the hard thinking wasn’t really in the bit I am talking about.

So, starting off… what do we think we need in order to have this static collection? What are the needs for describing these OERs? First up you may not want to hold an actual copy of the resources. We decided we didn’t want to hold a copy of the resource, these were pre-existing resources. What metadata do we need? Title, description, authors, origin, date, subject, classification of some sort, licence, and probably something about the resource type. Users want that information, not neccassarily locked up in an xml file. We want to embed a preview. We may or may not want to allow comments – but we don’t want to have to manage and spam filter those long terms. We want something with a good web presence (and findable by Google) and something that has good participation (links in many direction, embedded material, widgets etc. We want it to take part in the web). We want RSS feeds – great for pushing metadata around, we want embedded metadata (thinking RDFa, microformats etc), we want flexibility, want something easy to use and maintain (perhaps familiar), and exportable metadata?

he idea that we had was to use WordPress. One blog post per resources – if required you can attach resources that are single files. Gives basic description and good web presences. WP handles author, date, tite, and tags and topics ofr classification. Also extensions for metadata and additional functionality (a big developer community there).

We weren’t the first people with this idea…

  • Oxford’s Triton Project are running the Politics In Spires blog. They are creating OERs within WordPress – describe and comment on current affairs and other items. They have focused on add ons around that blog.
  • Edinburgh University have an initiative called OpenMed
  • CETIS has been exploring the use of WordPress to disseminate our publications. We see a sneak preview and should note that resources are attached to posts and looks nothing like a blog
  • Scriblio (formally WPopac) – WordPress theme to create an OPAC using WordPress

How were our goals met? Well most of what we wanted was possible. What I like about WordPress is Trackbacks – you can see when you’ve been blogged or linked to – people can write about you and you can then aggregate those comments on your post. You do get RSS feeds but some questions to mention that. It’s easy to use and maintain and familiar – though the more flexibility you use, the harder it is to maintain. All those question marks are where WordPress gives you information about the author but not the originator of the resource.

So some customisation…

We used WordPress’s custom fields and we adapted the theme so that these are displayed. And we will have either a Plugin or theme written so that the right metadata goes into the RSS feed. So lets have a look in the system for bridges…

We can find a description and preview of the resource, links to it etc. Looking at the admin screen you can see we are using custom fields to include metadata about the object and we have set up categories that fit the curriculum. Lisa in the audience here wrote all of the resource description – she is a trained librarian and that has really been helpful here.

And connected to that we will be hearing from:

OpenMed – Ross Reed, Edinburgh University

OpenMed is being used in a very similar way. We are using the same format of a blogpost as a reference to a resource. All th ebackend is written for you – with a few plugins to use. This was a spin off project from another bigger project. We wanted to cherry pick good resources and get away from the Google type discovery process for trying to find good resources. Resources are represented by posts. Categories and tags are in use. We have four people who are looking at their specialties here. Our key aim was how can each resource be entered in in 30 seconds if neccassary. We really trimmed down the editor view to make it very focused. The added value we have to the normal resources is they are peer rated. And we’ve used Pages to create rough curricula areas. This is to cater for those who wish to browse. We noticed when we started that repositories are super if you know what you are looking for but we are trying to push people to things for their curriculum in easier ways. We’ve gone for utilising WordPress to add value to the materials we point to.

The benefits was that it saved Ross time – he’s started writing a repository system at huge time cost but future proofing that was a bit of a nightmare. WordPress was far quicker and easier to implement. It had to be easy to author and edit and add on as we only have funding for limited time so if we can build up a good community it should be easy to update and maintain – someone yesterday said that you wouldn’t use Facebook if you needed to be trained to use it and that’s why we’ve tried to make it easy.

If you click on a resource you see ratings (stars for quality, A/B/C/C+ indicate level of teaching) and brief information. We hope to include more of the resource itself as well. We try to autocomplete as much as possible so although we don’t use metadata schema as such but we do collect some metadata in standard forms.

WordPress had RSS and security all built in. The search tools are also pretty good in WordPress – and there are lots of plugins to let you customise it further – additional filers for instance.

But we soon realised that the WordPress

WordPress SEO is quite good. It’s only been live for a few weeks and there are 170 resources over 4 subject areas.

Advantages: quick, good support from the community and from wordpress plugin (almost immediately requests followed up), it’s opened my eyes to the benefits of the open source community, easy to edit, it’s fairly customisable for the future as well, the academics contributing to it have been very keen to contribute to it themselves.We also want to make it as easy as possible to find, to link to, to link up to other resources etc.

We have switched off comments as we build it up but we may well be adding those. And there is the possibility of user rating not withstanding what Pat said this morning about


Q1) How much time did it take to go from start to finish with WordPress to live OpenMed system?

A1 – Ross) Probably 2 months calendar times, 2 days/week developer time. I use 6 plugins and only had to customise one of these. My issue was with curriculum pages – importing lists of resources into a page.

A1 – Phil) In terms of which language WordPress is written in PHP and CSS stylesheets. But nost of the time is trying out different plugins rather than develop them.

A1 – Ross) Yes, lots of trial error. But plugins can cause headaches with maintenance.Remember to delete those you

Q2) When you seek a resource do you look at where else that resource is used? FInding communities of use can help support the use?

A2 – Ross) I’m a bit detached from that part of the process but it would be intresting to look for. But would be interesting to capture that process of finding and selecting the resources. Largely talking to colleagues, Google, contacts etc. We’ve not had a protocol for finding these. We’ve just asked for good resources in use already

A2 – Phil) I’m waiting for Dan/Learning Registry to do that for me!

A2 – Ross) It would be great to be able to get a Plugin for Learning Registry…

A2 – Dan) Speak to Pat, he’s already working on one!

Phil is rounding off the day with thanks to all attendees, all the speakers, to the organisers of Repository Fringe who made this easy to set up and thanks to all who have been tweeting, engaging online, etc.

The page on the CETIS wiki with the programme information will have the links to presentations etc. in the next week or so.




LiveBlog: DevCSI Hackathon Awards; Closing Keynote

Mahendra is introducing the DevCSI Hackathon Awards by saying that he gives honorouable mention to Mark McGillivrey and Jo Walsh for their entries. Here is our first of the three presentations we will see:

Visual Filter for the Learning Registry – Jim Klo

the registry is a network for sharing material across many different repositories. Really mean anything by anyone. Could be service data, could be a pubilcation. Could be from an institution or an individual. This means there is a lot of data. A real challenge to usage and access. The Learning Regisry is an infrastructure not an interface. So helping communicate this better and how they could build on top of this I wanted to build a simple browser interface.

We accept everything so there are no standards – no schema standards! Taken a pure HTML5 device agnostic device. A touch graph for search results.

Topic Modelling – Michael Fourman

This was a spur of the moment 24 hour project. We have taken some of the topics from 3000 items in the research archive here. We wanted to address the issue of creating bridges between people using this. If you know someone writes a lot of papers on chemistry then you can see how their work relates to their peers. So if we look at each topic we can find the closest 7 people on a topic. You can see and drag connections around: The idea is to browse people and topics seemlessly and explore connections.

Bridges between author -Pat MacSween, Matthew Taylor, Andrew Day

Dave Millar’s career viewed temporaly as a co-author diagram. You can see the increasing breadth of his network. And you can click a contact to view co-authored papers. And you can click again to view that researchers graph. And you can go back in time to see their graph as well. This isn’t EPrints dependent. It runs off RDF and any old Triple Store.

And finally… we are awesome… ! Matt coded the backend and prepared the Pecha Kucha for me this morning so I could code the front end.

So…. the prizes…

So in Third Place is… Jim Klo, he gets the £50 Amazon voucher!

In Second Place is Micheal Fourman,  – share an Amazon voucher for £150

So First Place is Pivot People! They share a prize of £300 Amazon voucher!

And finally the best idea prize. We had ten ideas submitted by: Robin Rice, Peter Murray Rust, Nicola Osborne, … this goes to… Special mention for Jodie Double who submitted 5 ideas. The winner goes to Jodie for the idea which was very simple, was about how collections, community collections, could be enhanced with content from the community. A lovely idea. We will blog about all the ideas!

Pecha Kucha Prizes

Martin Donnelly has just announced the Day 1 Pecha Kucha Prize goes to Sheila Fraser. The Day 2 Pecha Kucha Prize goes to Mark MacGillvrey.

Closing Keynote -  Prof. Gary Hall (Coventry University)

Firstly huge thanks to Martin, Florance and everyone who has made me feel so welcome over the last few days.

In March this year the Radical Publishing event took place. Despite the title very little discussion took place about discussion of radical notions of publishing, authorship, copyright. Mainly publishers who publish radical content rather than radical business model used this event to advertise their work. It’s some of these issues I’ll speak about today. I will talk about projects that intersect between art, media and new media, “media gifts” if you will.

Ten items fall into this category. My starting point for thinking about these projects was a focus on the free distribution of research. So Culture Machine – the open access journal I publish. We have opened the Open Access Humanities press. There has been much discussion about why academics might publish in open access journals. We have looked over the last few days at how we can make repositories more accessible, more useful, more full of quality content. How we can take advantage of social and mobile media. How we can engage with the public through our research. However we musn’t lose sight of open access arguments: the taxpayers arguement (to not pay twice); the moral arguement (that we should circulate our work as widely as possible, particularly in less affluent parts of the world); and that it enables healthy democratic public sphere. Most people who see Open Access as important will have a mind to one of these political arguements. However how I think Open Access is most interestingly political is to the extent to which it can create an undecidable terrain:

“the politial is a decision taken in an undecidable terrain” – Ernesto Laclau.

Cue five minutes of political philosophy!

There are two senses of hegemony: as the leadership or dominance of one class over the other – the society defines itself but those outside of itself and this means society cannot be a unified community. Within this context that stability operates. Hegemony provides a stable reinforcement of society. A Them and Us articulation.These are the consesquence of temporary shifting events – and that does mean they can change.

“the political is a decision taken in an undecidable terrain” – Chantal Mouffe

Mouffe sees hegemony as inevitable. I’m not saying that we should not use hegemony. Nor that we should attempt to create a chain of equivelancy. And we do not live in a post hegemony world. In an era of Facebook, YouTube, Twitter we are in a default position of joining up with each other – if we only have outsider groups how do we create new ways of being in society.

So what it means to be political now isn’t something that can be decided in advance once and for all. It must be able taking decisions as needed, in an undecidable terrain. And it’s opportunities for doing this, using new media to create such decisions, that I’ve been experimenting with.The Culture Machine open access repository we launched in 2003 is just part of this. If we look at this piece by Ted Striphas on Taylor and Francis and connections to military and dubious political regimes. We designed a site to draw attention to the open access movement. This isn’t about peer review – fixed processes. We also saw this repository as a way to disarticulate and change and maybe reform the concept of publishing. And this is how Open Access is most interestingly political for me. Making it difficult to take decisions about their own political, publishing practices.

I’d argue that these projects is thinking beyond Open Access. Currently the Open Access community is actually quite conditional. It may allow unconditional sharing but that is often to the exclusion of allowing us to ask questions that are valid about authorship, ownership, etc.

OK, philosophy over!

And now onto CUTV – an IPTV project with Clare Birchall with colleagues. There is a need to invent new flexible fluid ways to express intellectual ideas within and beyond the university. We want this to be cheap and easy. We don’t do this because we feel the need to reach out to audiences who are not usually engaging with academic books and journals  – we don’t want to be personalities, build our brand, or become Brian Cox. This is why after the first broadcast we moved from individualised forms to more experimental democratised videos.  We want to make an intervention into the academic field, to find a new way of being in the world as academics. And that’s something Clare and I are also trying to do with the Liquid Theory Reader came about as a response to a publisher asking us to write a follow up to New Cultural Studies – they wanted another volume gathering papers by the key people but we felt that fixed brand was the wrong way to do this. So we are creating a liquid book instead. We gathered text from some of the first volume authors, and biographies for others we’d want to include. We have published this online. This allows us to challenge traditional formats and include whole books, video, sound, etc. Publishing a book in this way has allowed us to explore  posssibilities of new formats and devices. But we could make this book not only open access but also open on a read/write basis allowing edits, annotations, remixes etc.

By producing this book in a fluid open style in this way. What does this do to our notion of the author, authority, the concept of the book itself? And we endeavour to raise similar issues in our curent project The Living Books Series. We are combining biological and theoretical books that repackage existing open access materials clustered around selected topics. And these books are about life and are living objects themselves.

There has already been a radical shift in decentralised authorship over time. One year of social media is seeing a broader array of authors come forward than in 100 years of early book publishers. In 2010 the Guardian ran an experimental network of science blogs – bringing over content in in a very decentralised way. For Amy Alexander the impact of New York Times content night diminish the importance of the publication compared to experience of the paper on it’s own terms.

Will decentralised aggregation and editing see a shift in the role of academic author to editor or publisher. Will publishing in traditional journals lose it’s importance over time. Or could a more radical shift take place. Really shifting responsibility from author to editor/aggregator is not so radical. But read/write access offers further challenge, particularly if authors are not easily identifiable – perhaps not even human in the era of Google News.

Even more important still is the role or of the work itself with everyone potential authors here. Any attempt to entirely eliminate the role of the author risks placing authority on the work itself (Michel Foucault).

Are the future editors of Zizek going to have to publish his tweets? If not, why not? But according to Zizek’s publisher his Twitter page was run by an imposter.  Books have capacity to be extremely pluralistic – multi medium, multi location, objects. A few publishers are exporting and universalising their works. So looking at Michel Foucault’s wok – his work was the most cited by authors in the humanities in a 2009 THES chart.

In a 2009 Open Humanities Press talk Ngugi wa Thionh’o described how some languages have higher value for dissemination than others but provide important insights: in 2004 some 90% of the world’s scientific research was done by just 15 countries – a risk of a centre-periphery relationship being perpetuated.

One final project inspired by film and video art. This is inspired by Anders Weberg’s P2P text: Pirate Philosophy. I shared material for a limited time only and made it only available in the peer to peer version. Once downloaded I deleted my copy making all copies pirate only. What if I did that with this lecture? What of it? How does that effect your notion of authority, of the author, or of the conference.


Q1 – Les Carr) Really pleased to hear about the politics of Open Access. Part of me wonders what you would have thought had we turned your paper into pirate only document, deleted your original from your machine. I can’t help but think that open access is broken if it sees authors only as commodities for bitstreams. I think we just skate over that completely but I would like to see some serious thinking on the politics of eresearch.

A1) As far as I can see you are recording me here. That’s fine, you can cut it up, mash it up etc. Someone once asked me to give a talk and felt weird about putting things online but that’s the price for doing something interesting. The internet creates opportunities for community, thinking differently about community. It gives us a chance to look at new forms of economic circulation and distribution. We talk about academics not using Facebook or Twitter but maybe we can have different ways of gifting and sharing in less commodified ways. That’s why I am experimenting as I am.

Q2 – Ian Stuart) I sort of have a question. Social science people take data, reexamine data and combine it and create new research. That’s close to piracy in some ways. There is a fine line between that activity and pirating. It’s an interesting line that no-one has cleared up.

A2) Until yesterday we were probably mainly pirates – transferring our CDs to our PCs say. That may be legal now but yes, I am working on piracy and thinking about the new idea of a university. Perhaps having a pirate department. I’m sure some of you already know that 20th Century Fox was based on Fox pirating Henry Talbot’s technology and started a new studio in Los Angelos. None of us is free of piracy. But why do we pick some people and actions as authoritative, and others as piracy.

And finally we move to Martin to wrap up:

So it falls to me to sum up this year’s event for the organisers. Thank you to our sponsors the University of Edinburgh, EDINA, the DCC, OA-RJ, JISC-CETIS and particularly DevCSI.

Thanks to Stuart, Phillip, Nicola, Florance, Both Robins, Ianthe, Clare. To the weather for staying clear yesterday, our caterers, our audio visual support Blue Lizard and all at Informatics who have made us so welcome.

Enjoy the rest of your stay in Edinburgh and we look forward to seeing you next year both here and at OR2012.



LiveBlog: Presentations: Anna Clements & Janet Aucock; Niamh Brennan; Heather Rea; Siobhán Jordan

William Nixon is introducing our afternoon presentations:

Anna Clements & Janet Aucock (St Andrews University) – PURE-OAR Implementation

We started back in the day of cerif in 2003. In 2005 we set uo a link to DSpace. After our experience of the RAE we looked at setting up PURE CERIF-CRIS – a joint procurement in Aberdeen in 2010. There was a realisation in Scotland that we should work together over RAE/REF. We launched the Research Portal in 2011 ready to prepare for our REF submission in 2013. And we are thinking still about DSpace and our reseach data.

We pull in administrative information and we harvest publication, manual input and reference systems, we enter activities and impact, we link to full text, repository, open acces and these are fed out to industry/SMEs interface, HEI information, REF and funding councils etc, Public media and collaborations and resaeh pools.  And we are working on eResearch Repository (open access) and on the authority data from RCs IRIOS. And we are looking at WoS API from Thomson which is Cerifyed. And working with various JISC supported REF related projects.

Our graphs of activity we’ve had spikes in deposit – this is where we told our academics to deposit stuff in time for the research portal going live.

Over to Janet:

We have a robust infrastructure and that’s a real opportunity. We have substantial set of publication data, very rich research information, functionality to add full text in PURE and to send metadata and full text to our DSpace repository. The Research Portal is a great way to raise our visibility. We do still have some drivers here: REF, Research Council mandates and Open Access. These aren’t competing factors but engender the support services for research.

Our team is communicating more and more. We engage with training, information and guidelines far more now than we did previously. We have really had to up our game in research support. We are making it visible, we have to support it. Our research office staff help up information on research support on our research pages and joining that information together in a really constructive way. And the latest team to come on board are our liaison team in the library – we’ve really joined up the dots of what we have to do.

The portal lets you browse work, reseachers, etc. And we have been blogging and taking lots of advantage of the possibilities to highlight our research and engage with our researchers. Recent theses is important for our researchers, there are news items surfaced this way. And we have a midigraph – a mini monograph. The academic wanted to distribute this via the repository and it really fits into what we want to achieve. And we are hoping to become the distributor of several open access journals in the next year – really building on our infrastructure.

We have reached our 1000th full text item in June 2011. In graduation week we took a celebratory picture of depositors and staff.

Niamh Brennan (Trinity College Dublin) – CERIFy

Actually it’s not just one person but four! Niamh has been joined by Mahendra Mahey, Stephanie Taylor, Niamh Brennan (TCD) and Kevin Kiely (TCD). Mahendra is starting us off – he’s project manager for the CERIFy project. We want to engage institutions with the CERIF standard. We feel we have a methodology. We have an 8 month project which finishes in September. Aberystwyth University, University of Bath, Queens University Belfast, University of Huddersfield and Thomson Reuters (commercial partner) are all involved. Our philosophy was that institutions care about business processes and making those better. We got institutions to engage with CERIF by articulating their business processes.

We went on site visits to these institutions. We asked them to articulate their Research Information Management Process Mapping and Gap Analysis. This found us four priority areas to look at and this was a fascinating process. We also asked them about duplication, cross walking etc. We only asked one question about CERIF.

We then had data surgeries where we could drill down to the data level and really engage with CERIF. And we focused on two business processes: Measuring Esteem and Insight Exchange. And we CERIFied the data around these priorities so that it could be seen in a working CRIS system.

Over to Stephanie:

We wanted to put the users at the heart of everything we did. We spoke to everyone we could at these 4 institutions. We captured as much information as we could.

InCites Exchange of Data – we asked people how this was use. The highlights were: RAE requirement; comparison with themselves and other institutions. We asked about collecting data: a two way process with Thomson Reuters and local activity. User issues: there is a lot of effort involved in understanding the data – a big barrier to understanding and using the data. The dream scenario would be standard data, nightly updates etc.

Measuring Esteem – personal reviews, promotions, inward facing issues was an important as external needs for this. Collection of data was hugely varied and ad hoc. Everything was too wooly. Difficult to provide meaningful data. They dreamed of systematic capturing of data, bringing in huge numbers of resources, personalisation and personalised audit tools to be brought into RIM tools.

Over to Niamh:

We found such a huge amount

We used InCites in 2002 to populate our repository and our CRIS. But it’s not good enough. The data is unsatisfactory, the process of exchange is unsatisfactory. The views of the institution it provides can be really problematic. There is non standard schema. But you can find materials that are key to the REF. Huge amounts of effort involved to concert InCites to something standard. Queens University Belfast had already tried to build something and we were able to make this so:

Over to Kevin:

I’m going to talk about data conversion. The CERIF 2008 XML specification is extremely helpful for converting data. Ultimately we ended up with xml we could send to Thomson Reuters who could return the InCites data as CERIF XML with additional requested fields.

Back to Niamh:

We have a CERIF data model and a data exchange model. We have extended publication types here, we have multiple identifiers, full metrics.

The next step is to ask Queens University Belfast to properly pioneer this approach.

Notes on Esteem and the REF – although the guidelines don’t show Esteem it does say collaboration and contribution to the discipline. We will use our 2008 RAE Esteem factors, and everything else we can collect, to feed this reporting. We have models here from the PURE user group.

Issues: Most of the data is not currently available from sources other than narratives or reports supplied by members of academic staff. Where data can be imported from elsewhere it should be.

And now a very very short break….

Heather Rea (Beltane Beacon of Engagement) – Social innovation, research output and engagement

Unfortunately my blog has fallen over so this blog will be sorted out later, this portion covered: About heather; the beacons are; outputs; social innovation

Spectrum of Engagement – I have done a mapping that looks at how academics might contribute along this spectrum. From informing, to consulting, to involving, to delegating. The shape of this diagram is a wedge. That thin end of involved participation is very valuable but it is rare and expensive to achieve. You have to do everything before that to even get to that point. General informing is crucial to get you start.

Where does open data or open scholarship sit here – it’s between consulting and involving the public.

WE also talk about public engagement. We talk about the general public. No. It’s not the right way to think about it. It’s groups of publics. It’s people with special interests in your work. So you might think about:

– Policy makers – where are they, at what level. They could be local or institutional or they could be national, UK, EU etc.

– Community worker/NGO – communities of practice who will share their ideas. Also funders – these people look for funding and that’s a means of reaching out. And twitter is a tool that can be useful here

– Individuals, e.g. Patient – in doctors office/hospitals, community suport groups, online forums, searches.

You have to think about the audiences and how you might actually reach them.

With the NCCPE we have done some work on how engagement can be seen in REF impact. Engagement is not evidance of impact but it IS a pathway to impact.

We have approached our institutions and challenged them to change their culture. They have reached a concordat for engaging the public with research. This is a statement that is clear about the role of the university. Three of our four partners have also signed up to our manifesto “The Engaged University”.

My call to action to you: come to our conference: Enagaging Scotland takes place on 20th September 2011. And look at

Siobhán Jordan (ERI) – OpenBIZ – knowledge exchange between HE & Business

We work with universities across Scotland to engage them with conmpanies, particularly small and medium sized companies. Universities can seem like scary places to companies so we do a lot of face to face meeting with companies, it’s quite a resourece intensive project. When JISC put a call out for engaging with busienss it seemed like a great opportunity to pilot online engagement and I will be talking about the work we’ve done under that call.

Building a smarter future towards a sustainablle scottish solution for the future of higher education – scottish government 2011 – really supports this sort of interaction.

Busineses say that “we don’t know what we don’t know”. We were recently working with a small company working on speech technology for stroke victims. The anecdotal evidance was great but no clinical evidance. I suggested working with the Synapse group who look at brain images to give the business a whole new research area – their business has grown 25% just from the impact of working with the university. It’s great for the Scottish economy. And that company is now confident to go forward to work with other universities.

Our role is to overcome challenges to exchanging knowledge in these ways.

The OpenBiz project was to see what could be piloted online in the West of Scotland, where our uptake and connections were quite low. But it was important that we connected with Scottish Business Gateway and others that work every day with our audience.

To date we have worked with about 800 companies and we have taken forward about 400 projects or contracts. The first point of call for companies isn’t looking at a publications list. We wanted something accessible and some peer to peer interaction. So we started by making a series of short 1 to 3 minute videos. We worked with VidioWiki here at University of Edinburgh. This is quite a unique way to use the YouTube video to promote what we do.

We also wanted to increase the reach of our events. It’s hard and cost prohibitive to travel from remote areas to our events. But doing a webinar in a very active way and capture immediate feedback adn interest has proved very productive. We were able to triple the audience for our events. Our first event was on the day of the crazy snow in Edinburgh – we had a large event online as even those planning to attend in person attended online.

Over to Micheal Fourman:

Topic modelling: take a document and look at the words to find the topics. The word distributions are different for different types of documents. Can we simplify characteristics into a simple set of documents? Well we can if we have documents which we know are in the same topics. And we can look at what topics explain the variability of materials in a collection – the machine learns about papers with overlapping topics perhaps.

I was hoping to be able to topic modelling from resaearch materials and topic modelling of industrial materials and then use case  studies to cross search these. We didn’t have enough case studies to do this so instead used the topics to create word clouds to give a sense of content.

Back to Siobhan:

We want to challenge businesses to engage and for business and universities to find a common language. Early days but great potential here.

And our last workpackage in this project is an iPhone applications (Interface On) to connect businesses with universities easily.

We have kept a blog of the project – you can find it on the Interface website. We saw this as a great way to expand our contact with businesses. And this has been a unique opportunity for the parter universities to showcase their work to business across a wider area of West of Scotland.


Q1 – William Nixon) Do you see yourself being more involved in Impact in the future?

A1 – Heather Rea) We see ourselves working more with early stage bids with impact in mind but we’ve moved away from the Impact agenda as such as what we do isn’t directly an impact activity.

Q2- William Nixon) The CERIF project – you said you would be handing work over to Queens – are they ready for this yet?

A2 – Niamh) That’s just part of what we’re doing, taking our own researchers information and exchanging this data in a real world situation.

Q3 – William Nixon) Really interesting form of brokering that you are doing. Any upcoming webinars

A3 – Siobhan) we have a webinar coming up with our new office in Inverness and we are working on design led expertise, involving Glasgow School of Art for instance.

Q4 – Ian Stuart) We have a larger meeting next year  – Open Repositories 2012. How easy will it be to get businesses along to these sorts of events?

A4) One of the objectives of OpenBiz was to look at connecting research to business. We can try. I know that businesses are interested in searching for material and also social media aspects so any work in that area should interest them for sure.


LiveBlog: Round Tables (Day Two)

After a lovely lunch we are now onto the next round of Round Tables. Those taking place today are:

  • Open Scholarship Principles – Jo Walsh & Mark McGillivray (Open Knowledge Foundation)
  • Mapping the Repository Landscape – Theo Andrew, Peter Burnhill, Sheila Fraser (EDINA)
  • How Repositories are being used for REF & repository advocacy – Helen Muir (QMUC)

I am sitting in on the Mapping the Repository Landscape session at the moment so will record some notes from the session here [to follow].

Brief reports from round tables – facilitator Ianthe Hind
Neil Stewart, City University: How Repositories are being used for REF & repository advocacy
Recognition that institutional repository could be seen to be only for the REF nudging out everything else, you need to keep Open Access in your advocacy agenda. You also need to avoid REF spikes. Perpetual problem of academic engagement – following up to calls, keeping them informed. Keeping allies in the research office is also important but can be tricky when under pressure for the ref. We did talk about citations, Web of Science etc. and the difficulty of scanning coverage in non hard sciencey areas. REF great for backfilling repository and making it as complete as possible. And the problem of multiple author affiliations and identies, changes in the names of research groups etc. was discussed. The REF is a massive opportunity for repositories and libraries and it is a real chance to put the repository at the heart of the institution. Repositories should be for Open Access not neccassarily Current Research Information Systems – don’t lose sight of Open Access!

Peter Burnhill- Mapping the Repository Landscape
We worked around this graphic. Looking at funders, authors and PIs, final copies for deposit and print and the REF of course. We looked about ways in which grant information could be transmitted with materials, what the role of PIs are, how PIs and authors fit into multiple institutions and challenges to tracking work there. If you focus too much on funded reseaerch activity we might miss out on all that unfunded research that goes on in institutions and is important for Open Access. And we looked at how we deal with traditional literature and where it fits in wider scholarly comms landscape – you can’t include everything but only looking at the publications risks

Open Scholarship Principles – Mark McGillivray
The group reported online here:
1.    We looked at open scholarship and five areas to aspire towards:
2.    open scholarship is a move beyond open access
3.    it is a commitment to produce scholarly output with the intention of sharing it with the world
4.    open scholarship enables the ideal of scholarship by using currently available tools to the full, for that ideal
5.    when scholarship is open, the creative works of the world will be made freely available to everyone as widely as possible
6.    open scholarship – scholarship for the world


LiveBlog: Pecha Kucha session 2

And we move right on to the next Pecha Kucha session now…

Robbie Ireland & Toby Hanning (Enlighten, Glasgow University) – Glasgow Mini-REF exercise

We will look at the mini REF excercise we did at Glasgow to see how our repositories would work as selection tools for the REF. Last year we talk about embedding Enlighten into the university research structures, that’s now in place. We have learned from the RAE – placing everything in one place ahead of time was clearly going to be important.

We asked 1200 academnics to select 4 publications from 2008 onwards, to explain why they selected those and to approve the appropriate details for the REF. We added a plugin to Enlighten to enable selection, self-rating of the work, and place in order of preference. Once the selections had been made the academic was asked to look at the Impact and Esteem of their work.

As soon as the exercise began we saw a 2000% increase in enquiries. Staff got really engaged in depositing all of their materials. We added 4000 records to Enlighten. We had 700 items selected. It was important that REF information could be extracted and compared (to see if more than one researcher had picked the same item). 90% of participants completed the process online.

After the excercise we found improvements that could be made to Enlighten to improve it’s usefulness to the REF. We have started using Supportworks to track queries about Enlighten. We’ve also added a Request a Correction form for particular items. We added one new item type to accomodate required items. We have also added MePrints and we want a REF selection widget that tracks selections as part of that too.

So we won’t stop, we want to carry on doing this running a mini REF every 6 months so that we are prepared.

Staff are now better prepared for 2014 REF and there is better awareness of Enlighten and how it is useful to them.

Nicola Osborne (EDINA) – Social media and repositories

That’s me, look out for the presentation and video soon…

Andy Day & Patrick McSweeney (University of Southampton) – Harnessing the power of your institutions research news

Please note that Patrick hasn’t seen the slide at all, Andy made the slides so it could be exciting. We work at Southampton, we have a communications department, you almost certainly do too. They manage the profile of the institution and attract students. We communicate what we do. We do research. These guys write articles, they write blog posts. They are getting much better at sharing their work: one researcher to rule them all. The communications department don’t seem to monitor what their people do… so we wrote a tool for finding out what others in the institutions are actually doing. It’s about building the brand and improving the brand. If you can see what’s happening around the campus then you can cherry pick what’s going on.

So we built a web spider over the domain, builds a database, go through items and generate keywords – looks for common occurance etc. to find out what the post is about. And we care about “hot” post – a hotness metric to look at relevance and age to give you personalised news. You can put in keywords and it gives you stuff that’s current and relevant to your work. So the point is that there is engagement at multiple level. There is the at the desk experience, personalised magazine articles. You wake up in the morning, you look at your email or your personalised magazine on your iPad. It’s pretty cool on a personal level but we can give you broader news – news at an institutional level, news at a national level. And we can give you more information about this – we can give you value add. We can tell you about your own news. We can tell you trends in your news. We can tell you the speed of change. How much are your researchers engaging, how much are they blogging.

So, future work…

We want to autodetect what you do and what you want.


Q1) hotness

Q2) tweeting bad data

Q3) Informatics work in this area

Dan Needham & Phil Cross (mimas) – Names Project

We are working with the British Library to identify names in academia and the possibility of a names authority. We started by pulling in RAE, Zetoc, ? and started trying to disamiguate individuals. And as we looked for ways to do that we set up ways to share that data as an API and pull the data out as HTML, MARC, NAMES, JSON, RDF.

Various use cases: using identifiers for paper submissions; publishers using to track contributors; searches for people; library using for cataloguing,

The next step is to pull in more data – from institutional repositories for instance, look at interoperating with ISNI, ORCID, etc.

Thanks to Brian for being an unwitting participant here!

And now Phil will talk about our work in repositories. We’ve worked mainly with EPrints. We have been working on a plugin for Names for EPrints. The plugin augments name auto-completion via our Names API. One of the problems is disambiguating our names here. You can look at fields of interest but you might be able to look at co-authors, key papers etc. We stick the Names information in the email field but we don’t want to overwrite local URIs. We will be demoing this outside all day so do ask questions.

So future plugins: submit a name from a repository to the Names API to add yourself. Also looking at possibilityes of exporting an RDF graph of data in a repository. We’ve written a tool to do that. We are also looking at ways in which you could send us data to generate Names URIs.

Mark McGillvray (CottageLabs) – Open Scholarship perspective

    So I am from CottageLabs, also an Edinburgh PhD student, have also worked with the Open Knowledge Foundation and JISC before. What do we do when we do scholarship? We learn stuff. We research things. We tell stories. We say why we’ve done what we’ve done, what we’ve done and how we’ve done that. This is a package of information. We can use technology to distribute our packaged. Printed pages used to be the best technology for dissemination. We use bibliographic references to stitch our stories together. But we can do more than that now.

    So we have reference lists. You don’t need a pointer, these can be the pointers themself. Lets put this together. So BibSoup is an idea for doing this. Embed the reference list in your document – including the search, the look around, not just a list at the back. If the data is in your work you can do better stuff too – use d3 and embed in your own work. So Ben O Steen did a global visualisation of publications in th eworld. With an open bibliography we have pointers. We can measure the use of the pointers to show the impact of our work.

    Is everything we do perfect? No we publish what we can, but how do we change the publishing paradigm to reflect that nothing is perfect. Publishing used to be closed. What’s holding us back is that academic research sits in a closed revenue system. We need to move to open knowledge. Scholarship is discovering and disseminating ideas. Perhaps Open Scholarship is this in the best possible means.

    Never mind “why open our data” what about “who closed it?!”. Why would we want it closed? Lets see what we can do with that data. Scholarship relies on dissemination – it’s how new discoveries are made. We are putting up barriers to scholarship. There are some issues around copyright and legalities but come and join our Round Table later and we can see what we can discuss and find out.

    Stephanie Taylor (UKOLN) – Metadata Forum

      This is project that I run. I started working at UKOLN as a research officer working with repositories. The Metadata Forum is run by UKOLN, funded by JISC, and it’s a space for everyone that works with metadata in any way at any level of knowledge of experience. We actually started the Forum at the 2010 Open Repositories Conference in Madrid. We had people from England, Wales, Scotland, Ireland and the USA. We particularly discussed the complexity and simplicity of metadata.

      At least year’s RepoFringe we ran a round table on metadata for time based media. We have also tried doing a remote conference with the RSP – interesting process. We’ve had a Complex Objects session York and we had 25 people despite huge amounts of snow and we’ll be repeating this. We also did a hack event via Dev8D – getting practitioners and developers together via some speeddating at the start and a developer challenge afterwards. We had some great ideas – more on the blog.

      What have we learned in the last year. There are experienced practitioners who don’t call themselves an expert – where the forum can do great work. We have funding for another year. This will be more informal community led forum.

      There is a real gap between novices and experts. It can be like running a group therapy session. We are planning focussed meeting on specific types of material – scientific data, music etc. there are potential micro communities here, for hands on help and experience.

      Currently working on a Dublin Core workshop – may trial this online to see if this could work as a format for the future. Please join in and let me know where you’d like to join in, what your problems area. We want the community leading this. All our events have been based on suggestions so we welcome your input!


      Q1 – Mark Hahnel) About the Names work – if you want to disambiguate individuals – would their username have to be the URI. If you want to have a user in a repository be part of the extra layer.

      A1) We can store internal identifiers from repositories and vice versa – various information that can be used. It’s a two way thing really. Us getting data from them will only help us disambiguate authors.I’m not sure if EPrints can hold multiple identifiers but we do have SameAs fields in Names so we can store multiple identifiers here.

      And now a change to the programme… Mahendra and the DevCSI hackathon will be giving a wee presentation of what they’ve been up to.

      Mahendra Mahay – DevCSI

      DevCSI, the Developer Community Supporting Innovation, project encourage developers in higher education. We have been running a developer challenge during Repository Fringe. We already have 5 entries in (deadline is 3pm). We have another challenge, you don’t need to be a developer for that, for the best idea. You just need to tell me or email me:

      What we are going to do now is give you a very sneak preview of what’s been happening so far. A bit like an elevator pitch. First of all…

      People Pivot – Patrick, Matt and Andy – all Southampton folk

      A Spatial and temporal way to browse repositories. Some technical limitations to be fixed in the next few hours. It’s about people, connections, people you work with…

      Building Bridges between people using Topics – Micheal Fourman and Chen ?

      A tool to let you wander between people and topics and people….

      Mark McGillvray

      Been looking at the social side… Looking at Open Biblio data and how to include data in another embeddable faceted browse of other content. Try it out


      Taking disperate data sources in any schema, any format, and that’s a hugely difficult to browse and see what’s there so working on a visual browser to explore this huge network. And collating metadata with activity data and social data. And it works!

      Name Graph – Jo Walsh

      Tool to link data and documents in repositories via people and topics. See more later perhaps.

      Mahendra: And a few non-dev folk have submitted good ideas:

      Peter Murray Rust

      It’s on my blog – created linked open repositories in the UK and show that we can lead the world in tersm of proving linked open repositories – can be done in an afternoon!

      Yvonne ?

      My idea is about how do you create a challenge? There are lots of folk doing stand up and improvisation. How cool would it be to turn up and come up with ideas via improvisation here – come up with new stuff we haven’t done already here.

      Mahendra: Open repositories will be here next year. We’ve been talking about this (idea from Graham Triggs) and we were thinking that when people register for the event we ask for biggest challenge in repositories. Then at the welcome summarize the ideas in groups and thought about stickers on badges around thematic areas. So we know the partipants and their interests and match make.

      Micheal: Something similar that we did at Social Innovation Camp here and interestingly the NOT like minded poeple formed great teams – a real mixture of people bring great ideas together so I’d avoid the coloured blobs.

      Mahendra: I think we just invite all interested folk to the lounge and we want that nearer the action so that everyone can easily come and go.

      Peter Burnhill: OR 2012 will be here. But there is a definite wish to keep the spirit of the Fringe so we intend to keep a strand of Repository Fringe and we learn from that Edinburgh Festival and Fringe model.

      Jodie ?

      Tools to crowdsource and transcribe materials – to throw out material that needs doing. As tool or plugin.

      Mahendra again…

      You will see pitches of winners later today but they won’t know what they’ve won until they’ve presented

      So this is Dave Tarrant and gave this presentation at the University of Texas earlier this year and had by far the best reaction. The theme for OR2011 was “show us the future of repositories” so David gave his take on this theme.

      And it’s deposit via Kinect…

      Dave Tarrant (University of Southampton) – MS Kinect & SWORD v2 deposit

      This is  a bit tricky to blog so I’ve videoed it – it’s a process that looks like Minority Report – and there will be pictures but…

      Dave did a 2 minute drag and drop of an item into 3 repositories – some running EPrints, one on Dspace – all without using a mouse at all and just using his hands in the air via a Kinect. The metadata is generated automatically and deposit is immediate. This was possible using SWORD2 so could theoretically work on any repository.

      We’ve done various user testing around repositories and we have found that the more metadata you can automatically generate, the more researchers will actually take time to correct it, complete optional fields etc.

      One other demo…

      Here is a document in Microsoft Word. You can mark up the title, the abstract, etc. This is standard stuff. However we have build a new widget that lets you add in the SWORD deposit repository location (a url) and providing a simple one button submission directly from the document. It deposits instantly. But better yet you can made edits – change the title perhaps – and redeposit in real time (as the same item, just a newer version)  just by pressing the update button.

      Both of these projects came out of our project to increase the connections and communication between the repository and the user. That’s the best way to make repositories relevant and easy to use.

      Ian Stuart adds: a lot of this kinect stuff came out of discussion at dev8d and devcsi so the message here is let the geeks play!


      Q1 – Les Carr) is this the normal practice, whas he message

      A1) DepositMO is looking at familiar tools – people won’t use things if they have to be trained to do that. The point is to do with the familiarity. We need to get things into the repository and the key to do that is making it simple and intuitive and very quick.

      A1 – Mahendra) The point of DevCSI is the central belief that developers aren’t fully appreciatted within their organisations and they can offer a lot in a creative space. And we are trying to enable that creative space and support to innovate.

      Q2 – Peter Murray Rust) This is more history. This came out of a project to twiddle molecules around with the Kinect – the university wasn’t happy to fund buying that as

      A2) Yup, being able to manipulate stuff in 3D requires 3D type actions

      Mahendra: We ran a hack event where someone who is a developer working on chemistry and visualisation software, and he sat next to someone from the BBC. As a result of applying that visualisation to her data she now has a funded project on that.


      LiveBlog: Mark Hahnel – Figshare: Publish All Your Data

      After a refreshing coffee we’re back and Robin Rice of EDINA is introducing our next speaker. All of the work in the Research Data Management strand is about long term cultural change and I think Mark’s approach here is really inspired.

      Mark Hahnel (Imperial College London) – Figshare – Publish All Your Data

      Don’t be mad at me for not having a guitar!

      Basically this is a bit different to the other repositories in terms of what it does. One problem everyone seems to have is incentivising people to upload and share their data. This is about what would incentivise me as someone from a science background.

      I was doing a PhD, generating data, then generated lots of data, charts, graphs, etc. Only a tiny percentage of what I produced will ever be written up but that other data is useful too. That smaller subset will get out there with traditional publication methods. What can I do so that others can use, cite or be aware of it. This was the whole idea behind FigShare. This was originally an idea selfishly for myself. It’s built on a MediaWiki base. Others said – well it’s useful for you but it might be useful to others too…

      But why do this? Well within that data I have tested what x does to y. But I know that 20 other labs may fund the same research. There is this whole issue of negative data – it’s part of what is broken in the current publishing systems. In those 20 labs you can get 19 with negative results, 1 with a false positive but it’s much easier to publish that one result than those negative ones.

      So FigShare comes in here. A very simple set of boxes – I won’t use a repository that I have to be trained in. No one would use Facebook if you needed training for it! And researchers want their data to be visualised – we are working on making that embeddable. Each set of data is a persistant URL (no matter where hosted). And this has clickable everything. You can also preview datasets on the page without having to download everything. And automatically a researcher profile collects their work.

      And we also have space for videos – again not publishable but show interesting things. You can link your theses to this permanent URL in the same way. One of the things I have learned is that if you build a platform for scientists they will do their own thing with it. I thought it would be great for disseminating data and finding stuff on Google. Others have said they want feedback on material for publication. People started sharing their research through different outputs. If you click on person you can pull in an RSS feed of your research. So people have been plugging in that RSS to friendfeed to disseminate and people have given great feedback, questioned his methodology and collaborating. You could also plug the RSS feed into a blog as an eLab Book.

      And the permanent storage of something online – access your research anywhere which means you can instantly show people what you are working on. In terms of permanance we are working on exports to endnote and so on. The handles are similar to DOIs. Everything is listed by tags, searches etc. It is discoverable. You can search or browse by anuthing here. I wanted to do this for selfish reasons. When I started my PhD (on mobilisation of mscs) my lab had just had a huge paper released, reviewed in Nature, a feature on page 3 of the Guardian . If I search now my own work – which is useful for others – on FigShare are the top result even though it will not be published in a journal. I am happy to see that it is working in terms of discoverability. So the thing about this is that the data is more discoverable, it’s disseminated, it’s available for sharing. We have done all this on a budget of zero and for that reason we are asking researchers to make their data open when they upload it here. The thing about JISC is that they fund these amazing tools and resources but even as an interested researcher I don’t find things out. When I do I retweet, I get the word out. Retweet everything! Make the most of the amazing stuff that is being built.

      In the first few months we had several hundred researchers and 700 ish data sets submitted. Even with 700 objects that’s not great to search. It was suggested that I seed the database. There is an open subset of PMC of articles but finding the figures is tough so this is about breaking figures out of repositories. About a month ago we began parsing the xml files and we have been pulling in about 2-3000 figures per day. About 50,ooo figures so far. We should make about half a million figures more discoverable in total in this process. The other thing is that if you publish in an open access journal you therefore may already have a profile and data available.

      We’ve been looking at what else might be needed…

      We were asked to allow grouped files – for projects but also for complex 3D imaging objects. Researchers like to big themselves up. We are included alt metrics here – allowing new ways to boast about their work. Also graphical representations of page views – in a nice graph it’s quite appealing. And we also provide Embed code for adding their data for their theses or papers etc.

      So that is the long and short of the features as it is. And everyone I’ve talked to in science has an opinion – positive or negative. I am really pleaed that so many repositories are educating researchers on depositing data and articles and on open access.


      Q1 – Les Carr) It’s just amazing what one can accomplish as a diversion almost from one’s PhD. Looking at all these figures from external data sources, the actual data sets – which are so important – you have a handful of dozens of those. Any sense of how will you increase this

      A1) I have an idea that when we’re doing journal clubs and things like that you can use the QR code to look at the figure, see the data, explore further. Some journals require you to be uploading all of your data. There are projects like Driad. There are lots of datasets under CC0 – I could do that in the same way as we have for the figures but I’d prefer people to upload their own data.

      Q2 – Peter Murray-Rust) I think this is fantastic. Have you had any interest from journals about this. For instance I work with BioMedCentral and this would be trivial to link back and forth

      A2) BioMedCentral have been in touch, mainly as we have been compiling a list of repositories to deposit specialist materials.

      Q3 – Robin Rice) If journals and publishers are becoming dependant on figures beingthere what do you see as the sustainability model for FigShare

      A3) In the first week of pre-beta a not for profit organisations offered to host FigShare indefinitely – at least 3 years and it’s just had funding for at least the next 20 years.


      LiveBlog: JISC Repositories Takeup and Embedding programme project presentations

      Welcome to day two of Repository Fringe. We have opened with a little breakfast curtosy of the DevCSI hackathon and we have moved straight into our first section:

      Laurian Williamson, RSP Project Co-ordinator for JISCrte programme – Introduction

      Balvier Notay is also giving some additional background: in 2002 we had an exploratory phase of the programme – looking at OAI-PMH, then a building capacity phase starting up repositories, then an enhancement phase, then an a rapid innovation project – sneep and meprints came out of that, and then we had the Reposit project looking at workflows etc. they are just coming to an end, also various projects about automatic metadata also now coming to an end. The Repositories Take Up and Embedding programme was about getting these developments out there, about embedding repositories in institutions. We are creating a guide to embedding from RSP and a technical guide from Southampton.

      Laurian is back to give some additional framing information. We have been encouraging take up of repositories, embedding, sharing good practice and experience. It’s a JISC funded project and we were working with six very different projects.

      Jackie Wickham (RSP, University of Nottingham) – Overview of Repository Embedding and Integration Guide

      We don’t have a guide to show you yet but we will publish it in September. How to embed the repository in the institution, case studies and video interviews are included. There will be a list of tools and apps trying to pull that together. There will also be a self-assessment tool to guage the level of embeddedness. This will be publicised very widely in the community when it is launched.

      Xiaohong Gao (Middlesex University) – MIRAGE 2011 (developing an embedding visualization toolkit (for 3D images) and a plug-in for uploading queries)

      This is mainly a medical images repository. We want to maximise the benefit for the community of this specialist repository. This project is running as part of WIDTH – Warehousing Images in the Digital Hospital – a project with 11 partners. We are also in touch with a German reposititory of a similar ilk, IRMA for bone age, and with a Swiss repository, HRCT, for lung images. Also a Greek semantic based repository i-SCoRe.

      Our repository has been disseminated through three articles and a paper for eTELEMED conference won best paper as it explained the technical challenges of delivering 3D visualisations in the repository.

      The enlarged project team includes 2 BioMed MSc students doing final dissertations based on MIRAGE database, also a PhD student working with us.

      During this project we have enabled 3D visualisations and upload of images, our next stage is to digest 2D/3D movie data. We want to use Grid technology for this work. We also want to undertake some user evaluation and dissemination.

      From the students point of view the repository has widened both expectations and experience. For the developers this has been a great experience.

      Now handing over to Susan(?) for demo – there are over 2 million images in the database and you can search by image – the system looks at shape, texture, size, etc. A lot of medical images are 2D but can be combined into 3D or 4D images. A limitation for us was that you could see 3D images only as 2D slices. And you could not upload an image as a query image. We have added this via 3D Brain link – you can view the slices (and page through them) and view a 3D image alongside. And you can now upload an image from the internet to look for a comparison in the collection based on the image. You can also select relevant or non relevant results and research. Obviously this technique would also work for other types of repositories and searching and we are happy to talk more about this.

      You can also view random images from collections as a browsing tool.

      Marie-Therese Gramstadt (University of Creative Arts) – eNova (aims to extend the functionality of the MePrint profile tool)

        The reason we are doing this project is to improve the take up and embedding of repositories. The take up of repositories in the arts is really low so this is very important to us. This project is funded by JISC and run by University for the Creative Arts and University of the Arts London – we worked togetehr before on the Kultur project and we have been engaging with te Kulture2 group.

        We wanted to use the MePrints plugin – a profile tool for pages about depositors for EPrints. We have now installed the plugin to two repositories at the moment.

        How we have approached the project is to get feedback from the arts project. We have done work with 10 researchers on a long term basis (at least 3 points at which we are in contact). We are getting a lot more into the culture of the institution. The User Needs Analysis involved showing MePrints in use, looking at what researchers are using and a short survey (6 from UAL, 4 from UCA) – this will be written up as a full report soon.

        At the University for the Creative Arts the profiles are fairly textual. The University of the Arts London includes links to projects and materials in a more visual way.

        All staff had experience of staff research profile pages but most not very experienced with repositories. All wanted a web presence. We looked at some of their personal websites – they liked that there were no limits to their personal spaces online. We looked widely at what could or could not be included. We thought widely but then focused down.

        We created visualisations based on feedback and based on what was already in place. MePrints already has a space for a profile image but it would be great to be able to use video here. Rather than go crazy with social media we’ve focused on the key headings and use of controlled AHRC keywords for research interests – some require customisation to MePrints. The outputs tab includes categories of Publicaions, Exhibitions, Conferences – hopefully that would be from the repository but we hope to also add a field for material that can be linked to elsewhere – we think this may improve deposit as well. Finally the Gallery tab provides visual highlights of the depositor’s work.

        Alan Cope (De Montfort University) – EXPLORER (create workflows and processes to enable the embedding of the DMU repository within the DMU research systems and processes)

        DORA is the De Montfort repository. We had millions of items but a low self-depositing level and little connection to other research systems. EXPLORER had 2 strands: one to develop and implement workflows and processes to enhance and embed DORA within DMU research environment. Looked via focus groups and questionnaires but will also be looking at the Embed and Enrich projects. Strand 2 was to adapt and integrate toools to enlage DORA and enable deposit of a wider variety of outputs in line with REF2014. Looking at the Kultur, Air, MePrints and IRRA projects here.

        We had 81 survey respondents and conducted three focus groups. That was nearly twice as many responses as expected. Most respondents knew about DORA and most produced outputs as text but others create music, graphics, photos, etc. We asked what might make people use DORA more and they commented that the production of statistics and the reuse of data held in DORA – not having to resubmit multiple times. They also suggested ways to improve the process and look and feel.

        We will be creating an updated process map – previously each depatment had their own guide. We are simplifying it down to one process. We are improving and planning advocacy and be more proactive with that.

        The key technical work for strand 2 is to improve the display of non text items. We will create a Kultur plugin or DSpace (as Kultur doesn’t work in DSpace). We want images and video that works on th epage – video is now working, looking now at other media types.

        One big advantage is that the university is to have a new websites – we wanted to integrate DORA more closely with this and feed DORA outputs to new individual researcher profiles. And we are currently looking at the functionality from AIR, IRRA and MePrints. We have also been testing the CERIF4REF plugin. It works in DORA but need to review what is pulled out, test it, and see if it is right for us.

        The Survey was very successful and provided much information. The technical work has been more complicated than expected but will provide useful functionality for Dspace users on completion. And we are currently documenting the new processes. See:

        Miggie Picton (Northampton University) – NECTAR (implement new tools, procedures, and services to enhance their repository – in readiness for the needs of the REF and EThOS)

        This project came out of the fact that NECTAR has been there for a few years and had done well with collecting metadata we have not done so well with geting full text. We had a very mediated process – which our researchers were keen on. We had a bit of a nudge as theses mandate was about to come in, and the REF is of course also a key and timely driver.

        We wanted to modify the university procedues for submission to NECTAR and to make some technical and procedural changes as neccassary to ensure NECTOR connects to eTHoS. We also wanted to keep up with rebranding. And we wanted to bring in added value from the sector – ways to make it more valuable to repository users, particularly researchers. And we also wanted to provide training and advocacy an we also wanted to collaborate with colleagues at the RSP.

        So far we have rebranded NECTAR to match the current University’s website. EPrints services have implemented the Kultur extension onto the NECTAR test server – we have used this to get testing done before doing live. So we have added things like the scrolling display of images on the homepage and the reformatted item pages which present full content (if available) before metadata. We have also made some changes around the home page – we did have a top 5 papers. We mainly did this as they rarely changed much and that can be discouraging. Now it’s a latest additions list. And we’ve made some other minor changes – one user had problems with the search box so we have added a more obvious link to the advanced search.

        We are now working on making NECTAR ready to talk to EThOS. We are also working with the Graduate School to gather theses. We have only awarded research degrees since 2005 so we should be able to get a full metadate record for all theses and may be able to get full text from alumni. Procedures for depositing into NECTAR is now part of the research degrees hand books. We have also altered the data entry process so that researchers can enter their own data but it is still a mediated process and that’s what our research committee want.

        Ongoing advocacyhas involved presentations to research groups, school research forums and with school NECTAR administrators and academic librarians. NECTAR training is now in the universititys IT training programme. And we are having an Open Access week event to raise the profile.

        Researchers are now notified by email when items are deposited in live NECTOR (woth thanks to William Nixon). And the NECTAR bibliographies have been revised in response to researcher feedback about how to present publications etc. The University web teamn will be including NECTAR bibliographies on staff profile pages. We’d love more ideas like this!

        Still to come… Changes for the REF via an Eprints plugin, And the promotional campaign for our Open Access Week event.

        But on the wish list: we want improved statistics on usage and dissemination to users; integration of NECTAR with a university CRIS – and our research office would be all for that. And I’d like more staff for NECTAR!

        Chris Awre (Univ of Hull) – implement the Hydrangea software

          I will be talking about hydra in Hull. We are a collaborative project between University of Hull, University of Virginia, Stanford Universitym, Fedora Commons/DuraSpace, MediaShelf LLC. This was an unfunded project for a reusable framework for multipurpose multifunction munti-institutional repository-enabled solutions. The solution is modeled to be useful to our own and others repositories. We initially set ourselves a three year time frame from 2008-11 but we have agreed to work together indefinitely.

          The Hydra project was funded to apply these solutions to the University of Hull website. We are working with MediaShelf as a technology partner and their role as part of this project is to do a lot of the implementation and do a lot of knowledge transfer. I am pleased that our developer and a colleague have picked up and loved this work so we know we’ll be in a good place moving forward.

          The project had three phases – a read only interface; ingest and metatdata edit functionality launched in June; full CRUS capability for some genres should be done by September and we hope to replace our interface by then as well.

          The idea is to create a flexible interface over a repositoryy that allows for the management of different types of content in the same repository – the end user has a single place for all materials and we think that supports embedding. And we think this encourages take up through our flexible development of end user interfaces where these are designed for the users according to content types – and separate management interfaces for repository staff. Hydra provides a frameowrk to support adaptability.

          There are  key capabilities: supports any kind of record or metadata; object specific behaviours (e.g. books, images, music, video etc); tailored views for domain or discipline-specific materials; easy to augment and adapt over time.

          We have developed partnerships from the outset as we think that’s crucial to the sustainability of the project. We don’t want too formal an agreement here but partners must feel comfortable with expectations. sharing of code etc. We are trying to establish a sustainable community around Hydra. For Hull specifically we are providing a UK reference implementation, creating a local knowledge base for others to tap into, and a place to start building a UK or European community.

          Hydra have developed guidelines around the organisation and structure of content – though the guidelines have wider applicability. Hydra runs on Fedora 3.x with a range of additional technology

          Hydrangea was Hydra’s initial reference beta implementation. Now deprecated but played it’s part. But we have for all our code. And here I shall borrow from a Prezi from Open Repositories to explain the technologies. We use Blacklight as a next generation library interface but it is content aware and metadata agnostic, and it has a strong community around it.

          Why these technologies? Well we all use Fedora. Solr is very powerful. Blacklight was in use at Viginia and now at Stanord/JHU. And Ruby allows for very agile programming approaches.

          Hydra in Hull creates records that look fairly traditional, though we couldn’t resist including a QR code. Usability testing was good. Lots of advice on improvements as well. We only had a few people but they provided great feedback. See

          Robin Burgess (Glasgow School of Art) – Enhancement of the design of their interface

          Glasgow School of Art have no public facing repository yet, they just have an inward facing tool so this is quite an important project for them.

          You can see I have a guitar. Being the Edinburgh Fringe I thought I’d give my presentation in the form of song!

          So we embarked on a project with JISC, we were very new to it but we weren’t afraid as we’d done our research… what we had found was that there was plenty of help around. We are building a repository, something new to GSA. Still don’t have a name. We looked at different systems – DPrints, PURE, EPrints, all quite similar. We decided to invest in DPrints.

          Requirements building is the next step, with some help from EPrints and Kultivate. We hope to develop an integrated system to help showcase the work at GSA. We have to move from Filemaker pro to something new with amuch better interface. We are building RADA, our new repository.


          Laurien adds a pre Q&A note that it is so important to disseminate this work and we are so keen to do this!


          Q1 – Peter Murray Rust) How many of your repositories can be exported as RDF

          A1) A few hands raised – from Northumbria, Hull, eNova

          Q2 – Les Carr) As a software engineer, if it were in the gift of people who build repositories to do one thing to make life easier what would you ask for? What have been your biggest problems?

          A2 – Northumbia) Magically change copyright law so that everything can be uploaded.

          A2 – eNova) It’s not really been about the technology, it’s the culture that is such a challenge..

          Q2) So Open Access Rohipnol

          A2 – Hull) Easily tweakable interface that can be tweaked according to need. Ability to change things easily.



          LiveBlog: Final Notes on Day 1 – Networking, Hackathon

          We are moving to the Roof Terrace!

          But from the hackathon we already have something useful for you. A post from Mark MacGillivray showing his and Richard Jones entry for the Repofringe Developer Challenge. More info can be found in his blog post.

          “This allows you to find out what is on at the fringe, whilst also checking what is on at RepoFringe, by embedding a search of the Fringe catalogue right here in the RepoFringe website! (May only work in Firefox tho…)”