New blog for the Jisc Publications Router

The latest phase of the projects documented in this blog has moved to a new blog.


Our new blog will be used to outline the developments and benefits of the The Jisc Publications Router service. It begins with an introductory post that includes links to the service page and information on interacting with the Router.

The Publications Router is a free to use standalone middleware tool that automates the delivery of research publications from data suppliers to institutional repositories. The Router extracts authors’ affiliations from the metadata provided to determine appropriate target repositories before transferring publications to repositories registered with the service. The Router offers a solution to the duplication of effort recording a single research output presents in the increasingly collaborative world of research publications. It is intended to minimise effort on behalf of potential depositors while maximising the distribution and exposure of research outputs.

The Router has its origins in the Open Access Repository Junction project. A brief recap of the various stages of evolution can be found in a post on the history of the project.

If you wish to find out more about the service the Router offers please see the about page.

Developer Challenge: The Results

The winners of the developer challenge were announced during the Show & Tell Session just before the closing keynote.

The top price went to Russell Boyatt for his Preserving a MOOC toolkit. This idea fits very well with the preservation theme of this developer challenge. As Universities are putting more resources in deploying MOOC, it is very appropriate that capturing the social interaction generated by students and their tutors should become a priority in order to enable future analysis, feedback and validation of MOOCs. This hack was therefore very timely and inspired our judges. It provided them with a take home message – let’s do more to save MOOCs interaction data and we must do it now!

There were two runner-ups:

  • The Image Liberation Team made of Peter Murray-Rust and Cesare Bellini which overlays the license type on top of an image.
  • ePrints plug-in for image copyright by Chris Gutteridge which adds a license and copyright to a image in ePrints.

There were an additional two entries:

  • The Preservation Toolkit from Patrick McSweeney which provides a webservice for file format conversion.
  • The Metadata Creator from Richard Wincewicz which extracts the metadata embedded in PDF files.

These last four hacks are all about improving metadata, its quality and ease of capture. This give a strong signal as to what is a major concern for repositories and their users.

These were all very interesting and exciting hacks! It was a challenge in itself for the judges to reach a decision and award the prizes. They had to take into account the novelty, relevance, potential of the idea and balance it with the production of code during Repository Fringe. Not an easy task! Thanks again to our four judges, Paul Walk, Bea Alex, Stuart Lewis and Padmini Ray-Murray, for their excellent job!

What struck me most during the 24 hours of the challenge is that most developers were happy to enter a hack but didn’t want to win!  Maybe it was the lack of time to dedicate to the coding due to the Repository Fringe sessions running in parallel, ‘it’s not fully working yet‘. Maybe it was that some of them had won previous challenges, ‘been there, done it and got the T-shirt‘. Maybe it was the lack of a new generation of coders to compete with, ‘where is the new blood?‘. Maybe prizes are not the main motivation.

The feeling was that the challenge should come from the questions to be answered rather than the competition with other developers. There was a demand for a different type of event where developers could work together to solve problems that would be set as goals. This would provide a chance for developers to collaborate, learn from each others and code solutions to important and current issues. The opportunity to learn and demonstrate theirs skills seem more valuable to the developers than a prize money. It is more important to have fun, meet other people and build a developer community. Back to basics! I couldn’t agree more.



RSP Webinar on RJ Broker: Automating Delivery of Research Output to Repositories

On Wednesday 29th May 2013, Muriel Mewissen presented a Webinar on the Repository Junction Broker (RJ Broker) for the Repository Support Project (RSP).

This presentation discusses:

  • the need for a broker to automate the delivery of research output to Institutional Repositories
  • the development of the middleware tool for RJ Broker
  • the data deposit trials involving a publisher (Nature Publishing Group) and a subject repository (Europe PubMed Central) which have recently taken place
  • what the future holds for the RJ Broker.

A recording of the webinar and the presentation slides are available on the RSP website.

Report on the Will’s World Online Hack Event (Part 5/5) – Legacy

This post is the fifth and final part of our reflections on the organisation of an online hack for the Will’s World project.  The first post looked at the planning, the second post at the promotion , the third post at the format of the event and the fourth post at how things unfold during the hack. In this last post we look back over the hack and reflect on the experience.

All’s Well That Ends Well

Following the event we sent out an email summarising the final day to the mailing list and ensured that a blog post announced our winners. We also tried to ensure we linked to and acknowledged posts about the event – several of our participants have written about (or planned to write about) their hack experience, see Owen Stephens’ post  Shakespeare as you like it.

As the Will’s World Project had drawn to an end with the hack event, any potential follow up time has been limited although connections were made between data suppliers and hack participants where there was interest in taking ideas forward. The mailing list and Google+ Community remain available to allow on-going collaboration.

This Will’s World Online Hack felt a lot like a roller coaster ride. The planning of the event was the sharp ascent with a lot to learn, organise and set up in a short time frame before the exhilarating ride that was the event itself with its share of excitement and fear at the unknown twists and turns, with a smooth and happy finale being the hack presentations. It was very enjoyable! I would do it again and urge other to do so!

Now that the daze has settled, we can reflect on what we have achieved.


The use of social media as the main support for communication helped in creating a well documented trail of the event. The event wiki, project blog, YouTube channel, Pinterest boards, Tweeter and Google+ feeds were used throughout to broadcast the event and provide a catch-up facility to the participants. They remain available after the event and act as an account of the hack. The Shakespeare Registry itself is an Open Access resource fully available to all at:

The Will’s World Hack YouTube channel has recorded 473 views spread over 15 videos:

16 Nov 12 – 16 Dec 12 17 Dec 12 – 4 Feb 13 Total
Event Introduction 174 3 177
Data Introduction 49 1 50
Winning hack presentation 35 9 44
Opening session 29 29
Closing session 25 9 34
Day 1 2nd hangout 21 21
Day 7 hangout 20 1 21
Day 2 hangout 17 17
Day 3 1st hangout 16 16
Day 3 2nd hangout 15 15
Day 5 hangout 14 14
Day 4 hangout 6 6
Infographic hack presentation 3 2 5
Prize giving session 2 2
Day 6 hangout 1 1

We encouraged all participants to share their experience of the hack:


We used a Google Form to capture feedback, and you told us that:

  • The use of social media was great. Twitter had the lowest rating, probably because not all participants engaged with it. Twitter works best for immediate chat which didn’t work well with people at different times.
  • You like the friendly spirit of the hack and the opportunity to collaborate with people outside your field of work.
  • You like the XML plays.
  • You would have like a bit more directions and suggestions on what to do.
  • You would definitely consider taking part in an online hack event hack.
  • You have no suggestion on what would improve our online hack format! (Don’t worry, we do!)

Top Nine Outcomes

The main benefits of the Will’s World Online Hack were:

  1. To promote the Shakespeare Registry to the developer community within and beyond the UK academic sector. The dissemination efforts surrounding the event have reached a very large audience both from the art and technology world.
  2. To validate the outcome of the Will’s World project in terms of ‘aggregation as tactic’ and discovery principles. The submitted hacks provide specific use cases for such aggregations that may help inform other aggregation and registry projects.
  3. To promote the online resources and services listed in the Shakespeare Registry. It was very encouraging to see people contributing data and pointing to other relevant projects and resources.
  4. To evaluate a new format for a hack event and share reflections on the challenges and success of using such a format. The main success was the use of enabling technologies which supported the flexibility we wanted for the event. The biggest challenge was team building.
  5. To experiment with new combinations of social media and technologies as a primary channel for short term collaborative events.  Google+ Hangout was definitely the highlight here.
  6. To encourage Open Access. All the resources provided for the hack are freely available. The event itself was shared in as many ways as possible. The participants were encouraged to share their code, idea and experience. Several hacks published the code on github and personal blog.
  7. To create rich networking opportunities for professional and amateur developers, Shakespeare scholars, cultural organisations, and interested others both from across the UK academic sector and a broader international community.
  8. To seed potential new developments. It is clear that several hacks have huge prospect in term of being turn into fully fleshed applications or will serve as enabling technology for future projects.
  9. A range of fantastic hacks using the data provided in very different ways.

A Few Improvements

If we were to organise another online hack, what would we do differently:

  • A different topic: Shakespeare is a very popular subject with the advantage that there is a large audience for it but it can be rather over covered. It was a challenge in itself to find a focus that was novel and different for each individual hack.
  • More time for participants to familiarise themselves with the data ahead of the time and give a longer notice for people to register.
  • More effort: We had been warned it was going to be hard work! It was.  The planning wasn’t much different to an in-person event. The broadcasting of the event required a lot more rigour and effort in term of communication.
  • More structure: Too much flexibility can be confusing and create a lot of work. For example, having less check-in sessions and reducing the number of social media used would make for a more manageable event. Compulsory attendance at the first session may also boost participants’ active involvement and collaboration opportunities.
  • Less communication channels: being more selective would allow effort to be focussed on a selected set of preferred channels and avoid information, questions and collaboration opportunities being missed.
  • Different dates: Mid December is not conducive for much except Christmas shopping!

What next?

We hope to continue to inspire other projects to use our data or stage their own hack event. We will share our experience of setting up an online hack format with others: a presentation was made at the University of Edinburgh MSc in E-Learning Alumni Seminar, Virtual University of Edinburgh, Second Life, on the 20 February 2013, an article has been written for the next issue of  BITS, the digital magazine for Information Services at the University of Edinburgh, and guest posts for the (RSC, OKF) to share with their audience have been sought. We will keep contributing Will’s World Shakespeare Registry to future hackathon, the first of such event will be the Innovative Learning Week Hack help by the school of Informatics at the University of Edinburgh between the 18-22 February 2013.

We still have a few branded goodies to give away which will be awarded to use of the Shakespeare Registry in future hack events.


Report on the Will’s World Online Hack Event (Part 4/5) – Hacking

This post is the fourth part of our reflections on the organisation of an online hack for the Will’s World project.  The first post looked at the planning, the second post at the promotion  and the third post at the format of the event. Today’s post focuses on what happened during the hack – how we communicated, what hacking took place and what excited our judges…


A lot of thought and effort was put into communication during the event. Email, IRC and social media technologies were used to keep in touch with the participants, in an effort to reach them in their favourite forum.

In particular we provided:

  • A Daily reminder of the Hangout sessions and connection information provided on Google+ and tweeted.
  • A video of each live session was posted on YouTube straight after the end of the session.
  • A daily summary of the daily hangout, news and progress of the day was emailed to the participant mailing list, posted on the project blog and updated on the event wiki.  These included links to the videos recorded that day and made available on YouTube: Day 1, Day 2, Day 3, Day 4, Day 5, Day 6, Day 7, Presentations and Prizes.
  • IRC channel, Twitter account and event email were monitored throughout to answer any query and moderate discussion.
  • Personal contact was made by email with all registered participants who did not actively participate during the first couple of days to check that there were no access or technical issues and to encourage them to be more actively involved.
  • The team posted photos of their hack space on Pinterest to encourage participants to share their own setups.

In addition, the wiki required regular updates about organisational aspects during the event itself, for example, to add details of participants that registered during the hack, announce the prizes and members of the judging panel, the registered hack and documenting the latest addition to the Registry.

Image of the Google+ Community for the Will's World Hack

The Google+ Community for the Will’s World Hack

Keeping an active presence on so many social media sites proved time-consuming but did enable participants who may have missed some of the daily developments to catch up at a later time. These project updates, comments and shared materials also provide a very well documented record of the interactions which took place during the project.

At an in-person event participants typically communicate their own progress with the group – often through ad hoc check ins . The online and somewhat asynchronous nature of this event meant that the Will’s World team frequently acted as collector and curator of progress, updates, calls for help, etc. so that these could be shared with participants. In retrospect  this also had a significant impact on the time required to support the hack.


Similarly to in-person hacks, some participants were effective and active communicators throughout the week whilst others worked away in the background, sharing their hack at the very end of the event. Like traditional events not all of those who registered chose to turn up or to actively participate – although this was a minority of those registered for the hackathon several of whom contacted us to indicate last minute changes in schedules and commitments.

Although a great deal of discussion did take place amongst participants, and between the participants and the project team, it was somewhat disappointing that only one team was successfully formed. Individual hacks are of course very common, even at in-person hackathons where they often outnumber team hacks, but we had hoped for more collaboration between participants.  It seems that social media did not, for our participants,  adequately replace the informal face-to-face interaction needed for people to connect, build teams, overcome their inhibitions and offer or seek skills. Although it may have been that those people attracted to take part in an online hack may have been more keen in the technological aspects of the hack and/or in the data itself whilst those attracted to in-person events may be more keen to work collaboratively because they see the event as a social event or opportunity to meet and learn from others.

Several academics and literature specialists got in touch during the feasibility survey and we were hoping that some of them would participate in the hack event, sharing their expertise with participants whose expertise was more related to technology and coding. The online format of the event and emphasis on social media may have been a factor in the subject experts choosing to not take an active part in the event, although they may also not have been clear on how to take part. Joining an in-person hack is certainly less technically demanding – you can just show up and begin to find a role without having to engage with multiple logins etc. However even at in-person hack events engaging subject experts, and articulating the value and role of these non-developer participants, can be challenging.

Prizes and Judges

During the planning of the event we knew that we wanted the jury to be external to the project and include representatives both from the cultural and Shakespeare world, and from the technology and developer community, to provide a balanced judgement on what would make a good hack for the Will’s World project.

We contacted the British Museum and Royal Shakespeare Company (RSC). Both had held significant Shakespeare celebrations in the past year, with the British Museum holding the exhibition: Shakespeare: Staging the World over the summer; and the RSC holding the World Shakespeare Festival between April and September 2012 as part of the London 2012 Festival, the culmination of the Cultural Olympiad. For the technical judges, we turned to established hacking organisations: Culture Hack Scotland and Developer Community Supporting Innovation (DevCSI).

Our jury consisted of:

  • Sarah Ellis from the Royal Shakespeare Company
  • Erin Maguire from Culture Hack Scotland
  • Mahendra Mahey from DevCSI

Unfortunately, a last minute diary conflict meant that our British Museum judge was unable to join the final presentation session.

We had a generous £1,000 prize money available to reward the efforts of our hackers. Amazon vouchers were chosen for their universal appeal, their availability in various currencies which could be chosen according to the location of the winners, and their ideal fit with the online format as they can be emailed to the winners.

Five prize categories were identified:

  • Best Set Up (£50) to reward photos of participants’ hack environments.
  • Best Presentation (£100) to recognise the most engaging and effective communication of participants’ hacks/ideas.
  • Best Shakespeare Hack (£100) to reward the best hack “in the spirit of Shakespeare”.
  • Best Open Hack (£250) to reward the best hack for Open Access (open sources and open data).
  • Best Overall Hack (£500)

In addition, the RSC kindly gifted an amazing Shakespeare goodie bag to each winner full of lovely bard-related collectibles.

Royal Shakespeare Company stamps

Royal Shakespeare Company stamps and First Day Cover, part of the goodie bag provided by the RSC.

Mahendra Mahey from DevCSI was very helpful in sharing his experience of organising and judging hackathons ahead of the event. A set of rules for submitting hack were drawn up and made available on the wiki. These were very useful in clarifying the scope of the hack, in particular the fact that concepts, ideas and demonstrators were valid entries, the hack was not only about creating prototypes or applications. These rules also stated deadlines for the submission of titles, hacks and presentations in an attempt to encourage hackers to register their intentions to submit early. However, these were not adhered to and there were last minute submissions – not too unexpected from a community known for working late at night and right up to the deadline!


Following much anticipation and a couple of last-minute entries, the line-up of submitted hacks was impressive. We had nine entries ranging from the concept to simulated application, from the very technical to the fun and visual. The full list of submitted hacks is available on the wiki. Participants were asked to either present their hack live during the closing session on Google+ Hangout or to submit a pre-recorded presentation. There were six live presentations, two pre-recorded presentations and one by proxy.

Google+ Hangout worked amazingly well for the presentations. On top of the live videos of the participants, it allows screen sharing which meant presenters could easily switch between talking, slide shows and software demonstrations. Showing pre-recorded videos was (theoretically) similarly easy but we encountered some screen freezing issues with one of the larger presentations.

The quality and diversity of the hacks meant the jury had their work cut-out for them. After an hour of deliberation, the panel had agreed the winners and these were awarded in the last Hangout session. They were:

  • Best Set Up – Neil Mayo
  • Best Presentation – Kate Ho & Tom Salyers
  • Best Shakespeare Hack – Richard Wincewicz
  • Best Open Hack – Owen Stephens
  • Best Overall Hack – Kate Ho & Tom Salyers

More details on the winning hacks can be found in this post.

We were hugely impressed with the amount of work that had taken place, and with the imagination and quality of the hacks that were produced during the week. Some of the hacks did not, however, specifically fit into any of the prize categories and it was felt that an additional of “best technical hack” or “hack with the most potential” would have helped in rewarding extremely valuable efforts.


Report on the Will’s World Online Hack Event (Part 3/5) – Event

Ready, steady, hack!

This post is the third part of our reflections on the organisation of an online hack for the Will’s World project. The first post looked at the planning and the second post at the promotion for the event.

After 8 weeks of planning and much anticipation, the launch day of Will’s World Online hack finally arrived!

Format and hangouts

The format selected was a week-long event starting with a live, interactive opening session to introduce the data, the goal of the event, prize categories, social media tools and technologies, participants and hack ideas. A similar live session closed the event with the presentation of the hacks and prizes. Additional daily sessions were scheduled during the week to foster regular interaction and, we hoped, collaboration between the participants.

Google+ Hangout was used for all live sessions. It is a free video conferencing tool allowing up to 9 participants to join a call and (optionally) live streaming and archiving to YouTube. This live-streaming and archiving functionality was a key factor in our selection of this tool, as it enables an unlimited number of people to view the meeting live on YouTube without actively taking part, and creates a very accessible and sharable copy after the event. Most other video conference software we considered, such as Skype, are less flexible for streaming or archiving.

The recording facility ensured that the video of each session was available to view on YouTube straight after the end of the meeting. We were keen to make use of this for two reasons, first, to promote flexibility and to give the opportunity to participants who couldn’t make the meeting to catch up at a later time; and second, to document the hack and build a library of videos that captured the event as it took place.

Most participants already had a Google email address that they could use to access Google+ but few of them had used the Hangout facility. However, it is easy to set up, only requiring the installation of a browser plug-in, and was relatively easy to use both from a meeting organiser and participant point of view. Most participants were able to join the meeting simply by following the step-by-step instructions circulated ahead of the event. A couple of help enquiries were received but quickly solved. It should also be noted that where we did encounter teething issues with Google+ our participants were very patient and forgiving of issues around our experimentation with these technologies.

The schedule of the opening session was chosen to enable as many participants as possible to join. Most participants were located in the UK but a few were based in mainland Europe and the USA. We chose to hold the session at 1pm (GMT), or 2pm (CET), 8am (EST), 7am (CST) and 5am (PST), to enable people participating in their own time to either join during lunch time or before being at work. Live sessions were not compulsory and no prior registration was required to join a session which meant that we had no indication of how many people would attend. This added to the anticipation – particularly on the first day! The opening session saw four participants actively joining the hangout and one watching in addition to the four project team members.

During this session we presented an introduction to both the Will’s World Project and the Shakespeare Registry to be used during the hack. Participants were encouraged to introduce themselves and put forward ideas. Participants were invited to hack at any time that suited them during the week. They were also encouraged to form teams and to use the wiki, twitter and the mailing list to advertise for wanted and offered skills.

To ensure all participants were given the best start on the first day of the hack, we held a second live session at 5pm (GMT), or 6pm (CET), 12 noon (EST), 11am (CST) and 9am (PST), to offer an alternative time for people not able to attend the earlier session which covered the same practical information as the opening session. This second session was attended by one active participant and one additional viewer.

Further daily hangouts were planned during the week to provide regular drop-in sessions for participants to raise an issue, query, discuss ideas and the progress of their hacks; and for the project team to provide any updates, help or feedback on the Shakespeare Registry. These sessions were planned for 1pm (GMT) every day. We chose to hold these at the same time every day to make it easier to remember and provide consistency. Although, we considered holding the sessions at a different time every day to cater for different working patterns and time zones, we decided against this to avoid participants having to remember a complex schedule. Instead, we offered to change the time of the daily session to any other suggested time by the participants and to hold additional sessions on demand at any specific time. Participants were happy with the 1pm sessions and no other time or additional sessions were requested. On Friday, ahead of the weekend and what was likely to be a busy hacking time, we held one additional session at 5pm to support the participants.

On the final day, a closing session was planned for the presentation of the hacks and the awarding of the prizes. Following the presentations, the judges left the hangout session to join a separate, private session to deliberate while the participants were able to share their experience of the hack in the main session. The quality of the hacks was impressive and the jury took slightly longer than planned to decide the prizes. Instead of keeping the original hangout live, a separate hangout session was started to announce the prizes.

The turnouts for the Google+ Hangout sessions were:

Team Members Active Participants Viewers Total
Opening session 4 4 1 9
Day 1, 5pm 4 1 1 6
Day 2, 1pm 4 1 3 8
Day 3, 1pm 4 0 1 5
Day 3, 5pm 3 1 0 4
Day 4, 5pm 2 0 0 2
Day 5, 1pm 2 3 1 6
Day 6, 1pm 4 3 1 8
Day 7, 1pm 4 3 1 8
Hack Presentations 4 8 4 16
Prize giving 4 4 1 9

The scheduling of the sessions was probably the most challenging aspect: the number of session needed and the best time for these was largely a guess. We had low attendance for some of the sessions, in particular over the weekend, indicating that either the time wasn’t convenient or simply that there wasn’t a need for that many sessions. It may be the case that the online format promotes independent work, with individuals happy to hack on their own without needing much input and therefore fewer hangouts may have been better.

We were very impressed with the Google+ Hangout facility. It was easy to use, very effective and the broadcast and streaming facilities remarkable. We would however advise caution as on one occasion, active participation to the hangout was made public by mistake instead of being restricted to the invited participants. Soon enough, an unwelcome guest treated us to some unwanted behaviour and had to be swiftly blocked from the hangout!


The data is obviously at the core of the hack. The Shakespeare Registry was released a couple of days before the hack and more data were added during the event itself.  The Registry gives access to metadata for over 1.6 million Shakespeare related online resources, as well as marked XML for the plays, metadata schemas, search and documentation on the APIs. The content of the Registry is very eclectic, including text, pictures, movies and audio based on Shakespeare, his life, his work, his time and any interpretation of these. Participants were free to use as much or as little data from the Registry as they liked,and to combine it with other data sources. This resulted in very different hacks, some academic, some playful, some technical and some visual.

Releasing the Registry earlier, which would have allowed participants to make themselves familiar with the data prior to the event, would have been preferable and might have resulted in more participation in the hack itself. However, the presence of our developers at the Hangout sessions and their availability via email, Twitter and IRC did mean that hack participants had very good access to support, help and further information on the data and registry throughout the week.


Report on the Will’s World Online Hack Event (Part 2/5) – Promoting

This post is the second part of our reflections on the organisation of an online hack for the Will’s World project. Following a first post on planning the event, we are now looking at the event promotion.


We used MediaWiki to setup the Will’s World Online hack wiki prior announcing the event. MediaWiki was chosen as a tool because it is easy to install, maintain and used and the team had prior experience with this technology. We felt that participants would also be familiar with the MediaWiki software through their use or contribution to Wikipedia.

The wiki was used as the main communication hub for the hack to promote the hack, facilitate the registration process, encourage participants to form teams, support communication before, during and after the hack, and disseminate the outcomes after the event.

It was particularly useful for providing prospective participants with all required information about the hack. It included links to:

The participant profile page was updated as the registrations were received with the details made available for publication by the registrants. It was very effective in sharing the number of registrations and information about registrants, highlighting the range of expertise and background, from developer to literature scholars, and geographic distribution of participants.

Other Communication Tools

In order for the Will’s World Online Hack to have a strong and unified presence, we set up a range of communication channels with the Will’s World identity:

  • Email:
  • Google+ account and, from halfway through the week (when they officially launched the functionality) a Google+ Community
  • Twitter: @WillsWorldHack & #willhack
  • A YouTube channel
  • Pinterest board
  • An IRC channel: ##willsworldhack on also accessed via:
  • A mailing list (WILLSWORLDHACK@JISMAIL.AC.UK) that included all registered as recipients (they were encouraged to opt in to the list as part of the registration form) was set up to help targeted communication.


With all the communication channels in place and only two weeks to go until the start of the hack, we advertised the event in the same forums (blogs, websites, twitter, mailing lists and direct contacts) that had been used to disseminate the surveys and emailed those who had responded positively to the survey. This provided some continuity and feedback to an audience already alerted to the eventuality of the hack in earlier posts.

In addition a news release was produced and advertised on the EDINA website. This new release also featured in the JISC Headlines issue 110 in November 2012 and on their website.

To emphasise our desire to make the event personal, friendly and interactive despite of its online nature, we produced a couple of short videos to promote the event and present the Shakespeare Registry. These videos also provided a record of key aspects of the event that can be viewed at any time for the convenience of participants and interested parties. The videos were made available on YouTube and advertised in a separate blog post a week later to keep up the interest in the hack and serve as a reminder while providing new information.

Reminder messages, mails, tweets and posts were sent a few days before the event, they detailed how to join the event including the required technical steps.

Additional Data Contribution

The promotion of the hackathon was effective not only in recruiting participants for the event but also in raising the profile of the Shakespeare Registry. Nora McGregor, digital curator at the British Library, contacted us to contribute additional data to the Registry. Within a few days, the British Library was able to provide us with formatted metadata on Shakespeare related titles from their digitised 19th century books ready for inclusion in the Registry and for use during the hack.

Inspiring Others

We were pleased to hear that the SPRUCE Project spotted our tweet about the online hack and were inspired the set up their own one day remote hackathon to make file format identification better (crucial for preservation). Their CURATEcamp 24h event took place in November 2012 and more information about what was achieved is available on the event wiki.


Organising the online hack was actually very enjoyable. The novelty aspect made it easy to engage with people and bring out enthusiastic responses. However, the short time frame made it quite challenging. We would advise a much longer lead time to make it easier to order customised goods and to promote the event. Dissemination activities also took more time and effort than anticipated due to the many (perhaps too many) social media channels to cover and follow up needed with the large number of enquiries we received.


Report on the Will’s World Online Hack Event (Part 1/5) – Planning

It’s been a couple of months since the Will’s World Hack and we wanted to reflect on the process of planning and running the Hack. We’ve decided to split our thoughts into five posts which we will be sharing over the next week or so. We would love your feedback, ideas and reflections on the Hack and on our reflections here so please do leave us comments or any questions.

Why a Hack?

The idea of an online Hack event came from the need to promote the use of the Shakespeare Registry designed by the Will’s World Project. We wanted the ‘use’ of the Registry to be innovative and we thought that the ‘promotion’ should reflect this and be innovative too.  Hackathons are a common way to encourage developers and other creative people to collaborate on quick prototypes and proof of concepts based on a specific dataset or theme, and therefore was an obvious choice for increasing awareness of the Shakespeare Registry. Running the event online was the idea of the EDINA Social Media Officer as it would be innovative and allow us to experiment with a range of social media technologies.

Feasibility Study

There were obvious advantages to running an online hack such as increased flexibility and inclusion for participants, simpler logistics and reduced time scales for the organisers, and the opportunity to experiment with the use of social media technologies to support creative sharing and collaboration. We shared our thoughts on the potential pluses and minuses of holding an online hack in two blog posts: Online Hack Event and Can one desire too much of a good thing?

The project team carried out a feasibility study where we looking into the practical aspects of running an online event including suitable formats, collaboration and social media technologies, costs, publicity materials and prizes.

We also sought feedback and input on the idea of an online hack and on various aspects of the practical organisation of such an event through an online survey which was available from the 18 October 2012 to the 10 December 2012.

The survey was widely disseminated using posts on the project blog, project staff personal blogs, Google+, and Facebook presences, as well as posts on appropriate websites and mailing lists, targeted emails and Twitter.

The responses to the survey were analysed together with the direct feedback received via email, twitter etc. The results of this analysis were shared in the our post: Will’s World Online Hack Survey Results – Your Views!

We were delighted with the wide interest shown in a potential Will’s World Online Hack event but also in the general concept of running a hack event online. Many people were enthused by the idea and wished to be kept informed of outcomes of this innovative experience. In particular, the Royal Shakespeare Company (RSC) invited us to provide a guest post for their MyShakespeare blog.

We summarised our background research into the organisation of an online hack for Will’s World Registry and how we proposed to hold the event in a feasibility report which was put forward to the project funder, JISC, for approval in November 2012. With the support from JISC and (as the Discovery Programme supporting our efforts was about to end)  less than a month to get everything in place, we set out to organise the event based on the input received.

Promotional Material

Hackathons generally provides free items or goodie bags to participants to promote the hack and it’s sponsors, act as a memento and create a sense of being part of something special. They tend to be fun, useful and tongue-in-cheek items like a mug and free coffee to enable developers to stay up through the night. We wanted to create a similar feel for our online event and put together a ‘survival pack’ to be sent to participants ahead of the event. This free goodie bag was promised to the first 50 people to register for the hack which encouraged potential participants to actively sign-up ahead of the event.

Picture of the Will's World Survival Pack

Each pack included:

  • On the useful side:
    • A list with the essential contact points for Will’s World Online Hack: Wiki, YouTube, Google+, Twitter, Emails, mailing list and blog to reinforce the channels available for communicating during the hack.
    • Post-it notes & pen to jot down ideas.
    • A USB stick to store that great new code.
    • A mug to be filled with a favourite beverage (caffeine or not?) to see participants through these hacking hours.
  • On the fun side:
    • A badge to advertise and show support for the event.
    • Ruff-making instructions to get in the Shakespeare spirit.
    • Twelfth Night cake recipe. A very appropriate and festive alternative to the traditional fuel of many hacks: pizza!
    • Some sweet treats for that sugar rush and energy boost

The pen, USB keys, mugs and badges were branded with the Will’s World project, hack event, funder and/or developer logo to promote the Shakespeare Registry and advertise the hackathon. The ruff-making instructions and recipe were a playful way to encourage creativity in a Shakespeare themed way. We also hoped it would help build some links with the participants by encouraging interaction with everyone taking part in the hackathon. This tied in with the ‘Best Setup’ prize to be awarded for people sharing their hack environment. The ruff featured in the introductory video for the Registry as well as in the example photos for hack setups, modelled by Kiwi the cat:

We had hoped to send the packs ahead of the event itself to build anticipation but unfortunately the customisation of the mugs took 2-3 weeks. This meant that the packs were only sent out a couple of days before the start of the hack. Most participants did, however, receive their pack during the week of the hack (depending of their location).


A Google form was used to create a registration page to capture information about participants including email addresses, Twitter and Skype names, skills, expertise, short biography statement and what they were looking for in the hack. We explicitly asked for consent to publish any of that information (participants could opt out though most chose to share some or all of the information they had provided) on our Meet the participants wiki page.

A total of 22 people registered which we felt was a very good outcome for this type of event, especially considering the short notice and the close proximity to the festive holiday period. It is worth noting that a few additional people also registered during the hack itself.


SPARQL endpoint for SUNCAT

As we explored how to extend access to the metadata contributed by a set of libraries using the SUNCAT service in order to promote discovery and reuse of the data, it soon became clear that Linked Data was one of the preferred format to enable this.

The previous phase of this project developed a transformation to express the information on holdings in a RDF model. The XSLT produced converts MARC-XML into RDF/XML. This XSLT transformation was used to process over 1,000,000 holdings records made available by the British Library, the National Library of Scotland, the University of Bristol Library, the University of Nottingham Library, the University of Glasgow Library and the library of the Society of Antiquaries of London in order to make them available through a Linked Data SPARLQ endpoint interface.

Setting up the Triplestore

We build on previous experience at EDINA on providing SPARQL endpoints to set up the interface for the SUNCAT Linked Data.

We chose the 4Store application which is fully open source, efficient, scalable, and provides a stable RDF database. Our experience is that it is also simpler to install than other products. We installed 4Store on an independent host in order to keep this application separate from other services for security and easy maintenance.

Loading the data

The data contributed by each library was processed separately. First, the data was extracted from SUNCAT following any given restrictions placed by the specific library. It was then transformed into RDF/XML and finally loaded in the triplestore. Each of these steps can be fairly time consuming according to the size of the data file. Once the data from each library has been added to the triplestore, queries can be made accross the whole RDF database.


A HTTP server is required to provide external acces and allow querying of the triplestore. 4Store includes a simple SPARQL HTTP protocol server which answers SPARQL 1.1 queries. Once the server is running, you can query the triplestore using:

  1. A machine to machine  API at
  2. A basic GUI is available at: 


The functionality of the basic test GUI is rather limited and only enables SELECT, CONSTRUCT, ASK and DESCRIBE operations. In order to customise the interface and provide additional information like example queries, we used an open source SPARQL frontend designed by Dave Challis called SPARQLfront and available on github. SPARQLfront is a PHP and Javascript based frontend and can be installed on top of a default Apache2/PHP server. It supports SPARQL 1.0.

An improved GUI is available at:

The DiscoverEDINA SUNCAT SPARQL endpoint GUI provides four sample queries to help the user with the format and syntax required to compose correct SPARQL queries. For example, one of the queries is:

Is the following title (i.e. archaeological reports) held anywhere in the UK? 

SELECT ?title ?holder
        ?j foaf:primaryTopic ?pt.
        ?pt dc:title ?title;
            lh:held ?h.
        ?h lh:holder ?holder.

        FILTER regex(str(?title), "archaeological reports", "i")

The user is provided with a box in which to enter queries. Syntax highlight is provided to help with composition.  The user can also select whether to display the namespaces in the box or not. There is a range of output formats that can be selected:

  • SPARQL XML (the default)
  • JSON
  • Plain text
  • Serialized PHP
  • Turtle
  • Query structure
  • HTML table
  • Tab Separated Values (TSV)
  • Comma Separated Values (CSV)
  • SQLite database

The SPARQL endpoint GUI is ideal for running interactive queries, developing or troubleshooting queries to be run by the m2m SPARQL API or used in conjunction with the SRU target.

RJ Broker: a Research Output Delivery Service

Back in August 2009, the idea for a system to deposit research output directly to Institution Repositories (IRs) was formulated (like many other great ideas) on the back of a napkin, and presented in this ‘Basic Premise’ post. Development works on the Open Access Repository Junction finished in March 2011 and were followed a year later by the current project on the Repository Junction Broker (RJ Broker).

Development works have progressed over these last three years and a prototype RJ Broker has been designed but many questions were raised along the way. We decided to take advantage of the attendance of many representatives of the RJ Broker stakeholders at the 7th International Conference on Open Repositories (OR2012) in Edinburgh in July 2012, to refine our vision by using the direct input from key stakeholders.  An evening workshop was organised on the 9th July and representatives from our stakeholders: IR managers, funders, publishers, IR software and service developers from the UK, Europe, US and Australia were invited to take part. The summary of this brainstorming is presented here.

RJ Broker: A delivery service for research output

The RJ Broker Team first set the scene with a short presentation on the current and intended RJ Broker functionality.

The RJ Broker is in effect a delivery service for research output. It accepts deposits from data providers (institutional and subject repositories, funder and publisher systems). For each deposit, it uses the metadata for the deposit to identify organisations and any associated repositories that are suitable for receiving the deposit. It then transfers the deposit to the repositories that have registered with the RJ Broker service. The metadata acts as the address card for the deposit which is the parcel.

In order to receive deposits from the RJ Broker, IRs have to register their SWORD credentials with the service to give the RJ Broker an access point to their systems for data input, like the letter box lets mail in the house.

At this stage, the focus for the type of deposit (or content of the parcel) is research publications. However, the way the RJ Broker works is independent of the deposit type. The RJ Broker will transfer parcels of any type, big or small. For example, a deposit can be a publication with several supporting data files, just an article or just the data files. The parcel can be empty or even taped shut.  Indeed in the case  of an article published under Gold Open Access, the address card is the only information needed to provide a notification of the availability of that article in the publisher system. When an article is subject to an embargo period, a sealed parcel is required to enable the delivery to take place straight away even if it can only be opened later, like the presents sent by far away relatives placed under the tree to be opened on Christmas day!

The mind map below was used to inform the discussion of all the questions we were seeking to answer.

Metadata: The all important label

Like the address card on the top of a parcel, the metadata is available for all to see.  Indeed the metadata is always fully open access regardless of the embargo period imposed on the data itself. The metadata is owned by the person who creates it (author, publisher or IR manager) but there is no copyright on it. The metadata can even be considered to act as a advertissement flyer for the data itself which benefits its owner (whether author, publisher or IR manager) and therefore explains why owners support open access for metadata.

Standards are generally a good thing, improving quality and facilitating exchange. For example, the use of a funders’ code field in the metadata would significantly ease reporting on return on investment for the funding agencies.  Several metadata standards are currently being developed, for example CERIF, RIOXX, COUNTER or the OpenAIRE Guidelines. The RJ Broker will support these standards but it is not its duty to ensure these standard are adhered too or that for example all required fields have been entered. In the same way as one expects an address card to provide enough space for the required information to be supplied in order for the parcel to be delivered, one does not expect the postman to fill in a missing house number or any other missing information.

Deposit: What is in that parcel?

The RJ Broker has to assume some responsibility for the object it is trusted with transferring to IRs, mainly the correct identification of appropriate IRs and the subsequent delivery to these IRs. The RJ Broker is also responsible for the safe keeping of the deposit while it is in transit.

Once it has been successfully transferred to the registered IRs, the responsibility of the RJ Broker ends. It may seem tempting to extend the functonality of the RJ Broker to store a copy of every deposit in order to allow later downloads by newly registered IRs or simply to provide a safety backup. However, this is not the purpose of a delivery service. This would also turn the RJ Broker into a repository that could grow to a massive size. Therefore the RJ Broker will only keep recently transferred deposits for a limited period of time to allow IRs time to accept and process these deposits. Similarly, the postman is not required to scan each postcard he delivers for future safe keeping but undelivered items will be returned to the sorting office and held for a while to allow collection.

If none of the identified IRs have registered with the RJ Broker then no delivery is possible. This constitutes a successfull processing of a deposit for the RJ Broker. Future developments cou;ld consider transferring the deposit to open repositories like or sending a notification to IRs to advise them to register with the RJ Broker should they wish to receive direct deliveries of research output from the RJ Broker.

The RJ Broker will transfer every deposit it receives. It does not provide an inspection or validation service. Therefore will not flag an empty, a duplicate, incomplete or badly formatted deposit.

Dealing with embargo

The RJ Broker aims to support Open Access (OA) by enabling the dissemination of the reseach output across the UK and beyond. It does not matter for the delivery process whether this OA is gold or green. However, it is important that any embargo period is dealt with appropriately.

A legal agreement between the RJ Broker and each data provider requesting the respect of embargo periods will be signed before any data from that provider is transferred by the RJ Broker. Each IR will in turn have to accept a similar agreement before they can receive data, through the RJ Broker, from providers enforcing an embargo. Data providers have to ensure that embargo periods are correctly noted in the metadata. IRs have to respect any embargo specified in the metadata. The RJ Broker acts as a trusted, enabling technology between both parties, not as a control point,  it does not have any responsibility regarding the enforcement of embargos. Legal agreements are currently being set in place for the early adopters of the RJ Broker. The hope is that a set of standard agreements can be derived from these to promote take up and ease the administration process.

Beside the legal agreement, the RJ Broker will not perform additional checks or require further certification or accreditation from IRs. The aim of the RJ Broker is to disseminate research output widely. It is not its purpose to rate IRs for trust or reliability which is best left to the appropriate authorities.

Tracking a Deposit

The RJ Broker assigns a tracking ID to each deposit which enables data suppliers to check on the onward progress of the deposit after it was successfully delivered to an IR SWORD endpoint.

As mentioned previously, the responsibility of the RJ Broker ends once the deposit has been successfully transferred to the registered IRs.  Institutions follow different procedures, workflows and timetables when it comes to processing deposits left for inclusion in their repositories. Therefore asserting that a deposit has been successfully ingested by an IR is a complexe task which is not part of the RJ Broker’s remit as a delivery service. However, the RJ Broker will provide the data suppliers with a send receipt, as a proof that the deposit has been processed by the RJ Broker, which includes a tracking ID. The data supplier can later use this ID to check on the status of the deposit with the IRs in which it has been transferred, i.e. received, queued for processing, accepted, live or rejected.

Keep it simple!

The discussion was very productive, all topics set in our mind map were covered and answers to all questions regarding the functionality of the RJ Broker were agreed. The unanimous conclusion was to keep it simple!


The RJ Broker should aim to be a delivery service only. It will follow a “push only” model. Deposits will be pushed to the RJ Broker by data suppliers and the RJ Broker will push the deposits to the IRs. This enables the RJ Broker to have a streamlined workflow.

Specifically, the RJ Broker will NOT:

  • provide any reporting or statistics
  • filter incoming data
  • improve data or metadata
  • enforce standard compliance
  • be a repository
  • collect (“pull”) data from suppliers

I would like to thank everyone who took part in the workshop and help us shaped the functionality of the future RJ Broker service! Development and trials are on-going with a first version of the RJ Broker due for release to UK RepositoryNet+ in Spring 2013. Watch this space!

List of Attendees

Tim Brody (University of Southampton, UK), Yvonne Budden (University of Warwick , UK), Thom Bunting (UKOLN, UK), Peter Burnhill (UK RepositoryNet+, UK), Pablo de Castro Martin (UK RepositoryNet+, UK), Andrew Dorward (UK RepositoryNet+, UK), Kathi Fletcher (Shuttleworth Foundation, USA), Robert Hilliker (Columbia Univeristy, USA), Richard Jones (Cottage Labs, UK), Stuart Lewis (The University of Edinburgh, UK), John McCaffery (University of Dundee, UK), Paolo Manghi (OpenAIRE, Italy), Muriel Mewissen (RJ Broker, UK), Balviar Notay (JISC, UK), Tara Packer (Nature Publishing Group, USA), Marvin Reimer (Shuttleworth Foundation, USA), Anna Shadbolt (University of Melbourne, Australia), Terry Sloan (UK RepositoryNet+, UK), Elin Strangeland (University of Cambridge, UK), Ian Stuart (RJ Broker, UK), James Toon (The University of Edinburgh, UK), Jin Ying (Rice University, USA)