About Nicola Osborne

I am Digital Education Manager and Service Manager at EDINA, a role I share with my colleague Lorna Campbell. I was previously Social Media Officer for EDINA working across all projects and services. I am interested in the opportunities within teaching and learning for film, video, sound and all forms of multimedia, as well as social media, crowdsourcing and related new technologies.

Guardian Teacher Network Seminar: Technology in schools: money saver or money waster? – Belated Liveblog

Last Thursday I attended the Guardian Teacher Network Seminar: Technology in schools: money saver or money waster? at Kings Place, London.The panel was chaired by Kate Hodge (KH), head of content strategy at Jaywing Content and former editor of the Guardian Teacher Network, and featured:

  • John Galloway (JG), advisory teacher for ICT/special educational needs and inclusion, Tower Hamlets Council.
  • Donald Clark (DC), founder, PlanB Learning and investor in EdTech companies with experience of teaching maths and physics in FE in the UK and US.
  • Michael Mann (MM), senior programme manager, education team, Nesta Innovation Lab.
  • Naureen Khalid (NK), school governor and co-founder of @UkGovChat.

These are my live notes from the event – although these are a wee bit belated they are more or less unedited so comments, corrections, additions etc. are welcomed. 

The panel began with introductions, mainly giving an overview of their background. The two who said a wee bit more were:

John Galloway, specialist on technologies for students with special needs and inclusion, I work half time at Tower Hamlets with students but also a lot of training. It’s the skills of adults that is often the challenge. The rest of my time I consult, I’m a freelance writer, I am a judge of the BETT awards.

Michael Mann (MM), NESTA, our interest is that we don’t think EdTech has reached its potential yet… Our feeling is that we haven’t seen that impact yet. And since our report five years ago we’ve invested in companies and charities who focus on impact. Also do research with UCL, and work with teachers to trial things in real classrooms.

All comments below are credited to the speakers with their initials (see above), and audience comments and questions are marked as such… 

KH: What’s the next big thing in tech?

DC: It’s AI… It’s the new UI no matter what you use really… I only invest in AI now… Education is curiously immune from this at the moment but it won’t be… It is perfect for providing feedback and improving the eLearning experience – that crappy gamification or read then quiz experience… We are in a funny transitionary phase..

MM: There has been an interesting trend recently where specialist kit is becoming mainstreams… touch screens for instance, or speech to text… So, I think that is closing the gap between our minds and our machines… The gap is closing… The latest thing in special education needs have been eye games – your eyes are the controller… That is moving into mainstream gaming so that will become bigger… So I see a bigger convergence there… And the other thing I see happening is VR. That will allow children to go places they can’t go – for all kids but that has particular benefits and relevance for, say a child in a wheelchair. For autistic children you put them in environments so they can understand size, lights, noise, and deal with the anxiety… before they visit…

KH: What are the challenges of implementing that in the classroom

JG: The tech – and costs, the space… But also the creativity… A lot of what’s created are not particularly engaging or educational. I’d like to see teachers able to make things themselves… And then we need to think about pedagogy… But that’s the big issue…

DC: I can give you an example in the context of teaching Newton’s Laws with kids… We downloaded a bunch of VR apps… And NASA apps there was great for understanding and really feeling Newton’s three audience… Couldn’t do that with a blackboard… And that’s all free…

KH: How accessible is that… ?

DC: Almost every kid has a smartphone… Google Cardboard is maybe £5… It’s very cheap… It won’t replace a teacher, at least not yet. I wouldn’t teach basic mathematics with VR, but I wouldn’t teach Newton’s three laws any other way…

MM: We are piloting a thing called RocketFund and one of the first people to use VR used it in history… After that ran we have about 10 projects because they’d seen what was possible…

DC: “Fieldtrips” can be free… I’ve also seen a brilliant project with a 360 degree camera in a classroom used in a teaching space – a £250 camera – and brilliant for showing issues with behaviour, managing the classroom etc.

NK: Now if something is free, I would have no objection at all!

KH: How do you measure impact?

NK: Well if someone has a really old PC and it runs slow… that’s a quick and clear impact. But it’s about how they will use it, what studies are there and are they reliable… Could you do this any other way? What’s different?

MM: A lot of these technologies do not have evidence on them… But you will have toolkits, ideas that are well grounded on peer instruction, or tutoring… If you can take pedagogical approaches and link it to a tool you are using, that’s great. There’s work on online tutoring, and there is a company which provides tutoring from India… And I want to know how they ensure that they follow established criteria…

DC: I think we’ve had a lot of device fetishism… We’ve seen huge amounts of tablets imposed… and abandoned… You have to regard tech as a medium – not a gadget or a school. I think we’ve had disastrous experiences with iPads in secondary schools… They work in primary schools but actually writing on iPads doesn’t work well… It’s a disaster… And it’s a consumer devices not enabling higher order writing, coding, creation skills… I recommend that you look at Audrey Mullen’s work – she was a school kid when she started a company called Kite Reviews… She said we don’t want tablets or mobiles, that laptops were better…

Comment: What about iPads in schools… I did a David Hockney project with Year 10 students, that riffed off his use of iPads and the students really engaged with it… I’ve also used it in a portrait project as well… And one of the things I’m interested it is how you use it in more than writing and literacy…

JG: I just want to come back to measuring impact… It depends what you want to use it for… Donald gave us an example of using an iPad for the wrong thing, and from the audience that example of using iPads in the right ways… No-one in industry would code on an iPad… We have to use technology appropriate to the context and the wider world.

KH: How would you know that?

JG: As a teacher you have to gain expertise and transfer that to your teaching…

KH: You might be an expert in history but not in ITT…

JG: As a teacher you have to understand the technology you are being given to use… You have to understand the pedagogy… And you have to prove to teachers that the technology will improve their practice… I’m not sure any teacher has ever taught the perfect lesson, you always can think of ways to improve that… And that’s how you consider your work… One of the best innovations in teaching have been TeachMeets – informal exchanges of practice, experiences, etc. The reason technology in classrooms is not as successful as it should be are complex…

NK: I know of someone who purchased an app, brought into it, send people off to training… But it was the wrong app or what you are trying to do… So do the research first before you purchase anything…

DC: I think that the key word here is procurement… And teachers shouldn’t be doing that with hardware… You have to start with teaching needs, but actually general school software too – website, comms with parents, VLEs etc… It’s back end stuff… Take the art example… I know lots of artists… none using iPads… They use more sophisticated computers that enable the same stuff and more… It’s not David Hockney, that’s the tail wagging the dog… It’s general needs… Most kids have devices… I’d spend money on topping up for inclusion… And you have to do that cost benefit analysis first…

MM: Cost benefit analysis and expert approaches isn’t realistic in many schools… Often it’s more realistic to do small scale trialling… If it works, guide their peers, if not, then quite there… Practical experimentation, test and learn is the way forward I would say…

JG: I think that the challenge is often the enthusiast… You need to give things to the cynic!

DC: There is a role for sensible professional advice. In Higher Ed we have Jisc, we are quite sensible… But we don’t have that advice available for schools… It all goes a bit odd… It’s all anecdotal rather than evidence based… Otherwise we are just pottering about… And we end up with the lowest common denominator in terms of skills and understanding…

JG: I’m getting a bit nostalgic for BECTA, and NESTA FutureLab… doing interesting stuff. A lot of research now is funded by companies engaged in the research…

MM: I agree… but there is no evidence for white boards, tablets, whatever as they don’t work on their own… Has to be evidence informed…

DC: Cost effectiveness is always about tech as an intervention in education… The evidence for schools is that writing accuracy goes down 31% and is a huge problem on tablets… Unless…

NK: There’s good evidence that typing notes in class doesn’t work

DC: Absolutely… Although there is plenty of evidence that lectures don’t work ad we still do that… They have power devolved and in my view they are not really teachers… That happens every day…

Comment from audience: That doesn’t happen every day…

MM: We have to be careful about how we use the word evidence… Lectures may not be correlated with success but that may be to do with the quality of teaching staff, of lecturers…

KH: One of you talked about giving technology to the cynic… How do you overcome this…

JG: I think that the doubter, the cynic… will ask all the questions, find all the faults… But also see what works if it works…

KH: Often use of tech comes down to the enthusiasts and evangelists… But teachers lack space to be creative… How can we adopt technology if we lack that time and opportunity…

JG: We have so much more technology now, it has permeated our lives more… Our thinking, our discussion, potentially our classrooms… But I haven’t seen smartphones in schools much yet… We haven’t talked about bring your own device… There is an element of risk.. potential for videoing, for sharing bad practice, for bullying and harassment… But there is a lot of nervousness there…

DC: I think we have to move away from just thinking about technology in the classroom. I’m dead against it. Bring tech into a room in a one-to-many context… I’d rather use learner technology… Good teachers are teachers in the classroom… Kids really use tech at home, with homework… When you struggled when I was a kid you got stuck… but now you can use devices… to find the answer but also the method… And we have adaptive learning that can tailor to every kid. I think learner technology and away from the classroom is where it needs to be… Rather than the smart board debacle… Where one minister brought that in, Promethean made millions…

JG: I don’t recognise the classroom you are describing… I see teachers using technology, with big changes over the last twenty years… It is the appropriate use of technology in the appropriate places in learning… And thinking about the right technology for the job… If we took technology out of the classroom we’d just have lectures wouldn’t we?!

DC: The issue of collaboration is interesting… There is work from Stanford that many group works/collaborative technological driven things in the classroom… That most kids aren’t doing anything, but it looks collaborative… versus a good teacher doing the Socratic thing…

MM: I don’t think the in/outside the classroom thing is as important as the issue of what works, how things adapt, immediate feedback to with FitTech…. But it all comes back to pedagogy….

NK: It all comes back to what the problem is that you are trying to solve…

KH: What about the right way to do this… There’s the start-up like run fast, fail fast approach… Then the procurement approach…

NK: We want evidence based procurement… I don’t want to fund trials… Schools are poor…

KH: Start ups don’t throw it and see if it works… They use data to change their approach…And that’s what I’m talking about… Trialling then using evidence to inform decisions…

DC: The last thing I want to do is to waste time or money with start ups going into schools… I think taking risks in schools like that is very risky… I’m also not sure governors should be procuring… The senior team should… But often there is no digital strategy… It needs to be tactical not strategic…

JG: Suppose we get the kids to assess the start up product… There is a great project called Apps For Good… It gets kids to engage in the idea, the design process, the entrepreneurial aspect… There is a role for start ups for teaching kids about how this happens… I think education is a risky business anyway… We think something good will happen, kids have to trust the teacher… I think risk can be quite a healthy thing, and managing risk… Introducing something new can be edgy and can be quite invigorating…

NK: As a governor I don’t want my school going into the red financially… We need to operate within our means…

KH: It wasn’t about start ups in the classrooms… Even a small spend…. Can be risky…

MM: Isn’t there a risk of a big roll out of something that doesn’t work for your school? Some risks will feel riskier than others… School culture and character all mater…

JG: We do have examples of technologies that didn’t work but now do… VLEs didn’t take off… Schools don’t use them… It was an expensive risk… But many use Google Classroom which is essentially the same thing… It’s free but needs maintenance…

DC: Actually with new start ups… you want evidence, you want research to prove the usefulness. 50% of start ups fail, and you don’t want to adopt stuff that will fail…

JG: But someone has to try things first, to try new things, to bring something new into the classroom.

KH: How do we take Ed Tech forward… ?

DC: At risk of repeating myself… Professional procurement, technology strategy, strategic leadership in this…

Comment from crowd: Where do you get the evidence if you don’t test it in the classroom…

DC: I am involved in a big adaptive learning company… We are doing research with Cambridge University…

Comment from crowd: so for the schools taking part, that is a risk!

DC: No, it’s all carefully set up, with control groups… Not just by recommendation by colleagues…

JG: Setting up trials in schools in incredibly difficult, especially with control groups… Even if you do that you have to look at who was teaching, who was unwell then, etc. It’s very very hard to compare… And if it is showing improvement then morally should you withhold that technology from some pupils… One of the trials I can think of was around use of iPads… Give them own budget for apps.. But give them free choice… And then have them talk about that… It’s a trial but it’s very low cost, it’s very effective, it’s judging fit of tech to the space…

NK: I’ve known schools go for the iPad whether or not it works… Why go for the most expensive tablets… to try them!

DC: In the US there was a 1.3bn deal with Apple in California… And iPads are not there now… They now use Chrome Books…

JG: But that was imposed from the top.. And that’s an important issue…

Comment: I want to take issue with something Donald was talking about… I am all in favour of evidence based research and everything… But it is hard to find time to find the research, and a lot of effort to actually read through it… 3 pages of methodology before the conclusion… By the time it’s published it’s out of date anyway… I write about evidence on my website and often no firm conclusions come out of this… Ultimately anecdotal evidence matters… Asking questions of what was this trying to solve, what worked, what didn’t… Question: does Donald agree with me.

DC: No!

Comment: We all know the digital age is coming, kids have to work with computers, how can schools prepare children for that work and keep traditional teaching too..?

MM: For me there are two aspects: digital skills like codeclubs, programming… The other side is that when we are in this world with automation, what sort of jobs will survive… We have a report at Nesta called Creativity vs Robots… Skills that are most robust are creative, collaborative, dexterous… Preparing kids for the future still requires factual knowledge but also collaborative and problem solving skills… It’s not that it doesn’t exist, we just really need to focus on that…

JG: Maybe controversially I will say that we don’t… We should teach flexibility and to learn. A few years back I wrote for TimeEd… I visited Harrow- relatively unlimited funding… They don’t teach computing… They don’t get there until Year 9… Prep schools don’t teach it… Not “academic” enough fpr A-level or GCSE. They do some ICT skills… I guess they will get jobs, good ones…But they don’t prepare them for that… They prepare them to be leaders and the elite… I’m not necessarily sold on the idea that you have to prepare kids to be the makers… We teach reading and writing, but not digital literacy… Or how to read a film or a computer game, why failure is important… We don’t teach that… We might teach them how to create the game… So in part “don’t” and in part “expand the curriculum”

Comment: For Mr Galloway… Why did you go to Harrow not Eton… They invest in innovation and you get to be amused at top hats and tails?

JG: Tube ride!

DC: It would be madness to ignore technology in schools… But coding is this year’s thing… ! Kids need skills when they leave school…

NK: I have great problems with the idea of 21st Century skills… We can’t train kids for jobs they don’t exist… Jobs from hundreds of years ago….

MM: There is a social justice aspect here… Mark Zuckerberg went to one of the top schools… If we don’t expose all children to technology opportunities they can miss out…

JG: In Harrow they don’t impose technology on teachers… but they get it if they ask for it. They also give kids Facebook account sand teach them how to use it…

Comment: When we think about technology in schools, when do we think about teachers perspective… can we motivate and engage students with 21st century skills and possibilities…

NK: With all the money in the world, yes. We are in the position where schools can barely afford the teachers… We have to live within their means…

DC: Are teachers the right people to teach these skills… Is that what teachers are best suited to that… Not sure subject orientated teachers are well placed for that.

JG: Teachers do teach collaboration. Social media is about relationships… It’s just a form of that… CPD for teachers is outside of school time and that means keen teachers engage there…

MM: Having some teachers into smartphones. Some who are not… Some teachers are into outdoor education and camping… Others are not… You would’t want to exclude kids from the experience of camping… That’s how you can think about the ideas of digital literacy here… Finding the enthusiasm and route in…

Comments: A lot of what we, in this room, know of technology is through past exposure and experience of technology. Children are sponges.. They can often teach the teachers, with scaffolding from the teachers, about this era of technology… The kids are often better and quicker at using the technology… We have to think about where this might lead them…

Comment: On procurement and evidence… Michael talked about small trials… Do we think specific and unique contexts with schools not justify that type of small scale trialling…

MM: I think context is key in trials… Even outside of tech… Approaches like peer learning have great evidence… But the actual implementation can make a big difference… But you have to weigh up whether your context is as unique as you think…

DC: That can also be an excuse… Having been involved in procurement in tech… You don’t throw tech about… You think about what the context is, do serious homework before spending the money… You need the strategy and change management to roll things out and sustaining the effort… That’s almost invariably absent in the school context… Quite haphazard… “everyone’s unique… Let’s just play with this stuff”

Comment, I’m the director of a startup empowering primary aged girls and augmented reality to encourage routes into STEM subjects.: In terms of costs and being a governor… Start ups are obsessed with evidence. One of the best things you can do is work with start ups, they really want that evidence… If you are worried about costs you can trial things… But it is a risk when you are teaching… You were also talking about jobs that don’t exist at the moment… That means new jobs in new fields… One thing that strikes me this evening is that no one has talked about science, technology, arts and maths…. And teachers don’t come in from that route into schools… We’ve been talking to Jim Knight. In primary schools you don’t get labs but you can use AR to do experiments… to look in this area… My point it you’ve been talking about technology, is it worth it… Would have been great to hear someone from positive experiences, or an Ed Tech company… This feels like a lot of slamming down of technology…

JG: Can I talk about positive experiences… Technology is life changing and amazing… removing technology from classrooms is a horrendous… Your example in not having enough good qualified science teachers is an important one…

DC: I am not sure about AR and VR… I’d be careful with some of these things… Hololens isn’t there yet… Leading edge tech is a bit of a honeytrap… I raise VR as its on every phone… and free…

Commenter: AR is on phones… !

KH: Thank you for a really lively discussion!

And with that the rather spirited discussions came to an end! Some interesting things to consider but I felt like there was so much that wasn’t discussed properly because of the direction the conversation took – issues like access to wifi; measures to use but make technology safe – and what they mean for information literacy; technology beyond devices… So, I’d love to hear your comments below on Ed Tech in Schools.

Share/Bookmark

IIPC WAC / RESAW Conference 2017 – Day Three Liveblog

It’s the final day of the IIPC/RESAW conference in London. See my day one and day two post for more information on this. I’m back in the main track today and, as usual, these are live notes so comments, additions, corrections, etc. all welcome.

Collection development panel (Chair: Nicola Bingham)

James R. Jacobs, Pamela M. Graham & Kris Kasianovitz: What’s in your web archive? Subject specialist strategies for collection development

We’ve been archiving the web for many years but the need for web archiving really hit home for me in 2013 when NASA took down every one of their technical reports – for review on various grounds. And the web archiving community was very concerned. Michael Nelson said in a post “NASA information is too important to be left on nasa.gov computers”. And I wrote about when we rely on pointing not archiving.

So, as we planned for this panel we looked back on previous IIPC events and we didn’t see a lot about collection curation. We posed three topics all around these areas. So for each theme we’ll watch a brief screen cast by Kris to introduce them…

  1. Collection development and roles

Kris (via video): I wanted to talk about my role as a subject specialist and how collection development fits into that. AS a subject specialist that is a core part of the role, and I use various tools to develop the collection. I see web archiving as absolutely being part of this. Our collection is books, journals, audio visual content, quantitative and qualitative data sets… Web archives are just another piece of the pie. And when we develop our collection we are looking at what is needed now but in anticipation of what we be needed 10 or 20 years in the future, building a solid historical record that will persist in collections. And we think about how our archives fit into the bigger context of other archives around the country and around the world.

For the two web archives I work on – CA.gov and the Bay Area Governments archives – I am the primary person engaged in planning, collecting, describing and making available that content. And when you look at the web capture life cycle you need to ensure the subject specialist is included and their role understood and valued.

The CA.gov archive involves a group from several organisations including the government library. We have been archiving since 2007 in the California Digital Library initially. We moved into Archive-It in 2013.

The Bay Area Governments archives includes materials on 9 counties, but primarily and comprehensively focused on two key counties here. We bring in regional governments and special districts where policy making for these areas occur.

Archiving these collections has been incredibly useful for understanding government, their processes, how to work with government agencies and the dissemination of this work. But as the sole responsible person that is not ideal. We have had really good technical support from Internet Archive around scoping rules, problems with crawls, thinking about writing regular expressions, how to understand and manage what we see from crawls. We’ve also benefitted from working with our colleague Nicholas Taylor here at Stanford who wrote a great QA report which has helped us.

We are heavily reliant on crawlers, on tools and technologies created by you and others, to gather information for our archive. And since most subject selectors have pretty big portfolios of work – outreach, instruction, as well as collection development – we have to have good ties to developers, and to the wider community with whom we can share ideas and questions is really vital.

Pamela: I’m going to talk about two Columbia archives, the Human Rights Web Archive (HRWA) and Historic Preservation and Urban Planning. I’d like to echo Kris’ comments about the importance of subject specialists. The Historic Preservation and Urban Planning archive is led by our architecture subject specialist and we’d reached a point where we had to collect web materials to continue that archive – and she’s done a great job of bringing that together. Human Rights seems to have long been networked – using the idea of the “internet” long before the web and hypertext. We work closely with Alex Thurman, and have an additional specially supported web curator, but there are many more ways to collaborate and work together.

James: I will also reflect on my experience. And the FDLP – Federal Library Program – involves libraries receiving absolutely every government publications in order to ensure a comprehensive archive. There is a wider programme allowing selective collection. At Stanford we are 85% selective – we only weed out content (after five years) very lightly and usually flyers etc. As a librarian I curate content. As an FDLP library we have to think of our collection as part of the wider set of archives, and I like that.

As archivists we also have to understand provenance… How do we do that with the web archive. And at this point I have to shout out to Jefferson Bailey and colleagues for the “End of Term” collection – archiving all gov sites at the end of government terms. This year has been the most expansive, and the most collaborative – including FTP and social media. And, due to the Trump administration’s hostility to science and technology we’ve had huge support – proposals of seed sites, data capture events etc.

2. Collection Development approaches to web archives, perspectives from subject specialists

As subject specialists we all have to engage in collection development – there are no vendors in this space…

Kris: Looking again at the two government archives I work on there is are Depository Program Statuses to act as a starting point… But these haven’t been updated for the web. However, this is really a continuation of the print collection programme. And web archiving actually lets us collect more – we are no longer reliant on agencies putting content into the Depository Program.

So, for CA.gov we really treat this as a domain collection. And no-one really doing this except some UCs, myself, and state library and archives – not the other depository libraries. However, we don’t collect think tanks, or the not-for-profit players that influence policy – this is for clarity although this content provides important context.

We also had to think about granularity… For instance for the CA transport there is a top level domain and sub domains for each regional transport group, and so we treat all of these as seeds.

Scoping rules matter a great deal, partly as our resources are not unlimited. We have been fortunate that with the CA.gov archive that we have about 3TB space for this year, and have been able to utilise it all… We may not need all of that going forwards, but it has been useful to have that much space.

Pamela: Much of what Kris has said reflects our experience at Columbia. Our web archiving strengths mirror many of our other collection strengths and indeed I think web archiving is this important bridge from print to fully digital. I spent some time talking with our librarian (Chris) recently, and she will add sites as they come up in discussion, she monitors the news for sites that could be seeds for our collection… She is very integrated in her approach to this work.

For the human rights work one of the challenges is the time that we have to contribute. And this is a truly interdisciplinary area with unclear boundaries, and those are both challenging aspects. We do look at subject guides and other practice to improve and develop our collections. And each fall we sponsor about two dozen human rights scholars to visit and engage, and that feeds into what we collect… The other thing that I hope to do in the future is to do more assessment to look at more authoritative lists in order to compare with other places… Colleagues look at a site called ideallist which lists opportunities and funding in these types of spaces. We also try to capture sites that look more vulnerable – small activist groups – although it is nt clear if they actually are that risky.

Cost wise the expensive part of collecting is both human effort to catalogue, and the permission process in the collecting process. And yesterday’s discussion of possible need for ethics groups as part of the permissions prpcess.

In the web archiving space we have to be clearer on scope and boundaries as there is such a big, almost limitless, set of materials to pick from. But otherwise plenty of parallels.

James: For me the material we collect is in the public domain so permissions are not part of my challenge here. But there are other aspects of my work, including LOCKSS. In the case of Fugitive US Agencies Collection we take entire sites (e.g. CBO, GAO, EPA) plus sites at risk (eg Census, Current Industrial Reports). These “fugitive” agencies include publications should be in the depository programme but are not. And those lots documents that fail to make it out, they are what this collection is about. When a library notes a lost document I will share that on the Lost Docs Project blog, and then also am able to collect and seed the cloud and web archive – using the WordPress Amber plugin – for links. For instance the CBO looked at the health bill, aka Trump Care, was missing… In fact many CBO publications were missing so I have added it as a see for our Archive-it

3. Discovery and use of web archives

Discovery and use of web archives is becoming increasingly important as we look for needles in ever larger haystacks. So, firstly, over to Kris:

Kris: One way we get archives out there is in our catalogue, and into WorldCat. That’s one plae to help other libraries know what we are collecting, and how to find and understand it… So would be interested to do some work with users around what they want to find and how… I suspect it will be about a specific request – e.g. city council in one place over a ten year period… But they won’t be looking for a web archive per se… We have to think about that, and what kind of intermediaries are needed to make that work… Can we also provide better seed lists and documentation for this? In Social Sciences we have the Code Book and I think we need to share the equivalent information for web archives, to expose documentation on how the archive was built… And linking to seeds nad other parts of collections .

One other thing we have to think about is process and document ingest mechanism. We are trying to do this for CA.gov to better describe what we do… BUt maybe there is a standard way to produce that sort of documentation – like the Codebook…

Pamela: Very quickly… At Columbia we catalogue individual sites. We also have a customised portal for the Human Rights. That has facets for “search as research” so you can search and develop and learn by working through facets – that’s often more useful than item searches… And, in terms of collecting for the web we do have to think of what we collect as data for analysis as part of a larger data sets…

James: In the interests of time we have to wrap up, but there was one comment I wanted to make.which is that there are tools we use but also gaps that we see for subject specialists [see slide]… And Andrew’s comments about the catalogue struck home with me…

Q&A

Q1) Can you expand on that issue of the catalogue?

A1) Yes, I think we have to see web archives both as bulk data AND collections as collections. We have to be able to pull out the documents and reports – the traditional materials – and combine them with other material in the catalogue… So it is exciting to think about that, about the workflow… And about web archives working into the normal library work flows…

Q2) Pamela, you commented about permissions framework as possibly vital for IRB considerations for web research… Is that from conversations with your IRB or speculative.

A2) That came from Matt Webber’s comment yesterday on IRB becoming more concerned about web archive-based research. We have been looking for faster processes… But I am always very aware of the ethical concern… People do wonder about ethics and permissions when they see the archive… Interesting to see how we can navigate these challenges going forward…

Q3) Do you use LCSH and are there any issues?

A3) Yes, we do use LCSH for some items and the collections… Luckily someone from our metadata team worked with me. He used Dublin Core, with LCSH within that. He hasn’t indicated issues. Government documents in the US (and at state level) typically use LCSH so no, no issues that I’m aware of.

 

Share/Bookmark

IIPC WAC / RESAW Conference 2017 – Day Two (Technical Strand) Liveblog

I am again at the IIPC WAC / RESAW Conference 2017 and, for today I am

Tools for web archives analysis & record extraction (chair Nicholas Taylor)

Digging documents out of the archived web – Andrew Jackson

This is the technical counterpoint to the presentation I gave yesterday… So I talked yesterday about the physical workflow of catalogue items… We found that the Digital ePrints team had started processing eprints the same way…

  • staff looked in an outlook calendar for reminders
  • looked for new updates since last check
  • download each to local folder and open
  • check catalogue to avoid re-submitting
  • upload to internal submission portal
  • add essential metadata
  • submit for ingest
  • clean up local files
  • update stats sheet
  • Then inget usually automated (but can require intervention)
  • Updates catalogue once complete
  • New catalogue records processed or enhanced as neccassary.

It was very manual, and very inefficient… So we have created a harvester:

  • Setup: specify “watched targets” then…
  • Harvest (harvester crawl targets as usual) –> Ingested… but also…
  • Document extraction:
    • spot documents in the crawl
    • find landing page
    • extract machine-readable metadata
    • submit to W3ACT (curation tool) for review
  • Acquisition:
    • check document harvester for new publications
    • edit essemtial metaddta
    • submit to catalogue
  • Cataloguing
    • cataloguing records processed as neccassry

This is better but there are challenges. Firstly, what is a “publication?”. With the eprints team there was a one-to-one print and digital relationship. But now, no more one-to-one. For example, gov.uk publications… An original report will has an ISBN… But that landing page is a representation of the publication, that’s where the assets are… When stuff is catalogued, what can frustrate technical folk… You take date and text from the page – honouring what is there rather than normalising it… We can dishonour intent by capturing the pages… It is challenging…

MARC is initially alarming… For a developer used to current data formats, it’s quite weird to get used to. But really it is just encoding… There is how we say we use MARC, how we do use MARC, and where we want to be now…

One of the intentions of the metadata extraction work was to proide an initial guess of the catalogue data – hoping to save cataloguers and curators time. But you probably won’t be surprised that the names of authors’ names etc. in the document metadata is rarely correct. We use the worse extractor, and layer up so we have the best shot. What works best is extracting the HTML. Gov.uk is a big and consistent publishing space so it’s worth us working on extracting that.

What works even better is the gov.uk API data – it’s in JSON, it’s easy to parse, it’s worth coding as it is a bigger publisher for us.

But now we have to resolve references… Multiple use cases for “records about this record”:

  • publisher metadata
  • third party data sources (e.g. Wikipedia)
  • Our own annotations and catalogues
  • Revisit records

We can’t ignore the revisit records… Have to do a great big join at some point… To get best possible quality data for every single thing….

And this is where the layers of transformation come in… Lots of opportunities to try again and build up… But… When I retry document extraction I can accidentally run up another chain each time… If we do our Solaar searches correctly it should be easy so will be correcting this…

We do need to do more future experimentation.. Multiple workflows brings synchronisation problems. We need to ensure documents are accessible when discocerale. Need to be able to re-run automated extraction.

We want to iteractively ipmprove automated metadat extraction:

  • improve HTML data extraction rules, e.g. Zotero translators (and I think LOCKSS are working on this).
  • Bring together different sources
  • Smarter extractors – Stanford NER, GROBID (built for sophisticated extraction from ejournals)

And we still have that tension between what a publication is… A tension between established practice and publisher output Need to trial different approaches with catalogues and users… Close that whole loop.

Q&A

Q1) Is the PDF you extract going into another repository… You probably have a different preservation goal for those PDFs and the archive…

A1) Currently the same copy for archive and access. Format migration probably will be an issue in the future.

Q2) This is quite similar to issues we’ve faced in LOCKSS… I’ve written a paper with Herbert von de Sompel and Michael Nelson about this thing of describing a document…

A2) That’s great. I’ve been working with the Government Digital Service and they are keen to do this consistently….

Q2) Geoffrey Bilder also working on this…

A2) And that’s the ideal… To improve the standards more broadly…

Q3) Are these all PDF files?

A3) At the moment, yes. We deliberately kept scope tight… We don’t get a lot of ePub or open formats… We’ll need to… Now publishers are moving to HTML – which is good for the archive – but that’s more complex in other ways…

Q4) What does the user see at the end of this… Is it a PDF?

A4) This work ends up in our search service, and that metadata helps them find what they are looking for…

Q4) Do they know its from the website, or don’t they care?

A4) Officially, the way the library thinks about monographs and serials, would be that the user doesn’t care… But I’d like to speak to more users… The library does a lot of downstream processing here too..

Q4) For me as an archivist all that data on where the document is from, what issues in accessing it they were, etc. would extremely useful…

Q5) You spoke yesterday about engaging with machine learning… Can you say more?

A5) This is where I’d like to do more user work. The library is keen on subject headings – thats a big high level challenge so that’s quite amenable to machine learning. We have a massive golden data set… There’s at least a masters theory in there, right! And if we built something, then ran it over the 3 million ish items with little metadata could be incredibly useful. In my 0pinion this is what big organisations will need to do more and more of… making best use of human time to tailor and tune machine learning to do much of the work…

Comment) That thing of everything ending up as a PDF is on the way out by the way… You should look at Distil.pub – a new journal from Google and Y combinator – and that’s the future of these sorts of formats, it’s JavaScript and GitHub. Can you collect it? Yes, you can. You can visit the page, switch off the network, and it still works… And it’s there and will update…

A6) As things are more dynamic the re-collecting issue gets more and more important. That’s hard for the organisation to adjust to.

Nick Ruest & Ian Milligan: Learning to WALK (Web Archives for Longitudinal Knowledge): building a national web archiving collaborative platform

Ian: Before I start, thank you to my wider colleagues and funders as this is a collaborative project.

So, we have a fantastic web archival collections in Canada… They collect political parties, activist groups, major events, etc. But, whilst these are amazing collections, they aren’t acessed or used much. I think this is mainly down to two issues: people don’t know they are there; and the access mechanisms don’t fit well with their practices. Maybe when the Archive-it API is live that will fix it all… Right now though it’s hard to find the right thing, and the Canadian archive is quite siloed. There are about 25 organisations collecting, most use the Archive-It service. But, if you are a researcher… to use web archives you really have to interested and engaged, you need to be an expert.

So, building this portal is about making this easier to use… We want web archives to be used on page 150 in some random book. And that’s what the WALK project is trying to do. Our goal is to break down the silos, take down walls between collections, between institutions. We are starting out slow… We signed Memoranda of Understanding with Toronto, Alberta, Victoria, Winnipeg, Dalhousie, SImon Fraser University – that represents about half of the archive in Canada.

We work on workflow… We run workshops… We separated the collections so that post docs can look at this

We are using Warcbase (warcbase.org) and command line tools, we transferred data from internet archive, generate checksums; we generate scholarly derivatives – plain text, hypertext graph, etc. In the front end you enter basic information, describe the collection, and make sure that the user can engage directly themselves… And those visualisations are really useful… Looking at visualisation of the Canadan political parties and political interest group web crawls which track changes, although that may include crawler issues.

Then, with all that generated, we create landing pages, including tagging, data information, visualizations, etc.

Nick: So, on a technical level… I’ve spent the last ten years in open source digital repository communities… This community is small and tightknit, and I like how we build and share and develop on each others work. Last year we presented webarchives.ca. We’ve indexed 10 TB of warcs since then, representing 200+ M Solr docs. We have grown from one collection and we have needed additional facets: institution; collection name; collection ID, etc.

Then we have also dealt with scaling issues… 30-40Gb to 1Tb sized index. You probably think that’s kinda cute… But we do have more scaling to do… So we are learning from others in the community about how to manage this… We have Solr running on an Open Stack… But right now it isn’t at production scale, but getting there. We are looking at SolrCloud and potentially using a Shard2 per collection.

Last year we had a solr index using the Shine front end… It’s great but… it doesn’t have an active open source community… We love the UK Web Archive but… Meanwhile there is BlackLight which is in wide use in libraries. There is a bigger community, better APIs, bug fixees, etc… So we have set up a prototype called WARCLight. It does almost all that Shine does, except the tree structure and the advanced searching..

Ian spoke about dericative datasets… For each collection, via Blacklight or ScholarsPortal we want domain/URL Counts; Full text; graphs. Rather than them having to do the work, they can just engage with particular datasets or collections.

So, that goal Ian talked about: one central hub for archived data and derivatives…

Q&A

Q1) Do you plan to make graphs interactive, by using Kebana rather than Gephi?

A1 – Ian) We tried some stuff out… One colleague tried R in the browser… That was great but didn’t look great in the browser. But it would be great if the casual user could look at drag and drop R type visualisations. We haven’t quite found the best option for interactive network diagrams in the browser…

A1 – Nick) Generally the data is so big it will bring down the browser. I’ve started looking at Kabana for stuff so in due course we may bring that in…

Q2) Interesting as we are doing similar things at the BnF. We did use Shine, looked at Blacklight, but built our own thing…. But we are looking at what we can do… We are interested in that web archive discovery collections approaches, useful in other contexts too…

A2 – Nick) I kinda did this the ugly way… There is a more elegant way to do it but haven’t done that yet..

Q2) We tried to give people WARC and WARC files… Our actual users didn’t want that, they want full text…

A2 – Ian) My students are quite biased… Right now if you search it will flake out… But by fall it should be available, I suspect that full text will be of most interest… Sociologists etc. think that network diagram view will be interesting but it’s hard to know what will happen when you give them that. People are quickly put off by raw data without visualisation though so we think it will be useful…

Q3) Do you think in few years time

A3) Right now that doesn’t scale… We want this more cloud-based – that’s our next 3 years and next wave of funded work… We do have capacity to write new scripts right now as needed, but when we scale that will be harder,,,,

Q4) What are some of the organisational, admin and social challenges of building this?

A4 – Nick) Going out and connecting with the archives is a big part of this… Having time to do this can be challenging…. “is an institution going to devote a person to this?”

A4 – Ian) This is about making this more accessible… People are more used to Backlight than Shine. People respond poorly to WARC. But they can deal with PDFs with CSV, those are familiar formats…

A4 – Nick) And when I get back I’m going to be doing some work and sharing to enable an actual community to work on this..

 

Share/Bookmark

Digital Conversations @BL: Web Archives: truth, lies and politics in the 21st century (part of IIPC/RESAW 2017)

Following on from Day One of IIPC/RESAW I’m at the British Library for a connected Web Archiving Week 2017 event: Digital Conversations @BL, Web Archives: truth, lies and politics in the 21st century. This is a panel session chaired by Elaine Glaser (EG) with Jane Winters (JW), Valerie Schafer (VS), Jefferson Bailey (JB) and Andrew Jackson (AJ). 

As usual, this is a liveblog so corrections, additions, etc. are welcomed. 

EG: Really excited to be chairing this session. I’ll let everyone speak for a few minutes, then ask some questions, then open it out…

JB: I thought I’d talk a bit about our archiving strategy at Internet Archive. We don’t archive the whole of the internet, but we aim to collect a lot of it. The approach is multi-pronged: to take entire web domains in shallow but broad strategy; to work with other libraries and archives to focus on particular subjects or areas or collections; and then to work with researchers who are mining or scraping the web, but not neccassarily having preservation strategies. So, when we talk about political archiving or web archiving, it’s about getting as much as possible, with different volumes and frequencies. I think we know we can’t collect everything but important things frequently, less important things less frequently. And we work with national governments, with national libraries…

The other thing I wanted to raise in

T.R. Shellenberg who was an important archivist at the National Archive in the US. He had an idea about archival strategies: that there is a primary documentation strategy, and a secondary straetgy. The primary for a government and agencies to do for their own use, the secondary for futur euse in unknown ways… And including documentary and evidencey material (the latter being how and why things are done). Those evidencery elements becomes much more meaningful on the web, that has eerged and become more meaningful in the context of our current political environment.

AJ: My role is to build a Web Archive for the United Kingdom. So I want to ask a question that comes out of this… “Can a web archive lie?”. Even putting to one side that it isn’t possible to archive the whole web.. There is confusion because we can’t get every version of everything we capture… Then there are biases from our work. We choose all UK sites, but some are captured more than others… And our team isn’t as diverse as it could be. And what we collect is also constrained by technology capability. And we are limited by time issues… We don’t normally know when material is created… The crawler often finds things only when they become popular… So the academic paper is picked up after a BBC News item – they are out of order. We would like to use more structured data, such as Twitter which has clear publication date…

But can the archive lie? Well material is much easier than print to make an untraceable change. As digital is increasingly predominant we need to be aware that our archive could he hacked… So we have to protect for that, evidence that we haven’t been hacked… And we have to build systems that are secure and can maintain that trust. Libraries will have to take care of each other.

JW: The Oxford Dictionary word of the year in 2016 was “post truth” whilst the Australian dictionary went for “Fake News”. Fake News for them is either disinformation on websites for political purposes, or commercial benefit. Mirrium Webster went for “surreal” – their most searched for work. It feels like we live in very strange times… There aren’t calls for resignation where there once were… Hasn’t it always been thus though… ? For all the good citizens who point out the errors of a fake image circulated on Twitter, for many the truth never catches the lie. Fakes, lies and forgeries have helped change human history…

But modern fake news is different to that which existed before. Firstly there is the speed of fake news… Mainstream media only counteracts or addresses this. Some newspapers and websites do public corrections, but that isn’t the norm. Once publishing took time and means. Social media has made it much easier to self-publish. One can create, but also one can check accuracy and integrity – reverse image searching to see when a photo has been photoshopped or shows events of two things before…

And we have politicians making claims that they believe can be deleted and disappear from our memory… We have web archives – on both sides of the Atlantic. The European Referendum NHS pledge claim is archived and lasts long beyond the bus – which was brought by Greenpeace and repainted. The archives have also been capturing political parties websites throughout our endless election cycle… The DUP website crashed after announcement of the election results because of demands… But the archive copy was available throughout. Also a rumour that a hacker was creating an irish language version of the DUP website… But that wasn’t a new story, it was from 2011… And again the archive shows that, and archive of news websites do that.

Social Networks Responses to Terrorist Attacks in France – Valerie Schafer. 

Before 9/11 we had some digital archives of terrorist materials on the web. But this event challenged archivists and researchers. Charlie Hebdo, Paris Bataclan and Nice attacks are archived… People can search at the BNF to explore these archives, to provide users a way to see what has been said. And at the INA you can also explore the archive, including Titter archives. You can search, see keywords, explore timelines crossing key hashtags… And you can search for images… including the emoji’s used in discussion of Charlie Hebdo and Bataclan.

We also have Archive-It collections for Charlie Hebdo. This raises some questions of what should and should not be collected… We did not normally collected news papers and audio visual sites, but decided to in this case as we faced a special event. But we still face challenges – it is easiest to collect data from Twitter than from Facebook. But it is free to collect Twitter data in real time, but the archived/older data is charged for so you have to capture it in the moment. And there are limits on API collection… INA captured more than 12 Million tweets for Charlie Hebdo, for instance, it is very complete but not exhaustive.

We continue to collect for #jesuischarlie and #bataclan… They continually used and added to, in similar or related attacks, etc. There is a time for exploring and reflecting on this data, and space for critics too….

But we also see that content gets deleted… It is hard to find fake news on social media, unless you are looking for it… Looking for #fakenews just won’t cut it… So, we had a study on fake news… And we recommend that authorities are cautious about material they share. But also there is a need for cross checking – the kinds of projects with Facebook and Twitter. Web archives are full of fake news, but also full of others’ attempts to correct and check fake news as well…

EG: I wanted to go back in time to the idea of the term “fake news”… In order to understand from what “Fake News” actually is, we have to understand how it differs from previous lies and mistruths… I’m from outside the web world… We are often looking at tactics to fight fire with fire, to use an unfortunate metaphor…  How new is it? And who is to blame and why?

JW: Talking about it as a web problem, or a social media issue isn’t right. It’s about humans making decisions to critique or not that content. But it is about algorithmic sharing and visibility of that information.

JB: I agree. What is new is the way media is produced, disseminated and consumed – those have technological underpinnings. And they have been disruptive of publication and interpretation in a web world.

EG: Shouldn’t we be talking about a culture not just technology… It’s not just the “vessel”… Isn’t the dissemination have more of a role than perhaps we are suggesting…

AJ: When you build a social network or any digital space you build in different affordances… So that Facebook and Twitter is different. And you can create automated accounts, with Twitter especially offering an affordance for robots etc which allows you to give the impression of a movement. There are ways to change those affordances, but there will also always be fake news and issues…

EG: There are degrees of agency in fake news.. from bots to deliberate posts…

JW: I think there is also the aspect of performing your popularity – creating content for likes and shares, regardless of whether what you share is true or not.

VS: I know terrorism is different… But any tweet sharing fake news you get 4 retweets denying… You have more tweets denying than sharing fake news…

AJ: One wonders about the filter bubble impact here… Facebook encourges inward looking discussion… Social media has helped like minded people find each other, and perhaps they can be clipped off more easily from the wider discussion…

VS: I think also what is interested is the game between social media and traditional media…You have questions and relationship there…

EG: All the internet can do is reflect the crooked timber of reality… We know that people have confirmation bias, we are quite tolerant of untruths, to be less tolerant of information that contradicts our perceptions, even if untrue.You have people and the net being equally tolerant of lies and mistruths… But isn’t there another factor here… The people demonised as gatekeepers… By putting in place structures of authority – which were journalism and academics… Their resources are reduced now… So what role do you see for those traditional gatekeepers…

VS: These gatekeepers are no more the traditional gatekeepers that they were…. They work in 24 hour news cycles and have to work to that. In France they are trying to rethink that role, there were a lot of questions about this… Whether that’s about how you react to changing events, and what happens during election…. People thinking about that…

JB: There is an authority and responsibiity for media still, but has the web changed that? Looking back its suprising now how few organisations controlled most of the media… But is that that different now?

EG: I still think you are being too easy on the internet… We’ve had investigate journalism by Carrell Cadwalladar and others on Cambridge Analytica and others who deliberately manipulate reality… You talked about witness testimony in relation to terrorism… Isn’t there an immediacy and authenticity challenge there… Donald Trump’s tweets… They are transparant but not accountable… Haven’t we created a problem that we are now trying to fix?

AJ: Yes. But there are two things going on… It seems to be that people care less about lying… People see Trump lying, and they don’t care, and media organisations don’t care as long as advertising money comes in… A parallel for that in social media – the flow of content and ads takes priority over truth. There is an economic driver common to both mediums that is warping that…

JW: There is an aspect of unpopularity aspect too… a (nameless) newspaper here that shares content to generate “I can’t believe this!” and then sharing and generating advertising income… But on a positive note, there is scope and appetite for strong investigative journalism… and that is facilitated by the web and digital methods…

VS: Citizens do use different media and cross media… Colleagues are working on how TV is used… And different channels, to compare… Mainstream and social media are strongly crossed together…

EG: I did want to talk about temporal element… Twitter exists in the moment, making it easy to make people accountable… Do you see Twitter doing what newspapers did?

AJ: Yes… A substrate…

JB: It’s amazing how much of the web is archived… With “Save Page Now” we see all kinds of things archived – including pages that exposed the whole Russian downing a Ukrainian plane… Citizen action, spotting the need to capture data whilst it is still there and that happens all the time…

EG: I am still sceptical about citizen journalism… It’s a small group of narrow demographics people, it’s time consuming… Perhaps there is still a need for journalist roles… We did talk about filter bubbles… We hear about newspapers and media as biased… But isn’t the issue that communities of misinformation are not penetrated by the other side, but by the truth…

JW: I think bias in newspapers is quite interesting and different to unacknowledged bias… Most papers are explicit in their perspective… So you know what you will get…

AJ: I think so, but bias can be quite subtle… Different perspectives on a common issue allows comparison… But other stories only appear in one type of paper… That selection case is harder to compare…

EG: This really is a key point… There is a difference between facts and truth, and explicitly framed interpretation or commentary… Those things are different… That’s where I wonder about web archives… When I look at Wikipedia… It’s almost better to go to a source with an explicit bias where I can see a take on something, unlike Wikipedia which tries to focus on fact. Talking about politicians lying misses the point… It should be about a specific rhetorical position… That definition of truth comes up when we think of the role of the archive… How do you deal with that slightly differing definition of what truth is…

JB: I talked about different complimentary collecting strategy… The Archivist as a thing has some political power in deciding what goes in the historical record… The volume of the web does undercut that power in a way that I think is good – archives have historically been about the rich and the powerful… So making archives non-exclusive somewhat addresses that… But there will be fake news in the archive…

JW: But that’s great! Archives aren’t about collecting truth. Things will be in there that are not true, partially true, or factual… It’s for researchers to sort that out lately…

VS: Your comment on Wikipedia… They do try to be factual, neutral… But not truth… And to have a good balance of power… For us as researchers we can be surprised by the neutral point of view… Fortunately the web archive does capture a mixture of opinions…

EG: Yeah, so that captures what people believed at a point of time – true or not… So I would like to talk about the archive itself… Do you see your role as being successors to journalists… Or as being able to harvest the world’s record in a different way…

JB: I am an archivist with that training and background, as are a lot of people working on web archives and interesting spaces. Certainly historic preservation drives a lot of collecting aspects… But also engineering and technological aspects. So it’s poeple interested in archiving, preservation, but also technology… And software engineers interested in web archiving.

AJ: I’m a physicist but I’m now running web archives. And for us it’s an extension of the legal deposit role… Anything made public on the web should go into the legal deposit… That’s the theory, in practice there are questions of scope, and where we expend quality assurance energy. That’s the source of possible collection bias. And I want tools to support archivists… And also to prompt for challenging bias – if we can recognise that taking place.

JW: There are also questions of what you foreground in Special Collections. There are decisions being made about collections that will be archived and catalogued more deeply…

VS: In BNF my colleagues are work in an area with a tradition, with legal deposit responsibility… There are politics of heritage and what it should be. I think that is the case for many places where that activity sits with other archivists and librarians.

EG: You do have this huge responsibility to curate the record of human history… How do you match the top down requirements with the bottom up nature of the web as we now talk about i.t.

JW: One way is to have others come in to your department to curate particular collections…

JB: We do have special collections – people can choose their own, public suggestions, feeds from researchers, all sorts of projects to get the tools in place for building web archives for their own communities… I think for the sake of longevity and use going forward, the curated collections will probably have more value… Even if they seem more narrow now.

VS: Also interesting that archives did not select bottom-up curation. In Switzerland they went top down – there are a variety of approaches across Europe.

JW: We heard about the 1916 Easter Rising archive earlier, which was through public nominations… Which is really interesting…

AJ: And social media can help us – by seeing links and hashtags. We looked at this 4-5 years ago everyone linked to the BBC, but now we have more fake news sites etc…

VS: We do have this question of what should be archived… We see capture of the vernacular web – kitten or unicorn gifs etc… !

EG: I have a dystopian scenario in my head… Could you see a time years from now when newspapers are dead, public broadcasters are more or less dead… And we have flotsom and jetsom… We have all this data out there… And kinds of data who use all this social media data… Can you reassure me?

AJ: No…

JW: I think academics are always ready to pick holes in things, I hope that that continues…

JB: I think more interesting is the idea that there may not be a web… Apps, walled gardens… Facebook is pretty hard to web archive – they make it intentionally more challenging than it should be. There are lots of communication tools that disappeared… So I worry more about loss of a web that allows the positive affordances of participation and engagement…

EG: There is the issue of privatising and sequestering the web… I am becoming increasingly aware of the importance of organisations – like the BL and Internet Archive… Those roles did used to be taken on by publicly appointed organisations and bodies… How are they impacted by commercial privatisation… And how those roles are changing… How do you envisage that public sphere of collecting…

JW: For me more money for organisations like the British Library is important. Trust is crucial, and I trust that they will continue to do that in a trustworthy way. Commercial entities cannot be trusted to protect our cultural heritage…

AJ: A lot of people know what we do with physical material, but are surprised by our digital work. We have to advocate for ourselves. We are also constrained by the legal framework we operate within, and we have to challenge that over time…

JB: It’s super exciting to see libraries and archives recognised for their responsibility and trust… But that also puts them at higher risk by those who they hold accountable, and being recognised as bastions of accountability makes them more vulnerable.

VS: Recently we had 20th birthday of the Internet Archive, and 10 years of the French internet archiving… This is all so fast moving… People are more and more aware of web archiving… We will see new developments, ways to make things open… How to find and search and explore the archive more easily…

EG: The question then is how we access this data… The new masters of the universe will be those emerging gatekeepers who can explore the data… What is the role between them and the public’s ability to access data…

VS: It is not easy to explain everything around web archives but people will demand access…

JW: There are different levels of access… Most people will be able to access what they want. But there is also a great deal of expertise in organisations – it isn’t just commercial data work. And working with the Alan Turing Institute and cutting edge research helps here…

EG: One of the founders of the internet, Vint Cerf, says that “if you want to keep your treasured family pictures, print them out”. Are we overly optimistic about the permanence of the record.

AJ: We believe we have the skills and capabilities to maintain most if not all of it over time… There is an aspect of benign neglect… But if you are active about your digital archive you could have a copy in every continent… Digital allows you to protect content from different types of risk… I’m confident that the library can do this as part of it’s mission.

Q&A

Q1) Coming back to fake news and journalists… There is a changing role between the web as a communications media, and web archiving… Web archives are about documenting this stuff for journalists for research as a source, they don’t build the discussion… They are not the journalism itself.

Q2) I wanted to come back to the idea of the Filter Bubble, in the sense that it mediates the experience of the web now… It is important to capture that in some way, but how do we archive that… And changes from one year to the next?

Q3) It’s kind of ironic to have nostalgia about journalism and traditional media as gatekeepers, in a country where Rupert Murdoch is traditionally that gatekeeper. Global funding for web archiving is tens of millions; the budget for the web is tens of billions… The challenges are getting harder – right now you can use robots.txt but we have DRM coming and that will make it illegal to archive the web – and the budgets have to increase to match that to keep archives doing their job.

AJ: To respond to Q3… Under the legislation it will not be illegal for us to archive that data… But it will make it more expensive and difficult to do, especially at scale. So your point stands, even with that. In terms of the Filter Bubble, they are out of our scope, but we know they are important… It would be good to partner with an organisation where the modern experience of media is explicitly part of it’s role.

JW: I think that idea of the data not being the only thing that matters is important. Ethnography is important for understanding that context around all that other stuff…  To help you with supplementary research. On the expense side, it is increasingly important to demonstrate the value of that archiving… Need to think in terms of financial return to digital and creative economies, which is why researchers have to engage with this.

VS: Regarding the first two questions… Archives reflect reality, so there will be lies there… Of course web archives must be crossed and compared with other archives… And contextualisation matters, the digital environment in which the web was living… Contextualisation of web environment is important… And with terrorist archive we tried to document the process of how we selected content, and archive that too for future researchers to have in mind and understand what is there and why…

JB: I was interested in the first question, this idea of what happens and preserving the conversation… That timeline was sometimes decades before but is now weeks or days or less… In terms of experience websites are now personalised and our ability to capture that is impossible on a broad question. So we need to capture that experience, and the emergent personlisation… The web wasn’t public before, as ARPAnet, then it became public, but it seems to be ebbing a bit…

JW: With a longer term view… I wonder if the open stuff which is easier to archive may survive beyond the gated stuff that traditionally was more likely to survive.

Q4) Today we are 24 years into advertising on the web. We take ad-driven models as a given, and we see fake news as a consequence of that… So, my question is, Minitel was a large system that ran on a different model… Are there different ways to change the revenue model to change fake or true news and how it is shared…

Q5) Teresa May has been outspoken on fake news and wants a crackdown… The way I interpret that is censorship and banning of sites she does not like… Jefferson said that he’s been archiving sites that she won’t like… What will you do if she asks you to delete parts of your archive…

JB: In the US?!

Q6) Do you think we have sufficient web literacy amongst policy makers, researchers and citizens?

 

Share/Bookmark

Posted in Uncategorized

IIPC WAC / RESAW Conference 2017 – Day One Liveblog

From today until Friday I will be at the International Internet Preservation Coalition (IIPC) Web Archiving Conference 2017, which is being held jointly with the second RESAW: Research Infrastructure for the Study of Archived Web Materials Conference. I’ll be attending the main strand at the School of Advanced Study, University of London, today and Friday, and at the technical strand (at the British Library) on Thursday.

I’m here wearing my “Reference Rot in Theses: A HiberActive Pilot” – aka “HiberActive” – hat. HiberActive is looking at how we can better enable PhD candidates to archive web materials they are using in their research, and citing in their thesis. I’m managing the project and working with developers, library and information services stakeholders, and a fab team of five postgraduate interns who are, whilst I’m here, out and about around the University of Edinburgh talking to PhD students to find out how they collect, manage and cite their web references, and what issues they may be having with “reference rot” – content that changes, decays, disappears, etc. We will have a webpage for the project and some further information to share soon but if you are interested in finding out more, leave me a comment below or email me: nicola.osborne@ed.ac.uk.

These notes are being taken live so, as usual for my liveblogs, I welcome corrections, additions, comment etc. (and, as usual, you’ll see the structure of the day appearing below with notes added at each session). 

Opening remarks: Jane Winters and Nicholas Taylor

Opening plenary: Leah Lievrouw – Web history and the landscape of communication/media research Chair: Nicholas Taylor

Share/Bookmark

CILIPS Conference 2017: Strategies for Success (#CILIPS17) – Liveblog (day two)

Today I am at the CILIPS Conference 2017: Strategies for Success. I’ll be talking about our Digital Footprint work and Digital Footprint MOOC (#DFMOOC). Meanwhile back in Edinburgh my colleagues Louise Connelly (PI for our Digital Footprint research) and Sian Bayne (PI for our Yik Yak research) are at the Principal’s Teaching Award Scheme Forum 2017 talking about our “A Live Pulse”: YikYak for understanding teaching, learning and assessment at Edinburgh research project. So, lots of exciting digital footprint stuff afoot!

I’ll be liveblogging the sessions I’m sitting in today here, as usual corrections, additions, etc. always welcome. You’ll see the programme below becoming 

We have opened with the efficient and productive CILIPS AGM. Now, a welcome from the CILIPS President, Liz McGettigan, reflecting on the last year for libraries in Scotland. She is also presenting the student awards to Adam Dombovari (in absentia) and Laura Anne MacNeil. She is also announcing the inauguration of a new CILIPS award Scotland’s Library and Information Professional of the Year Award – nomination information coming soon on the website – the first award will be given out at the Autumn Gathering.

Keynote One – The Road to Copyright Literacy: a journey towards library empowerment Dr. Jane Secker, Senior Lecturer in Educational Development at City, University of London and Chris Morrison, Copyright and Licensing Compliance Officer, University of Kent

 

11.25-11.45am Refreshments and exhibition

11.45-12.30pm Parallel Sessions 1

City Suite Scotland welcomes refugees – the role of the library in resettlement and inclusion Dr. Konstantina Martzoukou, Post Graduate Programme Leader, Robert Gordon University – iSchool

Art Gallery Suite Overcoming disability and barriers: Using assistive Technologies in libraries A joint presentation from Craig Mill – CALL Scotland and Edinburgh Libraries award winning Visually Impaired People Project – Jim McKenzie – Lifelong Learning Library Development Leader – Disability Support, Paul McCloskey – Lifelong Learning Strategic Development Officer (Libraries) and Lindsay MacLeod – Project Volunteer

Melbourne Suite Perfect partnerships: Spotlight on Macmillan in Libraries Craig Menzies, Macmillan Programme Manager

12.30-1.35pm Networking lunch and exhibition

1.35-2.20pm Parallel Sessions 2

City Suite Overcoming barriers in reaching Readers Jacqueline Geekie and Jennifer Horan, Aberdeenshire Libraries and Fiona Renfrew, South Lanarkshire Leisure and Culture Libraries

Art Gallery Suite Spotlight on research – Papers on: Linked Data: opening Scotland’s library content to the world (Dr. Diane Pennington, University of Strathclyde) Towards an information literacy strategy for Scotland (Dr. John Crawford)

Melbourne Suite Informal recognition routes: Learn more about Open Badges – Robert Stewart, Learning and Development Adviser, Digital Learning Team, Scottish Social Services Council 10 minute break to transfer between rooms

2.30-3.15pm Parallel Sessions 3

City Suite Families matter: Innovation in public libraries Every Child a Library Member evaluation (Peter Reid, Professor of Librarianship and Head of Information Management at Robert Gordon University) and DigiDabble, North Ayrshire’s award winning family project (Alison McAllister, Systems and Support Officer, North Ayrshire Libraries)

Art Gallery Suite Fake news and alternative facts – The challenge for information professionals Jenny Foreman, Morag Higginson and Paul Gray, The Information Literacy Community of Practice

Melbourne Suite If I googled you – what would I find? Managing your digital footprint Nicola Osborne, Digital Education Manager, EDINA 3.15-3.35pm Refreshments and exhibition

3.35-4.15pm Keynote 2 (City Suite) – Securing the future: where next for our community in 2018 and beyond? Nick Poole, Chief Executive, CILIP

4.15-4.30pm President’s closing remarks

Share/Bookmark

Behind the scenes at the Digital Footprint MOOC

Last Monday we launched the new Digital Footprint MOOC, a free three week online course (running on Coursera) led by myself and Louise Connelly (Royal (Dick) School of Veterinary Studies). The course builds upon our work on the Managing Your Digital Footprints research project, campaign and also draws on some of the work I’ve been doing in piloting a Digital Footprint training and consultancy service at EDINA.

It has been a really interesting and demanding process working with the University of Edinburgh MOOCs team to create this course, particularly focusing in on the most essential parts of our Digital Footprints work. Our intention for this MOOC is to provide an introduction to the issues and equip participants with appropriate skills and understanding to manage their own digital tracks and traces. Most of all we wanted to provide a space for reflection and for participants to think deeply about what their digital footprint means to them and how they want to manage it in the future. We don’t have a prescriptive stance – Louise and I manage our own digital footprints quite differently but both of us see huge value in public online presence – but we do think that understanding and considering your online presence and the meaning of the traces you leave behind online is an essential modern life skill and want to contribute something to that wider understanding and debate.

Since MOOCs – Massive Open Online Courses – are courses which people tend to take in their own time for pleasure and interest but also as part of their CPD and personal development so that fit of format and digital footprint skills and reflection seemed like a good fit, along with some of the theory and emerging trends from our research work. We also think the course has potential to be used in supporting digital literacy programmes and activities, and those looking for skills for transitioning into and out of education, and in developing their careers. On that note we were delighted to see the All Aboard: Digital Skills in Higher Education‘s 2017 event programme running last week – their website, created to support digital skills in Ireland, is a great complementary resource to our course which we made a (small) contribution to during their development phase.

Over the last week it has been wonderful to see our participants engaging with the Digital Footprint course, sharing their reflections on the #DFMOOC hashtag, and really starting to think about what their digital footprint means for them. From the discussion so far the concept of the “Uncontainable Self” (Barbour & Marshall 2012) seems to have struck a particular chord for many of our participants, which is perhaps not surprising given the degree to which our digital tracks and traces can propagate through others posts, tags, listings, etc. whether or not we are sharing content ourselves.

When we were building the MOOC we were keen to reflect the fact that our own work sits in a context of, and benefits from, the work of many researchers and social media experts both in our own local context and the wider field. We were delighted to be able to include guest contributors including Karen Gregory (University of Edinburgh), Rachel Buchanan (University of Newcastle, Australia), Lilian Edwards (Strathclyde University), Ben Marder (University of Edinburgh), and David Brake (author of Sharing Our Lives Online).

The usefulness of making these connections across disciplines and across the wider debate on digital identity seems particularly pertinent given recent developments that emphasise how fast things are changing around us, and how our own agency in managing our digital footprints and digital identities is being challenged by policy, commercial and social factors. Those notable recent developments include…

On 28th March the US Government voted to remove restrictions on the sale of data by ISPs (Internet Service Providers), potentially allowing them to sell an incredibly rich picture of browsing, search, behavioural and intimate details without further consultation (you can read the full measure here). This came as the UK Government mooted the banning of encryption technologies – essential for private messaging, financial transactions, access management and authentication – claiming that terror threats justified such a wide ranging loss of privacy. Whilst that does not seem likely to come to fruition given the economic and practical implications of such a measure, we do already have the  Investigatory Powers Act 2016 in place which requires web and communications companies to retain full records of activity for 12 months and allows police and security forces significant powers to access and collect personal communications data and records in bulk.

On 30th March, a group of influential privacy researchers, including danah boyd and Kate Crawford, published Ten simple rules for responsible big data research in PLoSOne. The article/manifesto is an accessible and well argued guide to the core issues in responsible big data research. In many ways it summarises the core issues highlight in the excellent (but much more academic and comprehensive) AoIR ethics guidance. The PLoSOne article is notably directed to academia as well as industry and government, since big data research is at least as much a part of commercial activity (particularly social media and data driven start ups, see e.g. Uber’s recent attention for profiling and manipulating drivers) as traditional academic research contexts. Whilst academic research does usually build ethical approval processes (albeit conducted with varying degrees of digital savvy) and peer review into research processes, industry is not typically structured in that way and often not held to the same standards particularly around privacy and boundary crossing (see, e.g. Michael Zimmers work on both academic and commercial use of Facebook data).

The Ten simple rules… are also particularly timely given the current discussion of Cambridge Analytica and it’s role in the 2016 US Election, and the UK’s EU Referendum. An article published in Das Magazin in December 2016, and a subsequent English language version published on Vice’s Motherboard have been widely circulated on social media over recent weeks. These articles suggest that the company’s large scale psychometrics analysis of social media data essentially handed victory to Trump and the Leave/Brexit campaigns, which naturally raises personal data and privacy concerns as well as influence, regulation and governance issues. There remains some skepticism about just how influential this work was… I tend to agree with Aleks Krotoski (social psychologist and host of BBC’s The Digital Human) who – speaking with Pat Kane at an Edinburgh Science Festival event last night on digital identity and authenticity – commented that she thought the Cambridge Analytica work was probably a mix of significant hyperbole but also some genuine impact.

These developments focus attention on access, use and reuse of personal data and personal tracks and traces, and that is something we we hope our MOOC participants will have opportunity to pause and reflect on as they think about what they leave behind online when they share, tag, delete, and particularly when they consider terms and conditions, privacy settings and how they curate what is available and to whom.

So, the Digital Footprint course is launched and open to anyone in the world to join for free (although Coursera will also prompt you with the – very optional – possibility of paying a small fee for a certificate), and we are just starting to get a sense of how our videos and content are being received. We’ll be sharing more highlights from the course, retweeting interesting comments, etc. throughout this run (which began on Monday 3rd April), but also future runs since this is an “on demand” MOOC which will run regularly every four weeks. If you do decide to take a look then I would love to hear your comments and feedback – join the conversation on #DFMOOC, or leave a comment here or email me.

And if you’d like to find out more about our digital footprint consultancy, or would be interested in working with the digital footprints research team on future work, do also get in touch. Although I’ve been working in this space for a while this whole area of privacy, identity and our social spaces seems to continue to grow in interest, relevance, and importance in our day to day (digital) lives.

 

Share/Bookmark

Somewhere over the Rainbow: our metadata online, past, present & future

Today I’m at the Cataloguing and Indexing Group Scotland event – their 7th Metadata & Web 2.0 event – Somewhere over the Rainbow: our metadata online, past, present & future.

Paul Cunnea, CIGS Chair is introducing the day noting that this is the 10th year of these events: we don’t have one every year but we thought we’d return to our Wizard of Oz theme.

On a practical note, Paul notes that if we have a fire alarm today we’d normally assemble outside St Giles Cathedral but as they are filming The Avengers today, we’ll be assembling elsewhere!

There is also a cupcake competition today – expect many baked goods to appear on the hashtag for the day #cigsweb2. The winner takes home a copy of Managing Metadata in Web-scale Discovery Systems / edited by Louise F Spiteri. London : Facet Publishing, 2016 (list price £55).

Engaging the crowd: old hands, modern minds. Evolving an on-line manuscript transcription project / Steve Rigden with Ines Byrne (not here today) (National Library of Scotland)

 

Ines has led the development of our crowdsourcing side. My role has been on the manuscripts side. Any transcription is about discovery. For the manuscripts team we have to prioritise digitisation so that we can deliver digital surrogates that enable access, and to open up access. Transcription hugely opens up texts but it is time consuming and that time may be better spent on other digitisation tasks.

OCR has issues but works relatively well for printed texts. Manuscripts are a different matter – handwriting, ink density, paper, all vary wildly. The REED(?) project is looking at what may be possible but until something better comes along we rely on human effort. Generally the manuscript team do not undertake manual transcription, but do so for special exhibitions or very high priority items. We also have the challenge that so much of our material is still under copyright so cannot be done remotely (but can be accessed on site). The expected user community generally can be expected to have the skill to read the manuscript – so a digital surrogate replicates that experience. That being said, new possibilities shape expectations. So we need to explore possibilities for transcription – and that’s where crowd sourcing comes in.

Crowd sourcing can resolve transcription, but issues with copyright and data protection still have to be resolved. It has taken time to select suitable candidates for transcription. In developing this transcription project we looked to other projects – like Transcribe Bentham which was highly specialised, through to projects with much broader audiences. We also looked at transcription undertaken for the John Murray Archive, aimed at non specialists.

The selection criteria we decided upon was for:

  • Hands that are not too troublesome.
  • Manuscripts that have not been re-worked excessively with scoring through, corrections and additions.
  • Documents that are structurally simple – no tables or columns for example where more complex mark-up (tagging) would be required.
  • Subject areas with broad appeal: genealogies, recipe book (in the old crafts of all kinds sense), mountaineering.

Based on our previous John Murray Archive work we also want the crowd to provide us with structure text, so that it can be easily used, by tagging the text. That’s an approach that is borrowed from Transcribe Bentham, but we want our community to be self-correcting rather than doing QA of everything going through. If something is marked as finalised and completed, it will be released with the tool to a wider public – otherwise it is only available within the tool.

The approach could be summed up as keep it simple – and that requires feedback to ensure it really is simple (something we did through a survey). We did user testing on our tool, it particularly confirmed that users just want to go in, use it, and make it intuitive – that’s a problem with transcription and mark up so there are challenges in making that usable. We have a great team who are creative and have come up with solutions for us… But meanwhile other project have emerged. If the REED project is successful in getting machines to read manuscripts then perhaps these tools will become redundant. Right now there is nothing out there or in scope for transcribing manuscripts at scale.

So, lets take a look at Transcribe NLS

You have to login to use the system. That’s mainly to help restrict the appeal to potential malicious or erroneous data. Once you log into the tool you can browse manuscripts, you can also filter by the completeness of the transcription, the grade of the transcription – we ummed and ahhed about including that but we though it was important to include.

Once you pick a text you click the button to begin transcribing – you can enter text, special characters, etc. You can indicate if text is above/below the line. You can mark up where the figure is. You can tag whether the text is not in English. You can mark up gaps. You can mark that an area is a table. And you can also insert special characters. It’s all quite straight forward.

Q&A

Q1) Do you pick the transcribers, or do they pick you?

A1) Anyone can take part but they have to sign up. And they can indicate a query – which comes to our team. We do want to engage with people… As the project evolves we are looking at the resources required to monitor the tool.

Q2) It’s interesting what you were saying about copyright…

A2) The issues of copyright here is about sharing off site. A lot of our manuscripts are unpublished. We use exceptions such as the 1956 Copyright Act for old works whose authors had died. The selection process has been difficult, working out what can go in there. We’ve also cheated a wee bit

Q3) What has the uptake of this been like?

A3) The tool is not yet live. We thin it will build quite quickly – people like a challenge. Transcription is quite addictive.

Q4) Are there enough people with palaeography skills?

A4) I think that most of the content is C19th, where handwriting is the main challenge. For much older materials we’d hit that concern and would need to think about how best to do that.

Q5) You are creating these documents that people are reading. What is your plan for archiving these.

A5) We do have a colleague considering and looking at digital preservation – longer term storage being more the challenge. As part of normal digital preservation scheme.

Q6) Are you going for a Project Gutenberg model? Or have you spoken to them?

A6) It’s all very localised right now, just seeing what happens and what uptake looks like.

Q7) How will this move back into the catalogue?

A7) Totally manual for now. It has been the source of discussion. There was discussion of pushing things through automatically once transcribed to a particular level but we are quite cautious and we want to see what the results start to look like.

Q8) What about tagging with TEI? Is this tool a subset of that?

A8) There was a John Murray Archive, including mark up and tagging. There was a handbook for that. TEI is huge but there is also TEI Light – the JMA used a subset of the latter. I would say this approach – that subset of TEI Light – is essentially TEI Very Light.

Q9) Have other places used similar approaches?

A9) TRanscribe Bentham is similar in terms of tagging. The University of Iowa Civil War Archive has also had a similar transcription and tagging approach.

Q10) The metadata behind this – how significant is that work?

A10) We have basic metadata for these. We have items in our digital object database and simple metadata goes in there – we don’t replicate the catalogue record but ensure it is identifiable, log date of creation, etc. And this transcription tool is intentionally very basic at th emoment.

Coming up later…

Can web archiving the Olympics be an international team effort? Running the Rio Olympics and Paralympics project / Helena Byrne (British Library)

Managing metadata from the present will be explored by Helena Byrne from the British Library, as she describes the global co-ordination of metadata required for harvesting websites for the 2016 Olympics, as part of the International Internet Preservation Consortium’s Rio 2016 web archiving project

Statistical Accounts of Scotland / Vivienne Mayo (EDINA)

Vivienne Mayo from EDINA describes how information from the past has found a new lease of life in the recently re-launched Statistical Accounts of Scotland

Lunch

Beyond bibliographic description: emotional metadata on YouTube / Diane Pennington (University of Strathclyde)

Diane Pennington of Strathclyde University will move beyond the bounds of bibliographic description as she discusses her research about emotions shared by music fans online and how they might be used as metadata for new approaches to search and retrieval

Our 5Rights: digital rights of children and young people / Dev Kornish, Dan Dickson, Bethany Wilson (5Rights Youth Commission)

Young Scot, Scottish Government and 5Rights introduce Scotland’s 5Rights Youth Commission – a diverse group of young people passionate about their digital rights. We will hear from Dan and Bethany what their ‘5Rights’ mean to them, and how children and young people can be empowered to access technology, knowledgeably, and fearlessly.

Playing with metadata / Gavin Willshaw and Scott Renton (University of Edinburgh)

Learn about Edinburgh University Library’s metadata games platform, a crowdsourcing initiative which has improved descriptive metadata and become a vital engagement tool both within and beyond the library. Hear how they have developed their games in collaboration with Tiltfactor, a Dartmouth College-based research group which explores game design for social change, and learn what they’re doing with crowd-sourced data. There may even be time for you to set a new high score…

Managing your Digital Footprint : Taking control of the metadata and tracks and traces that define us online / Nicola Osborne (EDINA)

Find out how personal metadata, social media posts, and online activity make up an individual’s “Digital Footprint”, why they matter, and hear some advice on how to better manage digital tracks and traces. Nicola will draw on recent University of Edinburgh research on students’ digital footprints which is also the subject of the new #DFMOOC free online course.

16:00 Close

Sticking with the game theme, we will be running a small competition on the day, involving cupcakes, book tokens and tweets – come to the event to find out more! You may be lucky enough to win a copy of Managing Metadata in Web-scale Discovery Systems / edited by Louise F Spiteri. London : Facet Publishing, 2016 – list price £55! What more could you ask for as a prize?

The ticket price includes refreshments and a light buffet lunch.

We look forward to seeing you in April!

Share/Bookmark

Jisc Digifest 2017 Day Two – LiveBlog

Today I’m still in Birmingham for the Jisc Digifest 2017 (#digifest17). I’m based on the EDINA stand (stand 9, Hall 3) for much of the time, along with my colleague Andrew – do come and say hello to us – but will also be blogging any sessions I attend. The event is also being livetweeted by Jisc and some sessions livestreamed – do take a look at the event website for more details. As usual this blog is live and may include typos, errors, etc. Please do let me know if you have any corrections, questions or comments. 

Part Deux: Why educators can’t live without social media – Eric Stoller, higher education thought-leader, consultant, writer, and speaker.

I’ve snuck in a wee bit late to Eric’s talk but he’s starting by flagging up his “Educators: Are you climbing the social media mountain?” blog post. 

Eric: People who are most reluctant to use social media are often those who are also reluctant to engage in CPD, to develop themselves. You can live without social media but social media is useful and important. Why is it important? It is used for communication, for teaching and learning, in research, in activisim… Social media gives us a lot of channels to do different things with, that we can use in our practice… And yes, they can be used in nefarious ways but so can any other media. People are often keen to see particular examples of how they can use social media in their practice in specific ways, but how you use things in your practice is always going to be specific to you, different, and that’s ok.

So, thinking about digital technology… “Digital is people” – as Laurie Phipps is prone to say… Technology enhanced learning is often tied up with employability but there is a balance to be struck, between employability and critical thinking. So, what about social media and critical thinking? We have to teach students how to determine if an online source is reliable or legitimate – social media is the same way… And all of us can be caught out. There was piece in the FT about the chairman of Tesco saying unwise things about gender, and race, etc. And I tweeted about this – but I said he was the CEO – and it got retweeted and included in a Twitter moment… But it was wrong. I did a follow up tweet and apologised but I was contributing to that..

Whenever you use technology in learning it is related to critical thinking so, of course, that means social media too. How many of us here did our educational experience completely online… Most of us did our education in the “sage on the stage” manner, that’s what was comfortable for us… And that can be uncomfortable (see e.g. tweets from @msementor).

If you follow the NHS on Twitter (@NHS) then you will know it is phenomenal – they have a different member of staff guest posting to the account. Including live tweeting an operation from the theatre (with permissions etc. of course) – if you are medical student this would be very interesting. Twitter is the delivery method now but maybe in the future it will be Hololens or Oculus Rift Live or something. Another thing I saw about a year ago was Phil Baty (Inside Higher Ed – @Phil_Baty) talked about Liz Barnes revealing that every academic at Staffordshire will use social media and will build it into performance management. That really shows that this is an organisation that is looking forward and trying new things.

Any of you take part in the weekly #LTHEchat. They were having chats about considering participation in that chat as part of staff appraisal processes. That’s really cool. And why wouldn’t social media and digital be a part of that.

So I did a Twitter poll asking academics what they use social media for:

  • 25% teaching and learning
  • 26% professional development
  • 5% research
  • 44% posting pictures of cats

The cool thing is you can do all of those things and still be using it in appropriate educational contexts. Of course people post pictures of cats.. Of course you do… But you use social media to build community. It can be part of building a professional learning environment… You can use social media to lurk and learn… To reach out to people… And it’s not even creepy… A few years back and I could say “I follow you” and that would be weird and sinister… Now it’s like “That’s cool, that’s Twitter”. Some of you will have been using the event hashtag and connecting there…

Andrew Smith, at the Open University, has been using Facebook Live for teaching. How many of your students use Facebook? It’s important to try this stuff, to see if it’s the right thing for your practice.

We all have jobs… Usually when we think about networking and professional networking we often think about LinkedIn… Any of you using LinkedIn? (yes, a lot of us are). How about blogging on LinkedIn? That’s a great platform to blog in as your content reaches people who are really interested. But you can connect in all of these spaces. I saw @mdleast tweeting about one of Anglia Ruskin’s former students who was running the NHS account – how cool is that?

But, I hear some of you say, Eric, this blurs the social and the professional. Yes, of course it does. Any of you have two Facebook accounts? I’m sorry you violate the terms of service… And yes, of course social media blurs things… Expressing the full gamut of our personality is much more powerful. And it can be amazing when senior leaders model for their colleagues that they are a full human, talking about their academic practice, their development…

Santa J. Ono (@PrezOno/@ubcprez) is a really senior leader but has been having mental health difficulties and tweeting openly about that… And do you know how powerful that is for his staff and students that he is sharing like that?

Now, if you haven’t seen the Jisc Digital Literacies and Digital Capabilities models? You really need to take a look. You can use these to use these to shape and model development for staff and students.

I did another poll on Twitter asking “Agree/Disagree: Universities must teach students digital citizenship skills” (85% agree) – now we can debate what “digital citizenship” means… If any of you have ever gotten into it with a troll online? Those words matter, they effect us. And digital citizenship matter.

I would say that you should not fall in love with digital tools. I love Twitter but that’s a private company, with shareholders, with it’s own issues… And it could disappear tomorrow… And I’d have to shift to another platform to do the things I do there…

Do any of you remember YikYak? It was an anonymous geosocial app… and it was used controversially and for bullying… So they introduced handles… But their users rebelled! (and they reverted)

So, Twitter is great but it will change, it will go… Things change…

I did another Twitter poll – which tools do your students use on a daily basis?

  • 34% snapchat
  • 9% Whatsapp
  • 19% Instagram
  • 36% use all of the above

A lot of people don’t use Snapchat because they are afraid of it… When Facebook first appeared that response was it’s silly, we wouldn’t use it in education… But we have moved that there…

There is a lot of bias about Snapchat. @RosieHare posted “I’m wondering whether I should Snapchat #digifest17 next week or whether there’ll be too many proper grown ups there who don’t use it.” Perhaps we don’t use these platforms yet, maybe we’ll catch up… But will students have moved on by then… There is a professor in the US who was using Snapchat with his students every day… You take your practice to where your students are. According to global web index (q2-3 2016) over 75% of teens use Snapchat. There are policy challenges there but students are there every day…

Instagram – 150 M people engage with daily stories so that’s a powerful tool and easier to start with than Snapchat. Again, a space where our students are.

But perfection leads to stagnation. You have to try and not be fixated on perfection. Being free to experiment, being rewarded for trying new things, that has to be embedded in the culture.

So, at the end of the day, the more engaged students are with their institution – at college or university – the more successful they will be. Social media can be about doing that, about the student experience. All parts of the organisation can be involved. There are so many social media channels you can use. Maybe you don’t recognise them all… Think about your students. A lot will use WhatsApp for collaboration, for coordination… Facebook Messenger, some of the asian messaging spaces… Any of you use Reddit? Ah, the nerds have arrived! But again, these are all spaces you can develop your practice in.

The web used to involve having your birth year in your username (e.g. @purpledragon1982), it was open… But we see this move towards WhatsApp, Facebook Messenger, WeChat, these different types of spaces and there is huge growth predicted this year. So, you need to get into the sandbox of learning, get your hands dirty, make some stuff and learn from trying new things #alldayeveryday

Q&A

Q1) What audience do you have in mind… Educators or those who support educators? How do I take this message back?

A1) You need to think about how you support educators, how you do sneaky teaching… How you do that education… So.. You use the channels, you incorporate the learning materials in those channels… You disseminate in Medium, say… And hopefully they take that with them…

Q2) I meet a strand of students who reject social media and some technology in a straight edge way… They are in the big outdoors, they are out there learning… Will they not be successful?

A2) Of course they will. You can survive, you can thrive without social media… But if you choose to engage in those channels and spaces… You can be succesful… It’s not an either/or

Q3) I wanted to ask about something you tweeted yesterday… That Prensky’s idea of digital natives/immigrants is rubbish…

A3) I think I said “#friendsdontletfriendsprensky”. He published that over ten years ago – 2001 – and people grasped onto that. And he’s walked it back to being about a spectrum that isn’t about age… Age isn’t a helpful factor. And people used it as an excuse… If you look at Dave White’s work on “visitors and residents” that’s much more helpful… Some people are great, some are not as comfortable but it’s not about age. And we do ourselves a disservice to grasp onto that.

Q4) From my organisation… One of my course leaders found their emails were not being read, asked students what they should use, and they said “Instagram” but then they didn’t read that person’s posts… There is a bump, a challenge to get over…

A4) In the professional world email is the communications currency. We say students don’t check email… Well you have to do email well. You send a long email and wonder why students don’t understand. You have to be good at communicating… You set norms and expectations about discourse and dialogue, you build that in from induction – and that can be email, discussion boards and social media. These are skills for life.

Q5) You mentioned that some academics feel there is too much blend between personal and professional. From work we’ve done in our library we find students feel the same way and don’t want the library to tweet at them…

A5) Yeah, it’s about expectations. Liverpool University has a brilliant Twitter account, Warwick too, they tweet with real personality…

Q6) What do you think about private social communities? We set up WordPress/BuddyPress thing for international students to push out information. It was really varied in how people engaged… It’s private…

A6) Communities form where they form. Maybe ask them where they want to be communicated with. Some WhatsApp groups flourish because that’s the cultural norm. And if it doesn’t work you can scrap it and try something else… And see what

Q7) I wanted to flag up a YikYak study at Edinburgh on how students talk about teaching, learning and assessment on YikYak, that started before the handles were introduced, and has continued as anonymity has returned. And we’ll have results coming from this soon…

A7) YikYak may rise and fall… But that functionality… There is a lot of beauty in those anonymous spaces… That functionality – the peers supporting each other through mental health… It isn’t tools, it’s functionality.

Q8) Our findings in a recent study was about where the students are, and how they want to communicate. That changes, it will always change, and we have to adapt to that ourselves… Do you want us to use WhatsApp or WeChat… It’s following the students and where they prefer to communicate.

A8) There is balance too… You meet students where they are, but you don’t ditch their need to understand email too… They teach us, we teach them… And we do that together.

And with that, we’re out of time… 

Share/Bookmark

Jisc Digifest 2017 Day One – LiveBlog

Liam Earney is introducing us to the day, with the hope that we all take some away from the event – some inspiration, an idea, the potential to do new things. Over the past three Digifest events we’ve taken a broad view. This year we focus on technology expanding, enabling learning and teaching.

LE: So we will be talking about questions we asked through Twitter and through our conference app with our panel:

  • Sarah Davies, head of change implementation support – education/student, Jisc
  • Liam Earney, director of Jisc Collections
  • Andy McGregor, deputy chief innovation officer, Jisc
  • Paul McKean, head of further education and skills, Jisc

Q1: Do you think that greater use of data and analytics will improve teaching, learning and the student experience?

  • Yes 72%
  • No 10%
  • Don’t Know 18%

AM: I’m relieved at that result as we think it will be important too. But that is backed up by evidence emerging in the US and Australia around data analytics use in retention and attainment. There is a much bigger debate around AI and robots, and around Learning Analytics there is that debate about human and data, and human and machine can work together. We have several sessions in that space.

SD: Learning Analytics has already been around it’s own hype cycle already… We had huge headlines about the potential about a year ago, but now we are seeing much more in-depth discussion, discussion around making sure that our decisions are data informed.. There is concern around the role of the human here but the tutors, the staff, are the people who access this data and work with students so it is about human and data together, and that’s why adoption is taking a while as they work out how best to do that.

Q2: How important is organisational culture in the successful adoption of education technology?

  • Total make or break 55%
  • Can significantly speed it up or slow it down 45%
  • It can help but not essential 0%
  • Not important 0%

PM: Where we see education technology adopted we do often see that organisational culture can drive technology adoption. An open culture – for instance Reading College’s open door policy around technology – can really produce innovation and creative adoption, as people share experience and ideas.

SD: It can also be about what is recognised and rewarded. About making sure that technology is more than what the innovators do – it’s something for the whole organisation. It’s not something that you can do in small pockets. It’s often about small actions – sharing across disciplines, across role groups, about how technology can make a real difference for staff and for students.

Q3: How important is good quality content in delivering an effective blended learning experience?

  • Very important 75%
  • It matters 24%
  • Neither 1%
  • It doesn’t really matter 0%
  • It is not an issue at all 0%

LE: That’s reassuring, but I guess we have to talk about what good quality content is…

SD: I think materials – good quality primary materials – make a huge difference, there are so many materials we simply wouldn’t have had (any) access to 20 years ago. But also about good online texts and how they can change things.

LE: My colleague Karen Colbon and I have been doing some work on making more effective use of technologies… Paul you have been involved in FELTAG…

PM: With FELTAG I was pleased when that came out 3 years ago, but I think only now we’ve moved from the myth of 10% online being blended learning… And moving towards a proper debate about what blended learning is, what is relevant not just what is described. And the need for good quality support to enable that.

LE: What’s the role for Jisc there?

PM: I think it’s about bringing the community together, about focusing on the learner and their experience, rather than the content, to ensure that overall the learner gets what they need.

SD: It’s also about supporting people to design effective curricula too. There are sessions here, talking through interesting things people are doing.

AM: There is a lot of room for innovation around the content. If you are walking around the stands there is a group of students from UCL who are finding innovative ways to visualise research, and we’ll be hearing pitches later with some fantastic ideas.

Q4: Billions of dollars are being invested in edtech startups. What impact do you think this will have on teaching and learning in universities and colleges?

  • No impact at all 1%
  • It may result in a few tools we can use 69%
  • We will come to rely on these companies in our learning and teaching 21%
  • It will completely transform learning and teaching 9%

AM: I am towards the 9% here, there are risks but there is huge reason for optimism here. There are some great companies coming out and working with them increases the chance that this investment will benefit the sector. Startups are keen to work with universities, to collaborate. They are really keen to work with us.

LE: It is difficult for universities to take that punt, to take that risk on new ideas. Procurement, governance, are all essential to facilitating that engagement.

AM: I think so. But I think if we don’t engage then we do risk these companies coming in and building businesses that don’t take account of our needs.

LE: Now that’s a big spend taking place for that small potential change that many who answered this question perceive…

PM: I think there are saving that will come out of those changes potentially…

AM: And in fact that potentially means saving money on tools we currently use by adopting new, and investing that into staff..

Q5: Where do you think the biggest benefits of technology are felt in education?

  • Enabling or enhancing learning and teaching activities 55%
  • In the broader student experience 30%
  • In administrative efficiencies 9%
  • It’s hard to identify clear benefits 6%

SD: I think many of the big benefits we’ve seen over the last 8 years has been around things like online timetables – wider student experience and administrative spaces. But we are also seeing that, when used effectively, technology can really enhance the learning experience. We have a few sessions here around that. Key here is digital capabilities of staff and students. Whether awareness, confidence, understanding fit with disciplinary practice. Lots here at Digifest around digital skills. [sidenote: see also our new Digital Footprint MOOC which is now live for registrations]

I’m quite surprised that 6% thought it was hard to identify clear benefits… There are still lots of questions there, and we have a session on evidence based practice tomorrow, and how evidence feeds into institutional decision making.

PM: There is something here around the Apprentice Levy which is about to come into place. A surprisingly high percentage of employers aren’t aware that they will be paying that actually! Technology has a really important role here for teaching, learning and assessment, but also tracking and monitoring around apprenticeships.

LE: So, with that, I encourage you to look around, chat to our exhibitors, craft the programme that is right for you. And to kick that off here is some of the brilliant work you have been up to. [we are watching a video – this should be shared on today’s hashtag #digifest17]

Share/Bookmark