Quick update

Just a quick post to say that we’re still here and still working on making City a more open access-friendly place! In lieu of any major pieces of news about the service, here are a couple of other places I’ve been writing. First is over at LSE’s Impact of Social Sciences blog, where I wrote with a couple of colleagues in defence of institutional repositories. Second, I’ve set up a new blog with my colleague Lucy to discuss another project that’s taking up a fair bit of time at the moment, implementing Serials Solutions’ Summon resource discovery software. One of the many advantages of Summon is that it will make City Research Online content more visible to one of our key user groups, the staff and students at City. I intend to write about repositories and web-scale resource discovery at some point, so keep an eye out if you’re interested in that.

Making City Research Online OpenAire compliant

We’ve just made City Research Online (CRO) OpenAire compliant. This means that all EU FP7 funded research added to CRO will be made available via OpenAire’s Discovery Portal, and that this research will be fully compliant with the EU’s open access mandate for FP7 funded research.

To make CRO OpenAire compliant was relatively straightforward, since the ever-helpful guys at Eprints Services did the hard work of installing the OpenAire Compliance Plug-In. It was then a matter of using OpenAire’s validation tool to ensure things were working properly, then registering CRO with OpenAire (see CRO’s entry in this list of compliant repositories). All we need to do now is work out which of our full text papers have received FP7 funding!

I’m happy that we’ve managed to do this piece of work. There is currently something of a push to get UK repositories OpenAire compliant (there has been lots of activity on the various repository email lists), since very few in the UK are at the moment. It allows us in the CRO team to offer another service to our users: if you have FP7-funded research, give the outputs to us and we will do the legwork in making it comply with the EU’s open access mandate. There is also the imminent (possible but strongly rumoured) prospect of the EU mandating Green open access for all the research it funds- and if that happens we’ll be ahead of the game in offering this service to our users.

Research Libraries UK conference 2012

Last week I was lucky enough to be able to attend the Research Libraries UK conference 2012. It was held at St James’ Park in Newcastle, a rather huge football stadium (though of course we used the conference centre rather the terraces!) The conference was “high level” insofar as it examined big-picture issues relating to research libraries. In this it reflected the membership of RLUK, and attendees were mainly senior library managers as well as the odd interloper such as myself.

The reason for my attendance was hearing Janet Finch talk about the report her committee produced, which has become known as the Finch Report. Below, I summarise the interesting points of the other plenary sessions and examine in more detail the session in which Finch and others talked about open access. For reasons of space I have omitted reporting on some very lively Pecha Kucha sessions and one of the plenary sessions, as well as Stephen Curry’s excellent and engaging researcher’s perspective on open access, since I reported on a very similar presentation of his recently.

Roly Keating, British Library. Starting off the conference at a very high level, new BL Chief Executive Keating explained how he viewed the BL’s place in the “library ecosystem”. it was a dense presentation, but a few things in particular that he said stuck with me:

  • That the BL is a guarantor of information for future generations- and that this guarantee now extended (by statutory remit) to web content.
  • That the BL (and by extension other libraries) is a cultural institutions in its own right, as well as a traditional library in the sense of being a repository for physical objects.
  • Data management is a new horizon for the BL- for example they partnered with the BBC recently to digitise and turn into a dataset the Radio Times, giving the BBC for the first time a complete record of its broadcasting schedule since its inception.
  • The power and value of the physical object has not diminished in this new digital world; in fact it is enhanced.
  • The move to non-physical legal deposit is the biggest challenge on the BL’s horizon, but presents some amazing opportunities, e.g. turning the UK’s entire web domain into a dataset to allow its programmatic analysis.

User-centred cataloguing: thinking differently. This session was on the opportunities presented by shared services for cataloguing, in light of recent developments in data interoperability (not least that old library favourite, linked data). Economies of scale can be derived from a shared approach to the re-use of cataloguing data. The question is how to usefully do this.

Redefining the Research Library Model. A report on the RLUK project of the same name, which summarised new thinking in this area (and the website above includes some very interesting position papers on this subject). Most interesting was news from JISC on their forthcoming changes, which (for repositories at least) seem to lay emphasis on research data management.

Hidden Gems: Revealing or Special Collections. An overview of the state of play with special collections in research libraries. Some provocative points were made here, including one from Andrew Green, National Library of Wales, that perhaps those collections of uncatalogued material should be gotten rid of- how useful are they really?

Open Access to UK Research Outputs. As mentioned above, this session was of the most interest to me personally. Janet Finch kicked off, summarising how the Finch Report came to the conclusions it did, and the implications of those conclusions for libraries. The Finch Report has been discussed at great length elsewhere (in particular the way it favours Gold Open Access (OA) over Green) so I won’t rehash that discussion here, but some of the points and questions I took from Finch’s presentation were as follows:

  • Finch made very clear that there was no ministerial or other governmental influence over the findings of the committee and its report.
  • Finch stated that the remit of the report was (among other things) to make peer reviewed research available “free”, but free for whom? The emphasis on Gold OA means that journal publishing will move from a “reader pays” to an “author pays” model. Moreover, the cost of Gold OA author processing charges has been estimated at £60m per year on top of (or taken out of) the UK’s research budget, which will go directly to publishers.
  • Finch stated that “Maintaining the viability if the publishing industry” was one of the Committee’s success criteria, but it’s unclear to me why this should have deemed a criteria for success, if the goal was free access to research and if there was no Ministerial influence on the Committee.
  • Finch made some reassuring (from my perspective as a repository manager!) remarks about the expectation that we will be in a “mixed economy” of Green and Gold for the foreseeable future.

Following on from Janet Finch, was Mark Thorley from RCUK, to explain RCUK’s also much-discussed and recently revised open access policy. This policy puts into practice the Finch Report’s recommendations by enforcing open access for research it funds, with a clear preference for the Gold over the Green route (which at first glance seems to say that researchers with RCUK funding must, when they publish, go Gold if they can; and if they can’t go Gold they must go Green). When asked about whether this would circumscribe authors’ choices, Thorley was very clear that the policy applied at the level of journals rather than individuals. In other words, RCUK won’t be policing the choices of individuals as long as they have made their work openly accessible whether that is by going Gold or Green (or indeed both). What remains unclear to me is how researchers themselves are supposed to know this, given the wording of the current policy and the advice that surrounds it.

All in all the conference was very interesting for the “core” aspects of my role (i.e. open access), but it was also fascinating to find out about the many other hot topics around research libraries. I also managed to catch up with some old friends and meet some new people, which is always good!

Using City Research Online to serve papers to RePEc

One of the promises of the creation of a network of institutional repositories was that this would truly be a network, in the sense that there would be facility for appropriate transfer of material between services (I wrote about this for UKCoRR’s blog a while ago if you want more context). For example, an academic should be able to post a paper in the home repository, and also see this transferred automatically to e.g. the ArXiv.

We saw an opportunity to do this here at City when we began archiving our Department of Economics Discussion Papers Series. It soon emerged that the main point of discovery for economists looking for papers was the Repository of Papers in Economics (RePEc). The person in charge of the series had set up a page on the Economics website that pushed the papers in the series to RePEc, but this required an awful lot of maintenance, in particular ensuring that data could be transferred to RePEc in an appropriate format as RDF files.

So, we offered to take care of ensuring the series was automatically transferred from City Research Online (CRO) to RePEc. This involved some work with Eprints services and the people at RePEc to set up an area at CRO which indexed the papers as RDF files using the eprints2redif script. This is then used to push these files to City’s Department of Economics page at RePEc. The CRO RDF file-set updates overnight, meaning that additions, deletions and changes to the files therein will quickly be reflected on our RePEc page.

This will hopefully be a convenient and useful service for our economists- add your discussion paper to CRO, and it will automatically appear in RePEc! For us it’s a real win as well- we can take the administrative and technical burden off the economists’ hands, and also demonstrate that we are able to offer this kind of service to other departments. Also, it means that we should see a significant improvement in our download statistics, since the papers’ records in RePEc actually point back to full text papers in CRO when people hit the download button (see the URL to download this paper, for example). So it really is a win-win situation!

I would encourage other repository managers to have a think about this. I found the Department of Economics to be very receptive when we approached them, particularly when it became clear that we take on work they were spending time upon. There is some technical work that has to be done, but nothing that should flummox an experienced Eprints administrator. The next thing I’m going to think about is whether we can arrange something similar for our Centre for Mathematical Science, who are keen users of CRO and the aforementioned ArXiv.

1,000th paper added to City Research Online!

Last Weds 5th September we made live our 1,000th full text, openly accessible paper. The article in question was part of the Department of Economics Discussion Papers series, entitled: García-Alonso, M. D. C. & Garcia-Marinoso, B. (2007). “The strategic interaction between firms and formulary committees: effects on the prices of new drugs” . We’re now up to a total of 1,030 full text papers in the repository.

Given that City Research Online was only formally launched in October 2011, and that we aimed to make 500 papers live in its first year of operation, we’re pretty pleased with our progress thus far. It is of course testament to the support our team has received and of the willingness of City authors to contribute to the service. Thanks to all our users and depositors, here’s to our next 1,000 openly accessible papers!

City Research Online & IRUS-UK

Regular readers of this blog will know that we like our stats here at City Research Online. Therefore, when we were approached by representatives of the IRUS-UK project, we leapt at the chance to participate.

The JISC-funded project is intending to set up a national infrastructure to aggregate and disseminate institutional repository (IR) download statistics, thereby demonstrating the vital importance of IRs in the scholarly communications landscape (and by extension, the importance of Green Open Access). These statistics will also be COUNTER-compliant, meaning they can be reported on to SCONUL and other interested parties. The statistics gathered will be freely available and re-usable, in the spirit of the openly accessible IRs on which they report.

For us, the project is a chance to be involved in (and perhaps in a small way influence) the early stages of a project which is likely to be an important piece of infrastructure. It also will provide a way to verify the statistics we gather from our in-house tools (Eprints’ IR Stats package and the ubiquitous Google Analytics), and to benchmark ourselves against other institutions. The early indications are that the other four participating institutions (Bournemouth, Cranfield, Huddersfield and Salford) receive LOTS of downloads compared to us, but then they are all larger and more established than us.

I’ll blog about this project more in future, when we have more to report upon.

OR2012: Microsoft Academic Search

I’ve failed fairly miserably to blog about Open Repositories 2012, but here at least is something I’ve taken from the conference to work on here at City. This was from the session on  Repositories and Microsoft Academic Search presented by Alex Wade from Microsoft, and you can see a video of this presentation here (it’s the first presentation).

Microsoft Academic Search is precisely what you might guess it to be- an academic search engine, in the vein of (e.g.) Google Scholar. Where it seems to offer added value over Google’s offering is its ability to build and enrich the data it holds, through wiki-like functionality, then to display this data in interesting ways. For example, here’s City academic Jason Dykes’ Citation Graph, showing the authors who have most often cited his work. The service also aggregates data at an institutional level- see for example City University London’s listing.

Where it gets interesting in repository terms is the ability to “seed” publication records with links to PDFs, for example those PDFs held in City Research Online, using the feature that allows you to edit the metadata of any record. I’ve experimented with doing this for the aforementioned Prof Dykes. The process is not quite wiki-like, in that there is a delay and verification before changes go live, but it seems to me that this is an easy way of pointing back to repository materials, and should also help with Google page rankings. There was also a commitment given, during Alex Wade’s presentation, that the Microsoft Academic Research team would be looking at automatically harvesting repository records to further enrich the service’s data, and to point back to the wealth of open access material held in repositories.

If anyone else has experience doing this, I’ve be very interested to hear about it!

RSP event: Scholarly Communications: New Developments in Open Access

Laura and I attended the Repositories Support Project (RSP) event Scholarly Communications: New Developments in Open Access last Friday. It was held in the spectacular surroundings of RIBA’s Portland Place building, which gave proceedings a suitable air of grandeur. The event had a first-class line-up of speakers, and was really excellent- the RSP should be congratulated for the event’s high quality and depth of content. A Storify archive of the event’s Tweets is worth taking a look at, if you’re into the whole micro-blogging thing. What follows are my thoughts about the sessions, and as ever they are partial and impressionistic, so apologies in advance for any errors or mistakes in emphasis.

Where next with Open Access – keynote presentation – Martin Hall, Chair of Open Access Implementation Group and Vice Chancellor of the University of Salford. Professor Hall was the biggest scoop for the event- not only is a he a VC, but is also a member of the Working Group on Expanding Access to Published Research Findings, AKA the Finch Committee. He was therefore perfectly placed to deliver the keynote, which took a very high-level view of open access (OA) developments in light of the work of the Finch Committee and other developments such as the Elsevier boycott and the recent Whitehouse petition. His vision was one of a slow transition towards full Gold OA predicated on a market in Article Processing Charges (APCs), with a mixed economy (subscription journals, Gold OA journals and Green OA repositories) in the intervening period. He noted two particular possible victims of “collateral damage” in this change:

  • Learned Societies, who often rely on journal subscription charges to fund their activities and operate on very tight margins. The withdrawal of these subs could have a disastrous effect.
  • Independent researchers, who would not have access to institutional funds for APCs (though of course these people are currently in the opposite position- able to publish in journals, but having to rely on the c. 20% of openly accessible articles)

He offered no solutions to these snags, but at least they are being considered. His final remark was heartening for those of us plugging away with institutional repositories: legislative and academics’ attitudinal changes are likely to result in heightened interest in all forms of repositories during this change. I would add to this (as [namedrop alert] Bill Hubbard said to me during a break between sessions) that the emphasis on OA during the next REF cycle is likely to be another driver for interest in repositories.

The Budapest Open Access Initiative (BOAI) at 10 – recommendations for the next ten years of scholarly communications – Alma Swan, Director of European Advocacy, SPARC and Key Perspectives. This presentation was by another very prominent OA advocate, Alma Swan, who had recently participated in updating the BOAI, work which must have been extremely challenging given the stakeholders involved. A few particular points of interest arose from her presentation:

  • For green repositories, gratis OA is better than no OA at all; Libre OA is better than gratis OA. Ideally green OA content should be licensed as CC-BY for full libre re-use. I agree with this (though the distinction is not uncontroversial), the question for us becomes how to license then flag our repository content as Libre, i.e. CC-BY. There are also implications for text- and data-mining of green repository content if this shift is not implemented.
  • BOAI 2012 will make explicit recommendation of use of “Alternative metrics” to assess impact, for example the Altmetric service, or green repositories’  native download and access path statistics.
  • She noted one (in my view) telling statistic regarding access to PubMed Central. This was that 40% of people accessing this site can be defined as “citizens” as opposed to researchers or governmental people, which I think gives the lie to arguments that people do not need (or cannot make sense of) open scholarly research.

I’m now looking forward to the publication of the new version of the BOAI, which should provide yet more impetus toward OA.

Next up were some sessions about some projects and services, which in the interest of keeping this post to a vaguely manageable length I’ll just summarise here:

  • OAPEN-UK – collecting evidence on scholarly monograph publishing – Caren Milloy, Head of Projects JISC Collections. An introduction to the OAPEN-UK project, which is doing some interesting work collating attitudes towards the tricky prospect of archiving books and book chapters.
  • Building campus-based OA journal capacity: SAS Open Journals – Peter Webster, School of Advanced Study. A look at the School of Advanced Studies’  impressive integration of the Open Journals System with an Eprints repository, SAS Space, to provide in-house journal publishing services- see for example Amicus Curiae, a fully open access in-house journal. This is work we need to start looking at here at City.
  • Encouraging data publication – the JISC Managing Research Data Programme – Simon Hodson, JISC Programme Manager – Managing Research Data. An overview of this JISC programme, which is looking into data curation. Data curation is at a tangent to OA as understood as access to research articles, but is just as important. Again, it’s an area we need to start looking at here at City.
  • Figshare and open science – Mark Hahnel, Product Manager, Figshare.An overview of the excellent Figshare service, essentially a Mendeley for research data. As someone said on Twitter, one of the challenges posed by Figshare for repositories is how nice it looks, about a million times better than most repository interfaces.
  • Frontiers – Online community-based peer review, publishing and research networking. – Graeme Moffat, Frontiers. Frontiers is a new open access journals platform. There are some fascinating innovations with the platform, most notably the (nearly) open peer review process, which utilises a web forum to exchange feedback about submitted articles.

Using social media to disseminate research outputs – Melissa Terras, Reader in Electronic Communications in the Department of Information studies and Co-Director of the Centre for Digital Humanities at UCL. The final presentation of the day came from Melissa Terras, who talked about her experiences of Tweeting and blogging about research papers placed in her institution’s repository, UCL Discovery. You can read a full account of her experiments here (and it’s well worth reading in full), but what’s worth noting for the purposes of this post were the hundreds of extra downloads her papers received, merely by virtue of using social media to tell people about them. She also used her slot to make a plea for repository managers to understand academics’ attitudes with regard to self-archiving. Academics are essentially forward-looking, and when a paper is written it is generally considered to be over and done with. This makes retrospective appeals for academics to trawl their hard drives pretty onerous. I think the implication is that it’s incumbent on repository managers to make deposit as simple as possible, and to not get too hung up on back-runs of papers.

All in all an excellent event. RSP will have to do well to top this one, I’m looking forward to seeing if they can do it!

RIN event: How do we make the case for research data centres?

I attended the Research Information Network’s event, “How do we make the case for research data centres?“. The event was to mark the launch of the RIN/ JISC report, “Data centres: their use, value and impact“.  It was pretty high-level stuff, with plenty of discussion about the relationship of data centres to the research process and ways in which datasets are curated. There were a couple of very interesting examples of use of data are used by researchers themselves, one from an academic who noted the value of data centre data because it didn’t require awkward conversations with potentially rivalrous labs; and another from a researcher in a small company building socio-economic models using data derived from ESDS’ wealth of datasets.

There were a few lessons for City Research Online, though, and I outline them briefly here:

  • Institutions (and by extension institutional repositories) remain important for the curation of data, given their local knowledge and relationships with researchers.
  • Institutional repositories are an excellent and cost effective method of storing data. What they are less good at is the managerial aspects of serving datasets to those whom might wish to access them. IRs can’t provide the rich metadata and sophisticated web front ends that dedicated data centres provide.
  • However, there is still a role for data curation by IRs, for simple and/ or small datasets. Where data are presented in tabular form and are easily catalogued, IRs can take on data that would not be interesting for data centres.
  • This is my own opinion, but I would extend the above role for IRs to include datasets which underlie published journal articles, particularly in those cases where we already archive the paper(s) in question- the ability of IRs to link together items is of benefit here. The challenge here is to advertise this as a viable and meaningful service for data creators.

So, some challenges for us to look at. In my experience, datasets are one of those repository things that can be “worried about later”, but I also think that datasets are of increasing importance to research. If we can identify ways in which City Research Online can usefully provide (perhaps modest) data curation services, then so much the better.

First OA items available in City Research Online

I’m pleased to announce that after a lot of testing and configuration, we have made our first seven items available in City Research Online, City’s open access research repository. The items can be viewed and accessed at the latest additions area of the website. The items are courtesy of Neil Thurman, who directs the Masters in Electronic Publishing here at City, which somehow seems very appropriate!

So, we have some content! Now we need to get more, and do all the tedious stuff like writing documentation and supporting web pages. But this is a real milestone, and a reminder (if one were needed) why we are doing this in the first place.

