City Research Online & IRUS-UK

Regular readers of this blog will know that we like our stats here at City Research Online. Therefore, when we were approached by representatives of the IRUS-UK project, we leapt at the chance to participate.

The JISC-funded project is intending to set up a national infrastructure to aggregate and disseminate institutional repository (IR) download statistics, thereby demonstrating the vital importance of IRs in the scholarly communications landscape (and by extension, the importance of Green Open Access). These statistics will also be COUNTER-compliant, meaning they can be reported on to SCONUL and other interested parties. The statistics gathered will be freely available and re-usable, in the spirit of the openly accessible IRs on which they report.

For us, the project is a chance to be involved in (and perhaps in a small way influence) the early stages of a project which is likely to be an important piece of infrastructure. It also will provide a way to verify the statistics we gather from our in-house tools (Eprints’ IR Stats package and the ubiquitous Google Analytics), and to benchmark ourselves against other institutions. The early indications are that the other four participating institutions (Bournemouth, Cranfield, Huddersfield and Salford) receive LOTS of downloads compared to us, but then they are all larger and more established than us.

I’ll blog about this project more in future, when we have more to report upon.

OR2012: Microsoft Academic Search

I’ve failed fairly miserably to blog about Open Repositories 2012, but here at least is something I’ve taken from the conference to work on here at City. This was from the session on  Repositories and Microsoft Academic Search presented by Alex Wade from Microsoft, and you can see a video of this presentation here (it’s the first presentation).

Microsoft Academic Search is precisely what you might guess it to be- an academic search engine, in the vein of (e.g.) Google Scholar. Where it seems to offer added value over Google’s offering is its ability to build and enrich the data it holds, through wiki-like functionality, then to display this data in interesting ways. For example, here’s City academic Jason Dykes’ Citation Graph, showing the authors who have most often cited his work. The service also aggregates data at an institutional level- see for example City University London’s listing.

Where it gets interesting in repository terms is the ability to “seed” publication records with links to PDFs, for example those PDFs held in City Research Online, using the feature that allows you to edit the metadata of any record. I’ve experimented with doing this for the aforementioned Prof Dykes. The process is not quite wiki-like, in that there is a delay and verification before changes go live, but it seems to me that this is an easy way of pointing back to repository materials, and should also help with Google page rankings. There was also a commitment given, during Alex Wade’s presentation, that the Microsoft Academic Research team would be looking at automatically harvesting repository records to further enrich the service’s data, and to point back to the wealth of open access material held in repositories.

If anyone else has experience doing this, I’ve be very interested to hear about it!

Browse City authors in City Research Online!

Thanks to the efforts of the ever-helpful guys at Eprints Services, we now have a live author browse functionality for City Research Online (CRO). As you might have guessed, it allows you to have a look at City authors who have contributed full text material to CRO. As part of this work, the Services people also managed to get links working from individual authors as listed in citations, allowing users to click through from the record level to a full listing of any City author’s work- see for example the author listings linked to from this record.

Name authority, to allow useful browsing of “home” authors, is something I (with others) have grappled with in the past. This time round, we decided to go for a quick solution. A City author is defined by being someone with a City email address in the field used to record authorship of any given item. The email address is used as a key, and multiple items with the same email address are grouped together by virtue of being related to that key. The form of name that displays is based on the email address, which is not ideal as it means we end up with some authors as “Smith, J” and others as “Jones, David”, but I think it will do. There is now some cleaning up we need to do- for example making sure that all City authors have a valid email address associated with records they have authored.

We also took the opportunity to tweak the look of the repository, changing the colour of URLs to a more prominent shade of blue, and causing them to be underlined when hovered over. This will, I hope, make the author browse links as well as other internal links more obvious and therefore more used.

Open Repositories 2012 – brief conference report

A couple of weeks ago I attended Open Repositories 2012 in Edinburgh. The conference is the big event for anyone interested in open access technology and policy, and it had over 450 delegates from more than 40 countries in attendance. It featured a packed schedule, and I’ll post further over the course of the next week or two with musings upon this. For far more comprehensive overviews of the conference, see the incredibly full accounts of various sessions from the conference’s in-house liveblogging team; Natalia from LSE Library’s posts (part 1, part 2); Nick from UKCoRR’s reflections; and Yvonne from Warwick’s thoughts.

In yet more shameless self-publicity, I gave a paper on the Friday morning of the conference in the Eprints User Group strand, immediately before hopping on a train back to London. It was on our set-up here at City, and the extent to which we’ve managed to integrate City Research Online with the rest of the University’s systems. You can see the abstract and slides in the open access repository, and I have also uploaded the slides to Slideshare.

Finally, a photograph, courtesy of Dave Puplett from LSE. It was taken at the Playfair Library, where a drinks reception was held. It’s of me giving the thumbs up to my favourite philosopher and doyen of the Scottish Enlightenment, David Hume.


One of these people is a renowned philosopher. The other isn’t.

Open Repositories 2012- the excitement mounts!

Open Repositories 2012 is approaching, and I’m getting excited/ nervous, as I will be presenting a paper. It’s on our experiences here at City at coming to the repository game relatively late, and how to go about integrating repository systems with other university systems, policy and stakeholders given that situation. You can read the abstract of my presentation at City Research Online (I’ll update that record with slides when I’ve written the damn thing, which I will hopefully do tomorrow).

If you’re planning on attending the conference, I would encourage you to create a Crowdvine profile, which is a lightweight social network for delegates- you can see my profile here (warning: contains mugshot). Looking forward to seeing repository types in Edinburgh!

RSP event: Scholarly Communications: New Developments in Open Access

Laura and I attended the Repositories Support Project (RSP) event Scholarly Communications: New Developments in Open Access last Friday. It was held in the spectacular surroundings of RIBA’s Portland Place building, which gave proceedings a suitable air of grandeur. The event had a first-class line-up of speakers, and was really excellent- the RSP should be congratulated for the event’s high quality and depth of content. A Storify archive of the event’s Tweets is worth taking a look at, if you’re into the whole micro-blogging thing. What follows are my thoughts about the sessions, and as ever they are partial and impressionistic, so apologies in advance for any errors or mistakes in emphasis.

Where next with Open Access – keynote presentation – Martin Hall, Chair of Open Access Implementation Group and Vice Chancellor of the University of Salford. Professor Hall was the biggest scoop for the event- not only is a he a VC, but is also a member of the Working Group on Expanding Access to Published Research Findings, AKA the Finch Committee. He was therefore perfectly placed to deliver the keynote, which took a very high-level view of open access (OA) developments in light of the work of the Finch Committee and other developments such as the Elsevier boycott and the recent Whitehouse petition. His vision was one of a slow transition towards full Gold OA predicated on a market in Article Processing Charges (APCs), with a mixed economy (subscription journals, Gold OA journals and Green OA repositories) in the intervening period. He noted two particular possible victims of “collateral damage” in this change:

  • Learned Societies, who often rely on journal subscription charges to fund their activities and operate on very tight margins. The withdrawal of these subs could have a disastrous effect.
  • Independent researchers, who would not have access to institutional funds for APCs (though of course these people are currently in the opposite position- able to publish in journals, but having to rely on the c. 20% of openly accessible articles)

He offered no solutions to these snags, but at least they are being considered. His final remark was heartening for those of us plugging away with institutional repositories: legislative and academics’ attitudinal changes are likely to result in heightened interest in all forms of repositories during this change. I would add to this (as [namedrop alert] Bill Hubbard said to me during a break between sessions) that the emphasis on OA during the next REF cycle is likely to be another driver for interest in repositories.

The Budapest Open Access Initiative (BOAI) at 10 – recommendations for the next ten years of scholarly communications – Alma Swan, Director of European Advocacy, SPARC and Key Perspectives. This presentation was by another very prominent OA advocate, Alma Swan, who had recently participated in updating the BOAI, work which must have been extremely challenging given the stakeholders involved. A few particular points of interest arose from her presentation:

  • For green repositories, gratis OA is better than no OA at all; Libre OA is better than gratis OA. Ideally green OA content should be licensed as CC-BY for full libre re-use. I agree with this (though the distinction is not uncontroversial), the question for us becomes how to license then flag our repository content as Libre, i.e. CC-BY. There are also implications for text- and data-mining of green repository content if this shift is not implemented.
  • BOAI 2012 will make explicit recommendation of use of “Alternative metrics” to assess impact, for example the Altmetric service, or green repositories’  native download and access path statistics.
  • She noted one (in my view) telling statistic regarding access to PubMed Central. This was that 40% of people accessing this site can be defined as “citizens” as opposed to researchers or governmental people, which I think gives the lie to arguments that people do not need (or cannot make sense of) open scholarly research.

I’m now looking forward to the publication of the new version of the BOAI, which should provide yet more impetus toward OA.

Next up were some sessions about some projects and services, which in the interest of keeping this post to a vaguely manageable length I’ll just summarise here:

  • OAPEN-UK – collecting evidence on scholarly monograph publishing – Caren Milloy, Head of Projects JISC Collections. An introduction to the OAPEN-UK project, which is doing some interesting work collating attitudes towards the tricky prospect of archiving books and book chapters.
  • Building campus-based OA journal capacity: SAS Open Journals – Peter Webster, School of Advanced Study. A look at the School of Advanced Studies’  impressive integration of the Open Journals System with an Eprints repository, SAS Space, to provide in-house journal publishing services- see for example Amicus Curiae, a fully open access in-house journal. This is work we need to start looking at here at City.
  • Encouraging data publication – the JISC Managing Research Data Programme – Simon Hodson, JISC Programme Manager – Managing Research Data. An overview of this JISC programme, which is looking into data curation. Data curation is at a tangent to OA as understood as access to research articles, but is just as important. Again, it’s an area we need to start looking at here at City.
  • Figshare and open science – Mark Hahnel, Product Manager, Figshare.An overview of the excellent Figshare service, essentially a Mendeley for research data. As someone said on Twitter, one of the challenges posed by Figshare for repositories is how nice it looks, about a million times better than most repository interfaces.
  • Frontiers – Online community-based peer review, publishing and research networking. – Graeme Moffat, Frontiers. Frontiers is a new open access journals platform. There are some fascinating innovations with the platform, most notably the (nearly) open peer review process, which utilises a web forum to exchange feedback about submitted articles.

Using social media to disseminate research outputs – Melissa Terras, Reader in Electronic Communications in the Department of Information studies and Co-Director of the Centre for Digital Humanities at UCL. The final presentation of the day came from Melissa Terras, who talked about her experiences of Tweeting and blogging about research papers placed in her institution’s repository, UCL Discovery. You can read a full account of her experiments here (and it’s well worth reading in full), but what’s worth noting for the purposes of this post were the hundreds of extra downloads her papers received, merely by virtue of using social media to tell people about them. She also used her slot to make a plea for repository managers to understand academics’ attitudes with regard to self-archiving. Academics are essentially forward-looking, and when a paper is written it is generally considered to be over and done with. This makes retrospective appeals for academics to trawl their hard drives pretty onerous. I think the implication is that it’s incumbent on repository managers to make deposit as simple as possible, and to not get too hung up on back-runs of papers.

All in all an excellent event. RSP will have to do well to top this one, I’m looking forward to seeing if they can do it!

Symplectic Conference

Laura and I attended Symplectic’s Conference earlier in the week, which featured a number of interesting presentations and some intriguing feature announcements. The presentations included:

  • An introduction to the VIVO Project, an initiative to create a network of scientific researchers and enable discovery of those researchers’ publications. VIVO has a number of interesting features which we might look at here at City, not least the ability to create staff profiles.
  • An update on the DURA Project, which will synchronise users’ Mendeley accounts with Symplectic Elements, allowing easy addition of both metadata and full text to Elements, and hence to repository systems. This looks like an excellent feature development, though of course it will depend on City people using Mendeley. A quick search on Mendeley reveals about 20 City users of the system- not loads, but a start.
  • An introduction to Digital Science, a spin-off from the Nature Publishing Group, which is investing in many networked science start-up companies including Symplectic, and notably also Figshare and Altmetric, two companies we like!

The conference then heard from Symplectic CEO Daniel Hook, who outlined development priorities for Symplectic over the course of the next year. There were a lot of them, so I thought I would summarise some of the ones we’re particularly looking forward to here at City:

  • New data sources, including RePEc (actually available in the latest version of Elements, which we will be upgrading to soon), the British Library (book and chapter data?) and CrossRef (with the ability to pull through article-level metadata, hopefully)
  • User profiling and CV generation.
  • An upgraded user interface, featuring Symplectic’s snazzy new branding and the ability to customise look and feel. We’ll certainly want to make our Elements installation look more like the rest of City’s web presence.
  • Enhanced search, including via the API. This should assist us with outputting publications data to City’s web presence, particularly if and when a university-wide staff profiling system is put in place.
  • New reporting functionality. Reporting is already pretty good in my opinion, but any way to improve this is to be welcomed. Hopefully a report scheduler will be added.

The afternoon was taken up with a focus group session, which involved answering some (tricky) questions about the functionality of the REF module, which will hopefully help make that part of the system more user friendly.

All in all, a really good event, which looked at the bigger picture, but also promised some exciting developments for Elements over the course of the next year. It was also heartening to hear about Symplectic’s commitment to its software interacting with repository systems, something that is always high on our agenda here at City.

Stat attack!

I’ve become a little bit obsessed with monitoring the statistics for City Research Online recently, particularly Google Analytics’ Real Time functionality, which allows you to see visitors as they look at the site in real time.

Obsession aside, we’ve reached a couple of milestones in the last week or so. First, we had our first 100-download day, which weirdly enough was last Saturday the 21st of April:

April 2012 download statistics

Second, we seem to now be attracting more than 100 individual visitors day on day for the first time, which I think is indicative of the increasing amounts of traffic we’re getting via search (inevitably largely Google). I think it probably also indicates that we’re getting indexed more often and more comprehensively by the big G.

It’s also further proof that material added to City Research Online really does get found, downloaded and (presumably) read, built upon and cited.

City Research Online & electronic theses

We’ve been slowly working towards making City’s PhD theses available in City Research Online, and we’re now at the stage where we’re going to be adding a lot more full text versions of these important pieces of research. Working out these issues, and talking to PhD students about the uses made of their work, is also an opportunity to persuade early-career researchers of the benefits of open access, hopefully hooking them for the remainder of their career!

We already have some theses available (four at the time of writing) in the open access repository, thanks to PhD students getting in touch with us and passing on electronic versions. There are a few problems specifically associated with managing theses: you have to be particularly careful about how they are handled, since they represent three of more years of research, and are often intended to be published further down the line; potentially tricky issues with copyright (author permissions, 3rd party copyright) and sensitive data (commercial or personal); and the various places e-theses can both be sourced from and also end up- for example, there are already over 200 City theses held by the BL’s EThOS service, not to mention DART Europe.

I think we’ve worked through these issues to our satisfaction (or at least I’ve produced some papers on them!), and we’re now at a stage where we can recruit more content. There are two sources of e-theses we’re going to examine first. They are:

  1. A nice back-run of c. 50 we have here in the Library (on CD-ROMs!), with permissions forms all signed off. We’re going to add these, then email students to tell them we have done so.
  2. All examined theses going forward. We need to a bit more liaison to make sure that the Schools and Departments are clear with what we will do with newly received e-theses (this shouldn’t come as a shock to anyone!), and it will mean that we receive c. 250 newly examined theses per year.

Once we’re comfortable with the work-flows for managing these two sets of theses (and have exhausted the former set), we can have a look at other sources, including theses currently in EThOS but not held locally. I’m also in the process of setting up EThOS automatically harvesting our content, meaning that theses deposited in City Research Online will automatically be added to EThOS- a two for one offer!

This work has taken a while to come to fruition, but it’s really pleasing to think that over time we’ll become a comprehensive source of Doctoral research produced here at City.

City Research Online search functionality

We’ve just launched search functionality for publications data held in City Research Online. We’ve created a dedicated search page on the Research area of City’s website, as well as a supporting page with information about the service in general. This is the first time we’ve surfaced data from Symplectic (our Current Research Information System) to the web, and it took us a while to sort it out as well as some dedicated web development time, but we’re pleased with the results.

The search has been created by using Symplectic’s Application Programming Interface (API). The API pushes out “approved” (i.e. items validated by their author(s)) publications (in the form of citations plus abstracts) to a cache. The cache is then indexed and ranked by Funnelback, City’s corporate website’s indexing tool. The indexed data is then exposed to a keyword search via the form at the page linked to above.

There are a few features of the search, and the results it creates, worth flagging:

  • We have top-ranked search results where there are full text open access papers associated with those results. See, for example, a search for Jason Dykes’ publications– you’ll note that the first 40 or so hits allow you to click through to an openly accessible paper. This was done on the rationale that people are more likely to be interested in results with papers associated (and it doesn’t hurt or download statistics!)
  • As mentioned above, the search’s index includes abstracts, where present in the publication’s metadata. This means that search terms can sometimes appear a little fuzzy, particularly when you get towards the bottom of a list of hits- see for example this page, which is the fourth page of four when searching for the term “concrete”. We’re not too worried about this, given the propensity of searchers to only look at the first couple of pages of hits for any given search.
  • The advanced search is not particularly advanced. Our web developer is going to include a date range for results, but generally we weren’t looking to re-create a City version of e.g. Scopus, so we felt that relatively few advanced search options would be adequate.
  • We still need to do a bit of re-jigging of the formatting of the main search page, for example to include some text fields after the search form, to make the layout look a bit nicer. We’ve also included the service’s Twitter stream and an RSS feed of new items on this page, to give an idea of full text content being made live.

As ever, any feedback on any aspect of this new functionality much appreciated- you can email the team at

