Finding the Proof of the PID Pudding

DataCite occasionally has guest blog posts covering important topics and recent developments in the open infrastructure landscape. In this blog post by Alice Meadows (MoreBrains Cooperative), Josh Brown (MoreBrains Cooperative), and Natasha Simons (Australian Research Data Commons) we hear about the recently published cost-benefit analysis of persistent identifiers in the Australian research ecosystem

If you’re reading this blog post, the chances are you’re a bit of a PID enthusiast. You understand the value of PIDs and their metadata, and you advocate for them to be widely adopted and implemented so that everyone can benefit from them. But sometime, somewhere, someone is going to ask you for proof that investing in PIDs is really worthwhile. And, other than anecdotally and/or for small and quite specific use cases, such as this simulator developed by Portuguese funder FCT, that proof has been largely lacking — until recently….

Earlier this year, DataCite consortium lead and partner organization, the Australian Research Data Commons (ARDC), together with Australian ORCID consortium lead organization, the Australian Access Federation (AAF), commissioned the MoreBrains Cooperative to undertake a cost benefit analysis of the incentives for adoption of persistent identifiers (PIDs) by the Australian research sector. The resulting report, Incentives to invest in identifiers: A cost-benefit analysis of persistent identifiers in Australian research systems, published in September, found that 80% adoption of five priority PIDs would lead to savings of 38,000 researcher days per year. The direct financial cost of this wasted effort is close to AUD24 million per year (around 15M USD/ EUR); accounting for the opportunity cost associated with technology transfer and innovation-led growth, the savings increase to a staggering AUD84 million per year!

The PIDs in question are ORCID iDs for people, ROR IDs for institutions, ARDC’s own RAiDs for projects, Crossref and DataCite DOIs for research outputs, and Crossref DOIs for grants. In addition, as part of a longer-term strategy, the report recommends that work should continue on developing PIDs for instruments, expanding the uses of IGSN IDs for samples, and potentially other IDs, in collaboration with other research communities. Other recommendations include:

  • Developing a national PID strategy for Australia, which builds on the success of the AAF-led Australian ORCID consortium and leverages the leadership ARDC is already providing on PIDs
  • Ensuring that the five priority PIDs are integrated by key stakeholders in the Australian research sector
  • Building on the success of Australian Research Council’s (ARC) integration of ORCID into their research management system by other Australian funders adopting a similar approach and expanding to include the full suite of priority PIDs
  • Engaging with commercial providers of Research Information Management Systems (RIMS) and repositories and the communities that support open source RIMS in order to encourage and enable the further wholesale adoption of PIDs into those systems

These findings follow on from a previous PID cost benefit analysis that the MoreBrains team carried out last year for Jisc in the UK, which focused on the same five priority PIDs. The Jisc analysis took a much more conservative approach, aiming to identify the minimum likely cost savings if these PIDs were more widely adopted and implemented at UK universities. It also factored in the cost of establishing a UK PID support network to facilitate wider PID adoption and implementation. The Jisc analysis found that there would be a minimum savings of £5.67M over the course of five years if PID adoption targets of 67% by year 3 and 85% by year 5 could be met, even after major investments in PID integrations and, of course, the national support network. 

Both analyses highlight the three main benefits of PIDs:

  • Metadata reuse: PID registries act as both repositories for metadata, and as services that can provide programmatic access to it, saving the time and effort of rekeying it, and improving accuracy.
  • Automation: The presence of a PID in a system or a metadata record can act as a trigger for an action. The value of automation can go beyond time saved to include more complete information and more timely information processing.
  • Aggregation and analysis: At the institutional or national scale, aggregating information about entities and the relationships between them enables strategic analysis, benchmarking, the plotting of trends, and other insights.

And both focus primarily on the savings attached to the first of these — metadata reuse — since it is easiest (if still not exactly easy) to measure. The benefits of automation and aggregation/analysis would result in even more substantial cost savings.So, the next time someone asks you for proof that PIDs are a worthwhile investment, you’ll know what to say — just point them to ARDC, AAF, and MoreBrains’ report, where they’ll find all the evidence they need.

Alice Meadows