Data Repository Selection: Which Criteria Matter?

This blog post was written together with Matthew Cannon, Wei Mun Chan, Ilaria Carnevale, Imogen Cranston, Scott Edmunds,_ Nicholas Everitt_, Emma Ganley, Chris Graf, Iain Hrynaszkiewicz, Varsha K. Khodiyar, Thomas Lemberger, Catriona J. MacCallum, Kiera McNeice, Hollydawn Murray, Philippe Rocca-Serra, Kathryn Sharples, Marina Soares E Silva, Jonathan Threlfall. It is cross posted across several blogs to request feedback from different communities.

Publishers and journals are developing data policies to ensure that datasets, as well as other digital products associated with articles, are deposited and made accessible via appropriate repositories, in line with the FAIR Principles. With thousands of options available, however, the lists of deposition repositories recommended by publishers are often different and consequently the guidance provided to authors may vary from journal to journal. This is due to a lack of common criteria used to select the data repositories, but also to the fact that there is still no consensus of what constitutes a good data repository.

To address this, FAIRsharing and DataCite have joined forces with a group of publisher representatives who are actively implementing data policies and recommending data repositories to researchers. The result of our work is a set of proposed criteria that journals and publishers believe are important for the identification and selection of data repositories, which can be recommended to researchers when they are preparing to publish the data underlying their findings.

Our work intends to

  • reduce complexity for researchers when preparing their submissions to journals,
  • increase efficiency for data repositories that currently have to work with all individual publishers, and
  • simplify the process of recommending data repositories for publishers.

The aim is to make the implementation of research data policies more efficient and consistent, which may help to improve approaches to data sharing by promoting the use of reliable data repositories.

Although we recognize the role of researchers and other stakeholders in the research data life cycle, in this first instance the target audience for our work are other journals and publishers, repository developers and maintainers, certification and other evaluation initiatives, and other policy makers.

This proposed criteria are intended to:

  • guide journals and publishers in providing authors with consistent recommendations and guidance on data deposition, and improve authors’ data sharing practices;
  • reduce potential for confusion of researchers and support staff, and reduce duplication of effort by different publishers and data repositories
  • inform data repository developers and managers of the features believed to be important by journals and publishers;
  • provide a basis for collaboration with certification and other evaluation initiatives, serving as a reference and perspective from journals and publishers;
  • drive the curation of the description of the data repository in FAIRsharing, which will enable display, filter and search based on these criteria.

We invite you to read the pre-print article that describes the work and provide us with feedback via this form.

DataCite will use your input to inform the work we are doing within the European-funded FAIRsFAIR project on findability of trustworthy repositories, collaborating closely with re3data and CoreTrustSeal.