Scott Edmunds is Editor-in-Chief at GigaScience Press, which has worked closely with DataCite since 2011, as part of the publisher’s commitment to open science and open data. Scott is a member of the Make Data Count advisory group and supports the initiative’s efforts to make it possible to evaluate the usage of data in the later steps of the data-sharing cycle, so that we can better assess the benefits of public sharing of data. In the interview below, Scott reached out to the new Make Data Count Director Iratxe Puebla to discuss Make Data Count and what’s coming next for the initiative.
Can you tell us a bit about yourself, and your journey towards Making Data Count?
I have been involved in the open science space for a while, first working in open-access journals and more recently at ASAPbio, where I worked on promoting preprints in the life sciences – I shared a bit more about my professional path in this earlier post. I became interested in open data during my time implementing the data policy at PLOS ONE. The journal’s wide scope meant working through the nuances of datasets from a wide range of disciplines, and I experienced first-hand the issues that arise for rigor, reproducibility and research exploration when datasets underlying articles are not available. After pursuing my interest in open data through voluntary community efforts (e.g. as co-lead of the FORCE11/COPE Research Data Publishing Ethics Working Group), this role as Director of Make Data Count allows me to combine my interests in open data and community engagement. It is also an exciting opportunity to support the evaluation of data usage and impact, a topic that will become increasingly important in the coming years.
What exactly is Make Data Count?
Make Data Count is an initiative aimed at advancing open data metrics to enable the evaluation and reward of research data reuse. We have been telling researchers for years that sharing their data matters, but there are few tangible rewards for those who do so in the current research ecosystem. We have also been telling ourselves that open data advances discovery, but we need to meaningfully evaluate whether, how, and for whom this happens. To get to a place where research data are valued across scholarly activities, evaluation, and communications, we need metrics that enable assessment of how data is used, according to the interests and needs of different stakeholders. We also need those metrics to be transparent and meaningful, so that they provide adequate context for each specific use. This is why it is crucial to engage as diverse groups as possible in developing open data metrics. Make Data Count aims to bring the different stakeholders together (researchers, institutions, policymakers, publishers, infrastructure providers) to address needs and best practices, and advance the implementation of open data metrics in evaluation.
Data citation sounds quite an abstract and technical concept, why is promoting it and measuring data reuse so important?
Data citations make the links between datasets and other outputs transparent and visible. They are the signal that an individual or a group (e.g. a researcher, policy maker, or a member of the public) found that dataset of relevance to them. By establishing those connections, data citations bring benefits to the data producers, the data users, and the research ecosystem as a whole. Data citations assign attribution to the data producers, allow data users to raise the reproducibility and transparency of their work, and enable ecosystem-level evaluation of the impact of open data. By including a persistent identifier and associated metadata, data citations become particularly valuable as they allow visibility to the citation across different sources and workflows. Data citations also make it possible to undertake bibliometric studies into the reuse of datasets, so that we can understand what are the right indicators for data reuse in specific contexts, for example, across research disciplines.
Who else is currently involved in Make Data Count, and how can others get involved?
Make Data Count is a community-led initiative and has an advisory group that includes representatives that bring perspectives from research and research-supporting organizations and infrastructure. Different global organizations and individuals have contributed over the years, including DataCite, California Digital Library, DataOne, meta-researchers and others. The initiative’s goals align closely with DataCite’s mission to make research outputs findable, citable, connected and reused globally, and I am thrilled to be in this role that signals DataCite’s continued commitment to the initiative.
Make Data Count is a collaborative effort open to anyone interested in developing and promoting open data metrics. Data repositories and publishers have a key role by standardizing data usage (data repositories) and submitting data citations in metadata (data repositories and publishers), institutions and funders can provide support and create incentives for researchers to share, reuse and cite data. And of course, we want all of this to help researchers get attribution for the data they share. Make Data Count aims to act as a convener of stakeholders to facilitate conversations about data usage and data metrics. To this end, we are hosting the Make Data Count Summit in Washington DC next September 12-13. This will be a unique opportunity to discuss the next steps toward the implementation of open data metrics across the research ecosystem and I invite everyone interested to join us in September.
The Make Data Count Summit in September sounds interesting. What will be covered and what does it hope to achieve?
The goal of the Summit is to bring together all stakeholders invested in open data metrics – researchers, government data administrators, funders, policymakers, publishers, and infrastructure providers – to discuss the evaluation of open research data usage, reach, and impact. The meeting will cover ongoing data metrics initiatives including Make Data Count and Democratizing Data, the existing evidence on data metrics from meta-research studies, and explore how open data metrics can be incorporated into different evaluation processes such as policy and funding decisions, scholarly communications, and academia. We will be sharing more details on the program and speakers for the Summit very soon, so stay tuned!
What is coming up next for the initiative?
In addition to the Make Data Count Summit, another important immediate focus is the development of the open global data citation corpus. This is a project that DataCite is leading, in partnership with the Chan Zuckerberg Initiative and with support from the Wellcome Trust. The corpus will store data citations from a diverse set of sources and will include visualization tools to respond to the needs of different stakeholders. We see the corpus as an important step toward tools and services that make it possible for the community to evaluate data usage in easier and more automated ways. We will engage with the community to understand how the corpus can best address their needs and we will be sharing updates as our work progresses. I invite anyone interested in learning more about the corpus and how this resource can be useful to them to reach out to me.