DataCite’s commitment to The Principles of Open Scholarly Infrastructure

DataCite was founded in 2009 on the principle of being an open stakeholder governed community that is open to participation from organizations around the world. Today, that continues to be true. Although our services have expanded, we continue to remain grounded to our roots. DataCite’s umbrella was formed with the aim to safeguard common standards worldwide to support research, thereby facilitating compliance with the rules of good scientific practice. DataCite’s identifier registration, Data File, and services are foundational components of the scholarly ecosystem. As the ecosystem continues to evolve, governance, sustainability and living-will insurance have become increasingly important components of the open infrastructure.

Recently several open scholarly infrastructure organizations and initiatives have adopted The Principles of Open Scholarly Infrastructure. DataCite has conducted its own audit against the principles and would like to affirm our commitment to upholding these.


POSI Governance Self-Assessment

Coverage across the research enterprise – it is increasingly clear that research transcends disciplines, geography, institutions and stakeholders. The infrastructure that supports it needs to do the same.

As a global organization, DataCite represents a member community from over 47 countries, more than 2,300 repositories and a global user base that has the means to find, cite and reuse research. DataCite provides the means to create, find, cite, connect, and use research across a range of resource types and seek to address the needs of our members throughout the research lifecycle. We support research organizations, including universities, government bodies, research facilities, medical facilities, funders, and research-producing companies. In this way, DataCite services can be used for many types of research activities.

Stakeholder Governed – a board-governed organisation drawn from the stakeholder community builds more confidence that the organisation will take decisions driven by community consensus and consideration of different interests.

DataCite Members are the voting body of the organization. Membership is open to all entities that support DataCite’s data sharing mission. Members are governed by the statutes (the bylaws and operating procedures) of the organization. The statutes are developed, ratified, and approved by the members.

Non-discriminatory membership – we see the best option as an “opt-in” approach with a principle of non-discrimination where any stakeholder group may express an interest and should be welcome. The process of representation in day to day governance must also be inclusive with governance that reflects the demographics of the membership.

As an open, community-based organization, DataCite provides the means for any organization to either join as a member of the association or participate within one of the existing consortia globally. We uphold a set of standards and request that new members accept the DataCite statutes as the governing procedures of the organization.

Transparent operations – achieving trust in the selection of representatives to governance groups will be best achieved through transparent processes and operations in general (within the constraints of privacy laws).

DataCite has continued to adhere to transparent operations by making operational decisions, documentation and processes openly available to the community. In addition, we engage openly with the DataCite General Assembly to ensure that we continue to operate on a transparent basis.

Our members meet annually (General Assembly) to approve DataCite’s revenue and expenditures, stand and/or vote for the Executive Board, guide DataCite’s strategy, put forth resolutions and modify the association’s statutes. Members also participate in DataCite’s Steering and Working Groups and provide input on new member applications. In addition, DataCite members contribute to our public development roadmap and other initiatives through a range of platforms, such as our monthly open hours.

Cannot lobby – the community, not infrastructure organisations, should collectively drive regulatory change. An infrastructure organisation’s role is to provide a base for others to work on and should depend on its community to support the creation of a legislative environment that affects it.

DataCite does not lobby nor does DataCite include regulatory change as part of its remit. DataCite’s core purpose is to provide the means to create, find, cite, connect, and use research.

Living will – a powerful way to create trust is to publicly describe a plan addressing the condition under which an organisation would be wound down, how this would happen, and how any ongoing assets could be archived and preserved when passed to a successor organisation. Any such organisation would need to honour this same set of principles.

As a nonprofit, DataCite is governed by a set of statutes that are approved by our board and members. The DataCite Statutes provide the procedures should the association need to be dissolved. In addition, DataCite provides the Data File under CC0 license and all code is made publicly available under the open MIT License.

DataCite intends to archive the Data File with a third party in the coming months. This is therefore an area for improvement.

Formal incentives to fulfil mission & wind-down – infrastructures exist for a specific purpose and that purpose can be radically simplified or even rendered unnecessary by technological or social change. If it is possible the organisation (and staff) should have direct incentives to deliver on the mission and wind down.

DataCite operates on a strict cost recovery basis with a lean team that supports a global community. The mission of the organization is to address the needs of the global research community and our sustainability is supported through a cost-recovery membership model.


POSI Sustainability Self-Assessment

Time-limited funds are used only for time-limited activities – day to day operations should be supported by day to day sustainable revenue sources. Grant dependency for funding operations makes them fragile and more easily distracted from building core infrastructure.

DataCite continues to make efforts to ensure that our sustainability is supported through membership fees. We reached an important milestone during 2020 with the approval of an updated membership model that ensures that DataCite can operate on a sustainable basis and is no longer dependent on grant funding. As we continue to seek opportunities to collaborate and innovate with community partners, there are occasions where DataCite will opt to participate in funded projects and in these cases allocate time based resources to these projects. DataCite day-to-day operations are fully supported through the membership and service fees model.

Goal to generate surplus – organisations which define sustainability based merely on recovering costs are brittle and stagnant. It is not enough to merely survive, it has to be able to adapt and change. To weather economic, social and technological volatility, they need financial resources beyond immediate operating costs.

DataCite operates on a cost-recovery membership model. As our community continues to grow and we streamline our operations, we seek to generate a surplus each year. The General Assembly aims to continue to grow the reserve funds to support continued operations and pre-finance future organizational projects.

Goal to create contingency fund to support operations for 12 months – a high priority should be generating a contingency fund that can support a complete, orderly wind down (12 months in most cases). This fund should be separate from those allocated to covering operating risk and investment in development.

DataCite currently maintains an operational reserve in accordance with statutory association laws of Germany. The DataCite General Assembly is required to approve the use of reserves in accordance with the activities of the association. As such, we maintain a steady approach to continue to build a contingency fund that supports the charitable mission of the organization.

Mission-consistent revenue generation – potential revenue sources should be considered for consistency with the organisational mission and not run counter to the aims of the organisation.

DataCite revenue is generated directly through membership, use of member services and funded project activities. All revenue streams are directly linked to the organizational mission.

Revenue based on services, not data – data related to the running of the research enterprise should be a community property. Appropriate revenue sources might include value-added services, consulting, API Service Level Agreements or membership fees.

DataCite offers a portfolio of member services and has no intention to change our strategy in this regard. Our services are grouped within three distinct categories, namely; discovery, registration and integration. We provide the means to create, find, cite, connect, and use research.


POSI Insurance Self-Assessment

Open source – All software required to run the infrastructure should be available under an open source license. This does not include other software that may be involved with running the organisation.

All of DataCite’s code and software processes are openly managed on Github and the development roadmap and prioritization are discussed in regular communications with members. As part of this, all code is published openly on Github under a fully permissible MIT License. Whenever possible we leverage open source components and we work hard to ensure that our documentation allows other projects to leverage our CC0 Data File, open APIs, and other tools.

Open data (within constraints of privacy laws) – For an infrastructure to be forked it will be necessary to replicate all relevant data. The CC0 waiver is best practice in making data legally available. Privacy and data protection laws will limit the extent to which this is possible

To the extent possible under law, DataCite has waived all copyright and related or neighboring rights to the DataCite Data File. The DataCite Data File includes all DOIs and deposited metadata in our database. CC0 enables scientists, educators, artists and other creators and owners of copyright- or database-protected content- to waive those interests in their works and thereby place them as completely as possible in the public domain, so that others may freely build upon, enhance and reuse the works for any purposes without restriction under copyright or database law.

Available data (within constraints of privacy laws) – It is not enough that the data be made “open” if there is not a practical way to actually obtain it. Underlying data should be made easily available via periodic data dumps.

As stated above, the DataCite Data File is freely available via our open APIs and discovery interfaces under a CC0 license. Our APIs and discovery interfaces allow any user to retrieve, query and browse DataCite DOI metadata records.

Patent non-assertion – The organisation should commit to a patent non-assertion covenant. The organisation may obtain patents to protect its own operations, but not use them to prevent the community from replicating the infrastructure.

We value ourselves as a fully open and public Data File of factual information about research. As facts, information stored in the Data File, by its nature, cannot be patented. DataCite makes no copyright, related or neighboring rights claims to the aggregated data. Consistent with this broad waiver, DataCite does not impose any conditions on access to and use of the DataCite Data File.

We continue to work with the community as an open scalable infrastructure providing services to researchers across the globe. Our governance structures and principles such as these provide a framework for us as a community and support the continued development of open infrastructure.

Matthew Buys
Matt Buys
Executive Director at DataCite | Blog posts