Die FAIR-Prinzipien für Forschungsdaten
The idea behind FAIR data practices is to move data through the various stages of the data life cycle without hindrances or any loss of information. FAIR principles set guiding standards that aim to ensure research data remain findable, accessible, interoperable and reusable. It is important to take these principles into account when creating services and research data infrastructures.
Data and metadata should be easy to find for both humans and machines. Rich, machine-readable and descriptive metadata can facilitate the discovery of interesting, task-appropriate datasets. To achieve this, the data must be tagged with various metadata, which might include a title, author, summary of content, and a description of the data collection method, to cite just a few examples. Another essential step on the path toward making data and metadata findable is to assign a globally unique and persistent identifier, one example of which is a digital object identifier (DOI).
Data and metadata should be made accessible and preserved for the long term, ensuring that humans and machines can easily download and use the data through standardised communication protocols such as https. A further key to accessibility is ensuring that metadata remains accessible even if the actual research data is not available directly, for example due to copyright restrictions.
Metadata should use a formal and broadly applicable language that allows humans and machines to link them to other datasets. This can be achieved by using ontologies or thesauri such as Medical Subject Headings (MESH), AGROVOC or machine-readable formats for metadata such as XML. If the metadata also include qualified references to other datasets as part of the persistent identifier – for example through tags such as “is part of” or “is a version of” – this can also make it easier to establish links between datasets.
Metadata aids reusability by describing data in a way that enables it to be reused in other research projects and compared with other datasets. The origin, or provenance, of the research data plays a key role here – in other words which methods and devices were used to generate the data. Reusability also relies on proper citability of the data, generally by means of a persistent identifier such as a digital object identifier (DOI). Clear licence conditions are equally important – the conditions under which the data may be reused by humans and machines should be clearly specified by a Creative Commons or other license.