11. Creating a metadata profile

Recipe Overview
Reading Time
20 minutes
Executable Code
Creating a metadata profile
FAIRPlus logo
Recipe Type
Maturity Level & Indicator
not applicable
hover me Tooltip text

11.1. Graphical Overview

11.2. How to generate a metadata template

The following steps are intended as a starting point to guide the generation of a metadata template.

11.2.1. Step 1: Define competency questions

  • What are the questions you would like to address with the template? Without a set of competency questions, important variables may easily be forgotten. It is equally possible to collect too much metadata, making the resulting metadata model opaque and difficult to navigate. Competency questions serve as a guide to identify the most relevant experimental factors.

11.2.2. Step 2: Define a Minimal Set Of Metadata (MSOM) according to these questions

  • Compile metadata from different sources

  • Generate a consolidated view on metadata by merging attributes as far as possible

  • Differentiate metadata available for most of the studies from metadata occurring rarely (sparse matrix)

  • Identify gaps in the metadata available for most of the studies comprising data that is considered important but has not been captured in the past

  • Define a MSOM to be captured in the future from the metadata that is available for most of the studies and the metadata considered to be important

  • Identify available community standards regarding minimal sets of metadata

  • Add metadata attributes from those community standards to the MSOM, if they are not yet included

  • Assign cardinality to the MSOM (identify mandatory metadata and how many times the attributes may be reported. Some metadata might not be mandatory but are still important to capture, if available)

  • Identify appropriate ontologies representing your data and establish an application ontology (see recipe 4 of UC3)

  • Assign, as far as possible, ontologies to the MSOM and the sparse matrix

11.2.3. Step 3: Introduce semantics into the template

  • Identify most important objects to be represented in the model (e.g. study, sample, treatment, result, etc.)

  • Make sure to have an appropriate naming strategy for the objects (e.g. an NGSstudy is an OMICSstudy is a Study; do not call an NGSstudy a Study; make sure the granularity fits your purposes)

  • Assign MSOM and sparse matrix attributes to the respective objects

  • Identify and introduce relationships among the identified objects (e.g. “an NGSstudy contains samples”, “a result is derived from a sample”)

  • Identify dependencies to data not represented as objects at this point in time, but, e.g. as termlists

  • Make sure that your model can be expanded subsequently to represent those data as objects, as well

  • Integrate the sparse matrix of metadata not contained in the MSOM in the model

11.2.4. Step 4: Reality check

  • Introduce measures allowing to identify errors in reported data according to your model

  • Expose your model to actual data delivered by independent colleagues and capture the errors and gaps that occurred

  • Identify errors and gaps that are related to the model and not occurring due to errors in the data

  • Adjust the model according to these errors and gaps

  • Re-iterate the reality check until no more severe errors and gaps are occurring that are relevant for the previously defined competency questions

11.4. Authors