A quality denoting the ease of access of a resources. Accessibility can be measured according to a number of metrics or indicators.

Access Rights

a legal document stipulating who can access a dataset and under which conditions as well as for which purpose.



Bring Your Own Data. a type of hackathon were the participants are requested to bring to the table datasets they would like to see worked on. In the context of FAIRplus, events organized quarterly with participating IMI projects to carry out tasks improving the level of FAIRness of said datasets.



a Standard Development Organization developing data standards for reporting clinical trial data.


Common Data Element. As defined by the NIH, a common data element that is common to multiple data sets across different studies. CDEs can be defined using metadata standardization specification such as ISO11179 and persisted in a standard compliant metadata registry (see [MDR](## M))


Capability & Maturity Model Indicator is a process level improvement training and appraisal program. Administered by the CMMI Institute, a subsidiary of ISACA, it was developed at Carnegie Mellon University (CMU). (source: wikipedia)

Controlled Terminology

a list of vetted concepts used by a computer system or a database for marking content to ensure annotation consistence and query recall.



Data Access Agreement, a legally binding document articulating the conditions of access to data generated by an organization or consortium with other organizations or partners. The DAA document usually identifies the different parties involved, the contractual obligations binding the partners, the modalities of data access and transfers as well as the allowed uses of the data being transfered.


Directed Acyclic Graph is a type of directed graph (ie the vertices between nodes have a direction) where cycles are not allowed.

Data Catalogue

a metadata database meant to allow registration of datasets to increase their findability in an organization.

Data Dictionary

a data management document which lists all the variables and data types collected in a project.

Data Enclave

a data resource the content of which can only be accessed by being physically present on the site where the data is held and no connection to the outside world and the web is possible. This model is used to safeguard extremely sensitive data by drastically controling who accesses the information.


a W3C specification in the form of an RDF vocabulary for describing datasets. DCAT enables a publisher to describe datasets and data services in a catalog using a standard model and vocabulary that facilitates the consumption and aggregation of metadata from multiple catalogs. DCAT is the foundation for open dataset descriptions in the European Union public sector and was adapted by the ISA programme of the European Commission.

Descriptive Metadata

metadata the aim of which is to allow provision of domain sepcific content to make is available to agents, human or machines.



European Chemicals Agency, an agency of the European Union which manages the technical and administrative aspects of the implementation of the European Union regulation called Registration, Evaluation, Authorisation and Restriction of Chemicals. (source: wikipedia)


Eurepean Medicines Agency, is an agency of the European Union in charge of the evaluation and supervision of medicinal products.


Extract Transform Load is the general procedure of copying data from one or more sources into a destination system which represents the data differently from the source(s) or in a different context than the source(s). (source: wikipedia)



the acronynom for Findable, Accessible, Interoperable and Reusable data as defined by Wilkinson et al, 2016.

FAIR assessment

the process of rating a dataset or service along the 4 dimensions of Findability, Accessibility, Interoperability and Reusability.


United States Food and Drug Agency is a regulatory agency of the United States government in charge of the evaluation and supervision of food products and medicines.


a FAIR implementation Profile, which can be understood as the set of metadata element and their value sets, semantic resources and constraints to capture information in a domain of interest.


see HL7 FHIR.



GO FAIR is a bottom-up, stakeholder-driven and self-governed initiative that aims to implement the FAIR data principles, making data Findable, Accessible, Interoperable and Reusable (FAIR). It offers an open and inclusive ecosystem for individuals, institutions and organisations working together through Implementation Networks (INs). The INs are active in three activity pillars: GO CHANGE, GO TRAIN and GO BUILD. (source:


General Data Protection Regulation 2016/679 is a regulation in EU law on data protection and privacy in the European Union and the European Economic Area. It also addresses the transfer of personal data outside the EU and EEA areas. (source: wikipedia)


GraphQL is a query language for APIs, which uses the HTTP protocol and a runtime for fulfilling those queries with your existing data. GraphQL provides a complete and understandable description of the data in your API, gives clients the power to ask for exactly what they need and nothing more, makes it easier to evolve APIs over time, and enables powerful developer tools (source:


a graph traversal and query language developed as part of the Apache Tinkerpop project.


Globally Unique Persistent Resolvable Identifier, a type of persistent identifier with added properties such as guaranted uniqueness and accessibility via HTTP protocol.



An Act by the United States of American to amend the Internal Revenue Code of 1996 to improve portability and continuity of health insurance coverage in the group and individual markets, to combat waste, fraud, and abuse in health insurance and health care delivery, to promote the use of medical savings accounts, to improve access to long-term care services and coverage, to simplify the administration of health insurance, and for other purposes.


Health Level 7 Fast Healthcare Interoperability Resources is a standard describing data formats and elements and an application programming interface for exchanging electronic health records. The standard was created by the Health Level Seven International health-care standards organization. (source: wikipedia)


Hypertext Transfer Protocol is a communication protocol used by the internet.



Identifier, a sequence of character associated to an entity to track or retrieve it.


Innovative Medicine Initiative, a European Union public-private partnership funding program bringing together academic and industrial stakeholders on the theme of developing new therapeutic solutions.


The property of a resource to work together with other software agents.



Javascript Online Notation, a syntax for data structures and used to transmit data objects consisting of attribute-value pairs.



a Knowledge Graph is a knowledge base that uses a graph-structured data model or topology to integrate data. Knowledge Graphs are often used to store interlinked descriptions of entities - objects, events, situations or abstract concepts - with free-form semantics. (source: wikipedia)


Key Performance Indicator, a generic term to cover any metric or indicator use to gauge efficiency.



the process of assigning a license, i.e a formal, legally binding document which articulates the condition of use of an entity.


Linked Data is structured data which is interlinked with other data so it becomes more useful through semantic queries. (source: wikipedia)



a Material Transfer Agreement is a legal document stipulating the conditions under which physical entities / properties may be transfered between organizations signatories of the agreement.


Metadata Registry, a database storing individual definition of metadata element and data elements.


annotation about the data. Different types of metadata exit depending on their function.


Minimal Information Checklist, which is a document formal or not definig annotation requirements from a domain.



Not Only SQL database is a type of database which persists information in structures other than tables as classically used by RDBMS


a graph database and a company marking the system. see



Open Digital Rights Language (ODRL) is a W3C specification for policy expression language that provides a flexible and interoperable information model, vocabulary. (source: wikipedia)


Observational Medical Outcomes Partnership (OMOP) Common Data Model is a data model developed by the OHDSI consortium to support the capture and reporting of observational data but which can be used to capture clinical trial data.


a formal representation of a domain of knowledge. A more advanced artefact than a simple controled terminology as relations between entities are captures, concepts can be defined axiomatically and inferences can be made automatically by specialized tools such as reasoners. Ontologies can be used to accomplish tasks such as data entry validation or query expansion.


The OpenAPI Specification, originally known as the Swagger Specification, is a specification for machine-readable interface files for describing, producing, consuming, and visualizing RESTful web services. (souce: wikipedia)



Persistent Identifier, a long-lasting reference to a document, file, web page, or other object. The term “persistent identifier” is usually used in the context of digital objects that are accessible over the Internet. (source: wikipedia)

Property Graph

a type of knowledge graph which

Provenance Metadata

metadata the aim of which is to provide traceability, audit and trail capability allowing consumers to understand its origin.


a W3C RDF vocabulary for representing and expressing provenance information.



Quality Management Systems are a set of practices and tools used to manage and control quality in production lines. QMS uses tools such as Capability, Maturity Model Integration (see CMMI) approach.


Quantities, Units, Dimensions and dataTypes vocabulary specifications produced by the consortium.



Relational Database Management System. A type of information persistence and storage software which uses tables related to each other by linking tables storing primary keys and foreign keys to allow queries across tables. Ideally, model normalization is sought to avoid duplication of information within and between tables. Examples of relational database management systems are mySQL,MariaDB, postgreSQL or ORACLE. RDBMS contrast with NO-SQL and graph database systems which stored information as graphs instead of tables, claiming gains in query efficiencies owing to fast traversal algorithms.


Resource Description Framework a W3C standard specification for representing information in the form of subject / predicate / object statements known as triples.


The Research Data Alliance is a research community organization started in 2013 by the European Commission, the American National Science Foundation and National Institute of Standards and Technology, and the Australian Department of Innovation. (source: wikipedia)


Reification is the process by which an abstract idea about a computer program is turned into an explicit data model or other object created in a programming language. (source: wikipedia)


a property of an entity to be data entity to be consumed, integrated or repurposed, ideally with as low as possible transformation requirements.


a Research Object is a method for the identification, aggregation and exchange of scholarly information on the Web. The primary goal of the research object approach is to provide a mechanism to associate related resources about a scientific investigation so that they can be shared using a single identifier. (source: wikipedia)


Return on Investment. This is measure of how value has been added or lost by a financial or technical commitment. This is often used to gauge the gain deploying a new technology in terms of cost savings versus cost of deployment.


Research Resource Identifiers (RRID) are supposed to be resource identifiers which are globally unique and persistent. RRID are endorsed by a number of scientific publications listed at and are supported by UC San Diego Scicrunch knowledge base.


RWD stands for Real World Data n medicine is data derived from a number of sources that are associated with outcomes in a heterogeneous patient population in real-world settings, such as patient surveys, clinical trials, and observational cohort studies. (source:wikipedia)


RWE stands for Real World Evidence is defined by FDA as “clinical evidence regarding the usage and potential benefits or risks of a medical product derived from analysis of RWD”.



Search Engine Optimization is the process of improving the quality and quantity of website traffic to a website or a web page from search engines. (source:wikipedia)


Shape Constraint Language, a W3C standard for expressing constraints over RDF data to enable validation.


Shape Expression, a specification for expressing constraints over RDF data to enable validation. an technical alternative to SHACL.


SPARQL Query Language, a W3C standard for querying RDF graphs.

Structural Metadata

metadata the aim of which is to define data structure and content organization.

Swagger API

see openAPI.



Terse RDF Triple Language (Turtle) is a syntax and file format for expressing data in the Resource Description Framework (RDF) data model. Turtle syntax is similar to that of SPARQL, an RDF query language. It is a common data format for storing RDF data, along with N-Triples, JSON-LD and RDF/XML. (source:wikipedia)



A Uniform Resource Identifier (URI) is a unique sequence of characters that identifies a logical or physical resource used by web technologies.(source:wikipedia)



the range of discrete values allowed for a categorical variable.



Wikidata is a collaboratively edited multilingual knowledge graph hosted by the Wikimedia Foundation. It is a common source of open data that Wikimedia projects such as Wikipedia, and anyone else, can use under the CC0 public domain license. Wikidata is powered by the software Wikibase.



Extensible Markup Language.



Yet Another Markup Language.


Yummydata is a site that lists and monitors SPARQL endpoints that provide data of interest to the biomedical community.


No entries yet.