From ISA JSON to ISA JSON-LD (RDF): Dataset Maturity Level 3¶
Abstract:¶
The goal of this tutorial is to show how to go from an ISA document to an equivalent RDF representation using python tools, but also to highlight some limitations of existing libraries and point to alternative options to complete a meaningful conversion to RDF Turtle format.
This notebook mainly highlights the new functionality coming with ISA-API rc10.3 latest release which allows converting ISA-JSON to ISA-JSON-LD, with the choice of 3 popular ontological frameworks for semantic anchoring. These are:
obofoundry, a set of interoperable ontologies for the biological domain.
schema.org, the search engine orientated ontology developed by companies such as Yandex, Bing,Google
wikidata, a set of semantic concepts backing wikipedia and wikidata resources.
These frameworks have been chosen for interoperability.
This notebook has a companion notebook which goes over the exploration of the resulting RDF representations using a set of SPARQL queries. Check it out here
support: isatools@googlegroups.com
issue tracker: https://github.com/ISA-tools/isa-api/issues
Let’s get started¶
We do so by getting all necessary ISA tools
and importing the latest module for conversion to JSON-LD from ISA-JSON.
import os
import json
from json import load
import datetime
import isatools
from isatools.convert.json2jsonld import ISALDSerializer
1. Loading an ISA-JSON document in memory with json.load()
function¶
Prior to invoking the ISALDserializer
function, we need to do three things.
First, pass an url or a path to the ISA JSON instance to convert to JSON-LD
Second, select the ontology framework used for the semantic conversion. One may choose from the following 3 options:
obofoundry.org ontologies, abbreviated as
obo
schema.org ontology, abbreviated as
sdo
wikidata.org ontology, abbreviated as
wd
as prefix forhttp://www.wikidata.org/entity
Third, choose if to rely on embedding the
@context
file in the output or relying on url to individual contexts. By default, the converter will embed theall in one
context information. The reason for this is the lack of support for JSON-LD 1.1 specifications in many of the python libraries supported RDF parsing (e.g. RDFlib)
instance_path = os.path.join("./output/BII-S-3-synth/", "isa-new_ids.json")
with open(instance_path, 'r') as instance_file:
instance = load(instance_file)
instance_file.close()
2. Transforming ISA-JSON to ISA JSON-LD with ISALDserializer
function¶
# we now invoke the ISALDSerializer function
ontology = "isaterms"
serializer = ISALDSerializer(instance)
serializer.set_ontology(ontology)
serializer.set_instance(instance)
json_ld_content = serializer.output
Now that the conversion is performed, we can write the resulting ISA-JSON-LD to file:
3. Writing ISA JSON-LD to file¶
isa_json_ld_path = os.path.join("./output/BII-S-3-synth/", "isa-new_ids-BII-S-3-ld-" + ontology + "-v1.json")
with open(isa_json_ld_path, 'w') as outfile:
json.dump(json_ld_content, outfile, ensure_ascii=False, indent=4)
4. Converting ISA-JSONLD instance to RDF Turtle using python RDFlib library¶
Note
Python RDFlib version should be at last 6.0.2
from rdflib import Graph
graph = Graph()
graph.parse(isa_json_ld_path)
print(f"Graph g has {len(graph)} statements.")
# Write turtle file
rdf_path=os.path.join("./output/BII-S-3-synth/", "isa-new_ids-BII-S-3-ld-" + ontology + "-v3.ttl")
with open(rdf_path, 'w') as rdf_file:
rdf_file.write(graph.serialize(format='turtle'))
Authors¶
Authors
Name |
ORCID |
Affiliation |
Type |
ELIXIR Node |
Contribution |
---|---|---|---|---|---|
University of Oxford |
Writing - Original Draft |
||||
University of Oxford |
Writing - Original Draft |