15.9.1. Introduction to the VCF2FHIR python library




Recipe Overview
Reading Time
30 minutes
Executable Code
Yes
Difficulty
Converting VCF file to FHIR JSON
FAIRPlus logo
Recipe Type
Hands-on
Maturity Level & Indicator
DSM-1-C0DSM-1-C1
hover me Tooltip text

This python library (in early stage of development) by Dolin et al, 2021 [1] provides an initial capability to convert genetic variation information stored in a standard Variant Call File (VCF) into a JSON-based HL7 FHIR message, compliant with HL7 FHIR Genomics Report guidelines.

This notebook offers a simple way for anyone interested in how FAIR principles can be connected to Clinical World to try it out for themselves.

  • Main Features:

    • supports simple variants (SNVs, MNVs, Indels)

  • Limitations:

    • does not support structural variants

    • This software is not intended for use in production systems

15.9.1.1. Let’s get going by importing all the necessary python libraries

import os
import json
import logging
import vcf2fhir

15.9.1.1.1. VCF2FHIR python libraryvcf2fhir main method is called Converter and takes a number of arguments, most of which are optional.

  • Required arguments:

  • vcf_filename (required): the path to a text-based or bgzipped VCF file.

    IMPORTANT:

    • Valid path and filename without whitespace must be provided.

    • VCF file must conform to VCF Version 4.1 or later.

    • FORMAT.GT must be present.

    • Multi-sample VCFs are allowed, but only the first sample will be converted.

    • bgzipped VCF files are allowed but then the additional argument has_tabix must be set to True and a tabix index file must be provided. The Tabix file must have the same name as the bgzipped VCF file, with a ‘.tbi’ extension, and must be in the same folder.

  • ref_build (required): Genome Reference Consortium genome assembly to which variants in the VCF were called.

    IMPORTANT:

    • Must be one of ‘GRCh37’ or ‘GRCh38’.

  • Optional arguments are:

  • patient_id (optional):

  • conv_region_dict

  • conv_region_filename

  • annotation_filename (optional)

  • region_studied_filename (optional)

  • nocall_filename (optional):

  • ratio_ad_dp (optional)(default value = 0.99)

  • genomic_source_class (optional)(default value = somatic)

For more information about those options, refer to the library documentation.

15.9.1.1.1.1. Invoking the converter is as simple as the following command:

fhir = vcf2fhir.Converter('vcftests.vcf','GRCh37')

15.9.1.1.1.2. Invoking the convert() submethod to serialize the information as a HL7 FHIR JSON message to a default file output.

fhir.convert()

15.9.1.1.1.3. Performing both actions in one go while using an additional optional argument

vcf2fhir.Converter('vcftests.vcf','GRCh38', 'patient01').convert()

15.9.1.1.1.4. Invoking the conversion and writing to a user defined file instead of the default file.

output=vcf2fhir.Converter('vcftests.vcf','GRCh37', 'patient01', ratio_ad_dp = 0.89).convert(output_filename='patient01.json')

15.9.1.1.2. Peaking at the resulting JSON file:

with open('patient01.json','r') as input:
    fhirmsg=json.load(input)

print(json.dumps(fhirmsg, indent=4, sort_keys=True))
{
    "category": [
        {
            "coding": [
                {
                    "code": "GE",
                    "system": "http://terminology.hl7.org/CodeSystem/v2-0074"
                }
            ]
        }
    ],
    "code": {
        "coding": [
            {
                "code": "81247-9",
                "display": "Master HL7 genetic variant reporting panel",
                "system": "http://loinc.org"
            }
        ]
    },
    "contained": [
        {
            "category": [
                {
                    "coding": [
                        {
                            "code": "laboratory",
                            "system": "http://terminology.hl7.org/CodeSystem/observation-category"
                        }
                    ]
                }
            ],
            "code": {
                "coding": [
                    {
                        "code": "69548-6",
                        "display": "Genetic variant assessment",
                        "system": "http://loinc.org"
                    }
                ]
            },
            "component": [
                {
                    "code": {
                        "coding": [
                            {
                                "code": "48004-6",
                                "display": "DNA change (c.HGVS)",
                                "system": "http://loinc.org"
                            }
                        ]
                    },
                    "valueCodeableConcept": {
                        "coding": [
                            {
                                "code": "NC_000023.10:60465:T:C",
                                "system": "http://varnomen.hgvs.org"
                            }
                        ]
                    }
                },
                {
                    "code": {
                        "coding": [
                            {
                                "code": "48013-7",
                                "display": "Genomic reference sequence ID",
                                "system": "http://loinc.org"
                            }
                        ]
                    },
                    "valueCodeableConcept": {
                        "coding": [
                            {
                                "code": "NC_000023.10",
                                "system": "http://www.ncbi.nlm.nih.gov/nuccore"
                            }
                        ]
                    }
                },
                {
                    "code": {
                        "coding": [
                            {
                                "code": "48002-0",
                                "display": "Genomic Source Class",
                                "system": "http://loinc.org"
                            }
                        ]
                    },
                    "valueCodeableConcept": {
                        "coding": [
                            {
                                "code": "LA6684-0",
                                "display": "Somatic",
                                "system": "http://loinc.org"
                            }
                        ]
                    }
                },
                {
                    "code": {
                        "coding": [
                            {
                                "code": "69547-8",
                                "display": "Genomic Ref allele [ID]",
                                "system": "http://loinc.org"
                            }
                        ]
                    },
                    "valueString": "T"
                },
                {
                    "code": {
                        "coding": [
                            {
                                "code": "69551-0",
                                "display": "Genomic Alt allele [ID]",
                                "system": "http://loinc.org"
                            }
                        ]
                    },
                    "valueString": "C"
                },
                {
                    "code": {
                        "coding": [
                            {
                                "code": "92822-6",
                                "display": "Genomic coord system",
                                "system": "http://loinc.org"
                            }
                        ]
                    },
                    "valueCodeableConcept": {
                        "coding": [
                            {
                                "code": "LA30102-0",
                                "display": "1-based character counting",
                                "system": "http://loinc.org"
                            }
                        ]
                    }
                },
                {
                    "code": {
                        "coding": [
                            {
                                "code": "exact-start-end",
                                "display": "Variant exact start and end",
                                "system": "http://hl7.org/fhir/uv/genomics-reporting/CodeSystem/TbdCodes"
                            }
                        ]
                    },
                    "valueRange": {
                        "low": {
                            "value": 60466
                        }
                    }
                }
            ],
            "id": "dv-506559af936d4",
            "meta": {
                "profile": [
                    "http://hl7.org/fhir/uv/genomics-reporting/StructureDefinition/variant"
                ]
            },
            "resourceType": "Observation",
            "status": "final",
            "subject": {
                "reference": "Patient/patient01"
            },
            "valueCodeableConcept": {
                "coding": [
                    {
                        "code": "LA9633-4",
                        "display": "present",
                        "system": "http://loinc.org"
                    }
                ]
            }
        },
        {
            "category": [
                {
                    "coding": [
                        {
                            "code": "laboratory",
                            "system": "http://terminology.hl7.org/CodeSystem/observation-category"
                        }
                    ]
                }
            ],
            "code": {
                "coding": [
                    {
                        "code": "diagnostic-implication",
                        "display": "Diagnostic Implication",
                        "system": "http://hl7.org/fhir/uv/genomics-reporting/CodeSystem/TbdCodes"
                    }
                ]
            },
            "component": [
                {
                    "code": {
                        "coding": [
                            {
                                "code": "53037-8",
                                "display": "Genetic variation clinical significance [Imp]",
                                "system": "http://loinc.org"
                            }
                        ]
                    },
                    "valueCodeableConcept": {
                        "coding": [
                            {
                                "display": "not specified",
                                "system": "http://loinc.org"
                            }
                        ]
                    }
                }
            ],
            "derivedFrom": [
                {
                    "reference": "#dv-506559af936d4"
                }
            ],
            "id": "di-6da3099b6b204",
            "meta": {
                "profile": [
                    "http://hl7.org/fhir/uv/genomics-reporting/StructureDefinition/diagnostic-implication"
                ]
            },
            "resourceType": "Observation",
            "status": "final",
            "subject": {
                "reference": "Patient/patient01"
            }
        },
        {
            "category": [
                {
                    "coding": [
                        {
                            "code": "laboratory",
                            "system": "http://terminology.hl7.org/CodeSystem/observation-category"
                        }
                    ]
                }
            ],
            "code": {
                "coding": [
                    {
                        "code": "69548-6",
                        "display": "Genetic variant assessment",
                        "system": "http://loinc.org"
                    }
                ]
            },
            "component": [
                {
                    "code": {
                        "coding": [
                            {
                                "code": "48004-6",
                                "display": "DNA change (c.HGVS)",
                                "system": "http://loinc.org"
                            }
                        ]
                    },
                    "valueCodeableConcept": {
                        "coding": [
                            {
                                "code": "NC_000023.10:60578:G:A",
                                "system": "http://varnomen.hgvs.org"
                            }
                        ]
                    }
                },
                {
                    "code": {
                        "coding": [
                            {
                                "code": "48013-7",
                                "display": "Genomic reference sequence ID",
                                "system": "http://loinc.org"
                            }
                        ]
                    },
                    "valueCodeableConcept": {
                        "coding": [
                            {
                                "code": "NC_000023.10",
                                "system": "http://www.ncbi.nlm.nih.gov/nuccore"
                            }
                        ]
                    }
                },
                {
                    "code": {
                        "coding": [
                            {
                                "code": "48002-0",
                                "display": "Genomic Source Class",
                                "system": "http://loinc.org"
                            }
                        ]
                    },
                    "valueCodeableConcept": {
                        "coding": [
                            {
                                "code": "LA6684-0",
                                "display": "Somatic",
                                "system": "http://loinc.org"
                            }
                        ]
                    }
                },
                {
                    "code": {
                        "coding": [
                            {
                                "code": "69547-8",
                                "display": "Genomic Ref allele [ID]",
                                "system": "http://loinc.org"
                            }
                        ]
                    },
                    "valueString": "G"
                },
                {
                    "code": {
                        "coding": [
                            {
                                "code": "69551-0",
                                "display": "Genomic Alt allele [ID]",
                                "system": "http://loinc.org"
                            }
                        ]
                    },
                    "valueString": "A"
                },
                {
                    "code": {
                        "coding": [
                            {
                                "code": "92822-6",
                                "display": "Genomic coord system",
                                "system": "http://loinc.org"
                            }
                        ]
                    },
                    "valueCodeableConcept": {
                        "coding": [
                            {
                                "code": "LA30102-0",
                                "display": "1-based character counting",
                                "system": "http://loinc.org"
                            }
                        ]
                    }
                },
                {
                    "code": {
                        "coding": [
                            {
                                "code": "exact-start-end",
                                "display": "Variant exact start and end",
                                "system": "http://hl7.org/fhir/uv/genomics-reporting/CodeSystem/TbdCodes"
                            }
                        ]
                    },
                    "valueRange": {
                        "low": {
                            "value": 60579
                        }
                    }
                }
            ],
            "id": "dv-6f399c7fb0be4",
            "meta": {
                "profile": [
                    "http://hl7.org/fhir/uv/genomics-reporting/StructureDefinition/variant"
                ]
            },
            "resourceType": "Observation",
            "status": "final",
            "subject": {
                "reference": "Patient/patient01"
            },
            "valueCodeableConcept": {
                "coding": [
                    {
                        "code": "LA9633-4",
                        "display": "present",
                        "system": "http://loinc.org"
                    }
                ]
            }
        },
        {
            "category": [
                {
                    "coding": [
                        {
                            "code": "laboratory",
                            "system": "http://terminology.hl7.org/CodeSystem/observation-category"
                        }
                    ]
                }
            ],
            "code": {
                "coding": [
                    {
                        "code": "diagnostic-implication",
                        "display": "Diagnostic Implication",
                        "system": "http://hl7.org/fhir/uv/genomics-reporting/CodeSystem/TbdCodes"
                    }
                ]
            },
            "component": [
                {
                    "code": {
                        "coding": [
                            {
                                "code": "53037-8",
                                "display": "Genetic variation clinical significance [Imp]",
                                "system": "http://loinc.org"
                            }
                        ]
                    },
                    "valueCodeableConcept": {
                        "coding": [
                            {
                                "display": "not specified",
                                "system": "http://loinc.org"
                            }
                        ]
                    }
                }
            ],
            "derivedFrom": [
                {
                    "reference": "#dv-6f399c7fb0be4"
                }
            ],
            "id": "di-12de791e725b4",
            "meta": {
                "profile": [
                    "http://hl7.org/fhir/uv/genomics-reporting/StructureDefinition/diagnostic-implication"
                ]
            },
            "resourceType": "Observation",
            "status": "final",
            "subject": {
                "reference": "Patient/patient01"
            }
        },
        {
            "category": [
                {
                    "coding": [
                        {
                            "code": "laboratory",
                            "system": "http://terminology.hl7.org/CodeSystem/observation-category"
                        }
                    ]
                }
            ],
            "code": {
                "coding": [
                    {
                        "code": "69548-6",
                        "display": "Genetic variant assessment",
                        "system": "http://loinc.org"
                    }
                ]
            },
            "component": [
                {
                    "code": {
                        "coding": [
                            {
                                "code": "48004-6",
                                "display": "DNA change (c.HGVS)",
                                "system": "http://loinc.org"
                            }
                        ]
                    },
                    "valueCodeableConcept": {
                        "coding": [
                            {
                                "code": "NC_000023.10:60582:G:C",
                                "system": "http://varnomen.hgvs.org"
                            }
                        ]
                    }
                },
                {
                    "code": {
                        "coding": [
                            {
                                "code": "48013-7",
                                "display": "Genomic reference sequence ID",
                                "system": "http://loinc.org"
                            }
                        ]
                    },
                    "valueCodeableConcept": {
                        "coding": [
                            {
                                "code": "NC_000023.10",
                                "system": "http://www.ncbi.nlm.nih.gov/nuccore"
                            }
                        ]
                    }
                },
                {
                    "code": {
                        "coding": [
                            {
                                "code": "48002-0",
                                "display": "Genomic Source Class",
                                "system": "http://loinc.org"
                            }
                        ]
                    },
                    "valueCodeableConcept": {
                        "coding": [
                            {
                                "code": "LA6684-0",
                                "display": "Somatic",
                                "system": "http://loinc.org"
                            }
                        ]
                    }
                },
                {
                    "code": {
                        "coding": [
                            {
                                "code": "69547-8",
                                "display": "Genomic Ref allele [ID]",
                                "system": "http://loinc.org"
                            }
                        ]
                    },
                    "valueString": "G"
                },
                {
                    "code": {
                        "coding": [
                            {
                                "code": "69551-0",
                                "display": "Genomic Alt allele [ID]",
                                "system": "http://loinc.org"
                            }
                        ]
                    },
                    "valueString": "C"
                },
                {
                    "code": {
                        "coding": [
                            {
                                "code": "92822-6",
                                "display": "Genomic coord system",
                                "system": "http://loinc.org"
                            }
                        ]
                    },
                    "valueCodeableConcept": {
                        "coding": [
                            {
                                "code": "LA30102-0",
                                "display": "1-based character counting",
                                "system": "http://loinc.org"
                            }
                        ]
                    }
                },
                {
                    "code": {
                        "coding": [
                            {
                                "code": "exact-start-end",
                                "display": "Variant exact start and end",
                                "system": "http://hl7.org/fhir/uv/genomics-reporting/CodeSystem/TbdCodes"
                            }
                        ]
                    },
                    "valueRange": {
                        "low": {
                            "value": 60583
                        }
                    }
                }
            ],
            "id": "dv-6175dec7e9904",
            "meta": {
                "profile": [
                    "http://hl7.org/fhir/uv/genomics-reporting/StructureDefinition/variant"
                ]
            },
            "resourceType": "Observation",
            "status": "final",
            "subject": {
                "reference": "Patient/patient01"
            },
            "valueCodeableConcept": {
                "coding": [
                    {
                        "code": "LA9633-4",
                        "display": "present",
                        "system": "http://loinc.org"
                    }
                ]
            }
        },
        {
            "category": [
                {
                    "coding": [
                        {
                            "code": "laboratory",
                            "system": "http://terminology.hl7.org/CodeSystem/observation-category"
                        }
                    ]
                }
            ],
            "code": {
                "coding": [
                    {
                        "code": "diagnostic-implication",
                        "display": "Diagnostic Implication",
                        "system": "http://hl7.org/fhir/uv/genomics-reporting/CodeSystem/TbdCodes"
                    }
                ]
            },
            "component": [
                {
                    "code": {
                        "coding": [
                            {
                                "code": "53037-8",
                                "display": "Genetic variation clinical significance [Imp]",
                                "system": "http://loinc.org"
                            }
                        ]
                    },
                    "valueCodeableConcept": {
                        "coding": [
                            {
                                "display": "not specified",
                                "system": "http://loinc.org"
                            }
                        ]
                    }
                }
            ],
            "derivedFrom": [
                {
                    "reference": "#dv-6175dec7e9904"
                }
            ],
            "id": "di-cda284da44504",
            "meta": {
                "profile": [
                    "http://hl7.org/fhir/uv/genomics-reporting/StructureDefinition/diagnostic-implication"
                ]
            },
            "resourceType": "Observation",
            "status": "final",
            "subject": {
                "reference": "Patient/patient01"
            }
        },
        {
            "category": [
                {
                    "coding": [
                        {
                            "code": "laboratory",
                            "system": "http://terminology.hl7.org/CodeSystem/observation-category"
                        }
                    ]
                }
            ],
            "code": {
                "coding": [
                    {
                        "code": "69548-6",
                        "display": "Genetic variant assessment",
                        "system": "http://loinc.org"
                    }
                ]
            },
            "component": [
                {
                    "code": {
                        "coding": [
                            {
                                "code": "48004-6",
                                "display": "DNA change (c.HGVS)",
                                "system": "http://loinc.org"
                            }
                        ]
                    },
                    "valueCodeableConcept": {
                        "coding": [
                            {
                                "code": "NC_000023.10:60592:A:T",
                                "system": "http://varnomen.hgvs.org"
                            }
                        ]
                    }
                },
                {
                    "code": {
                        "coding": [
                            {
                                "code": "48013-7",
                                "display": "Genomic reference sequence ID",
                                "system": "http://loinc.org"
                            }
                        ]
                    },
                    "valueCodeableConcept": {
                        "coding": [
                            {
                                "code": "NC_000023.10",
                                "system": "http://www.ncbi.nlm.nih.gov/nuccore"
                            }
                        ]
                    }
                },
                {
                    "code": {
                        "coding": [
                            {
                                "code": "48002-0",
                                "display": "Genomic Source Class",
                                "system": "http://loinc.org"
                            }
                        ]
                    },
                    "valueCodeableConcept": {
                        "coding": [
                            {
                                "code": "LA6684-0",
                                "display": "Somatic",
                                "system": "http://loinc.org"
                            }
                        ]
                    }
                },
                {
                    "code": {
                        "coding": [
                            {
                                "code": "69547-8",
                                "display": "Genomic Ref allele [ID]",
                                "system": "http://loinc.org"
                            }
                        ]
                    },
                    "valueString": "A"
                },
                {
                    "code": {
                        "coding": [
                            {
                                "code": "69551-0",
                                "display": "Genomic Alt allele [ID]",
                                "system": "http://loinc.org"
                            }
                        ]
                    },
                    "valueString": "T"
                },
                {
                    "code": {
                        "coding": [
                            {
                                "code": "92822-6",
                                "display": "Genomic coord system",
                                "system": "http://loinc.org"
                            }
                        ]
                    },
                    "valueCodeableConcept": {
                        "coding": [
                            {
                                "code": "LA30102-0",
                                "display": "1-based character counting",
                                "system": "http://loinc.org"
                            }
                        ]
                    }
                },
                {
                    "code": {
                        "coding": [
                            {
                                "code": "exact-start-end",
                                "display": "Variant exact start and end",
                                "system": "http://hl7.org/fhir/uv/genomics-reporting/CodeSystem/TbdCodes"
                            }
                        ]
                    },
                    "valueRange": {
                        "low": {
                            "value": 60593
                        }
                    }
                }
            ],
            "id": "dv-c5a54b1cd5684",
            "meta": {
                "profile": [
                    "http://hl7.org/fhir/uv/genomics-reporting/StructureDefinition/variant"
                ]
            },
            "resourceType": "Observation",
            "status": "final",
            "subject": {
                "reference": "Patient/patient01"
            },
            "valueCodeableConcept": {
                "coding": [
                    {
                        "code": "LA9633-4",
                        "display": "present",
                        "system": "http://loinc.org"
                    }
                ]
            }
        },
        {
            "category": [
                {
                    "coding": [
                        {
                            "code": "laboratory",
                            "system": "http://terminology.hl7.org/CodeSystem/observation-category"
                        }
                    ]
                }
            ],
            "code": {
                "coding": [
                    {
                        "code": "diagnostic-implication",
                        "display": "Diagnostic Implication",
                        "system": "http://hl7.org/fhir/uv/genomics-reporting/CodeSystem/TbdCodes"
                    }
                ]
            },
            "component": [
                {
                    "code": {
                        "coding": [
                            {
                                "code": "53037-8",
                                "display": "Genetic variation clinical significance [Imp]",
                                "system": "http://loinc.org"
                            }
                        ]
                    },
                    "valueCodeableConcept": {
                        "coding": [
                            {
                                "display": "not specified",
                                "system": "http://loinc.org"
                            }
                        ]
                    }
                }
            ],
            "derivedFrom": [
                {
                    "reference": "#dv-c5a54b1cd5684"
                }
            ],
            "id": "di-7771ce77a9a54",
            "meta": {
                "profile": [
                    "http://hl7.org/fhir/uv/genomics-reporting/StructureDefinition/diagnostic-implication"
                ]
            },
            "resourceType": "Observation",
            "status": "final",
            "subject": {
                "reference": "Patient/patient01"
            }
        },
        {
            "category": [
                {
                    "coding": [
                        {
                            "code": "laboratory",
                            "system": "http://terminology.hl7.org/CodeSystem/observation-category"
                        }
                    ]
                }
            ],
            "code": {
                "coding": [
                    {
                        "code": "69548-6",
                        "display": "Genetic variant assessment",
                        "system": "http://loinc.org"
                    }
                ]
            },
            "component": [
                {
                    "code": {
                        "coding": [
                            {
                                "code": "48004-6",
                                "display": "DNA change (c.HGVS)",
                                "system": "http://loinc.org"
                            }
                        ]
                    },
                    "valueCodeableConcept": {
                        "coding": [
                            {
                                "code": "NC_000023.10:60691:T:C",
                                "system": "http://varnomen.hgvs.org"
                            }
                        ]
                    }
                },
                {
                    "code": {
                        "coding": [
                            {
                                "code": "48013-7",
                                "display": "Genomic reference sequence ID",
                                "system": "http://loinc.org"
                            }
                        ]
                    },
                    "valueCodeableConcept": {
                        "coding": [
                            {
                                "code": "NC_000023.10",
                                "system": "http://www.ncbi.nlm.nih.gov/nuccore"
                            }
                        ]
                    }
                },
                {
                    "code": {
                        "coding": [
                            {
                                "code": "48002-0",
                                "display": "Genomic Source Class",
                                "system": "http://loinc.org"
                            }
                        ]
                    },
                    "valueCodeableConcept": {
                        "coding": [
                            {
                                "code": "LA6684-0",
                                "display": "Somatic",
                                "system": "http://loinc.org"
                            }
                        ]
                    }
                },
                {
                    "code": {
                        "coding": [
                            {
                                "code": "69547-8",
                                "display": "Genomic Ref allele [ID]",
                                "system": "http://loinc.org"
                            }
                        ]
                    },
                    "valueString": "T"
                },
                {
                    "code": {
                        "coding": [
                            {
                                "code": "69551-0",
                                "display": "Genomic Alt allele [ID]",
                                "system": "http://loinc.org"
                            }
                        ]
                    },
                    "valueString": "C"
                },
                {
                    "code": {
                        "coding": [
                            {
                                "code": "92822-6",
                                "display": "Genomic coord system",
                                "system": "http://loinc.org"
                            }
                        ]
                    },
                    "valueCodeableConcept": {
                        "coding": [
                            {
                                "code": "LA30102-0",
                                "display": "1-based character counting",
                                "system": "http://loinc.org"
                            }
                        ]
                    }
                },
                {
                    "code": {
                        "coding": [
                            {
                                "code": "exact-start-end",
                                "display": "Variant exact start and end",
                                "system": "http://hl7.org/fhir/uv/genomics-reporting/CodeSystem/TbdCodes"
                            }
                        ]
                    },
                    "valueRange": {
                        "low": {
                            "value": 60692
                        }
                    }
                }
            ],
            "id": "dv-b69d08e525b44",
            "meta": {
                "profile": [
                    "http://hl7.org/fhir/uv/genomics-reporting/StructureDefinition/variant"
                ]
            },
            "resourceType": "Observation",
            "status": "final",
            "subject": {
                "reference": "Patient/patient01"
            },
            "valueCodeableConcept": {
                "coding": [
                    {
                        "code": "LA9633-4",
                        "display": "present",
                        "system": "http://loinc.org"
                    }
                ]
            }
        },
        {
            "category": [
                {
                    "coding": [
                        {
                            "code": "laboratory",
                            "system": "http://terminology.hl7.org/CodeSystem/observation-category"
                        }
                    ]
                }
            ],
            "code": {
                "coding": [
                    {
                        "code": "diagnostic-implication",
                        "display": "Diagnostic Implication",
                        "system": "http://hl7.org/fhir/uv/genomics-reporting/CodeSystem/TbdCodes"
                    }
                ]
            },
            "component": [
                {
                    "code": {
                        "coding": [
                            {
                                "code": "53037-8",
                                "display": "Genetic variation clinical significance [Imp]",
                                "system": "http://loinc.org"
                            }
                        ]
                    },
                    "valueCodeableConcept": {
                        "coding": [
                            {
                                "display": "not specified",
                                "system": "http://loinc.org"
                            }
                        ]
                    }
                }
            ],
            "derivedFrom": [
                {
                    "reference": "#dv-b69d08e525b44"
                }
            ],
            "id": "di-69f5a76124f04",
            "meta": {
                "profile": [
                    "http://hl7.org/fhir/uv/genomics-reporting/StructureDefinition/diagnostic-implication"
                ]
            },
            "resourceType": "Observation",
            "status": "final",
            "subject": {
                "reference": "Patient/patient01"
            }
        },
        {
            "category": [
                {
                    "coding": [
                        {
                            "code": "laboratory",
                            "system": "http://terminology.hl7.org/CodeSystem/observation-category"
                        }
                    ]
                }
            ],
            "code": {
                "coding": [
                    {
                        "code": "69548-6",
                        "display": "Genetic variant assessment",
                        "system": "http://loinc.org"
                    }
                ]
            },
            "component": [
                {
                    "code": {
                        "coding": [
                            {
                                "code": "48004-6",
                                "display": "DNA change (c.HGVS)",
                                "system": "http://loinc.org"
                            }
                        ]
                    },
                    "valueCodeableConcept": {
                        "coding": [
                            {
                                "code": "NC_000023.10:60881:T:G",
                                "system": "http://varnomen.hgvs.org"
                            }
                        ]
                    }
                },
                {
                    "code": {
                        "coding": [
                            {
                                "code": "48013-7",
                                "display": "Genomic reference sequence ID",
                                "system": "http://loinc.org"
                            }
                        ]
                    },
                    "valueCodeableConcept": {
                        "coding": [
                            {
                                "code": "NC_000023.10",
                                "system": "http://www.ncbi.nlm.nih.gov/nuccore"
                            }
                        ]
                    }
                },
                {
                    "code": {
                        "coding": [
                            {
                                "code": "48002-0",
                                "display": "Genomic Source Class",
                                "system": "http://loinc.org"
                            }
                        ]
                    },
                    "valueCodeableConcept": {
                        "coding": [
                            {
                                "code": "LA6684-0",
                                "display": "Somatic",
                                "system": "http://loinc.org"
                            }
                        ]
                    }
                },
                {
                    "code": {
                        "coding": [
                            {
                                "code": "69547-8",
                                "display": "Genomic Ref allele [ID]",
                                "system": "http://loinc.org"
                            }
                        ]
                    },
                    "valueString": "T"
                },
                {
                    "code": {
                        "coding": [
                            {
                                "code": "69551-0",
                                "display": "Genomic Alt allele [ID]",
                                "system": "http://loinc.org"
                            }
                        ]
                    },
                    "valueString": "G"
                },
                {
                    "code": {
                        "coding": [
                            {
                                "code": "92822-6",
                                "display": "Genomic coord system",
                                "system": "http://loinc.org"
                            }
                        ]
                    },
                    "valueCodeableConcept": {
                        "coding": [
                            {
                                "code": "LA30102-0",
                                "display": "1-based character counting",
                                "system": "http://loinc.org"
                            }
                        ]
                    }
                },
                {
                    "code": {
                        "coding": [
                            {
                                "code": "exact-start-end",
                                "display": "Variant exact start and end",
                                "system": "http://hl7.org/fhir/uv/genomics-reporting/CodeSystem/TbdCodes"
                            }
                        ]
                    },
                    "valueRange": {
                        "low": {
                            "value": 60882
                        }
                    }
                }
            ],
            "id": "dv-e18135b890434",
            "meta": {
                "profile": [
                    "http://hl7.org/fhir/uv/genomics-reporting/StructureDefinition/variant"
                ]
            },
            "resourceType": "Observation",
            "status": "final",
            "subject": {
                "reference": "Patient/patient01"
            },
            "valueCodeableConcept": {
                "coding": [
                    {
                        "code": "LA9633-4",
                        "display": "present",
                        "system": "http://loinc.org"
                    }
                ]
            }
        },
        {
            "category": [
                {
                    "coding": [
                        {
                            "code": "laboratory",
                            "system": "http://terminology.hl7.org/CodeSystem/observation-category"
                        }
                    ]
                }
            ],
            "code": {
                "coding": [
                    {
                        "code": "diagnostic-implication",
                        "display": "Diagnostic Implication",
                        "system": "http://hl7.org/fhir/uv/genomics-reporting/CodeSystem/TbdCodes"
                    }
                ]
            },
            "component": [
                {
                    "code": {
                        "coding": [
                            {
                                "code": "53037-8",
                                "display": "Genetic variation clinical significance [Imp]",
                                "system": "http://loinc.org"
                            }
                        ]
                    },
                    "valueCodeableConcept": {
                        "coding": [
                            {
                                "display": "not specified",
                                "system": "http://loinc.org"
                            }
                        ]
                    }
                }
            ],
            "derivedFrom": [
                {
                    "reference": "#dv-e18135b890434"
                }
            ],
            "id": "di-ee4e4752910f4",
            "meta": {
                "profile": [
                    "http://hl7.org/fhir/uv/genomics-reporting/StructureDefinition/diagnostic-implication"
                ]
            },
            "resourceType": "Observation",
            "status": "final",
            "subject": {
                "reference": "Patient/patient01"
            }
        },
        {
            "category": [
                {
                    "coding": [
                        {
                            "code": "laboratory",
                            "system": "http://terminology.hl7.org/CodeSystem/observation-category"
                        }
                    ]
                }
            ],
            "code": {
                "coding": [
                    {
                        "code": "69548-6",
                        "display": "Genetic variant assessment",
                        "system": "http://loinc.org"
                    }
                ]
            },
            "component": [
                {
                    "code": {
                        "coding": [
                            {
                                "code": "48004-6",
                                "display": "DNA change (c.HGVS)",
                                "system": "http://loinc.org"
                            }
                        ]
                    },
                    "valueCodeableConcept": {
                        "coding": [
                            {
                                "code": "NC_012920.1:6017:A:C",
                                "system": "http://varnomen.hgvs.org"
                            }
                        ]
                    }
                },
                {
                    "code": {
                        "coding": [
                            {
                                "code": "48013-7",
                                "display": "Genomic reference sequence ID",
                                "system": "http://loinc.org"
                            }
                        ]
                    },
                    "valueCodeableConcept": {
                        "coding": [
                            {
                                "code": "NC_012920.1",
                                "system": "http://www.ncbi.nlm.nih.gov/nuccore"
                            }
                        ]
                    }
                },
                {
                    "code": {
                        "coding": [
                            {
                                "code": "48002-0",
                                "display": "Genomic Source Class",
                                "system": "http://loinc.org"
                            }
                        ]
                    },
                    "valueCodeableConcept": {
                        "coding": [
                            {
                                "code": "LA6684-0",
                                "display": "Somatic",
                                "system": "http://loinc.org"
                            }
                        ]
                    }
                },
                {
                    "code": {
                        "coding": [
                            {
                                "code": "81258-6",
                                "display": "Sample VAF",
                                "system": "http://loinc.org"
                            }
                        ]
                    },
                    "valueQuantity": {
                        "code": "1",
                        "system": "http://unitsofmeasure.org",
                        "value": 0.8
                    }
                },
                {
                    "code": {
                        "coding": [
                            {
                                "code": "69547-8",
                                "display": "Genomic Ref allele [ID]",
                                "system": "http://loinc.org"
                            }
                        ]
                    },
                    "valueString": "A"
                },
                {
                    "code": {
                        "coding": [
                            {
                                "code": "69551-0",
                                "display": "Genomic Alt allele [ID]",
                                "system": "http://loinc.org"
                            }
                        ]
                    },
                    "valueString": "C"
                },
                {
                    "code": {
                        "coding": [
                            {
                                "code": "92822-6",
                                "display": "Genomic coord system",
                                "system": "http://loinc.org"
                            }
                        ]
                    },
                    "valueCodeableConcept": {
                        "coding": [
                            {
                                "code": "LA30102-0",
                                "display": "1-based character counting",
                                "system": "http://loinc.org"
                            }
                        ]
                    }
                },
                {
                    "code": {
                        "coding": [
                            {
                                "code": "exact-start-end",
                                "display": "Variant exact start and end",
                                "system": "http://hl7.org/fhir/uv/genomics-reporting/CodeSystem/TbdCodes"
                            }
                        ]
                    },
                    "valueRange": {
                        "low": {
                            "value": 6018
                        }
                    }
                }
            ],
            "id": "dv-3d5841401bfb4",
            "meta": {
                "profile": [
                    "http://hl7.org/fhir/uv/genomics-reporting/StructureDefinition/variant"
                ]
            },
            "resourceType": "Observation",
            "status": "final",
            "subject": {
                "reference": "Patient/patient01"
            },
            "valueCodeableConcept": {
                "coding": [
                    {
                        "code": "LA9633-4",
                        "display": "present",
                        "system": "http://loinc.org"
                    }
                ]
            }
        },
        {
            "category": [
                {
                    "coding": [
                        {
                            "code": "laboratory",
                            "system": "http://terminology.hl7.org/CodeSystem/observation-category"
                        }
                    ]
                }
            ],
            "code": {
                "coding": [
                    {
                        "code": "diagnostic-implication",
                        "display": "Diagnostic Implication",
                        "system": "http://hl7.org/fhir/uv/genomics-reporting/CodeSystem/TbdCodes"
                    }
                ]
            },
            "component": [
                {
                    "code": {
                        "coding": [
                            {
                                "code": "53037-8",
                                "display": "Genetic variation clinical significance [Imp]",
                                "system": "http://loinc.org"
                            }
                        ]
                    },
                    "valueCodeableConcept": {
                        "coding": [
                            {
                                "display": "not specified",
                                "system": "http://loinc.org"
                            }
                        ]
                    }
                }
            ],
            "derivedFrom": [
                {
                    "reference": "#dv-3d5841401bfb4"
                }
            ],
            "id": "di-07dcac1e12104",
            "meta": {
                "profile": [
                    "http://hl7.org/fhir/uv/genomics-reporting/StructureDefinition/diagnostic-implication"
                ]
            },
            "resourceType": "Observation",
            "status": "final",
            "subject": {
                "reference": "Patient/patient01"
            }
        },
        {
            "category": [
                {
                    "coding": [
                        {
                            "code": "laboratory",
                            "system": "http://terminology.hl7.org/CodeSystem/observation-category"
                        }
                    ]
                }
            ],
            "code": {
                "coding": [
                    {
                        "code": "82120-7",
                        "display": "Allelic phase",
                        "system": "http://loinc.org"
                    }
                ]
            },
            "derivedFrom": [
                {
                    "reference": "#dv-6f399c7fb0be4"
                },
                {
                    "reference": "#dv-6175dec7e9904"
                }
            ],
            "id": "sid-b582d59d887d4",
            "meta": {
                "profile": [
                    "http://hl7.org/fhir/uv/genomics-reporting/StructureDefinition/sequence-phase-relationship"
                ]
            },
            "resourceType": "Observation",
            "status": "final",
            "subject": {
                "reference": "Patient/patient01"
            },
            "valueCodeableConcept": {
                "coding": [
                    {
                        "code": "Cis",
                        "display": "Cis",
                        "system": "http://hl7.org/fhir/uv/genomics-reporting/CodeSystem/SequencePhaseRelationshipCS"
                    }
                ]
            }
        },
        {
            "category": [
                {
                    "coding": [
                        {
                            "code": "laboratory",
                            "system": "http://terminology.hl7.org/CodeSystem/observation-category"
                        }
                    ]
                }
            ],
            "code": {
                "coding": [
                    {
                        "code": "82120-7",
                        "display": "Allelic phase",
                        "system": "http://loinc.org"
                    }
                ]
            },
            "derivedFrom": [
                {
                    "reference": "#dv-6175dec7e9904"
                },
                {
                    "reference": "#dv-c5a54b1cd5684"
                }
            ],
            "id": "sid-87677dfd43394",
            "meta": {
                "profile": [
                    "http://hl7.org/fhir/uv/genomics-reporting/StructureDefinition/sequence-phase-relationship"
                ]
            },
            "resourceType": "Observation",
            "status": "final",
            "subject": {
                "reference": "Patient/patient01"
            },
            "valueCodeableConcept": {
                "coding": [
                    {
                        "code": "Cis",
                        "display": "Cis",
                        "system": "http://hl7.org/fhir/uv/genomics-reporting/CodeSystem/SequencePhaseRelationshipCS"
                    }
                ]
            }
        }
    ],
    "id": "dr-0646393fe2044",
    "issued": "2021-10-22T10:49:11+00:00",
    "meta": {
        "profile": [
            "http://hl7.org/fhir/uv/genomics-reporting/StructureDefinition/genomics-report"
        ]
    },
    "resourceType": "DiagnosticReport",
    "result": [
        {
            "reference": "#dv-506559af936d4"
        },
        {
            "reference": "#di-6da3099b6b204"
        },
        {
            "reference": "#dv-6f399c7fb0be4"
        },
        {
            "reference": "#di-12de791e725b4"
        },
        {
            "reference": "#dv-6175dec7e9904"
        },
        {
            "reference": "#di-cda284da44504"
        },
        {
            "reference": "#dv-c5a54b1cd5684"
        },
        {
            "reference": "#di-7771ce77a9a54"
        },
        {
            "reference": "#dv-b69d08e525b44"
        },
        {
            "reference": "#di-69f5a76124f04"
        },
        {
            "reference": "#dv-e18135b890434"
        },
        {
            "reference": "#di-ee4e4752910f4"
        },
        {
            "reference": "#dv-3d5841401bfb4"
        },
        {
            "reference": "#di-07dcac1e12104"
        },
        {
            "reference": "#sid-b582d59d887d4"
        },
        {
            "reference": "#sid-87677dfd43394"
        }
    ],
    "status": "final",
    "subject": {
        "reference": "Patient/patient01"
    }
}

15.9.1.1.3. Tracking conversion errors by activating the logger function

As with all conversions, things can go awry. It is therefore always good to log any error when executing code. The authors of the vcf2fhir library provide 2 distinct logging modes, which we’ll now use.

The vcf2fhir logging process simply builds on the well established python logging library and therefore to use it is as simple as using said library:

15.9.1.1.3.1. i. instantiate a logger and set a error logging level

general_logger = logging.getLogger('vcf2fhir.general')
general_logger.setLevel(logging.DEBUG)

15.9.1.1.3.2. ii. define an file as output and a formatter pattern

# create console handler and set level to debug
ch = logging.FileHandler('vcf2fhir-generic-errors.log')
ch.setLevel(logging.DEBUG)
# create formatter
formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')
# add formatter to ch
ch.setFormatter(formatter)
# add ch to logger
general_logger.addHandler(ch)

15.9.1.1.4. Using the dedicated invalid_record_logger:

15.9.1.1.4.1. i. create a logger and pass it the specific vcf2fhir logger as follows:

invalid_record_logger = logging.getLogger('vcf2fhir.invalidrecord')

15.9.1.1.4.2. ii. configure the logger output file, error logging level and output formatting

inv_ch = logging.FileHandler('vcf2fhir-invalid-record-errors.log')
inv_ch.setLevel(logging.DEBUG)
inv_ch.setFormatter(formatter)

note: *we reuse the formatter created previously

15.9.1.1.4.3. iii. plug the error handler in the logger and execute

invalid_record_logger.addHandler(inv_ch)

15.9.1.1.5. We can now read the log file to check what happened during the conversion from VCF to FHIR JSON.

This is an important Quality Control step as the vcf2fhir is still experimental and under active development. Therefore, users of the tool need to excert critical thinking, understand the parsing and conversion rules as well as understand the limit of the envelop of the tool, as explained by the authors in their manuscript .

with open('vcf2fhir-invalid-record-errors.log','r') as input:
    lines=input.readlines()

print(lines[0:10])
["2021-10-22 09:37:17,523 - vcf2fhir.invalidrecord - DEBUG - Reason: VCF INFO.SVTYPE must be in ['INS', 'DEL', 'DUP', 'CNV', 'INV']. Record: Record(CHROM=M, POS=11551, REF=T, ALT=[TN[M:16141[]), considered sample: CallData(GT=1, PS=None)\n", "2021-10-22 09:37:17,524 - vcf2fhir.invalidrecord - DEBUG - Reason: VCF INFO.SVTYPE must be in ['INS', 'DEL', 'DUP', 'CNV', 'INV']. Record: Record(CHROM=M, POS=11562, REF=T, ALT=[TN]11:49883566]]), considered sample: CallData(GT=1)\n", "2021-10-22 09:37:17,525 - vcf2fhir.invalidrecord - DEBUG - Reason: Mitochondrial DNA with GT = 0 or its diploid, Record: Record(CHROM=M, POS=6021, REF=A, ALT=[C]), considered sample: CallData(GT=0|1, PS=60003, DP=15, AD=['12', '3'], CGA_RDP=12)\n", "2021-10-22 09:37:17,526 - vcf2fhir.invalidrecord - DEBUG - Reason: Mitochondrial DNA with GT = 0 or its diploid, Record: Record(CHROM=M, POS=6027, REF=A, ALT=[C]), considered sample: CallData(GT=0|1, PS=60003, DP=17, AD=['13', '4'], CGA_RDP=13)\n", "2021-10-22 09:37:17,526 - vcf2fhir.invalidrecord - DEBUG - Reason: VCF FORMAT.GT is in ['0/0','0|0','0'], Record: Record(CHROM=M, POS=6028, REF=A, ALT=[C]), considered sample: CallData(GT=0, PS=60003, DP=17, AD=['13', '4'], CGA_RDP=13)\n", "2021-10-22 09:37:20,983 - vcf2fhir.invalidrecord - DEBUG - Reason: VCF INFO.SVTYPE must be in ['INS', 'DEL', 'DUP', 'CNV', 'INV']. Record: Record(CHROM=M, POS=11551, REF=T, ALT=[TN[M:16141[]), considered sample: CallData(GT=1, PS=None)\n", "2021-10-22 09:37:20,983 - vcf2fhir.invalidrecord - DEBUG - Reason: VCF INFO.SVTYPE must be in ['INS', 'DEL', 'DUP', 'CNV', 'INV']. Record: Record(CHROM=M, POS=11562, REF=T, ALT=[TN]11:49883566]]), considered sample: CallData(GT=1)\n", "2021-10-22 09:37:20,987 - vcf2fhir.invalidrecord - DEBUG - Reason: Mitochondrial DNA with GT = 0 or its diploid, Record: Record(CHROM=M, POS=6021, REF=A, ALT=[C]), considered sample: CallData(GT=0|1, PS=60003, DP=15, AD=['12', '3'], CGA_RDP=12)\n", "2021-10-22 09:37:20,987 - vcf2fhir.invalidrecord - DEBUG - Reason: Mitochondrial DNA with GT = 0 or its diploid, Record: Record(CHROM=M, POS=6027, REF=A, ALT=[C]), considered sample: CallData(GT=0|1, PS=60003, DP=17, AD=['13', '4'], CGA_RDP=13)\n", "2021-10-22 09:37:20,988 - vcf2fhir.invalidrecord - DEBUG - Reason: VCF FORMAT.GT is in ['0/0','0|0','0'], Record: Record(CHROM=M, POS=6028, REF=A, ALT=[C]), considered sample: CallData(GT=0, PS=60003, DP=17, AD=['13', '4'], CGA_RDP=13)\n"]

15.9.1.1.6. Conclusion

With this notebook, we’ve shown how to convert genetic variation information held in a VCF formatted file (it must comply with v4.1 or higher for this conversion to work) and generate a JSON-based HL7 FHIR Genomics Report message.

15.9.1.1.6.1. Why does this matter and how does it relate to FAIR:

The conversion from VCF to HL7 FHIR JSON message has to do with the **I and R** of FAIR, that is interoperability and reusability. From a syntactic standpoint, the availability of genetic variation information at a granular level in an easily parseable form (JSON) is a gain for anyone looking at merging this information with other clinical messages. From a semantic standpoint, the reliance on LOINC vocabulary to mark up the patterns defined in the HL7 FHIR Genomics Reports enhances interoperation between systems by provided unambiguous annotations. Finally, as more systems are able to produce FHIR message from a variety of instruments or data sources, the availability of a FHIR message covering a subset of genetic variation available from testing facilities makes investigating and mining phenotypic and genotypic relations more straightforward.

However, one needs to remember that the capability affored by the vcf2fhir library is at an early stage and only supports simple cases. More efforts and more efforts is needed before a functionality is available at a Technical Readiness Level compatible with production systems.

15.9.1.1.7. Reference:

15.9.1.2. Authors

15.9.1.3. License