3. Creating InChIKeys for IUPAC names

Recipe Overview
Reading Time
15 minutes
Executable Code
Yes
Difficulty
Creating InChIKeys for IUPAC names
FAIRPlus logo
Recipe Type
Hands-on
Audience
Maturity Level & Indicator
DSM-4-C4
hover me Tooltip text

3.1. Main Objectives

The main purpose of this recipe is:

To take an IUPAC name and generate an InChIKey from it.


3.1.1. Using the OPSIN website

The OPSIN library is an open source tool to parse IUPAC names into chemical graphs 2.

OPSIN has a website where IUPAC names are converted into other representations, including an InChIKey.

The latter is done by the official InChI library 1.

3.1.2. Automating translations with Google Colab

Google Colaboratory (Colab for short) allows us to use Python to automate conversions of IUPAC names.

In Colab, we can use Bacting 3 to access the OPSIN library.

We would first need to set up Colab for Java, Maven, and scyjava, followed by the download of the Bacting libraries and creation of Bacting manager objects.

Java 17 and Maven are installed with the following commands, (with a confirmation which Java is available):

apt-get install openjdk-17-jre-headless maven -qq > /dev/null
import os
os.environ["JAVA_HOME"] = "/usr/lib/jvm/java-17-openjdk-amd64"
update-alternatives --set java /usr/lib/jvm/java-17-openjdk-amd64/bin/java
java -version

Scyjava is installed with the following command:

pip install scyjava

We can then continue by installing Bacting and setting up the two Bacting managers, inchi and opsin:

from scyjava import config, jimport
config.endpoints.append('io.github.egonw.bacting:managers-inchi:0.4.1')
config.endpoints.append('io.github.egonw.bacting:managers-opsin:0.4.1')

inchi_cls = jimport("net.bioclipse.managers.InChIManager")
inchi = inchi_cls(".")
opsin_cls = jimport("net.bioclipse.managers.OpsinManager")
opsin = opsin_cls(".")

After that, we use the manager API to parse the IUPAC name and generate an InChI and an InChIKey:

anInChI = inchi.generate(opsin.parseIUPACName("methane"))
print(f"InChI: {anInChI.getValue()}")
print(f"InchIKey: {anInChI.getKey()}")

The full Jupyter notebook can be found here, including a button to open the notebook in Colab.

3.1.3. Automating translations with Apache Groovy

Because Bacting is written in Java and the libraries being available from Maven Central, it also be used in Apache Groovy and other Java-based environments.

The above code in Groovy looks like:

@Grab(group='io.github.egonw.bacting', module='managers-inchi', version='0.4.1')
@Grab(group='io.github.egonw.bacting', module='managers-opsin', version='0.4.1')

workspaceRoot = "."
inchi = new net.bioclipse.managers.InChIManager(workspaceRoot);
opsin = new net.bioclipse.managers.OpsinManager(workspaceRoot);

anInChI = inchi.generate(opsin.parseIUPACName("methane"))
println "InChI: ${anInChI.getValue()}"
println "InchIKey: ${anInChI.getKey()}"

3.2. Conclusion

Cheminformatics provides us the tools to parse IUPAC names and convert them to chemical graph based identifiers, such as the InChIKey.

The InChIKey identifier can be used to find more information about the chemicals represented by the original IUPAC names.

3.3. References

3.4. Authors