3. Creating InChIKeys for IUPAC names¶

3.1. Main Objectives¶
The main purpose of this recipe is:
To take an IUPAC name and generate an InChIKey
3.1.1. Using the OPSIN website¶
The OPSIN library is an open source tool to parse IUPAC names into chemical graphs 2. OPSIN has a website where IUPAC names are converted into other representations, including an InChIKey. The latter is done by the offical InChI library 1.
3.1.2. Automating translations with Google Colab¶
Google Colaboratory (Colab for short) allows us to use Python to automate conversions of IUPAC names. In Colab we can use Bacting 3 to access the OPSIN library. We would first download the Bacting libraries and create the Bacting manager objects:
from scyjava import config, jimport
config.endpoints.append('io.github.egonw.bacting:managers-inchi:0.1.0')
config.endpoints.append('io.github.egonw.bacting:managers-opsin:0.1.0')
inchi_cls = jimport("net.bioclipse.managers.InChIManager")
inchi = inchi_cls(".")
opsin_cls = jimport("net.bioclipse.managers.OpsinManager")
opsin = opsin_cls(".")
After that, we use the manager API to parse the IUPAC name and generate an InChI and InChIKey:
anInChI = inchi.generate(opsin.parseIUPACName("methane"))
print(f"InChI: {anInChI.getValue()}")
print(f"InchIKey: {anInChI.getKey()}")
The full Jupyter notebook can be found here, including a button to open the notebook in Colab.
3.1.3. Automating translations with Apache Groovy¶
Because Bacting is written in Java and the libraries being available from Maven Central, it also be used in Apache Groovy and other Java-based environments. The above code in Groovy looks like:
@Grab(group='io.github.egonw.bacting', module='managers-inchi', version='0.1.0')
@Grab(group='io.github.egonw.bacting', module='managers-opsin', version='0.1.0')
workspaceRoot = "."
inchi = new net.bioclipse.managers.InChIManager(workspaceRoot);
opsin = new net.bioclipse.managers.OpsinManager(workspaceRoot);
anInChI = inchi.generate(opsin.parseIUPACName("methane"))
println "InChI: ${anInChI.getValue()}"
println "InchIKey: ${anInChI.getKey()}"
3.2. Conclusion¶
Cheminformatics provides us the tools to parse IUPAC names and convert them to chemical graph based identifiers, such as the InChIKey. The InChIKey identifier can be used to find more information about the chemicals represented by the original IUPAC names.
3.3. References¶
References
- 1
Jonathan M. Goodman, Igor Pletnev, Paul Thiessen, Evan Bolton, and Stephen R. Heller. Inchi version 1.06: now more than 99.99. Journal of Cheminformatics, may 24 2021.
- 2
Daniel M. Lowe, Daniel M. Lowe, Peter T. Corbett, Peter Murray-Rust, and Robert C. Glen. Chemical Name to Structure: opsin, an Open Source Solution. Journal of Chemical Information and Modeling, 51(3):739–753, mar 28 2011.
- 3
Egon Willighagen. Bacting: a next generation, command line version of Bioclipse. Journal of Open Source Software, 6(62):2558, jun 23 2021.
3.4. Authors¶
Authors
Name |
ORCID |
Affiliation |
Type |
ELIXIR Node |
Contribution |
---|---|---|---|---|---|
Maastricht University |
Writing - Original Draft |