Skip to content

Getting Started

This is a small introduction to help you get started with Chemaxon Python API. For a thorough documentation, go to the API Reference. A more detailed list of examples can be found in the Python API examples project.

Molecule import

You can easily import molecules from various formats. You can either specify the format yourself,
or if you do not specify any format, the function will recognize the format automatically.

1
2
3
from chemaxon import import_mol
mol = import_mol("CC(=O)OC1=CC=CC=C1C(O)=O")
mol2 = import_mol("CC(=O)OC1=CC=CC=C1C(O)=O", "smiles")

The import_mol function returns a chemaxon.molecule.Molecule object on success.

Bulk import from file:

1
2
3
4
5
from chemaxon import open_for_import

with open_for_import("my_molecules.sdf") as mol_iterator :
    for mol in mol_iterator :
        print(export_mol(mol, "smiles:u"))

Bulk export to a file:
1
2
3
4
5
6
from chemaxon import open_for_export, import_mol

molecule_list = ... # assume we have a list containing molecules

with open_for_export("file.sdf", "sdf") as out:
    write_res = all(out.write(mol) for mol in molecule_list)

Molecule export

You can also export molecules into various formats. You can need to specifiy the format in this case.

1
2
3
from chemaxon import export_mol
mol_str = export_mol("CC(=O)OC1=CC=CC=C1C(O)=O", "smiles")
mol2_str = export_mol("CC(=O)OC1=CC=CC=C1C(O)=O", "mrv")

The export_mol function returns a str object on success, which contains the molecule in the specified format.

Export to file (even multiple molecules):

1
2
3
4
5
6
from chemaxon import open_for_export
molecules = list(...) # A list of chemaxon.molecule.Molecule objects 

with open_for_export("my_molecules.sdf", "sdf") as mol_exporter :
    for mol in molecules :
        mol_exporter.write(mol, "mol:V3")

Molecule display

The molecule can be exported to a svg file string:

1
2
3
from chemaxon import import_mol
mol = import_mol("CC(=O)OC1=CC=CC=C1C(O)=O")
svg = mol._repr_svg_()

Example svg file.

Calculations

Once you have a molecule, you can calculate various properties of it.

from chemaxon import logp, LogpMethod
result = logp(mol, method=LogpMethod.CHEMAXON, consider_tautomerization=True)

This function returns a chemaxon.calculations.logp.LogPResult object, which contains the calculated logP value,
and the logP values of the individual atoms.

You can also calculate logp using the default parameter values:

from chemaxon import logp
result = logp(mol)

Fingerprint calculations

Chemaxon has a number of functions that you can use to generate fingerprints.

1
2
3
from chemaxon import cfp, ecfp
cfp = cfp(mol)
ecfp = ecfp(mol, 4, 1024)

These functions return chemaxon.fingerprints.fingerprint.Fingerprint objects. You can get the fingerprints
in bytes or in binary string format.

ecfp.to_bytes()
ecfp.to_binary_string()

You can also calculate pharmacophore fingerprints:

from chemaxon import pharmacophore_fp
pf = pharmacophore_fp(mol)

This method returns a FloatVectorFingerprint, which contains a float array.

You can also calculate Tanimoto Dissimilarity for the fingerprints:

1
2
3
4
5
6
7
8
9
from chemaxon import tanimoto, ecfp, pharmacophore_fp, float_vector_tanimoto
ecfp1 = ecfp(mol, 4, 1024)
ecfp2 = ecfp(mol2, 4, 1024)

result1 = tanimoto(ecfp1, ecfp2)

pf1 = pharmacophore_fp(mol)
pf2 = pharmacophore_fp(mol2)
result2 = float_vector_tanimoto(pf1, pf2)

Structure standardization

Standardization in python works similar to our other tools.
See more information about standardization in general: Standardizer Introduction
and the list of available standardizer actions.

Standardizer can be configured via action strings:

1
2
3
4
5
from chemaxon import import_mol, Standardizer

standardizer = Standardizer("dearomatize..removeexplicith:lonely")
mol = import_mol("[H]c1c([H])c([H])c([H])c([H])c1[H].[H]")
standard_mol = standardizer.standardize(mol)

Or via xml configuration:

1
2
3
4
5
6
7
from chemaxon import import_mol, Standardizer

configFile = "path/to/my_config.xml"
with open(configFile) as config:
    standardizer = Standardizer(config)
    mol = import_mol("[H].[H]C1=C([H])C([H])=C([H])C([H])=C1[H]")
    standard_mol = standardizer.standardize(mol)

Structure checker

For the concept of Structure Checker in general see Structure Checker User's Guide.
Built-in checkers and fixers are available from Chemaxon Python API, custom implementations are not supported yet.
Structure fixer keeps the input molecule unchanged and returns an object containing the result molecule and a flag indicating if the fix was successful.

The following example shows the configuration and usage of structure checker via action strings:

from chemaxon import import_mol, StructureChecker

mol = import_mol('[NH3+]C1=CC(O)=CC=C1.O')
checker = StructureChecker('solvent->removeatom..moleculecharge->neutralize')

check_result = checker.check(mol)
for r in check_result.results:
    print('Checker name:', r.checker_name)

fix_result = checker.fix(mol)
print("Fix was" + ("" if fix_result.is_fix_successful else "n't") + " successful.")
fixed_molecule = fix_result.fixed_mol

Action strings can be specified as list as well:

1
2
3
...
checker = StructureChecker(["solvent->removeatom","moleculecharge->neutralize"])
...

Using xml files structure checker creation changes as follows:
1
2
3
4
5
...
configFile = "path/to/my_config.xml"
with open(configFile) as config:
    checker = StuctureChecker(config)
...

Searching a query molecule on a target molecule is supported by the MoleculeSearch class. For a detailed documentation of our search engine see our search documentation. For now second generation search engine functionalities/ semantics is not supported .Search type and standardization config can be specified optionally with default values substructure and general aromatization:

1
2
3
4
from chemaxon import import_mol, SearchType, MoleculeSearch, Standardizer

search = MoleculeSearch(search_type=SearchType.DUPLICATE, standardizer="aromatize:b")
...

Standardizer can be specified with Standardizer object as well:
1
2
3
...
search = MoleculeSearch(standardizer=Standardizer("aromatize:b"))
...

Search is executed with the find_hit method, that has mandatory arguments query and target molecules and optional argument return_colored_hit with default value False. Result of this method is a SearchHit object if there is a hit and None otherwise. This hit object has a hit_indexes and a colored_hit fields. The later is None if hit coloring was switched off.
1
2
3
4
5
6
7
8
9
...
search = MoleculeSearch()
query = import_mol('CC')
target = import_mol('CCC')
hit = search.find_hit(query, target)
hit.hit_indexes
hit = search.find_hit(query, target, return_colored_hit=True)
hit.colored_hit
...

Searching on a list of molecules

MoleculeSearch class also supports searching of a list of molecules. In this case search is multithreaded but no fingerprint screening is executed before the search.

1
2
3
4
5
6
7
8
9
...
query_mol = import_mol('C', 'smiles')
target_mol_1 = import_mol('CCCCCl', 'smiles')
target_mol_2 = import_mol('NCCCCCN', 'smiles')
target_mol_3 = import_mol('HOOH', 'smiles')
input_list = [target_mol_1, target_mol_2, target_mol_3]

mol_search = MoleculeSearch()
result_list = mol_search.find_hits_in_list(query_mol, input_list)

The result will contain a SearchHit object for every input target structure in the same order as the targets were specified, being None if there is no hit for the given target.
As with find_hit method the return_colored_hit option can be specified if colored hit is also required.

Exception handling

1
2
3
4
5
6
7
from chemaxon import Molecule, LogpMethod, import_mol, logp
try:
    # for illustrating exeption raising, a non-importable molecule is used
    mol = Molecule("non-importable", None)
    logp(mol, LogpMethod.CHEMAXON, -12, -34, consider_tautomerization=True, ph=15)
except RuntimeError as e:
    # custom error handler code