Tautomerization and tautomer models of ChemAxon

Introduction

This page describes the different tautomerizaion methods of ChemAxon and how they are used for various chemical scenarios. For a brief chemical introduction on what tautomerization/a tautomer is, please read this page.

Tautomerization methods and their results

The tautomerization method is used to generate a set of molecules (can be a single molecule as well) based on a set of tautomerization rules for a given chemical scenario, e.g. chemical searching, predicting tautomer distribution in water.

The tautomer generation algorithm

The tautomer generation algorithm goes as follows:

  1. Identifying a set of possible donors and acceptors of the input molecule that can take part in tautomerization.

  2. Filtering this set based on the pre-set tautomerization method and other parameters. The tautomerization method applies a set of tautomerization rules that filters the original donor/acceptor set.

  3. Considering bond path length between identified donor and acceptor atoms, and filtering the set based on pre-set lengths.

  4. Generating the result tautomer set after this initial pre-processing.

The tautomerization methods

All Tautomers

The All Tautomers tautomerization does not apply any additional filtering rules on the original donor/acceptor set, so it is only filtered based on the pre-set parameters by the user. The original donor/acceptor set is then used during combinatorial enumeration of all tautomers (step #4).

Example

Let's take 1,3-dimethyl-1H-pyrazol-5-ol as an example throughout this documentation:

images/download/thumbnails/1806170/original.png

We get the following All Tautomers set as output:

images/download/attachments/1806170/pyrazol_all.png

Fig. 1. All tautomers of 1,3-dimethyl-1H-pyrazol-5-ol

Application

The All Tautomers method are for exploring all possible tautomer forms of the input molecule.

Normal All Tautomers

The Normal tautomer generation scope can be applied to filter the All Tautomer set. The Normal scope narrows down the original donor/acceptor set by applying empirical rules that results in a chemically more relevant tautomer set.

Using this method results in a tautomer set most likely present in different common solvents (e.g. water, DMSO, CCl4).

Example

In case of 1,3-dimethyl-1H-pyrazol-5-ol the original tautomer set is narrowed down to the following three tautomers most likely present in different solvents:

images/download/attachments/1806170/pyrazol_normal_all.png

Fig. 2. Normal All tautomers of 1,3-dimethyl-1H-pyrazol-5-ol

Tautomer #1 is the one most likely present in water solvent. Tautomer #2 can be considered the most abundant form in CCl4, while #3 is the most abundant form in DMSO.

Application

The Normal All tautomerization method is for predicting the tautomer forms into which the input molecule can spontaniously tautomerise in general.

Canonical Tautomer

The Canonical tautomerization generates a single tautomer form that can chemically represent the whole tautomer set, representing all stable forms in water.

Application

The Canonical tautomerization can be applied when e.g. a set of molecules needs to be standardised and stored as a single tautomer.

Example

In case of 1,3-dimethyl-1H-pyrazol-5-ol the canonical tautomer is the following form. In this case the original OH group can be converted into an =O group, while the N can be tautomerized into an NH group in the ring.

images/download/attachments/1806170/canonical.png

Fig. 3. Fig. 2. Canonical tautomer of 1,3-dimethyl-1H-pyrazol-5-ol

Normal Canonical Tautomer

The Normal Canonical tautomerization is very similar to the Canonical tautomerization, but it incorporates more empirical rules to be more precise chemically.

Application

The application of the Normal Canonical tautomerization is very similar to that of the Canonical tautomerization. This tautomer form is usually used for searching or standardization purposes.

Example

The following example shows that canonical and normal canonical tautomers are not necessarily the same:

images/download/attachments/1806170/can_vs_nc.png

The above mentioned general principle of generating normal canonical tautomers applies here. In this case the normal canonical form is chemically more relevant than the simple canonical.

Ionization and tautomerization

Canonical tautomerization tries to eliminate the pH dependence of the tautomer form by neutralising charges that can be linked to an ionisable group in the molecule. In that sense we can say that we do not consider ionisation as a part of the canonicalisation. Ionisation is part of resonance handling.

Here is an example:

images/download/attachments/1806170/charge_can.png

The Dominant Tautomer distribution and the Major Tautomer

Dominant tautomer distribution in water can be predicted. This process tries to transform the input molecule into a more stable form from an unstable form. If pH dependence is taken into account, pKa values also get a role in the prediction due to ionisation.

The major tautomer form is the most dominant in water.

Application

Calculating tautomer distribution in water can be important to know because tautomerization can affect e.g. the different molecular properties (e.g. pKa) and therefore the behaviour of the molecule in water.

Example

In case of 1,3-dimethyl-1H-pyrazol-5-ol we get the following distributions and major form:

images/download/attachments/1806170/dominant_major.png