Maximum Common Substructure (MCS) search¶
This manual introduces Chemaxon's Maximum Common Substructure search.
Introduction¶
Finding the Maximum Common Substructure (MCS) of two molecules is a problem with many applications in the field of cheminformatics. It can be used for similarity search, hierarchical clustering, molecule alignment, and automated reaction mapping. An example is presented in Fig. 1.
Chemaxon provides powerful heuristic algorithms for MCS search, which typically find large common substructures in a short time. However, they do not always provide the exact optimal result due to the complexity of the MCS problem (especially for large molecules).
Fig. 1. Maximum Common Substructure (MCS) of two molecules
From a graph theoretical point of view, the MCS of two molecules is defined as the maximum common edge subgraph (MCES) of the two molecule graphs. That is, MCS and MCES mean exactly the same in Chemaxon's terminology.
Even though the roles of the two molecules in MCS search are generally the same, we distinguish a query and a target molecule. The reason for this is that some special query features are only allowed in the query molecule (see details below).
Search features¶
Search options¶
MCS search can be customized using various search options:
- considering atom and bond types;
- considering charges, isotopes, and radicals (see Table 1);
- connected or disconnected MCS search (see Table 2);
- setting a minimum size for extra fragments (in case of disconnected MCS);
- setting whether and how rings can be broken (see Table 3).
Furthermore, two search modes are provided with different speed/accuracy trade-off. Consider to use the FAST mode if you prefer search speed rather than more accurate MCS results.
Examples¶
| Charge matching | Query | Target |
|---|---|---|
| False (default) Formal charges of the atoms need not match. | ||
| True Formal charges of the atoms should match. |
Table 1. Charge matching option
| Connected mode | Query | Target |
|---|---|---|
| False (default) The MCS can consist of multiple fragments. | ||
| True The MCS should consist of only one fragment. |
Table 2. Connected mode
Table 3. Ring handling mode
Note that the latter two options only consider rings within a given size limit. The default maximum size is eight, i.e. rings of nine or more atoms may be broken even if KEEP_RINGS option is used. (However, this limit can also be changed.)
Query features¶
The following query features are supported in MCS search, but only in the query molecule:
- generic query atoms (any, halogen, metal, etc.)
- atoms lists, not lists;
- generic bonds (any, single or double, etc.)
However, complex features such as stereochemistry, tautomers and Markush structures are not supported.
Example¶
| Query | Target |
|---|---|
Usage¶
Command line usage¶
JChem also provides a simple command line application for MCS search (mainly for evaluation and demonstration purposes).
The program can be used as:
Options¶
Examples¶
- Example 1. Search MCS of the given query ( -q ) and target ( -t ) molecules.
| Command | mcs -q "C12CCC(O)C1(C)CCC1C2CCC2=CC(=O)CCC12C" -t "CC(=O)C1CCC2C3CCC4=CC(=O)CCC4(C)C3CCC12C" |
|---|---|
| Result (console) | Query: CC12CCC3C(CCC4=CC(=O)CCC34C)C1CCC2O Target: CC(=O)C1CCC2C3CCC4=CC(=O)CCC4(C)C3CCC12C MCS: CC12CCCC1C1CCC3=CC(=O)CCC3(C)C1CC2 Atom count: 20 Bond count: 23 Fragment count: 1 Similarity score: 0.8519 |
- Example 2. Search MCS of the given two molecules ( -q, -t ) and display the results ( -w ).
| Command | mcs -w -q "CN(O)C1=CC(=CC=C1)C(O)=O" -t "OC(=O)CC1=CC=C(C=C1)N+=O" |
|---|---|
| Result | ![]() |
- Example 3. Search connected MCS ( -c ) of the given two molecules ( -q, -t ) using charge matching ( -m c+ ) and display the results ( -w ) including atom mapping numbers (-a ).
| Command | mcs -c -m c+ -w -a -q "CN(O)C1=CC(=CC=C1)C(O)=O" -t "OC(=O)CC1=CC=C(C=C1)N+=O" |
|---|---|
| Result | ![]() |
- Example 4. Search pairwise MCS of the molecules in the given two input files ( -q, -t ) and display the results in a grid view ( -g ).
| Command | mcs -g -q queries.mrv -t targets.smiles |
|---|---|
| Result | ![]() |
1 | |
API usage¶
The com.chemaxon.search.mcs package contains classes that provide an API for MCS search. Here is a simple example demonstrating the usage of these classes:
You can specify search options like this:
For more information see the API documentation of MaxCommonSubstructure and McsSearchOptions.
References¶
-
Péter Englert and Péter Kovács. Efficient Heuristics for Maximum Common Substructure Search . Journal of Chemical Information and Modeling, 2015, 55:941-955.
-
John W. Raymond and Peter Willett. Maximum common subgraph isomorphism algorithms for the matching of chemical structures . Journal of Computer-Aided Molecular Design, 2002, 16:521-533.
-
Takeshi Kawabata. Build-up algorithm for atomic correspondence between chemical structures . Journal of Chemical Information and Modeling, 2011, 51:1775-1787.




