Page tree


Differences in matching Daylight and MDL formats

We pursue compatibility with both MDL and Daylight structure searches. However, some query features have different meanings in the two systems. For this reason the interpretation of some query features depends on the query input format. Queries of type SMILES, SMARTScxsmiles and cxsmarts will be matched the Daylight way and all others the MDL way.

The affected query features and their different matchings are detailed below.

ANY and not list atoms

In the MDL terminology, ANY atoms never match hydrogens. This also excludes plain H, deuterium, charged H, etc. However, at Daylight ANY matches isotopic and charged H, but not plain Hydrogens.

In case of not list atoms, if H (or #1) does not appear in the excluded list, Daylight terminology behaves similarly as above: accept isotopic and charged H only. On the other hand, MDL never accepts Hydrogens for not lists. Here we chose not to comply with the MDL behavior even in the case of MDL format input to avoid misinterpretation. So in case of MDL format query all Hydrogens match to not lists. (Certainly if H atom type is included in the not list it will NOT match to H.) See examples below.

Table 1.




MDL Query (molfile)

Daylight Query (SMARTS)

H query property

In the MDL terminology, query property H <number> means at least <number> Hydrogens in excess explicitly drawn on the query. H0 is a special case which means no Hydrogens in excess the explicitly drawn. On the other hand, at Daylight H <number> means a total of <number> Hydrogens. (Explicit and implicit.)

Table 2.




MDL Query (molfile)

Daylight Query (SMARTS)

Double bond stereo matching mode

This is related to cis-trans isomerism of double bonds. As described above, there is a search option to control this: setDoubleBondStereoMatchingMode(), defaulted to DBS_MARKED. When DBS_MARKED option is set, cis/trans is only considered at marked double bonds. (An MDL query feature, also called stereo care flag. It is depicted as a square over the double bond.) However, the Daylight terminology lacks marked double bonds, they use directional bonds: / and \ instead. In order to correctly evaluate stereo SMARTS queries using the default search in case of Daylight format queries, the DBS_MARKED option considers directional bonds. (Please note that there is no special depiction of these SMARTS stereo bonds in Marvin, however the non-stereo double bonds like CC=CC are depicted by a wiggly bond ligand.)

'D' and 's' features

The SMARTS feature 'D' (degree) in Daylight implementation by default does not follow its description ("explicit connections"): ignores explicit H connections (but counts explicit H isotopes). This is the same semantics as the MDL feature 's' ("substitution count") offers, so in searches the two features have the same meaning.

SMARTS feature matrix

Supported SMARTS features

Table 3.

SMARTS notation



Aromatic/Aliphatic atoms


Any atom








Total H count


Ring membership


Ring size








Atomic number

@, @@

Tetrahedral chirality


Chiral or unspec

  • / \ = # : ~    -,=    -,:

bond types


Atom list


Atom not list




Reaction SMARTS


Component level grouping

/? ?

directional bond or unspecified




Implicit H-count


Any ring bond

! & ; ,

General logical expressions within atom and bond descriptions.


Recursive SMARTS

NOT YET supported SMARTS features

Table 4.

SMARTS notation



Chirality class


Chirality class or unspec

Molfile (MDL) query feature matrix

Supported Molfile(MDL) query features

Table 5.

Generic atoms: hetero(Q), Any(A)

Atom list

Atom not list

No implicit hydrogens





Atom to atom map(reactions)

Chiral atoms

Chiral flag of molecules

Enhanced stereo representation(ABS AND<n> OR<n>)

Bond types: single, double, triple, aromatic, double cis or trans, single or double, single or aromatic, double or aromatic, any

Stereo bond types: single up, single down, single up or down

Double bond stereo care flag

Reactions: starting materials, products

Reaction stereo: inversion, retention

Reacting center

Atom alias

Pseudo atoms

LP atom type

R-group queries: up to two connections per R-group

R-logic: occurrence range, restH, if-then

S-groups: Super atom (abbreviated group), multiple group, mixture, component, formulation

Bond topology: in ring, in chain, none

Unsaturated atom

Ring bond count(RB)

Substitution count

Link atom

Polymer and attached data S-group types

NOT YET supported Molfile(MDL) features

Table 6.

3D special features

Exact change flag (reaction)

Beilstein generics