MDL MOLfiles, RGfiles, SDfiles, Rxnfiles, RDfiles formats¶
MOL V2000 files¶
-
Atom block:
- x, y, z coordinates
-
atom type:
- 1H, 2He, 3Li, ..., 103Lr,
- atom list and exclusive list L,
- "any" atoms A, Q, *,
- lonely pair LP
- R-Group, R or R N , where N > 0 integer. Before version 5.9, R, without number, was written as R#
- charge
- stereo-care box
- valence
- atom-atom mapping (for reactions)
- inversion/retention flag (for reactions)
Code: mol
Extension: .mol
-
Bond block:
- bond type: 1, 2, 3, aromatic, "any", "single or double", "single or aromatic", "double or aromatic", "hydrogen" or "coordinate" (import only)
- bond stereo information: up or down
- bond topology: ring or chain
-
Properties block:
M ALS- atom list and exclusive list
M APO- Rgroup attachment point
M CHG- charge
M RAD- radical
M ISO- isotope mass numbers
M RGP- Rgroup labels on root structure
M LOG- Rgroup logic
M LIN- link nodes
M SUB- substitution count query property (s)
M UNS- unsaturated atom query property (u)
M RBC- ring bond count query property (rb)
M STY- Sgroup type
M SST- Sgroup subtype
M SCN- Sgroup connectivity (head-to-head, head-to-tail or either/unknown)
M SAL- atoms that define the Sgroup
M SPA- multiple group parent atom list (paradigmatic repeating unit atoms)
M SBL- Sgroup's crossing bonds
M SMT- Sgroup label
M SPL- Sgroup parent list
M SDS EXP- Sgroup expansion
M SDT- Data sgroup field description
M SDD- Data sgroup display information
M SCD- Data sgroup data
M SED- Data sgroup data end of line
M SNC- Sgroup component numbers
M CRS- Sgroup correspondence
M SDI- display coordinates in each S-group bracket
M SBT- the displayed S-group bracket style
M SAP- the S-group attachment point information
M MRV SMA- SMARTS H, X, R, r, a, A properties (Marvin extension)
A- Atom alias
V- Atom value
Extended MOLfiles (V3000)¶
If the number of atoms or bonds exceeds 999, in case of reactions with Rgroups or when there is enhanced stereo in the molecule the extended format is used. In an extended MOLfile, the following properties and features are supported:
-
Atom block:
- x, y, z coordinates
-
atom type:
- 1H, 2He, 3Li, ..., 103Lr,
- "any" atoms A, Q, *,
- lonely pair LP
- atom-atom mapping (for reactions)
- inversion/retention flag (INVRET)
CHG- charge
RAD- radical
CFG- parity
VAL- valence
MASS- isotope mass number
HCOUNT- number of implicit hydrogens
STBOX- stereo-care box
INVRET- inversion/retention flag
ATTCHPT- R-group attachment point
RGROUPS- R-groups that comprise this R# atom
SUBST- Substitution count query property (s)
UNSAT- Unsaturated atom query property (u)
RBCNT- Ring bond count query property1**Restriction:** only one R-group can comprise an atom in Marvin
Codename: mol:V3
Extension: .mol
-
Bond block:
- bond type: 1, 2, 3, aromatic, "any", "single or double", "single or aromatic", "double or aromatic"
CFG- bond stereo configuration: up or down
TOPO- bond topology: ring or chain
STBOX- stereo-care box
LINKNODE- Link nodes.
-
Sgroup block:
ATOMS- atoms that define the Sgroup
PATOMS- multiple group parent atom list (paradigmatic repeating unit atoms)
XBONDS- crossing bonds
MULT- multiple group multiplier
CONNECT- connectivity (head-to-head, head-to-tail or either/unknown)
LABEL- display label
PARENT- parent Sgroup
ESTATE- expanded state
FIELDNAME- data Sgroup field name
FIELDINFO- data Sgroup field information (type and units)
FIELDDISP- data Sgroup field display information
QUERYTYPE- data Sgroup program query code
QUERYOP- data Sgroup query operator
FIELDDATA- data Sgroup field value
BRKXYZ- display coordinates in each S-group bracket
BRKTYP- the displayed S-group bracket style
COMPNO- Sgroup component numbers
CBONDS- Sgroup's crossing bonds
XBHEAD, XBCORR- Sgroup correspondence
SUBTYPE- Sgroup subtype
CLASS- Sgroup class; AA: amino acid
CSTATE- Sgroup contracted state
ESTATE- Sgroup expanded state
-
SAP- the S-group attachment point informationCollection block:
Enhanced stereo features, see also the V3ec and V3ea export options.
MDLV30/STEABS- ABSOLUTE stereochemical group
MDLV30/STEREL- OR stereochemical group
-
MDLV30/STERAC- AND stereochemical groupAtom highlighting.
MDLV30/HILITE- Highlighted atoms and bonds, currently as represented as atom/bond set 1. (This feature is experimental and import only!)
- Rgroup blocks with
RLOGICentries
- Template block (import only)
Reaction files (V2000)¶
A reaction file consists of a REACTANT block, a PRODUCT block, and (optionally) an AGENT block. Reaction files containing reaction agents are non-standard.
Each block starts with 'Molecule or Reaction Identifier'. The form of a molecule identifier must be one of the following:
Codename: rxn
Extension: .rxn
$MFMT $MIREG N $MFMT $MEREG N $MIREG N $MEREG N .
Here $MFMT means that a molecule is given in a molfile format, $MIREG Nis the internal and $MEREG Nis the external registry number of the molecule. Similarly, the identifier has the following form, $RFMT $RIREG N $RFMT $REREG N $RIREG N $REREG N .
Here $RFMT means that a reaction is given in a rxnfile format, $RIREG Nis internal and $REREG Nis the external registry number of the reaction.
A reaction agent is a molecule structure that does not take part in the chemical reaction, but is added to the reaction equation for informative purpose only. Agents are normally displayed graphically above the reaction arrow, added to the reaction file after the reactants and the products. The number of agents is displayed in the file header (after the number of reactants and the number of products) if it is non-zero. Reaction files containing agents are non-standard.
Extended reaction files (V3000)¶
This format is used automatically if a reaction includes Rgroups and/or the number of atoms or bonds exceeds 999. An extended reaction file consists of a REACTANT block, a PRODUCT block, (optionally) an AGENT block, and (optionally) RGROUP blocks.
Codename: rxn:V3
Extension: .rxn
SD Files¶
In SDfiles read by marvin, the name field is special, it overrides the molecule name specified in the molfile part.
- Incompatibility note: * The MDL definition declares the maximal line length for molecule properties in 200 characters. We ignore this limit.
Codename: .sdf
Extension: .sdf
RG Files¶
A special feature of Marvin RGfiles is that they can contain a reaction as the root structure. This feature is non-standard, such mixed RG/Rxnfiles can only be imported by Marvin.
Codename: rgf
Extension: .rgf
See also :