Markush DARC format - VMN¶
VMN import was discontinued since version 14.7.7.0. and reintroduced in version 19.27 as a beta verson. VMN export is introduced for the first time in 19.27 as a beta version.
VMN format¶
VMN Files are describing Markush structures in a Markush DARC compatible format.
When to use VMN format?¶
VMN support at Chemaxon is designed to provide a solution for those who want to use Chemaxon products but still in need of an interface for tools that only accept Markush DARC compatible formats.
Chemaxon softwares provide wider set of Markush and variable structure features than the VMN compatible features, so that VMN format is only recommended when there are no other options.
Import from VMN format¶
VMN format is a binary format and it may have a human-readable AMN companion file. AMN files are automatically processed if they are in the same folder and the file name is the same as the corresponding VMN file (except the extension).
For example if there is a something.vmn file and right in the same folder there is a something.amn file, then something.amn will be automatically processed during the import of something.vmn.
Export to VMN format¶
Export is similar to import. If the Markush structure contains information that has to be stored in AMN files, then the export process automatically generates AMN file next to the generated VMN file.
Code : vmn
Extension : .vmn
Interpretation of VMN features¶
- Groups :
G0is read in as the scaffold whileG1,G2, ... are stored in corresponding R-groupsR1,R2, ... The representation of attachments is described below.
- Undefined attachment information is ignored.
- Moieties on the scaffold are represented as repeating units with repetition ranges with no crossing bonds.
- Atom attributes : we interpret the following VMN atom attributes:
| VMN attribute name | Chemaxon terminology |
|---|---|
| AM - Abnormal mass | isotope |
| AV - Abnormal valence | valence |
- ~~ ~~ Homology atom attributes : we store the following VMN homology atom attributes in Marvin atom properties:
| VMN attribute name | Marvin atom property name | property values |
|---|---|---|
| DT - Deuterium-Tritium counts | DTCOUNT | D[deuterium count]T[tritium count] (e.g. D3T2) |
| CR - Carbon ring attributes | BRANCHING | BRA, STR |
| SIZE | LO, MID, HI, LO MID, MID HI, LO HI | |
| SATURATION | SAT, UNS | |
| RINGTYPE | MON, FU | |
| data in AMN | TEXTNOTES | AMN text referring to the atom (e.g. N0-4,S0-4) |
Structure shortcuts (abbreviated groups)¶
The following structure shortcuts (abbreviated groups) are supported:
| C2, C3, ..., C50 | |||||
|---|---|---|---|---|---|
| ACE | BU | CN | CO1 | CO2 | |
| COI | ET | IBU | IPR | MBE | NBU |
| NO2 | NPR | OBE | PBE | PH | PO3 |
| PO4 | SBU | SO2 | SO3 | TBU |
Amino acids (peptides)¶
The following standard amino acids (peptide abbreviated groups) are supported:
| ALA | ARG | ASN | ASP | CYS | GLN |
|---|---|---|---|---|---|
| GLU | GLY | HIS | ILE | LEU | LYS |
| MET | PHE | PRO | SER | THR | TRY |
| TYR | VAL |
The following non-standard peptides are also supported:
| ABU | aminobutyric acid |
|---|---|
| ASU | aminosuberic acid |
| GLP | pyroglumatic acid |
| HCY | homocysteine |
| HSE | homoserine |
| NLE | norleucine |
| NVA | norvaline |
| ORN | ornithine |
| SAR | sarcosine |
| STA | statine |
Note, that VMN defined peptide connection bonds are not handled currently, they are interpreted as single covalent bonds.
For more information on peptide representation refer to the Sequences - peptide, DNA, RNA documentation.
Superatoms (homology pseudo atoms)¶
Superatoms representing homology groups are read in as pseudo atoms. The following homologies are interpreted by enumeration and search:
| CHK | CHE | CHY | CYC | ARY | HET |
|---|---|---|---|---|---|
| HEA | HEF | UNK | MX | AMX | A35 |
| TRM | LAN | ACT | HAL | ACY | PRT |
| XX |
Multiple R-group attachments¶
Markush Compound Number¶
VMN files contain a segment of 12 bytes in the header to hold Markush Compound Number. It is an alpha-numerical string with the following restrictions:
- Maximum 12 characters
- Can contain only capital letters, numbers and dashes
- Some softwares can have other restrictions
The Markush Compound Number is read as the title of the scaffold molecule and this is the value exported to VMN as well. If this value is not present, the exporter generates a Markush Compound Number.
Format of the automatically generated Markush Compound Number: MMYY-mmmmm where MM is the current month, YY is the current year and mmmmm is the last 5 digits of the current epoch milliseconds.
This value can be manipulated by API. Use the setName method on the scaffold molecule. For further details on how to access the scaffold molecule see the Markush representation documentation, the scaffold molecule is called root on the API.
File Segment¶
VMN files contain a segment of 32 bytes in the header to hold data called File Segment. It is an alpha-numerical string with the following restrictions:
- Maximum 32 characters
- Can contain only capital letters, numbers and dashes
- Some softwares can have other restrictions
The File Segment value is read as a molecule property of the scaffold molecule and this is the value exported to VMN as well. The property key is "FileSegment".
This value can be manipulated by API. Use the properties method on the scaffold molecule. For further details on how to access the scaffold molecule see the Markush representation documentation, the scaffold molecule is called root on the API.
This value also can be manipulated on the UI, you can find the guide here. To select the proper component, make sure nothing is selected and right click on an empty spot on the canvas and follow the guide from there. An other method is to double click on an atom of the scaffold which should select the entire scaffold. If your scaffold have multiple fragments, repeat this selection for all fragments while holding down Shift.
Limitations¶
Not supported variable structure features:¶
- Position Variation Bonds
- Link Nodes
- Alias for R-groups
Not supported substructure-group features:¶
- Polymer related groups (Monomer, Polymer, Copolymer, Graft, Crosslink, etc.)
- Non-exact repetition
- Mixtures
VMN specification related limitations:¶
- Atoms with 104 or higher atomic number are not supported
- Maximum recommended R-group definition count is 50
Used external references during implementation¶
-
Derwent World Patents Index, Markush DARC User Manual, The Thomson Corporation, 1993, 2008