Markush DARC format - VMN

    {warning} VMN import was discontinued since version 14.7.7.0. and reintroduced in version 19.27 as a beta verson. VMN export is introduced for the first time in 19.27 as a beta version.

    VMN format

    VMN Files are describing Markush structures in a Markush DARC compatible format.

    When to use VMN format?

    VMN support at Chemaxon is designed to provide a solution for those who want to use Chemaxon products but still in need of an interface for tools that only accept Markush DARC compatible formats.

    Chemaxon softwares provide wider set of Markush and variable structure features than the VMN compatible features, so that VMN format is only recommended when there are no other options.

    Import from VMN format

    VMN format is a binary format and it may have a human-readable AMN companion file. AMN files are automatically processed if they are in the same folder and the file name is the same as the corresponding VMN file (except the extension).

    For example if there is a something.vmn file and right in the same folder there is a something.amn file, then something.amn will be automatically processed during the import of something.vmn.

    Export to VMN format

    Export is similar to import. If the Markush structure contains information that has to be stored in AMN files, then the export process automatically generates AMN file next to the generated VMN file.

    Code : vmn

    Extension : .vmn

    Interpretation of VMN features

    • Groups : G0 is read in as the scaffold while G1, G2, ... are stored in corresponding R-groups R1, R2, ... The representation of attachments is described below.

    • Undefined attachment information is ignored.

    • Moieties on the scaffold are represented as repeating units with repetition ranges with no crossing bonds.

    • Atom attributes : we interpret the following VMN atom attributes:

      VMN attribute name Chemaxon terminology
      AM - Abnormal mass isotope
      AV - Abnormal valence valence
    • ~~ ~~ Homology atom attributes : we store the following VMN homology atom attributes in Marvin atom properties:

      VMN attribute name Marvin atom property name property values
      DT - Deuterium-Tritium counts DTCOUNT D[deuterium count]T[tritium count] (e.g. D3T2)
      CR - Carbon ring attributes BRANCHING BRA, STR
      SIZE LO, MID, HI, LO MID, MID HI, LO HI
      SATURATION SAT, UNS
      RINGTYPE MON, FU
      data in AMN TEXTNOTES AMN text referring to the atom (e.g. N0-4,S0-4)

    Structure shortcuts (abbreviated groups)

    The following structure shortcuts (abbreviated groups) are supported:

    C2, C3, ..., C50
    ACE BU CN CO1 CO2
    COI ET IBU IPR MBE NBU
    NO2 NPR OBE PBE PH PO3
    PO4 SBU SO2 SO3 TBU

    Amino acids (peptides)

    The following standard amino acids (peptide abbreviated groups) are supported:

    ALA ARG ASN ASP CYS GLN
    GLU GLY HIS ILE LEU LYS
    MET PHE PRO SER THR TRY
    TYR VAL

    The following non-standard peptides are also supported:

    ABU aminobutyric acid
    ASU aminosuberic acid
    GLP pyroglumatic acid
    HCY homocysteine
    HSE homoserine
    NLE norleucine
    NVA norvaline
    ORN ornithine
    SAR sarcosine
    STA statine

    Note, that VMN defined peptide connection bonds are not handled currently, they are interpreted as single covalent bonds.

    For more information on peptide representation refer to the Sequences - peptide, DNA, RNA documentation.

    Superatoms (homology pseudo atoms)

    Superatoms representing homology groups are read in as pseudo atoms. The following homologies are interpreted by enumeration and search:

    CHK CHE CHY CYC ARY HET
    HEA HEF UNK MX AMX A35
    TRM LAN ACT HAL ACY PRT
    XX

    Multiple R-group attachments

    images/download/thumbnails/1806607/vmn_1.png

    Markush Compound Number

    VMN files contain a segment of 12 bytes in the header to hold Markush Compound Number. It is an alpha-numerical string with the following restrictions:

    • Maximum 12 characters

    • Can contain only capital letters, numbers and dashes

    • Some softwares can have other restrictions

    The Markush Compound Number is read as the title of the scaffold molecule and this is the value exported to VMN as well. If this value is not present, the exporter generates a Markush Compound Number.

    Format of the automatically generated Markush Compound Number: MMYY-mmmmm where MM is the current month, YY is the current year and mmmmm is the last 5 digits of the current epoch milliseconds.

    This value can be manipulated by API. Use the setName method on the scaffold molecule. For further details on how to access the scaffold molecule see the Markush representation documentation, the scaffold molecule is called root on the API.

    File Segment

    VMN files contain a segment of 32 bytes in the header to hold data called File Segment. It is an alpha-numerical string with the following restrictions:

    • Maximum 32 characters

    • Can contain only capital letters, numbers and dashes

    • Some softwares can have other restrictions

    The File Segment value is read as a molecule property of the scaffold molecule and this is the value exported to VMN as well. The property key is "FileSegment".

    This value can be manipulated by API. Use the properties method on the scaffold molecule. For further details on how to access the scaffold molecule see the Markush representation documentation, the scaffold molecule is called root on the API.

    This value also can be manipulated on the UI, you can find the guide here. To select the proper component, make sure nothing is selected and right click on an empty spot on the canvas and follow the guide from there. An other method is to double click on an atom of the scaffold which should select the entire scaffold. If your scaffold have multiple fragments, repeat this selection for all fragments while holding down Shift.

    Limitations

    Not supported variable structure features:
    • Position Variation Bonds

    • Link Nodes

    • Alias for R-groups

    Not supported substructure-group features:
    • Polymer related groups (Monomer, Polymer, Copolymer, Graft, Crosslink, etc.)

    • Non-exact repetition

    • Mixtures

    VMN specification related limitations:
    • Atoms with 104 or higher atomic number are not supported

    • Maximum recommended R-group definition count is 50

    Used external references during implementation

    1. Derwent World Patents Index, Markush DARC User Manual, The Thomson Corporation, 1993, 2008