Molconvert

    Molconverter is a command line program in Marvin Suite and JChem that converts between various file types.

    Syntax

    molconvert [options] outformat[:exportoptions] [files...]

    The outformat stands for one of the supported formats.

    Chemical Formats

    Format owner Format name Format type Outformat value
    Chemaxon Marvin Document (MRV) Document mrv
    Chemaxon Chemaxon Object Notation (CXON) Document cxon
    Chemaxon Chemaxon Compressed Molfile Molecule csmol csrxn cssdf csrdf
    Chemaxon Chemaxon Extended SMILES Molecule cxsmiles
    Chemaxon Chemaxon Extended SMARTS Molecule cxsmarts
    Chemaxon Chemaxon SMILES Abbreviated Groups Molecule abbrevgroup
    PerkinElmer Informatics ChemDraw sketch file (CDX) Document cdx
    Dassault Systemes ISIS/Draw sketch file (SKC) Document skc
    Dassault Systemes CTFile formats Molecule mol rgf rxn sdf rdf
    Daylight SMILES Molecule smiles
    Daylight SMARTS Molecule smarts
    IUPAC/InChI Trust IUPAC InChI Molecule inchi
    IUPAC/InChI Trust IUPAC InChIKey Hash inchikey
    IUPAC IUPAC Name Molecule name
    Peptide Sequence Molecule peptide
    CSV N/A csv

    Image Formats

    Format name Outformat value
    Portable Network Graphics png
    MS Bitmap bmp
    JPEG jpeg
    Enhanced Windows Metafile emf
    Tag Image File Format tiff
    Encapsulated PostScript eps

    Molconvert Options

    -o file Write output to specified file instead of standard output
    -m Produce multiple output files
    -e charset Set the input character encoding. The encoding must be supported by Java.
    -e [in ]..[ out] Set the input (in) and/or output (out) character encodings. Examples: UTF-8, ASCII, Cp1250 (Windows Eastern European), Cp1252 (Windows Latin 1), ms932 (Windows Japanese).
    -s string Read molecule from specified SMILES, SMARTS or peptide string (try to recognize its format)
    -s string { format : options } Read molecule from the string in the specified format (can be omitted), using the specified importoptions (can be omitted)
    -f string Specify the import format and options
    --peptide string Read molecule from specified peptide string
    -g Continue with next molecule on error (default: exit on error)
    -Y Remove explicit H atoms
    -I <range> process input molecules with molecule index (1-based) falling into the specified range (e.g. 5-8,15 refers to molecules 5,6,7,8,15)
    -U fuse input molecules and output the union
    -R <file>[:<range>] fuse fragments to input molecule(s) from file with specified mol index range range syntax: "-5,10-20,25,26,38-" (e.g. -R frags.mrv:20-)
    -R<i> <file>[:<range>] fuse R definition members to input molecule(s) from file in specified index range (e.g. -R1 rdef1.mrv:5-8,19)
    -R<i>:<1|2> <file>[:<range>] fuse R definition members to input molecule(s) from file in specified index range, filter molecules having 1 (2, resp.) attachment points (e.g. -R1:2 rdef1.mrv:-3,8-10)
    -T "<f1>:<f2>:..." Export molecule properties <f1>, <f2>, ... with the result separated by tab characters. Supported in SMILES, SMARTS, CXSMILES, CXSMARTS, InChI, InChIKey (where the result is a single line). There is an option to export all properties with -T "*".
    -F Remove small fragments, keep the largest
    -c"f1 OP value&f2 OP value..." Filtering by the values of fields in the case of SDF import. OP may be: =,<,>,<=,>=
    --mol-fields-to-records Convert molecule type fields to separate records.
    -v Verbose
    -vv Very verbose (print stack trace at error)
    -2 [ : options] [ : F<i1><i2>...,<iN>] Calculate 2D coordinates Options for coordinate calculation. Performs partial clean with fixed atom coordinates for atoms <i1><i2>...,<iN> (1-based indexes) if the Fparameter is specified.
    -3 [ : options] Calculate 3D coordinates Options for coordinate calculation.
    -H3D Help on options for 3D calculations. Detailed list on Clean 3d Options

    Export options

    The format specific export option can be specified with the format descriptor.The outformat value and the options are separated by a colon, the options by commas.

    The following example creates a 100x100 pixel JPEG image on yellow background, with 95% quality molconvert jpeg:w100,Q95,#ffff00 nice.mol -o nice.jpg

    Import options

    The format specific import options can be specified between braces, in one of the following forms:

    filename{options} Description
    filename{MULTISET,options} to merge molecules into one that contains multiple atom sets
    filename{format:} to skip automatic format recognition
    filename{format:options} to skip automatic format recognition
    filename{format:MULTISET,options}

    Format specific options

    Format name Export options Import opions
    MRV >>
    CXSMARTS,CXSMILES >> >>
    CTFile formats (MOL, SDF, RXN, RDF, RGF) >> >>
    SMILES, SMARTS >> >>
    InChI,InChIKey >>
    Name >>
    Peptide >> >>

    You can also use the Basic export options for all formats.

    Examples

    Example

    1. Printing the SMILES string of a molecule in a molfile

      molconvert smiles caffeine.mol
    2. Dearomatizing an aromatic molecule:

      molconvert smiles:-a -s "c1ccccc1"
    3. Aromatizing a molecule:

      molconvert smiles:a -s "C1=CC=CC=C1"

      (The default general aromatization is used.)

    4. Aromatizing a molecule using the basic algorithm:

      molconvert smiles:a_bas -s "CN1C=NC2=C1C(=O)N(C)C(=O)N2C"
    5. Converting a SMILES file to MDL Molfile

      molconvert mol caffeine.smiles -o caffeine.mol
    6. Making an SDF from molfiles:

      molconvert sdf *.mol -o molecules.sdf
    7. Printing the encodings of SDfiles in the working directory:

      molconvert query-encoding *.sdf
    8. SMILES to Molfile with optimized 2D coordinate calculation, converting double bonds with unspecified cis/trans to "either"

      molconvert -2:2e mol caffeine.smiles -o caffeine.mol
    9. 2D coordinate calculation with optimization and fixed atom coordinates for atoms 1, 5, 6:

      molconvert -2:2:F1,5,6 mol caffeine.mol
    10. Import a file as XYZ, do not try to recognize the file format:

      molconvert smiles "foo.xyz{xyz:}"

      Note: This is just an example. XYZ and other formats known by Marvin are always recognized (send us a bug report otherwise), so the specification of the input format is usually not needed. It is only relevant if a user-defined import module is used.

    11. Import a file as XYZ, with bond-length cut-off = 1.4, and max. number of Carbon connections = 4, export to SMILES:

      molconvert smiles "foo.xyz{f1.4C4}"
    12. Import a file as Gzipped XYZ, with the same import options as in the previous example:

      molconvert smiles "foo.xyz.gz{gzip:xyz:f1.4C4}"
    13. Like the previous example but merge the molecules into one molecule that contains multiple atom sets. MDL molfile is exported.

      molconvert mol "foo.xyz.gz{gzip:xyz:MULTISET,f1.4C4}"
    14. Import an SDF and export a table containing selected molecules with columns: SMILES, ID, and logP:

      molconvert smiles -c "ID<=1000&logP>=-2&logP<=4" -T "ID:logP" foo.sdf
    15. Fuse R2 definition from file, filter fragments with 1 attachment point:

      molconvert mrv in.mrv -R2:1 rdef.mrv
    16. Fuse fragments from file (note, that the input molecule, which the fragments are fused to, should also be specified):

      molconvert mrv in.mrv -R frags.mrv
    17. Generate all common names for a structure:

      molconvert "name:common,all" -s tylenol
    18. Generate the most popular common name for a structure (It fails if none is known.):

      molconvert name:common -s viagra
    19. Generate SMILES from those molecules that names are mentioned in a file foo.html:

      molconvert smiles foo.html