Standardizer Command-line Application

    The Standardizer command-line tool reaches all the functionalities of Standardizer.

    On this page you will find information on

    General usage schema

    You can standardize molecules with Standardizer using the following command:

    standardize [input file(s)/string(s)] -c <config file/string> [options]

    Prepare the usage of the standardize script or batch file as described here.

    Options

    Available options will be displayed by this command:

     standardize -h
    
    The appearing general options: 
     -h, --help this help message
    
     -ha, --help-actions list of available actions
    
     -g, --ignore-error continue with next molecule on error
    
     --empty-mol-on-error write an empty molecule, and continue
    
     with next molecule on error
    
     --unstandardized-mol-on-error write the original molecule, and continue
    
     with next molecule on error
    
     Input options:
    
     -c, --config <filepath|string> configuration XML file
    
     or action string,
    
     actions separated by "..",
    
     Output options:
    
     -f, --format <format> output file format (default: smiles)
    
     -o, --output <filepath> output file (default: standard output)
    
     -e, --export-fields-to-smiles export property fields to SMILES
    
     -v, --verbose verbose output with time results
    
     --log <filepath> write log messages to file
    
     default: write log to system error
    
     --loglevel <level> sets the log level
    
     levels: [severe|warning|info|off]
    
     -rp, --report-property <prop-name> name of the property to wirte report

    The command line parameter --config is mandatory. This specifies the path and filename of a configuration file or else it is the simple actionstring. We highly recommend creating the configuration file via Standardizer GUI or Standardizer Editor.

    By default, the program exits in case of molecule import/export errors. If the command-line parameter -g or --ignore-error is specified, then errors will not stop the process. The error is written to the console, the molecule is discarded from the structure file (the resulting file will contain less molecules than the input file). With option --empty-mol-on-error the structure is changed for an empty molecule. The molecule is presented in the original form when using the option --unstandardized-mol-on-error. Both of these settings result in a file containing the same number of structures as the input file.

    Standardizer Actions

    The following command lists the available Standardizer actions:

     standardize -ha 

    Valid actionstrings are:

     addexplicitH convert implicit hydrogens to explicit
    
     aliastoatom convert pseudo and alias atoms to normal atoms
    
     aliastogroup convert pseudo and alias atoms to groups
    
     aromatize perform aromatization (default: general)
    
     :general perform general aromatization
    
     :basic perform basic aromatization
    
     :loose perform loose aromatization
    
     clean2d perform partial clean in 2D (alternative:
    
     clean)
    
     :full perform full clean in 2D
    
     :removezcoordinate remove z coordinate of 3D structures
    
     :<template file> perform template based clean in 2D
    
     clean3d perform clean in 3D
    
     clearisotopes convert isotope to the given chemical element
    
     clearstereo clear stereo information
    
     (default: chirality, doublebond)
    
     :chirality clear tetrahedral stereo information
    
     :doublebond clear double bond stereo information
    
     :singleupordownbond clear "Single Up or Down" bonds (wiggly bonds)
    
     connecting to chiral centers
    
     contractsgroups contract S-groups
    
     :exclude='<group>' specified group(s) remain expanded
    
     convertdoublebonds convert unspecified double bond stereo
    
     (default: crossed)
    
     :crossed convert unspecified double bond stereo
    
     to crossed representation
    
     :wiggly convert unspecified double bond stereo
    
     to wiggly representation
    
     convertpimetalbonds convert metal-multicenter single bonds
    
     to coordinate type
    
     converttoenhancedstereo convert to enhanced stereo (default: and)
    
     :abs convert to enhanced stereo; stereogenic
    
     centers without enhanced stereo flag go into a
    
     new "abs" groups
    
     :and convert to enhanced stereo; stereogenic
    
     centers without enhanced stereo flag go into a
    
     new "and" groups
    
     :or convert to enhanced stereo; stereogenic
    
     centers without enhanced stereo flag go into a
    
     new "or" groups
    
     creategroup create group from abbreviated group definition
    
     :group=<group> specify group, e.g. creategroup:Val,Ser,Boc
    
     dearomatize convert aromatic bonds to Kekule form
    
     disconnectmetalatoms remove covalent bonds between
    
     metals and non-carbons
    
     expand multiple fragments according to
    
     stoichiometry data
    
     expandsgroups expand S-groups
    
     :exclude='<group>' specified group(s) remain contracted
    
     map add atom maps
    
     (default: complete, keepmapping)
    
     :complete map all atoms
    
     :matching map matching atoms of a reaction
    
     :changing map changing atoms of a reaction
    
     :keepmapping keep existing mapping
    
     :markbonds mark changing bonds
    
     mapreaction add atom maps to reaction
    
     (default: complete, keepmapping)
    
     :complete map all atoms
    
     :matching map matching atoms of a reaction
    
     :changing map changing atoms of a reaction
    
     :keepmapping keep existing mapping
    
     :markbonds mark changing bonds
    
     mesomerize take canonical mesomer form
    
     neutralize neutralize charged molecules
    
     removeabsolutestereo remove absolute stereo flag (chiral flag)
    
     removeatomvalues remove atom values (extra atom labels)
    
     removeattacheddata remove attached data from molecules
    
     removeexplicitH convert explicit hydrogens to implicit ones
    
     by default, plain hydrogen atoms are removed
    
     :bridgehead connecting to bridgehead atom
    
     :charged charged explicit hydrogens
    
     :hconnected hydrogen connected to hydrogen atom
    
     :isotopic isotopic explicit hydrogens
    
     :lonely hydrogens without any connections
    
     :mapped mapped explicit hydrogens
    
     :polymerendgroup hydrogen connected to a SRU S-group
    
     :radical radical explicit hydrogens
    
     :sgroup hydrogen which is the only atom in an S-group
    
     :sgroupend hydrogen connected to a Superatom S-group
    
     :valenceerror hydrogen connected to an atom which has
    
     valence error
    
     :wedged wedged explicit hydrogens
    
     removefragment remove fragments from molecules
    
     (default: method=keeplargest,
    
     measure=atomcount)
    
     :method=keeplargest keep the largest fragment
    
     :method=keepsmallest keep the smallest fragment
    
     :method=removelargest remove the largest fragment
    
     :method=removesmallest remove the smallest fragment
    
     :measure=atomcount remove fragments according to atom count
    
     :measure=molmass remove fragments according to molecular mass
    
     :measure=heavyatomcount remove fragments according to
    
     heavy atom count
    
     removergroupdefinitions remove R-group definitions; keep root
    
     structure
    
     removestereocarebox remove stereo search markers from double bonds
    
     replaceatoms replace a chemical element with another
    
     element
    
     :queryatom='<atom symbol>' atom to be replaced
    
     :replaceatom='<atom symbol>' new atom to be included instead
    
     :allowValenceError tolerance of valence errors
    
     setabsolutestereo set absolute stereo flag (chiral flag)
    
     stripsalts remove salts listed in the salt dictionary
    
     tautomerize take canonical tautomer form
    
     ungroupsgroups ungroup S-groups
    
     :exclude='<group>' specified group(s) remain grouped
    
     unmap remove atom maps
    
     wedgeclean rearranges stereo wedges according to
    
     IUPAC recommendations
    
     <reaction SMARTS> transform molecule according to reaction
    
     SMARTS, e.g. nitro transformation
    
     [O-][N+]=O>>O=N=O
    

    Visit Standardizer Actions to find out more about the functionality of each action!

    Input

    Most molecular file formats are accepted.

    The input is either specified in input file(s), or else in input string(s), usually in SMILES format.

    If neither the input file name(s) nor the input string(s) are specified in the command line then the standard input is read.

    Output

    Standardizer writes output molecules in the format specified by the --format option (the default format is "SMILES"). If the --output is omitted, results are written to the standard output.

    If the command line parameter --export-fields-to-smiles is specified, then the property fields (SDF fields) of the molecules will be exported even if the output format is SMILES, SMARTS, ChemAxon Extended SMILES or ChemAxon Extended SMARTS. In case of other formats the property fields are always exported, this option has no effects.

    Usage examples

    1. A UNIX command that reads molecular structures from the mols.sdf file and writes the standardized molecules to the standard output in smiles format:

       standardize -c Standardizer.xml mols.sdf
    2. A UNIX command that reads molecules given as SMILES strings from file nci10000.smiles located in the./test/pharmacophore directory and writes results in the file named nci10000.sdf to be created in the same directory:

       standardize -c Standardizer.xml nci10000.smiles -f sdf -o nci10000.sdf
    3. The same with transformation check and verbose output, then displaying the result in MarvinView:

       standardize -c Standardizer.xml -e -v nci100.smiles -f sdf -o nci100.sdf
      
       mview nci100.sdf
    4. Processing an SD file and displaying the standardized molecules using MarvinView:

       standardize -c Standardizer.xml med100.sdf | mview -

      Note that such piping does not work in Windows.

    5. Standardization with actionstring:

       standardize -c "aromatize..[O-:2][N+:1]=O>>[O:2]=[N:1]=O" med100.sdf -o med100.smiles
    6. Standardization with actionstring, taking input molecules as SMILES strings:

       standardize -c "aromatize..[O-:2][N+:1]=O>>[O:2]=[N:1]=O" \
      
       "[O-][N+](o)C1=CC=CC=C1" "[H]C1=C(C=C(C=C1)[N+](o)=O)[N+](o)=O"
    7. Processing tasks belonging to no groups or to task group "target":

       standardize -c Standardizer.xml -u target targets.sdf -f sdf -o output.sdf