Structure Checker Command Line Application¶
Structure Checker is a chemical validation tool detecting and fixing common structural errors or special features that can be potential sources of problems. structurecheck is the command-line tool of Structure Checker .
Usage¶
The command line parameter -c or --config is mandatory. This parameter specifies the configuration file path or a simple action string
(see Creating a Configuration for more details).
or
Parameter -m or --mode specifies the operation mode. The following operation modes are available:
check(default): searches for errors;
fix: searches for errors and fixes automatically fixable errors.
Note: When a molecule import/export error occurs, the program continues to run. The error is written to the console, and the molecule is discarded from the results (i.e., the resulting output file contains less molecules than the input file).
Note: The syntax of commands can be different under various command line shells (bash, tcsh, zsh, etc.).
General options¶
Input¶
structurecheck accepts most molecular file formats as input (Marvin Documents (MRV), MDL molfile, Sdfile, RXNfile, Rdfile, SMILES, etc.). The input can be specified as:
- input file(s),
- input string(s), or
- SMILES (default).
Note : If neither the input file nor the input string is specified, the standard input (console) will be read.
Output¶
structurecheck's output contains the file(s) of the checked/fixed molecules and optionally a report of the results. The molecules are written to the output file(s). The format of the output file(s) can be specified by the -f or --format option (default format is: "smiles"). The type of output is defined by the -t or --output-type parameter. The possible values of the output type are the following:
- single (default): all molecules are written to the file defined by the
--outputparameter. If--outputparameter is omitted, the result is written in the standard output (console). (--discardedparameter is ignored in this case.)
-
separated : valid and invalid molecules are written to two different files. The
--outputparameter defines the output file of molecules with valid structures, and the--discardedparameter defines the output file of molecules with invalid structures (or in fix mode, those which cannot be fixed automatically).- If
--discardedparameter is omitted, molecules with invalid structures are written to standard output;
- If
--outputparameter is omitted, molecules with valid structures are written to standard output;1**Note** : The indication of `--output` or `--discarded` parameter is mandatory. If none of these parameters are defined, the program stops.
- If
- accepted : only molecules with valid structures are written to file defined by the
--outputparameter. If--outputparameter is omitted, molecules with valid structures are written to the standard output. (--discardedparameter is ignored in this case)
- discarded : only molecules with invalid structures are written to the file defined by the
--discardedparameter. If?-discardedparameter is omitted, molecules with valid structures are written to the standard output. (--outputparameter is ignored in this case.)
The report of structure checking can be written either to a separate file, defined by the --report-file parameter, or to the output file(s) as additional molecule property. The name of the property can be defined by the --report-property parameter.
Note: Not all molecules with structure errors are discarded. When fix mode is selected, molecules with automatically unfixable errors will be discarded only.
Usage examples¶
Below you can find the short descriptions of some examples.If you want to check, fix, or filter structures in evaluate or JChem Cartridge, find examples here.
-
Executes a check with configuration metallocene on the molecule(s) defined in the standard input, and writes the result to the standard output (console);
-
1 2 3
```no-highlight structurecheck -c "bondLength" in.sdf ```Executes a check with configuration bondLength on the molecule(s) defined in the
in.sdffile, and writes the result to the standard output (console); -
1 2 3
```no-highlight structurecheck -c "isotope->converttoelementalform" in.sdf ```Executes a check with configuration isotope->converttoelementalform on the molecule(s) defined in the
in.sdffile, and writes the result to the standard output (console); -
1 2 3
```no-highlight structurecheck -c "aromaticity..valence" -m fix -f sdf -o out.sdf in.sdf ```Executes a fix with configuration aromaticity and valence on the molecule(s) defined in the
in.sdffile, and writes the molecules with valid structures (including automatically fixed molecules) insdfformat to theout.sdfoutput file; -
1 2 3
```no-highlight structurecheck -c config.xml -t separated -o out.sdf -d discarded.sdf ```Executes a check with configuration contained by the
config.xml, and writes the molecules with valid structures toout.sdf, and writes the molecules with invalid structures todiscarded.sdf.Note: The format of both outputs is SMILES(!) as
--format (-f)is not defined; -
1 2 3
```no-highlight structurecheck -c config.xml -m fix -t separated -d discarded.sdf ```Executes a fix with configuration contained by the
config.xml, and writes the molecules with invalid structures todiscarded.sdf, and writes molecules with valid structures to the standard output (console); -
1 2 3
```no-highlight structurecheck -c config.xml -m fix -t discarded in.sdf ```Executes a fix with configuration contained by the
config.xml, and writes the molecules with invalid structures todiscarded.sdf, and omits molecules with valid structures.