Codename | Default Value | Explanation |
---|---|---|
smiles | on | enable the conversion of SMILES strings |
inchi | on | enable the conversion of InChI strings |
ocr | on | enable the processing of scanned text in PDF documents |
osr | on if any external OSR tool is installed | enable the conversion of structure drawings by any available OSR external tool |
osra | on if OSRA is installed | enable the conversion of structure drawings by the OSRA external tool. Using this option will specify that OSRA should be used even if other OSR tools are available. |
clide | on if CLiDE is installed | enable the conversion of structure drawings by the CLiDE external tool. Using this option will specify that CLiDE should be used even if other OSR tools are available. |
imago | on if Imago is installed | enable the conversion of structure drawings by the OSRA external tool. Using this option will specify that Imago should be used even if other OSR tools are available. |
timeout=N | no timeout | the maximum number of seconds to run, with 0 for no timeout |
osraTimeout=N | 20 seconds | configure the maximum number of seconds to run OSRA on an image |
clideTimeout=N | 20 seconds | configure the maximum number of seconds to run CLiDE on an image |
imagoTimeout=N | 20 seconds | configure the maximum number of seconds to run Imago on an image |
filterOSR | on | enable the filtering of OSR structures for incomplete recognition |
text | on | enable the processing of the textual content of the document. The text is searched for text-based formats: name, smiles, InChI, (all on by default) and CAS Registry Numbers® (off by default, see the cas option above) |
acronyms | off | enable the conversion of acronyms, such as ATP for Adenosine TriPhosphate |
vernacular | off | enable the conversion of everyday terms like "water" or "steam" |
OLE | on | enable the conversion of structures embedded in office documents |
startPage=N | no limit (starts at first page) | start processing the document at page N (can be combined with endPage to process a range of pages) |
endPage=N | no limit (stops at last page) | stop processing the document at page N |
insideTag=TAG | off | for markup formats, enable the conversion only inside the given tag (typically insideTag=body for HTML) |
contextRadius=N | 40 | maximum number of characters of context to include, on each side of the hit |
contextIndex | off | whether to include the index of the hit in the context |
Each option can be preceded by a minus sign - (for instance -smiles) to disable it. Both forms smiles and +smiles are accepted to enable an option.
Apart from its specific options listed above, Document to Structure also accepts all Name to Structure format options, to configure which name conversions are attempted.
In some cases, however, Document to Structure uses different default values than Name to Structure:
Codename | Default value |
---|---|
ocrCorrection | on |
elements | off |
ions | off |
cas | off |
casNames | off |