Skip to content

Document to Structure Format Options

Format Options Specific to Document to Structure

Codename Default Value Explanation
smiles on enable the conversion of SMILES strings
inchi on enable the conversion of InChI strings
ocr on enable the processing of scanned text in PDF documents
osr on if any external OSR tool is installed enable the conversion of structure drawings by any available OSR external tool
osra on if OSRA is installed enable the conversion of structure drawings by the OSRA external tool. Using this option will specify that OSRA should be used even if other OSR tools are available.
clide on if CLiDE is installed enable the conversion of structure drawings by the CLiDE external tool. Using this option will specify that CLiDE should be used even if other OSR tools are available.
imago on if Imago is installed enable the conversion of structure drawings by the OSRA external tool. Using this option will specify that Imago should be used even if other OSR tools are available.
timeout=N no timeout the maximum number of seconds to run, with 0 for no timeout
osraTimeout=N 20 seconds configure the maximum number of seconds to run OSRA on an image
clideTimeout=N 20 seconds configure the maximum number of seconds to run CLiDE on an image
imagoTimeout=N 20 seconds configure the maximum number of seconds to run Imago on an image
filterOSR on enable the filtering of OSR structures for incomplete recognition
text on enable the processing of the textual content of the document. The text is searched for text-based formats: name, smiles, InChI, (all on by default) and CAS Registry Numbers® (off by default, see the cas option above)
acronyms off enable the conversion of acronyms, such as ATP for Adenosine TriPhosphate
vernacular off enable the conversion of everyday terms like "water" or "steam"
OLE on enable the conversion of structures embedded in office documents
startPage=N no limit (starts at first page) start processing the document at page N (can be combined with endPage to process a range of pages)
endPage=N no limit (stops at last page) stop processing the document at page N
insideTag=TAG off for markup formats, enable the conversion only inside the given tag (typically insideTag=body for HTML)
contextRadius=N 40 maximum number of characters of context to include, on each side of the hit
contextIndex off whether to include the index of the hit in the context

Each option can be preceded by a minus sign - (for instance -smiles) to disable it.
Both forms smiles and +smiles are accepted to enable an option.

Name to Structure Format Options

Apart from its specific options listed above, Document to Structure also accepts all
Name to Structure format options,
to configure which name conversions are attempted.

In some cases, however, Document to Structure uses different default values than Name to Structure:

Codename Default value
ocrCorrection on
elements off
ions off
cas off
casNames off