Document to Structure Format Options

    Format Options Specific to Document to Structure

    Codename Default Value Explanation
    smiles on enable the conversion of SMILES strings
    inchi on enable the conversion of InChI strings
    ocr on enable the processing of scanned text in PDF documents
    osr on if any external OSR tool is installed enable the conversion of structure drawings by any available OSR external tool
    osra on if OSRA is installed enable the conversion of structure drawings by the OSRA external tool. Using this option will specify that OSRA should be used even if other OSR tools are available.
    clide on if CLiDE is installed enable the conversion of structure drawings by the CLiDE external tool. Using this option will specify that CLiDE should be used even if other OSR tools are available.
    imago on if Imago is installed enable the conversion of structure drawings by the OSRA external tool. Using this option will specify that Imago should be used even if other OSR tools are available.
    timeout=N no timeout the maximum number of seconds to run, with 0 for no timeout
    osraTimeout=N 20 seconds configure the maximum number of seconds to run OSRA on an image
    clideTimeout=N 20 seconds configure the maximum number of seconds to run CLiDE on an image
    imagoTimeout=N 20 seconds configure the maximum number of seconds to run Imago on an image
    filterOSR on enable the filtering of OSR structures for incomplete recognition
    text on enable the processing of the textual content of the document. The text is searched for text-based formats: name, smiles, InChI, (all on by default) and CAS Registry Numbers® (off by default, see the cas option above)
    acronyms off enable the conversion of acronyms, such as ATP for Adenosine TriPhosphate
    vernacular off enable the conversion of everyday terms like "water" or "steam"
    OLE on enable the conversion of structures embedded in office documents
    startPage=N no limit (starts at first page) start processing the document at page N (can be combined with endPage to process a range of pages)
    endPage=N no limit (stops at last page) stop processing the document at page N
    insideTag=TAG off for markup formats, enable the conversion only inside the given tag (typically insideTag=body for HTML)
    contextRadius=N 40 maximum number of characters of context to include, on each side of the hit
    contextIndex off whether to include the index of the hit in the context

    Each option can be preceded by a minus sign - (for instance -smiles) to disable it. Both forms smiles and +smiles are accepted to enable an option.

    Name to Structure Format Options

    Apart from its specific options listed above, Document to Structure also accepts all Name to Structure format options, to configure which name conversions are attempted.

    In some cases, however, Document to Structure uses different default values than Name to Structure:

    Codename Default value
    ocrCorrection on
    elements off
    ions off
    cas off
    casNames off