Stripping salts/solvates from compounds is performed when the "Analyze Salt Solvate Fragments" Source dependent Registration option is ON. Only salts/solvates that are present in the Salts &Solvates Dictionary can be stripped.
During the upload, the user can provide the salt/solvate Ids and multiplicities for each compound structures if needed. Salts/solvates from the Salts &Solvates Dictionary can be referred by their Id and multiplicity in the SDF within a data field that is going to be mapped with the "Version salt/solvate ID" and "Version salt/solvate multiplicity" fields (Figure)
The result of the upload process is always summarized in the:
Uploading No structures under different parent compound Ids
No structure compounds can be sucessfully uploaded through an SDF if that contains empty structures and have the structure field mapped.
Example SDF that contains empty structure |
Each empty molfile will be registered as a new compound in the Registration system.
Similarly, in order to bulk register No structure compounds with CST, the value from the SDF should be mapped with the "CST" field in the application.
Mapping the CST on the Upload page |
Uploading No structures under the same parent compound Id (bulk register new lots)
No structure compounds can be sucessfully uploaded through an SDF under a specific parent compound Id, if the Id value is mapped to the "PCN" field in the system.
If the same CST is provided for a set of No structures within the SDF, the records will be registered under the same parent compound Id.
Example SDF that contains empty structure and mapped PCN |
Please note, that the Bulk Upload does NOT support the registration of multi-component compounds. In the case when the loaded SDF file contains multiple components within one structure, this will be registered as a "single" compound consisting of more components (not as a Mixture, Formulation or Alternate). Or, as a solution, a multi-component checker can be introduced, in which case these records will not be autoregistered but will fall to the Staging area, from where these can be manually registered as multi-component compounds.
Unreadable file format : When a molfile cannot be parsed and no information can be extracted from it (because it might be corrupted or does not follow the expected file format specification).
Inconsistent file format : Even though a valid molfile (can be parsed according to syntax specification, passes format validation) is present, it can happen that the molecule described by the file does not make any sense and is likely wrong.
Unsupported file format : ChemAxon may not know that a section of the file contains structure-relevant information and simply skip parsing that information. Often caused by modifications to the file format specifications introduced by vendors or organizations, that are not widely used and not sufficiently described publicly. In other words: we might miss extracting data, that is not supposed to exist according to the publicly available file format specs and we cannot assume what type of data and what its context might be.
Using Compound Registration invalid structures can be uploaded successfully. The compounds will be stored as "No structures" while the original structure as the text will be stored as additional data. Later the "No structures" can be amended on the parent compound level and the correct structure can be provided (the Ids will be kept).
In order to prepare the system for storing the original (invalid) molfile as a text, an additional data field should be created (Administration/Forms and Fields) that has the "errorStructure" as field identifier.
New field configuration |
On the Upload page the previously created additional data field can be appended:
Appending an additional field |
After uploading the file, the records that have invalid structures will be converted to "No structures" and will have the structure as text in a field stored as additional data to the compound.
During registration on the Staging area/Submission page |
After registration: Browse page |
The result of the upload process is always summarized: