Bulk Upload

    Upload is the process of calling the registration service to register a set of compounds automatically based on a predefined configurable set of business rules (validation, standardization, structure checking).

    {info} Since version 24.1.0 the file processing in bulk upload was redesigned to fix several issues and to improve performance. Instead of relying on the embedded broker to store intermediate upload records in the new version, the redesigned bulk upload will save these records as submissions. The change also reduces the memory footprint during bulk registrations and two-step registrations.

    The Upload page consists of two main sections, the File uploader and the All uploads.

    The Upload page with the recent uploads

    An SD file can be dragged and dropped into the File uploader, but you can also browse for files or paste as texts.

    Supported formats

    The Upload page is designed to simultaneously handle multiple submissions imported from an SD file according to configurable system settings.

    The following molecule formats are supported: SDF, SMILES, SMARTS, MRV, CSV.

    {info} In the case of SDF files the structure field is mandatory. When the "structure" part is "empty" a No structure registration is triggered. More details can be found at the Uploading No structures.

    The structure within the CSV file can be in Chemaxon Extended SMARTS (1), SMARTS (2), Chemaxon Extended SMILES (3), or SMILES (4) format.


    {info} Since version 22.19.0 the UTF-8 BOM file encoding is supported.

    {info} In the CSV file, the header is mandatory, columns can be comma or semicolon separated. The first column must be the structure column followed by any other fields.

    {info} The structure column can contain a valid chemical file format as listed above for registering Single structure, or can be left empty for registering No-structures. At least one field (e.g. Id or additional data) must be mapped prior to initiating the upload process or the upload will fail.

    {info} Note that in case of CSV files the system might not automatically recognize the fields, so manual mapping should be made in order to successfully upload the file.

    Upload options

    You can reach the upload options by clicking on the gear wheel icon on the Upload page.

    Multi-value input delimiter

    Since version 20.8.0 a multi-value input delimiter can be defined in the Upload options. Field value splitting is supported for both SDF and CSV. If CSV is chosen, CSV delimiter and multi-value delimiter must be different and CSV delimiter must not be contained within multi-value delimiter. The delimiter is chosen for the uploaded file and not per field. ‘Enable multi-value input’ must be true for the given field in order to attempt to split field values.

    images/download/attachments/1803272/gear_wheel.png images/download/attachments/1803272/define_delimeter.png images/download/attachments/1803272/delimeter_mapping.png
    Upload options icon Defining a delimiter Delimiter mapping

    ID-based fields

    Since version 21.3.0 ‘File contains dictionary item IDs, instead of values’ option can be used in the Upload options.

    'File contains dictionary item IDs, instead of values' checkbox

    The effect of the 'File contains dictionary item IDs, instead of values' checkbox:

    • ID-based fields:

      When this checkbox is ON: File contains dictionary item IDs, instead of values.

      When this checkbox is OFF: File contains values.

    • Non-ID-based fields:

      Non-ID-based fields are not affected.

    When you use the Append field feature and choose a Value from the drop-down, that id-value pair will be used. You will see the Value on the UI, also when the checkbox is ON or OFF.

    Upload compounds, salts and solvates

    From the Upload page compounds and salts/solvates can be bulk registered: