Design Hub developer guide - import plugins

    Import plugins can be used to push or pull data into Design Hub from any external source (message queue, REST API, etc...)

    NodeJS module API

    Import plugin exports the following properties:

    Name Type Required Description
    name string yes Unique identifier of the plugin, used by Design Hub for identification and internal communication. If multiple plugins use the same identifier, the last one to be loaded overrides the others.
    label string yes Human readable name of the plugin, used by Design Hub to display GUI elements related to this plugin: as menu entry in the menu to enable the plugin, as title of the panel displaying the results.
    domains array of strings yes List of domains where this plugin may be used, when authentication is enabled in Design Hub. Use * to allow any domain.
    init async function yes this is a domain specific context, described later in this doc. This method can be used to initialize connections and/or start scheduled jobs for the plugin.
    (!) In a multi-domain setup, init will be called for each domain separately with its own context.
    settings array of objects Including this property indicates a manual import plugin which can be triggered from the UI and settings dialog will be displayed for the user based the exported settings.
    getSettings async function Including this property indicates a manual import plugin which can be triggered from the UI and settings dialog will be displayed for the user based on return value of getSettings().
    runImport async function This triggered in case of manual import, and gets the configuration in this
    cannotProcess async function The import service calls this with the ids of records for which the processing was failed.

    Domain specific import context properties:

    Name Type Description
    domain string Domain of this context
    logger object Context specific logger with the typical debug, info, warn, error logging methods
    schedule object Job scheduler
    storeRawData function(RawRecord[]): Promise\<number> The plugin api provides function to store the raw data immediately into the database (external data storage). The number of stored records returned.
    (!) Manual import plugins should store this instance in a domain specific way, so it can be reused later when import is triggered manually.

    JSDoc for RawRecord

     * @typedef {Object} RawRecord
     * @prop {string} [external_id] - Design Hub external, but virtual identifier of a compound
     * @prop {string} [substance_id] - physical substance identifier of a compound
     * @prop {boolean} [generate_virtual_id] - if set true, virtual identifier will be generated instead of using external_id
     * @prop {string} source - chemical structure
     * @prop {string} owner_username - username or identifier as obtained from the identity provider
     * @prop {number} [project_id] - Design Hub internal project identifier
     * @prop {string} [project_key] - Design Hub external project identifier (acquired when fetching projects from `company` plugin)
     * @prop {string} [hypothesis_title] - title of a hypothesis in which compounds is stored
     * @prop {string} [designset_title] - title of a design set in which compound is stored
     * @prop {number} [status_id] - Design Hub internal status identifier
     * @prop {string} [status_label] - Design Hub status label
     * @prop {number} [visibility] - private/shared visibility flag. for private use 0, for public use 1. default is 1
     * @prop {{[key: string]: string|number|NumberWithModifier}} raw_data - compound properties (e.g. assay data)
     * @typedef {Object} NumberWithModifier
     * @prop {number} value
     * @prop {string} modifier

    One of external_id or substance_id is mandatory as that will be used as primary identifier for the imported record.

    For values with modifiers, the following value modifiers are accepted: <<, <, <=, =, *, ~, >=, >, >>.

    Note: visibility for private compounds is used during the initial setup of the defined hypothesis and design set. Records processed afterwards will inherit the visibility of design set.


    All fresh data in the temporary storage gets processed automatically by scheduled jobs. After successful processing, the content appears for the users.

    Domain-level configuration options for the jobs:

    Name Default Description
    importProcessBatchSize 100 Number of records to processs in a batch
    importSchedulerPlan normal normal: every 5 minutes on Mon-Fri
    crazy: every minute on Mon-Fri
    dontTryThisAtHome: each second

    Error handling

    Preprocessing steps:

    • generate images for structures
    • resolve internal status_id from status_label
    • resolve internal project_id from project_key
    • identify owner user based on owner_username
    • identify target design set based on hypothesis_title and designset_title Notes:
      • the owner user should have write permission on it
      • the designset must be public
    • identify records which need to be attached to an already existing content.
      Id matching strategy: currently it is a query based on external_id or substance_id
    • insert or update content
    • chemically index structures with JChem Microservices

    If any of these steps fails, the record will be marked as FAILED and error cause is stored. The failures can be queried through the following endpoint:

    GET http://{host}:{port}/data/plugins/import/{plugin-name}/errors?offset={offset}&limit={limit}