Design Hub developer guide - import plugins

    Import plugins can be used to push or pull data - compounds with meta data - into Design Hub from any external source (message queue, REST API, etc...)

    NodeJS module API

    An import plugin exports the following properties:

    Name Type Required Description
    name string yes Unique identifier of the plugin, used by Design Hub for identification and internal communication. If multiple plugins use the same identifier, the last one to be loaded overrides the others.
    label string yes Human readable name of the plugin, used by Design Hub to display GUI elements related to this plugin: as menu entry in the menu to enable the plugin, as title of the panel displaying the results.
    domains array of strings yes List of domains where this plugin may be used, when authentication is enabled in Design Hub. Use * to allow any domain.
    init function yes this is a domain specific context, described later in this doc. This method can be used to initialize connections and/or start scheduled jobs for the plugin.
    (!) In a multi-domain setup, init will be called for each domain separately with its own context.
    getSettings async function no Including this property indicates a manual import plugin which can be triggered from the UI and settings dialog will be displayed for the user based on return value of getSettings(). Has higher precendence than settings.
    settings array of objects no Including this property indicates a manual import plugin which can be triggered from the UI and settings dialog will be displayed for the user based the exported settings. This property is ignored if getSettings is defined.
    runImport async function yes Triggered in case of manual import, and gets the configuration in this
    cannotProcess async function The import service calls this with the ids of records for which the processing was failed.
    onConfigurationChanged function no A callback function that Design Hub calls during initialization and whenever an administrator updates the Secrets of the system.

    Arguments:
    config (Object) An object with secrets attribute containing the key-value pairs of secrets from the Admin interface

    Domain specific import context properties:

    Name Type Description
    domain string Domain of this context
    logger object Context specific logger with the typical debug, info, warn, error logging methods
    schedule object Job scheduler
    storeRawData function(RawRecord[]): Promise\<number> The plugin API provides a callback function to store the compounds immediately into the database (external data storage). The number of stored records returned.
    (!) Manual import plugins should store this method in a domain specific way, so it can be reused later when import is triggered manually.

    RawRecord

    The storeRawData callback function accepts a list of RawRecords which carry the compound information. The table below lists all the accepted attributes of such a record:

    Name Type Required Description
    external_id string Yes A unique record identifier
    substance_id string No Physical substance identifier of the compound
    virtual_id string No A non-Design Hub generated virtual identifier of the compound
    source string Yes Chemical structure of the compound
    owner_username string Yes username or identifier as obtained from the identity provider
    generate_virtual_id boolean No Deprecated
    project_id number One of project_id and project_key required Design Hub internal project identifier
    project_key string One of project_id and project_key required Design Hub external project identifier (acquired when fetching projects from company plugin)
    hypothesis_title string No title of a hypothesis in which compounds is stored
    designset_title string No title of a design set in which compound is stored
    status_id number One of status_id and status_label required Design Hub internal status identifier
    status_label string One of status_id and status_label required Status label
    source_system string No Label of the compound source. This attribute will be published for storage plugins.
    visibility number No private/shared visibility flag. for private use 0, for public use 1. default is 1
    raw_data Object No Compound properties. Object keys matching the name of compoundFields will be used to update the value of custom fields, while the rest are stored as Imported data.
    add_tags string[] No Tags to be added to the compound records.

    Compound Properties

    The raw_data attribute of a RawRecord is simple object that accepts string, number and modified number values. See the example below:

    {
       "external_id": "CHEMBL25",
       "substance_id": "CHEMBL25",
       ...
       "raw_data": {
          "Toxicity Assessment": "Safe",
          "Purity %": 99,
          "COX-1 IC50 uM": {
             "value": 4.45,
             "modifier": ">"
          }
       }
    }

    For values with modifiers, the following value modifiers are accepted: <<, <, <=, =, *, ~, >=, >, >>.

    Configuration

    All fresh data in the temporary storage gets processed automatically by scheduled jobs. After successful processing, the content appears for the users.

    Domain-level configuration options for the jobs:

    Name Default Description
    importProcessBatchSize 100 Number of records to processs in a batch
    importSchedulerPlan normal normal: every 5 minutes on Mon-Fri
    crazy: every minute on Mon-Fri
    dontTryThisAtHome: each second

    Error handling

    Preprocessing steps:

    • generate images for structures
    • resolve internal status_id from status_label
    • resolve internal project_id from project_key
    • identify owner user based on owner_username
    • identify target design set based on hypothesis_title and designset_title Notes:
      • the owner user should have write permission on it
      • the designset must be public
    • identify records which need to be attached to an already existing content.
      Id matching strategy: currently it is a query based on external_id or substance_id
    • insert or update content
    • chemically index structures with JChem Microservices

    If any of these steps fails, the record will be marked as FAILED and error cause is stored. The failures can be queried through the following endpoint:

    GET http://{host}:{port}/data/plugins/import/{plugin-name}/errors?offset={offset}&limit={limit}

    Plugin skeleton

    Below, you can find 2 skeleton files for a manual and an automatic import plugin implementing the API methods. The code below includes typescript definitions for all parameters and expected results, so that editors like Visual Studio Code can assist with static code analysis and adherence to the specifications.

    skeleton-manual.import.js

    //@ts-check
    "use strict";
    
    const dhutils = require("@chemaxon/dh-utils");
    
    /**
     *
     * @typedef {Object} RawRecord
     * @prop {string} external_id
     * @prop {string} [substance_id]
     * @prop {string} [virtual_id]
     * @deprecated @prop {boolean} [generate_virtual_id]
     * @prop {string} source - chemical structure
     * @prop {string} owner_username
     * @prop {number} [project_id] Internal DH project identifer
     * @prop {string} [project_key]
     * @prop {number} [status_id] Internal DH status identifier
     * @prop {string} [status_label]
     * @prop {string[]} [add_tags]
     * @prop {string} [hypothesis_title]
     * @prop {string} [designset_title]
     * @prop {number} [visibility]
     * @prop {string} [source_system]
     * @prop {{[key: string]: string|number|NumberWithModifier}} raw_data - compound properties (assay data)
     *
     * @typedef {Object} NumberWithModifier
     * @prop {number} value
     * @prop {string} modifier
     *
     * @typedef PluginSettings
     * @prop {string} label
     * @prop {'boolean'|'number'|'enum'|'multienum'|'project'|'text'|'objectenum'|'objectmultienum'} type
     * @prop {string[]|number[]|{id: string, label: string, category?: string}[]} [values]
     * @prop {string|number|boolean} [default]
     * @prop {number} [min]
     * @prop {number} [max]
     *
     * @typedef ImportInitContext
     * @prop {string} domain
     * @prop {Logger} logger
     * @prop {import("node-schedule").schedule} schedule
     *
     * @typedef Logger
     * @prop {function(...any): void} info
     * @prop {function(...any): void} warn
     * @prop {function(...any): void} error
     *
     * @typedef {function(RawRecord[]): Promise<number>} StoreCallback
     *
     * @typedef GetSettingsContext
     * @prop {User} user
     *
     * @typedef RunImportContext
     * @prop {User} user
     * @prop {PluginConfiguration} settings
     * @prop {string} domain
     * @prop {StoreCallback} storeRawData
     *
     * @typedef User
     * @prop {string} userName
     * @prop {any} tokens OIDC TokenSet
     *
     * @typedef {any} PluginConfiguration Project is DH internal project ID
     *
     * @typedef ConfigurationValues
     * @prop {{[key: string]: string}} secrets
     */
    
    /**
     * @this {ImportInitContext}
     */
    function init() {
      //store the logger instance
    }
    
    /**
     * @this {GetSettingsContext}
     * @returns {Promise<PluginSettings[]>}
     */
    async function getSettings() {
      console.log("plugin-name getSettings", this.user);
      return [];
    }
    
    /**
     * @this {RunImportContext}
     * @returns {Promise<{ successCount: number }>}
     */
    async function runImport() {
      console.log("user is requesting data with settings", this.user, this.settings);
    
      //obtain data
      //transform data to records
    
      //submit records to DH API
      const successCount = await this.storeRawData(records);
    
      return { successCount };
    
    }
    
    /**
     * @this {CannotProcessContext}
     * @param {string[]} externalIds
     */
    async function cannotProcess(externalIds) {
      console.log("Cannot import IDs", externalIds);
    }
    
    /**
     * Store and use values provided by Admin interface's Secret manager
     * @param {ConfigurationValues} config
     */
    function onConfigurationChanged(config) {
      console.log("plugin-name configuration", config.secrets);
    }
    
    module.exports = {
      name: "manual-plugin-name",
      label: "Plugin Label",
      init: init,
      runImport: runImport,
      getSettings: getSettings,
      cannotProcess: cannotProcess,
      domains: ["*"],
      onConfigurationChanged: onConfigurationChanged
    };

    skeleton-automatic.import.js

    //@ts-check
    "use strict";
    
    const dhutils = require("@chemaxon/dh-utils");
    
    /**
     *
     * @typedef {Object} RawRecord
     * @prop {string} external_id
     * @prop {string} [substance_id]
     * @prop {string} [virtual_id]
     * @deprecated @prop {boolean} [generate_virtual_id]
     * @prop {string} source - chemical structure
     * @prop {string} owner_username
     * @prop {number} [project_id] Internal DH project identifer
     * @prop {string} [project_key]
     * @prop {number} [status_id] Internal DH status identifier
     * @prop {string} [status_label]
     * @prop {string[]} [add_tags]
     * @prop {string} [hypothesis_title]
     * @prop {string} [designset_title]
     * @prop {number} [visibility]
     * @prop {string} [source_system]
     * @prop {{[key: string]: string|number|NumberWithModifier}} raw_data - compound properties (assay data)
     *
     * @typedef {Object} NumberWithModifier
     * @prop {number} value
     * @prop {string} modifier
     *
     * @typedef PluginSettings
     * @prop {string} label
     * @prop {'boolean'|'number'|'enum'|'multienum'|'project'|'text'|'objectenum'|'objectmultienum'} type
     * @prop {string[]|number[]|{id: string, label: string, category?: string}[]} [values]
     * @prop {string|number|boolean} [default]
     * @prop {number} [min]
     * @prop {number} [max]
     *
     * @typedef ImportInitContext
     * @prop {string} domain
     * @prop {Logger} logger
     * @prop {import("node-schedule").schedule} schedule
     *
     * @typedef Logger
     * @prop {function(...any): void} info
     * @prop {function(...any): void} warn
     * @prop {function(...any): void} error
     *
     * @typedef {function(RawRecord[]): Promise<number>} StoreCallback
     *
     * @typedef GetSettingsContext
     * @prop {User} user
     *
     * @typedef RunImportContext
     * @prop {User} user
     * @prop {PluginConfiguration} settings
     * @prop {string} domain
     * @prop {StoreCallback} storeRawData
     *
     * @typedef User
     * @prop {string} userName
     * @prop {any} tokens OIDC TokenSet
     *
     * @typedef {any} PluginConfiguration Project is DH internal project ID
     *
     * @typedef ConfigurationValues
     * @prop {{[key: string]: string}} secrets
     */
    
    /**
     * @this {ImportInitContext}
     */
    function init() {
      //store the logger instance
    
      //set up the cron job
      const job = this.schedule.scheduleJob("0 0,30 * * * *", runImport.bind(this));
    }
    
    /**
     * @this {RunImportContext}
     * @returns {Promise<{ successCount: number }>}
     */
    async function runImport() {
      //obtain data
      //transform data to records
    
      //submit records to DH API
      const successCount = await this.storeRawData(records);
    
      return { successCount };
    
    }
    
    /**
     * @this {CannotProcessContext}
     * @param {string[]} externalIds
     */
    async function cannotProcess(externalIds) {
      console.log("Cannot import IDs", externalIds);
    }
    
    /**
     * Store and use values provided by Admin interface's Secret manager
     * @param {ConfigurationValues} config
     */
    function onConfigurationChanged(config) {
      console.log("plugin-name configuration", config.secrets);
    }
    
    module.exports = {
      name: "automatic-plugin-name",
      label: "Plugin Label",
      init: init,
      runImport: runImport,
      cannotProcess: cannotProcess,
      domains: ["*"],
      onConfigurationChanged: onConfigurationChanged
    };