Skip to content

Architecture

Overview

The DSClient is designed to receive and store pre-formatted registration data, in a message, pushed from a Chemaxon Compound Registration (CompReg) system.
Data can be received either in a stateless HTTP format (recommended) or an ActiveMQ approach can be adopted.

Messages are automatically triggered by the following four actions:

  1. Registering a new compound
  2. Amending an existing compound — editing the compound data
  3. Updating the layout of a compound — changing the compound itself
  4. Deleting a compound

ActiveMQ Push Messaging

CompReg can directly push registration messages to an ActiveMQ queue, REGISTRATION.DS, which can be consumed by the DSClient.
The ActiveMQ communication protocol requires a dedicated broker which communicates via the OpenWire protocol, typically on port 61616, with optional authentication.


DSClient Architecture

ActiveMQ authentication

Anonymous access of the ActiveMQ should only be used within a secure container-based network.
For all other scenarios, a JMS broker username/password should be configured.

HTTP polling

Direct HTTP polling of broadcasted CompReg messages are possible with the DSClient.
The simpler, more direct approach of HTTP polling results in a simple configuration with fewer components; better error handling mechanisms; audit trails in CompReg and the DSClient; and fine-grained control over throughput.


DSClient Architecture

CompReg Downstream API Documentation

The CompReg system utilizes a REST API for accessing registration data.
The REST API can be used independently of the push mechanisms.
The DSClient only uses the /downstream/messages/ endpoint for retrieving data from CompReg.
However, the additional downstream endpoints can be used for manual testing/data integrity verification.

Database

The DSClient currently supports PostgreSQL, Oracle, MySQL and SQLite databases.
The downstream database schema is a snapshot of the CompReg registration tree structure, organized in a non-normalized format that mirrors the CompReg database architecture.

Key Relationships

The schema follows this relationship hierarchy:

flowchart LR
    STRUCTURE --> PARENT --> VERSION --> PREPARATION

Terminology Difference

In the CompReg system, the lowest level of the registration hierarchy is called a lot. In the DSClient downstream database schema, this same entity is a preparation.

flowchart LR
    STRUCTURE --> PARENT --> VERSION --> LOT

Schema Overview

The database schema consists of the following main table groups - Follow the link for a detailed description of the schema:

Core Structure Tables

Structure Storage

The structure table stores all parent and salt/solvate modifier structures (separately) in JChem structure table format. The structure files are stored in the cd_structure CLOB column and can be searched using JChem libraries.

Registration Tree Tables

Supporting Tables

System Tables

Schema Modifications

The database schema is statically defined. Any changes to database architecture, table or column names, data types, or workflows will require code changes and a new build of the custom DSClient.

Detailed Schema Documentation

For detailed information about each table, including column definitions, relationships, and data types, see the Database Schema documentation.


Configuration

Settings are stored in a simple flat key-value configuration file in the following files:

  • CompReg: registry.properties
  • DSClient: registry.dsclient.properties

Configuration Variables

Variable Description
RegDBType Database type (e.g., PostgreSQL)
RegDBDriver JDBC driver class (e.g., org.postgresql.Driver)
RegDBUrl Database connection URL
RegDBUser Database username
RegDBPass Database password
RegDBMaxActive Maximum number of active database connections
RegDBValidationQuery SQL query to validate connections (e.g., SELECT 1)
RegDownstreamMode Downstream mode (e.g., Database)
RegDownstreamPublishEnabled Enable/disable downstream publishing (true/false)
RegDownstreamFusedImageFormat Format for fused structure images (e.g., mol:V3)
CHEMAXON_LICENSE_URL URL to the Chemaxon license server
Variable Description
REGISTRYCXN_DSCLIENT_HOME Home directory for DSClient
RegDSDBType Downstream database type (e.g., PostgreSQL)
RegDSDBDriver JDBC driver class for downstream database
RegDSDBUrl Downstream database connection URL
RegDSDBUser Downstream database username
RegDSDBPass Downstream database password
RegDSDBMaxActive Maximum number of active downstream database connections
RegDSDBValidationQuery SQL query to validate downstream connections
RegDSClientCommunicationType Communication type (HTTP or JMS)
RegDsClientHttpCompRegHost CompReg host URL (for HTTP communication)
RegDsClientHttpCompRegClientId Client ID for HTTP authentication (must match created client)
RegDsClientHttpCompRegClientSecret Client secret for HTTP authentication (must match created secret)
RegDsClientHttpCompRegUser CompReg user for HTTP authentication
RegDsClientHttpPaginationLimit Number of records per page for HTTP polling
RegDsClientHttpPollingFrequencySeconds Polling frequency in seconds for HTTP communication
RegDSClientStrictConsistency (default: false) enables strict consistency mode. When true, any processing error causes a rollback and stops processing entirely, and validates that messages follow in consecutive order with no missing IDs. When false (default), failed messages are skipped and saved to the failed message table.