Getting Started JChem PostgreSQL Cartridge

    {warning} This online page relates to the latest released JChem PostgreSQL Cartridge. Find the version specific documentation in your installed package in /opt/jchem-psql/doc/ directory.

    This guide will serve as a basic introduction to installing and using JChem PostgreSQL Cartridge (JPC).

    See also Installation and Administration and API Usage pages for detailed description of administration and API use cases.

    System requirement

    • CentOS, RedHat, Debian or Ubuntu operating system x86_64 version.

    Find information for cases of other linux systems

    • postgreSQL 9.4 relational database (preferred download page) for JPC versions up to 2.0

    • postgreSQL 9.5 relational database for JPC versions starting from 2.1

    See help to Upgrade of PostgreSQL database together with JPC

    • postgreSQL 10 relational database for JPC versions starting from 3.0

    • postgreSQL 12 relational database for JPC versions starting from 5.0

    • availability of hstore extension of postgresql (e.g., contrib package)

    • Java8 runtime environment. We support AdoptOpenJDK 8 as well. In the case of AdoptOpenJDK libfontconfig package must be included.

    Installation and Setup

    Install PostgreSQL Cartridge

    Download the latest version of PostgreSQL Cartridge from here. In order to install the latest version, change x.y (version information) in the following statement to the current one:

    In CentOS:

    
    sudo yum localinstall jchem-psql-x.y.x86_64.rpm

    In RedHat or Debian:

    
    sudo dpkg -i jchem-psql-x.y.x86_64.deb

    Before the first use

    1. Install your license file

      Copy a valid ChemAxon license to /etc/chemaxon/license.cxl. Required license in the license file is 'Postgres Cartridge'. The jchem-psql user should have read access to the license file.

    2. Initialize index directory

      
      sudo service jchem-psql init

      {primary} If you encounter a problem in your java installation, set your JAVA_HOME in the file /etc/default/jchem-psql.

    3. First start

      
      sudo service jchem-psql manual-start

    Using the extension in a database

    1. Create postgres user and database

      
      sudo su postgres
      createuser testuser
      createdb testdb -O testuser
    2. Install the extensions

      
      psql testdb
      testdb> CREATE EXTENSION chemaxon_type;
      testdb> CREATE EXTENSION hstore;
      testdb> CREATE EXTENSION chemaxon_framework;
    3. Checking JChem-psql cartridge installation

      If jchem-psql service is running, the following query can be executed without any problem:

      
      testdb> SELECT 'C'::Molecule('sample') |<| 'CC'::Molecule;

    Service

    The service can be started/stopped using

    
    sudo service jchem-psql manual-start
    sudo service jchem-psql stop

    By default, the service starts on system boot. This can be switched off by setting ENABLED=0 in the /etc/default/jchem-psql file.

    BASIC API Usage

    CREATE TABLE

    
    CREATE TABLE table_name (structure_column_name MOLECULE('molecule_type_name'));

    Example:

    
    CREATE TABLE ttest (mol MOLECULE('sample'));

    Import data

    • Import from sdf/mol file (located on your local machine):

      
       et variable_name `cat sdf_file_name`
      CREATE TABLE table_name AS 
          SELECT molSrc::molecule('molecule_type_name') AS (structure_column_name)[, props ->'sdf_field_name' AS column_name] FROM parse_sdf(:'variable_name');

      Example:

      
       et content `cat ~/a.sdf`
      CREATE TABLE mysdftable AS 
          SELECT molSrc::molecule('sample') AS mol,
              props -> 'Molformula' AS formula 
          FROM parse_sdf(:'content');
    • Import from local smiles/cxsmiles/smarts/cxsmarts files:

      
      \COPY table_name (structure_column_name) FROM 'file_name' (FORMAT csv);

      Example:

      
      \COPY ttest(mol) FROM '/home/posgresuser/targetfiles/nci-pubchem_1m_unique.smiles' (FORMAT csv);

    CREATE INDEX

    Indextype named chemindex has to be used when indexing a column that contains chemical structures.

    
    CREATE INDEX index_name ON table_name USING chemindex(structure_column_name);

    Example:

    
      CREATE INDEX ttest_idx ON ttest USING chemindex(mol);

    SELECT

    Substructure search

    
    SELECT * FROM table_name WHERE 'query_structure' |<| structure_column_name;

    Examples:

    
        SELECT * FROM ttest WHERE '<?xml version="1.0" encoding="windows-1250"?>
        <cml xmlns="http://www.chemaxon.com" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.chemaxon.com/marvin/help/formats/schema/mrvSchema_6_1_0.xsd" version="ChemAxon file format v6.1, generated by v6.1.3">
        <MDocument>
        <MChemicalStruct>
        <molecule molID="m1">
        <atomArray atomID="a1" elementType="C" x2="3.4100000858306885" y2="3.740000009536743"/>
        <bondArray/>
        </molecule>
        </MChemicalStruct>
        </MDocument>
        </cml>' |<| mol;
     
        SELECT * FROM ttest WHERE 'c1ccccc1' |<| mol;
     
        SELECT * FROM ttest WHERE '[#6]-[#6]' |<| mol;

    Full fragment (exact fragment) search

    
    SELECT * FROM table_name WHERE query_transform('query_structure', 'fullfragment') |<| structure_column_name;

    Example:

    
    SELECT * FROM ttest WHERE query_transform('CC', 'fullfragment') |<| mol;

    Superstructure search

    
    SELECT * FROM table_name WHERE 'query_structure' |>| structure_column_name;

    Example:

    
    SELECT * FROM ttest WHERE 'CCC' |>| mol;

    Duplicate search

    
    SELECT * FROM table_name WHERE 'query_structure' |=| structure_column_name;

    Example:

    
    SELECT * FROM ttest WHERE 'CCC' |=| mol;