Migration Guide

from JChem Oracle Cartridge to JChem PostgreSQL Cartridge

Here we describe a basic migration guide for transferring an Oracle table storing chemical structures indexed with jc_idxtype index of JChem Oracle Cartridge into PostgreSQL database and making it usable for JChem PostgreSQL Cartridge.

Presumptions

Oracle side

Your Oracle table to be migrated is named COMPOUND_STRUCTURES (in our example it is the compound table of the ChEMBL database).

It has the following columns:


MOLREGNO NUMBER (9,0)
MOLFILE CLOB
STANDARD_INCHI VARCHAR2(4000 BYTE)
STANDARD_INCHI_KEY VARCHAR2(27 BYTE)
CANONICAL_SMILES VARCHAR2(4000 BYTE)

The jc_idxtype index created was created on MOLFILE column.

This index has a parameter relating standardization:


std_conf:sql=select setting_val from
Chemaxon_settings where abbrev = ''STD_CONFIG''

This setting_val has the following value:


<StandardizerConfiguration>
<Actions>
<RemoveExplicitH ID="RemoveExplicitH"/>
<Aromatize ID="Aromatize" Type="general"/>
</Actions>
</StandardizerConfiguration>

PostgreSQL side

The database is PostgreSQL or Enterprise DB (EDB).

The Cartridge application server can run on the same or on a separate machine.

Migration steps

Oracle side

Export the data from the Oracle table into csv file using Oracle SQL Developer (required minimum version number 18.4)
1. Right click on the table to be exported, and select Export...
2. Export DDL is not needed, only Export Data to csv.
  
  Name the exported file as TableMigration.csv
3. Run the export
Copy file TableMigration.csv to machine of the PostgreSQL server into the /tmp/ folder. The user postgres must have read right on that file.

PostgreSQL side

Install JChem PostgreSQL Cartridge
1. Install the jchem-psql-db-server ( jchem-psql-edb-db-server) package on the machine with the PostgreSQL server (EDB server).
2. Install the jchem-psql-app-server (jchem-psql-edb-app-server) package on the appropriate machine
  
  Alternatively the package jchem-psql (jchem-psql-edb) containing the full installation can be installed if the JChem Service will run on the same machine where the database server is running.
Create a molecule type file according the standardization set at Oracle cartridge index creation
1. Make a copy from the file /etc/chemaxon/types/sample.type and name it /etc/chemaxon/types/example.type
2. In example.type file set
  
  typeID=2
  
  standardizerAction = aromatize:g..removeexplicitH
3. Take care, the file must have the same privileges as sample.type has.
Init jchem-psql service

sudo service jchem-psql init
Start jchem-psql service

sudo service jchem-psql start
Connect with psql to the database as user postgres. Step into the database where the cartridge was installed (where the CREATE EXTENSION steps were executed during the installation).
Create table in PostgreSQL with molecule type column for the chemical structures
```
psql:>
CREATE TABLE COMPOUND_STRUCTURES
( MOLREGNO numeric,
MOLFILE molecule('example'),
STANDARD_INCHI VARCHAR(4000),
STANDARD_INCHI_KEY VARCHAR(27),
CANONICAL_SMILES VARCHAR(4000) );
```
{warning} if the original table in Oracle contained CLOB type columns other than the molecular column then the corresponding column in PostgreSQL should have type TEXT.

Import the data from the csv file into that table


psql:> COPY COMPOUND_STRUCTURES (MOLREGNO,MOLFILE,STANDARD_INCHI,STANDARD_INCHI_KEY, CANONICAL_SMILES) FROM '/tmp/TableMigration.csv' DELIMITER ',' CSV HEADER;

Create index on the chemical structure column


psql:> CREATE INDEX jpc_idx1 on compound_structures USING sortedchemindex(molfile);

Test chemical structure searching


psql:> SELECT COUNT(*) FROM compound_structures WHERE 'Cc1ccccc1' |<| molfile;

PostgreSQL side if the chemical structures are already filled in a CLOB or TEXT column of a table

Same as above
Same as above
Same as above
Same as above
Same as above

Rename the column storing the chemical structures


psql:> ALTER TABLE compound_structures RENAME molfile TO molfile_original;

Add a new, molecule type column with example subtype to the table


psql:> ALTER TABLE compound_structures ADD COLUMN molfile molecule('example');

Copy the chemical structures by casting to molecule type to the newly created molecule type column
```
psql:> UPDATE compound_structures SET molfile=molfile_original::molecule('example');
```

Create index on the chemical structure column


psql:> CREATE INDEX jpc_idx1 on compound_structures USING sortedchemindex(molfile);

Test chemical structure searching


psql:> SELECT COUNT(*) FROM compound_structures WHERE 'Cc1ccccc1' |<| molfile;

Optional. Remove the column storing the original chemical structures if they won’t be needed anymore
```
psql:> ALTER TABLE compound_structures DROP COLUMN molfile_original;
```
Alternatively,

instead of steps 6, 7, 8, and 11, change the type of the original CLOB or TEXT column to molecule type.

Be careful, the original chemical structures will be lost.


ALTER TABLE compound_structures ALTER COLUMN molfile TYPE molecule('example') USING molfile::molecule('sample');

Migration demo

See the video demonstrating migration from JChem Oracle Cartridge to JChem PostgreSQL Cartridge.