Here we describe a basic migration guide for transferring an Oracle table storing chemical structures indexed with jc_idxtype index of JChem Oracle Cartridge into PostgreSQL database and making it usable for JChem PostgreSQL Cartridge.
Your Oracle table to be migrated is named COMPOUND_STRUCTURES (in our example it is the compound table of the ChEMBL database).
It has the following columns:
MOLREGNO NUMBER (9,0)
MOLFILE CLOB
STANDARD_INCHI VARCHAR2(4000 BYTE)
STANDARD_INCHI_KEY VARCHAR2(27 BYTE)
CANONICAL_SMILES VARCHAR2(4000 BYTE)
The jc_idxtype
index created was created on MOLFILE column.
This index has a parameter relating standardization:
std_conf:sql=select setting_val from
Chemaxon_settings where abbrev = ''STD_CONFIG''
This setting_val
has the following value:
<StandardizerConfiguration>
<Actions>
<RemoveExplicitH ID="RemoveExplicitH"/>
<Aromatize ID="Aromatize" Type="general"/>
</Actions>
</StandardizerConfiguration>
The database is PostgreSQL or Enterprise DB (EDB).
The Cartridge application server can run on the same or on a separate machine.
Export the data from the Oracle table into csv file using Oracle SQL Developer (required minimum version number 18.4)
Right click on the table to be exported, and select Export...
Export DDL is not needed, only Export Data to csv.
Name the exported file as TableMigration.csv
Run the export
Copy file TableMigration.csv
to machine of the PostgreSQL server into the /tmp/ folder. The user postgres must have read right on that file.
Install JChem PostgreSQL Cartridge
Install the jchem-psql-db-server ( jchem-psql-edb-db-server) package on the machine with the PostgreSQL server (EDB server).
Install the jchem-psql-app-server (jchem-psql-edb-app-server) package on the appropriate machine
Alternatively the package jchem-psql (jchem-psql-edb) containing the full installation can be installed if the JChem Service will run on the same machine where the database server is running.
Create a molecule type file according the standardization set at Oracle cartridge index creation
Make a copy from the file /etc/chemaxon/types/sample.type
and name it /etc/chemaxon/types/example.type
In example.type
file set
typeID=2
standardizerAction = aromatize:g..removeexplicitH
Take care, the file must have the same privileges as sample.type has.
sudo service jchem-psql init
Start jchem-psql service
sudo service jchem-psql start
Connect with psql to the database as user postgres. Step into the database where the cartridge was installed (where the CREATE EXTENSION steps were executed during the installation).
Create table in PostgreSQL with molecule type column for the chemical structures
psql:>
CREATE TABLE COMPOUND_STRUCTURES
( MOLREGNO numeric,
MOLFILE molecule('example'),
STANDARD_INCHI VARCHAR(4000),
STANDARD_INCHI_KEY VARCHAR(27),
CANONICAL_SMILES VARCHAR(4000) );
{warning} if the original table in Oracle contained CLOB type columns other than the molecular column then the corresponding column in PostgreSQL should have type TEXT.
Import the data from the csv file into that table
psql:> COPY COMPOUND_STRUCTURES (MOLREGNO,MOLFILE,STANDARD_INCHI,STANDARD_INCHI_KEY, CANONICAL_SMILES) FROM '/tmp/TableMigration.csv' DELIMITER ',' CSV HEADER;
Create index on the chemical structure column
psql:> CREATE INDEX jpc_idx1 on compound_structures USING sortedchemindex(molfile);
Test chemical structure searching
psql:> SELECT COUNT(*) FROM compound_structures WHERE 'Cc1ccccc1' |<| molfile;
Same as above
Same as above
Same as above
Same as above
Same as above
Rename the column storing the chemical structures
psql:> ALTER TABLE compound_structures RENAME molfile TO molfile_original;
Add a new, molecule type column with example subtype to the table
psql:> ALTER TABLE compound_structures ADD COLUMN molfile molecule('example');
Copy the chemical structures by casting to molecule type to the newly created molecule type column
psql:> UPDATE compound_structures SET molfile=molfile_original::molecule('example');
Create index on the chemical structure column
psql:> CREATE INDEX jpc_idx1 on compound_structures USING sortedchemindex(molfile);
Test chemical structure searching
psql:> SELECT COUNT(*) FROM compound_structures WHERE 'Cc1ccccc1' |<| molfile;
Optional. Remove the column storing the original chemical structures if they won’t be needed anymore
psql:> ALTER TABLE compound_structures DROP COLUMN molfile_original;
Alternatively,
instead of steps 6, 7, 8, and 11, change the type of the original CLOB or TEXT column to molecule type.
Be careful, the original chemical structures will be lost.
ALTER TABLE compound_structures ALTER COLUMN molfile TYPE molecule('example') USING molfile::molecule('sample');
See the video demonstrating migration from JChem Oracle Cartridge to JChem PostgreSQL Cartridge.