Page tree

FAQ

How to check/change the host Oracle calls back to

SELECT * FROM <CHORAL_OWNER>.choral_settings; 

Configuration

How to find problematic structures during indexing or adding structures to an indexed table

The error message will contain a key (e.g.):

“Unable to import molecule with key 1101010960388”

Run this query (be careful, the table owner and name are in uppercase):

SELECT * FROM <table_name> WHERE rowid=(SELECT long_to_rowid(data_object_id,<key>) FROM ALL_OBJECTS 
	WHERE object_name='<TABLE_NAME>' AND owner='<TABLE_OWNER>');

How to check the progress of creating an index

SELECT * FROM v$session_longops WHERE sofar<totalwork ORDER BY start_time DESC;

How to check the execution plan of an sql statement

EXPLAIN PLAN

It is advised always to check your select sql statement by generating its explain plan.

How to import SD file into Oracle database

Prerequisites

  • sdf file (example.sdf)
  • Oracle DB access with table creation grants
  • sqlldr utility from Oracle Client installation

Import steps

The following example will import the example.sdf file records into the mol column. Note that the fields of the sdf records are not separated and imported into separate columns.

  1. Create table in the database (this table will be used to import the data into):

    CREATE TABLE testtable (
    id NUMBER GENERATED BY DEFAULT AS IDENTITY,
    mol CLOB NOT NULL);
  2. Create an sqlldr control file with the following content.

    sdfload.ctl:
    load data
    infile 'example.sdf' "str '$$$$\n'"
    into table testtable
    fields terminated by '$$$$'
    (mol char(10000))
  3. Load the sdf file from command line with sqlldr command (the command will ask you username and password to access the database). The sqlldr command should be executed in the directory where the example.sdf and sdfload.ctl are located.

    sqlldr CONTROL=sdfload.ctl LOG=/tmp/test.log BAD=/tmp/test.bad

Result

As a result of the previous operations you will have a table with primary keys in id column and the molecules with additional fields (which are not extracted) in the mol column.

Known Issues

Search result inconsistency

Inconsistent search results might be received if during the run of the search some modifications (insert, update or delete)  of stored chemical structures of the chemical structure table(s) affected by the search are committed by a different user than the searcher one.

Order by inconsistency of uncommitted structures

Uncommitted chemical structures are always at the end of the order by queries in index mode.

Slow indexing

Indexing can be slow due to not doing calculations in parallel in special cases. If the indexed table is not owned by the user who issued the CREATE INDEX command and the indexing user don’t have select role with grant option on the indexed table or doesn’t have the create view role and the cartridge owner doesn’t have select role on the indexed table.

Order by inconsistency at similarity search in function mode

If the Oracle Cost Based Optimizer chooses not to use the index at similarity search, ordering based on similarity relevance will be corrupted.

Slow calculation of chemical terms

Calculation of chemical terms for great number of structures might be slow.  

Cost estimation for query structures in CLOB

Cost estimation does not work for query structures in CLOB.

The following workarounds are suggested to convert the CLOB query to VARCHAR2 and run the search (then cost estimation will work):

Workaround 1

CREATE OR REPLACE FUNCTION my_search(query CLOB) RETURN NUMBER AS
    query_v VARCHAR2(32767);
    result NUMBER;
BEGIN
    IF dbms_lob.getlength(query) <= 32767 THEN
        query_v := dbms_lob.substr(query, dbms_lob.getlength(query), 1);
        SELECT COUNT(*) INTO result FROM test WHERE general_search(mol,  query_v, 'SUBSTRUCTURE') = 1;
    ELSE     
        SELECT COUNT(*) INTO result FROM test WHERE general_search(mol,  query, 'SUBSTRUCTURE') = 1;
    END IF;
    RETURN result;
END;
/

SELECT my_search(TO_CLOB('CC')) FROM DUAL;

Workaround 2

CREATE OR REPLACE FUNCTION my_search2(table_name VARCHAR2, query CLOB, strategy VARCHAR2) RETURN NUMBER AS
    query_v VARCHAR2(32767);
    result NUMBER;
BEGIN
    IF (dbms_lob.getlength(query) <= 32767) then
        query_v := dbms_lob.substr(query, dbms_lob.getlength(query), 1);
        EXECUTE IMMEDIATE 'SELECT COUNT(*) FROM ' || table_name || '  WHERE general_search(mol, ''' || query_v || ''', ''' || strategy || ''') = 1' INTO result;
    ELSE     
        EXECUTE IMMEDIATE 'SELECT COUNT(*) FROM ' || table_name || '  WHERE general_search(mol, ''' || query || ''', ''' || strategy || ''') = 1' INTO result;
    END IF;
    RETURN result;
END;
/

SELECT my_search2('test', TO_CLOB('C'), 'SUBSTRUCTURE') FROM DUAL;
SELECT my_search2('test', TO_CLOB('C'), 'DUPLICATE') FROM DUAL;
SELECT my_search2('test', (SELECT mol FROM test WHERE id = 4), 'SUBSTRUCTURE') FROM DUAL;

Cost estimation can not be calibrated automatically yet

Cost estimation automatically runs for a given environment using predefined values. It is not possible to run an automated calibration for it  to adjust to the custom environment. However, cost estimation factors can be adjusted manually and thus the plan selected for chemical search by Oracle can be influenced. See how to make calibration settings manually.

No more data to read from socket error

After an apparently successful installation and index creation, the search processes can throw No more data read from socket error.

Ocassionally, the error appears also at the end of the installation process.

At the moment, the only viable workaround is repeating the installation.