JChem PostgreSQL Cartridge FAQ and Known Issues

    FAQ

    How to cut off data fields from structures stored in sdf format

    In case you want to remove the data fields from the structures stored in sdf format, the following UPDATE statement can be helpful:

    UPDATE <table> SET <structure_column> = substring(<structure_column>::text,'^.*M\ \ END')::molecule('molecule_subtype');

    Example:

    UPDATE mytable SET mol = substring(mol::text,'^.*M\ \ END')::molecule('sample');

    How to apply different business rules on the same chemical structure dataset

    The following business rules can be defined in the molecule type files.

    • standardizer configuration
    • assuming absolute or relative stereochemistry
    • switching tautomer search on or off
    • fingerprint configuration

    If you want more than one combination of the above listed business rule parameters be available - for example you want to run searches with and without tautomer search - you have to create separate table columns with different molecule types. Unfortunately, this means duplication, multiplication of the stored chemical structure data.

    How to install extensions to different schema than public

    By default, the extensions chemaxon_type and chemaxon_framework are installed into the PUBLIC schema of the PostgreSQL database. However, there can be cases when these extensions must be installed into a different schema. For example, if there is an object - like a table - named to molecule already present in the PUBLIC schema.

    The following steps should be executed:

    CREATE SCHEMA chemaxon;
    CREATE EXTENSION chemaxon_type WITH SCHEMA chemaxon;
    CREATE EXTENSION chemaxon_framework WITH SCHEMA chemaxon;
    CREATE EXTENSION hstore;  -- this extension does not need to be installed into the chemaxon schema

    In order to make searches able to use these extensions, search_path must be modified.

    Session level modification:

    SET search_path TO chemaxon,public;

    Database level modification:

    ALTER DATABASE <db_name> SET search_path TO chemaxon,public;

    See further possibilities in PostgreSQL documentation about schema search path.

    How to install HSTORE extension

    JPC can only be successfully installed if the

    CREATE EXTENSION HSTORE; 

    step of the installation process can be executed.

    HSTORE extension is part of the contrib package of PostgreSQL database (for example postgresql-contrib-12), so it is necassary to have the contrib package installed.

    How to check invalid structures

    1. Use the is_valid_molecule method. Note, there are some structures which are qualified as valid, but cannot be indexed by the cartridge.

    2. Check the logs on the server in /var/log/jchem-psql/. The log level can be set in /etc/chemaxon/jpc-log4j.xml, see documentation.

      • INFO level log files contain only the error.
      • DEBUG level log files contain the the error and the structure as well.
    3. In the client log when running insert scripts, the insert is successful, but warnings appear:

      INSERT INTO <table_name>

      Warnings: <...> Invalid molecule source: <error> See PostgreSQL documentation .

    4. By this query (preferably on an indexed table):

      select * from test except select * from test where '*'|<|mol;

    Can I use JChem PostgreSQL Cartridge in cloud environment?

    In Amazon environment, JChem PostgreSQL Cartridge requires a PostreSQL database installed on an EC2 instance, it does not work with Amazon PosgreSQL RDS.
    Two setups are possible:

    1. installing JPC on the same EC2 where the database is
    2. installing JPC's DB-server on the same EC2 where the database is and installing JPC's App-server on a separate EC2 instance


    JPC is tested in Amazon environment. Other cloud provider's environment was not tested, but we see no reason why they shouldn't work.

    How to identify the molecule stopping the create index process?

    If the create index process is not finished sucessfully, you can check the log file /var/log/jchem-psql/stderr.log and look for the characters label=> and copy the number following label=>.

    If this number is - for example - 8589934592, the row where the molecule stopping the create index process is present can be identified with the following SQL statement:

    SELECT *        
    FROM <tablename> where (ctid::text::point)[0]::bigint + 
     (ctid::text::point)[1]::bigint * POWER(2,32) = 8589934592;

    How to solve Molecule set with name <number> does not exist error?

    Reindexing is needed.

    drop index <indexname>
    update table_name set structure_column_name = structure_column_name::text::molecule('molecule_type_name');
    create index ...

    How to migrate JPC database?

    1. install PostgreSQL in the new environment
    2. run postgresql service in the new environment
    3. install JPC in the new environment. Configure it (memory, etc.), add the same type files as the ones in the old environment
    4. Initialize and run jchem-psql service in the new environment, create the chemaxon_type, hstore and chemaxon_framework extensions
    5. dump the old database - for example by pg_dump. Exporting into csv is not an applicable method!
    6. restore the database from the dump in the new environment; reindexing is automatically run during the restore process, which can significantly lengthen the restore process

    Known Issues

    False similarity search results in the case of molecule types with tautomer=GENERIC parameter

    In the case molecule types with tautomer=GENERIC parameter, similarity search gives false results. These is no workaround at the moment, please do not execute similarity search in structure columns having molecule type with tautomer=GENERIC parameter. From version 21.9.0 similarity search works correctly even in the case of molecule types with tautomer=GENERIC parameter.

    Drop index doesn't drop the index on the server if there were no other JPC calls before

    If you log in to a database and immediately call drop index, e.g.

    drop index <index_name>;

    the JPC-specific call won't be executed. This means that the index data won't be cleared up in the JChem Server. The following workarounds are recommended: