Type of search

From version 20.8.0-2005111440 the "Exact" match name is replaced with "Duplicate" match name to correlate with JChem. "2D match" is replaced with "Stereo matches".

Three types of structure searches are currently enabled which can be selected from the Search type drop-down list:

Duplicate search

  • Search for structures identical to the query structure including stereochemistry (including atropisomers too, if configured), isotopic and charged forms. The structure is considered as a complete entity, with all the structure's atoms and bonds identical in the retrieved compound. The stereo and tautomer information can certainly be neglected using the proper check-boxes. See more details at the Match type section.

The identical structures containing CSTs are also included in the Duplicate Search results.

Substructure search

  • Search for structures in which the query structure is embedded. For single compounds, all versions including those ones that contain different isotopic or charged states, and also the salt/solvate containing versions are listed.

As a substructure search result multi-component compounds are also returned if the query structure is a component of the multi-component compound. E.g. when running a substructure search for the piperazine structure, a mixture's, an alternate's and a single's lot are returned as a result.

If the component structure is present in the DB without any single-component lot, only the multi-component compound is listed in the search results. E.g. using 4-aminocyclohexan-1-ol as a search query for substructure search, only an alternate is returned as a result if 4-aminocyclohexan-1-ol is not present as a lot in the DB.
When searching for CST only records (compounds having no chemical structure, just CST) on the Search page (using an Exact or Substructure search) put a star atom in the Marvin structure editor. If the structure editor contains a star atom, the CST only records are listed (currently only for the single compounds).

Similarity search

  • Referred to similarity search in JChem Base.

  • Search for structures that are "similar" to the query structure. The distance of the query and target structures is calculated based on the generated chemical hashed fingerprints in the JChem table (also used for the screening part of the substructure search). The metric is currently set to be the default Tanimoto. The similarity threshold can be provided when selecting this search type, a decimal value between 0 and 1 is accepted. 2D and Tautomer search option checkboxes are disabled in these cases.

  • As a result of the search, all the structures are returned that have a similarity higher than the specified threshold. By default, the hits are listed in an order of decreasing similarity (most similar on the top). See Figure S1 . Please be aware, that the displayed value is a similarity level (and not dissimilarity), thus 1.0 stands for a structure that is identical to the query.


Figure S1. Results of similarity search for a query structure

The results are always listed in the increasing dissimilarity order (although the dissimilarity values are not listed)