Definitions of Terms

    The list of terms with their definitions in the Chemaxon Compound Registration system:

    Additional Data

    Additional data, as configurable and source dependent, belong to the compounds to be registered, or they can be attached to the compounds during registration. Stereo Comments, editable from the Registration and Submission pages, can be used as additional data. When enabled for the given source, Stereo Comments are recalculated on the Details page during each amendment process.

    Comment can be also added, edited and saved for each level (parent, version, lot from the Details page), whereas Project can be modified after registration only from the lot level.

    Additional data can be defined from the Administration page, where the source, the grouping of the data, the data type, validation rules etc. can be also set.

    Alternate

    The Alternate is an abstract representation of an uncertain structural information. An Alternate can be a list of possible chemical forms of a certain compound. The Alternate is a multi-component compound without quantitative composition information. For other multi-component types, see also formulation and mixture. For Alternates no MF and MW are calculated. An UNKNOWN # data is added automatically by the registration system in order to be able to register alternates under different parent compound Ids.

    Amendment

    Amendment is called the process of modifying a registered compound. On the Details page, beside the chemical structure (with or without CST), the salts/solvates, the Restriction value, the Molweight, the LnbRef, and Additional Data can be also modified.

    Analyze Salt/Solvate

    Analyze Salt/Solvate is a procedure capable of automated extraction of salt/solvate fragment from a compound's chemical structure and replace them with references to the corresponding records in the Salts and Solvates dictionary. It can be activated or deactivated as a part of the Registration Options. The switcher is on by default for some sources, and can be applied manually to records during the Advanced Registration and in the Staging area.

    Assigned/Unassigned Submission

    An Assigned submission is one that is being worked on by a specific user (e.g. registrar) in order to be manually registered from the staging area. Multiple submissions can be assigned to the same registrar. A submission, that is not assigned to any user, is considered Unassigned, and can be freely worked on by any user.

    Audit

    A detailed history of the changes that have been made to a compound during amendment steps performed on the parent, version or lot level. The Audit will always contain the originally drawn structure too (the one provided for registration).

    Autoregistration

    The process of calling the registration service to register a compound automatically, based on a predefined configurable set of business rules. In case a submission cannot be registered automatically, it will fall to the Staging area and a user with corresponding privileges will need to review and manually register it.

    Autoregistrations can be performed either from the Registration page or from the Upload page.

    Bulk Upload

    The Bulk Upload is the process of registering multiple structures (possibly thousands) in one request. From the Upload page designated for this process, it is possible to register multiple compounds and also to add more salts and/or solvates (used as version details) and several additional data in one request to the DB.

    On the Upload page an SD file (preferably) can be loaded, then the type of the load can be choosed (compounds or salts/solvates). After the proper mapping of the fields, the upload can be initialized.

    During the upload, the registration process can be customized through to custom structure checkers, structure fixers and registration options.

    Chemically Significant Text (CST)

    CST is a text, which is attached to the chemical structure and takes part in defining the structure uniqueness. Structure S-Groups as "NAME" are recognised and interpreted as CSTs by the system.

    CST can be attached either to a component, or to the whole structure.

    It is possible to register a record with CST without a chemical structure. In this case it is usually referred as "CST-only" or "No structure CST" record.

    Users having privileged role for this, can add CSTs to a dictionary. CSTs from the dictionary are considered "known" CSTs, they can be retrieved and used for compound registrations and amendments. CSTs, which are not present in the dictionary, are considered "unknown CSTs", and their registration can be prevented by setting the appropriate Registration option.

    When compounds with CSTs are being registered (regardless wheater are "known" or "unknown") and Match/Suggestion list is encountered (manual registration from Advanced Registration page or from Submission page, Staging area), Similar CST matching is considered.

    ID

    The same as Submitter.

    Comment

    Comment is a configurable additional data, that might come along with the structure to be registered or it can be attached to the structure during registration and it can be added/modified for the registered compounds. After registration, Comment is stored on lot level, but on the Details page it can be also set for the versions and parents too.

    Compound

    The Compound is a proper representation of any chemical entity, including charges, isotopes, salts, solvates, etc. During the process of the registration, a parent compound is created for each Compound, that contains a so called parent standardized form of the chemical structure. Compounds and parent compounds in general are referred as structures.

    Compound Number (CN)

    The Compound Number (CN) is a unique identifier of the registered compound, which is either generated automatically by the registration system according to predefined rules, or it is inherited from an existing version in case of an exact compound match. It is also possible to specify the CN (just like the PCN) during the registration, which can be useful e.g. during migration of legacy data. The CN explicitly identifies a version. When it is generated, CN is usually derived from the parent compound number (PCN) according to customizable rules.

    Created by

    "Created by" refers to the user who started the registration procedure, when a submission failed and ends up in the Staging area. Or, it can be the user who will actually pick up the submission and register it from the Staging area.

    It might be possible, that a user sent a submission to the Staging area, but another one will be actually register it. In this case, for the not yet registered submission on the Submission correction page the "Created by" column is populated already and contains the user who initiated the registration procedure, then, if it will be successfully registered, the "Created by" column on the Details page or on the Search page will contain the name of the user who actually registered it.

    Details page

    The Details page is designed to view in details the registered compounds, with all their accompanying data, like text (CST), salt/solvate info, registration IDs, MW, MF, Project info, Stereo comments and general Comments etc. On the Details page the user has the possibility to modify the compound and the additional data and save these changes in a so-called amendment process.

    Dictionaries

    Multiple Dictionaries can be added and populated to the Registry database, which can be used later during registration or amendment. Dictionaries, and also their items, can be searched, edited and deleted.

    By default, the Compound Registration includes five (local) dictionaries: Chem. Sig. Text (empty by default), Double bond panel, Geometric Isomerism, Stereocenter panel and Stereochemistry, which contain some sample items. Dictionaries can be accessed from the Adimistration page, Dictionary Manager tab.

    All the items present in the Chem. Sig. Text dictionary are considered "known" by the system. Otherwise, if an item is not added to the dictionary is considered "unknown" and its registration can be prevented. A submission that falls to the Staging area with "Unknown CST" status message can be registered if the appropriate switch is enabled.

    The items of the Stereochemistry and Geometric isomerism dictionaries are present in drop-down lists on the Registration and Submission pages of the application.

    The content of the Stereocenter and Double Bond Panels are used on the Submission page for the Stereo Fixer panel.

    External dictionaries can be also used. For this a name and the URL should be provided.

    External ID

    External IDs are IDs derived from an external source. Currently there are two external ID's implemented: LnbRef and Lot ID. The LnbRef is mandatory, but Lot ID is optional. If LnbRef is not provided (configurational) the system will use the generated LN instead.

    File Format

    The default file formats for structures are the MRV, MDL Extended Molfile V3000 (.mol) and SD File. For more details please consult file formats in Marvin:

    It is possible to use any other molecule format that Marvin can import and export. Please note that only the MRV, MOLV3000 and CXN Extended SMILES store the enhanced stereo information, and the CXN Extended SMILES format cannot store data attached to atoms.

    Formulation

    A multi-component compound with exact quantitative composition information (e.g. component 1: 37%, component 2: 63%). A practically arbitrary number of components can be defined. All the component percentages should be positive and their sum should be equal to 100. See also alternate, mixture.

    Fused Image

    A structure image that is on-the-fly generated from the components of a structure. Fused images are generated for multi-component compounds on all hierarchy levels and for single-component structures with salts/solvates on version and lot level.

    JChem Structure Table

    A JChem Structure Table is a database table maintained by the Chemaxon JChem libraries that contains structural information. A JChem Table stores the proper representation(s) of the structure and a list of additional field (e.g. fingerprints) that supports the easy and fast screening/searching of the table by the available search types (duplicate, substructure, similarity, etc.) There are different table types based on the intended usage. For further details please visit JChem documentation.

    Library ID

    When a Bulk upload process is initiated, all submissions within that bulk registration attempt will receive a Library ID. For a bulk upload we can set desired Library names. If are not set, Library IDs are always generated for an upload (like LIBRARY_1, LIBRARY_2, etc ), even if the submissions are automatically registered. For failed submissions, found in the Staging area filtering according to the Library ID is possible.

    LnbRef

    Acronym for (Electronic) Laboratory NoteBook (LNB) Reference. The identifier is provided by the source prior to the registration. It is a compulsory data field for every submission and it is guaranteed to be unique in the whole registration database. The format of LnbRef can be customized by the company and is validated during the registration process. The LnbRef can be modified after the registration, but the attached lot ID cannot be modified.

    Locked/Unlocked Submission

    Deprecated. See assigned/unassigned submission.

    Lot

    Lot is the bottom level of the data hierarchy. A lot (preparation) represents the unit of material obtained in one definite chemical process.

    On lot level only IDs and additional data are stored. Structures are not stored here. (Structures are stored on the parent and version level).

    A lot entry has external IDs like an LnbRef and/or lot ID, as unique identifiers and also has a calculated ID: LN.

    For lot level, configurable additional data (e.g. Comment) can be stored. Project informations are also stored at lot level, and these will be also inherited by the version and parent too. Stereo Comments, on the other hand,which are stored on parent level, are inherited by the versions and lots.

    Lot ID

    The Lot ID is an external ID attached to a lot. Lot ID is optional, but if it is required by the system, it cannot be modified.

    Lot Number (LN)

    The LN is a unique identifier attached to the lot, typically derived from the PCN (regardless of the fact, that the PCN is specified one or generated). When a lot is moved to another tree, the LN is regenerated. Similarly to the PCN and CN, this can also be configured.

    Manual Registration

    The process of registering a failed submission (or multiple failed submissions) by a user with corresponding privileges. The result of the Manual Registration is driven by a set of structure checkers, structure fixers and registration options. The user also has the opportunity to modify the structure manually for the given submission before re-submitting it to Manual Registration.

    Match

    A Match is an already registered parent structure, which could potentially serve as a parent for a compound, that is to be registered /amended. Several different types of Matches exist, based on the level of structural similarity: exact, 2D, component, etc. During autoregistration, depending on the configuration of the registration options, any non-exact match type is either ignored or causes the submission to fall to the Staging area. During manual registration and amendment, the user is presented with the available Matches. Then, he has the ability to choose a Match and a match action. Finally, if it couldn't be done automatically, the user might have to reconcile the Matched tree with the new compound through a process called version fix or version correction.

    Match Action

    The way to respond to matches during manual registration or amendment(relevant for Match list). There are 3 Match Actions:

    Match list

    The Match list is a popup window that shows the possible matches of a structure to be registered or amended when registering Markush structures or multi-component compounds. There are two different types of match lists: component level match list (for the components in multi-component compounds) and multi-component level match list.

    The component match list is shown for the Markush compounds or for the components of a multi-component compound. In the match list we can navigate between components with the aid of the arrows located above the input structure and for each component the match (if available) is displayed. The input parent standardized structure of the actual component is on the left side, the matching parent structure is on the right side. The possible match actions (accept or unique) can be selected by buttons located under the matching structure. If more than one match is available, then the match itself can be selected from a drop-down list. The first item of the drop-down list is always the exact match, if available, followed by all the other match types. In case, if an exact match is available, only the Accept button is available (below the structure of the exact match) - according to the fundamental business rules of the registration -, unique and accept for other match types are not allowed.

    The multi-component level match list is very similar to the previous one, with the only difference that it shows the whole multi-component parent standardized structure instead of the several component match tabs. In the multi-component match list only those already registered multi-component compounds are listed, that have exact matches for all the components - as discussed above. No multi-component match list is presented if there are no matches (exact/component/external) on that level for the compound. In the multi-component match the Accept and Unique options can be found. In case of an exact match, Accept is the only available option.

    Match Type

    Match type is considered the way that the parent-standardized compound and its match are related.

    For single component compounds the Match Type can be: exact, tautomer, 2D, 2D&tautomer and similar CST. The stereo isomers and/or CST matches are considered 2D matches.

    For more details about stereomers please consult the Documentation about stereochemistry.

    For details related to tautomers please consult the Documentation about tautomers. CSTs are considered to be Similar if they have the same content except for the whitespaces and case sensitivity. E.g a "test" and T est" are Similar CST matches.

    For multi-component compounds, when all components have exact matches, the Match Type can be exact, component, 2D or external component. The Match Type is exact match when two multi-component compounds have the same type and the same components, and the same ranges/percentages (e.g. a mixture to be registered consists of 21-44% benzene and 56-79% toluene, while another mixture is already in the registry consisting of 21-44% benzene and 56-79% toluene ). The Match Type is component match when two multi-component compounds have the same type and the same components, but with different ranges/percentages (e.g. a mixture to be registered consists of 21-44% benzene and 56-79% toluene, while another mixture is already in the registry consisting of 45-55% benzene and 45-55% toluene). The Match Type is 2D match when two multi-component compounds have the same components, but the composition is unknown (e.g. an alternate having "ALTERNATE 1" attached data will be a 2D match with another alternate, having the same components). Type is external component match when two multi-component compounds have different types, but have the same components (e.g. a mixture to be registered consists of 21-44% benzene and 56-79% toluene, while there is a registered alternate consisting of benzene and toluene).

    For Polymers only two match types can be considered: exact , when the same components have the same repeating units and %, and component match , when the same components have the same repeating units but different %.

    Mixture

    A type of multi-component compound with semi-quantitative composition information. In case of a Mixture, every component has an assigned range that represents the relative amount of the component (e.g. component 1 composes 30-40% of the mixture, while component 2 composes 60-70%). The maximum number of the components and the component range values can be configured independently. Some of them can also be used as unknown ranges, in case of uncertain information. When a Mixture has an unknown component range an additional 'UNKNOWN' data is also attached to the structure. See also formulation, alternate.

    Modified by

    While "Created by" refers to the privileged user who initiated a registration (in case of failing) or a who actually registered a compound, "Modified by" refers to the user, who amends the compound once it is registered.

    Molecular Formula (Formula, MF) and Molecular Weight (MolWeight, MW)

    The Formula for a compound is generated according to the Hill system: the number of carbon atoms is indicated first, the number of Hydrogen atoms next, and then the number of all other chemical elements subsequently, in alphabetical order. Isotopes are listed separately in square brackets following the related chemical element. When the formula contains no carbon, all the elements, including hydrogen, are listed alphabetically.

    In the Formula representation dots are used to separate the structure from the salt/solvate and components in multi-component compounds, e.g. a 21-44% benzene 56-79% toluene mixture will have "C6H6.C7H8" in the Molecular Formula field.

    Average molecular mass is calculated from the standard atomic weights.

    For further information, please check Appendix A. Calculations.

    On the Search page the MolWeight (Structure) and MolWeight (Structure+Salt) represent the calculated molweight for the parent and version structures. Whereas the MolWeight (Parent) and MolWeight (Version) stands for the user-specified molweight.

    Multi-Component Compound

    A compound which is composed of two or more components. Regarding the actual technical solution, these components also exist as independently-registered single-component compounds in the registration system. Different types of Multi-Component Compounds exist based on purpose and on the level of accuracy of the composition information: alternates, formulation, mixtures, and polymers. A Multi-Component Compound is distinct from a structure having multiple fragments within a structure field that has been registered as a single compound (without registering each fragment individually).

    No Structure

    No Structure structure type is available for registering compounds that have no available structures. No Structures can be registered either with specifying a CST or without CST.

    {info} Configured Structure checkers will not run on "No structure" type compounds.

    Parent

    The highest level of storage hierarchy of the registration system. A Parent in the registration service database represents a parent compound along with a set of additional information. E.g. Stereo Comments are stored at Parent level, but these are inherited also by the versions and lots too. Parent is referred by a unique identifier called parent compound number (PCN). Each Parent can have multiple versions that represent the registered compounds that are grouped together having a common parent compound.

    Parent compound

    The Parent Compound belongs to the top level of the storage hierarchy of the registration system. It is derived from the compound structure through parent standardization, which includes neutralization and salt/solvate/isotope removal by default, but can be customized according to the corporate business logic.

    Structures are only stored on parent and version level .

    The Stereo Comments, if available, are stored in the parent structure and the versions and the lots will inherit it.

    Parent Compound Number (PCN)

    The Parent Compound Number (PCN) is a unique identifier of the registered compound, which is either generated automatically by the registration system according to predefined rules, or it is inherited from an existing parent in case of an exact or accepted match. PCNs can be also specified during registration, which can be useful e.g. during migration of legacy data.The PCN explicitly identifies a parent.

    Polymer

    Polymers can be registered as multi-components compounds in the Registration system. The representation of polymers (polycondensates) that are created via a condensation reaction from monomers X-A-X and Y-B-Y, resulting in alternating copolymers with the general structure ...A-B-A-B-A-B... is supported. Polymers are being created according to a predefined set of leaving group pairs (X, Y) or rules that can be defined.

    Preparation

    A synonym of lot.

    Project

    Project is a simple textual data field attached to the lot level. It can typically be interpreted as a reference to a business project, within the lot was created.

    Projects can be specified either during autoregistration, or when registering the submission from the Staging area. Each lot can be a part of multiple Projects. The Project information is calculated on version and parent levels as the union of the Projects defined for the lots of the tree (or sub-tree).

    {info} Project field values can be provided up the 200 characters. Asian characters, and - _ @ ( ) [ ] are accepted.

    Special characters like: \ / % $ ß & are not acceptable.

    Compound Registration system can be configured in a way to use project based access when registering or when retrieving registered compounds (and their lots).

    Quality Checks

    During the process of registration, a certain set of quality assurance rules/checks can be defined, as a list of structure checker and structure fixer pairs. Quality Checks are defined at the level of the entire registration service, and cannot be configured individually for a specific source, although there exists a source-dependent registration option that controls whether the quality checks are run or not.

    Reject duplicate Id switch

    If this switch is ON, a submission that has failed because of a duplicate Id error (like LnbRefDuplicated or LotIdDuplicated), will have a "Rejected Id" for the status and will be excluded from the Staging area. Users will not be able to retrieve these submissions, unless they specifically type the submission Id in the URL, like: https://your.domain.com/RegistryCxn/client/index.html#/submission? submissionId=xxx.

    Register with specified Id

    It is possible to register a compound with a specified Id: specific PCN and/or CN. The compound to be registered can go under an existing version, in which case specified PCN and/or CN are not considered, or it can be registered under a new version, in which case specified CN, if there is one, is kept.

    Registrar

    An advanced user of the registration service, typically responsible for manually registering failed submissions, amending registered compounds and administering the registry database.

    Registration

    The process of deciding on the uniqueness of new (small) molecules compared to the ones already stored in a database. The decisions are made according to predefined corporate business rules. The result of the registration process is a dedicated database, the registry, that is used to store the relevant structural and accompanying information.

    A compound, that has been submitted for Registration, is first checked and processed by several configurable steps (see standardization, structure checkers, structure fixers and registration options), that ensure that the compound is fit to be consistently introduced into the database. The compound is then placed into the appropriate parent tree in the database - either a unique (new tree created for this compound), or into a matched tree in case such a tree exists. The registration service aims to register a compound automatically (known as autoregistration whenever it is possible. In case a compound cannot be automatically registered, a privileged user can manually register it.

    Registration Page

    It is a form, from where autoregistration process can be started.

    Registration successful / Registration summary

    Registration successful window is received after registering a compound from the Registration page (due to autoregistration) or from the Submission page (due to manual registration).

    The window can be configured to contain the PCN, CN, LN, LnbRef and Lot ID. Optionally, in this window, another button can be present, which using an ID parameter (e.g. LnbRef) can redirect the user to a specified URL (configurable).

    The window is not received, when bulk registration or bulk loader/upload is used. In case of bulk registration (from the Submission page) a "Bulk registration summary" window appears, containing the failed, the successfully registered and the "in progress" registrations. When registering using bulk upload, no message window appears, but on the Dashboard page we are informed about the process and, when finished, the successful and failed registrations will be present in the proper sections of the page.

    Registration summary window is received also in unsuccessful registrations. When a record cannot be autoregistered according to the business rules, the submission falls to the Staging area and the user will receive a Registration Summary, Registration failed window containing the error message (e.g. restricted match, invalid LnbRef, No Structure, Unknown CST, etc.).

    Registry

    The Registry is a database, where all the data related to the registered compounds are stored.

    Registration options

    A set of options that can be switched to either yes or no in order to modify the registration process. Some examples include Perform Quality Checks and Analyze Salt Solvate Fragments. Registration options are configured through source dependent configuration files at the level of the registration service but can be additionally configured e.g. during an individual manual registration.

    Restriction Level

    A numeric value associated with a registered compound, which indicates the level of exclusivity or confidentiality of that compound. A compound with a Restriction Level of 0 is considered unrestricted, while any higher Restriction level makes the compound restricted. Restricted compounds are highlighted in the Match/Suggestion list, on the Submission and Details/Browse pages with watermark, and on the Search page with red frame. The registration system gives additional warnings and prevents certain behavior in the case of registration, matching and amendment of restricted compounds.

    Salts and Solvates

    A set of chemical structures, that are stored in a list. Salts and Solvates can be added to any compound during registration or amendment. Salts can be added, by default, with a positive integer multiplicity, whereas solvates can be added with 0.1 increments. It is also possible to define the stoichiometry as a pair of parent x m and salt/solvate x p values, where m and p are the multiplicities of the parent and salt/solvate. The required precision of the multiplicities is configurable on the server. To add a salt/solvate, click on the appropriate button, select the salt/solvate form the list (using ID or name), set the multiplicity, then click on the Add button. The parent multiplicity can be set only if a salt or solvate with multiplicity was already added to the structure.

    See also salt/solvate fragment.

    Salt/Solvate ID

    A unique identifier assigned to the salt/solvate entry in the list. Salts and solvates are stored in a common table, therefore are having a common sequence of IDs. It is possible (as a forced registration) to add the same structure as a salt and also as a solvate to the list, but they will have different IDs.

    Salt/Solvate Fragment

    A fragment of a compound's chemical structure that can be identified with a record from the salt/solvate list. See also the Analyze Salt/Solvate.

    Search Page

    A form of the application, where a search in the Registration DB can be performed and search results can be visualized.

    Single Structure

    Single Structure type compounds and multi-component compounds can be as well registered. Usually one structure with or without salt component, with or without isotopes, or multi-fragment structures can be registered as single type compounds. Single Structure is the default structure type.

    Compounds lacking of structures (with or without) CST can be registered as No Structures.

    Structure Type

    When registering structures the Structure Type must be set. The default is the Single Structure type, but No Structure, Mixture, Formulation, Alternate, Polymer can be also choosen if available (configurable)

    Source

    The Source identifies the origin of the compound to be registered.

    Structure checkers and Registration options can be configured according to each source. Also the additional data accompanying the registered compound can be configured according to the source.

    The registration system can accept different configurable Sources e.g.: REGISTRAR, ELNB, BULKLOAD, WEBREG.

    Submissions arriving from a Source, which are not listed in the configuration file, will fall to the Staging area with the error message: "Unknown source".

    Specified MolWeight

    The molecular mass of a compound can be supplied also by the user, referred also as specified MW.

    Specified MW (version) can be set during registration (during uploading SDFiles as well), but no specified MW for the parent can be set separately.

    If the MW is specified, and a salt component is also provided when registering a lot, the system does not "recalculate" the specified MW.

    The specified MW can be provided during registration

    • when a new compound is created: the whole tree will inherit the specified MW

    • when a new lot is registered under an existing compound:

      • the specified MW will be lost if the version already exists (but it can be set again after registration on the Details page)

      • the specified MW will be kept if a new version is created

    Specified MW can be set for each level of the registered tree hierarchy after registration.

    The specified MW of a compound can be changed after registration

    The specified molecular MW of a parent structure is not inherited by the versions and lots of the corresponding parent.

    The specified molecular MW of a version structure is also displayed for its lots, but it is not set for the corresponding parent.

    The specified molecular MW of a lot is inherited by its version, but it is not set for the corresponding parent.

    Searching in the database is possible for each level of the tree (parent, version and lot). When searching for a given level, in the search results table (with configurable columns) different types of molecular weights can be found:

    • for a parent level search: the Molweight (Structure) and the Molweight (Parent) are available, representing the calculated and specified parent molweights.

    • for a version and lot level search: the Molweight (Structure), Molweight (Structure+Salt) for the calculated molweights and the Molweight (Parent) and Molweight (Version) for the specified molweights are available.

    Staging Area

    The entries of failed submissions are collected in the Staging Area. It is a dedicated area for compounds to be verified for manual registration. The site is under the authority of privileged users, who can correct and register failed submissions manually during the registration process, while registration options and structure checkers/fixers are enabled/disabled.

    Standardization

    The process of converting a chemical structure to a Standardized form - defined by certain predefined rules - used in the registration service database. There are two separate steps of Standardization: general and parent. General Standardization is run for all compounds, that are to be registered, and can consist of any kind of structure transformation as configured by the user. Parent Standardization consisting of neutralization and isotope removals is performed after general Standardization in order to create/find the appropriate parent compounds.

    Stereo Analyzer

    Stereo Analyzer displays the result after analyzing the stereocenters and the stereo double bonds of a structure. Fixers are also available, which can be applied instantaneously on the structure. The available "labels" for a given structure are basically the Stereo Comments: Stereochemistry and Geometric isomerism, which are included in the Dictionaries.

    Stereo Comments

    Stereo comments are calculated during registration if the "Stereo Comment Check" switcher (Registration option, source dependent) is enabled. If the switcher is disabled, no compulsory data should be provided, arbitrary values can be set for these fields.

    If the switcher is on and the fields (Stereochemistry and Geometric isomerism) are configured, the registration system expects the correct stereo comment for the structure. If the provided stereo comment is missing or it is not correct, the submission will fail and end up in the Staging area. In case of an advanced (Registration page) or manual registration (Staging area), the system does not expect the comment, it ignores the one provided, if it is not the correct one, and calculates the correct stereo comments that will be stored for the registered structure.

    {info} Since CR version 21.14.0 No structure compounds can be successfully registered if the Stereo comment fields (Stereochemistry and Geometric isomerism) are configured for your system Source. In this case no Stereo comment values will be set. Before this version, if the Stereo fields are configured for your system, please use another Source in order to register No structures.

    {info} For Multi-components compounds having No structures with CST as components the system behaves like in case of Single structures. It expects users to provide values for the Stereochemistry and Geometric isomerism fields during autoregistration and sets Stereochemistry: "Achiral" and Geometric isomerism: "None" during advanced registration or when registering from Staging.

    The Stereo Comments are stored for the parent structure, but also the versions and the lots will inherit it.

    Currently, we distinguish between two types of Stereo Comments: Stereochemistry and Geometric isomerism, which are included in the Dictionaries.

    The default items in the Stereochemistry dictionary are: Achiral, Diastereomeric mixture, Racemic diastereomer with known relative stereochemistry, Racemic or presumed racemic, Single known enantiomer, Single unknown enantiomer, Single unknown enantiomer with known relative stereochemistry, Unequal mixture of enantiomers (please describe).

    The default items in the Geometric isomerism dictionary are: E, Equal mixture of geometric isomers, Known isomer with E and Z double bonds (as drawn), None, Single unknown geometric isomer, Unequal mixture of geometric isomers (please describe), Unknown, Z.

    Structure

    The Structure term in the registration system refers to the chemical structure itself and a set of additional data (CST, unknown attached data) that are considered during the decision of the uniqueness of a compound. The union of compounds and parent compounds can be referred as Structures. Three types of Structure S-groups are interpreted by the Compound Registration system:

    • NAME, that will be interpreted as a CST
    • UNKNOWN, that will be interpreted as an ISOMER
    • STEREO, that will be interpreted for double-bond stereochemistry

    We can distinguish single Structures and multi-component Structures.

    Structure Checker

    An automatic way to check for structural problems in compounds submitted for registration. The registration service comes with several default Structure Checkers, and users can define additional custom checkers based on their own requirements. Depending on the configuration of the registration service, a structure that has been flagged as problematic by a given Structure Checker, can either be prevented from being registered or can be automatically corrected by an associated structure fixer. Structure checker doesn't work if the ChemDraw is set as structure editor.

    For more information about Chemaxon's Structure Checkers please consult the Structure Checker Documentation.

    Structure Checker Software

    Structure Checker is an interactive tool to detect and fix structure related issues using JChem technology. It comes with numerous checkers and fixers to search and correct various structural issues. The correction process can be manual, completely automatic, or somewhere in between. Structure Checker can operate in batch and provide flags for problems which cannot be automatically corrected. The checking and fixing functionality can also be accessed from external Java code through the JChem API.

    Structure Editor

    The default structure editor is Marvin JS. But for editing structures, Marvin Sketch or ChemDraw can be also set. When choosing the editor, please be aware that Chrome and Mozilla FF (since Firefox 52 version ) doesn't support applets any more. Marvin and ChemDraw as structure editors can be used only with Internet Explorer.

    ChemDraw 15, 16 and since version 20.8.0 also ChemDraw 17 and 18 can be used successfully as structure editor within the Compound Registration web application, if it is installed locally on your computer, but only with Internet Explorer. However, in order to use ChemDraw, you need to have ChemDraw ActiveX plugin installed and then allow access in Internet Explorer.

    {info} Since version 21.20.0 Internet Explorer is not supported.

    {info} Since version 21.20.0 using ChemDraw directly as structure editor in Compound Registration is no longer supported. Structures drawn with ChemDraw can still be imported to MarvinJS as before.

    {info} Since version 22.6.0 using Marvin applet directly as structure editor in Compound Registration is no longer supported.

    Structure Fixer

    An automatic way to correct structural problems that have been found by an associated structure checker. Several Fixers can be associated to a given structure checker in order to provide different ways of dealing with a structural problem. During manual registration or bulk registration, the privileged user can choose which Fixer should be applied to a particular compound. The registration service comes with several default Structure Fixers, and users can define additional custom Fixers based on their own requirements.

    For more information about Chemaxon's Structure Checkers please consult the Structure Checker Documentation.

    Submission

    Submission is a record of a successful, failed, or in-progress registration. A Submission comprises the information needed for a registration (such as a structure, a lot ID, LnbRef, etc.), a submission status, and additional meta-information (such as the time of registration). Failed and in-progress Submissions can be seen in the Staging area.

    Submission ID

    The Submission ID is an automatic identifier for a submission entry, that is generated in increasing numerical order with the increment of 1 during entering a record into the registration system.

    Submission page

    The Submission page is the page where a submission from the Staging area is opened in order to register it manually. On the Submission page, you can edit the structure, CST, LnbRef, Molweight, Restriction, Salts and Solvates and the Additional data. On the Submission page, you can turn on or off registration options and can apply structure checker/fixers.

    Submission Type

    The Submission Type describes for each submission, which service was used and in what kind of circumstances for creating the submission. The Submission Type can be e.g. AutoRegister, AutoRegisterBulk, ManualRegister, DeleteId, DeleteTree etc.

    Submission Status

    A status indicating whether a submission is successfully registered, is still "in progress", or has failed due to some reason (e.g. the LnbRef was invalid, or a non-exact match was found). If the submission ended up in the Staging area, there is a detailed description about the reason of failure besides the Submission Status.

    Submitter

    The identifier of the chemist who actually owns the physical lot. This might be distinct from the ID of the user who autoregisters or might have to manually register the same submission in case it cannot be autoregistered (Created by), or who might make an amendment to the compound once it is registered (Modified by). The Submitter (ID) appears under different tabs of the application. The Submitter (ID) plays important role in Project based access, e.g. a user having "read own submissions" permission in a certain project, will be able to read only those submissions which have the given username in the Submitter field.

    Suggestion list

    The Suggestion list is a full screen window that displays how the submitted single type compound would look in the database compared to the existing similar (stereoisomer, tautomer, different / similar CST) compounds.
    The submitted compound is always displayed in its standardized form on the left side. Below the submitted structure the system displays whether the lot to be registered is a new lot or version of an already existing compound or is a new compound (parent). Additional information that "Your submitted compound already exists" or "Your submitted compound is novel" is also displayed here. By choosing the drawn structure the submitted compound will be registered under a new Id (in case of new compounds), while choosing one of the suggested structures the submitted lot will be registered as a new lot under an existing version or as a new version under an existing parent compound structure.

    Synonym

    Alternative names can be available for Compound Numbers (PCNs and CNs) in the DB. If the system is configured for this and synonyms are available for versions and parents (these PCNs and CNs are displayed in red), the synonyms will appear when hovering over the PCN or CN on the Details and Search pages. If a synonym is available for a parent, that will be displayed also in the Match list. It is also possible to use a synonym to find a compound on the Details page.

    Tree

    The Tree is a storage hierarchy of the parent with all versions and lots in the registration database. Each Tree has one parent, but can have any number of versions under that parent, and any number of lots/preparations under each version. Each Tree can be displayed on the Details page.

    Undelete

    Deleted compounds can be restored with the "undelete" function. Parents, versions or lots can be restored. Restoring lots is possible even if the structure of the tree has changed. Versions can be restored only if the structure was kept, otherwise, an error message is received that the version cannot be restored.

    Unknown ID / Unknown Attached Data

    Unknown Attached Data and IDs are generated for multi-component compounds without any quantitative composition (alternates) or semi-quantitative composition (mixtures) that involves unknown ranges. Examples for Unknown Attached Data and ID are: "Alternate 1", "Alternate 2", "Mixture 1", "Mixture 2", etc. For each registered unique compound a new Unknown ID is set. In a similar way "Isomer 1", "Isomer 2",... IDs are set for chiral compounds with an unknown configuration having e.g. an "OR1" stereo flag.

    Update Layout

    Structures can be modified within an amendment process. If the user wants to change only the arrangement of the structure (e.g. 2D clean, rotation), the amendment cannot be performed, since "Structure not modified" message would have been received. For changing the structure display the Update Layout should be used.

    This feature can be applied when the user prefers to display the whole tree (parent, versions, and lots) with the same arrangement of the structure. The [Update Layout] button is available only for single compounds on parent level on the Browse page Edit mode. It is not active for multi-component compounds, though the displayed fused images of the multi-component compounds will be renewed if the component structures are updated. However, the stored structures of the multi-component compounds will still remain the same.

    Upload Page

    A form of the application, where an upload of an SD file can be initialized to carry out Bulk Uploads.

    User ID

    The User ID (=username) indicates the user who has submitted the record, registered a record or initiated the amendment in question.

    Validation

    Every registration and amendment step begins with a thorough check of the input data provided to the services. Input values are Validated against a predefined set or range of possible values, regular expressions, etc. The series of steps to be performed might be dependent on the company business rules. The uniqueness of the external IDs is also checked during the Validation procedure. If any of the defined Validation steps fails, the submission ends up in the staging area with the proper error message.

    Version

    A Version in the registration service database represents a compound along with a set of additional information. It is defined as the second level in the data hierarchy. Each Version is referred by a unique identifier called compound number (CN).

    Structures are only stored on parent and version level.

    Version Correction / Fix

    A process of reconciling existing versions within a matched parent tree with a new version created through manual registration or amendment. The registration system attempts to do this automatically, but in cases where an automatic Version Fix is not possible, the user is prompted to make these changes by hand before registration or amendment can be completed.

    Version Fingerprint

    Version fingerprint represents the ID and multiplicity separated by a colon. For a version without salts/solvates only the parent ID is available and the version fingerprint will be 0:1.

    If a version with salt(s)/solvate(s) is available, after the parent and his multiplicity, the salt/solvate Ids with their multiplicities will be enumerated separated by a comma, e.g. 0:1,1:2.0 (salt Id1 is present with multiplicity 2).

    {info} The fractional precision of the salt/solvates in the version fingerprint should be at least with one digit.

    Virtual Compound

    When registering a Virtual Compound (chemical entity, including charges, isotopes, salts, solvates, etc.) only a parent and a version compound (no lot) is being created. During the registration of virtual compounds the lot specific fields are excluded form the Registration form.