Special search types: Polymer search

    JChem supports the search of polymers with polymer queries. Polymers can also be searched by a substructure of the polymer structural repeating unit. (In addition to the polymer search features described in this section, all query features of non-polymer search can be used.) Substructures of the represented polymer containing two or more structural repeating units are not found in polymers. Polymer search is designed to compare polymers using their source- or structure-based representation. Polymers can be stored in tables of molecule type. Markush structures with polymers are not yet supported.

    Polymer representation

    Polymers are represented either by their repeating unit (structure-based representation) or by the original monomer (source-based representation). Both representations require the enclosing of the given structure in the appropriate bracket (see Table 1).

    Table 1. SRU and monomer brackets

    images/download/attachments/1806751/poly1.jpg images/download/attachments/1806751/poly2.jpg
    Structural repeating unit (structure-based representation) Monomer bracket (source-based representation)

    Repeat patterns

    Structural repeating units (SRU) with two bracket-crossing bonds can have the following three repeat patterns:

    • ht: head-to-tail repeat pattern.

    • hh: head-to-head repeat pattern.

    • eu: either-unknown repeat pattern. Both repetitions are possible or the sequence is unknown.

    The polymer structure represented by a monomer bracket is considered to have head-to-tail repeat pattern.

    For duplicate search type repeat patterns must match exactly. In other search types the either-unknown repeat pattern can match all types, the "hh" and "ht" repeat pattern can match only itself.

    Ladder-type polymers, structures with four bracket crossing bonds have additionally a flip option for the specific repeat patterns. Hence for these polymers there are five repeat patterns:

    • ht,f: head-to-tail with flip

    • ht: head-to-tail without flip

    • hh,f: head-to-head with flip

    • hh: head-to-head without flip

    • eu: either-unknown repeat pattern. All repetitions are possible or the sequence is unknown.

    As for the two crossing bonds case, either-unknown repeat pattern can match to all other types for non-duplicate search types. For other repeat patterns the flipping parameter must match as well. Examples of ladder-type structures with different repeat patterns are shown in Table 2.

    Table 2. Ladder type structures

    polymer SRU repeat pattern example structure
    images/download/attachments/1806751/polyladder.jpg ht images/download/attachments/1806751/polyladderht.jpg
    ht,f images/download/attachments/1806751/polyladderhtf.jpg
    hh images/download/attachments/1806751/polyladderhh.jpg
    hh,f images/download/attachments/1806751/polyladderhhf.jpg

    Monomer-SRU matching

    Monomer and structural repeating unit representations of the same polymer are matching on each other in all search types except duplicate search. For duplicate search the source and the structure-base representation, or the possibly existing several source based representations are considered non-equivalent. This enables the database registration of different monomers of the same polymer.

    Matching of monomers on SRU and verse is achieved through the transformation of monomer representations to SRU representations. This transformation is based on polymerization rules, from which our system supports currently the following types:

    • Addition to a double bond. E.g. polystyrene.

    • Polymerization through elimination of water or HCl. E.g. polyester, polyamide.

    Monomer transformation can be switched off by the appropriate search option. Table 3 contains examples of matching between SRU type polymers and monomers.

    Table 3. Monomer and SRU matching

    query target hit
    monomer transformation NO transformation
    images/download/attachments/1806751/poly2.jpg images/download/attachments/1806751/poly2.jpg images/download/attachments/1806751/yes.png images/download/attachments/1806751/yes.png
    images/download/attachments/1806751/poly3.jpg images/download/attachments/1806751/yes.png images/download/attachments/1806751/no.png
    images/download/attachments/1806751/poly4.jpg images/download/attachments/1806751/poly4.jpg images/download/attachments/1806751/yes.png images/download/attachments/1806751/yes.png
    images/download/attachments/1806751/poly5.jpg images/download/attachments/1806751/yes.png images/download/attachments/1806751/no.png

    Cyclization and phase-shifting

    For some polymers with head-to-tail repeat pattern the brackets can be shifted along the polymer chain. This "phase-shifted" SRU represents the same polymer. Structural repeating units with head-to-tail repeat pattern can find their phase shifted variant. However the structural repeating unit is different in these cases, the represented polymer is the same. Table 4. illustrates this behavior.

    Table 4. Phase shifting for SRU type polymers.

    query target
    images/download/attachments/1806751/poly5.jpg images/download/attachments/1806751/poly6.jpg
    images/download/attachments/1806751/poly5.jpg images/download/attachments/1806751/yes.png images/download/attachments/1806751/yes.png
    images/download/attachments/1806751/poly6.jpg images/download/attachments/1806751/yes.png images/download/attachments/1806751/yes.png

    For SRUs with head-to-head or either-unknown repeat pattern the phase shifted version does not match the original.

    Phase shifting can be switched off in order to maintain compatibility with MDL search types (see compatibility notes).

    End group matching

    Polymers in structure-based representation can have specific or undefined end groups. This latter is denoted by star atoms.

    If the end group matching option is chosen, the end groups have to match exactly. In this case an undefined end group can still match specific end groups. Otherwise the end-groups are ignored.

    In case of end group matching and specific end groups on the structures there is no "phase-shifting" behavior.

    What are the limitations of being considered end group? Only those groups are considered end groups which have exclusively one bond crossing bracket(s). See examples for end groups (highlighted with green) and for not end groups (highlighted with red).

    images/download/attachments/1806751/poly8_endgroup01.png
    images/download/attachments/1806751/poly_noendgroup01.png images/download/attachments/1806751/poly_noendgroup03.png images/download/attachments/1806751/poly_noendgroup04.png

    Matching of end-groups is illustrated in Table 5.

    Table 5. Matching of end groups

    query target hit
    end group matching NO end group matching
    images/download/attachments/1806751/poly7.jpg images/download/attachments/1806751/poly7.jpg images/download/attachments/1806751/yes.png images/download/attachments/1806751/yes.png
    images/download/attachments/1806751/poly8.jpg images/download/attachments/1806751/no.png images/download/attachments/1806751/yes.png
    images/download/attachments/1806751/poly5.jpg images/download/attachments/1806751/no.png images/download/attachments/1806751/yes.png
    images/download/attachments/1806751/poly9.jpg images/download/attachments/1806751/no.png images/download/attachments/1806751/yes.png
    images/download/attachments/1806751/poly5.jpg images/download/attachments/1806751/poly7.jpg images/download/attachments/1806751/yes.png images/download/attachments/1806751/yes.png
    images/download/attachments/1806751/poly8.jpg images/download/attachments/1806751/yes.png images/download/attachments/1806751/yes.png
    images/download/attachments/1806751/poly5.jpg images/download/attachments/1806751/yes.png images/download/attachments/1806751/yes.png
    images/download/attachments/1806751/poly9.jpg images/download/attachments/1806751/yes.png images/download/attachments/1806751/yes.png

    Further polymer types

    Copolymers

    Copolymers are formed of several polymers. They have the following subtypes:

    • co - unspecified

    • alt - alternating - the components are alternating without repetition (e.g. ABABABAB...)

    • rnd - random - the components are randomly distributed. (e.g. AAABABBBAB...)

    • blk - block - the components are arranged in blocks. (e.g. ...AAAAABBBB...)

    Unspecified matches all other subtypes, other subtypes have to match exactly as shown on Table 6.

    Table 6. Matching of copolymer subtypes.

    query target
    images/download/attachments/1806751/poly_co.jpg images/download/attachments/1806751/poly_alt.jpg images/download/attachments/1806751/poly_blk.jpg
    images/download/attachments/1806751/poly_co.jpg images/download/attachments/1806751/yes.png images/download/attachments/1806751/yes.png images/download/attachments/1806751/yes.png
    images/download/attachments/1806751/poly_blk.jpg images/download/attachments/1806751/no.png images/download/attachments/1806751/no.png images/download/attachments/1806751/yes.png

    Copolymers are differentiated on whether the bracket-crossing bonds cross the copolymer bracket or not, and if the polymers are connected or not. Connection between polymer components specifies their order. Crossing of the copolymer bracket means that the unit inside must repeat. If no bracket-crossing bonds cross the copolymer bracket the copolymer should have either-unknown repeat pattern. In duplicate search exact matching is required, in substructure search the matching behavior is shown on Table 7.

    Table 7. Copolymer matching.

    query target
    images/download/attachments/1806751/poly_co.jpg images/download/attachments/1806751/poly_co3.jpg images/download/attachments/1806751/poly_co4.jpg
    images/download/attachments/1806751/poly_co.jpg images/download/attachments/1806751/yes.png images/download/attachments/1806751/yes.png images/download/attachments/1806751/yes.png
    images/download/attachments/1806751/poly_co3.jpg images/download/attachments/1806751/no.png images/download/attachments/1806751/yes.png images/download/attachments/1806751/yes.png
    images/download/attachments/1806751/poly_co4.jpg images/download/attachments/1806751/no.png images/download/attachments/1806751/no.png images/download/attachments/1806751/yes.png

    Copolymers can be matched by simple SRU type polymes and by copolymers, if the "copolymerMatching" option is chosen then only copolymers can match a copolymer.

    Other types

    grf

    • Grafted copolymer

    xl

    • Cross-linked copolymer

    mer

    • Mer bracket represents a structural unit, that doesn't repeat with itself.

    mod

    • Modification of an other structure.

    Attached data search

    Data sgroups attached to atoms of polymers or polymer brackets are considered during searching. About attached data matching see details.

    Using the attachedDataMatch search option attached data matching can be switched off, which results in the ignorance of all attached data.

    Polymer Mixtures

    Mixtures are built of different components. Depending on whether the order of the components is relevant or not we distinguish ordered and unordered mixtures. Ordered mixtures or formulations (sign "f") can only match on ordered mixtures with the same order, though different numbering is possible. Unordered mixtures (sign "mix") can match both type of mixtures. See mixture documentation.

    Polymers can be part of mixture-type brackets. Arbitrary depth of nesting is allowed. The criterion for matching is that the polymer/mixture brackets which include a given query structure should have a corresponding target-side brackets with the same order of nesting.

    Examples of polymer mixture matching are shown on Table 8.

    Table 8. Polymer mixture matching.

    query target
    images/download/attachments/1806751/polymer_mix1.jpg images/download/attachments/1806751/polymer_mix2.jpg images/download/attachments/1806751/polymer_mix4.jpg
    images/download/attachments/1806751/polymer_mix1.jpg images/download/attachments/1806751/yes.png images/download/attachments/1806751/yes.png images/download/attachments/1806751/yes.png
    images/download/attachments/1806751/polymer_mix2.jpg images/download/attachments/1806751/no.png images/download/attachments/1806751/yes.png images/download/attachments/1806751/no.png
    images/download/attachments/1806751/polymer_mix4.jpg images/download/attachments/1806751/no.png images/download/attachments/1806751/no.png images/download/attachments/1806751/yes.png

    Relation to MDL polymer search types

    Chemaxon polymer searching can be configured to correspond to MDL's polymer search types. Table 9. shows the settings that can be used for the different MDL polymer search types.

    Table 9. MDL and Chemaxon polymer search

    MDL search type Chemaxon search type Additional options Remarks
    Polymer exact duplicate polymer:y phaseShift:n -
    Find monomer or sru duplicate polymer:y transformMonomer:y -
    Polymer substructure substructure search polymer:y transformMonomer:n -
    Copolymer search substructure search polymer:y copolymerMatching:y -