Protein Data Bank (PDB) file format

    Import from PDB format

    Codename : pdb

    PDB files complying the PDB Contents Guide version 2.3 are processed, though with some negligible limitations. PDB files produced by various 3rd party applications may not comply the PDB standard. Most of these files are also properly handled though there might be exceptions.

    All covalent bonds in proteins and in nucleic acids are properly assigned, but hydrogen bonds, sulfur and water bridges, coordinated bonds are not recognised yet. Covalent bonds in hetero groups are perceived based on geometry, bond types are guessed with some errors. Hydrogen atoms are identified and bonded to the appropriate heavy atom either in chains, in hetero groups as well as in water molecules.

    Multiple models are properly processed as well as insertions and modified residues.

    Import Options

    H or +H Add explicit hydrogen atoms. PDB:H
    -H Remove explicit hydrogen atoms. PDB:-H
    c Omit CONECT records for hetero compounds. Bonds are detected by the PDB reader module based on local geometry unless the b option is specified. PBD:c
    b Do not recognize bond order. All bonds either defined by CONECT records or generated by PDB import are represented as ANY bonds. Usage: PBD:b
    f# set bond length cut-off (default #: 1.12)

    Limitations

    Standard record types listed below are not recognised by the current version of PDB import:

    • Optional: OBSLTE, CAVEAT, SPRSDE, JRNL, REMARK, SEQADV, FTNOTE, HETSYN, FORMUL, SSBOND, LINK, HYDBND, SLTBRG, CISPEP, SITE, MTRIX1, MTRIX2, MTRIX3, TVECT, SIGATM, ANISOU, SIGUIJ

    • Mandatory: CRYST1, ORIGX1, ORIGX2, ORIGX3, SCALE1, SCALE2, SCALE3, MASTER

    The recognition and proper processing of these record types will be implemented in forthcoming releases on demand.

    Export to PDB format

    Marvin exports simplified PDB files containing record types listed below:

    • Title section:

      • HEADER contains the following fields: classification="PROTEIN" (or imported value), date, idCode="NONE" (or imported value).

      • TITLE, SOURCE, KEYWDS, EXPDTA: The imported value is exported. Default: "NULL".

      • COMPND: The imported value is exported. Default: "MOLECULE:name", where "name" is the molecule name.

      • AUTHOR: The imported value is exported. Default: "Marvin".

      • REVDAT: The following line plus the imported value.

        REVDAT N DD-MMM-YY 3

        (N is the modification number, DD-MMM-YY is the date of the modification.)

    • Coordinate section:

      • ATOM and HETATM: The atom name includes the remoteness indicator and the branch designator character in case of amino acids. For non-standard residues, the atom name and the element symbol field contain the same value. The occupancy and the temperature factor are zero. The residue field contains one of the standard residue symbols.
    • Connectivity section:

      • CONECT: Only the first five fields are used. If the number of bonds is greater than four, a second CONECT line with the same atom serial number (first field) will be used.

      • TER: Indicates the end of a chain. Imported but not exported in the current version.

    • Book keeping section:

      • MASTER

    Export options can be specified in the format string. The format descriptor and the options are separated by a colon. Options listed below are available for PDB output.

    H or +H Add explicit hydrogen atoms. PDB:H
    -H Remove explicit Hydrogen atoms. PDB:-H
    -c Do not store connections PDB:-c

    Limitations

    The exporter writes the atoms in the molecule object's internal atom order which may be different from the order of residues in a chain. Thus export is still not reliable for macromolecules with residues.

    See also

    Reference