PDB File questions also

SCHUMAN@bnlstb.bio.bnl.gov
Thu, 9 Mar 1995 8:26:36 -0500 (EST)

Hi PPSers,

I have been looking at the data for my protein
(trypsinogen, 1tgn in pdb-ese). There are a number of
anomalies that I cannot figure out...(I am locatied at
the same place as the PDB, but I though others may see
the same problems, so if I can find anything out, I will
help to answer my own question).

The protein is listed as having 245 residues,
but only 229 are listed in the sequence records.

There is one remark about residues 10-16 not
being included...only accounting for 7 of the missing
residues.

The atom records start at res. number 17, which
is not where the sequence records start. There are
residues missing from the atom records and sequence
records that are not described in the remarks. There is
at least one residue pair labeled 65 and 65A, which I
presume to mean, at location 65 either can occur.

The questions are...

Where are those missing residues?

Why does RasMol show the residues as being
connected if there are two missing between the atom set
of data?

There are apparently some additional residues
not listed in the beginning of the sequence records,
since the listed residues don't count correctly if only
10-16 are missing...I guess you need to be there! Since
there is no atom information for residues before number
17, it is hard to tell who is who in that part of the
atom information.

FYI, I have modified one of our programs to read
a pdb file and calculate total molecular weight, various
centers of gravity, (e.g., based on backbone atom
positions, based on alpha carbon positions, weighted by
mainchain mass, by full residue mass, etc.) and also the
radius of gyration based on different percent of D-H
exchange (we are neutron scatters here). The program is
dependent on VAX/VMS because it uses screen management
calls, but the pdb reading and other calculations can
probably be lifted. It counts the frequency of
appearance of the residues, accounts for the waters, and
some of the special groups and atoms (Heme, Ca), and
Nucleic acids. I can pull the calculation parts out and
make them available if anyone would like to use them.

Anyway, this has gotten too long...Thanks.

Gail (R Group)
schuman@bnlstb.bio.bnl.gov