PDB Questions - one answer

Glenn Proctor (proctor@yorvic.york.ac.uk)
Thu, 9 Mar 95 15:09:11 GMT

Following on from Gail Schuman's (SCHUMAN@bnlstb.bio.bnl.gov) question
about the "missing" residues in the PDB file for trypsinogen
(1tgn.pdb), I've had a look and agree that the PDB file is rather
confusing.

For instance, the entries go residue 33, 34, 37, 38 - residues 35 and
36 are missing. There is a good reason for this.

Often, when a new structure of a particular family (in this case the
serine proteases) is solved, the residue numbering is assigned not
simply in consecutive residue order. Instead it is assigned to give
the greatest similarity to other known structures (or perhaps one
archetypal structure). So in serine protease structures, the
"catalytic triad" is _always_ His 57 Asp 102 Ser 195. So occasionally
residues have to be "missed out" in order to make the sequence
numbering correspond to existing structures.

This seems confusing at first, but makes it much simpler to compare
very similar structures, e.g. by least squares fitting.

Hope this has helped!

Glenn. (T group)

-----------------------------------------------------------------------------
Glenn Proctor Protein Structure Research Group Voice: +44 (0) 904 432573
Internet: proctor@yorvic.york.ac.uk Fax: +44 (0) 904 432519
WWW: http://www.yorvic.york.ac.uk/~proctor/
York University Protein Structure Research Group, Heslington, York, England.
-----------------------------------------------------------------------------