Re: Science, Aug 2nd

Simon Brocklehurst (smb@bioch.ox.ac.uk)
Tue, 20 Aug 1996 09:46:38 +0100 (BST)

Hi all,

Well, I've just had a quick look through the paper... so I thought
I'd throw another point or two into the discussion (I'm afraid,
our photocopier is out of toner, so I'm "(mis?)quoting" extracts
from memory!).
Before I go further though, I hope someone will draw the authors'
attention to this discussion because it's publically viewable
on the Web. I have hesitated about making the points below without
the authors seeing what I'm writing, because what I'm about to say is
critical of the their work.
I'm not saying "I'm right and they're wrong", I'm simply offering
this contribution in the spirit of a discussion. I very much hope
people will point out errors in what I say/support the authors
hypotheses etc...

One of the authors' central claims seems to be (sorry if I've got the
wording slightly wrong - the sense is preseved I hope):

"It is the requirement that many sequences design a structure that
leads to formation of secondary and tertiary structure."

It seems (to me at least) that this is highly unlikely to be a
correct statement. Why? Because I believe the reasons that secondary
and (some) tertiary structure form is already known (and it's not
at all related to the authors' hypothesis)... but I won't digress
further along this line...

I would prefer, instead, to make a point that arises directly
out of the above claim, and which I believe supports my previous
comment about proteins not being stable to mutation (contrary
to the authors' claims). We have to be clear what the authors mean
by the phrase, "many sequences". Remember they have only two type
of residue in their simulation (H & P) - so to them different
sequences simply mean different patterns of hydrophobic and polar
residues.

The authors observe that their "designable" structures are characterised
by having _both_ of the following kinds of sequences:

1. sequence families with many conserved positions (i.e.
conserved H residues, or conserved P residues)

2. some sequences completely unrelated to other sequences
(i.e. statistically insignificant sequence similarity -
e.g. no positions with conserved H across the family)

In real proteins however, it seems to me that we see observe something
fundamentally different: divergently evolved protein families are almost
always of ONLY type 1 (remember we're not talking real sequence identity,
rather we're talking conserved "key" residues types). Anyone disagree?

We tend to observe type 2. sequence relations when comparing _UNRELATED_
proteins that have the same fold - this is nothing to do with
divergent evolution though.

Thus, it seems to me that the evidence points to the fact that
observed divergently evolved protein sequences are characterised by
having "key" residue positions that are required for the protein to fold.

In other words, don't the points I've made indicate that "designability"
of a structure is unlikely to have much to do with general features
of protein evolution or protein folding?

As I've said above, I'd really love someone whose read the paper
more carefully than I have, to tell me what I've missed/gotten wrong
in all the above...

-- Simon
_____________________________________________________________________________
|
| ,_ o Simon M. Brocklehurst,
| / //\, Oxford Centre for Molecular Sciences, Department of Biochemistry,
| \>> | University of Oxford, Oxford, UK.
| \\, E-mail: smb@bioch.ox.ac.uk | WWW: http://www.ocms.ox.ac.uk/~smb/
|____________________________________________________________________________