Transcript of meeting in BioMOO, PPS Base 14th Mar '96 17:00 GMT on Protein Assignment 1

Thanks to Peter Murray-Rust for holding this well-attended meeting, and for dealing with so many questions.

This transcript can be found on the tape 'mar15' (sic) in the PPS Base, for at least one week after the meeting, but Peter will eventually have to delete it to save disk space.



PeterMR turns the recorder on.

PeterMR says, "the recorder is now on"

JohnW asks, "does everyone have the appropriate URL on a WWW browser?"

PeterMR says, "we are going to discuss protein assignment 1."

PeterMR has"has asks if everyone has got the assignment

Giovanni says, "I got the assignment"

Luis says, "I got it"

PeterMR says, "any problems with Q1? - similar proteins?"

Salim says, "I have the correct browser on"

Franco says, ""I got it""

KarlS says, " I got it too"

Auroram says, "yes I have started working with the assignment "

Paolo says, "Me too"

JohnW says, "thats .../PPS2/assignments/proteins1.html"

PeterMR says, "does everyone have a protein of the sort 1ABC?"

Luis says, "How do you know if a protein is similar, just from the pdb code?"

PeterMR says, "1abc and 2abc are usually related"

Franco says, ""what about 1ccr and 1ccx?""

KarlS says, "I found other members of 'my' protein family, but the pdB codes are not similar at all"

PeterMR says, "1ccr and 1ccx are not necessarily related"

PeterMR says, "the prob is the PDB started with only 4-letters, so really the codes don't mean very much."

PeterMR says, "the early ones, e.g. 1ins for insulin were OK"

PeterMR ghashas a bad deelet keey

PeterMR says, "there are about 100 structures of mutant lysozymes."

PeterMR says, "so it is impossible to use similar codes."

PeterMR says, "I shall go to another terminal IRL - 1 minute - chat among yourselves..."

Ahotz says, "me too"

Auroram [to so]: we have to search for homologies in the sequence to find the possible family members?

Paulyta says, "At some sites you can key-word browse PDB files"

JohnW [to auroram]: one way of finding other family members is to search a database of structures, e.g. there is a search interface at PDB

PeterMR is back again

PeterMR says, "does anyone have a protein with no similar ones in the database?"

Kdenton says, "no"

Salim says, "I have 4 similar ones"

PeterMR says, "how many people have mutant proteins?"

Ahotz says, "no, mine even have similar names 4cms,1cms,2cms and 3cms"

Giovanni says, "I have two similars"

Auroram says, "[to Peter]I do not know yet"

PeterMR says, "who has a protein from two species?"

Franco says, ""nothing like 1abc 2abc, but lots of mutants or otherwise related""

KarlS says, " I have srucutres of different oxidation states"

Luis says, "I have 2cpl and could not find other Xcpl's"

PeterMR says, "how many of you have searched for the name of your proteins?"

Salim says, "Is it the text file that tells me about the mutant forms?"

Giovanni says, "I have 1ald-human and 1fba is drosophila and 1kag is pseudomonas"

Kdenton says, "I don't know how to distinguish it as a mutant"

Paolo says, "I have several structures of the protein with different het groups"

PeterMR thinks it is a good idea to ask the questions and to let the answers come in at random...

Ahotz says, "I have searched with the name and didnDt find more"

KarlS says, "I did and I found structures from three other species and oxidation states"

Auroram says, "I know the name of my protein"

PeterMR says, "the discussion is going very well. I am leariing a lot already! :-)"

Kdenton says, "I have about 10 xabp's and more 1abpx's"

Luis says, "Is there a systematic way to go about finding related proteins ?"

Salim says, "What about the mutants, fellas??"

PeterMR [to ahotz]: - what is your proteins *name*?

Giovanni says, "and what about subunits?"

PeterMR says, "lusi - the best way is to use a database of comparison like SCOP"

PeterMR meant Luis

JohnW nods

PeterMR says, "another good way is to look at Swissprot."

Luis says, "I must try that. How do I get there?"

Luis says, "I did look at swiss prot and eventually found my protein."

JohnW says, "searchable databases are what is required- such as Scop, and also Brookhaven PDB has a search interface on its WWW pages."

PeterMR suggest we move to Q3----------------------------------------------------

Franco says, ""let's put it this way. I have no experience in protein sequence, and I went random;y around databases. Which is he best route to follow?""

Luis says, "I finally did by refernce search, which I'm sure is not the right way"

Ahotz says, "chymosinB"

Kdenton says, "what about different non-protein constituents?"

PeterMR [to Franco]: - swissprot is usually the best stating place

Luis says, "Question 3 is easy. I think I'm getting to be a rasmol wizzard"

JohnW says, "SCOP URL (also mirrored) is"

Marek says, "About non-protein constituents - can you rely on RasMol?"

Kdenton says, "how do work out %MW of water?"

PeterMR [to kdenton.]: That is a misprint. It should be *but* different non-protein constits...

Auroram says, "my protein ( a cytokine) has missed 24 amino acids from the N-terminal, can I consider it as a mutant?"

PeterMR [to auroram]: that is a good observation. It could be a signal sequence

PeterMR says, "or it could be that the residues could not be seen experimentally"

PeterMR [to auroram]: or it could be a pre-protein that was cleaved.

Paolo says, "I also have a signal peptide, not included in PDB entry, that lack 1 amino acid more"

Kdenton says, "or molecular volume?"

PeterMR: interleukin-1 occurs as a pre-protein of about twice the length

Tday says, "I also would like to know about molecular volume"

KarlS says, "My pdb files says No. of solvent atoms is 110. An I find 110 HOH molecules. Is this the same?"

PeterMR asks about water. Does anyone NOT have water?

PeterMR says, "HOH == water. water is often called solvent"

Franco says, ""how do you know about water?""

Ahotz says, "I have water, too"

Giovanni says, "i dont have water. 1ald= aldolase A"

Marek says, "My ubiquitin did not have water?"

KarlS says, "Are other solvents than water signified by their names in pdb files?"

PeterMR says, "early structures (or badly crystalline ) do not have water since the diffraction is not good enough. 1ald is probably early. There really IS water there!"

Luis [to Marek]: isn't that because its a memebrane prot?

PeterMR [to marek.]: when was your structure done?

PeterMR doesn't think that ubiquitin is a membrane prot :-)

Luis says, "sorry. I'm not a biochemist."

Auroram says, "no, ubiquitin is cytoplasmatic"

PeterMR [to asks]: what the resolution is for proteins without water?

KarlS does not uznterstand PeterMR las question

PeterMR [to Marek]: - do you know what the resolution is? It should be near the top of the file

Marek says, "Ubiquitin PDB entry was last revised in 1994"

Giovanni says, "1ald is 3 A"

PeterMR says, "every protein diffracts x-rays to a different amount (every crystal, I mean). If the diffraction is poor , the resolution is a large number (e.g. 3Angstroms)"

PeterMR [to Giovanni]: well done. 3 A is poor, therefore no waters observed. (But they are there :-)

JohnW says, "for example, I spoke to Jens about his structure last week, which has a low resolution (2.0) and no water molecules in the PDB file"

PeterMR says, "has anyone got another solvent other than HOH or WAT??"

Auroram says, "is it 2 considered as low resolution?"

Marek says, "I have a rather slow connection - same for other people?"

Silk thought that only ordered water would be seen

PeterMR [to JohnW]: - maybe the experimenters were lazy :-). Water determination is the last thing that is normally carried out before final refinement

Luis says, "mine is 1.63 A. Iguess that's pretty good."

PeterMR [to Luis]: - it is good. The best is crambin at 0.9

Ahotz says, "in the HEeTATOM record should appear if the solvent is different, mine is water"

KarlS says, " I found some structures which were determined by NMR. They did not have information about resolution. Why?"

Ahotz says, "my resolution is 2.2A"

PeterMR says, "resolution is very important. At 3 A you can't see atoms well. at 1.5 you can start seeing them."

Kdenton says, "the resolutionfor my protein is 1.8 ang"

PeterMR [to karls.]: resolution is an X-ray term only. NMR produces multiple structures. There has been some work trying to create an analogous result for NMR

Paolo says, "My protein has resolution 2 A"

PeterMR says, "suggests we move to Q4 ************************************************"

Kdenton says, "I have to go shortly-could we talk about counter-ions?"

PeterMR says, "any non proteins bound? covalent? non-covalent?"

Marek says, "[to PeterMR] Resolution 2.4 Angstroms"

PeterMR says, "counter ions. any examples?"

Silke says, "how can you see the non Protein? my protein has NADPH bound, says the file"

PeterMR says, "these should all be on HETATM cards and maybe have HET records describing them."

Kdenton says, "my protein has galactose boyund"

PeterMR [to Silke]: - this is non-covalently bound

Luis says, "no non-protein groups in mine (a bit of a disappointment)"

KarlS says, "If sulfate is a counter ion than I have one in my protein"

Paolo says, "My protein has a Fe3+ and a CO3--"

Ahotz says, "I have no counterions or non-protein const."

PeterMR [to kdenton.]: This *might* be covalent or might be non-covalent. Any ideas?

PeterMR [to KarlS]: it is :-)

Marek says, "Is RasMol a suffcient way to recognise non-protein parts?"

PeterMR says, "++++++ Q5 ++++++++++++++ disulphides"

Kdenton says, "non covalent modulated by water?"

PeterMR [to marek.]: yes. there is a command (I can't remember it) to show solvents and substrates.

Luis says, "no disulphides. but a few Cys"

Ahotz says, "I have 3 SS bridges in mine"

Paolo says, "My protein has six disulphide bonds"

PeterMR says, "does everyone know how to look for disulphides?"

KarlS says, "No s-s, but two Cys"

Kdenton says, "no s-s but 1 cys"

Franco says, ""no""

Marek says, "Me neither"

Paolo says, "ssbonds on ?"

TRex [to Marek]: in the command line of RasMol type not water

Silke says, "I tried: color ssbonds green and there wasn't anything green then"

PeterMR will keep moving one through the Q's....

PeterMR +++Q6 any ambiguity about the composition?????

Luis says, "I couldn't make head or tail of that question."

KarlS says, "Me too"

Silke doesn't know how to find out about ambiguitis

PeterMR [to luis.]: Sometimes it isn't poss to know what the aminoacid is.

PeterMR says, "for example, the code ZSX means ASN or ASP. "

PeterMR meant ASX

Paolo says, "Some aa don't have side chain density"

Franco says, ""What about ACE in seqres?""

PeterMR says, "early structures didn't have a known sequence. So people guessed from the X-ray data."

PeterMR says, "ACE = acetyl (n-terminus)"

JohnW [to Marek]: e.g. 'select protein' 'select solvent' etc shows the different parts of the structure

PeterMR says, "sometimes a protein is given all ALA sidechains because the seq. isn't known"

PeterMR says, "`paolo that is unfortunate!! do you know the sequence from Swissprot?"

PeterMR [to ahotz.]: thanks

Paolo says, "Yes, 1lct trfl_human"

Paolo says, "in SEQRES they are correctly displayed, however"

PeterMR [to paolo.]: do not feel second-class if your protein has a poor structure! you can learn a lot even from a rough structure

JohnW [to paolo]: what is your PDB code?

Paolo says, "My prot has 2A"

JohnW . o O ( too late )

Paolo says, "1lct"

Silke says, "when SEQRES looks unambiguous, is the strusequence determined ok then?"

JohnW is looking at 1lct structure file

Paolo says, "Some residues were also omitted"

PeterMR moves to Q7++++++++++++++++++++++++++++++++++

PeterMR says, "how many people have more than one *macro*molecule (i.e. not solvent, ligand) in the structure?"

PeterMR [to silke.]: Both sequences and structures are *fairly* reliable now, but early ones were often poor.

PeterMR [to silke.]: SEQRES can sometimes be wrong but not often.

Auroram says, "Could anybody tell me what does it mean that the completenes of the structure is lets say 90 %? "

Ahotz says, "me not"

PeterMR notes that Q8 is about discrepancies between SEQRES and ATOM.

Marek says, "[to PeterMR]I have two chains in the ubiquitin"

Luis says, "what is ATOM?"

JohnW says, "ATOM lists the coordinates of each protein atom"

PeterMR [to auroram.]: Probably means that 90% of the residues have been correctly identified. (or that 90% of the atoms have been reliably located.)

Silke says, "[to auroram}: where have you found the not e about completeness?"

PeterMR [to marek.]: I think you will find these are two separate molecules of ubiquitin?

JohnW says, "(and some other data in ATOM record- more of this later)"

Luis says, "ATOM is a never ending list. Ithought it was not meant for humans, only for programs"

TRex smiles

Marek says, "[to PeterMR]Yes, the chains have the same composition"

PeterMR [to luis.]: I think you need to view it with a screen editor...

JensJL says, "In the atom list you may check how complete the data is"

JensJL says, "Off course takes time and patience"

Ahotz says, "sometimes in the remark section you find also hints about completeness"

PeterMR says, "Marek has an important case. Two identical mols."

Luis says, "I gather structural biologists are *very* patient people :-)"

Auroram says, "[to Silke] It is on the information file about my protein, in data collection"

Marek says, "[to PeterMR]Why important?"

PeterMR says, "this is *sometimes* because biological molecules are dimeric, *sometimes* just a crystallographic artefact. Not always easy to tell."

PeterMR says, "examples of dimers are many allosteric enzymes."

JohnW says, "I think I should get some notes on crystal symmetry up. Or does anyone know of some good ones already? BTW we will be covering symmetry in more detail in a later chapter..."

Paulyta says, "there is a crystallography site"

JohnW thanks JensJL

PeterMR Q9 addresses this +++++++++++++++++++++++++++++++++

Silke says, "why are insulin dimers and hexamers biologically important, since insulin is dilute in blood?"

JohnW says, "I should also mention the symmetry tutorial program you can get from Birkbeck, but it only runs on MS Windows :"

JensJL says, "I've the link on my computer back in my office. (the crystallography site). Tomorrrow I'm there again."

PeterMR will try to circlate the recording!!

PeterMR [to silke.]: Insulin is stored in the beta cells as hexamers. they actually form mini-crystals in the cells!!

Giovanni says, "I have a question. I dont know if related."

Marek says, "[to PeterMR and Silke]Would it not be the case with many peptide hormones -tightly packed?"

PeterMR [to Giovanni]: - ask!!


Giovanni says, "how can i get the complete helix"

JensJL says, "This means you have a mirror-plane or something like it which gives you the same protein again in the cell."

Giovanni says, "actually is DNA"

JensJL says, "If you could provide me with the symetry operation, I could look up, what it does."

PeterMR [to giovanni.]: DNA has two strands related by a symmetry axis (one does one way the other in the opposite direction)

PeterMR [to giovanni.]: It will be a two-fold rotation axis.

JohnW says, "so it must be a 'palindromic' sequence"

ClareS [to Giovanni]: A lot of modelling programs will be able to generate symmetry replicates automatically, given the space group

Giovanni says, "Yes. but how you can regenarte the complete structure. I have only onte strand in the pdb file."

PeterMR [to JensJL]: bilogical molecules are normally chiral so don't have mirror planes :-). normally 2-, 3- 4- fold axes, etc

Giovanni says, "yes its palindromic"

JensJL says, "Sorry, coming from inorganic chemistry. "

TRex [to JohnW]: no,it doesn't have to be palindromic

JensJL says, "To Giovanni. Probably there are programs out which will do the job. Alternativly you cut out all the atoms and use the symmetry operation on the coordinates."

Giovanni says, "do you think should do it manually?"

Silke [to TRex]: if it's not palindromic, what symmetrie does it have then?

JensJL says, "A basic program doing it could be easily written (10-30 min) and then you paste the additional atoms back in the file. This would be a way I would do it if I wouldn't find another way"

Giovanni says, "ok. thanks."

PeterMR [to giovanni.]: This is a mathematical operation. Unfortunately RasMol doesn't have the code yet.

PeterMR [to giovanni.]: You have to have a program that understands symmetry (most commercial modelling packages do).

PeterMR thinks that symmetry comes a bit later in the course???

Paulyta says, "what is the space group?"

PeterMR [to JensJL]: I used to be an inorganic chemist as well :-)

PeterMR [to giovanni.]: I think that PDB offer some examples of the *biologically* relevant molecules (e.g. 2-stranded DNA).

PeterMR has to go upstairs.

JensJL says, "I've done it manually for 5-10 atoms, but normally I wouldn't do it for more."

JohnW says, "go on""

Paulyta says, "there are several PDB file manipulation programs that will do the job"

Gayle schulte the two fold operation that take the 5'----3' and turns it into the 3'----5' shoud make your DNA double stranded

JohnW nods to PeterMR

JohnW says, "symmetry is examined in the chapter on quaternary structure of proteins"

PeterMR is back - are we still discussin or is veryone off to the pub?

PeterMR asks whether anyone has Qs from the bottomh half (10-18) of the list??

Ahotz says, "there was a question what is the space group, thats what also interests me"

KarlS says, "How can I find out about processing?"

PeterMR asks whether another session would be useful???

Tday says, "I am still interested in how you measure moleculkar volume and surface area"

JensJL says, "The spacegroup defines, what symmetry operations you have in the unit-cell."

ClareS thinks it might be

Ahotz says, "I also would like to know what the R factor and the temperature factors are for"

TRex says, "I don't understand processing either"

Silke agrees with ClareS

PeterMR [to KarlS]: processin is often mentioned in swissprot rather than in PDB.

Paolo says, "I agree with ClareS"

KarlS nods

Ahotz [to TRex]: I have a PrePro proteiin

JensJL says, "Proteins being quite complicated structures shouldn't have many symmetry and therefior only a few spacegroups should be found."

PeterMR says, "processing.... Some proteins are produced in a form that is later modified."

ClareS says, "the R factor describes mathematically the "goodness of fit" between the model structure and the X-ray data"

KarlS says, "Like cutting off signal peptides?"

PeterMR says, "e.g. there is a signal sequence which gets them thro a membrane and then it might be chopped off."

ClareS says, "the temperature factors describe how well resolved each atom is "

Franco says, "where do you get info about processing?"

TRex says, "Is processing info in the PDB file, or do you have to go to Expasy?"

ClareS says, "an atom with a high B (temperature) factor is "fuzzier", indicating it is moving more and is "hotter" (in quotes)"

KarlS says, "Where can I find more information about R-values and temperature factors?"

PeterMR says, "insulin is a preproprotein. It has a signal sequence which is cleaved to a pro-insulin, then pro-insulin is cleaved to give an A chain a B chain and a Chain."

Ahotz says, "in my c@more"

TRex says, "I'm not quite clear about the difference between R factor and resolution"

ClareS [to KarlS]: try a standard classic protein crystallography text, like Blundell & Johnson

JensJL says, "I will try to get some crystallographic links togeteher."

PeterMR [to KarlS]: - any crystallography book (e.g. Ladd and Palmer, Glusker and Trueblood or buy a crystallographer a drink.

Marek says, "[to ClareS] R-factor - the smaller the better?!"

ClareS [to Marek]: yes

JohnW nod to JensJL

Ish says, ""there is indirect processing info in the PDB, like in trypsin which has "holes" for the pieces cut out of trypsinogen...kind of looks like missing sequence parts."

PeterMR thanks jens very much that would be excellent - nothing too detailed.

JensJL says, "Yes with R-factor. Smaller means better."

PeterMR says, "what about the temp factor???"

TRex says, "Can you have a good R factor but poor resolution?"

Silke says, "what about temperature? does it have anything to do with real temperature or is it just an arbitrary number?"

ClareS says, "smaller temperature factor means an atom or residue is particularly well defined"

ClareS says, "atoms with large B factors are often on the protein surface"

PeterMR [to TRex]: - it wouldn't be a very good structure

Marek says, "I can't see any temp factor entry in my PDB file"

JohnW [to ahotz]: you e-mailed a question about 'u' . u squared is the mean square amplitude of displacement of the atom from the equilibrium position.

Silke [to ClareS]: so resolution applies to the whole molecule, temperature to individual atoms?

PeterMR [to TRex]: - poor resolution isn't worth refining very much so you wouldn't always egt an R-factor at all...

ClareS [to Marek]: temp factors are the last (rightmost) column in the ATOM records

Ahotz [to JohnW]: thanks

ClareS [to JensJL]: do you know of ?

KarlS says, "what's the difference between temperature and thermal factors?"

JohnW says, "u again- I should say where < > means a summation over the different axes (I think?). The B factor is therefore (8pi)sqd. <u2>"

Marek says, "[to ClareS] Thanks!"

TRex says, "Thanks Peter - R factor depends on resolution, and is a measure of how good the model you construct is?"

JensJL says, "Yes I visited the site regularry during my time in frankfurt."

PeterMR [to trex.]: R-factor is a measure of error between the calculated structure and the observed diffraction. It can depend on many things.

JensJL says, "About the R-factor debate. A good R-Factor is dependend on several points."

JensJL says, "Most improtant are the 'heavy' atoms. Of course you don't have them in oproteins."

PeterMR [to TRex]: bad data --> high R-factor. bad model --> high R-factor. Complete screwup gives high R-factor

TRex smiles

JohnW says, "well, you do have to use heavy-atom derivatives of proteins, to solve the phase problem."

JensJL says, "But for examplke missing out hydrogens doesn't effect the R-factor much."

PeterMR says, "crystallographers can 'fudge' the R-factors so people are working on 'absolute' methods to see if they are telling the truth."

Gayle schulte says, ": how "

TRex says, "Some proteins contain heavy atoms naturally (hemoglobin)"

PeterMR says, "one method is called 'r-free'.. But always always always keep your critical brain switched on."

Gayle schulte says, "how many observations to parameters is a good ratio in protein crystallography?"

JohnW nods

JensJL says, "I worked a lot with investigating structures and getting data out of correlations. R-factors are often used to define quality, but I couln't recomment this."

PeterMR [to gayle.]: not sure. about 3???

PeterMR feels that 90 mins is quite enough. Also has to go home...

ClareS agrees with Peter

KarlS agrees with Peter


PeterMR meant session?

JohnW also agrees!

ClareS will leave that decision to the students...

Gayle schulte says, "yes this seems very productive to me"

TRex says, "I have learnt a great deal tonight - thank you to everyone!"

Tday says, "me too"

Ish says, ""Before you all leave, can you tell me where to find out about these sessions...are they posted somewhere...thanks? Are there planned agendas or questions?"

Silke agrees

JohnW says, "I would like to thank Peter for answering so many questions and keeping up such a fast typing rate!"

Gayle schulte says, "ditto JohnW, thank you!"

TRex says, "How can we tell if the students want another. Do we emailthem?"

Silke says, "Thank you, Peter!"

ClareS [to TRex]: Probably the best idea...

TRex [to Ish]: the notices are usually emailed to PPS-general mail-list

JohnW says, "Yes, info about forthcoming meetings is posted to the lists, in which we ask people for there particular queries, so we can put together an agenda."

Ish says, ""hmmm...better get on that list! Thanks!"

JohnW says, "I also try to put the details on the Noticeboard but I think I forgot about this one, oops... "

TRex says, "Will a transcript of this meeting be posted or mailed?"

JohnW says, "The list you want is pps96-general"

ClareS [to JohnW]: this meeting was on the Noticeboard, I've just checked

PeterMR [to trex.]: I have tried to capture the session and will mail it (or post it if I am successful!!!)

TRex smiles

PeterMR turns the recorder off.