PRINCIPLES of PROTEIN STRUCTURE An interactive course on the Internet Birkeck College, London, UK and Virtual School of Natural Sciences (VSNS) (Affiliated to the Globewide Network Academy) Introduction ------------ With the explosive growth in the number of protein structures and their key role in understanding biological processes, there is a huge demand for effective ways for biologists, chemists and other scientists to learn the principles of three-dimensional protein structures. This course is a completely new approach and a new step in the use of networks and computers for educational purposes. Background ---------- The Globewide Network Academy (GNA) has received acclaim for the its pioneering work in the world's first fully distributed interactive course (an introduction to C++). This course used several technologies for the first time: A hypertextbook on the WorldWideWeb with a vast range of other resources An interactive environment (in Diversity University MOO) for discussions A rapid and extremely flexible set of discussions using listservs which were also html'ised for the WWW An set of interactive student projects. All this was run at no monetary cost with volunteers (including students) from over 20 countries none of whom have ever physically met. As a result several participants are continuing to make advances in this educational technology. The GNA is now developing very impressive WWW technology for the administration of all aspects of the educational process, including student registration, course admin, assessment, project work, etc. There are also very effective discussion groups for the creation of new courses. This course (VSNS-PS) --------------------- Building on this experience we are offering this course and adding many new elements to create a resource which would be difficult to match in any one *physical* location. A major feature of the C++ course was that we developed it as we learnt from experience and we are keeping that flexibility here. Therefore the details below may change, even after the course has started! We shall start in January 1995 and take about 15 weeks, although we shall allow students to proceed at different paces where possible. The course will be prepared and by about 7 leading groups in Protein Structure who will also act as interactive consultants. (During the C++ course 3 students became consultants halfway through!). In the period Sept-Dec 1994, the consultants will discuss the structure and details of the course and prepare material of many different kinds. Consultants (whose number will probably increase) will use the GNA technology to discuss the course, decide on responsibilities and implement central resources. Students -------- We envisage two types of student, real-life (RL) and virtual. The virtual students (the only type on the C++ course) may be anywhere in the world and will be selected as below. A new feature is that we hope that some consultants will arrange for students in their own locality also to take the course, which gives us a basic element of feedback (not easy with virtual students). (The RL students may take this for their own enlightenment or as an *additional resource* for an existing course). Student registration will begin in (?) October and we expect to be heavily oversubscribed. From our experience with C++ we've realised that students must make an effective commitment to donate something to the course and we are therefore developing *learning contracts* whereby both students and consultants agree at the start what each can expect from the other. (As the course progresses the distinction between student and consultant often becomes minimal, but it is important at the start that students are aware that they intend to commit themselves.) STUDENTS SHOULD *ONLY* APPLY WHEN THE GNA REGISTRATION WWW FORMS ARE IN PLACE AND CANNOT EXPECT ANSWERS TO E-MAIL REQUESTS. Effectively the students commit themselves to give (some of): - feedback and discussion - project work - technology development - resource creation (e.g. hyper-articles, diagrams) - attendance at interactive electronic Course Content -------------- This will evolve as consultants discuss the course, but it should not be overambitious nor deal with too many areas where there are differences of opinion (it's difficult to devise a collaborative electronic course in these cases). Collaborative courses also lend themselves to well defined modules whose content can be generally determined from their title and description. The course divides into two parts (*very* approximately equal): "Basic principles" and "technology" Examples of protein structure in families "Basic principles" and "technology" are two parallel threads which students can follow at their own pace and to a certain extent in their own order. Note that some students will be familiar with some components already. Provisionally the contents are: Basic principles: Overview of protein synthesis (transcription, translation, transport, cellular organisation) Primary structure of proteins and nomenclature Protein geometry and nomenclature Overview of molecular forces (not mathematical) Secondary structure Tertiary structure (possibly through alpha, beta, alpha-beta) Technology: using WWW (browsers, navigation, forms, imagemaps) WWW resources (e.g. databases) Installing and running VSNS-PS programs locally BIOMOO and listservs Protein families (very much up for discussion and *not* comprehensive): Globins Serine proteases Immunoglobulins MHC Nucleotide-binding proteins Membrane-bound proteins and receptors DNA-binding proteins Virus structure Other topics which will be relevant and may be covered in passing are protein structure prediction, quaternary structure, methods of structure determination, etc. We have to set limits for the content and the level and will err on the side of too little. There is ample scope for discussion of advanced topics in the listserv and MOO, but we will probably give little formal coverage for: Molecular modelling Sequence alignment (especially multiple) Molecular evolution Enzyme mechanisms Nucleic acid and carbohydrate structure Prerequisites and Objectives ---------------------------- Students should be familiar with (or be prepared to learn beforehand): Simple chemistry and nomenclature of common groups Basic geometry (coordinates, distances, angles) Appropriate computer skills (editing, installing programs, WWW and ftp, mail) We intend to give clear guidelines to what is expected and if possible pointers to where material or help can be found. Each module will have objectives which all students should try to attain, but many will go beyond. Where possible we hope to devise WWW tools for self-assessment (e.g. students can test themselves on how well they can recognise amino acids from pictures, formula, etc). This self-assessment might be used (anonymously) for feedback for the consultants. Consultants ----------- Resources and materials ----------------------- The course will have a wide range of resources available. Most people will probably use a subset... ** Local Programs: These could be installed on a students PC/Mac/Unix at home as well as at their place or work/study/recreation... They must be freely available, preferably multi-platform and well supported by a subset of the consultants and students. This is a minimal core which might grow, but we don't want to overstretch students who find computer installations difficult: Mosaic A MOO client (tf, Muddweller...) graphics viewers (xv, lview) RasMol (3-D structure viewer) Kinemage (protein image viewer) WPDB (PDB browser and display) MACAW (sequence alignment - are there other PD ones?) ** Servers The course resources will have links to many URLs but there will be a number of servers over which we have control (e.g. can mount forms, imagemaps, etc) or store course-related info. (this list will grow): PDB: SWISSPROT/Expasy/SWISSMODEL: GNA: SEQNET/CBMT: O: Birkbeck: ** BioMOO BioMOO is used by a wide range of biologists and has been featured in Science, but until recently was text-only. Gustavo and friends have pioneered the use of graphics in the last few weeks and there are real possibilities for the use of the MOO in communal discussions, without a large investment in learning the MOO technology. There may be programmers in BioMOO who are keen to help develop this. We have learnt a lot from our MOO experience on C++ and will make the MOO interface more friendly with additional resources within it. ** Glossary With the WWW forms technology an extended glossary can now be compiled as a communal effort from *all* course members, and made searchable. ** Clickable imagess (hyperdiagrams). These are a very powerful learning tool and can be readily created by willing course members for many parts of the course (genomes, cells, protein images, biochemical pathways...) ** Self-assessment and drill tools Several parts of the course require memorisation, such as aminoacid nomenclature, torsion angles, cellular coomponents. WWW technology allows us to build multiple choice questions which can be used for drill or assessment. The GNA could help with the generic technology. ** listservs this was a very powerful tool for C++ , allowing discussion to take place on an intermdeiate timescale. I'd suggest that we have several listservs for different topics so that noone needs to read all the traffic. Moreover the traffic can all be html'ised and put on the Web which makes it very easy to read. ** hypermodules The structure of the course breaks down into modules (of varying sizes) and these can be run by different groups/people. An excellent example of a hypermodule is HISTO (MHC) which, though not originally created for this course can be fitted in extremely easily. ** new technology Web technology is increasing daily. For example, we have developed a way of driving RasMol from WWW pages so that, for example, a click on a Ramachandran map will highlight the residue in RasMol. It's very likely that some of the groups will come up with new ways of driving things from WWW. ** projects There are a number of areas where projects may be an appropriate way of learning, very often as a group. These projects might be to produce a resource (e.g. a hyperarticle or a hyperimage) or to investigate an idea (e.g. "are disulphide bridges always conserved?"). These could very possible run on after the course has finished. Why run a course like this? -------------------------- There is no doubt that at its best courses like this (and C++) offer things that can't be done in many places IRL. We can get a critical mass of enthusiastic students, many of whom bring complementary expertises to the course. There is an enormous feeling of internationality and cooperation, and , of course, many of us love being involved in the learning process. But also this course will spawn new technology, and possibly new ideas in the protein structure. The world will be watching, and this is a powerful incentive to produce a course of the highest quality. Because of that it's also available to anyone who can connect (even if they are not registered as students). We had hundreds of unofficial 'readers' of the C++ which was even 'syndicated' in one country! For those groups with technology or ideas that they want others to test, the course community can be very useful. And it may also spawn collaborative projects which continue afterward (as the C++ course). Some general considerations --------------------------- A course like this *must* be used as a way of enhancing present education and not a substitute for it. (The lack of human contact in C++ was a problem at times and I'm not keen on totally electronic education just yet!). It's important that it's international and that as far as possible it can be a way of bridging the gap between countries with different levels of resources. Thus we set the lower limits of technology as: PC (386) or Mac equivalent free software Internet access at (?) 9600 baud No audio/realtime movies No hardware-specific tools (e.g. SGI-based modelling programs) Obviously it will run better on UNIX boxes with high bandwidth connexions and consultant groups will undoubtedly prepare much of their material in this way. Birkbeck's role --------------- Birkbeck is a world leader in protein structure and also adult education. Several courses related to macromolecules are taught at MSc level. It is totally committed to this type of venture and is funding Alan Mills to develop network-learning techniques (including home study). VSNS-PS will run very closely with some of the MSc courses and although students are not *required* to take it, many will do so and will contribute material as project work. The symbiosis between GNA and a RL academic institution is seen as beneficial by both sides. Peter Murray-Rust (pmr1716@ggr.co.uk) Glaxo Research and Development, Greenford mbglx@seqnet.dl.ac.uk (Mailbox at Daresbury by kind permission of Alan Bleasby)