Introduction
Component Programming
In an ideal world, software application development would be straight forward.
One would simply decide what code modules are needed, what order they should
be executed and what information should be passed between them. One would
then go to a software library to find which of the required modules had
already been written. Modules written from scratch would then be combined
with the library modules to make up the finished application. Component
programming of this sort is not a new idea. In fact, McIlroy (McIlroy,
1968) proposed it almost thirty years ago. Unfortunately, this form of
programming is still not mainstream and code reuse remains limited. However,
it has been suggested that recent advances in computer science now make
component programming a practical proposition if standard libraries of
components are constructed along certain guidelines (Jazayeri, 1995). Another
factor that is likely to make component programming more popular is the
advent of new component technologies such as JavaBeans and ActiveX.
Code Reuse
There have been several attempts at increasing the amount of software reuse
within biocomputing. These efforts have relied on procedural or object-oriented
libraries. Procedural libraries of note are the program suites provided
by the CCP4 which
are widely used in protein crystallography. Their approach is to create
and adapt programs so that they can communicate via files in common formats.
The GCG, EGCG
and SEQNET libraries contain
programs for biological sequence analysis. These efforts have proven very
popular but are aimed at program users rather than developers.
Object-orientated Programming
The use of object-oriented programming languages such as C++ and Java is
now widespread within biocomputing and such languages facilitate component
programming and contain features that make code reuse easier (Stroustrup,
1997). There are two published object-oriented class libraries developed
for use in biomolecular computing. PDBlib (Chang et al., 1994) represents
three dimensional macromolecular structure in the form of a C++ class library.
SCL (Sequence Class Library) (Vahrson et al., 1996), is designed for use
in analysing DNA and protein sequences. Peter Murray-Rust has produced,
but not published, a C++ class library for use in computational chemistry
and bioinformatics called Democritos.
Our impression is that, while these class libraries have proven useful
to their authors, there has been little uptake by other programmers. In
this paper the limitations of pure object-oriented design (OOD) that can
constrain code reuse are discussed.
Generic Programming
One of the conditions that Jazayeri (Jazayeri, 1995) puts upon software
component libraries is that the constituents should be as generic as possible.
Generic components can be used in more than one context. A generic algorithm
can be used to upon data stored in a range of data structures. Thus, generic
algorithms and data structures to be combined in an orthogonal manner (Musser
and Stepanov, 1988). This flexibility can only be achieved if some abstraction
and parameterization of the components is allowed. Such a facility is provided
in the object-oriented programming languages Ada (Musser and Stepanov,
1989) and C++.