BTL - Introduction

Introduction

Component Programming

In an ideal world, software application development would be straight forward. One would simply decide what code modules are needed, what order they should be executed and what information should be passed between them. One would then go to a software library to find which of the required modules had already been written. Modules written from scratch would then be combined with the library modules to make up the finished application. Component programming of this sort is not a new idea. In fact, McIlroy (McIlroy, 1968) proposed it almost thirty years ago. Unfortunately, this form of programming is still not mainstream and code reuse remains limited. However, it has been suggested that recent advances in computer science now make component programming a practical proposition if standard libraries of components are constructed along certain guidelines (Jazayeri, 1995). Another factor that is likely to make component programming more popular is the advent of new component technologies such as JavaBeans and ActiveX.

Code Reuse

There have been several attempts at increasing the amount of software reuse within biocomputing. These efforts have relied on procedural or object-oriented libraries. Procedural libraries of note are the program suites provided by the CCP4 which are widely used in protein crystallography. Their approach is to create and adapt programs so that they can communicate via files in common formats. The GCG, EGCG and SEQNET libraries contain programs for biological sequence analysis. These efforts have proven very popular but are aimed at program users rather than developers.

Object-orientated Programming

The use of object-oriented programming languages such as C++ and Java is now widespread within biocomputing and such languages facilitate component programming and contain features that make code reuse easier (Stroustrup, 1997). There are two published object-oriented class libraries developed for use in biomolecular computing. PDBlib (Chang et al., 1994) represents three dimensional macromolecular structure in the form of a C++ class library. SCL (Sequence Class Library) (Vahrson et al., 1996), is designed for use in analysing DNA and protein sequences. Peter Murray-Rust has produced, but not published, a C++ class library for use in computational chemistry and bioinformatics called Democritos. Our impression is that, while these class libraries have proven useful to their authors, there has been little uptake by other programmers. In this paper the limitations of pure object-oriented design (OOD) that can constrain code reuse are discussed.

Generic Programming

One of the conditions that Jazayeri (Jazayeri, 1995) puts upon software component libraries is that the constituents should be as generic as possible. Generic components can be used in more than one context. A generic algorithm can be used to upon data stored in a range of data structures. Thus, generic algorithms and data structures to be combined in an orthogonal manner (Musser and Stepanov, 1988). This flexibility can only be achieved if some abstraction and parameterization of the components is allowed. Such a facility is provided in the object-oriented programming languages Ada (Musser and Stepanov, 1989) and C++.