Evolutionary Biology and Evolutionary Computations: Cross-field Dissemination of Ideas
"An accident, a random change, in any delicate mechanism can hardly be expected to improve it. Poking a stick into the machinery of one's watch or one's radio set will seldom make it work better."—Theodosius Dobzhansky, Heredity and the Nature of Man (1964), p. 126.
The State of Problem: "Something from Nothing"
"There are only two ways we know of to make extremely complicated things, one is by engineering, and the other is evolution. And of the two, evolution will make the more complex." - Danny Hillis.
"If complex computer programs cannot be changed by random mechanisms, then surely the same must apply to the genetic programs of living organisms. The fact that systems [such as advanced computers], in every way analogous to the living organism, cannot undergo evolution by pure trial and error [by mutation and natural selection] and that their functional distribution invariably conforms to an improbable discontinuum comes, in my opinion, very close to a formal disproof of the whole Darwinian paradigm of nature. By what strange capacity do living organisms defy the laws of chance which are apparently obeyed by all analogous complex systems?"—*Michael Denton, Evolution: A Theory in Crisis (1985), p. 342.
Indeed, under certain conditions, the evolution of fit by way of cumulative blind variation and selection appears inevitable. Consider a population of self-replicating entities that vary in ways relevant to their reproductive success, and that inhabit an environment of limited space and resources that does not undergo large fluctuations from one generation of entities to the next. If these entities produce quite (but not always perfectly) accurate copies of themselves, after a few generations the winnowing effect of selection will be noticed as the population inexorably shifts toward a preponderance of new entities that better fit their environment. This, in a nutshell, is what the process of cumulative blind variation and selection is all about.
However, as we have now seen, these entities do not have to be restricted to living organisms and the genes they contain [Campbell, 1974]. They can be molecules, antibodies, neural synapses, behaviors, scientific theories, technological products, cultural beliefs, words, or computer programs. And selection does not have to be restricted to the natural and purposeless selection of Mother Nature, but may involve purposeful humans selecting for plants growing bigger tomatoes, cows giving more milk, scientific theories providing better predictions, automobile engines yielding greater efficiency, or molecules providing more powerful drugs. The robustness of the selection process was dramatically demonstrated by the findings of artificial life researchers [Ray, 199?] who were amazed at just how easy it is to get adaptive evolution happening on their computers. As long as the basic conditions of some (but not too much) variability, accurate (but not too accurate) replication, and a fairly stable environment prevail, the mechanistic, unforesighted evolution of fit appears inescapable.
Richard Dawkins has written several computer programs which function, he says, like evolution. Computers today are powerful and can run the programs very quickly. This speed enables Dawkins to compress a single generation, or computer trial, into a fraction of a second. In his widely acclaimed 1987 book The Blind Watchmaker, from which the above quote comes, he tells about several such programs. (The Richard Dawkins Unofficial Website)
Adaptations are the result of natural selection on spontaneously arising, heritable variation. Little doubt remains on these essentials 150 years after the publication of Darwin's and Wallace's original proposals. Natural selection is a universal principle in every system where self-replicating entities show variation, heritability and differ- ential reproduction (Lewontin, 1970). This universality of natural selection allowed an unexpected triumph of Darwinism in computer science where the principles of biological adaptations are increasingly used to solve computational problems. The rapidly growing arena of evolutionary algorithms includes genetic algorithms (Holland, 1992), evolutionary strategies (Rechenberg, 1973) and genetic programming (Koza, 1992). These applications show, in an artificial reality, the efficacy of the Darwinian process to solve even the most intricate optimization problems. In principle, this success should lay to rest the worry whether selection on spontaneous variation is able to explain the origin of complex adaptations, as for instance the notorious vertebrate eye (Darwin, 1859). However, evolutionary computer science not only demonstrates the problem solving power of the Darwinian method, but brings into focus the largely unsolved real problem of the origin of complex adaptations. The process of adaptation can only proceed to the extent as favourable mutations occur. This is usually not a major concern to biologists, because they study the endproducts of evolution, and their very existence is powerful evidence that the favourable mutations have occured at a sufficient rate. However, the computer scientists, who want to solve engineering problems with evolutionary algorithms, start 1 with inferior designs and want to improve them. For them the time it takes to actu- ally obtain the improvement is money. In studying how quickly an improvement can be obtained it was discovered that the mutation/selection process is not universally effective. This is not an entirely new insight (see for instance Eden (1967), Bossert (1967), Simon (1965) or Bremermann (1966)), and it is also easy to see why this is the case. There is no way to improve the performance of a conventional computer program by randomly exchanging letters in the source code. However, Darwinian ''evolution'' of computer programs is indeed possible, as shown by the Tierra program of Thom Ray (1992) and the genetic programming methods (Koza, 1992).
Further it will be suggested that computer science and biology experience a phase of substantive convergence of interests, which makes computerscience the strongest allie in the attac on many of the still unsolved problems in evolutionary biology [Adaptation and the Modular Design of Organisms, G.P. Wagner].
The issue of evolution in nature has received renewed attention over the past two decades. Darwin's fundamental theory, while still sound today, is in need of expansion. For example, one well-known principle is that of natural selection, usually regarded as an omnipotent force capable of molding organisms into perfectly adapted creatures. The work of Stuart Kauffman [1993] has revealed that other factors can influence evolution besides natural selection. He demonstrated that certain complex systems tend to self-organize; that is, order can arise spontaneously. A major conclusion is that such order constrains evolution, to the point where natural selection cannot divert its course.
Another principle of Darwin's theory is that of gradualism - small phenotypic changes accumulate slowly in a species. Paleontological findings discovered over the years have revealed a different picture - long periods of relative phenotypic stasis, interrupted by short bursts of rapid changes. This phenomenon has been named punctuated equilibria by biologist Stephen Jay Gould. While a full explanation does not yet exist, the phenomenon has been recently observed in a number of ALife works, suggesting that it may be inherent in certain evolutionary systems.
ALife offers opportunities for conducting experiments that are extremely complicated in traditional biology or not feasible at all. ALife complements biological research, raising the possibility of joint ventures leading to valuable new scientific discoveries.
To scientists, the most exhilarating news to come out of Ray's artificial evolution machine is that his small worlds display what seems to be punctuated equilibrium. For relatively long periods of time, the ratio of populations remain in a steady tango of give and take with only the occasional extinction or birth of a new species. Then, in a relative blink, this equilibrium is punctuated by a rapid burst of roiling change with many newcomers and eclipsing of the old. For a short period change is rampant. Then things sort out and stasis and equilibrium reigns again.
Evolution requires new genes. In the four-billion year history of Earth’s biosphere, life has evolved from prokaryotic cells to eukaryotic cells, to multicelled plants and animals, to creatures with specialized tissues and organs, etc. This increase in organs, systems, and features has been accompanied by an increase in the size of the genome. Admittedly, overall genome size is not a reliable measure of a species’ place on the phylogenetic tree of life. Even similar species can have genomes of quite different sizes. However, if we eliminate from consideration all silent DNA and all redundant copies of genes, the size of the necessary genome has increased as evolution has produced new systems, functions and features. Specifically, the number of necessary genes has increased. It is clear that the evolution of new organs, systems, or features requires new genes.
Two scientific theories give differing accounts of the appearance of new genes in biological evolution on Earth. The more accepted one, by far, is the neo-Darwinian theory, as elaborated by Susumu Ohno. According to Ohno, new genes arise when existing genes are duplicated, undergo “forbidden mutations” as silent DNA, and reemerge as new genes with new functions.
While silent, a gene cannot be improved or even maintained by natural selection. To arrive at a substantially different nucleotide sequence, we would expect many mutations to be required. These mutations will randomize the original sequence of the gene and it will lose its original meaning. As everyone knows, a random nucleotide strand as long as an average gene has an absurdly high number of possible sequences. If an average gene is 1,000 nucleotides, the number of possible sequences it can have is 4^1000, or about 10^600. The chance of finding any gene currently expressed anywhere in biology in that sequence space, in even 10^50 trials, is less than 10^-500.
If a duplicated gene suffers only a few mutations, it may retain its original function and soon again become expressed. In fact it could evolve this way without necessarily ever becoming silent. This is a possible neo-Darwinian account for the many variants of cytochrome C, for example. Furthermore, if the right few mutations occur, according to this line of thought, the gene may acquire a new function.
But if this is proposed to be the basis for the evolution of all new genes, then neo-Darwinism must maintain that there are series of expressed genes whose sequences are closely related — one-to-the-next — leading from a small set of original prokaryotic genes to every gene subsequently expressed in biology. This method of gene creation would imply ultra-gradualism in evolution.
But if this is proposed to be the basis for the evolution of all new genes, then neo-Darwinism must maintain that there are series of expressed genes whose sequences are closely related — one-to-the-next — leading from a small set of original prokaryotic genes to every gene subsequently expressed in biology. This method of gene creation would imply ultra-gradualism in evolution. This line of thought was discussed by Manfred Eigen in 1987, but its development has not been fruitful.
Chandra Wickramasinghe has compared the neo-Darwinian account of evolution to saying that all of world literature came from the book of Genesis by occasional typos and paragraph swapping. The mechanism discussed here is analogous to stipulating that every text along the way was viable as literature. Such gradualistic series have not been shown to be possible in written text or computer programs. Nor have they been shown to exist in biology. If this is how new genes are supposed to evolve, the mechanism remains to be demonstrated.
Competition on Rugged Fittness Landscapes
The concept of a ``fitness landscape'', a picturesque term for a mapping of the vertices of a finite graph to the real numbers, has arisen in several fields, including evolutionary theory.
A fitness landscape consists of a fitness function mapping objects in a search space to their fitness, and a neighborhood relation defining neighboring objects in the search space. In biological contexts the neighboring objects are called mutants. Fitness landscapes are important aid to understanding optimization.
Energy landscapes with many local minima are by now a well-studied subject in the statistical mechanics of spin glasses, and are currently studied in the statistical mechanics of protein folding. Similar landscapes of fitness with many local maxima instead of minima have also attracted attention in evolutionary biology and in computer science. Theoretical biologists use them in models of evolution. Computer scientists have to live with them, as they crop up in combinatorial optimization problems, and in the training of neural networks --- and use algorithms mimicking evolution to attack these hard optimization problems [Goldberg, 1989].
The notion of an adaptive ``landscape'' representing the abstract ``fitness'' of various kinds of organisms in various contexts has been a fixture of evolutionary biology ever since it was proposed by Sewell Wright in 1932. Although there are problems with a notion of ``fitness'' that is a property of an individual, independent of other individuals and the environment, the discoveries of molecular biology have significantly reinforced the power of this idea. We now understand, for example, the role of a discrete genomic ``blueprint'' in specifying the chemical constitutents of enzymes and a glimmering of how sensitive the fitness of the organism can be to variations in enzyme chemistry, so that it makes sense to identify the specific sequence of nucleotide bases in the genome as the argument of the fitness function. It has also become increasingly clear that the ``design'' of organisms involves a host of complex trade-offs, implying that there must inevitably be large numbers of local optima in such fitness landscapes. It was thus only a matter of time before the analogy between optimization (i.e. selection) on landscapes and combinatorial optimization problems appeared in the biological literature.
Evolution is nature's way of solving problems. The main problem, of course, for a bacterium or any other organism, is how to survive long enough to reproduce. In the struggle for existence, genetically distinct individuals compete for limited resources. Those that do best produce more offspring and become the major genetic shareholders of the next generation. Random mutations and sexual reproduction shuffle genes to create offspring with new genetic combinations. And every generation, natural selection sifts through the competitors, eliminating combinations that don't work, and adapting organisms to their enviroranents. But what happens if the envirorunent isn't stable, but is constantly changing? How does evolution cope with a moving target? This is a problem that pathogenic bacteria such as Salmonella or Escherichia coli know well. For them, the problems come thick and fast in the form of a vicious molecular bombardment, courtesy of their host's immune system and, sometimes, antibiotic reinforcements. Still, they manage to evolve quickly, on the hoof, to find new genetic solutions. A bad bout of food poisoning is living proof that somehow, for a time, the bacteria have got the better of your defences.
Evolutionary escapology. To fuel their evolutionary escapology bacteria rely on random mutations-rare, unforeseeable and uncontrollable mistakes in the process that copies DNA from one generation to the next. Here or there, without planning, the DNA of a gene is copied incorrectly. More often than not, these mutations are harmful rather than beneficial-the biological equivalent of throwing a spanner into the works. With such blind and basic machinery, it is a wonder that bacteria are so flexible and adaptive. And this is the puzzle that is driving biologists to dishes teeming with hundreds of thousands of microbes. By studying their growth and genetic alterations, scientists are beginning to shed light on how bacteria evolve and evade their enen-des so effectively. Oddly, it seems that a bacterium sometimes finds evolutionary success through biochemical failure. Evolution uses some very peculiar tricks when it comes to making organisms that are not only adapted, but adaptable as well. The problem confronting a population of bacteria can be visualised more clearly in a mathematical context than in a strictly biological one. Faced with a changing environment, bacteria need to explore a vast range of alternative genetic solutions. An adaptive landscape is a useful metaphoric tool, often used bv evolutionary theorists to describe this genetic prospecting. Imagine an abstract mathematical space and an undulating surface within it, complete with hius and valleys. The height of any point on the landscape corresponds to the extent to which a given genetic constitution produces an organism well-suited to its environment. Movement on the landscape corresponds to changes in the organism's genes, and the highest peaks represent those combinations which give an organism the highest fitness, as measured by reproductive success. This is, adn-Littedly, a fantastically oversimplified picture. For the possible directions of movement are extremely limited in a three-dimensional landscape. An organism can move north or south, east or west, or anv combination of these directions. But real organisms have a huge number of -enes that can mutate independently of one another. And each gives a different possible direction for change. The bacterium E. coli, for instance, has about 4 000 genes. So the true adaptive landscape for E. coli crawls up and down in a mind-boggling space of 4 000 dimensions. Stil, even in a weird landscape of such high dimension, the basic idea is the same. Fuelled by mutation and driven by natural selection, organisms will tend to march uphill in the landscape toward fitness peaks and, having arrived, stay there. A particular adaptive landscape is fixed as long as an organism's environment doesn't change. But genetic combinations that were favoured in one environment will not necessarily be favoured in another. If their environment changes, a population sitting proudly on a mountain top mav suddenly find itself down in a valley. In order to survive in this new environment, organisms are banking on mutations to inch them towards the top of a new peak on the landscape.
Trouble is, the navigational skills of organisms on an adaptive landscape are limited - they cannot anticipate where they should go. And the mutations that must take them there are random. For a bacterium, the verv next mutation is more likely to cause touble than help.
[Weinberger, E. (1988). ``A More Rigorous Derivation of Some Results on Rugged Fitness Landscapes,'' J. theor. Biol. 134 No. 1, 125-129. Weinberger, E. (1990). ``Correlated and Uncorrelated Fitness Landscapes and How to Tell the Difference,'' Biological Cybernetics 63, No. 5, 325-336. Weinberger, E. (1991a). ``Local Properties of Kauffman's N-k model, a Tuneably Rugged Energy Landscape,'' Physical Review A, 44, No. 10, 6399-6413. ]
Complexification and Redundancy
Yet it can hardly be denied that over the course of bi- ological evolution the complexity of the most complex things around has increased dramatically. It can indeed be said that in the earliest stages of life, there was noth- ing like the great variety of complex and wonderful crea- tures that now grace our world. Somewhere along the line complexity has evolved - not monotonically (witness the extinction of the dinosaurs), but it certainly has hap- pened, and the mystery is why.
Biological Evolution and Evolutionary Computations
"How does evolution produce increasingly fit organisms in environments which are highly uncertain for individual organisms?"
"How does an organism use its experience to modify its behavior in beneficial ways (i.e. how does it learn or 'adapt under sensory guidance')?"
"How can computers be programmed so that problem-solving capabilities are built up by specifying 'what is to be done' rather than 'how to do it'?" [Holland, 1975, page 1]
Any theory holding that life originates on Earth de novo from nonliving chemicals has problems for which computers provide a good metaphor. The problems are in two categories, hardware and software. For life as for computers, both are required, together.
After eukaryotic cellular life has become established, in this metaphor, the machinery for creating new biological hardware is in place, and the remaining problem is one of software only. How does the genetic programming for new evolutionary features get written and installed?
This aspect of the problem of evolution is a good one to focus on because computers are everywhere and can be readily observed. We can ask the same question about real computers: how do new computer programs get written and installed? Of course, the answer is that computer programmers write the programs and computer users install them.
But neo-Darwinism holds that during the course of evolution there were no programmers for genetic programs: the process was blind, self-driven. An analogous process in the world of computers would cause new computer programs or subroutines to appear spontaneously in the traffic of computer code being copied and transferred. If such a program or subroutine somehow became able to replicate itself, it would have taken a significant step toward "life." If, subsequently, it accrued other advantages, like concealment, it would have a "survival" advantage in the world of computer traffic. From there, by analogy with neo-Darwinism, it could grow and multiply and have properties similar to life [Klyce, Brig, Computer Models of Evolution]. Does this ever happen?
Alternatively, it should be possible for scientists to artificially create a computer "environment" in which the evolution of computer programs could occur. Parameters such as the mutation rate and the recombination parameters could be optimized for the evolution of new programs. At the lightning speed of modern computers, jillions of trials could be run to see if randomness coupled with any nonteleological iterative process can ever write computer programs with genuinely new functions. Has this been done?
These were some of the questions concerning John Holland when he thought of Genetic Algorithms (GA's) in the 1960's. Basically, all these problems were shown to be reduced to a problem of optimizing a multiparameter function necessary for solving a particular problem. Nature's "problem" is to create organisms that reproduce more (are more fit) in a particular environment: the environment dictates the selective pressures, and the solutions to these pressures are organisms themselves. In the language of optimization, the solutions to a particular problem (say, an engineering problem), will be selected according to how well they solve that problem. GA's are inspired by natural selection as the solutions to our problem are not algebraically calculated, but rather found by a population of solution alternatives which is altered in each time step of the algorithm in order to increase the probability of having better solutions in the population. In other words, GA's (or other Evolutionary Strategies (ES) such as Evolutionary Programming (EP)), explore the multi-parameter space of solution alternatives for a particular problem, by means of a population of encoded strings (standing for alternatives) which undergo variation (crossover and mutation) and are reproduced in a way as to lead the population to ever more promising regions of this search space (selection)
Genetic Algorithms (GAs) are adaptive heuristic search algorithm premised on the evolutionary ideas of natural selection and genetic. The basic concept of GAs is designed to simulate processes in natural system necessary for evolution, specifically those that follow the principles first laid down by Charles Darwin of survival of the fittest. As such they represent an intelligent exploitation of a random search within a defined search space to solve a problem.
First pioneered by John Holland in the 60s, Genetic Algorithms has been widely studied, experimented and applied in many fields in engineering worlds. Not only does GAs provide an alternative methods to solving problem, it consistently outperforms other traditional methods in most of the problems link. Many of the real world problems involved finding optimal parameters, which might prove difficult for traditional methods but ideal for GAs. However, because of its outstanding performance in optimisation, GAs have been wrongly regarded as a function optimiser. In fact, there are many ways to view genetic algorithms. Perhaps most users come to GAs looking for a problem solver, but this is a restrictive view [De Jong, 1993].
Genetic algorithms (GAs) are optimization techniques based on the concepts of natural selection and genetics. In this approach, the variables are represented as genes on a chromosome. GAs feature a group of candidate solutions (population) on the response surface. Through natural selection and the genetic operators, mutation and recombination, chromosomes with better fitness are found. Natural selection guarantees that chromosomes with the best fitness will propagate in future populations. Using the recombination operator, the GA combines genes from two parent chromosomes to form two new chromosomes (children) that have a high probability of having better fitness than their parents. Mutation allows new areas of the response surface to be explored. GAs offer a generational improvement in the fitness of the chromosomes and after many generations will create chromosomes containing the optimized variable settings.
Genetic algorithms (GAs) are optimization techniques based on the concepts of natural selection and genetics [Goldberg, D.E. Genetic Algorithms in Search, Optimization, and Machine Learning; Addison-Wesley: Reading, MA, 1989]. In this approach, the variables are represented as genes on a chromosome. Similar to Simplex optimization, GAs feature a group of candidate solutions (population) on the response surface. Through natural selection and the genetic operators, mutation and recombination, chromosomes with better fitness (response function scores) are found. Natural selection guarantees that chromosomes with the best fitness will propagate in future populations. Using the recombination operator, the GA combines genes from two parent chromosomes to form two new chromosomes (children) that have a high probability of having better fitness than their parents. Mutation allows new areas of the response surface to be explored. One of the reasons GAs work so well is that they offer a combination of hill-climbing ability (natural selection) and a stochastic method (recombination and mutation).
Advantages to using genetic algorithms:
It is well-known that GAs are not speedy and precise but robust search procedures. Because of their robustness, they have been enployed in many fields, and applied to many types of problems. It is also well known. however, that vthe canonical GA can not discover optima easily on some types of problems, of which search space is massively multimodal [Goldberg & Richardson, 1987], deceptive [Goldberg, 1989, Complex Systems, 1989], unstationary, and so on. They are called GA-hard problems.
Although a framework for GAs has been developed, not all applications can be optimized efficiently by GAs [Goldberg, 1989]. Optimization problems in which GAs have difficulty are called "GA-hard" or "GA-deceptive" and occur when above average schemata do not combine to form better performing schemata. This problem arises for many reasons. Understanding why certain problems are GA-hard is a very important research problem in the GA community.
For applications where the calculation of the gradient vector is numerically precise and fast, I recommend using some form of gradient descent or Powell's method (See Press, W.H.; Flannery, B.P.; Teukolsly, S.A.;Vetterling, W.T. Numerical Recipes; Cambridge University Press: Cambridge, 1986 for more details). GAs will work for these types of applications but will reach the optimal region much slower than hill-climbing methods.
Applications which require that the exact global optimum be found may be a challenge for a GA. GAs are best at reaching the global optimum region but sometimes have trouble reaching the exact optimum location. Many researchers use GAs to get close to the optimal region then switch to another method for final exploration.
One of the most commonly cited difficulties with GAs is that compared to hill-climbing techniques they generally require more response (fitness) function evaluations. If the response surface is quite smooth then a hill-climbing method such as Simplex optimization will outperform a GA for a given number of evaluations.
Unfortunately, there are certain optimization problems, termed "GA-hard", which present a difficult challenge to GAs. One of the main areas of research in GAs is the study of these types of applications and to develop methods of determining beforehand whether an optimization problem is GA-hard. Only recently have GA theoreticians been able to understand some of the more common causes of GA-hardness.
****************************************************************
VLGs have been considered to be very important in the evolution of complex entities by a number of researchers (see e.g. the SAGA genetic algorithm of Harvey). This affirms the principle that in ALife GAs are not employed to solve a particular problem, but instead provide a substrate for open-ended evolution.
Building up of Biological Complexity
Darwin's theory of evolution is widely known and accepted but some of its sub-theories are not. Among these debated sub-theories is the explanation of a constructive evolutionary process. Any debate would not be complete without the arguments of the infamous Richard Dawkins and his opponent Stephen Jay Gould. As expected, they propose contrasting theories but this time the line between sides is not quite as clear.
According to Dawkins there are exactly two ways in which genes evolve constructively, one of which he names ‘coadaptive genotypes' and the other ‘arms races'. It is these two ways "in which mutation and natural selection can lead, over the long span of geological time, to a building up of complexity" (a169).
The second of these types, ‘arms races', is an equally if not better explanation of evolution's apparent constructivity. ‘Arms races' "consist of the improvement in one lineage's equipment to survive, as a direct consequence of improvement in another lineage's evolving equipment" (a178). The race relies on the relationship between two competitive organisms ‘tracking' each other's changes. A lion increases its camouflage and thus its competitor, the zebra, increases its speed to escape a predator who can now unnoticeably attack from a closer range. This example of the arms race is what Dawkins terms a "change in habit of weaponry of predators" which "will be tracked by evolutionary changes in their prey" (a179). For "lineages of animals and plants will, in evolutionary time, ‘track' changes in their enemies" (a180). The plurality of ‘enemies is intended for the reason that "both participants have other enemies against whom they are simultaneously running" (a182).
Specifically, we might expect natural selection on predators to increase their efficiency of capturing prey and on prey to increase their efficiency of escape from predators (you saw one example of the latter in the case of industrial melanism in the lecture on natural selection and lots of other examples of neat ways in which predators try to trick their prey or prey to trick their predators in the film on the tropical rain forest). In fact, predators and prey can impose reciprocal selection in what has been called an "evolutionary arms race"--this is a kind of coevolution (mutual influence on the evolution of two interacting species).
In brood parasitism, interactions between a parasite and its host lead to a coevolutionary process called arms race, in which an evolutionary progress in one side provokes a further response in the other side. The host should evolve defensive means to reduce the impact of parasitism, while the parasite should evolve means to counter the host defense. To gain insights into the coevolutionary process of the arms race, a model is developed and analyzed, in which host's defense and parasite's counterdefense are assumed to be genetically determined.
Gazelles, for example, must cope with the arid plains of Africa-- and to lions. They respond by adaptations which let them conserve moisture and with stronger leg muscles, to outrun lions. Lions adapt by becoming better runners themselves to deal with fleeing gazelles. In short, a great deal of adaptation is driven by the adaptations of other species-- evolution is an Arms Race.
But if nature is in fact static then Dawkins does have an explanation; the race has ended. The price, comparable to an economic ‘opportunity cost', is too high, the competition has gone extinct, or the race has reached its physical limits. Nevertheless the arms race is a race to genetic complexity that meets its end in the construction of a body.
Dawkins and Krebs emphasize ritualization as usually arising in situations of conflicting interests -- so greater escalation of manipulation and sales resistance (coevolutionary arms race).[Dawkins and Krebs 1979]
To summarize Dawkins' theory, in the constructive evolutionary process, genes that can cooperate better with other genes and genes that our advantageous in the ‘arms race' are more likely to replicate than those that are not and the adaptive advantages of these genes are derived through a gradual process of cooperating teams and competing enemies.
Dawkins' "arms race theory" state that two different groups of genes are always in competition to gain superiority. The struggle for domination causes both teams of genes to keep mutating to a more complex organisms. An species is always trying to improve itself weather it is to change its skin tone or becoming whiter to out smart its praetors. "Indeed, it is precisely because there has been approximately equal progress on both sides that there has been so much progress in the level of sophistication of design. The element of competition causes the gene to mutate, making a more complex organism.
Competitive co-evolution has recently attracted considerable interest in the community of Artificial Life and Evolutionary Computation. In the simplest scenario of two co-evolving populations, fitness progress is achieved at disadvantage of the other population's fitness. Although it is easy to point out several examples of such situation in nature (e.g., competition for limited food resources, host-parasite, predator-prey), it is more difficult to analyze and understand the importance and long-term effects of such ``arms races'' on the development of specific genetic traits and behaviors. An interesting complication is given by the ``Red Queen effect'' 1 whereby the fitness landscape of each population is continuously changed by the competing population. Given the relative lack of empirical evidence for the importance of the Red Queen effect on biological evolution, Artificial Life techniques seem well-suited to study this penomenon (Cliff and Miller, 1995). For example, Ray's ``Tierra system'' (1991) and Sims' creatures (1994) are based on co-evolutionary competing species; also, several other simulated eco-worlds make use of co-evolving species and competitive fitness schemes (Menczer and Belew, 1993; Yeager, 1994). It has been argued that pursuit-evasion contests might favor the emergence of ``protean behaviors'', that is behaviors which are adaptively unpredictable (Cliff and Miller, 1995). For example, preys could take advantage of unpredictable escape behaviors based on short sequences of stochastic motor actions. Similarly, predators could take advantage of enhanced percep- tual characteristics and/or adaptive sensory-motor intelligence which could enable predictive tracking strate- gies. Miller and Cliff provided an excellent review of the biological significance of pursuit-evasion contests and several arguments for its relevance in the study of protean adaptive behavior (Miller and Cliff, 1994).
Some researchers have attempted to provide a theoretical understanding of the underlying complex dynamics; notably among others, Axelrod (1989) in the context of the Iterated Prisoner's Dilemma, Renshaw (1991) by modeling spatially distributed populations, and Kauffman (1992) in the extended framework of his ``NKC'' class of statistical models of rugged fitness landscape.
-------------------------------------------------------------------
1) The Red Queen is a figure, invented by novelist Lewis Carroll, who was always running without making any advancementbecause the landscape was moving with her.
From a computational perspective, competing co-evolutionary systems are appealing because the everchanging fitness landscape, caused by the struggle of each species to take profit of the competitors' weaknesses, could be potentially exploited to prevent stagnation in local maxima. Hillis (1990) reported a significative improvement in the evolution of sorting programs when parasites (programs deciding the test conditions for the sorting programs) were co-evolved , and similar results were found by Angeline and Pollack (1993) on co-evolution of players for the Tic Tac Toe game. Koza (1991, 1992) applied Genetic Programming to the evolution of pursuer-evader behaviors and Reynolds (1994) observed in a similar scenario that co-evolving populations of pursuers and evaders display increasingly better strategies. Cliff and Miller realised the potentiality of co-evolution of pursuit-evasion tactics in evolutionary robotics. In the first of a series of papers (Miller and Cliff, 1994), they provided an extensive review of the literature in biology and in differential game theory and introduced their 2D simulation of simple robots with ``eyes''. Later, they proposed a new set of performance and genetic measures in order to describe evolutionary progress which could not be otherwise tracked down due to the Red Queen effect (Cliff and Miller, 1995). Recently, they described some of the results where simulated robots with evolved eye-morphologies could either evade or pursue their competitors of several generations earlier and proposed some applications of the approach in biology and in the entertainment industry (Cliff and Miller, 1996).
The Red Queen hypothesis (Hamilton, 1980), emphasises the importance of frequency-dependent selection resulting from interspecific interactions, such as those between hosts and their parasites. The Red Queen hypothesis states that sex is an adaptation to escape from parasites. Under this hypothesis, obligate asexuality is believed not to be viable because high rate coevolving parasites efficiently adapt their strategies for infiltrating host defences. As asexuals often stay genetically the same over several generations, unless a mutation occurs, an obligate asexual lineage would accumulate coadapted harmful parasites. Previous computer models have been proposed to test the Red Queen hypothesis. (e.g. Bell & Maynard Smith, 1987; Hamilton et al., 1990; May & Anderson, 1983) These models used simplified analytical methods or game theoretical versions of host-parasite interaction, with fixed population sizes and fixed patterns of parasite infection. Most of these models considered only haploid organisms, random mating and a simplified expression of fitness. One of the most important criticism about the use of models in biology, is the fact that biological and ecological systems are rather complex. Models, when too simple, ignore that many relevant biological phenomena are emergent properties from complex interactions. This criticism is difficult to refute as evidence of the emergence of unexpected properties from complex system simulations is rising. (eg. Cliff & Miller, 1994; Jefferson, 1991; Kauffman & Johnson, 1992; Levin et. al., 1997; Ray, 1991) In this respect, evolutionary computer simulations are ideal tools for studying co-evolution (Cliff & Miller, 1995). They allow to model more complex and realistic genotypes, phenotypes and interactions than population-genetic or evolutionary game theory models. Moreover, they allow researchers to make detailed measurements during and after co-evolution, revealing much more information than conventional methods. Within the Artificial Life community, the use of computer simulation methods has shown to be important in understanding the dynamics of co- evolution (Cliff & Miller, 1995; Hillis, 1992; Kauffman & Johnson, 1992). Computer simulation showed also to be useful in understanding the reciprocal interactions within species between mate preferences and sexually selected traits. (Jaffe, 1996; Miller & Todd, 1993) Additionally, from the point of view of epidemiology, a simulation model allows the monitoring of single hosts or parasites, even when they are inside hosts.
As mentioned, the fundamental difference between the AGR model and the RQE model is that in the later there is no optimal host genotype for parasite resistance. In this model, we have the `Red Queen Effect' which arises from co-evolutionary arms races. In co-evolution between hosts and parasites, the hosts resistance evolves against parasites virulence that evolve on their turn: each lineage's fitness landscape changes perpetually.
"Rather than spending uncountable hours designing code, doing error-checking, and so on, we'd like to spend more time making better parasites!" - Danny Hillis
"It seems to be a universal property of life that all successful systems attract parasites," (Thomas Ray). In nature parasites are so common that hosts soon coevolve immunity to them. Then eventually the parasites coevolve strategies to circumvent that immunity. And eventually the hosts coevolve defenses to repel them again. In reality, these actions are not alternating steps but two constant forces pressing against one another.
Ray seeded his world (which he called "Tierra") with a single creature he programmed by hand-the 80-byte creature-inserted into a block of RAM in his virtual computer. The 80 creature reproduced by finding an empty RAM block 80 bytes big and then filling it with a copy of itself. Within minutes the RAM was saturated with copies of 80.
On Ray's first run of Tierra, random variation, death, and natural selection worked. Within minutes Ray witnessed an ecology of newly created creatures emerge to compete for computer cycles. The competition rewarded creatures of smaller size since they needed less cycles, and in Darwinian ruthlessness, terminated the greedy consumers, the infirm, and the old. Creature 79 (one byte smaller than 80) was lucky. It worked productively and soon outpaced the 80s.
Ray also found something very strange: a viable creature with only 45 very efficient bytes which overran all other creatures. "I was amazed how fast this system would optimize," Ray recalls. "I could graph its pace as the system would generate organisms surviving on shorter and shorter genomes."
On close examination of 45's code, Ray was amazed to discover that it was a parasite. It contained only a part of the code it needed to survive. In order to reproduce, it "borrowed" the reproductive section from the code of an 80 and copied itself. As long as there were enough 80 hosts around, the 45s thrived. But if there were too many 45s in the limited world, there wouldn't be enough 80s to supply copy resources. As the 80s waned, so did the 45s.
"I started with a creature 80 bytes large," Ray remembers, "because that's the best I could come up with. I figured that maybe evolution could get it down to 75 bytes or so. I let the program run overnight and the next morning there was a creature-not a parasite, but a fully self-replicating creature-that was only 22 bytes!
Hillis's first massively parallel Connection Machine had 64,000 processors working in unison. He inoculated his computer with a population of 64,000 very simple software programs. Each bug in his soup was initially a random sequence of instructions, but over tens of thousands of generations they became a program that sorted a long string of numbers into numerical order. Such a sort routine is an integral part of most larger computer programs; over the years many hundreds of man hours have been spent in computer science departments engineering the most efficient sort algorithms. Hillis let thousands of his sorters proliferate in his computer, mutate at random, and occasionally sexually swap genes. Then in the usual evolutionary maneuver, his system tested them and terminated the less fit so that only the shortest (the best) sorting programs would be given a chance to reproduce. Over ten thousand generations of this cycle, his system bred a software program that was nearly as short as the best sorting programs written by human programmers.
Hillis then reran the experiment but with this important difference: He allowed the sorting test itself to mutate while the evolving sorter tried to solve it. The string of symbols in the test varied to become more complicated in order to resist easy sorting. Sorters had to unscramble a moving target, while tests had to resist a moving arrow. In effect Hillis transformed the test list of numbers from a harsh passive environment into an active organism. Like foxes and hares or monarchs and milkweed, sorters and tests got swept up by a textbook case of coevolution.
A biologist at heart, Hillis viewed the mutating sorting test as a parasitic organism trying to disrupt the sorter. He saw his world as an arms race-parasite attack, host defense, parasite counterattack, host counter-defense, and so on. Conventional wisdom claimed such locked arms races are a silly waste of time or an unfortunate blind trap to get stuck in. But Hillis discovered that rather than retard the advance of the sorting organisms, the introduction of a parasite sped up the rate of evolution. Parasitic arms races may be ugly, but they turbocharged evolution.
Just as Tom Ray would discover, Danny Hillis also found that evolution can surpass ordinary human skills. Parasites thriving in the Connection Machine prodded sorters to devise a solution more efficient than the ones they found without parasites. After 10,000 cycles of coevolution, Hillis's creatures evolved a sorting program previously unknown to computer scientists. Most humbling, it was only a step short of the all-time shortest algorithm engineered by humans. Blind dumb evolution had designed an ingenious, and quite useful, software program.
"We want these systems to solve a problem we don't know how to solve, but merely know how to state." One such problem is creating multimillion-line programs to fly airplanes. Hillis proposes setting up a swarm system which would try to evolve better software to steer a plane, while tiny parasitic programs would try to crash it. As his experiments have shown, parasites encourage a faster convergence to an error-free, robust software navigation program.
Even when technicians do succeed in engineering an immense program such as navigation software, testing it thoroughly is becoming impossible. But things grown, not made, are different. "This kind of software would be built in an environment full of thousands of full-time adversaries who specialize in finding out what's wrong with it," Hillis says, thinking of his parasites. "Whatever survives them has been tested ruthlessly." In addition to its ability to create things that we can't make, evolution adds this: it can also make them more flawless than we can. "I would rather fly on a plane running software evolved by a program like this, than fly on a plane running software I wrote myself," says Hillis, programmer extraordinaire.
"What are the limits to natural selection? What can't evolution make? And if blind natural selection has limits, what else is operating within or beyond evolution as we understand it?" - Kelly, Kevin "Out of Control"
In the pursuit of artificial evolution, the limits (if any) to natural selection, or to evolution in general, take on practical importance. We'd like an artificial evolution that generates neverending diversity, but so far, that isn't so easy to do. We'd like to extend the dynamics of natural selection to very large systems with many levels of scale, but we don't know how far natural selection can be extended. We'd like an artificial evolution that we could control a bit more than we control organic evolution. Is that possible? [ Kelly, Kevin, "Out of Control"]
Death gives room for the new, it eliminates the ineffective. But to say that death causes wings to be formed, or eyeballs to work, is essentially wrong. Natural selection merely selects away the deformed wing, the unseeing eye. "Natural selection is the editor, not the author," says Lynn Margulis. What, then, authors innovation in flight and sight?
'Macroevolution' Needs Macromutations
"Systemic mutation [large numbers of positive, perfect, coordinated mutations suddenly changing one species to another] have never been observed, and it is extremely improbable that species are formed in so abrupt a manner." —*Theodosius Dobzhansky, Genetics and the Origin of Species (1941), p. 80.
"It is true that nobody thus far has produced a new species or genus, etc., by macromutation [a combination of many mutations]; it is equally true that nobody has produced even a species by the selection of micromutation [one or only a few mutations]."—*Richard Goldschmidt, "Evolution, As Viewed by One Geneticist," American Scientist, January 1952, p. 94.
The sudden appearance of a wholly new creature or even a solitary complex organ like an eye or wing would amount to astronomical numbers of highly coordinated mutations occurring simultaneously, a statistical miracle and a genetic impossibility.
"After perhaps a million breedings of different varieties from around the world, he [Goldschmidt] came to the conclusion that geographic variation is a blind alley that leads only to microevolution within the species. Because of his studies, he had to conclude that for major progressive evolution to occur, large mutations or macromutants must have occurred in the past."—Harold G. Coffin, "Creation: The Evidence from Science," These Times, January 1970, p. 25.
We noted earlier that some evolutionists adhered to the natural selection, as the cause of cross-species changes. Later, when mutations were discovered and the inadequacies in natural selection were realized, many turned to mutations as the solution.
"We have established that a single cell bacteria requires about 3,000,000 nucleotides so as to function and reproduce as a unicell species. A human cell contains about 3,000,000,000 nucleotides in a very specific sequence. We may assume that the cell of a trilobite was somewhere in between. Shall we extend it the benefit of the doubt and guesstimate it to have 500,000,000 meaningfully aligned nucleotides? (The argument would still be valid were it eventually established that a trilobite had, for example, as few as 20 million or as many as 920 million nucleotides). How will we get from 3 million to 500 million? What is the probability that 497 million nucleotides would align themselves—all by themselves—into a very, very specific sequence? Certainly Gould and Eldredge would agree that the probability is nil."—I.L. Cohen, Darwin Was Wrong (1984), pp. 98-99.
But, later still, several prominent evolutionists turned to a new variation on the mutation theory: they came up with the "hopeful monster" theory. This is the idea that, once every 50,000 years or so, a gigantic set of helpful, positive mutations occurs all at once: a lizard lays an egg and a beaver hatches from it!
Goldschmidt spent a unrewarded lifetime showing that extrapolating the gradual transitions of microevolution (red rose to yellow rose) could not explain macroevolution (worm to snake). Instead, he postulated from his work on developing insects that evolution proceeded by jumps. A small change made early in development would lead to a large change-a monster-at the adult stage. Most radically altered forms would abort, but once in a while, large change would cohere and a hopeful monster would be born. The hopeful monster would have a full wing, say, instead of the half-winged intermediate form Darwinian theory demanded. Organisms could arrive fully formed in niches that a series of partially formed transitional species would never get to. The appearance of hopeful monsters would also explain the real absence of transitional forms in fossil lineages.
"After observing mutations in fruit flies for many years, Professor Goldschmidt fell into despair. The changes, he lamented, were so hopelessly micro [insignificant] that if a thousand mutations were combined in one specimen there would still be no new species."—*Norman Macbeth, Darwin Retried (1971), p. 33.
Goldschmidt concluded that Darwinian evolution could account for no more than variations within the species boundary; unlike Grasse', he thought that evolution beyond that point must have occurred in single jumps through macromutations. He conceded that large-scale mutations would in almost all cases produce hopelessly maladapted monsters, but he thought that on rare occasions a lucky accident might produce a "hopeful monster," a member of a new species with the capacity to survive and propagate (but with what mate?).
"There has recently been renewed expression of support for the importance in macroevolution of what Goldschmidt termed the hopeful monster . . At least in principle, Goldschmidt accepted Schindewolf's extreme example of the first bird hatching from a reptile egg. The problem with Goldschmidt's radical concept is the low probability that a totally monstrous form will find a mate and produce fertile offspring."—*Steven M. Stanley, Macroevolution: Pattern and Process (1979), p. 159.
If Goldschmidt really meant that all the complex interrelated parts of an animal could be reformed together in a single generation by a systemic macromutation, he was postulating a virtual miracle that had no basis either in genetic theory or in experimental evidence.
"Although he [Goldschmidt] recognized the constant accumulation of small changes in populations (microevolution) [changes within species], he believed they did not lead to speciation. Between true species he saw `bridgeless gaps' that could not be accounted for by large sudden jumps, resulting in `hopeful monsters.' "—*R. Milner, Encyclopedia of Evolution (1990).
You are extrapolating results from a directed intelligent genetic engineering experiment into proof for an undirected, random, "blind watchmaker" process. It is a good model for Intelligent Design. It is not a model for random mutation + cumulative selection.
"Suppose that, following a massive research program, scientists succeed in altering the genetic program of a fish embryo so that it develops as an amphibian. Would this hypothetical triumph of genetic engineering confirm that amphibians actually evolved, or at least could have evolved, in similar fashion? No it wouldn't, because Gould and the others who postulate developmental macromutations are talking about random changes, not changes elaborately planned by human (or divine) intelligence A random change in the program governing my word processor Could easily transform this chapter into unintelligible gibberish, but It would not translate the chapter into a foreign language, or produce a coherent chapter about something else. What the proponents of developmental macromutations need to establish is not merely that there is an alterable genetic program governing development, but that important evolutionary innovations can be produced by random changes in the genetic instructions." (Johnson P.E., "Darwin on Trial", Second Edition, 1993, InterVarsity Press, Illinois, p42)
"Do we, therefore, ever see mutations going about the business of producing new structures for selection to work on? No nascent organ has ever been observed emerging, though their origin in pre-functional form is basic to evolutionary theory. Some should be visible today, occurring in organisms at various stages up to integration of a functional new system, but we don't see them: There is no sign at all of this kind of radical novelty. Neither observation nor controlled experiments has shown natural selection manipulating mutations so as to produce a new gene, hormone, enzyme system, or organ."—*Michael Pitman, Adam and Evolution (1984), pp. 67-68.
...It says that variations can be chosen in a deliberate way. Rather than have the gene bureaucracy merely edit random variations, have it produce variations by some agenda. Mutations would be created by the genome for specific purposes. Direct mutations could spur the blind process of natural selection out of its slump and propel it toward increasing complexity. In a sense, the organism would direct mutations of its own making in response to environmental factors.
Stress-Dependent Mutation Rates
Similar compatibility is not the case, however, for another view of evolution that has attracted considerable interest and led to much recent controversy. In 1988 John Cairns, a well-respected molecular biologist and cancer researcher, published with two associates a paper in the prestigious British journal Nature that threatened to undermine the basic tenets of Darwinian evolution [20].
Cairns and his colleagues claimed to have found evidence that E. coli was able somehow to direct its mutations to achieve adaptive changes when placed in a new, challenging environment. This research involved placing bacteria that could only metabolize glucose in an environment where only a foreign sugar (lactose) was available. Here the stressed bacteria continued to duplicate and, as would be expected, some of the descendants contained mutations that permitted them to metabolize the new sugar. This in itself is not surprising, since the genetic change necessary to transform an E. coli from a glucose- to a lactose-eating bacterium is quite small, and in a large colony it would be expected that at least some of the naturally occurring mutants would have stumbled on it by sheer blind chance. But these scientists reached the highly unorthodox conclusion that instead of being produced randomly, the bacteria were somehow able to produce the adaptive mutations at a much higher frequency than other, nonadaptive mutations. In other words, they believed that their studies provided evidence that "bacteria can choose which mutations they should produce" which would "provide a mechanism for the inheritance of acquired characteristics."[*]
As would be expected, these statements immediately elicited both considerable interest and controversy, since the central dogma of biology was being challenged, that is, that changes in the environment cannot direct (instruct) changes in the genome. Some researchers rejected this conclusion out of hand, but others were impressed enough to attempt to find possible mechanisms by which the environment could somehow instruct the genome to produce just the right mutations to allow it to digest the new sugar. Cairns himself proposed that environmental changes could affect changes in proteins that could consequently instruct the DNA to make certain adaptive changes in the genes, in flagrant violation of the central dogma.
However, it may well be that this and other explanations for directed or "instructed" mutation are not necessary after all. Australian microbiologist Donald MacPhee and his colleagues provided evidence that, when placed in a medium of lactose, the mutations produced by glucose-metabolizing E. coli are indeed produced blindly.[*] What seems to happen under the stressed condition of a glucose-poor environment is not a specific increase in the rate of adaptive mutations, but rather a general increase in the overall mutation rate due to inhibition of the mechanism that usually checks and repairs the genetic errors that arise during the normal functioning of the bacterium. So while mutations continue to be produced blindly, the higher rate of genetic change allows the bacteria to stumble on the adaptive genetic change more quickly than they would if left in their normal glucose-rich environment.
But let us continue to imagine for a moment that a bacterium was able to change just those genes regulating metabolism in just the right way to allow for the digestion of a foreign sugar. If this were the case, it would be yet another example of a puzzle of fit demonstrating that the bacterium had somehow acquired the ability to sense a new sugar in its environment and alter its genome to digest it. But then we would be led to ponder how this adapted complexity could have originated in the first place, with cumulative blind variation and selection as a prime candidate to explain the source of this remarkable ability that somehow permitted the bacterium to instruct its genome to make the required changes to digest the new, strange food that was being served.
Although no convincing evidence exists that adaptive changes in genes can be directed by the environment in a Lamarckian manner, the findings of Cairns and MacPhee and their respective colleagues are important. If organisms are able to increase their mutation rate in the presence of new environmental stresses but keep mutations in check when these stresses are absent, it would enable organisms to exert a certain degree of control over evolution that is absent from the classic neo-Darwinian perspective. Instead of producing mutations at a constant rate regardless of environmental conditions, organisms may produce more mutations and therefore more varied offspring just when such innovative variation is necessary to keep the species extant.[*]
This view ascribes to the evolutionary process decidedly more "intelligence" than does the neo-Darwinian perspective. It nonetheless preserves the required blindness of genetic variations. What is altered is only the rate of production of these variations. This sensitivity of mutation rate to environmental stress could simply be the result of a stress-related breakdown of genetic repair mechanisms. Or it could be the result of a more sophisticated active mechanism that itself had evolved by natural selection, since individuals that by chance produced more genetic variability under difficult environmental conditions would have been more likely to leave better adapted progeny than those insensitive to environmental stress.
The work of Cairns and MacPhee concerned the metabolism of different types of food. It is not difficult to imagine how other types of biological functions could also be involved, such as thermoregulation. For example, as temperatures dropped at the onset of an ice age, mammals would undergo stress as did Cairns's bacteria when placed in an environment where no useful food was available. This would lead to an increase in the mutation rate during reproduction, resulting in a second generation of animals with greater variation in the length and texture of their coats. Those particular descendants having, by chance, longer and thus warmer coats would suffer less from the cold environment, resulting in lower mutation rates and consequently less variation in the coats of their third generation, extra-hairy offspring. But those second-generation animals with short coats would maintain a higher rate of mutation, so that at least some of their offspring would likely have warmer coats then their parents.
This hypothesis has some interesting consequences. As in the ice-age example, by varying the mutation rate, a species would adapt more quickly to changing environmental conditions. It is also of interest to realize that such stress-dependent mutation rates would result in occasional short periods of relatively rapid (although still gradual) evolutionary change separated by longer periods of little or no change during periods of environmental stability. And this is exactly what Gould, Eldredge, and their associates refer to as punctuated equilibrium.
Does mutability carry a selective advantage under stress? Nine years ago John Cairns and his colleagues at the Harvard School of Public Health reported in the influential journal Nature sensational experiments "suggesting that cells may have mechanisms for choosing which mutations will occur"-specifically, in ways that give those cells an advantage in stressful conditions. This radical proposal collided head-on with the sacrosanct principle of genetics that mutations occur at a rate that is completely unrelated to whatever consequences they might have. Cairns's suggestion thus conjured the ghost of jean-Baptiste Lamarck, who argued in tile 19th century that species evolve through the inheritance of "acquired" characteristics-ones that individuals develop in response to environmental challenges. Cairns postulated that bacterial cells, in effect, mysteriously know in advance which mutations are likely to benefit them. Then, when'-investigators stress the cells by starving them, the bacteria tip fate's scales so that rare beneficial mutations happen more often than chance would allow. This incendiary idea, known as directed mutation, ignited a firestorm of debate. Almost a decade later the dust has still not settled. Investigators around the world have immersed themselves in complex experiments to learn whether the apparent surplus of beneficial mutations in Cairns's studies - confirmed by other researchers-might have a less explosive alternative explanation. Potentially far-reaching discoveries are now emerging. Most biologists now believe-and Cairns has acknowledged-that the seeming excess of beneficial mutations found in many directed-mutation studies might arise because researchers are more likely to spot and so count beneficial events than they are harmful ones. Various theories have been advanced to explain why, although none has gained universal acceptance. Recent experiments, however, provide important evidence for one effect that could produce such a counting bias. The effect, hypermutation, thus might make true directed mutation unnecessary. But hypermutation itself opens the door to some intriguing possibilities.
Another molecular biologist, Barry Hall, published results which not only confirmed Cairns's claims but laid on the table startling additional evidence of direct mutation in nature. Hall found that his cultures of E. coli would produce needed mutations at a rate about 100 million times greater than would be statistically expected if they came by chance. Furthermore, when he dissected the genes of these mutated bacteria by sequencing them, he found mutations in no areas other than the one where there was selection pressure. This means that the successful bugs did not desperately throw off all kinds of mutations to find the one that works; they pinpointed the one alteration that fit the bill. Hall found some directed variations so complex they required the mutation of two genes simultaneously. He called that "the improbable stacked on top of the highly unlikely." These kinds of miraculous change are not the kosher fare of serial random accumulation that natural selection is supposed to run on.
Hypermutation was first proposed as an explanation for Cairns's results in 1990, by Barry G. Hall of the University of Rochester. Hall conjectured that when starving, a few bacterial cells might enter an unusual state in which they generate multiple mutations. Cells that by random chance produced favorable mutations in extreniis would survive to be counted, but others would probably die and leave no trace. So investigators would see niore beneficial mutations than harmful or neutral ones. For some years, technical obstacles made it hard to confirm or refute this explanation. Now Patricia L. Foster of Boston University and, separately, Susan Rosenberg of the University of Alberta have performed experiments that give it a boost. Like Cairns, the researchers studied bacteria that lack the ability to feed on the sugar lactose. When Foster and Rosenberg deprived the bacteria of all sugars except lactose, excess mutations arose not only iii a gene that allowed the bacteria to use the lactose but in other genes, too. The two sets of results "together show the generality of hypermutation under lactose selection," commented Bryn A. Bridges of the University of Sussex in Nature on Jun6 The results suggest, as Hall had proposed, that hypermutation occurs in some cells that are under physiological stress, possibly because DNA is more likely to break under such conditions. Bridges reserved judgment on whether bacteria evolved the capacity for hypermutation as an adaptation to overcome nutritional stress or whether the effect is merely a mechanical response to starvation. But studies reported in the same journal a week later suggest-to some, at least-a possible way that hypermutation may indeed have evolved as an adaptation. These latest findings show that in natural populations of bacteria, "mutator genes," which increase the mutation rate, can spread through a population by allowing the bacteria to evolve faster. Parado@ally, this happens even though mutations produced by the mutator genes, like others, are on average harmful. The seemingly impossible occurs because my4ators occasionally arise in individuals that also carry an ad,vantageous gene. In an asexual population, the mutator may then spread with the advantageous gene, a phenomenon called the hitchhiking effect. Franqois Taddei of the CNRS in Paris and an Anglo-French team showed in a theoretical study that in a changing environment, the faster evolution made possible by mutator genes often outweighs their disadvantage to the individual. And Pau1 D. Sniegowski of the University of Pennsylvania and his colleagues showed that mutators can get ahead in real populations as well. In three out of 12 bacterial colonies evolving in a new environment, mutator genes swept through the population and became ubiquitous. Researchers have found evidence that mutator genes are especially common in tumors and pathogens. By allowing faster evolution, they might help the villains evade hosts' immune systems, Sniegowski suggests. And although he emphasizes that his finding has no immediate bearing on the notion of directed mutation, the new crop of results leads some biologists to suspect that mutation might play a more complicated role in evolution than they had believed. In a Nature conimentary on June 12, E. Richard Moxon of John Radcliffe Hospital in Oxford, England, and David S. Thaler of ttie Rockefeller University note that many pathogens have some collections of genes that are excessively prone to mutation. Mutation frequently varies the combinations of these hypermutable genes that are iii ictive service by making individual genes functional or not. Because the genes affect how the pathogen interacts with its liost, hypermutation within such special sets of genes allows tile microbe to confound immune defenses. Other hypermutable gene sets might assist in solving differ ent challenges, Moxoii atid Thaler conjecture. If, for example, the genes' rate of mutation is affected by a microbe's physiological state, like the mutation rates Rosenberg and Foster studied, hyper mutable genes could generate mutations when a cell was starv ing and so lielp mimic directed mutation. The mutations would still be random, but the most beneficial ones would remain long enough to be counted.
The appearance of directed mutation might thus arise "with no requirement for new molecular mechanisms," Moxon and Thaler surmise. The scientists suggest further that if physiological factors can influence hypermutable genes, perhaps separate mutator genes can also switch on and off hypermutable genes. Mutation rates would then be subject to fine-grained genetic control. Thaler says that "the mechanisms for the generation of variants are themselves subject to evo!ution." It might take another decade to learn whether evolution routinely plays such a sophisticated game with mutation rates. But one piece of unpublished work lends support to the notion tliat mutator genes might liave a part iii how hypermutation simulates directed mutation. Hall has recently isolated five bacterial genes that niake excess favorable mutations seem to appear elsewhere in the bacterial DNA. Hall thinks his newly isolated genes soii-icliow stimulate liypermutatioii and so generate the illusion of overabuiidaiit advantageous mutations. "In my gut I feel it's an evolved phenomenon," lie says. Pure directed ii-iutation, with its spooky foreknowledge, may be dead. But real mechanisms that produce the ghost of directed mutation could yet shake up biology. "In evolutionary theory there has been ali overemphasis on the power of selection as opposed to the generation of diversity," Thaler goes on to reflect. "Maybe this will take it to an other level." -Tim Beardsley in Washington, D.C.
[New Scientist 14 February 1998 Martin Brookes in a science writer based in London]
Evolution is nature's way of solving problems. The main problem, of course, for a bacterium or any other organism, is how to survive long enough to reproduce. In the struggle for existence, genetically distinct individuals compete for limited resources. Those that do best produce more offspring and become the major genetic shareholders of the next generation. Random mutations and sexual reproduction shuffle genes to create offspring with new genetic combinations. And every generation, natural selection sifts through the competitors, eliminating combinations that don't work, and adapting organisms to their enviroranents. But what happens if the envirorunent isn't stable, but is constantly changing? How does evolution cope with a moving target? This is a problem that pathogenic bacteria such as Salmonella or Escherichia coli know well. For them, the problems come thick and fast in the form of a vicious molecular bombardment, courtesy of their host's immune system and, sometimes, antibiotic reinforcements. Still, they manage to evolve quickly, on the hoof, to find new genetic solutions. A bad bout of food poisoning is living proof that somehow, for a time, the bacteria have got the better of your defences.
Fuelled by mutation and driven by natural selection, organisms will tend to march uphill in the landscape toward fitness peaks and, having arrived, stay there. A particular adaptive landscape is fixed as long as an organism's environment doesn't change. But genetic combinations that were favoured in one environment will not necessarily be favoured in another. If their environment changes, a population sitting proudly on a mountain top mav suddenly find itself down in a valley. In order to survive in this new environment, organisms are banking on mutations to inch them towards the top of a new peak on the landscape.
Trouble is, the navigational skills of organisms on an adaptive landscape are limited - they cannot anticipate where they should go. And the mutations that must take them there are random. For a bacterium, the verv next mutation is more likely to cause touble than help. But Richard Moxon, a biologist from the John Radcliffe Hospital at Oxford University, points out that the randomness of mutations does not imply the lack of any constraint. Some genes are much more likely' to mutate than others, and Moxon is convinced that these highly mutable genes are accident prone by design. If you take away radiation and chemical nasties from an organism's environment, the mutation rate will diminish, but it won't disappear. Mutations are not just the result of insults from outside the cell. They can also come about as a consequence of "honest" mistakes in DNA metabolism. All cells possess a suite of enzymes which maintain the upkeep of the DNA, and ensure that it is faithfully replicated before being passed on to the next generation. For example, proofreading enzymes detect and correct chemical defects in the DNA, and polymerase enzymes catalyse the synthesis of a replica DNA molecule prior to cell division. Although these enzymes are efficient, they do make mistakes. Errors may be 'overlooked" or the wrong base may be inserted into a new DNA template. These n-dstakes in DNA metabolism are not distributed uniformly across the genome. They tend to occur again and again at specific 'hypermutable" genes that contain chen-dcal banana skinsrepetitive motifs embedded in their DNA sequence-which increase the chances of genetic shp-ups. Apart from being accident black spots, these genes have something else in common - they code for cell surface proteins. Cell surface proteins are naturally in the front line of the evolutionary war between an invading bacterium and its host. For the host, they are the visible signatures of foreign intruders and the primary targets for its defence mechanisms. To stem the tide of a bacterial infection, molecules produced by the host's immune system must first recognise and lock on to these proteins. In order to deceive the host, bacteria set up novel arrays of targets on their cell surface, which the host's immune system is unable to recognise. To do this requires rapid and frequent changes to the structure and conformation of the cell surface proteins.
Dual strategy. Moxon believes that the high mutation rates of genes coding for these proteins has evolved to give bacteria flexibility where it is most needed. 'Hypermutable genes are those which generate useful biological noise at the cell surface," he says. Because mutations are, on average, likely to have detrimental effects, most genes-those which carry out the basic -housekeeping" functions for the ceflare selected to have a low mutation rate. But a hn-dted set of genes, the so-called contingency genes, are selected to have a high mutation rate. This combination of highly mutable contingency genes and more sedate housekeeping genes promotes flexibility where it is most needed, while minin-dsing the risks associated with genetic mutations. "We're dealing with two strategies side by side," says Moxon. To what does this correspond in the adaptive landscape? Any change in a housekeeping gene is catastrophic. So there must be a high and narrow ridge runninge landscape (see Diagram). Changes to housekeeping genes correspond to movement off the edge of the ridge, the precipitous faces of which reflect the high cost of altering crucial elements.

Changes in the hills: a bacterium evolved to a fitness peak (a) may suddenly find itself in a valley (b). Mutations being rare, the next generation would also be stuck nearby. A high mutation rate enables bacteria to evolve rapidly in one generation and to find a new peak.
'A mutation in a mutator gene can alter the efficiency of a proofreading enzyme and ultimately lead to mutational mayhem across the genome'.
Mutations in contingency genes, on the other hand, correspond to movement along the ridge, which may wander gently up and down. By restricting mutations to the contingency genes, bacteria ensure that mutations don't throw them off the cliff, but still maintain flexibility where they need it. So, the chances of a mutation being beneficial are increased (see "Restricted wandering"). But contingency genes are not the only way in which bacteria can optimise their genetic prospecting. Last year, so-called mutator genes were discovered in the organism's problem, solving ar-moury. Mutator genes code for the enzymes involved in DNA metabolism. Their name is appropriate, for a mutation in a mutator gene can alter the efficiency of a proofreading enzyme and ultimately lead to mutational mayhem across the genome. A supermarket provides an analogy. To be successful, your local food emporium depends on ordered efficiency. Products are neatly arranged on their designated shelves by a small army of underpaid workers. The job of the floor manager the human equivalent of the mutator geneis, to check that the shelf-stackers are putting the foodstuffs in the right places. But if the floor manager develops a fondness for extended liquid lunches, the results could be disastrous. Before too long, baked beans in the wine section and nappies in the frozen foods will send confused and disgruntled shoppers scurrying for the exits.
Bad caretakers. The point is, of course, that an inefficient caretaker, be it in the form of a mutator gene or a supermarket floor manager, can leave a once ordered system in chaos. "A mutation in a mutator gene was always thought to be bad news for the organism," says Paul Sniegowski, "but weird and surprising stuff turns up when you study bacteria." Sniegowski, a biologist from the University of Pennsylvania, is speaking from experience. He has just spent four long years studying 10 000 generations of bacteria, and believes that "inefficient caretakers" just might have their uses. At the start of the study, Sniegowski and his colleagues set up 12 genetically identical populations of E. coli in an environment lacking a vital nutrient. 'When the populations were started, they weren't adapted to this environxnent, so there was lots of potential for adaptation," says Sniegowski. After 10 000 generations, 3 of the 12 populations had mutation rates that had risen from 0.002 to 0.2 mutations per gene per generation. The mutation rates of the other nine populations remained unchanged. When Sniegowski introduced normal working copies of the mutator gene into the three hypermutable populations, the ancestral mutation rate was restored. This confirmed that the high mutation rate had been caused by a mutation in a mutator gene.
'For an organism facing a wildly fluctuating environment, high mutability of mutator genes could be an adaptation in itself'
How can populations with such high mutation rates survive? The answer may be surprisingly simple. An "inefficient caretaker" will prompt a cascade of mutations throughout the genome. Most wiu be harmful to the organism and threaten its chances of survival. But, by chance, a new mutation might appear whose benefits outweigh the total cost of all the deleterious changes. Or, in one or a few offspring bacteria, just the right combination may occur to produce a bacterium adapted to the new environment. Of course, there is no guarantee that the drastic measure of employing an 'inefficient caretaker" will have the desired effect-the changes could be entirely deleterious. But it does at least make rapid change a possibility. More importantly, it allows the process to be switched on or off with a single mutation. Evidence from outside the lab suggests that most bacteria have low genome-wide mutation rates. But where high mutation rates are found, they tend to be in highly pathogenic species like Salmonella. "I think mutator genes can themselves be thought of as contingency genes," says Moxon. For an organism facing a wildly fluctuating envirorunent, high mutability of mutator genes could be an adaptation in itself. Moxon refers to the genome's mutational machinery as the organism's "tool box" a metaphor which accurately describes its problem-solving role. What's more, he believes that natural selection is constantly refining the contents of an organism's tool box. "I think this fine tuning is going on all the time," he says, "with contingency genes constantly being recruited and dismissed." A lethal bacterial infection is clearly bad news for the host, but it is not an ideal situation for the bacteria either. The death of a host wifl end any prospect of the bacteria establishing further infections. A fatal outbreak of meningitis is an example where the bacteria responsible have let success go to their head. In these circumstances, natural selection is likely to remove some of the tools from the tool box to reduce the bacterial potency. Conversely, if bacterial survival is suddenly threatened by a barrage of new environmental problems or opportunities, additional tools may be incorporated.
Not
so blind watchmaker
It is almost ten years now since John Calms, from the Harvard School of Public Health, wrote, in a highly publicised article in Nature (volume 335, 1988, pp 142-145) that "cells may have mechanisms for choosing which mutations occur'. Cairns believed that organisms might somehow be able to perceive changes in their envirorunent and direct mutations to the most appropriate genes. In other words, Cairns was suggesting that Richard Dawkins's blind watdimaker had suddenly regained his sight. The jury is still out on the so-called directed mutation controversy. Caims's idea did not go down too well with the disciples of Dawkins and Darwin. But more importantly, a feedback mechanism to explain directed mutation has so far proved elusive. Nevertheless, at the time, there was something intuitively attractive about Caims's concept. On the face of it, directed mutation seemed like a much more efficient problem-solving system than random change, and offered a possible explanation for the rapid evolutionary diange observed in pathogenic species of bacteria. Moxon's contingency theory is not a million miles away from Cairns's original notion of directed mutation, yet it neither invokes unknown mechanisms or commits Darwinian blasphemy. What's more, Moxon believes that "biased randomness", as he cans it, is a more effident system of evolutionary problem solving than directed mutation. Directed change would restrict navigational flexibility on the adaptive landscape. Populations would always be evolving towards the nearest local peak, leaving more prominent peaks out of reach. Random change, on the other hand, allows populations to explore genetic solutions over the entire adaptive surface. 'Random genetic variation and natural selection is a far more powerful system of problem solving," says Moxon. Bacteria may be the organisms of choice for Moxon, Sniegowski and others who are fascinated by evolutionary problem solving, but they are by no means the sole proprietors of an evolving tool box. "Contingency genes are characteristic of all organisms," says Moxon. Furthermore, the study of evolutionary problem solving may have important implications for understanding other cases of rapid evolution, such as in HIV and cancer. Many cancers are initiated by a mutation in a mutator gene of a normal cell. The subsequent growth and spread of the cancer is exactly analogous to the growth and spread of pathogenic bacteria inside the host. Says Sniegowski, "We're just at the early stages of this subject, and some pretty exciting things lie ahead."
Restricted Wandering. Contingency genes accelerate an organism's mutational wandering over its adaptive landscape. Imagine a simple case in which an organism has 1000 genes, including five contingency genes. 'A binary switch is a realistic way of describing mutations in these contingency genes,' says Moxon. So each of the contingency genes can exist in one of two alternative states, A or a, B or b, C or c, and so on, with mutations transforming a gene from one state to another. With 5 contingency genes and two alternative states there are 2^5 = 32 possible genetic combinations.
Inside a host, a bacterium with the genetic combination ABcDe gets lucky, finding itself more than capable of countering the host's repertoire of immune responses. ABcDe is the Mount Everest of this particular adaptive landscape. Within hours, this single individual has given rise to millions of identical clones, leaving the host in some discomfort. For the host, drastic measures are needed, so it decides to self-administer some antibiotics. The antibiotics induce radical changes in the bacterial population's adaptive landscape. The highest peak collapses to a valley and fresh new peaks rise up around it. Population ABcDe must rapidly evolve to a new peak if it is to survive the antibiotic onslaught. If this new peak corresponds to the genotype AbCDE, for example, then three mutational changes would be required.
Let's assume initially that the mutation rate in the contingency genes is the same as that in the rest of genome-10^-3 per gene per generation. Then, the chance of the three mutational changes occurring in a single individual is 10^-9. This equates to a probability of one in a billion cells. Even for rapidly reproducing bacteria, this remains a pretty remote possibility. But in real life, contingency genes have much 'higher mutation rates. With a mutation rate of 10^-2, the chance of an individual reaching the new adaptive peak in one generation is reduced to only one in a million cells. Although the descendants of any one bacterium are still almost surely doomed, in a population of millions, there is a good chance of a survivor being created.
Mutator Processes, Enhanced Adaptability and Event Space Dimensions. Bacterial transposons Tn5 and Tn10 have been shown to have similar advantage to mutator genes against otherwise isogenic strains under chemostat competition (Chao et. al. 1983). Successful strains show transposition and in particular insertion of IS10 at determining sites. Hence possessing the transposon is a strongly favoured phenotype, in contradiction to the selfish DNA hypothesis (Doolittle & Sapienza 1980, Orgel & Crick 1980). The same conclusion holds for phages l and m and IS50 (Weiner et. al. 1986). The conservation of the major classes of eukaryote transposable element may also be a result of enhanced adaptability, particularly in times of genomic crisis. Mutator genes cannot function in eukaryotes because sexual recombination separates unlinked genes.
Genes are often defined as the DNA nucleotide sequences required for the synthesis of a RNA transcript. These DNA nucleotide sequences include two types of regions: coding regions (which encode a functional protein or RNA molecule) and non-coding regions (transcriptional control elements, introns, RNA processing signals).
Chromosome evolution, higher order and parasitic elements. With the accumulation of genomic sequence data, certain unexplained patterns of genome evolution have begun to emerge. One striking observation is the general tendency of genomes of higher organisms to evolve an ever decreasing gene density with higher order. For example, E.coli has a gene density of about 2 Kb per gene, Drosophila 4 Kb per gene and mammalian about 30 Kb per gene. Much of the decreased density is due to the increase in the accumulation of non-coding or 'parasitic DNA' elements, such as type one and two transposons.
Intermediate-Repeat DNA
Intermediate-repeat DNA consists of a large number of a relatively few families of DNA sequences:
1. SINES (short interspersed elements) are 150-300 bp in length.
2. LINES (long interspersed elements) are 5-7 kilobases in length.
SINES and LINES are not necessarily exact repeats.
Mobile DNA Elements
The bulk of intermediate-repeat DNA is comprised of DNA elements that are able to move (or transpose) throughout the genome.
These mobile DNA elements are sometimes termed "selfish DNA," since they do not appear to directly benefit the host.
They would appear to be DNA parasites in the host genome.
They do appear to benefit the host organism by providing recombination sites that permit the evolution of new genes (e. g. in gene duplication).
There are two basic types of mobile genetic elements:
1. Insertion sequences or transposons
2. Retrotransposons (viral and non-viral retrotransposons).
These two types of mobile genetic elements have different modes of transposition:
1. Insertion sequences or transposons: DNA intermediates.
2. Retrotransposons: RNA andDNA intermediates.
LINES and SINES are nonviral retrotransposons.
Many SINES, such as the the Alu type, appear to be cytoplasmic mRNAs that have undergone retrotransposition.
If your computer suddenly begins to greet you, at various times, with a vulgar message, you will automatically know that the computer has contracted a virus. It might have arrived via the modem, it might have come with a new program on a disk, or someone might have stealthily keyed it in.
Computer viruses are called viruses because they are analogous to real viruses, the ones that infect living cells. Viruses are not independently capable of metabolism or reproduction. Biologists now think that viruses evolved after cells. What is a virus?
A virus is a piece of genetic instructions in a protective coat. Viruses are not living things. When they are outside of their host cell, they are just very complex molecular particles that have no metabolism and no way to reproduce. In our computer metaphor, they're like software with no hardware, floppy disks or diskettes without a computer.
The viruses that infect bacteria are more specifically called bacteriophages, or simply phages. The kind and amount of genetic instructions in phages vary from 3,600 RNA nucleotides to 166,000 DNA nucleotide pairs (6). To restate these dimensions in terms of our computer analogy, the computer viruses that infect handheld calculators range in size from 900 bytes to over 40 kilobytes. For comparison, the simplest handheld calculator (bacterium) has about 200 kilobytes of stored programs.
The viruses that infect eukaryotic cells vary in size also. The poliovirus has 7,600 RNA nucleotides; the vaccinia (cowpox) virus has 240,000 DNA nucleotide pairs (7). To use computer terms again, the computer viruses that infect personal computers range in size from 1.9 kilobytes to 60 kilobytes. For comparison, a very simple personal computer (a yeast cell) has genetic instructions equivalent to about 8 megabytes. An advanced personal computer (a human cell) contains about 1.5 gigabytes of stored programs, counting the backup copy and the unused programs (silent DNA).
Once inside, the virus causes the machinery of the host cell to enter one of two cycles, the lytic cycle or the lysogenic cycle. In the lytic cycle, which leads to cell degradation, the host begins to carry out the reproductive instructions in the invading virus's genome. Those instructions are, in summary, "make more of me." The host becomes a slave to the invader; it drops everything and begins to manufacture copies of the virus. After many copies have been made, the cell breaks open and dies, and many viruses are released. This is the normal way in which a virus causes symptoms of disease in its host.
In the 1940s, a visionary scientist and Nobel prize winner, Barbara McClintock, predicted the existence of pieces of DNA which could jump in and out of chromosomes - 'jumping genes'. This must have seemed incredulous at the time, since DNA was believed to be stable and invariable. 'Jumping genes' were, in fact, isolated from the bacterium Escherichia coli in the late 1960's and were further defined as specific, small fragments of DNA which were given the name transposons.
Scientific interest in transposons, increased during the 1970's, when it appeared that they assisted in the transfer of bacterial resistance to antibiotics. Furthermore, it soon became evident that they caused most of the spontaneous mutations occurring in laboratory populations of more sophisticated organisms, such as vinegar flies.
Transposons may sound like something you would buy from a toy shop, however, the most striking feature of transposable elements (TEs) is their mobility. In fact, some have been given esoteric names which are indicative of their mobile nature, for example, the HMS Beagle (Darwin's boat),
Stalker (a cartoon character from The Soviet Union), mariner, hobo, Tyrant (a Castilian Knight) first identified by a Spanish geneticist and roo (found in Australia).
Thirty different families of transposons have been isolated from the vinegar fly, often misnamed the fruit fly. They range in size, from 1 to 10 kilo bases of DNA and encode so called 'DNA sites' and enzymes required for their own transposition and maintenance. Many transposons have a unique DNA site (a short, specific sequence of DNA), which acts as a forwarding address, directing the transposon to a complementary DNA site in its host genome. There are usually multiple copies of any given DNA site in the host genome and exactly which site a transposon will attach to is completely random.
The enzymes encoded by transposons provide the physical mechanism for jumping into a host's DNA. Two methods of jumping are known to exist and their characteristic differences have been utilised to classify transposons into two groups.
Transposons in the first group, Class I, appear to jump with an RNA 'parachute', in other words, they change from their initial DNA status into RNA. It is then necessary for this RNA intermediate to change back to DNA. Class I transposons have an enzyme called 'reverse transcriptase' which converts RNA into DNA that is why they are called retrotransposons. After the reverse transcriptase acts on the Class I transposons' RNA they incorporate into the host's DNA.
It could be said that Class II transposons 'free-fall' into their host's genome, as they do not have an RNA intermediate. To accomplish this, they use an enzyme called 'transposase' to incorporate their DNA into their host.
It would appear that transposons are the ultimate example of 'selfish DNA'. After all they are purely parasitic - jumping between different parts of a genome in order to propagate themselves and this is usually to the detriment of their host.
We now know that transposons are ubiquitous and may comprise up to 20% of an organism's genome. Prof John Gibson is interested in genetic variation in natural populations and has been investigating transposons in natural populations of the vinegar fly, Drosophila melanogaster, at the Research School of Biological Sciences. He has asked the question; Do transposons produce diversity in natural populations? In order to answer this important question, he has monitored genetic variation, by measuring the activity of certain enzymes. When he finds a significant difference in enzyme activity, within a natural population of flies, he searches for transposon DNA within the gene. Prof Gibson has found that transposons do generate genetic variation in natural populations and interestingly, their impact is rarely positive!
The explosion of diversity in the Cambrian occurred in the lineage of the eukaryotes; the prokaryotes did not participate. One of the most striking genetic differences between eukaryotes and prokaryotes is that most of the genome of prokaryotes is translated into proteins, while most of the genome of eukaryotes is not. It has been estimated that typically 98% of the DNA in eukaryotes is neither translated into proteins nor involved in gene regulation, that it is simply ``junk'' DNA [*]. It has been suggested that much of this junk code is the result of the self-replication of pieces of DNA within rather than between cells [*, *].
Mobile genetic elements, transposons, have this intra-genome self-replicating property. It has been estimated that 80% of spontaneous mutations are caused by transposons [*, *]. Repeated sequences, resulting from the activity of mobile elements, range from dozens to millions in numbers of copies, and from hundreds to tens of thousands of base pairs in length. They vary widely in dispersion patterns from clumped to sparse [*].
Larger transposons carry one or more genes in addition to those necessary for transposition. Transposons may grow to include more genes; one mechanism involves the placement of two transposons into close proximity so that they act as a single large transposon incorporating the intervening code. In many cases transposons carry a sequence that acts as a promoter, altering the regulation of genes at the site of insertion [*].
Interspersed repetitive sequences are major components of eukaryotic genomes. Repetitive elements can make up nearly a half of a genome, e.g., 45% of the silkworm (Bombyx mori ). Some repetitive sequences are lineage-specific, others are common in a wide spectrum of organisms, which suggests that the former are more recent than the latter in terms of their evolutionary origin. Among different categories of interspersed repetitive elements, the most abundant in eukaryotic genomes are long and short interspersed elements (LINEs and SINEs, respectively). Because the specific function of these elements remains to be defined and because of their unusual 'behaviour' in the genome, they are often quoted as a selfish or junk DNA [Doolittle and Sapienza, 1980; Orgel and Crick,1980]. Although very important in the development of molecular evolutionary biology, the idea of selfish DNA arose at the time when our knowledge of the biology of repetitive elements was very narrow and reflected the sociobiological wave of thinking, which dominated biology in the seventies. Our view of the entire phenomenon of repetitive elements has to now be revised in light of data on their biology and evolution, especially in the light of what we know about the retroposons. The question addressed in this essay is: are interspersed elements junk DNA?
The SINE families are the most highly repeated elements in eukaryotic genomes. The well-characterized SINE families have several features in common. By definition, all SINEs are small, up to 1000 nt in length. They are present in tens to hundred of thousands of copies per genome, and a single species may have more than one SINE family (see [Makalowski, 1995]). SINEs evolved multiple times in eukaryotes from the genes coding for small, untranslated RNAs such as 7SL RNA or tRNA. All known SINE families lack the coding capacity but include an internal RNA polymerase III promoter and A-rich 3' end. Some of the elements underwent fusion during their evolutionary history giving rise to new homodimeric families (e. g., primate Alu and artiodactyl art 2). Others are made up of a fusion between two unrelated sequences, (e. g., the galago type II family is a fusion of 7SL-based repeat with a tRNA-like repeat and the artiodactyl C element is a fusion of an A element with a sequence derived from tRNA [*]). SINE elements proliferate in the genomes via RNA intermediates in the process named by Rogers [Rogers, 1983] retroposition (Fig. #). Although there is no specific integration signals for retroposition, it seems that some regions are predisposed for retroposon integration. For example, the 40.6 kilobase region around the human nucleoprotamine genes contains 42 Alu elements, three truncated L1 sequences, and numerous short stretches of MER elements.
Illustration from Makalowski review [*]. SINEs proliferate in the genome through the RNA intermediate. This process, called retroposition, requires a reverse transcription of the RNA and its subsequent insertion into the genome. The insertion can occur in the direct or reverse orientation with respect to the orientation of SINE master gene.
Transposons may produce gene products and often are involved in gene regulation [*]. However, they may have no effect on the external phenotype of the individual [*]. Therefore they evolve through another paradigm of selection, one that does not involve an external phenotype. They are seen as a mechanism for the selfish spread of DNA which may become inactive junk after mutation [*].
DNA of transposon origin can be recognized by their palindrome endings flanked by short non-reversed repeated sequences resulting from insertion after staggered cuts. In Drosophila melanogaster approximately 5 to 10 percent of its total DNA is composed of sequences bearing these signs. There are many families of such repeated elements, each family possessing a distinctive nucleotide sequence, and distributed in many sites throughout the genome. One well known repeated sequence occurring in humans is found to have as many as a half million copies in each haploid genome [*].
Elaborate mechanisms have evolved to edit out junk sequences inserted into critical regions. An indication of the magnitude of the task comes from the recent cloning of the gene for cystic fibrosis, where it was discovered that the gene consists of 250,000 base pairs, only 4,440 of which code for protein, the remainder are edited out of the messenger RNA before translation [*, *, *, *].
It appears that many repeated sequences in genomes may have originated as transposons favored by selection at the level of the gene, favoring genes which selfishly replicated themselves within the genome. However, some transposons may have coevolved with their host genome as a result of selection at the organismal or populational level, favoring transposons which introduce useful variation through gene rearrangement. It has been stated that: ``transposable elements can induce mutations that result in complex and intricately regulated changes in a single step'', and they are ``A highly evolved macromutational mechanism'' [*].
In this manner, ``smart'' genetic operators may have evolved, through the interaction of selection acting at two or more hierarchical levels (it appears that some transposons have followed another evolutionary route, developing inter-cellular mobility and becoming viruses [*]). It is likely that transposons today represent the full continuum from purely parasitic ``selfish DNA'' and viruses to highly coevolved genetic operators and gene regulators. The possession of smart genetic operators may have contributed to the explosive diversification of eukaryotes by providing them with the capacity for natural genetic engineering.
In designing self replicating digital organisms, it would be worthwhile to introduce such genetic parasites, in order to facilitate the shuffling of the code that they bring about. Also, the excess code generated by this mechanism provides a large store of relatively neutral code that can randomly explore new configurations through the genetic operations of mutation and recombination. When these new configurations confer functionality, they may become selected for.
There is now sufficient evidence to suggest that horizontally transmitted agents and gene sets allow the rapid adaption of various living systems, including bacteria, yeast, drosophila and hymenoptera. 'Pathogenic islands' are contiguous regions of DNA that contain gene sets in bacteria that appear to be horizontally acquired and can exist as either prophage, episomes or genomic sequences (21). These pathogenic islands appear to account for much of the rapid adaptability in bacteria. Transposons of Drosophila appear to require horizontal transmission in order to be maintained during evolution and appear to have been the underlying mechanism of hybrid dysgenesis (10). The parasitioid wasp species (hymenoptera) maintain genomic polydnaviruses in most species which are highly produced into non-replicating viral forms during egg development and subsequently suppress host larval immunity making them essential for egg survival (47,74). Thus horizontally transmitted genetic elements are common in the genomes of all species.
The mammalian chromosome presents an especially interesting case of accumulation of 'parasitic' DNA. All placental species have unique LINE elements present at very high abundance as well as other related and even more abundant elements, such as the SINES or primate specific alu elements (see (70) for references). Yet there appears to be no common progenitor to these elements. All these elements appear to be products of reverse transcription of cellular RNA's however, there is no explanation for the conservation to RT activity in mammals.
Although endogenous retroviruses are found in most organisms prior to mammalian radiation, the levels of these genomic agents is relatively low in non-mammals and the nature of retroposons seems distinct form that in mammalian. Mammalian LINES, for example lack a precise 5' end, have no poly-A 3' end, and lack RT coding regions that are characteristic of all LINE elements as opposed to avian or other retroposon elements of vertebrates that do not have these features. Why are mammalian (eutherian) chromosomes especially so full of these RT derived agents? What selects for their generation or retention?
Additional evidence that genes can move across species boundaries even in eukaryotes comes in the June 13, 1997, issue of Science. A report there by Frederico J. Gueiros-Filho and Stephen M. Beverley of Harvard describes the "Trans-kingdom Transposition" of a gene-size piece of DNA known as a transposable element (*). The particular transposable element they studied, called mariner, has already been found in planaria, nematodes, centipedes, many insects, and humans (*). Until recently, transposable elements were considered to be functionless, or "junk DNA." But John McDonald, a professor in the department of genetics at the University of Georgia, concludes, "It now appears that at least some transposable elements may be essential to the organisms in which they reside. Even more interesting is the growing likelihood that transposable elements have played an essential role in the evolution of higher organisms, including humans" (*). Another team of biologists has demonstrated that by transformation (discussed above in bacteria) a mariner element can become installed into the inherited genome of zebrafish (*). Viruses are not the only mobile genetic elements.
Damage control! By virtue of their mobility, transposons have a considerable capacity to cause havoc in their host's genome! The arrival of a transposon can have a range of effects including; a mild alteration in gene expression, gene deletion, catalysing a major chromosome rearrangement and in some cases, their action can be lethal to the host or have no effect at all. These various effects are determined by the particular characteristics of the transposon, as well as its site of integration, that is, whether it is within a gene or the gene's regulatory regions. Most commonly, transposons have a negative effect upon their host by inhibiting normal gene action and reducing the normal quantity of a given gene product.
Transposons can sometimes become deleterious to a host when they attempt to leave. They not only display a random pattern of site selection, they also leave random patterns of left-over DNA in the host's genome, when they depart. An 'excision event' may be precise - leaving the host's DNA as it was found, or it may leave certain pieces of DNA behind - catalysing the deletion of genes, chromosomal rearrangements or the translocation of genes within the host's genome. The transposon may even take some of the host's DNA with it to the next insertion site.
To complicate matters further, incomplete transposons, which cannot move by themselves, may be reactivated by an 'active' transposon located elsewhere in the host's genome. For example, an active transposon can share its 'transposase' with a locally situated incomplete transposon, restoring its mobility, including the ability to leave the host's genome!
Not surprisingly, the rate at which transposons jump into a genome is very low. The higher the frequency of insertion the higher the probability of a lethal insertion. In the vinegar fly, under normal conditions, an average of 10-4 insertions occur per generation. This does not mean that the entire genome has the same affinity for transposons, rather some genes appear to be particularly attractive to some transposons.
[Eukaryotic transpoable elements and genome evolution. D.J. Finnegan. Trends in Genetics , 1989, 5, 103-107.]
The Viruses in All of Us - The Viruses That Make Us. In addition mammals appear to have retained the presence of at least some copies of non-defective 'genomic retroviruses', such as intercysternal A-type particles (IAP's) or endogenous retroviruses (ERVs), (Lower et al., 1996, Urnovitz and Murphy, 1996. ). It is currently difficult to account for the selective pressure that retains these genomic viruses, since they often lack similarity to existing free autonomous retroviruses. It is widely accepted that viral agents act a negative selecting force on their host. However, viral agents have very high mutation and adaption rates. This character led Salvador Luria (1959) to speculate early on that perhaps viruses contribute to host evolution.
Similarly other routines might result in rearrangement of code within organisms, rather than point mutations. While most such alterations of code are likely to result in the death of the organism, some small percentage of the altered individuals would presumably survive and would provide the raw material for evolution.
Selfish Code - Transposons. The explosion of diversity in the Cambrian occurred in the lineage of the eukaryotes; the prokaryotes are not plaers in the explosion. One of the most striking genetic differences between eukaryotes and prokaryotes is that most of the genome of prokaryotes is translated into proteins, while most of the genome of eukaryotes is not. It has een estimated that typically 98% of the DNA in eukaryotes is neither translated into proteins nor involved in gene regulation, that it is simply "junk" DNA (Thomas, 1971). Orgel and Crick (1980) and Doolittle and Sapienza (1980) have suggested that this junk code is the result of the self-replication of pieces of DNA within rather than between cells.
Mobile genetic elements, transposons, have this intra-genome self-replicating property. It has been estimated that 80% of spontaneous mutations are caused by transposons (Green, 1988). Repeated sequences, resulting from the activit of mobile elements, range from dozens to millions in numbers of copies, and from undreds to tens of thousands of base pairs in length. They ary widely in dispersion patterns from clump ed to sparse (Jelinek and Schmid, 1982).
Transposons code for an enzyme, "transposase", which makes copies of the transposon and inserts it somewhere else in the genome of the same cell, though not all mobile elements code for their own transposase. Larger transposons carry one or more genes in addition to those necessary for transposition. Transposons may grow to include more genes; one mechanism involves the placement of transposons into close proximity so that they act as single large transposon incorporating the intervening code. In many cases transposons carry a sequence that acts as promotor, altering the regulation of genes at the site of insertion (Syvanen, 1984).
Transposons do produce gene products (e.g., transposase) and often are involved in gene regulation. However, they may have no effect on the external phenotype of the individual (Doolittle and Sapienza, 1980). Therefore they evolve through another paradigm of selection, one that does not involve an external phenotype. They are seen as mechanism for the selfish spread of DNA which may become inactive junk after mutation (Orgel and Crick, 1980).
DNA of transposon origin can be recognized their palindrome endings flanked by short non-reversed repeated sequences resulting from insertion after staggered cuts. In Drosophila Melanogaster approximately to 10 percent of its total DNA is composed of sequences bearing these signs. There are many families of such repeated elements, each family possessing distinctive nucleotide sequence, and distributed in many sites throughout the genome. One well known repeated sequence occuring in humans is found to have as many as half million copies in each haploid genome (Strickb erger 1985). Elaborate mechanisms have evolved to edit out junk sequences inserted into critical regions. An indication of the magnitude of the task comes from the recen cloning of the gene for cystic fibrosis, where it was discovered that the gene consists of 250,000 base pairs, only 4,440 of which code for protein, theremainder are edited out of the messenger RNA before translation (Kerem et. al. 1989; Marx 1989; Riordan et. al. 1989; Rommens et. al. 1989).
It appears that repeated sequences in genomes originated as transposons favored selection at the level of the gene, favoring genes which selfishly replicated themselves within the genome. However, some transposons coevolved with their host genome as result of selection at the organsimal or populational level, favoring transposons which introduce useful variation through gene rearrangement. In this manner, "smart" genetic operators evolved, through the interaction of selection acting at or more hierarchical levels (it appears that some transposons have followed another evolutionary route, developing mobility and becoming viruses). It is likely that transposons to da represent the full continuum from purely parasitic "selfish DNA" to highly coevolved genetic operators and gene regulators. The posession of smart genetic operators must have contributed to the explosive diversification of eukaryotes providing them with the capacity for natural genetic engineering.
In designing self-replicating digital organisms, it would be worth while to introduce such genetic parasites, in order to facilitate the sh uing of the code that they bring ab out. The excess code generated this mechanism provides large store of relatively neutral code that can randomly explore new congurations through the genetic operations of mutation and recombination. When these new congurations confer functionality they may become selected for.
Transposable Elements in Eukaryote Mutations. Investigation of spontaneous sequence rearrangements in different species gives a window on classes of recombination occurring in different eukaryote genomes. Although this provides only a very restricted time view of a very long-term process, it does throw up several hints of the action of transpositional and recombinational processes.
The picture in mammals is illustrated by studies of several human an murine mutational pathologies (Meuth 1989). In some mammalian genes, point mutations outnumber the other most frequent feature, deletions, possibly associated with the nuclear scaffold and topoisomerase action. In others large rearrangements predominate. A significant class of rearrangements is crossing over involving repeated sequences, most frequently an Alu at one or both ends, or palindromic stem-loop structures. Others are retroviral insertions and anomalous immuno-recombinational events, involving switch regions. Germ-line mutations are generally more varied and complex than somatic ones, possibly because they include inter-chromosome events from crossing-over.
In Drosophila melanogaster, discounting the diverse modulated mutations caused by dysgenesis elements such as P and I, and FB and its tandem forms, a majority spontaneous mutations are insertions of transposable elements. Although Copia is more abundant in RNA transcripts, Gypsy predominates in spontaneous transposition (Green 1988). A variety of mutations including the dilute locus (Jenkins et. al. 1981, Copeland et. al. 1983a) and the agouti locus (Copeland et. al. 1983b) in the mouse display transpositional insertions with accompanying suppressor action by other genes and formation of solo LTRs. The yellow locus in Drosophila (Geyer et. al. 1988) illustrates selective developmental alterations regulation through insertion and consequent release by solo LTR formation.
One of the most significant studies of the potential of transpositional mutation comes from a Gypsy insert into the ct locus which resulted in concerted transposition (Tchurikov et. al. 1988) possibly involving both DNA-transposition and retrotransposition giving rise to transposition explosions - significant movements of Gypsy, other retrotransposons and possibly also P and FB elements, but without obvious chromosome aberrations. Because of the identical arrangement of these in all mutant offspring, they occur in a single germ cell at premeiotic stage. Repeated reversions of transposable elements back to deleted positions occurred, presumably by reinsertion into the remaining LTR. These included revertions to wild-type of deletions of single-copy gene sequences. There were also multiple insertions of other mobile elements such as jockey, hercules, burdock and roo into the 5' LTR of Gypsy. Several multiple transpositions and deletions were repeated in a significant proportion of the cases indicating a controlled explosion.
This leads to the hypothesis that under conditions of genomic shock (McClintock 1978, 1982), or cellular processes connected for example with meiosis, concerted transposition of diverse types of element spanning both the retroelements and DNA transposons may occur, resulting in significant changes in genomic organization. These may precipitate species discontinuities, along with processes such as hybrid dysgenesis.
Central to these ideas is the contrast between microevolution involving selective optimization of phenotype, maintaining species optimality and graduated variety, and macroevolution arising from genomic catastrophe, resulting in discontinuous changes in species, caused by major changes to the environment or niche of the organism. Existing organisms have survived, both because they have been optimized through phenotypic selection, and because they have sustained extensive transformations of structure in times of evolutionary disruption. Negative regulation of transposable elements reduces mutational load in times of optimization but provides structurally significant modes of mutation through coordinated transposition under situations where phenotypic optimization is breaking down. Since the principal challenges to the long-term survival of a given line occur during the disruptive phase, traditional models of Darwinian selective advantage or neutral evolution may be a relatively secondary feature of the evolutionary development of complexity, providing optimizing stability between catastrophic discontinuities - Waddington's chreode. Such discontinuities may be responsible for the great radiative adaptions such as the Cambrian and mammalian radiations.
Transposition as a Major Insertional Mutation. A retroposition event can be treated as a major insertional mutation. As such, the effect of a SINE insertion on the host organism can be deleterious, advantageous, or neutral. According to Kimura [*], most insertions should be deleterious, some of them may be neutral or nearly so, so and only a minute fraction of them could be advantageous. Indeed, no retropositions with a direct advantagous effect were observed to date. All de novo Alu insertions gave pathological phenotypes [*] [*]). Other insertions seem to have neutral or nearly neutral effect on the products of nearby genes (for example, insertion of Alu-Sb2 into the open reading frame of cholinesterase gene resulted in a truncated, nonfunctional protein that has no significant effect on the phenotype [*]).
From the genomic perspective, the retroposition can occur in exons, introns or intergenic regions. In each of these places, it can affect the host genes. They can become part of open reading frames, be captured for different regulatory functions, serve as recombinational hot-spots, or be involved in the creation of functionally and structurally new genes. I will describe below the role of retroposons in each of these genomic events.
Recombination is a very powerful factor of evolution that produces genetic variability by using already existing blocks of biological information. Some genes evolved by exon shuffling (for review see [*]) and recent computer simulations show that DNA sequences may evolve more rapidly by homologous recombination than by point mutations [*]. The repetitive elements play an important role in the unequal recombination events. Because of their sequence similarity, they enable pairing and exchange between the unrelated fragments of chromatin. The SINEs serve as recombination hot spots (Fig. 3).
Illustration from Makalowski review [*]. SINE elements serve as recombination hot-spots allowing the exchange of genetic material between unrelated sequences.
Sometimes recombination has positive effects although most of the observed recombination events lead to pathological phenotypes (e. g., a rearrangement between two Alu sequences from chromosomes 5 and 18 led to a tumorogenic oncogene tre [*]).
Retroposons can mediate the meiotic cross-over between homologous chromosomes. The mouse MT elements are involved in the recombinational hot spots located in MHC locus [*], [*], [*]. The Drosophila melanogaster HeT family is also involved in the recombination events. Biessman et al. [*] showed that these sequences insert into the chromosome breaks and effectively stabilize a damaged chromosome. The mammalian L1 elements behave similarly [*].
One of the most important steps in the gene transcription process is formation of the transcription complex, which is strongly dependent on the special signal within a gene called promoter. Promoters usually lay upstream of the transcribed sequence but they can also be part of it (internal promoters are characteristic of genes transcribed by polymerase III). The insertion of a retroposon in the proximity of a transcribed sequence can affect efficiency of the transcription. In certain situations the inserted element can become part of the promoter signal.
As mentioned, the insertion of a repetitive sequence into a gene can influence its transcription. The SINEs and other repeats can act as tissue-specific enhancers or silencers of the adjacent genes. Below I review several examples of the repetitive elements that were captured and came to be involved in the regulation of host gene transcription.
Alu sequences may be a factor in Primate Evolution [New Scientist 25 Sept 95] THE human genome is littered with "junk" DNA that everyone used to think had no real function. But now one of the most common types of genetic junk turns out to contain a working copy of a genetic switch that activates other genes. The junk sequence, known as Alu, may have played an important role in the evolution of primates, says Wanda Reynolds of the Sidney Kimmel Cancer Center in San Diego. Alu is a 283-nucleotide sequence that acts as a "jumping gene". From time to time, it inserts copies of itself randomly into the genome. Over the past 30 to 60 million years these insertions have occurred repeatedly, leaving roughly a million copies of Alu scattered through the human genome and making up almost 10 per cent of all the DNA in each cell. During this time, the sequences of the various Alus have begun to diverge, so that four distinct subfamilies of Alu can now be recognised. While studying one of these subfamilies, Reynolds noticed a short stretch of DNA only 14 bases long- that looked familiar. Elsewhere in the genome, there are nearly identical sequences that function as anchor points for proteins that bind to hormones and which therefore provide a way for hor mones to turn genes on and off. Reynold and her colleague Gordon Vansant learned that the Alu sequence also binds to a hor mone receptor-in this case, the receptor for a hormone called retinoic acid, whic activates genes at the proper times durin development (Proceedings of the National Academy of Sciences, vol 92, p 8229). Vansant and Reynolds then turned their attention to a naturally occurring Alu that sits close to the human gene for keratin, protein found in our skin, hair and nails They looked at cells in which they had re placed the keratin gene with a "marker' gene whose activity could be easily mea sured. When the researchers then delete the Alu sequence, they found that th marker gene became 35 times less active. Since submitting their paper for publica tion, they have found functional bindin sites for the retinoic acid receptor in second subfamily of Alus. This subfamil also contains sequences that bind to thyroi hormone receptors, says Reynolds, "so th story is going to get even more interesting" A few Alus have previously been sho to affect the activity of nearby genes, but the new study is the first to show how. The results also provide the first clear evidence that most Alus could have the potential to regulate human genes. Other researchers have been hunting for similar effects but without success. "I've been looking for mobile elements carrying out significant regulatory roles, and I've made little progress," says Roy Britten of the California Institute of Technology in Pasadena. Reynolds believes that most Alus have little effect on nearby genes, perhaps because they are bundled deep within folds of DNA. But she says that with a million Alus strewn randomly through the genome during the course of primate evolution, at least a few are likely to have landed where they could regulate a nearby gene. When this occurred, she suggests, the effect would be equivalent to randomly twisting a knob on an instrument panel. Usually the effect would be harmful, but once in a while it might produce an interesting and beneficial genetic novelty. "We can't prove it," she says, "but it seems that over the last 30 to 50 million years, it would provide good evolutionary fodder." Bob Holmes, Santa Cruz
A fascinating possibility from the evolutionary point of view is that SINEs can contribute to the origin of new gene functions. A new function can be initiated by a copy of an existing gene, by a new composite made of unused pieces of DNA, or by any intermediate between both. In the two last cases the genetic information may derive from repetitive elements scattered through the genome.
They interact with the surrounding sequences and nearby genes, may serve as recombination hot spots (just because of their high copy number in a genome) and they can acquire specific cellular functions such as RNA transcription control and stabilization of the chromatin structure.
It is obvious that maintaining these functions requires the coevolution of SINEs and other cellular components involved in the process. For instance, in order for an enhancer to maintain its proper function, the specific sequence (such as a part of a SINE element) and protein interacting with this signal have to coevolve. It is also clear that none of the retroposons is 'ready' for a specific function. The de novo insertion of a repeat, if not deleterious, is only neutral or near neutral for the nearby genetic elements. If the insertion is neutral, its fate in the population is determined by the genetic drift [*]. At this stage of their evolution inserted sequences change relatively fast by accumulating random mutations with the rate close to the mutation rate. If by chance they change in such a way as to support a cellular function (for example if a cryptic splicing site is activated or a SINE sequence starts interacting with a specific transcription factor), the natural selection replaces the genetic drift. The selection can favor changes in a certain direction to better adapt a SINE element to the acquired function or to protect the element from disruptive mutations. Once SINEs reach this stage of their evolutionary history their fate is controlled by natural selection [*]. The genomes are dynamic entities. New functional elements appear and the old ones go extinct. The information reviewed above shows that SINEs and other retroposable elements are a major evolutionary force serving at least as a reservoir for motifs used by natural selection in its evolutionary experiments. As such, SINEs should be called genomic scrap yard rather than junk DNA.
It is common knowledge now that the evolution of genetic ensembles occurs by means of genes duplications (Altenberg, 1994; Li and Noll, 1994; Wagner, 1994). The typical story of gene origin is as follows: gene duplication (1); its fixation in the population through selection or drift (2); maintenance of gene function by selection (3); gene evolution under mutation and selection (4).
In recent yeas, the data were obtained, which evidence, that besides duplications, gene transpositions are involved in this process. Transposable elements (transposons and retroposons) are a major source of genetic change, including the creation of novel genes, the alteration of gene expression in development, and the genesis of major genomic rearrangements (Lozovskaya, 1995).
A set of possible strategies of interrelation between Transposable Elements (TE) and host' genome is being discussed in biological literature. Not all of these strategies are incompatible. First of all, the destabilisation of host's genome by transposons looks to be of extreme interest in a context of macroevolution.
McClintock characterised these genetic phenomena as "genomic shock" (McClintock, 1984). Particularly, it is worth to mark the phenomenon of hybrid dysgenesis (Lozovskaya, 1995), in which multiple unrelated TEs are mobilised simultaneously via Drosophila genome destabilisation. As a rule, TEs remain silent in Drosophila genome until some stress factor (temperature, irradiation, DNA damage, the introduction of foreign chromatin, viruses, etc.) activates their elements. The insertion of activated TEs into a number of loci can lead to alteration of gene expression pattern.
From the viewpoint of evolution, stress induction of transpositions is a powerful factor generating new genetic variation in populations under stressful environmental conditions (Vasil'eva LA, 1997). Passing through a "bottleneck," a population can rapidly and significantly alter its population norm and become the founder of new, normal forms.
On the contrary, an estimated typical rate of transpositions in natural populations of Drosophila (number of transpositions per element per generation) is of the order 10-4 (Charlesworth, 1992).
The next extreamly interesting fact is that there are essential distinctions between different TE families in quantity of copies per genome, in predeterminedness of the sites of incertion as well as in observed in natural populations negative selection pressure on different lines of TE. Transposons with rigidly predetermined sites of insertion and transposable elements with low specificity concerning the sites of insertion (high and low insertion polymorphism) are well characterized and identified.
From the other side, quotients of negative selection for different Drosophila species can differ substantially. The given above facts stimulated general discussions in biological literature concerning possible alternative strategies of TE and host (Drosophila) genome interaction: the model with selection against insertional mutations (Charlesworth, 1991) and the alternative model of deleterious effect of chromosomal rearrangements due to recombinational events between TE insertions.
The long coexistence of TEs in the genome is expected to be accompanied by host - transposon co-evolution. Indeed, the important role of host factors in the regulation of TEs has been illuminated by recent studies of several systems in Drosophila (Lozovskaya, 1995).
Transposon insertions can affect the expression patterns of endogenous genes by adding and distributing specific control elements throughout the host genome. TEs in addition to reducing the fitness of their hosts may also provide a rich pool of regulatory elements that contribute to the long-term evolutionary potential of the population in a beneficial manner (Bucheton, 1995).
Transposition of a transposon into or near a particular host gene - possibly followed by an excision event leaving behind the transposon's regulatory sequences - might impose novel developmental control on such a host gene. Such a mechanism would serve to confer evolutionarily significant alterations in the spatial- temporal control of gene expression (Bronner, 1995).
Ecology of transposable elements: feedback between mobile and cellular protocols
The theme of this paper is that two forms of feedback between mobile and cellular protocols lead to a complex ecological relationship between the forms of transposable element and recombination processes in the cellular genome. The first of these is the feedback between phenotypic and element survival, which may require for example negative regulation of element transposition. The second is the feedback cellular recombination processes have on the transposable element ecology, as a result of both element and organism adaption and survival.
It is proposed that the evolution of major phylla is accompanied by particular structural relationships between the resident mobile families, constituting a stochastic dynamical system, which is characteristic both of the individual protocols of the elements and of cellular recombinational and structural features. In particular in mammals, a global linkage is proposed between five types of transposable structure, LINEs, SINEs, retrogenes, endogenous LTR-retroelements and cellular conversion processes, which constitutes a stochastic process phased to the reproduction cycle, providing a higher-level event space structure, enhancing survival through adaptability.
Evolution Evolving [King, 1978; 1985; 1989; 1991]
One of the principal roles of transposable elements in evolution may be to introduce new classes of probability structure which are also phased with the higher level informational structure of the genome and its regulation, rather than the lower level primary structure of base sequence mutation and random deletion/insertion. The larger scale spontaneous mutations in mammals are dominated by deletions unrelated to gene structure, possibly associated with topoisomerase and nuclear scaffold effects. By contrast, although often lacking promoter sequences, retrogenes consist of a complete coding gene unit, an insertional mutation of relatively high probability of functional advantage if coupled with suitable promoters. Their capacity to associate genes with new regulatory elements may complement the more restricted action of cellular gene duplication (Brosius 1991) making the lack of promoter a positive feature. Various forms of gene conversion may provide an avenue for assimilating information contained in inactive pseudogenes or regaining new function.
The very much smaller information content of a typical response element by comparison with a typical coding sequence makes it possible that retrogenes constitute an optimal form of higher level mutation, given the upstream promoter structure of pol II transcription. Calculation of the probability of finding a cryptic response element consisting of 8-10 key bases with 20%-25% divergence, flanked by a 6 bp inverted repeat (20bp) with possible 1 base inserts in either side of the stem or loop can be approximated as follows :
Probability of 8-10bp loop ~ .0004 - .004, Probability of stem ~.01,
Thus probability of a response element ~ 4x10-6 - 4x10-5
Hence probability (response elt /1 kb pseudogene) ~ 4x10-3 - 4x10-2 ~ 10-2
The very short sequences in the MuLV LTR enhancer (5-15bp) show that even this construct may overestimate complexity. By contrast, if a 450bp open reading frame has 70 amino acids with 25/64 possible codon changes, 40 with 8/64 and 40 with 4/64 we have a probability of ~ 1.3 x 10-113. Allowing 550 positions per kb we have ~ 7 x 10-111.
It is thus possible that although only a minority of retrogenes carry 5' regulatory signals, the probability of neighbouring sequences to an insert harbouring cryptic promoters is high enough to provide useful retrogenes such as the case noted with human amylase. Trends such as increasing AT content in spacer regions (Moreau et. al. 1982) and similar increase in pseudogene mutations (Graur et. al. 1989) may further increase the likelihood of specific signals such as TATA and AATAAA and contribute to enhancer structures, through weaker base-pairing.
LTR-retroelements provide transposition of coupled LTR regulatory sequences and recombinant coding genes, and support cell to cell transfer of information through budding. The formation of solo-LTRs in high copy number also provides an important independent mechanism for the dissemination of regulatory sequences, complementary to the spread of coding sequences by retrogenes (McDonald 1990). LTRs are both highly variable and include generic promoters and multiple enhancers capable of constitutive, tissue-specific and hormonal regulation.
SINEs such as Alu may contribute further types of transformation through their ubiquitous repeated nature as illustrated in fig. The existence of repeated sequences makes further provision for gene duplication through unequal crossing over and subsequent conversion events. Limitation of the effects of inversions and translocations caused by repeated sequences to tolerable levels is likely. The apparent absence of DNA transposons and compound foldback structures in mammals may reflect the replacement of the TE type translocations by more effectively modulated Alu-based events. LINEs provide for modulated expression of transposition in a manner possibly linked to cellular recombination in gametogenesis fig, thus forming a central catalyst for modular transposition.
If each of these modes of transpositional transformation is limited to a tolerable load per generation, probabilities of deleterious mutation remain fixed while probabilities of functional mutation are elevated by a factor of up to 10100. The variation of global family distributions between for example Drosophila and mammals may thus reflect distinct stochastic dynamical systems linking element and organism survival.
Coordinated Models of Recombinational and Element-Mediated Transposition
Cellular recombination, including gene duplication and divergence are pivotal to the structure of the genome. The very existence of repeated sequences in the genome can lead to unequal meiotic crossing over through recombination between staggered repeats, and gene duplication or the spread of one sequence copy through a tandemly repeated family. Gene conversion, resulting from repair of mismatch between partially homologous alleles of a gene, is a more general process which can occur between sequences on different chromosomes. The free single-stranded feeler appears to be dominant in conversion.
Both these processes constitute important cellularly-regulated mechanisms of sequence transposition, which play a role in processes from the cassettes of the yeast MAT mating locus, through to the maintenance of sequence uniformity repeated genes such as those for rRNA and sequence relationships in immunoglobulin families (Edelman & Gally 1970). The particular form of cellular recombination mechanisms varies from predominantly homologous form in yeast through to non-homologous forms in the majority of eukaryotes (Fink 1987). The exact form of recombination is likely to be specific to each major phylum, and apply differently to coding gene families, transposable elements and simple sequence DNA.

Figure: Coordinated family recombination
: (a) Repeated elements such as Alu and L1 facilitate gene duplication
(b). This is complemented by pol II and pol III (promoter-containing) retrogene
formation (d). LTR insertion (c) and cryptic promoters recombine further
regulatory signals. Conversion (e) results in intra-family recombination.
The universality of recombinational enzymes is hinted at by the common use of the c sequence in both prokaryotic l-phage and immunoglobulin somatic recombination signals (Kenter & Birshtein 1981). The relation between the mini-satellite consensus GGAGGTGG GCAGGAXG and c (Jeffreys 1985) appears to provide a mechanism for its transposition throughout the genome. The observed rates of recombination are 10 times higher than average consistent with being recombinational hot spots.
The high incidence of transposable families in the cellular genome is a source of frequent recombinational events including inversion excision or insertion. Merely by existing, a repeated family can promote duplication, both of itself and of any gene flanked by tandem repeats. Gene conversion can also act on transposable families both to create new variants and to standardize a family through concerted evolution. Cellular recombination processes between simple sequence DNA may also assist in disseminating promoter/enhancer sequences.
One of the most interesting possibilities is that gene duplication and retrogene formation can act as complementary transpositional pathways promoting recombinational diversity. Duplicative and retro-pseudogenes to form a family gene pool in the organism capable of generating diversity through conversion events. Novel regulatory or functional genes existing in a divergent family would enable mutational divergence to induce differential changes in coordinated regulation, with recombinant pseudogenes providing either a type of class switching or more graduated change than that possible by a single mutating gene.
Coordinated family recombination : Duplicative gene families and their processed retrogenes constitute a mutually interactive recombinational pool of variants optimizing modular gene diversity through independent routes which both preserve and recombine regulatory sequences, fig.
These mechanisms taken together lead to a model which generates a diversity of recombinational mechanisms to provide as general a class of genetic transformations as possible. High copy repeat SINEs and LINEs provide local nuclei for unequal crossing over, which increases the probability of duplication of cellular genes, complete with regulatory signals. Three further processes recombine coding genes with new regulatory signals. Pol II retrogenes provide coding sequence, which if it is preferentially inserted into an AT-rich region already has enhanced probability of possessing a cryptic promoter. Pol III transcripts provide a further class of sequence sometimes including promoter regions. Complementary to this, LTR insertion provides an independent route for generating new control regions. Finally the interaction of duplicative gene families with their collective retrogenes provides a pool of successive mutual rearrangements through cellular conversion.
If Evolution Can, Why Cannot We?
And in computer life, where the term "species" does not yet have meaning, we see no cascading emergence of entirely new kinds of variety beyond an initial burst. In the wild, in breeding, and in artificial life, we see the emergence of variation.
No one has yet witnessed, in the fossil record, in real life, or in computer life, the exact transitional moments when natural selection pumps its complexity up to the next level. There is a suspicious barrier in the vicinity of species that either holds back this critical change or removes it from our sight.
Synthetically reproduced protolife and artificial evolution in computers have already unearthed a growing body of nontrivial surprises. Yet artificial life suffers from the same malaise that afflicts its cousin, artificial intelligence. No artificial intelligence that I am aware of-be it autonomous robot, learning machine, or massive cognition program-has run more than 24 hours in succession. After a day, artificial intelligence stalls. Likewise, artificial life. Most runs of computational life fizzle out of novelty quickly. While the programs sometimes keep running, churning out minor variation, they ascend to no new levels of complexity or surprise after the first spurt (and that includes Tom Ray's world of Tierra). Perhaps given more time to run, they would. Yet, for whatever reason, computational life based on unadorned natural selection has not seen the miracle of open-ended evolution that its creators, and I, would love to see.
As the French evolutionist Pierre Grasse said, "Variation is one thing, evolution quite another; this cannot be emphasized strongly enough.... Mutations provide change, but not progress." So while natural selection may be responsible for microchange-a trend in variations-no one can say indisputably that it is responsible for macrochange-the open-ended creation of an unexpected novel form and progress toward increasing complexity.
Spontaneously directed variation and selection is an incredibly powerful problem solver. Natural selection indeed works over the immediate short term. We can use it to find what we can't see and fill in what we can't imagine. The question comes down to whether random variation and selection are sufficient alone to produce ever increasing novelty over the very long term. And if "natural selection is not enough" then what else might be at work in wild evolution, and what may we import into artificial evolution that will generate self-organizing complexity?
Postdarwinism suggests that other forces are at work in evolution in the long run. These lawful mechanisms of change reorganize life into new fitnesses. These unseen dynamics extend the Library in which natural selection may operate. This deepened evolution need not be any more mystical than natural selection is. Think of each dynamic-symbiosis, directed mutation, saltationism, self-organization-as a mechanism that will foster evolutionary innovation over the long term in complement to Darwin's ruthless selection.
Throughout this course emphasis was put on identifying the most important tools utilized in the field of Artificial Life. We started with self-organizing systems, exemplified with the logistics equation, random boolean networks, cellular automata (e.g. Conway’s game of Life), and all characterized in terms of dynamics systems theory. Later, with the von Neumann self-reproduction scheme, I argued that state-determined (purely dynamic) systems are not able to offer open-ended evolution, that is, to increase their complexity with genuine emergence of new functionalities. Dynamic systems are restricted to the complexity of their attractor landscape.
For this purpose, systems inspired by von Neumann’s scheme, which demand the separation between the description of a machine from the machine itself, and therefore introduce the concept of memory and external selection, were introduced. Such systems offer a model of the mechanisms utilized by natural selection, and are accordingly known as evolutionary systems (or evolutionary strategies) — e.g. genetic algorithms and evolutionary programming. We can also refer to the mechanisms utilized to model the kind of evolution that natural selection offers as memory based selective strategies: selection acting on memory elements in order to change the dynamic structure they encode.
I further emphasized hybrid systems which try to model both the self-organizing and selective mechanisms of biological systems, and can therefore offer a more complete understanding of evolutionary systems. I referred to these systems as getting close to the category of local memory based selective self-organization, or semantic closure. In practice I showed those approaches aiming at the introduction of non-deterministic, self-organizing, developmental steps between genotype and phenotype such as the evolution of boolean/neural networks encoded through L-System rules in a genetic algorithm. The understanding of the relative importance the two basic categories of organization in artificial systems introduces a very powerful way to study the relative importance of self-organization and natural selection in biological systems themselves. In other words, by creating different forms of "life-as-it-could-be" with different degrees of both these categories, we may shed some light on the credit assignment problem of biology: how much of evolution is a result of natural selection and how much is a result of the self-organizing characteristics of its specific materiality.
Heinz von Foerster [1965, 1969, 1977] equated the ability of an organization to classify its environment with the notion of eigenbehavior. He postulated the existence of some stable structures (eigenvalues) which are maintained in the operations of an organization's dynamics. An eigenvalue of an organizationally closed system can be seen as an attractor of a self-organizing dynamical system. The global "cooperation" of the elements of a dynamical system which spontaneously emerges when an attractor state is reached is understood as self-organization [von Foerster, 1960; Haken, 1977; Prigogine, 1985; Forrest, 1991; Kauffman, 1993]. The attractor behavior of any dynamical system is dependent on the structural operations of the latter, e.g. the set of boolean functions in a boolean network. Speaking of an attractor makes sense only in relation to its dynamical system, likewise, the attractor landscape defines its corresponding dynamical system. Further, attractor values can be used to refer to observables accessible to the dynamical system in its environment and therefore perform relevant classifications in such environment (e.g. neural networks).
What is usually referred to as self-organization is the spontaneous formation of well organized structures, patterns, or behaviors, from random initial conditions. The systems used to study this phenomenon are referred to as dynamical systems: state-determined systems. They possess a large number of elements or variables, and thus very large state spaces. However, when started with some initial conditions they tend to converge to small areas of this space (attractor basins) which can be interpreted as a form of self- organization. Since such formal dynamical systems are usually used to model real dynamical systems such as chemical networks of reactions, non-equilibrium thermodynamic behavior [Nicolis and Prigogine, 1977] the conclusion is that in nature, there is a tendency for spontaneous self-organization which is therefore universal [Kauffman, 1993].
This process of self-organization is also often interpreted as the evolution of order from a disordered start. Self-organizing approaches to life take chaotic attractors as the mechanism which will be able to increase the variety (physiological or conceptual) of organizationally closed systems. External random perturbations will lead to internal chaotic state changes; the richness of strange attractors is converted to a wide variety of discriminative power. Dynamic systems such as boolean networks clearly have the ability to discriminate inputs. Generally, the attractors of their dynamics are used to represent events in their environments: depending on inputs, the network will converge to different attractors. However, for any classification to have survival value, it must relate its own constructed states (attractors) to relevant events in its environment, thus, similar events in the world should correspond to the same attractor basin. Chaotic systems clearly do not have this property due to their sensitivity to initial conditions. Ordered systems follow this basic heuristic.
As previously discussed, for a self-organizing system to be informationally open, that is, for it to be able to classify its own interaction with an environment, it must be able to change its structure, and subsequently its attractor basins, explicitly or implicitly. Explicit control of its structure would amount to a choice of a particular dynamics for a certain task (the functional would be under direct control of the self-organizing system) and can be referred to as learning. Under implicit control, the self-organizing system is subjected to some variation of its structure (including its distributed memory) which may or may not be good enough to perform our task. Those self-organizing systems which are able to perform the task are thus externally selected by the environment to which they are structurally coupled. If reproduction is added to the list of tasks these systems can produce based on their dynamic memories, then we have the ingredients for natural selection: heritable variation and selection.
The underlying idea of computational evolutionary strategies (ES) is the separation of solutions for a particular problem (e.g. a machine) from descriptions of those solutions through a code. Genetic algorithms (GA's) work on these descriptions and not on the solutions themselves, that is, variation is applied to descriptions, while the respective solutions are evaluated, and the whole (description-solution) selected according to this evaluation. This leads to the conclusion that the form of organization attained by GA's is not self-organizing in the sense of a boolean network or cellular automata. Even though the solutions are obtained from the interaction of a population of elements, and in this sense following the general rules usually observed by computationally emergent systems, they do not strictly self-organize since they rely on the selective pressures of some fitness function. The order so attained is not solely a result of the internal dynamics of a collection of interacting elements, but also dictated by the external selection criteria.
Lately much attention has been posited on evolutionary strategies that bring together self-organizing systems and natural selection inspired algorithms. Particularly in the field of Artificial Life, Kitano[1994], and Dellart and Beer [1994], have proposed GA's which do not encode directly their solutions, but rather encode generic rules (through L-Systems) which develop into boolean networks simulating given metabolic cycles. With these approaches, GA's no longer model exclusively selection, but also a self- organizing dimension standing for some materiality. The GA does not search the very large space possible solutions, but a space of basic rules which can be manipulated to build different self-organizing networks. These networks are then started (sometimes with some learning algorithm) and will converge to some attractor behavior standing for a solution of our simulation. Rather than directly encoding solutions, the GA harnesses a space of possible self-organizing networks which will themselves converge to a solution - emergent morphology.
Digital Yin and Yang: Host-Parasite Coevolution in Computer
Combining artificial evolution with massively parallel computers not only magnified the power of the technique but provided further insights into the workings of evolution in nature. Danny Hillis' Connection Machine, with its 65,536 connected processors (typical computers merely have 1 processor), could simulate the growth and development of a very large number of artificial organisms and their interactions with each other at fantastically high speeds. In an early experiment, Hillis started with 65,536 randomized digital "organisms" dubbed "Ramps", one per processor, and threw them at the task of sorting a set of sixteen integers in descending order in the fewest number of exchanges possible. Each Ramp was a small program that was to compete with the ingenuity of the best human programmers; Milton Green held the record with sixty exchanges in his program. Like Holland's GAs, the Ramps were evaluated at every generation on their ability to sort the set of integers; the most successful could breed with one another (Hillis gave them an ability to choose) and survived to the next generation, where the fitness test was repeated. Hillis later tossed in artificial "parasites" who also evolved and continually challenged the Ramps to greater efficiency, quickening the pace of their evolution considerably. Eventually, after hundreds of thousands of generations, the Ramps managed to sort the integer-set in sixty-one exchanges. Their highly evolved, optimized code contained many subtle tricks that would have impressed a human programmer, in addition to being more robust and flexible than the highly specific and relatively fragile program of a human. Many AL researchers feel that the future of programming lay with the evolutionary approach of GAs: if the problem could be defined well enough, one could ignore the tedious details in programming a solution and instead evolve flexible, efficient and robust code superior to perhaps any human product.
The coevolution of Ramps and parasites provided many insights into the evolutionary process in general. The advantage of simulating a process of the natural world on a computer lay in the operator being able to examine every detail of the system, surpassing even the evolutionary biologist's dream of having the sequenced genome of every extinct animal along with the complete fossil record. Not only did Hillis find the coevolution-evolution cycle of host and parasite to be more important than tracking the evolution of a single species, but his system also put a new twist on the old evolutionary debate of gradualism versus punctuated equilibrium. Gradualism maintained that evolutionary changes were small but constant between generations of organisms; punctuated equilibrium proposed that evolutionary changes came suddenly, spurred on by large environmental disturbances. Hillis found that, despite the appearances of punctuated equilibrium in that the Ramps typically stayed at evolutionary plateaus for some time before being forced to further adapt, underneath the calm veneer of stability the steady exchange and interactions of their digital genes prepared them for the next surge of adaptation. Although most biologists were far from convinced that a computer model could tell them anything about the real world, a growing number of them appreciated the new insights given. Until much more sophisticated tools become available to study large communities of organisms in the field, increasingly sophisticated computer models may be among the best means to understand evolution's workings. But what would be needed to move beyond models, and begin to cross the boundary between living and nonliving on a computer? The answer lay in open-ended evolution.
As the sphere of AL expanded, some began to wonder at the further tantalizing possibilities of evolution on computers. In particular, how could complexity rivaling that seen in biological life be realized? Biologist Thomas Ray saw that the unnatural selection process of previous work was far too limiting. He developed an artificial environment, named Tierra, where his digital organisms were to vie with one another for computer resources (memory, processing cycles), so that their fitness criteria resembled that of nature's more than a programmer's fancy. Each artificial creature was again a collection of computer instructions, initially with the simple behavior of replicating itself; each had a digital genotype and phenotype. The world of Tierra welcomed the possibility of mutation and was more forgiving of it as well; mutations were less likely to be fatal than in the real world. On January 3 1990, Tierra received its first visitor, a creature eighty instructions long that Ray named the "Ancestor". The Ancestor's descendants rapidly propagated throughout Tierra, and much to Ray's shock and delight, Tierra soon bloomed with new "life".
Can open-ended evolution be constructed within a computer, proceeding without any human guidance? This issue was addressed by Thomas Ray who devised a virtual world called Tierra, consisting of computer programs that can undergo evolution [6]. In contrast to genetic programming where fitness is defined by users, the Tierra creatures (programs) receive no such direction. Rather, they compete for the natural resources of their computerized environment, namely CPU time and memory. Since only a finite amount of these are available, the virtual world's natural resources are limited, as in nature, serving as the basis for competition between creatures.
Ray modeled his system on a relatively late period of earth's evolution known as the Cambrian era, roughly 600 million years ago. The beginning of this period is characterized by the existence of simple, self-replicating organisms, marking the onset of evolution that resulted in the astounding diversity of species found today. For this reason, the era is also referred to as the Cambrian explosion. Ray did not wish to investigate how self-replication is attained, but rather wanted to discover what happens after its appearance on the scene. He inoculated his system with a single, self-replicating organism, called the ``Ancestor'', which is the only engineered (human-made) creature in Tierra. He then set his system loose, and the results obtained were quite provocative: An entire ecosystem had formed within the Tierra world, including organisms of various sizes, parasites, hyper-parasites, and so on. The parasites, for example, that had evolved are small creatures that use the replication code of larger organisms (such as the ancestor) to self-replicate. In this manner they proliferate rapidly without the need for the excess reproduction code.
Like Hillis' Ramps and other GAs, the organisms of Tierra had many subtle, counter-intuitive coding methods that nonetheless were very efficient and complex. Mutants appeared that could replicate with seventy-nine instructions, then seventy-eight and even fewer. Parasites with only forty-five instructions soon followed, and like Hillis' system, an evolutionary tug-of-war broke out: as hosts developed defenses, parasites found new means of attack, and the war raged on. Later, "hyperparasites" arose that could attack the parasites through inspecting themselves for the marauders, and thence steal the parasite's replication process for their own growth. With the parasites driven to extinction, a cooperative cycle arose between groups of hyperparasites who relied on their neighbors for more efficient growth. Soon, a new breed of parasite appeared which took advantage of this cooperative cycle for its own ends, and so on.
Such remarkably rich interactions were typical in Tierra, which generated fame not only for its creator but for AL as well. Even diehard biologists took note of it and similar systems, which so clearly illustrated the cycles and interactions of evolution they observed in nature, and also provided a way to test out some theories impractical to verify in the field.
VEGA. Recently, fuzzy systems are applied to various systems. However, lack of learning ability, the determination of most fuzzy rules and membership functions was made by human experts. In this paper, we propose a self-tuning fuzzy controller with virus-evolutionary genetic algorithm (VEGA). This learning algorithm is based on virus theory of evolution. The VEGA realizes a horizontal propagation and a vertical inheritance of genetic information in a population. The main operator of the VEGA is a reverse transcription operator, which plays the role of a crossover and a selection at the same time. Further, a transduction operator generates a substring to the transmitted. The VEGA can reduce the number of fuzzy rules by the reverse transcription operator and the transduction operator. The eefectiveness of the proposed method is shown through the simulations of the cart-pole problem.
[K. Shimojima, N. Kubota, T. Fukuda. Virus-Evolutionary Genetic Algorithm for Fuzzy Controller Optimization.]
Genotype Outgrowth in Computer Evolution Programs
If neo-Darwinian evolution works, it should be possible to mimic the process in software. By whatever mechanism, Ohno’s or other, computers should be able to mimic what biological evolution has done. In the discussion above, we have focused on the creation of new genes that code for new functions.
1) One well-known computer program that purports to mimic evolution is the one by Richard Dawkins that creates "biomorphs". The program generates stick figures that resemble insects, trees, bats, spiders, etc. The figures show a certain amount of variety as they evolve. But the evolution is by artificial selection, and nothing like gene duplication occurs. Instead, only nine or sixteen variables (in different versions) are allowed to wander within narrow ranges. These few variables occupy a tiny fraction of the “genome” that generates the biomorphs, which includes Dawkins’s application program and the necessary parts of the computer's operating system. The sequence space explored by Dawkins’s program is tightly confined and every member of it is functional. Certainly, nothing analogous to a new gene is created by Dawkins’s biomorphs.
Dawkins acknowledges that he uses artificial selection to guide the process. His creatures are tightly constrained by his Biomorph software and completely dependent on the computer's operating software. Deleterious mutations are not possible. The ratio of actual to possible creatures in his genetic scheme is one to one: every sequence makes a viable creature. He achieves the "evolution" by adjusting only a few variables within narrow ranges. So the only changes that occur in the creatures are those whose potential is already available in the program originally. The evolution that he is able to simulate is, at most, microevolution.
2) The program by Tom Ray called Tierra is also well-known. It starts with a species that originally has 80 instructions. The creatures multiply and evolve until the computer’s storage capacity is full. From then on the population is controlled by killing off creatures ranking lower on a fitness scale. One common outcome is the evolution of parasitism. Parasitism is known to be important in biological evolution. But the evolution of parasitism does not necessarily require any new genes — the genes of the parasites and hosts already exist beforehand. True, biological genomes that become related in this way may in fact require new genes to make them compatible with each other. But in Tierra, nothing suggests that anything analogous to a new gene is ever created.
In the program there is a standard "ancestor," an "organism" consisting of eighty computer instructions. From it other organisms descend. Mutations are introduced into these descendants at rate about a million times higher than the average mutation rate in eukaryotes. When the computer's memory is full, the older or more defective organisms are "killed off." The outcome of this artificial process is the evolution of a more compressed version of the original ancestor, one with only 36 instructions instead of eighty... Yet the noteworthy outcome of biological evolution is not the reduction of the instruction set, but the growth of it, and the emergence of new features. As real life evolves, new genes with new meaning are added to the genome.
3) "Prisoner's Dilemma" In September, 1995, Langton was asked: if chance can write the new genetic code behind evolution, then it should be able to write new computer code. Can it? Langton answered yes (6). His example was a string of computer code two bits long used to specify one strategy in a computer game simulating evolution (the "Prisoner's Dilemma"). Because mutations are allowed, the string of code can, occasionally, double to four bits; sometimes this doubling can lead to a superior strategy. This example seems weak. If any random process can write computer programs, there should be examples more impressive than the duplication of two bits. This is equivalent to the insertion of one nucleotide in a real genome. One possible excuse for this weakness is that not enough time has gone by for self-generated computer programs to emerge. But evolution is a robust process. If new genetic programs can be created without input in the biological world, wouldn't there be some convincing indication of an analogous process in the computer world by now?
4) John Koza ’s models of evolution, called Genetic Programming, start with selected algorithms that are shuffled and duplicated to create new subroutines. The subroutines are bred for their ability to solve a basic problem. While it is possible for an evolved subroutine to contain more algorithms than its parent, there is no suggestion that any new algorithms are created in Koza’s process. All of the necessary algorithms — and some that may be unnecessary — are supplied from the outset.
Makalowski W., SINEs as a Genomic Scrap Yard: An Essay on Genomic Evolution, Chapter 5 of the book: The Impact of Short Interspersed Elements (SINEs) on the Host Genome edited by Richard J. Maraia. 1995 R.G. Landes Company, Austin.
Gueiros-Filho, Frederico J. and Stephen M. Beverley.
"Trans-kingdom Transposition of the Drosophila Element mariner Within
the Protozoan Leishmania" p 1716-1719 v 276 Science. 13 June 1997.
Hartl, Daniel L. "Mariner Sails into Leishmania"
p 1659-1660 v 276 Science. 13 June 1997.
Williams, Phil. "Transposable
Elements May Have Had A Major Role In The Evolution Of Higher Organisms"
at EurekAlert!, 9 February 1998.
Fadool, James M.; Daniel L. Hartl and John E. Dowling.
"Transposition of the mariner element from Drosophila mauritiana in
zebrafish" p 5182-5186 v 95 n 9 Proceedings of the National Academy
of Sciences of the USA. 28 April 1998.
Daniels, G.R. and P.L. Deininger, Characterization of a third major SINE family of repetitive sequences in the galago genome. Nucleic Acids Res, 1991. 19(7): p. 1649-56.
Altenberg, L. Evolving Better Representations through Selective Genome Growth. pp. 182-187. In Proceedings of the IEEE World Congress on Computational Intelligence, (1994b).
Altenberg, L. Genome growth and the evolution of the genotype-phenotype map. Pp. 205-259 in W. Banzhaf and F. H. Eeckman, eds. Evolution and Biocomputation. Computational Models of Evolution. New York. Springer-Verlag, Berlin, Heidelberg, (1995).
Bronner G; Taubert H; Jackle H Mesoderm-specific B104 expression in the Drosophila embryo is mediated by internal cis-acting elements of the transposon. Chromosoma 103: 669-75 (1995).
Bucheton A. The relationship between the flamenco gene and gypsy in Drosophila: how to tame a retrovirus. Trends Genet 11: 349-353 (1995).
Charlesworth B, Lapid A, Canada D, The distribution of transposable elements within and between chromosomes in a population of Drosophila melanogaster. I. Element frequencies distribution. Genet Res, 60: 103-114 (1992).
Ding D. and Lipshitz HD Spatially regulated expression of retrovirus-like transposons during Drosophila melanogaster embryogenesis. Genet Res 64: 167-81 (1994).
Li X, and Noll M. Evolution of distinct developmental functions of three Drosophila genes by acquisition of different cis-regulatory regions. Nature 367: 83-87 (1994).
Lozovskaya ER; Hartl DL; Petrov DA Genomic regulation of transposable elements in Drosophila. Curr Opin genet Dev 5:768-73 (1995).
McClintock B, The significance of responses of the genome to challenge. Science, 226: 792-801 (1984).
Petrov DA; Schutzman JL; Hartl DL and Lozovskaya ER Diverse transposable elements are mobilized in hybrid dysgenesis in Drosophila virilis. Proc Natl Acad Sci USA 92: 8050-80544 (1995).
Smith PA and Corces VG The suppressor of Hairy-wing protein regulates the tissue-specific expression of the Drosophila gypsy retrotransposon. Genetics 139: 215-228 (1995).
Wagner A. Evolution of gene networks by gene duplications: a mathematical model and its implications on genome organization. Proc Natl Acad Sci U S A 91: 4387-4391 (1994).
Urnovitz, H. B. and W. H. Murphy. 1996. Human endogenous retroviruses: nature, occurrence, and clinical implications in human disease. Clin. Microbiol. Rev. 9:72-99.
Lower, R., J. Lower, and R. Kurth. 1996. The viruses in all of us: characteristics and biological significance of human endogenous retrovirus sequences. Proc. Natl. Acad. Sci. U. S. A. 93:5177-5184.
Luria, S. E. 1959. Viruses: A survey of some current problems, p. 1-10. In A. Isaacs and B. W. Lacey (eds.), Virus Growth and Variation. Cambridge University Press, London, England.
S. A. Kauffman. The Origins of Order. Oxford University Press, New York, 1993.
J. R. Koza. Genetic Programming. The MIT Press, Cambridge, Massachusetts, 1992.
C. G. Langton. Self-reproduction in cellular automata. Physica D, 10:135-144, 1984.
C. G. Langton. Preface. In C. G. Langton, C. Taylor, J. D. Farmer, and S. Rasmussen, editors, Artificial Life II, volume X of SFI Studies in the Sciences of Complexity, pages xiii-xviii, Redwood City, CA, 1992. Addison-Wesley.
T. S. Ray. An approach to the synthesis of life. In C. G. Langton, C. Taylor, J. D. Farmer, and S. Rasmussen, editors, Artificial Life II, volume X of SFI Studies in the Sciences of Complexity, pages 371-408, Redwood City, CA, 1992. Addison-Wesley.
J. von Neumann. The Theory of Self-Reproducing Automata. University of Illinois Press, Illinois, 1966. Edited and completed by A.W. Burks.
Dawkins, Richard. The Blind Watchmaker. W.W. Norton and Company, Inc. 1987.
Koza, John R. “Genetic Evolution and Coevolution of Computer Programs” pp 603-630 Artificial Life II, Christopher G. Langton et al. eds. Addison-Wesley Publishing Company 1992.
Wickramasinghe, Chandra. "The Evidence of Professor Chandra Wickramasinghe in the Trial at Arkansas, December, 1981"
Conrad, Michael [1983]. Adaptability. Plenum Press.
Conrad, Michael [1990]."The geometry of evolutions." In: BioSystems Vol. 24, pp. 61-81.
Dellaert, F. and R.D. Beer [1994]."Toward an evolvable model of development for autonomous agent synthesis." In: Artificial Life IV. R. Brooks and P. Maes (Eds.). MIT Press.
Haken, H. [1977]. Synergetics. Springer-Verlag.
Kauffman, S. [1993]. The Origins of Order: Self-Organization and Selection in Evolution. Oxford University Press.
Kitano, Hiroaki [1994]."Evolution of Metabolism for Morphogenesis." In: Artificial Life IV. Brooks, R.A. and P. Maes. MIT Press. pp 49-58.
Mitchell, M. and S. Forrest [1994]."Genetic algorithms and Artificial Life." In: Artificial Life Vol. 1, pp 267-289.
Nicolis, G. and I. Prigogine [1977]. Self-Organization in Nonequilibrium Systems.
Rocha, Luis M. [1995]."Contextual Genetic Algorithms: Evolving Developmental Rules." In: Advances in Artificial Life. F. Moran, A. Moreno, J.J. Merelo, and P. Chacon (Eds.). Springer-Verlag. pp. 368-382.
Rocha, Luis M. [199?]."Selected Self-Organization and the Semiotics of Evolutionary Systems". In: Evolutionary Systems: The Biological and Epistemological Perspectives on Selection and Self- Organization, . S. Salthe, G. Van de Vijver, and M. Delpos (eds.). Kluwer Academic Publishers, pp. 341-358.
Bell, G., and Maynard Smith J. (1987) ``Short-term selection for recombination among mutually antagonistic species''. Nature, 328, pp. 66-68.
Cliff, D. and Miller, G. F. (1994). ``Protean Behavior in Dynamics Games : Arguments for the co-evolution of pursuit-evasion tactics''. In D. Cliff, P. Husbands, J. A. Meyer and S. Wilson, editros, Proc. Third International Conf. Simulation Adaptive Behavior (SAB 94), M.I.T. Press Bradford Books.
Cliff, D. and Miller, G.F. (1995) ``Tracking the Red Queen: Measurements of adaptive progress in co-evolutionary simulations''. In F. Moran, A. Moreno, J. J. Merelo and P. Cachon (editors) Advances in Artificial Life: Proceedings of the Third European Conference on Artificial Life (ECAL95). Lecture Notes in Artificial Intelligence 929, Springer Verlag, pp.200-218.
Ewald, P.W. (1995). ``The evolution of virulence: a unifying link between parasitology and ecology''. Journal of Parasitology, 81(5), pp. 659-669.
Goldberg, David E (1989). Genetic Algorithms in Search, Optimization and Machine Learning. Addison-Wesley Publishing Company, Inc.
Hamilton, W. (1980). D. ``Sex versus non-sex versus parasite''. Oikos 35 :282-290.
Hamilton, W.D., Axelrod, R. and Tanese, R. (1990). ``Sexual reproduction as an adaptation to resist parasites : A review.'' Proc. Nat'l Acad. Sciences (USA), 87(9) : 3566-3573.
Hillis, W.D. (1992). ``Co-evolving parasites improve simulated evolution as an optimization procedure.'' Artificial Life II, SFI Studies in the Sciences of Complexity, Ed. C. Langton, C. Tylor, J. D. Farmer, & S. Rasmussen, Addison-Wesley Publishing Company.
Howard, R. S. and Lively, C. M. (1994). ``Parasitism, mutation accumulation and the maintenance of sex''. Nature, 367, pp. 554-556.
Hurst, L.D. and Peck, J.R. (1996). ``Recent advances in understanding of the evolution and maintenance of sex''. TREE, vol 11, no. 2, pp. 46-52
Jaffe, K. (1996). ``The dynamics of the evolution of sex, why the sexes are, in fact, always two ?'' Interciencia, 21 : 259 :267.
Jefferson, D.R. et al. (1991). ``Evolution as a Theme in Artificial Life: The Genesys / Tracker System.'' Artificial Life II, SFI Studies in the Sciences of Complexity, Ed. C. Langton, C. Tylor, J. D. Farmer, & S. Rasmussen, Addison-Wesley Publishing Company.
Kauffman, S.A., and S. Johnson. (1992). ''Co-Evolution to the Edge of Chaos : Coupled Fitness Landscapes, Poised States, and Co-evolutionary Avalanches''. Artificial Life II, SFI Studies in the Sciences of Complexity, Ed. C. Langton, C. Tylor, J. D. Farmer, & S. Rasmussen, Addison-Wesley Publishing Company.
Levin S.A., et al. (1997). ``Mathematical and Computational Challenges in Population Biology and Ecosystems Science''. Science, 275, pp.334-343.
Maynard Smith, J. (1971). ``The origin and maintenance of sex''. In G. C. Williams (ed.), Group Selection. Aldine Atherton, Chicago, USA.
Maynard Smith, J. (1978). The Evolution of Sex. Cambridge University Press, Cambridge, UK.
Miller, G.F., and Todd, P.M. (1994). ``Evolutionary wanderlust: Sexual selection with directional mates preferences''. In D. Cliff, P. Husbands, J. A. Meyer and S. Wilson, editros, Proc. Third International Conf. Simulation Adaptive Behavior (SAB 94), M.I.T. Press Bradford Books, 1994
Muller, H.J. (1964). ``The relation of recombination to mutational advance''. Mutat. Res. , 1, pp. 2-9.
Ray, T. (1992). ``An Approach to the Synthesis of Life.'' Artificial Life II, SFI Studies in the Sciences of Complexity, Ed. C. Langton, C. Tylor, J. D. Farmer, & S. Rasmussen, Addison-Wesley Publishing Company.
Syswerda, G. (1989). ``Uniform crossover in genetic algorithms.'' In J. David Shaffer (ed.) ,Proceedings of the Third International Conference on Genetic Algorithms. Morgan Kaufman Publishers.
Angeline, P. J. and Pollack, J. B. 1993. Competi- tive environments evolve better solutions for com- plex tasks. In Forrest, S., editor, Proceedings of the Fifth International Conference on Genetic Algorithms, pages 264--270, San Mateo, CA. Morgan Kaufmann.
AxelRod, R. 1989. Evolution of strategies in the iter- ated prisoner's dilemma. In Davis, L., editor, Genetic Algorithms and Simulated Annealing. Morgan Kaufmann, San Mateo, CA.
Calabretta, R., Galbiati, R., Nolfi, S., and Parisi, D. 1996. Two is better than one: A diploid genotype for neural networks. Neural Processing Letters, In press.
Cliff, D. and Miller, G. F. 1995. Tracking the red queen: Measurements of adaptive progress in co- evolutionary simulations. In Mor'an, F., Moreno, A., Merelo, J. J., and Chac'on, P., editors, Advances in Artificial Life: Proceedings of the Third Euro- pean Conference on Artificial Life, pages 200--218. Springer Verlag, Berlin.
Cliff, D. and Miller, G. F. 1996. Co-evolution of Pursuit and Evasion II: Simulation Methods and Results. In Maes, P., Mataric, M., Meyer, J., Pollack, J., Roitblat, H., and Wilson, S., editors, From Animals to Animats IV: Proceedings of the Fourth International Conference on Simulation of Adaptive Behav- ior. MIT Press-Bradford Books, Cambridge, MA.
Floreano, D. and Mondada, F. 1994. Automatic Cre- ation of an Autonomous Agent: Genetic Evolution of a Neural-Network Driven Robot. In Cliff, D., Husbands, P., Meyer, J., and Wilson, S. W., editors, From Animals to Animats III: Proceedings of the Third International Conference on Simulation of Adaptive Behavior. MIT Press-Bradford Books, Cambridge, MA.
Floreano, D. and Mondada, F. 1996. Evolution of homing navigation in a real mobile robot. IEEE Trans- actions on Systems, Man, and Cybernetics-Part B, 26:396--407.
Floreano, D. and Mondada, F. 1996. Evolution of plas- tic neurocontrollers for situated agents. In Maes, P., Mataric, M., Meyer, J., Pollack, J., Roitblat, H., and Wilson, S., editors, From Animals to Animats IV: Proceedings of the Fourth International Conference on Simulation of Adaptive Behavior. MIT Press-Bradford Books, Cambridge, MA.
Hillis, W. 1990. Co-evolving parasites improve simulated evolution as an optimization procedure. Physica D, 42:228--234.
Kauffman, S. A. and Johnsen, S. 1992. Co-evolution to the edge of chaos: Coupled fitness landscapes, poised states, and co-evolutionary avalanches. In Langton, C., Farmer, J., Rasmussen, S., and Tay- lor, C., editors, Artificial Life II: Proceedings Vol- ume of Santa Fe Conference, volume XI. Addison Wesley: series of the Santa Fe Institute Studies in the Sciences of Complexities, Redwood City, CA.
Koza, J. R. 1991. Evolution and co-evolution of computer programs to control independently-acting agents. In Meyer, J. and Wilson, S., editors, From Animals to Animats. Proceedings of the First International Conference on Simulation of Adaptive Behavior. MIT Press, Cambridge, MA.
Koza, J. R. 1992. Genetic programming: On the programming of computers by means of natural selection. MIT Press, Cambridge, MA.
Maechler, P. 1997. Robot odometry correction using grid lines on the floor. In Proceedings of 2nd International Workshop on Mechatronical Computer Systems for perception and Action, Pisa, Italy.
Mataric, M. and Cliff, D. 1996. Challenges in Evolving Controllers for Physical Robots. Robotics and Autonomous Systems. In press.
Menczer, F. and Belew, R. K. 1993. Latent energy environments. In Belew, R. K. and Mitchell, S., editors, Plastic Individuals in Evolving Populations. Addison Wesley, Redwood City, CA.
Miglino, O., Lund, H. H., and Nolfi, S. 1996. Evolving Mobile Robots in Simulated and Real Environments. Artificial Life, 2:417--434.
Miller, G. F. and Cliff, D. 1994. Protean behavior in dynamic games: Arguments for the co-evolution of pursuit-evasion tactics. In Cliff, D., Husbands, P., Meyer, J., and Wilson, S. W., editors, From Animals to Animats III: Proceedings of the Third International Conference on Simulation of Adaptive Behavior. MIT Press-Bradford Books, Cambridge, MA.
Nolfi, S. 1997. Using emergent modularity to develop control system for mobile robots. Adaptive Behavior, 5.
Nolfi, S., Floreano, D., Miglino, O., and Mondada, F. 1994. How to evolve autonomous robots: Different approaches in evolutionary robotics. In Brooks, R. and Maes, P., editors, Proceedings of the Fourth Workshop on Artificial Life, pages 190--197, Boston, MA. MIT Press.
Nolfi, S. and Parisi, D. 1995. Genotypes for neural networks. In Arbib, M. A., editor, The Handbook of Brain Theory and Neural Networks. MIT Press, Cambridge, MA.
Ray, T. S. 1992. An approach to the synthesis of life. In Langton, C., Farmer, J., Rasmussen, S., and Taylor, C., editors, Artificial Life II: Proceedings Volume of Santa Fe Conference, volume XI. Addison Wesley: series of the Santa Fe Institute Studies in the Sci- ences of Complexities, Redwood City, CA.
Renshaw, E. 1991. Modeling Biological Populations in Space and Time. Cambridge University Press, Cam- bridge.
Reynolds, C. W. 1994. Competition, Coevolution and the Game of Tag. In Brooks, R. and Maes, P., edi- tors, Proceedings of the Fourth Workshop on Artifi- cial Life, pages 59--69, Boston, MA. MIT Press.
Sims, K. 1994. Evolving 3D Morphology and Behav- ior by Competition. In Brooks, R. and Maes, P., editors, Proceedings of the Fourth Workshop on Ar- tificial Life, pages 28--39, Boston, MA. MIT Press.
Yao, X. 1993. A review of evolutionary artificial neural networks. International Journal of Intelligent Sys- tems, 4:203--222.
Yeager, L. 1994. Computational Genetics, Physiology, Metabolism, Neural Systems, Learning, Vision, and Behavior or PolyWorld: Life in a New Context. In Langton, C., editor, Artificial Life III. Addison- Wesley, Redwood City, CA.