Case 4

Living systems encompass a staggering range of scales, from the interactions of individual molecules to the dynamics of whole populations. The factors that govern the way they function -- the material properties, the physical forces at work, the resources required -- vary a great deal across those scales. (For example, the effects of gravity are negligible at the molecular level, and nearly so for cells, but often decisive at the larger phenotypic scales.) Mapping the relationships between the things happening at each level is not only mind-bogglingly difficult, it's also not at all clear that doing so will provide meaningful answers to systemic questions.

The consequences of low-level activities at higher levels tend to be impossible to predict a priori¹. The mess of chemical reactions going on around DNA give rise, through a looney set of intermediate steps, to all kinds of protein molecules, which in turn contribute in unexpected ways to changes in the structure and behaviour of the containing cell, which in turn nudge other cells into slightly different states, which in turn...

...until eventually here I am, writing these notes, and there you are, reading them.

The remarkable feature of living systems is not, as the IDiots bleat, that simple processes give rise to complexity -- we're all familiar with that. Rather, it is the cohesive simplicity of the outcome. It would be easy to imagine the low-level complications aggregating up to create something ever more chaotically unpredictable, but they don't. The intricacies smooth away and order emerges, intimately dependent on what goes on below and yet astonishingly robust.

One aspect of cellular robustness is an armoury of techniques for dealing with invaders. Like everything in life, these can be rather ad hoc: just random heuristics that happened along and -- by increasing the ability to reproduce successfully -- became endemic. One such, which has only become understood recently, forms the basis of a neat method for investigating experimentally the workings of individual genes and their contributions to the overall behaviour of cells and organisms.

The familiar double helix of DNA is the storehouse of genetic information in cells, and almost all of it lives in the nucleus. To actually use the information, individual genes are transcribed into single strands of a related molecule, RNA, which then travels outside the nucleus to the bits of cell machinery that build the proteins from its template.

Viruses are much simpler. They don't have those bits of cell machinery at all, relying on being able to hijack ours, and many don't bother with the niceties of DNA. Instead, their genetic information is encoded directly as RNA; in some cases as double strands.

Since double stranded RNA almost never occurs natively in eukaryotic cells, its presence is an indicator that the sequences in question might be alien and dangerous. There are protein complexes in the cell that identify those sequences and stomp on them, suppressing their expression. This ability is present throughout the eukaryote kingdom and is obviously a pretty successful defence, but it can also be used to mislead cells into suppressing the products of their own genes.

RNA interference, whose inventors won the Nobel Prize this year, does exactly that trick: introduce double-stranded RNA into the cell matching a legitimate gene, and bingo. The protective mechanism targets the sequence and the corresponding protein doesn't get expressed. (The suppression isn't absolute, but it's usually enough to effectively stall pathways dependent on the protein.)

By inhibiting each gene in turn, it is possible to get some idea of which ones contribute to particular cell features and activities, and even of the order of expression of proteins along their relevant pathways, and this has been done in, for example, the geneticist's favourite Drosophila melanogaster. But interpreting the results is far from straightforward and it's still a long way shy of explaining how it all works.

For example, there are quite a lot of Drosophila genes that have an impact on the overall shape of a cell -- making it spiky or deformed, say -- but these effects may be consequent on very different kinds of failure in the cell biology, and may be quite incidental to the "real" function of the gene. The fact that disabling the gene screws up the membrane morphology doesn't mean that the gene in question is "for" membrane morphology².

There is an ever-present danger that the answers we get from this kind of investigation will be prejudiced by the questions being asked: if you are looking for effects on cell structure, that's what you'll see. This is not to suggest that asking such questions is invalid or unimportant -- au contraire, mon frère -- just that we have to be careful when drawing conclusions. There will almost always be parts of the picture we aren't seeing³.

This is where an evolutionary perspective can help.

The "solutions" produced by natural selection are only ever answers to the nebulous question "How can I reproduce better?" rather than, say, "How can I fly?" or "How can I see?", even if they sometimes include those latter features. Such solutions are intimately entwined with the environments in which they developed, and are often quite difficult to apprehend in the more linear terms favoured by humans. (This may be why we design jumbo jets, while nature comes up with bumblebees.)

Evolutionary algorithms have been used to interesting effect in software, simulation and industrial design in recent years, with some notable successes. Such efforts can provide important insights into the way evolution works with respect to a problem space.

The example under consideration just now uses a genetic algorithm to develop rulesets for a cellular automaton. The CA itself is entirely deterministic, although the rulesets are combinatorially explosive and not suited to encoding in the classic Wolfram "Rule 110" bit-table form. A ruleset defines 100 rules, each of which votes for some piece of very simple cell behaviour at the next turn based on neighbourhood state. The majority vote wins. (This "democratic" approach is convenient for evolution because it allows for genotype changes to accrue incrementally.) The genetic algorithm used to evolve the rulesets uses multi-point crossover with elitism (ie, some of the best performers in each parental generation are preserved unchanged into the filial). I'm not sure whether point mutations are also included, but I'd imagine so. (Update: they are.)

The interesting feature here is the choice of selection criteria. Each ruleset is run starting with a single seed cell in the centre of a 3d space. Its fitness is scored according to growth after 50 generations and homeostasis (ie, maintaining the same overall shape) at 100 and 150 generations. Homeostasis was chosen as a feature that most living things seem to need and -- to a good approximation -- exhibit. There were also some boundary conditions on the space, to avoid having to deal with infinitely large organisms.

This is a pretty abstract environment and might be expected not to produce results with much relevance to biological systems, which are so much more grubby and complicated. However, it actually turns out to be rather illuminating.

The first lesson of the system was how good evolution is at exploiting loopholes in the problem specification: early versions were quickly dominated by rulesets that did exactly that, making use of the spatial boundaries to maintain shape.

When these weaknesses were corrected, the most successful rulesets tended to use strategies very similar to the way tissues grow in real-world creatures, with an actively-growing layer at the bottom, a dying-off layer at the top, and cell migration from one to the other. This was unexpected.

Because the CA is deterministic, it's easy to experiment with successful rulesets, making small changes to both their environment and their genome. One such experiment involved switching off one gene at a time and observing the results -- just as was done in Drosophila with RNAi. It turned out that only a small number of genes were crucial to homeostasis across a wide range of successful rulesets, all with pretty much the same substance: if X then die. In other words, they were genes for apoptosis.

From a biological perspective this may not seem a huge surprise: cell death is essential to health in all multicellular organisms. What's remarkable is that it is a consistently-evolved feature in such an abstract mathematical space.

Another feature shared by most successful rulesets -- even though it was not selected for explicitly -- was an ability to heal wounds: the ruleset would generally restore its original shape reasonably quickly after a chunk of cells was randomly killed. This was, of course, an event that was never encountered during the rulesets' evolution.

Similar behaviour is also found in the embryonic stages of many creatures, where it is assumed to be a specialised function evolved for that purpose. But embryos are typically in the most protected environment they will ever encounter -- they simply do not have to deal with wound healing under any normal circumstance, and when they do they're pretty much stuffed because it means their parent or egg has been catastrophically damaged. As with the evolved CA rulesets, embryonic wound healing is something that has likely never been specifically tested by natural selection -- it simply doesn't provide a reproductive advantage.

This is the real, remarkable result that links the CA experiment -- so deliberately divorced from the material world -- with the earlier discussion about relating Drosophila genes to cell structure. Evolution in a complex system -- and even in a comparatively simple one -- is not a matter of neat, linear relationships. Evolved behaviours exist as complex networks, and some -- like wound healing -- may turn out to be almost inevitable adjuncts to the things that are genuinely advantageous.

1 This seems to me as good an argument as any against the infuriatingly obtuse claptrap of intelligent design. Cell mechanics are simply not amenable to a de novo design process. Life is so profoundly contingent, and the solutions it comes up with so Heath Robinson [US: Rube Goldberg], that it can only plausibly have reached its present state by trial and error. Any procedure by which a Flying Spaghetti Monster might have come up with the likes of us must be so suck it and see as to make the deity itself entirely superfluous.
2 Observe how basic vocabulary becomes fraught with risk in this sort of discussion; hence the quotes. Careless talk can easily lead to unproductive diversions in teleology. The reasons why things are as they are do not amount to a purpose.
3 Vaguely interesting bicycle analogy omitted. I may restore this in the submitted case paper if it seems pertinent when I get around to writing that.