Gould provides a remarkable, but very informal answer to this question. He bases it on the analysis of the asymmetric random diffusion of life, as constrained proliferation and diversification. Asymmetric because, by a common assumption, life cannot be less complex than bacterial life10. We may understand Gould’s analysis by the general principle: any asymmetric random diffusion propagates, by local interactions, the original symmetry breaking along the diffusion.
The point is to propose a pertinent (phase) space for this diffusive phenomenon. For example, in a liquid, a drop of dye against a (left) wall diffuses in space (toward the right) when the particles bump against each other locally. That is, particles transitively inherit the original (left) wall asymmetry and propagate it globally by local random interactions. By considering the diffusion of biomass, after the early formation (and explosion) of life, over complexity, one can then apply the principle above to this fundamental evolutionary dynamic: biomass asymmetrically diffuses over complexity in time. Then, there is no need for a global design or aim: the random paths that compose any diffusion, also in this case, help to understand a random growth of complexity, on average. On average, as there may be local inversion in complexity, the asymmetry is randomly forced to “higher complexity”, a notion to be defined formally, of course.
In Bailly and Longo (2009), and more informally in Longo and Montévil (2014b), a close definition of phenotypic complexity was given, by counting fractal dimensions, networks, tissue differentiations, etc., hence, a mathematical analysis of this phenomenon was developed. In short, in the suitable phase space, that is “biomass × complexity × time”, we can give a diffusion equation with real coefficients, inspired by Schrödinger’s equation (which is a diffusion equation, but in a Hilbert space). In a sense, while Schrödinger’s equation is a diffusion of a law (an amplitude) of probability, the potential of variability of biomass over complexity in time was analyzed when the biological or phenotypic complexity was quantified, in a tentative but precise way, as hinted above (and better specified in the references).
Note that the idea that the complexity (however defined) of living organisms increases with time has been more recently adopted in Shanahan (2012) as a principle. It is thus assumed that there is a trend toward complexification and that this is intrinsic to evolution, while Darwin only assumed the divergence of characters. The very strong “principle” in Shanahan (2012), instead, may be derived, if one gives a due role to randomness, along an asymmetric diffusion, also in evolution.
A further but indirect fall-out of this approach to phenotypic complexity results from some recent collaborations with biologists of cancer (see Longo et al. 2015). We must first distinguish the notion of complexity, based on “counting” some key anatomical features, from biological organization. The first is given by the “anatomy” of a dead animal, and the second usually refers to the functional activities of a living organism. It seems that cancer is the only disease that diminishes functional organization by increasing complexity. When tissue is infected by cancer, ducts in glands, villi in epithelia, etc., increase in topological numbers (e.g. ducts have more lumina) and fractal dimensions (as for villi). This very growth of mathematical complexity reduces functionality, by reduced flow rates, thus the biological organization. This is probably a minor remark, but in the very obscure etiology of cancer, it may provide a hallmark for this devastating disease.
1.5. Random sequences: a theory invariant approach
Sequences are the simplest mathematical infinite objects. We use them to discuss some subtle differences in the quality of randomness. In evaluating the quality of randomness of the following four examples, we employ various tests of randomness for sequences, i.e. formal tests modeling properties or symptoms intuitively associated with randomness.
The Champernowne sequence, 012345678910111213, is random with respect to the statistical test which checks equal frequency – a clearly necessary condition of randomness11. Indeed, the digits 0, 1, 2, …, 9 appear with the right frequency 10-1, every string of two digits, like 23 or 00 appears with the frequency 10-2, and by a classical result of Champernowne (1933), every string – say 366647888599991 00200030405060234234 or 00000000000000000000000000000000000 – appears with the frequency 10-(length of stnng) (10-35 in our examples). Is the condition sufficient to declare the Champernowne sequence random? Of course not. The Champernowne sequence is generated by a very simple algorithm – just concatenate all strings on the alphabet {0, 1, 2, 3, …, 9} in increasing length order and use the lexicographical order for all strings of the same length. This algorithm allows for a prefect prediction of every element of this sequence, ruling out its randomness. A similar situation appears if we concatenate the prime numbers in base 10 obtaining the Copeland-Erdös sequence 235711131719232931374143 (Copeland and Erdös 1946).
Now consider your favorite programing language L and note that each syntactically correct program has an end-marker (end or stop, for example) which makes correct programs self-delimited. We now define the binary halting sequence H(L) = h1h2…hn : enumerate all strings over the alphabet used by L in the same way as we did for the Champernowne sequence and define hi = 1 if the ith string considered as a program stops and hi= 0 otherwise. Most strings are not syntactically correct programs, so they will not halt: only some syntactically correct programs halt. The Church-Turing theorem – on the undecidability of the halting problem (Cooper 2004) – states that there is no algorithm which can correctly calculate (predict) all the bits of the sequence H(L); so from the point of view of this randomness test, the sequence is random. Does H(L) pass the frequency test? The answer is negative.
The Champernowne sequence and the halting sequence are both non-random, because each fails to pass a randomness test. However, each sequence passes a non-trivial random test. The test passed by the Champernowne sequence is “statistical”, more quantitative, and the test passed by the halting sequence is more qualitative. Which sequence is “more random”?
Using the same programing language L we can define the Omega sequence as the binary expansion Ω(L) = ω1ω2… ωn… of the Omega number, the halting probability of L:
It has been proved that the Omega sequence passes both the frequency and the incomputability tests; hence, it is “more random” than the Champernowne, Copeland-Erdös and halting sequences (Calude 2002; Downey and Hirschfeldt 2010). In fact, the Omega sequence passes an infinity of distinct tests of randomness called Martin-Löf tests – technically making it Martin-Löf random (Calude 2002; Downey and Hirschfeldt 2010), one of the most robust and interesting forms of randomness.
Have we finally found the “true” definition of randomness? The answer is negative. A simple way to see it is via the following infinite set of computable correlations present in almost all sequences, including the Omega sequence (Calude and Staiger 2014), but not in all sequences: that is, for almost all infinite sequences, an integer k > 1 exists (depending on the sequence), such that for every m ≥ 1:
In other words, every substring ωm+1 ωm+2 … ωmk has to contain at least one 1, for all m ≥ 1, a “non-randomness” phenomenon no Martin-Löf test can detect. A more general result appears in Calude and Staiger (2014, Theorem 2.2).
So, the quest for a better definition continues! What about considering not just an incremental improvement over the previous definition, but a