Wednesday, May 15, 2013

Resistance to network thinking


Phylogeneticists are used to the idea of tree thinking, in which evolutionary history is seen as a branching tree-like pattern. Clearly, for many phylogeneticists this has not yet been extended to network thinking, in which evolutionary history can also be seen as a reticulating network. Indeed, I have recently come across several people who have actively insisted that "trees are still central" to phylogenetics (to quote one of my correspondents). As Mindell (2013) has claimed, the Tree of Life is still a useful metaphor, model and heuristic device.

So, there is not just indifference to networks but there seems also to be some resistance to them. This is somewhat unexpected, as a network simplifies to a tree if there are no incompatible phylogenetic signals, and so there is no intrinsic reason to restrict phylogenies to being tree-like.

As a typical example from the literature, Losos et al. (2012) have recently commented:
Although molecular data have rarely changed our understanding of the major multicellular groups of the evolutionary tree of life, they have suggested changes in the relationships within many groups, such as the evolutionary position of whales in the clade of even-toed ungulates. Further investigation has usually resolved conflicts, often by revealing inadequacies in previous morphological studies. This has led to a presumption by many in favor of molecular data.
Needless to say this is a biased point of view, because conflicts can also be resolved by revealing inadequacies in molecular studies. For example, molecular analyses involve many subjective decisions about substitution models and rates of molecular change, and any one of the underlying assumptions may be violated. There is no theoretical justification for favouring one source of data over another.

Similarly, there is no theoretical justification for trying to resolve conflicts by preferring one hypothesis over another. Phylogenetic conflicts can also be "resolved" by recognizing that evolutionary history is not necessarily tree-like. Losos et al. do not even consider this possibility:
When two phylogenies are fundamentally discordant, at least one data set must be misleading.
In fact, the only misleading thing here is the word "must", because both datasets may be perfectly correct but are simply the product of two different evolutionary histories.

This point is perhaps most obvious when comparing molecular datasets. The evolutionary history revealed by between-gene evolutionary processes (e.g. recombination, hybridization, horizontal gene transfer) often conflicts with that from within-gene processes (e.g. nucleotide substitutions and insertions / deletions), and this leads to a reticulating evolutionary history.

Indeed, the more we learn about genomes the less tree-like does the evolutionary history of species seem to be. There are long-standing controversies regarding the evolutionary history of many taxonomic groups, and it has been hoped that genome-scale data would resolve these controversies. However, to date none of these controversies has been satisfactorily resolved into an unambiguous tree-like genealogical history using genome data. They all apparently involve reticulate evolutionary processes.

For example, the estimated relationships among humans, chimpanzees and gorillas did not change as a result of genome sampling (Galtier and Daubin 2008), nor did those of malaria species (Kuo et al. 2008) nor those of placental superorders (Hallström and Janke 2012). In all three cases the estimated relationships were just as complex after the genome sequencing as before. The resolution of controversial branches in our trees has not occurred as a result of increased access to character data or improved data analyses, but our recognition of reticulating relationships certainly has occurred.

There are many other examples where increased character sampling is yet to resolve long-standing controversies about branching patterns, and where reticulation may also be the true explanation. Birds seem to provide many of these examples (eg. Smith et al. 2013), but insects are a rich source as well (eg. Thomas et al. 2013), and sometimes even plants (eg. Goremykin et al. 2013).

Clearly, when two or more phylogenies are fundamentally discordant, none of the datasets needs to be misleading, because a reticulating history may be involved. Network thinking should thus be a standard tool in the arsenal of every phylogeneticist. Tree thinking excludes networks but network thinking does not exclude trees, and so the more general model will always be the more useful one.

[Note: An empirical example is discussed in this later blog post: Conflicting placental roots: network or tree?]

References

Galtier N, Daubin V (2008) Dealing with incongruence in phylogenomic analyses. Philosophical Transactions of the Royal Society of London, Series B, Biological Sciences 363: 4023-4029.

Goremykin VV, Nikiforova SV, Biggs PJ, Zhong B, Delange P, Martin W, Woetzel S, Atherton RA, McLenachan PA, Lockhart PJ (2012) The evolutionary root of flowering plants. Systematic Biology 62: 50-61.

Hallström BM, Janke A (2012) Mammalian evolution may not be strictly bifurcating. Molecular Biology and Evolution 27: 2804-2816.

Kuo C-H, Wares JP, Kissinger JC (2008) The Apicomplexan whole-genome phylogeny: an analysis of incongruence among gene trees. Molecular Biology and Evolution 25: 2689-2698.

Losos JB, Hillis DM, Greene HW (2012) Who speaks with a forked tongue? Science 338: 1428-1429.

Minell DP (2013) The Tree of Life: metaphor, model, and heuristic device. Systematic Biology 62: 479-489.

Smith JV, Braun EL, Kimball RT (2013) Ratite nonmonophyly: independent evidence from 40 novel loci. Systematic Biology 62: 35-49.

Thomas JA, Trueman JW, Rambaut A, Welch JJ (2013) Relaxed phylogenetics and the Palaeoptera problem: resolving deep ancestral splits in the insect phylogeny. Systematic Biology 62: 285-297.

2 comments:

  1. David,

    The article by Losos et al. was referring to two phylogenies, one molecular and one morphological. In all (computational) work done in the area of networks, I'm only aware of work that reconciles different *molecular* phylogenies. So, I'm not sure there's a problem with the statement by Losos et al.

    Best,
    Luay Nakhleh

    ReplyDelete
    Replies
    1. Luay, That may be how the authors intended their statement to be interpreted, but it is not quite how it is written. Either way, I think that conflict between trees based on different datasets is often a straightforward extension of within-dataset conflict, because the conflict is likely to be related to incompatible phylogenetic signals in the organisms. (It could also be computational issues, such as different models.) The authors are not thinking that way -- to them, conflict means that one dataset is wrong, rather than there are non-treelike signals in all of the datasets. All the best, David

      Delete