Wednesday, November 7, 2012

Explanation of the many names for types of phylogenetic networks


Two types of phylogenetic network are commonly recognized, although there can be gradations between the two extremes. These go by many different names, which inevitably leads to some confusion on the part of users.

Some of the names are listed here, along with an explanation of what the terminology is intended to convey. The terms are arranged in pairs, indicating the two different types of network. The "network" part of the name is assumed in each case unless indicated otherwise.

      Type 1       Type 2
  1. Affinity  Genealogical
  2. Data-display Reticulogeny
  3. Implicit  Explicit
  4. Directed  Undirected
  5. Rooted  Unrooted
  6. Splits graph Augmented tree, Reconciliation, Recombination,
                  Hybridization
1.  This reflects the biologists' perspective, describing the different purposes for which networks have been used. Affinity networks display overall similarity relationships among the organisms, whereas genealogical networks display only historical relationships of ancestry.

2.  This reflects the assumptions used for the data analysis. Data-display networks are interpreted solely as visualizations of the patterns of variation in the data, while the reticulogenies are based on some inferences about those data patterns (such as their possible cause). Some network types, such as Reduced Median Networks and Median-Joining Networks, are based on algorithms that make partial inferences from the data. Data-display networks have mainly been used as affinity networks and reticulogenies as genealogical networks.

3.  This reflects the computational perspective, describing the goal of the algorithm used to analyze the data. Explicit networks are intended to provide a phylogeny in the traditional sense used for phylogenetic trees, displaying both vertical and horizontal patterns of descent with modification. Implicit networks provide information that can be used to explore phylogenetic patterns in a dataset without any direct interpretation as necessarily showing a phylogeny. Implicit networks have mainly been used as data-display networks and explicit networks as reticulogenies.

4.  This reflects the mathematical interpretation of networks as line graphs. In a directed graph the edges have a direction, usually indicated by an arrow, in which case the edges are more correctly referred to as arcs. Undirected graphs do not have directed edges.

5.  This reflects the tree-thinking view of phylogenetic networks, in which directed graphs are called rooted trees and undirected graphs are called unrooted trees. Rooted networks are usually treated as explicit networks and are thus used as genealogical networks, although there is no reason why they could not be used simply as a convenient form of data display.

6.  This reflects the modelling approach to network analysis based on mathematical structures. Splits graphs model phylogenetic patterns as bipartitions of the data, and build the network from those partitions (the result will be a tree if there are no incompatible bipartitions). Augmented trees are essentially trees with a few added reticulation edges / arcs, while reconciliation networks are based on reconciling the differences between trees. Recombination networks are based on analyzing data patterns in terms of a simple model of genetic cross-over, while hybridization networks model the data in terms of patterns in conflicting trees.

So, there are reasons why so many different terms have appeared in the literature. Unfortunately, they are not always used consistently with the meaning that was originally intended.

No comments:

Post a Comment