'Tree-based' phylogenetic networks provide a mathematically-tractable model for representing reticulate evolution in biology. Such networks consist of an underlying 'support tree' together with arcs between the edges of this tree. However, a tree-based network can have several such support trees, and this leads to a variety of algorithmic problems that are relevant to the analysis of biological data. Recently, Hayamizu (arXiv:1811.05849 math.CO) proved a structure theorem for tree-based phylogenetic networks and obtained linear-time and linear-delay algorithms for many basic problems on support trees, such as counting, optimisation, and enumeration. In the present paper, we consider the following fundamental problem in statistical data analysis: given a tree-based phylogenetic network $N$ whose arcs are associated with probability, create the top-$k$ support tree ranking for $N$ by their likelihood values. We provide a linear-delay (and hence optimal) algorithm for the problem and thus reveal the interesting property of tree-based phylogenetic networks that ranking top-$k$ support trees is as computationally easy as picking $k$ arbitrary support trees.
phylogenetic tree, tree-based phylogenetic network, support tree, top-k ranking problem, maximum likelihood estimation; enumeration, algorithm
This is a list of supplementary files associated with the primary manuscript. Click to download.