Background: The relationship between host conditions and microbiome profiles, typically characterized by operational taxonomic units (OTUs), contains important information about the microbial role in human health. Traditional association testing frameworks are challenged by the high-dimensionality and sparsity of typical microbiome profiles. Incorporating phylogenetic information is often used to address these challenges with the assumption that evolutionarily similar taxa tend to behave similarly. However, this assumption may not always be valid due to the complex effect of microbes, and phylogenetic information should be incorporated in a data-supervised fashion.
Results: In this work, we propose a local collapsing test called Phylogeny-guided microbiome OTU-Specific association Test (POST). In POST, whether or not to borrow information and how much information to borrow from the neighboring OTUs in the phylogenic tree are supervised by phylogenetic distance and the outcome-OTU association. POST is constructed under the kernel machine framework to accommodate complex OTU effects and extends kernel machine microbiome tests from community-level to OTU-level. Using simulation studies, we showed that when the phylogenetic tree is informative, POST has better performance than existing OTU-level association tests. When the phylogenetic tree is not informative, POST achieves similar performance as existing methods. Finally, we show that POST can identify more outcome-associated OTUs that are of biological relevance in real data applications on bacterial vaginosis and on preterm birth.
Conclusions: Using POST, we show that the power of detecting associated microbiome features can be enhanced by adaptively leveraging the phylogenetic information when testing for a target OTU. We developed an user friendly R package POSTm which is now available at CRAN
(https://CRAN.R-project.org/package=POSTm) for public access.