MCluster-VAEs: an end-to-end variational deep learning-based clustering method for subtype discovery using multi-omics data

doi:10.21203/rs.3.rs-1668552/v1

Download PDF

Research Article

MCluster-VAEs: an end-to-end variational deep learning-based clustering method for subtype discovery using multi-omics data

https://doi.org/10.21203/rs.3.rs-1668552/v1

This work is licensed under a CC BY 4.0 License

Version 1

posted

You are reading this latest preprint version

Background

The discovery of cancer subtype based on unsupervised clustering helps provide precise diagnoses, guide treatment and improve patients’ prognoses. Instead of single-omics data, multi-omics data can improve performance of the clustering because it obtains a comprehensive landscape for understanding biological systems and mechanisms. However, heterogeneous data from multiple sources raises high complexity and different kinds of noise, which will be detrimental to the extraction of clustering information.

Methods

We propose an end-to-end deep learning-based method, Multi-omics Clustering Variational Autoencoders (MCluster-VAEs), that can extract cluster-friendly representations on multi-omics data. First, unified network architecture with an attention mechanism is developed for modeling multi-omics data precisely. Then, using a novel objective function built from the Variational Bayes technique, the model is trained to effectively obtain the posterior estimation of clustering assignments.

Results

Compared with twelve other state-of-the-art multi-omics clustering methods, MCluster-VAEs achieved outstanding performance on benchmark datasets from the TCGA database. On the Pan Cancer dataset, MCluster-VAEs achieved adjusted Rand index of around 0.78 for cancer category recognition, an increase of more than 18% compared with other methods. Furthermore, the survival analysis and clinical parameters enrichment tests on ten cancer datasets demonstrate that MCluster-VAEs delivered comparable or even better results than many typical integrative methods.

Conclusions

These results demonstrate that MCluster-VAEs is a new powerful tool for dissecting complex multi-omics relationships and providing new insights for cancer subtype discovery.

cancer subtype discovery

multi-omics data integration

cluster

deep learning

variational bayes