Background: Integrating multi-omics data for cancer subtype recognition is an important task in bioinformatics. Recently, deep learning has applied to recognize the subtype of cancers. However, existing studies almost integrate the multi-omics data simply by concatenation as the single data and then learn a latent low-dimensional representation through deep learning model, which didn't considering the distributes differently of omics data. Moreover, these methods ignore the relationship of samples.
Results :In order to tackle these problems, we proposed SADLN: a self-attention based deep learning network of integrating multi-omics data for cancer subtype recognition. SADLN combined encoder, self-attention, decoder, and discriminator into a unified framework, which can not only integrate multi-omics data but also adaptively model the sample’s relationship for learning a accurately latent low-dimensional representation. With the integrated representation learned from the network, SADLN used Gaussian Mixture Model to identify cancer subtypes. Experiments on ten cancer datasets of TCGA demonstrated the advantages of SADLN compared to ten methods.
Conclusions: The Self-Attention Based Deep Learning Network (SADLN) is a effective method of integrate multi-omics data for cancer subtype recognition.