With the advancements in science and technology, the study of cancer has expanded beyond analyzing a single set of data. Instead, researchers are now focusing on the fusion analysis of multiple data groups, which is more beneficial for predicting cancer. I This research paper examines the existing techniques for analyzing multi-omics data through statistical integration. The methods discussed include correlation analysis, multivariate analysis, and series integration analysis. However, as high-throughput sequencing technology advances, multi-omics data sets have become increasingly large. This leads to a "dimensional disaster" where traditional analysis methods struggle to effectively extract relevant variables and are prone to overfitting and failure. This paper introduces a method called sparse canonical correlation analysis, which is used to analyze the data set of DNA methylation and RNA-seq gene expression in breast cancer patients. The relationship between the methylation status of particular DNA regions and the level of gene expression was examined, and the most significant characteristics in each dataset were identified. This approach offers a novel method for complex multi-omics datasets with high dimensionality.