Transcriptomic alternations in DNA copy number or DNA MET
DNA CNVs and MET at genetic and epigenomic levels, as well as gene EXP patterns, were collected from 313 STAD specimens. Then, the raw materials were preprocessed according to the method described previously in “Methods.” Later, the correlation coefficients of DNA CNV or MET profiles with corresponding mRNA EXP data were computed, so as to evaluate the influence of genomic and/or epigenomic abnormalities. The correlation coefficient r had been standardized according to the Fisher’s Z-transformation for variance stabilization.
Consistent with prior study, overall distribution of correlation coefficients between DNA CNV and related EXP data markedly skewed to the right (skewness = 1.2352, p < 1e-5). In comparison, the correlation coefficients between DNA MET and related EXP data skewed to the left (skewness = -0.37363, p < 1e-5) (Fig.1A), which indicated that DNA CNVs and MET abnormalities positively and negatively modulated transcription, respectively.
A great number of genes participated in the above 2 gene sets (CNVcor and METcor genes, Table S2-3); as a result, genes that were markedly correlated with OS were selected in later analysis (log rank p < 0.05). Gene signatures with positive correlation were screened to examine the DNA copy number (CNVcor, n = 368), and while those with negative correlation were screened for DNA MET (METcor, n = 348). CNVcor genes indicated DNA CNVs-dependent transcriptional dysregulation, but METcor genes suggested MET-dependent transcriptional dysregulation. CNVcor genes were not overlapped with METcor ones, of them, 116 overlapped gens had been identified, which suggested that CNVcor and METcor genes could specifically regulate transcriptional dysregulation (Fig.1B).
CNVcor genes preferred to DNA CNVs within some genomic regions, especially on chromosomes 10, 17, 18 and 21 (Fig.1C and Table S4). Consistent with prior research, CNVcor genes were abundant on chromosomes 10 and 17, which suggested that gene expression was sensitive to DNA dose at certain regions[22]. Additionally, METcor genes could be found within desired chromosomal regions, like chromosome 19 (Fig.1D and Table S5), most of which were genes that encoded proteins (Fig.1E) and distributed in the CpG islands (Fig.1F). This study suggested that, CNVcor and METcor genes were remarkably effective on transcriptional dysregulation of STAD, which required to be further examined in future studies.
Different CNVcor and METcor genes-dependent molecular subtypes.
Subsequently, the effects of CNVcor and METcor gene expression on predicting prognostic subgroups were explored. NMF cluster analysis was performed for all gene set data, and the cluster number k was set at 2-10; later, the value of k was calculated for each profile (k=3 for CNV and MET, respectively) (Fig.2A-B). Surprisingly, CNVcor genes-identified subtypes were remarkably overlapped with the METcor genes-identified ones (p < 1e-5 upon χ2-test), and such results were consistent with CNVcor and METcor gene regulation in STAD (Fig.2E-F). Additionally, Kaplan–Meier (KM) curve results suggested that, CNVcor or METcor genes-identified subtypes predicted the OS of patients (Fig.2C-D), separately (p < 0.05).
Molecular subtypes associated with CNVcor and METcor gene expression were identified at various aspects. iCluster, the integrated cluster method, was employed to integrate genomic data on mRNA EXP, DNA CNVs and MET. Afterwards, cluster analysis would be carried out, with the cluster number k of 2-4. 20 cluster iterations at K=4 (category 5), K=3 (category 4) and K=2 (category 3) were carried out separately to assess the best iCluster cluster results. According to our findings, stable cluster results were obtained at K=2 relative to those at K=3 or 4 (Fig.S1-2). As a result, all samples were clustered as 3 subclasses of iC1-iC3 (n=93, 100 and 120, respectively). Fig.S3A-B display the cluster results of those 3 subclasses, and Table S6 presents those of all samples.
KM analysis results suggested that, iC1 could attain the optimal OS across those 3 subtypes (p < 0.05, Fig.3A). The OS of patients in iC1 subgroup was compared with that of the other 2 subgroups (Fig.3B and Fig.S4), and the results suggested that, the difference in prognosis between iC1 and iC2 subgroups was statistically significant (p < 0.001). Notably, the subtypes identified by icluster remarkably overlapped with those identified through CNVcor and METcor genes (p < 1e-5, χ2-test, Fig.3C-D). According to the above findings, comprehensive analysis on CNVcor and METcor genes facilitated to identify various molecular subtypes, and all of them showed heterogenous combinations of genomic and epigenomic features associated with prognosis and transcriptional dysregulation.
Coordinated DNA CNVs and METs abnormalities.
The frequencies between genome-wide DNA METs abnormalities and DNA CNVs were compared following batch effect correction. Additionally, DNA copy-number gain (CNVgain, β>0.3) and loss (CNVloss, β<-0.3), together with DNA hypermethylation (METhyper, β>0.8) and hypomethylation (METhypo, β<0.2) were computed according to the determined threshold (fold change of 0.3), which were subsequently compared with the level of every probe. Our results indicated that (Table S7), CNVgain frequency showed marked correlation with CNVloss frequency (p < 1e-5, Fig.4A). Besides, METhyper frequency displayed evident correlation with METhypo frequency (p < 1e-5, Fig.4F). Directional METhyper and CNVgain were tightly correlated with CNVloss, which indicated that all correlations were directional abnormality-free (Fig.4B-E, p < 0.05). In conclusion, our results suggested that, STAD patients with increased frequency of DNA CNVs had elevated frequency of DNA MET abnormality. Correlations in the frequency of abnormal METcor and CNVcor genes indicated the close correlation of DNA MET with DNA CNVs.
Identification of STAD subtype key features
Firstly, clinical features (including TNM, Stage, Gender and Primary Site) were compared across those 3 subtypes. Fig.S5 and Table 1 indicate that, the differences in clinical characteristics are not statistically significant among 3 STAD subtypes. Nonetheless, as for stage distribution, III and IV samples were mainly the iC2 subtype that had the poorest prognosis. In addition, samples across these 3 subtypes were calculated and determined based on the tumor immune estimation resource (TIMER) method (Table S8)[23]. As could be seen, differences in the 5/6 immune cell scores (CD8+T, CD4+T, macrophages, neutrophils, and dendritic cells) of samples across these 3 subtypes were statistically significant (Fig.5A and B, p < 1e-5). The above results indicated that, immunocyte infiltration degree or the immune microenvironment of STAD was correlated with DNA CNV or MET level.
Table 1 Clinical informations of STAD patients across the 3 subtypes were compared.
Event
|
Total
|
iC1
|
iC2
|
iC3
|
Alive
|
188
|
69
|
47
|
68
|
Dead
|
125
|
22
|
49
|
49
|
T
|
|
|
|
|
T1
|
14
|
7
|
5
|
2
|
T2
|
63
|
17
|
26
|
20
|
T3
|
146
|
41
|
47
|
58
|
T4
|
81
|
26
|
18
|
37
|
N
|
|
|
|
|
N0
|
90
|
35
|
24
|
31
|
N1
|
80
|
20
|
25
|
35
|
N2
|
65
|
20
|
22
|
23
|
N3
|
62
|
14
|
23
|
25
|
NX
|
7
|
2
|
2
|
3
|
M
|
|
|
|
|
M0
|
275
|
87
|
86
|
102
|
M1
|
18
|
2
|
5
|
11
|
MX
|
11
|
2
|
5
|
4
|
Stage
|
|
|
|
|
I
|
39
|
14
|
14
|
11
|
II
|
99
|
34
|
27
|
38
|
III
|
129
|
36
|
40
|
53
|
IV
|
28
|
3
|
13
|
12
|
Un
|
9
|
4
|
2
|
3
|
Grade
|
|
|
|
|
G1
|
8
|
2
|
3
|
3
|
G2
|
107
|
37
|
49
|
21
|
G3
|
180
|
51
|
40
|
89
|
GX
|
9
|
1
|
4
|
4
|
Age
|
|
|
|
|
0~50
|
27
|
9
|
7
|
11
|
50~60
|
72
|
15
|
20
|
37
|
60~70
|
91
|
23
|
33
|
35
|
70~80
|
99
|
38
|
32
|
29
|
80~100
|
15
|
6
|
4
|
5
|
Gender
|
|
|
|
|
FEMALE
|
101
|
29
|
27
|
45
|
MALE
|
203
|
62
|
69
|
72
|
Besides, differences in gene EXP, DNA CNVs and MET across iC2 and iC1 subtype samples were compared. Later, DNA MET and CNVs levels had been classified as 3 types, including Normal, Gain and Loss (for CNV), as well as Normal, HyperMethy and HypoMethy (for MET). DNA CNVcor or METcor genes with marked difference between iC2 and iC1 subtypes were obtained through Fisher’s exact test. These results are presented in Table S9-10. As for EXP patterns, differentially expressed genes (DEGs) between iC1 and iC2 subtypes were acquired through DESeq2[24] (p < 0.05), and the results are presented in Table S11. Fig. S7 displays EXP, CNV and METcor genes that were markedly different across samples of 3 subtypes. In addition, to illustrate the crucial prognostic features across various subtypes, altogether 5 genes with distinct difference between iC1 and iC2 samples at all the three (MET, CNV and EXP) levels had been screened to carry out univariate survival analysis. According to our findings, 2 genes (PLCXD3 and KCNB1) showed correlation with prognosis (log-rank p < 0.05), suggesting that the above-mentioned two genes in iC2 subtype (with poorer prognosis) had greater CNV and hypermethylation levels than iC1 subtype (with positive prognosis), and the expression of the 2 genes in iC1 subtype were down-regulated compared with ones in the iC2 subtype. Afterwards, PLCXD3 and KCNB1 expression was divided as low (L1), medium (L2) or high (L3). Our findings suggested that, L1, L2 and L3 groups in terms of the PLCXD3 and KCNB1 expression levels had positive correlation with prognosis (Fig.6A-B). The above findings indicated that, the distinct PLCXD3 and KCNB1 expression was related to DNA MET or CNV level as well as the patients prognostic outcome. Moreover, the GEO GSE62254[25] STAD data set (n=266) had been utilized to analyze the relationship between those of 5 genes and the prognosis for patients. The expression levels of JPH3 and KCNB1 were available, which suggested that only JPH3 in GSE62254 STAD data set[26] showed marked correlation with prognosis. Fig.S6 displays these results.
Eventually, the STAD mutation profiles had been determined for exploring the associations with the sub-classification. Specifically, the synonymous mutations were eliminated to obtain the missense and nonsense mutations. Overall, the mutation frequency of genes in different subtypes showed statistically significant differences. In addition, some distinctly different mutant genes (PIK3CA, ARID1A, SPECC1, ZFHX3, KMT2D, OBSCN, ZBTB20, FSIP2, RANBP2 and TTN) between iC1 and iC2 subtype were selected upon fisher’s test, as presented in Fig.7 and Table S12 (FDR < 0. 1 upon fisher’s test). The different mutational spectra were analyzed, which suggested that the above 10 genes in iC1 subtype had remarkably increased mutation frequencies compared with those in iC2 and iC3 subtypes (p < 0.01). Interestingly, mutation frequencies of these 10 genes in iC3 subtype showing common prognosis were also increased compared with that of iC2 subtype (p < 0.05). In conclusion, the above findings indicated that, the STAD molecular subtypes related to DNA copy numbers and DNA MET showed correlation with mutations in the above 10 genes, and they might modulate the progression of STAD subtypes.