The MSI/LOH status in tumor-related genes
In order to determine the MSI/LOH status in tumor-related genes, we selected 61 MS loci (Table S2) based on the optimization of the appropriate PCR conditions. Among the 61 MS loci, 53 were located in introns, 1 found in an exon region, 5 loci located in the non-coding regions (referred to the area except for 3’-untranslated regions (UTR), 5’-UTR, exon and intron of gene), and the rest of 2 loci in the 3'-UTR. Based on the STR scanning, 217 MSI events in 18 genes were detected in 48 tumor specimens, representing approximately 4.52 mutations per tumor. In addition, 909 LOH events in 18 genes were detected in 147 tumors representing approximately 6.18 mutations per tumor. Of the 256 cases, 18.8% harbored one or more than one MSI events, 57.4% had one or more than one LOH events and no mutation (MSI and LOH) were found in 23.8% cases. For the 61 loci, 70.49 % (43/61) contained at least one MSI event and 83.61% (51/61) contained at least one LOH event. MSI occurrence in 61 loci varied widely. BRAF-9 was most frequently affected with significant higher occurrence rate than other loci (5.08%, 13/256) (Fig. 1A). The MSI frequency of the top 3 frequent MS loci in the tumor-related genes (range from 4.30% to 5.08%) was lower than that of B5 loci (range from 7.42% to 9.38%). The results showed that the MSI mutation percentages of BAT25, BAT26, D5S346, D2S123, and D17S250 were very high. Remarkably, loci TP53-1 had the highest LOH occurrence rate (26.95%, 69/256). LOH occurrence in 61 loci were also ranged widely (Fig. 1B). The LOH frequency of the top 3 frequent loci in the tumor-related genes (range from 14.45% to 26.95%) was similar to that of dinucleotide loci in B5 (range from 10.16% to 21.88%). Furthermore, the P21 was the most commonly affected gene with MSI frequency (4.69%, 12/256), and gene TP53 had much higher LOH frequency (26.95%, 69/256) than other genes (Fig. 1C-D and Table S3-S4).
Colorectal carcinomas with high-frequency microsatellite instability (MSI-H) account for 15% of all colorectal cancers, including 12% of sporadic cases and 3% of cancers associated with Lynch syndrome. Using the B5 panel, we classified the tumor as MSI-H, MSI-L and MSS. Colorectal cancers with MSI-H accounted for 10.94% of all cases, which were comparable with the data in literature. Moreover, MSI-H tumors defined by B5 panel were more prone to mutate in MS loci of tumor-related genes (Fig. S2).
The prognostic value and predicting the response to chemotherapy of MSI/LOH in tumor-related genes
Microsatellite instability can provide rich information for prognosis and evaluation of the chemotherapy response in the cancer patients [18,19]. The overall survival (OS) in patients with MSI-H were also longer than those with MSS/MSI-L (63.5 months versus 60.0 months, p=0.013) . In the present study, we explored the relationship of MSI/LOH of 32 sensitive loci and the outcomes of CRCs only in training group (n=256) due to the lack of survival information of the second batch of samples.
However, according to MSI status of the B5 panel, the outcome were not significantly different CRC patients which in group of all stage (Fig. 2A-B), stage Ⅱ (Fig. 2C-D), stage Ⅲ (Fig. 2E-F)adjuvant chemotherapy (Fig. 2G-H). (at least two of the B5 loci show LOH) status of the B5 panel analysis also failed to indicate the outcome of CRC patients (Fig. 3) Fortunately, we found the MSI/LOH status of BAT25 and BAT26 in B5 panel and 12 loci in tumor-related genes could be sensitive markers for the outcome prediction of CRC patients (Table 1 and Fig. 4).
In group of entire patients (n=256), the MSI in D17S250 (p=0.001), MSH2-15 (p=0.001), pinch5 (p=0.03) and MCC-10 (p=0.001) loci demonstrated a poor prognosis in 5-year OS (Fig. 4A); patients with MSI in D17S250 (p=0.02), MSH2-15 (p=0.006), MCC-25 (p=0.048) and MCC-10 (p=0.001) loci showed significantly poorer outcome in 5-year progression free survival (PFS) (Table 1 and Fig. 4B).
In group of stage Ⅱ patients (n=127), the MSI in D17S250 (p=0.006), Pinch-5 (p=0.001), MSH2-15 (p=0.001), MCC-25 (p=0.024) and MCC-10 (p=0.001) MCC-3 (p=0.036), MCC-26 (p=0.049), MGMT-10 (p=0.04), APC-6 (p=0.049) loci showed a bad outcome in 5-year OS in the stage Ⅱ CRC patients (Fig. 4C). Patients with MSI in Pinch-5 (p=0.001), MSH2-15 (p=0.001), MCC-25 (p=0.024), MCC-10 (p=0.001), MGMT-10 (p=0.04) and BRAF-9 (p=0.001), showed significantly poorer outcome in 5-year PFS (Table 1 and Fig. 4D).
In group of stage Ⅲ patients (n=93), MSI in D17S250 (p=0.002) and MCC-10 (p=0.003), and LOH in loci P21 (p=0.009) and MLH1-2 (p=0.006) was related to a bad outcome in 5-year OS (Fig. 4E); MSI in D17S250 (p=0.01) and MCC-10 (p=0.001), and LOH in BRAF-9 (p=0.001), P21 (p=0.021) MLH1-2 (p=0.004) and Pinch-13 (p=0.035) showed significantly poorer outcome in 5-year PFS (Table 1 and Fig. 4F).
We also examined the association of MSI/LOH in the tumor-related genes with the response to adjuvant chemotherapy. In adjuvant chemotherapy group (n=132), the patients with MSI in D17S250 (p=0.01) and MCC-10 (p=0.001), and LOH in BAT-25 (p=0.048) presented a poorer outcome in 5-year OS (Fig. 4G); meanwhile, patients with MSI in MCC-10 (p=0.001) loci presented a poorer outcome in 5-year PFS (Table 1 and Fig. 4H).
The association of MSI/LOH profile with CRC clinical features
The clinical features such as the TNM (tumor-node-metastasis) stage and pathological type are usually important prognostic factors for patients with colorectal cancer . The analysis of the association of MSI/LOH profile with CRC clinical features was set up in the training cohort (n=256) and were clarified in the validation cohort (n=440). Here, we showed that the numbers of patients with mucinous carcinoma who have MSI in BAT25 (p=0.005), MSI/LOH in BAT26 (p=0.004) or MSI-H in B5 panel (p=0.012) were significantly higher than that in the adenocarcinoma in (Table S5-S6). These results illustrated that, compared to loci in tumor-related genes, MSI/LOH of certain loci or whole panel of B5 have closer relation to the pathological type of CRCs. Next, we explored the MSI/LOH profile and its association with other clinicopathological features. Although MSI/LOH of several loci were remarkably related to TNM stage, lymphatic metastasis, infiltration depth, differentiation degree and recurrence in training group, but they all failed to be confirmed in the validation group (Table S7-S14). In regard to B5 panel, LOH-H patients showed more lymphatic metastasis than LOH-L+non-LOH CRCs in training (p=0.05) and validation (p=0.04) set (Table S9-S10 ).
The characteristics of the MSI/LOH within the tumor-related genes
Among MSI/LOH events, 46% MSI (100/217) and 56% LOH (511/909) were found in tumor suppressor (TS) genes. Specifically, we found that the MSI frequency in TS genes (1.50%, 100/26*256) and DNA repairing (DNAR) genes (1.50%, 23/6*256) were higher than that in oncogenes (1.25%, 64/20*256) and MMR (1.30%, 30/9*256), but the difference had no statistically significance . However, the LOH frequency in TS genes (7.68%, 511/26*256) was remarkably higher than that in DNAR genes (5.79%, 89/6*256), MMR (4.69%, 108/9*256) and oncogenes (3.93%, 201/20*256) (p=0.011; p<0.001; p<0.001, respectively). In addition, significant difference in the LOH frequency was detected between the DNAR and oncogenes (p=0.002) (Fig. 5A-B).
Regarding the location of MS in the tumor-related genes, the MSI frequency within introns was 1.5% (203/53×256), which was higher than that located in the non-coding (1.02%, 13/5×256) and exon (0.39%, 1/1×256) regions, but they did not differ significantly between each other (Fig. 5C). On the other hand, the LOH frequency within 3'UTR 7.03% (36/2×256) and intron 6.31% (856/53×256) were significantly higher than that located in the non-coding region 1.33% (17/5×256) (p<0.001; p<0.001, respectively) (Fig. 5D). These results suggested that the MSs were rich in the introns and were prone to mutate than other regions.
Most MSIs (75.1%, 163/217) and LOHs (63.3%, 575/909) were characterized by the dinucleotide repeats existed within the tumor-related genes. The frequency of MSI with dinucleotide repeats (1.68%, 163/38×256) was remarkably higher than that with the tetranucleotide repeats (0.78%, 26/13×256) (p<0.001), and also showed distinct difference as compared with the trinucleotide repeats (0.93%, 19/8×256) (p=0.013) (Fig. 5E). The frequency of LOH with dinucleotide repeats (5.91%, 575/38×256) was higher than that with the tetranucleotide repeats (4.72%, 157/13×256) (p=0.010), but showed no significant difference with the trinucleotide repeats (5.27%, 108/8×256) (p=0.262) (Fig. 5F). These data indicated that most of MS loci were characterized with dinucleotide repeats which were more prone to mutate than other types.
To investigate the mutation patterns of the tumor-related genes in human CRCs, we divided mutations into two patterns: MSI and LOH. Among 1126 mutation events, the rates of MSI and LOH were 19.27% (n=217) and 80.73% (n=909), respectively (Fig. S3A). We found LOH was the more common mutation type in the tumor-related genes (Fig. S3B). Of the 61 MS loci, we found mutations in 54 MS loci and most (40 loci) of them exhibited both MSI and LOH patterns (Fig. S3C). There were 11 loci exhibiting LOH pattern alone and 3 loci only showing MSI pattern. Statistical analysis results indicated that the MSI frequency was similar among the four types of genes. MSI in TS genes (1.50%, 100/26×256) was similar to the DNAR genes (1.50%, 23/6×263), MMR genes (1.30%, 30/9×263), and oncogenes (1.25%, 64/20×263). But the proportion of MSI in TS genes was much lower than that in the oncogenes (p<0.01) (Fig. S3D). When focus on the locations of MS, we found introns and non-coding regions harbored two mutation types, while 3'UTR only had LOH mutation and exon had MSI mutation. The proportion of LOH pattern in introns represented 80.83% (856/1059) of all mutation events. The proportion of MSI and LOH in non-coding regions were semblable (Fig. S3E).
We further analyzed the mutation patterns based on the number of repeat unit, especially in introns, the types of repeat unit, and the length of repeat unit. It showed no correlation among these subgroups (Fig. S4), which indicated that mutation patterns have not been interfered by repeat units.
Mutational profile of MS in human CRCs
Given that the B5 panel has been frequently applied in clinical practice, and the MMR system is of pivotal importance for the occurrence of MSI, we analyzed if the MSI of tumor-related genes we studied was relevant to the status of B5 or MMR. The samples of CRC were divided into B5-MSI and B5-MSS or MMR-deficient (MMR-d) and MMR-proficient (MMR-p) groups according to the MSI status of B5 or MMR.
The data showed that the MSI frequency of 16 tumor-related genes (84.2%, 16/19) we detected was significantly higher in the B5-MSI group than that in the B5-MSS group (Table S15). It indicated that B5 panel is a high-efficiency criterion on assessing the integral MSI status of the genome.
Similarly, except four MMR genes including MSH2, MLH1, MSH6 and PMS2, MSI frequency of majority tumor-related genes (80%, 12/15) we detected was remarkably higher in MMR-d tumors than that in MMR-p tumors (Table S16). In accordance with the statement that MMR system plays an vital role in the occurrence of MSI.
The MSI/LOH spectrum in CRC patients
Increased number of mutations was detected in CRCs , suggesting that the mutation spectrum in CRCs was very complicated. In our study, 54.17% (26/48) of CRCs harbored MSI events within one gene, while 6.25% (3/48) in two genes simultaneously. Of 48 MSI patients, the number of MSI events detected in each individual patient was ranged from 1 (52.08%, 25/48) to 18 (2.08%, 1/48) with a mean value of 4.52 MSI events (217/48) per individual (Table S17-18). Furthermore, 22.22% (42/189) of CRCs harbored LOH events within one gene, while 17.46% (33/189) in two genes. Of 189 LOH patients, the number of LOH events detected in each individual patient was ranged from 1 (22.22%, 42/189) to 18 (0.53%, 1/189) with a mean value of 4.81 MSI events (909/189) per individual (Table S19-20). These results suggested a complicated mutation spectrum in the CRC patients.
We further found that both the gene numbers and the MSI loci numbers in the non-adenocarcinoma patients were higher than that in the adenocarcinoma patients (p=0.002; p=0.002, respectively) (Fig. S5A-B). Moreover, both the MSI genes number and the MSI loci number in the colon patients were higher than in the rectal patients (p=0.006; p=0.007, respectively) (Fig. S5C-D). However, no significant differences of the LOH genes number and the LOH loci number were found between each pair of groups. These findings suggested that the MSI frequency of tumor-related genes in colorectal cancer was associated with pathological type and tumor location.