The mutational landscape of both MHCC97L and HCCLM6 cell lines
To explore the relationship between MHCC97L and the HCCLM6 cell populations, we first examined their karyotypes to uncover their chromosomal abnormality. The conventional karyotype analysis revealed that MHCC97L showed the heterogeneity with various forms of abnormal morphologies of chromosomes 1, 4, 6 and16, which exhibits at least three types of karyotypes (Supplementary 1), whereas HCCLM6 cells are relatively homogenous in the karyotypic appearances. This phenomenon that HCCLM6 cell populations are less diversified than the MHCC97L cells suggested that HCCLM6 cells could originate from a given cell subclone within MHCC97L cell populations. In addition, multiplex-fluorescence in situ hybridization (M-FISH) analysis indicated that the chromosomal karyotypes of the MHCC97L and HCCLM6 cells exhibited the dramatic aneuploidy and morphology abnormities (Supplementary Figure 1). Generally, the karyotypes of both cell lines are composed of 53 and 55 chromosomes respectively, with similar chromosomal aberrations. Chromosomal translocations between chromosomes 1 and 21, 4 and 8, 9 and 7, 9 and 2, 14 and 22 were common in the two cell lines, whereas HCCLM6 cells showed extra translocations between 1 and 8, 16 and 3, as well as more numbers of chromosomes 7 and 16. This implied that HCCLM6 cells could undergo the featured genetic events during metastatic clonal evolution.
To test the hypothesis, we further performed whole genomic analysis on the two cell lines, including whole genome sequencing (WGS), whole exome sequencing (WES) and Affymetrix single-nucleotide polymorphism (SNP) genotyping as well as methylome profiling with Illumina Infinium HumanMethylation450 BeadChip and the transcriptome profiling with RNA sequencing (RNA-Seq), according to the workflow illustrated in Supplementary Figure 2. And then genomic aberrations including single nucleotide variant (SNV), copy number variation (CNV) and structural variant (SV) based on WGS, WES and SNP genotyping were called in both cell lines (see Methods).
Briefly, through SNV calling and filtering on the WES data, we obtained 335 overlapping mutations, as well as 165 and 160 mutations exclusively in MHCC97L and the HCCLM6 cells, respectively (Supplementary Table 1), of which 347 were nonsilent mutations. CNV data was obtained based on Affymetrix SNP 6.0 genotyping (Supplementary Table 2). Besides, based on the WGS data, whole genomic structural aberrations, including CNV (Supplementary Table 3) and genomic rearrangements (Supplementary Table 4), were obtained. The comparison of the two above CNV data showed that Pearson’s correlation coefficient is 0.91 (P < 2.2e-16) between them, indicating the high data reliability. Meanwhile, to supplement the information about the epigenetic status and transcriptional activities of the two cell lines, DNA methylation profiles and the gene expression profiles were yielded (Supplementary Tables 5 and 6). The genome-wide representation of these results depicted with Circos plots (Figure 1), in general, revealed that the two cell lines presented similar genomic aberrations and highly homologous characteristics, supporting that they share common ancestry. However, some distinctive features might drive the metastatic clonal evolution within HCCLM6 cell populations.
A minor subclone in the MHCC97L cells became dominant in the HCCLM6 cells
To study the heterogeneities and phylogenetic relationships in MHCC97L and HCCLM6, here we performed clonal analysis on the two cell lines based on the SNVs and CNVs (see Methods). The mutations were deconvoluted into 3 subclones (denoted as 1, 2 and 3) within the MHCC97L cells (Figure 2A), and 2 subclones (denoted as A, B) in the HCCLM6 cells (Figure 2A). Then we counted the mutations in each subclone. Interestingly, the minor clones, subclone 3 in the MHCC97L and subclone B in the HCCLM6, are mainly characterized with their specific mutations, whereas the shared mutations were expanded in subclone 1, 2 and 3 in the MHCC97L, and majority in subclone A in the HCCLM6 (Figure 2B). The subclonal structure of the cell lines complied with notion that the MHCC97L could be more heterogenous. In addition, the results indicated that the subclone A in the HCCLM6 could be originated from subclone 3 in the MHCC97L. To confirm the implication, next, we performed phylogenetic reconstruction based on the mutated alleles. It was indicated that the HCCLM6 are derived from the same node with subclone 3 in the MHCC97L (Figure 2C), which consolidate the notion that the ancestry minor subclone 3 in MHCC97L, when it first emerged, with all the MHCC97L and HCCLM6 shared mutations, and without the current MHCC97L unique mutations, is the ancestor of the subclone A in the HCCLM6 (Figure 2D). It was noticed that c.2156C>T in DMBT1 was implicated as early mutation in the evolutionary history as it was in subclone 1 and A respectively in the MHCC97L and HCCLM6 (Supplementary Table 1). The newly formed functional damaging nonsilent mutations in the MHCC97L includes those influencing DDX59, SETD5, CXCR6, CDK14, etc. (Supplementary Table 1). Meanwhile, newly formed functional damaging nonsilent mutations in the HCCLM6 includes those influencing EFHC2, BCORL1 etc. (Supplementary Table 1). The nonsilent mutations that were specifically expanded from the subclone 3 in the MHCC97L to the subclone A in the HCCLM6 included c.1061T>G in SMAD5 and c.1373A>G in RNF169 (Supplementary Table 1 and Supplementary Figure 3), both of which were predicted as functional damaging.
MHCC97L and HCCLM6 cells concurrently harbored amplification of CAV1, CAV2 and MET
Due to the HCCLM6 cells originating from the subclone of MHCC97L, the common genetic events shared by the two cell lines could be initial drivers responsible for metastasis.
Next, the unstable genome integrities of the two cell lines were evaluated by SV and CNV analyses. For SV, we observed 24 concordant inversions and 30 concordant translocations in the two cell lines. Moreover, the MHCC97L harbored 44 inversions and 15 translocations exclusively, whilst the HCCLM6 had 8 unique translocations and none unique inversions (Figure 1). Annotation of the SV events revealed that they did not flank gene coding regions.
The CNV analysis using SNP6.0 array revealed that the average autosomal ploidy for the MHCC97L and the HCCLM6 were respectively 2.03 and 2.06. The MHCC97L includes 121 CNV gain (average segment size 3.2 Mb) and 121 CNV loss (average segment size 1.6 Mb); the HCCLM6 includes 128 CNV gain (average segment size 3.1 Mb) and 145 CNV loss (average segment size 1.6 Mb). As shown in the global track of the genomic CNVs, the MHCC97L and HCCLM6 have undergone analogous copy number altering events (Figure 3A), with typical amplification on chromosome 7 (Figure 3B) and the deepest deletions on chromosome 8 and 2 (Figure 3C and 3D). In consistence with previous studies, we have observed chromosome 8p loss in the two cell lines, especially the region harboring the genes of LOC392196, USP17L7, ZNF705D, FAM66D and USP17L2 (Figure 3C). Chromosome 8p loss is widely studied and testified to associate with tumorigenesis and metastasis as it harbored several tumor suppressing genes. Interestingly, the 8P loss is slightly deeper in the HCCLM6 than in the MHCC97L, which is consistent, as well as most of the genome, in the WGS and Chip revealed results (Supplementary Figure 4A). In despite of the concordance, the both ends of chromosome 2, as well as the head of chromosome 3p, chromosome 15, are differed in the two cell lines in their CN states.
The genes with copy number larger than 6 or lower than 1 was listed in Supplementary Table 7, among which The chromosome 7q31.2 region flanking CAV1, CAV2 and MET was the mostly amplified region in the whole genome and therefore constitute the prominent peaks in the CNV profiles in the two cell lines (Figure 3B). MET amplification is a significant recurrent CNV event in the TCGA HCCs according to cBioportal. The proto-oncogene, have amplified along with CAV1 and CAV2 to reach more than 10-13 copies in the MHCC97L and HCCLM6. In contrast, the most deeply deleted gene in the two cell lines lied in the segment of Chr8:11968639-12017716, followed by Chr2:191833566-192049769, where tumor suppressor gene STAT1 is located (Figure 3D). We compared the gene expressions of these most amplified and deleted genes in the two cell lines, as well as in the normal liver tissue and the HepG2 cell line, which do not report such alterations. As shown in Supplementary Figure 4B, the dose effect of the gene copies seemed obvious. In contrary to the corresponding expressions in the normal liver and HepG2, CAV1, CAV2, MET were largely overexpressed in the two cell lines, especially in the HCCLM6, while STAT1 was utterly silenced (Supplementary Table 8). Amplification and overexpression of MET are reported to be associated with cell survival and invasiveness in multiple human cancers, especially HCC. Therefore, it is highly likely that MET amplification was a driver mutation that propelled the malignancy and metastasis in the early tumor phase of the MHCC97.
EMT associating genes were remarkably elevated in HCCLM6
To further investigate the dose effects of CNV events on gene expressions, we performed intra- and inter-cell comparisons. Firstly, for the intra- cell comparison of gene expressions, we assigned the genes into four groups according to their copy numbers and then observed a significant overall increase (P= 1.3e-5) of gene expressions along with the increment of gene copies in both cell lines (Supplementary Figure 4C), suggesting that increasing gene dose had generally elevated the transcription activities.
Next, for inter-cell comparisons, we compared the expressions of the genes that have different copy numbers in the MHCC97L and the HCCLM6. Interestingly, the group of genes that had more DNA copy numbers in the HCCLM6 had up-regulated expressions compared to their correspondents in the MHCC97L (Supplementary Figure 4D). In contrast, the genes that had concordant copy numbers were highly expressed in the MHCC97L (P=2.7e-4) and the genes that had less copy numbers in the HCCLM6 had no significant differences in their total expression distributions in both cell lines. The result showed that, in despite of dose effect of copy numbers, the global gene transcriptions in the HCCLM6 are more active.
Among the genes that had differed copy numbers between the two cell lines, the top ranked genes that had gained copy numbers and upregulated transcriptions in the HCCLM6 in comparison to the MHCC97L are CAV1, CAV2 and MET. Although these genes were amplified in both cell lines, the HCCLM6 had 2 more copies and had an elevation of more than 100-fold in the expressions. Previous studies reported that the Caveolins had significantly facilitate portal vein invasion of HCC. They can trigger the EMT process by up-regulating Vimentin and inhibit E-cadherin. Also, it increased secreted MMPs. Therefore, we examined the expression of these genes in the cell lines. It was found that N-cadherin, which helped the tumor cells to interact with stromal cells, is 23-fold higher expressed in the HCCLM6. In contrast, E-cadherin that guarantees tight junction is lowly expressed in both cell lines. Moreover, the secretory protease of MMP7 is 6-fold higher expressed in the HCCLM6. There is also implication that CAV1 and CAV2 favors metastasis via exome secretion pathway as they were in the exosome cargoes of metastatic HCCs.
In addition, in the substantial genes that are slightly differed between the two cell lines, we noticed that EMT transcription factor TWIST2 was among it; it had a shallow deletion in the MHCC97L but not in the HCCLM6. Then we compared the expressions of these three genes of the cell lines with the liver tissue and the HepG2. The trend of the expressions of the genes reflect to their malignancy states (Supplementary Table 8). Their expressions were elevated in the MHCC cells than in the liver and the HepG2. Besides, they were largely up-regulated in the HCCLM6 than in the MHCC9L.
DLEC1 is aberrantly methylated in the two cell lines
To investigate the DNA methylome of the two cell lines, we performed HK450 Array hybridization analysis. As illustrated in Supplementary Figure 5 and Supplementary Table 4, the probe intensities had similar distribution in the two cell lines. In addition, we observed that larger than 99 per cent of the probes had approximate intensities (differences within 0.1), which indicated that the two cells are similar in the landscapes of their genomic methylations. The two cell lines had concurrent high densities of methylation in the promoter region of tumor suppressor gene DLEC1. As shown in Figure 4A, the promoter region of DLEC1 is occupied with a CpG island and its N-terminus shore region, which are densely marked with high methylations intensities. Besides, the methylations in the HCCLM6 are slightly higher than in the MHCC97L. The DLEC1 expressions is down-regulated by 19- and 36- fold change in MHCC97L and HCCLM6 respectively than in the normal liver tissue.
In the purpose of screening differentially methylated genes, we chose the genes that had at least deviated methylated CpG sites that exhibited more than three tandem sites that had intensity differences larger than 0.3 in the promoter region (Supplementary Table 9). We selected the most heavily marked genes and aligned them with elevated expressions, among which top ranked genes are TP6V0E2, ZC3HAV1L, SGCE, NIPA1, YTHDC1, HUWE1, GRB10 and SNRPN that are relatively hypomethylated in the HCCLM6.Several encompassed genes are strong biological candidates for tumorigenesis and metastasis. Specifically, YTHDC1, HUWE1, VIM and GRB10, which are elevated with 27.5-, 8- and 7.6- fold change in the HCCLM6, are involved in mRNA splicing, tumor suppressor ubiquitination and tyrosine signaling respectively (Figure 4B).
Next, we examined the pathways of the cell lines via performing enrichment analysis on the transcription profiles. The result showed that the MHCC97L exceled in the chromatin package, cell cycle and proliferation pathways (Supplementary Figure 6). In contrast, the HCCLM6 transcription activities are more enriched in the protein folding, exome secretion processes, which is explicitly in consistence with the implications of the up-regulated expressions of CAV1 and CAV2.
HCCLM6 exceled via reshaping the clonal structure rather than obtaining novel alterations
It is worth mentioning that the most implicative metastatic favoring alterations are presented concordantly in the MHCC97L and HCCLM6. For instance, in addition to the well mentioned MET, CAV1, CAV2 amplification, the deletion of STAT1 implicates favors to metastasis as its high expression is reported to suppresses angiogenesis in human glioma cells. Moreover, the hypermethylated DLEC1, as a tumor suppressor, is reported to be deleted or aberrantly methylated in multiple cancer forms. All the superiority of HCCLM6 in metastasis are likely to be based on these same materials, merely with minor elevations in the damaging extents. Specifically, the further amplification of MET,CAV1, CAV2 in the HCCLM6 had resulted in larger gene doses and remarkable transcriptionally elevations. Additionally, the DLEC1 promoter methylation in the HCCLM6 is slightly lower and so is the expression. Moreover, the promoter region of HUWE1 is mediately methylated in the MHCC97L and become largely erased in the HCCLM6. HUWE1 encodes a protein that functions as an E3 ubiquitin ligase that helped degradation of the TP53 tumor suppressor, core histones, and DNA polymerase beta. Given the fact that clonal analysis revealed that the HCCLM6 is likely to be formed via expansion of an early minor subclone like that in the MHCC97L, these mentioned changes in the extents of damaging alterations suggested that the HCCLM6 may become aggressively metastatic via reshaping the clonal structure to push its evolution in the course towards the metastasis favoring path. To conclude, we proposed a schema chart to illustrate the relationships between the two cell lines with the aforementioned important mutations under the clonal expansion proposition (Supplementary Figure 7).