DOI: https://doi.org/10.21203/rs.3.rs-1195512/v1
High-risk human papillomaviruses (HPV) are an important agent that cause head and neck squamous cell carcinomas (HNSCC). Besides viral induced cancers, tobacco use and heavy alcohol consumption are believed to cause DNA damage, promoting tumorigenesis. How gene expression and DNA methylation differ between HNSCC based on HPV status is yet unknown. We used publicly available gene expression and DNA methylation data from the Cancer Genome Atlas and compared HPV positive and HPV negative HNSCC. We used differential gene expression analysis, differential methylation analysis and a combination of these two analyses to identify differences. Differential expression analysis identified 1854 differentially expressed genes including PCNA, TNFRSF14, TRAF1, TRAF2, BCL2, and BIRC3. SYCP2 was identified as one of the top deregulated genes in the differential methylation analysis and in the combined differential expression and methylation analysis. Additionally, pathway and ontology analysis identified the extracellular matrix and receptor interaction pathway to be the most altered between HPV negative and HPV positive HNSCC groups. Combining gene expression and DNA methylation can help elucidate genes involved in HPV positive HNSCC tumorigenesis such as SYCP2 and TAF7L.
It has become evident that increasing number of head and neck squamous cell carcinomas (HNSCC) are a direct cause of Human Papillomavirus (HPV) infection. The clinical determination of whether a particular patient’s HNSCC is HPV positive, or HPV negative is crucial for determination of the subsequent treatment. In this study we explored the possibility of identifying a set of genes that would be indicative of having an HPV positive HNSCC and identify the site of the primary tumor. These genes’ altered function was tested based on two biological standards, the gene methylation state and the level of transcripts. We identified a set of genes that have the potential to identify HPV positive tumors and specify the anatomic location of the tumor.
Head and neck squamous cell carcinomas (HNSCC) are a group of cancers from anatomically distinct areas: oropharynx, larynx, hypopharynx, oral cavity, and tongue. HNSCC accounted for 700, 000 cancer cases worldwide in 2018, an alarming number [1, 2]. Etiologic agents identified as causes of HNSCC include alcohol consumption and tobacco use, and high-risk human papillomavirus (HPV) infection [3, 4]. Changes in sexual behavior have been found to be associated with higher HPV oral and oropharyngeal incidence and HPV is becoming increasingly indicated as one of the major HNSCC etiologic agents [5–9]. The predominant HPV genotype identified in HNSCC is HPV 16 (in as many as 90% of cases).
HPVs are non-enveloped, double stranded DNA viruses with a genome that includes six early expressed genes, and two late expressed genes [10]. Two of the early genes, E6 and E7, are characterized as oncogenes in cervical, oral, anal, and penile cases [11–13]. E6 interferes with cell survival pathways by targeting p53 for proteasome degradation, and E7 promotes cell proliferation interfering with the function of the Retinoblastoma protein (pRb) [11, 12].
Genes involved in cancer development and progression can affect cell proliferation, metastasis, and invasion [14]. As a result of HPV infection, pathways that control cytoskeletal rearrangement, immune response, extracellular matrix formation, and receptor activation are differentially altered [14–16]. These genetic changes sustained during carcinogenesis and viral oncogenesis are a result of changes in gene expression and transcriptome profile. HNSCC onset, progression, and outcome differ depending on the presence or absence of HPV. In HPV negative (HPVN) HNSCC patients, the tumor suppressor genes TP53 and p16, alongside with CCND1 oncogene are the most frequently identified mutated genes [17, 18]. In HPV positive (HPVP) HNSCC, the viral oncogenes E6 and E7 initiate deregulation by targeting p53 and pRb, respectively [11, 12]. Studies have begun describing epigenetic profile changes, specifically on the level of DNA methylation, and it has been reported that the methylation status in HNSCC patients is associated with HPV infection (i.e. positive versus negative) [19–22].
Recently, there has been a growth in interest and need for understanding the biological significance of HPVP and HPVN HNSCC. Many of these studies have utilized tools of the rapidly expanding field of bioinformatics [23–25].
We performed a meta-analysis of The Cancer Genome Atlas (TCGA) HPVP and HPVN HNSCC transcriptome and DNA methylome data [26, 27]. To our knowledge, this is the first time a study bridges these two data sets and compares groups based on the HPV status in HNSCC patient samples. Our findings show that pathways involved in viral invasion and trafficking, as well as immune system activation are differentially expressed in HPVP HNSCC. We identified that the differential expression of these pathways positively correlates with the differential methylation analysis.
This study demonstrates the ability of computational methods to identify biomarkers of potential clinical significance from a centralized resource of available datasets such as TCGA.
Data in this study were acquired through the publicly available database TCGA, NCI Genomic Data Commons (GDC) [26, 27]. We focused on HNSCC tumor data, and all data used in this study was open access (downloaded in 2019). We grouped the HNSCC patient samples in two experimental groups: 1) HPVP HNSCC, and 2) HPVN HNSCC. We were interested in comparing gene expression and methylation state of tumors in the absence or presence of HPV. The TCGA gene expression and DNA methylome data were extracted from RNA-seq studies of HNSCC, and from DNA methylation arrays, respectively. We also requested corresponding clinical data [27]. We used the clinical information to filter the samples into HPVP or HPVN HNSCC. We used two criteria to determine the presence of HPV: 1) the expression levels of p16 gene, a well-known tumor-suppressor gene indicative of high-risk HPV related cancers [28]; and 2) we used the in-situ hybridization information for p16 gene if expression information of p16 was not available. Using these criteria, we were able to acquire the information from 73 HPVN patients and 41 HPVP patients from the transcriptome studies, and 74 HPVN and 44 HPVP patients from the DNA methylome studies (detailed list of patients Supplementary Table 1). For the patients that we had RNA-seq data available, we performed the analysis on clinical status as well. To visualize clinical data, we used gplots, ggplot2, RColorBrewer and colorRamps Bioconductor packages [29–32]. TCGA data consisted of already mapped reads that were downloaded using Bioconductor’s package TCGAbiolinks for TCGA data handling. R (version 3.6.1) and RStudio software were used for all data analyses [33–38]. Figure 1 shows our overall workflow, with each section described in detail in the following sections.
We preprocessed and filtered the data according to the parameters of HPV status. Preprocessing makes the data as uniform as possible, rearranges, and enables it for the analysis software to handle it. We also normalized the data in order to be able to perform subsequent clustering steps. Data were filtered using TCGAbiolinks and xlsx packages, and used embedded functions TCGAanalyze_Preprocessing, TCGAanalyze Normalization and TCGAanalyze_Filtering [35, 39].
To investigate if clustering was as expected (HPVP vs HPVN HNSCC), Principal Component Analysis (PCA) and Hierarchical clustering with heatmaps using edgeR and gplots packages, and heatmap.2 function in R were performed [32, 40, 41]. For the PCA analysis, we used prcomp function already existent in R, and for the Hierarchical clustering with heatmaps we used edgeR package for R and gplots, ggplot2 and RColor Brewer libraries for data visualization throughout the analyses [29, 30, 32, 40]. For heatmap clustering, we followed a recommended online tutorial [42]. Using heatmap clustering we investigated the most variable transcripts as well as the genes that have the highest mean values across 114 patients, using it as a proxy for the most abundant transcripts.
In order to understand differential gene expression of the filtered data, a DGE analysis was performed using TCGAbiolinks TCGAanalyze_DEA function. We used an FDR cutoff=0.01 and logFC cutoff of point=1. To understand the nature of the extracted deregulated genes, we performed a pathway analysis using clusterProfiler Bioconductor package, and the function enrichKEGG, along with packages SummarizedExperiment, MultiAssayExperiment and genefilter [43–46]. To visualize the identified pathways, we used pathview Bioconductor package, and to visualize DEG in a volcano plot we used EnhancedVolcano Bioconductor package [47, 48]. We used PANTHER (Protein ANalysis THrough Evolutionary Relationships) and Enrichr, two comprehensive gene set enrichment analysis tools, to investigate the enriched pathways in the DEG dataset [49–51].
To analyze the DNA methylation patterns, we used TCGAbiolinks function TCGAanalyze_DMR, and used p-value cutoff=10−5 and \(\stackrel{-}{\beta }\)≥0.25. “\(\stackrel{-}{\beta }\)” is a parameter for differential methylation levels that ranges between 0 and 1, 0 being unmethylated and 1 being fully methylated.
To observe common patterns of gene silencing or overexpression, we combined the two datasets (DEG and DMR) using TCGAbiolinks TCGAvisualize_starburst function [35]. We used \(\stackrel{-}{\beta }\)≥0.25, FDRexpression≤10−5, FDRDNAmethylation≤10−5 and │logFC│≥1. We also tested the data with a more stringent parameter of │logFC│≥3, and decided to work with the former parameter, as the analysis demonstrated that the most prominent genes were filtered under both parameters.
To ensure that the data from the TCGA database were comparable, we first examined the clinical profile of the patients in both HPVP and HPVN patient groups. Patient age distribution showed that the median values were comparable in both HPVN and HPVP groups (HPVP = 58, HPVN = 59 years) (Figure 2A). Most patients fell into 56-65 years of age, and the collective (both for HPVP and HPVN groups) median age fell in this range as well (median = 58.5 years) (Figure 2B). We observed that there were no HPVP patients in the oldest patient category 76-85 years of age (Figure 2B). Sex distribution in our sample groups revealed that females (n = 21) were underrepresented in comparison to males (n = 93) (Figure 2C). As per TCGA’s classification in different race categories, race distribution showed that white race was by far the most represented one (n = 98) (Figure 2D). Anatomical site analysis of these HNSCC showed that there were apparent differences in location of tumor depending on the HPV status (Figures 2E, 2F and 2G). HPVP cancers were primarily found in the tonsil region and the base of the tongue, and HPVN cancers primarily in the oral tongue and the larynx (Figures 2F and 2G).
To explore clustering and similarity of samples, we analyzed the two experimental groups, HPVN and HPVP, by PCA and clustering on heatmap. PCA revealed that the two groups (detailed explanation in Material and Methods) clustered mostly as two separate and distinct groups with an overlapping middle area (Figure 3A, HPVN in blue, HPVP in pink).
Heatmaps (Figures 3B-E) revealed specific patterns:
To explore the impact of HPV on gene expression in HNSCC, we performed Differential Gene Expression Analysis (DGE) using TCGAbiolinks Bioconductor package for R (Material and Methods). Using FDR ≤ 0.01 and │logFC│ ≥ 1, with HPVN as baseline, DGE identified 1854 Differentially Expressed Genes (DEG), 941 downregulated and 913 upregulated in HPVP samples (Figure 4). Significant DEGs are in purple, and genes that are non-significant or significant by one of the parameters are in grey, blue, and green. Some of key representative DEGs are: PCNA, TNFRSF14, TRAF1, TRAF2, BCL2, and BIRC3.
To functionally explain the up- and down-regulated genes, we performed KEGG analysis [54]. In Table 1 are ten of the significantly enriched pathways (a full list of DEGs and KEGG pathways can be found in the Supplementary Tables 2 and 3, respectively) [47, 54]. The KEGG enriched pathways included those involved in ECM-interaction, cytokine production, cell cycle regulators, apoptosis, and genes identified as part of an HPV infection.
Pathway and ontology analysis was performed using the Enrichr and PANTHER classification systems (Table 2A and 2B) [49–51]. These tools identified similar pathways and patterns as KEGG (Tables 2 and 3, Supplementary Table 3). The top KEGG significantly enriched pathways (i.e., enrichment of genes) were consistent with HPV infection (Table 3). It is notable that transcription factors and genes involved in cell cycle progression were identified as upregulated. In contrast, genes involved in cellular response to stimulus, including chemotherapeutic agents and radiation were downregulated [55].
To explore epigenetic changes in HNSCC due to HPV, we focused on DNA methylation. We performed a Differential Methylation Analysis (DMA) using the following parameters: \(\stackrel{-}{\beta }\)≥0.25 and p≤10−5 that identified top hypo- and hyper-methylated regions of the genome and genes involved (Table 4, Supplementary Table 4). We compared the overall median methylation levels of our two groups of patient samples, HPVN and HPVP, and observed that their median values were comparable (\(\stackrel{-}{\beta }\)~ 0.46) (Figure 5A). DMA results are represented on a volcano plot - green represents hypomethylated and red hypermethylated regions in HPVP (HPVN samples used as baseline comparison) (Figure 5B).
3.5. Starburst: an analysis that bridges differentially expressed and methylated genes revealed similar patterns to DNA methylome analysis and potential biomarker gene for HPVP HNSCC
To identify common DEG and DMA genes, we performed a Starburst analysis [35]. This analysis identifies genes with similar DEG and DMR patterns (i.e. hypomethylated and upregulated and hypermethylated and downregulated), using following parameters: \(\stackrel{-}{\beta }\)≥0.25, FDRexpression≤10−5, FDRDNAmethylation≤10−5. Our analysis showed that a similar pattern was observed with │logFC│ being set to ≥ 1, and more stringent │logFC│ set to be ≥ 3. The pattern of DEG and DMR expression remained comparable with both parameters being used, and the top statistically significant DEG and DMR identified in both analyses were consistent (Figures 5C and 5D). We decided to proceed with │logFC│≥ 1 and depict some of the representative results (Table 5), and a complete list can be found in Supplementary Table 5.
HPV has been recognized as an important driver of HNSCC [23, 56, 57]. The patient treatment varies depending on HPV positive (HPVP) vs. negative (HPVN) HNSCC, thus it is important to gain further knowledge of the genetic profile of HNSCC. Our study showed that HPVP HNSCC patients exhibit gene deregulation at the gene transcription and methylation levels different from HPVN HNSCC. When analyzed, both independently and collectively, gene expression and methylation deregulation patterns specifically point out changes in gene pathways including those involved in controlling invasion, immune response, differentiation, and cell division.
In total, the cohort of patient’s samples analyzed were 114 (HPVN = 73 and HPVP = 41). There was a disparity in the male/female self-described sample ratio, male samples account for 93, and female samples the remaining 21 (Figure 2C). A possible explanation for this disparity might be that HNSCC cases are sex biased and more prevalent in males, but a larger cohort needs to be analyzed to address this disparity. There was also an overrepresentation of white race (n = 98) in our cohort for HNSCC (Figure 2D). This lack of racial representation is unfortunately not uncommon in clinical studies. We have since identified studies that report HNSCC incidence in non-white population, and a similar analysis will be conducted in the future in order to include more equally distributed races [58–61]. There was an apparent absence of HPVP HNSCC patients in the oldest patient category (76-85 years of age - Figure 2B), and we theorize that might be due to HPVP HNSCC being rarer than HPVN patients, thus causing this age groups’ underrepresentation. Alternatively, the HPVP HNSCC patients do not survive long enough to be included in the data [62, 63]. We observed differences in anatomical sites of HNSCC dependent on HPV status (Figures 2E, 2F and 2G). Tonsil was the predominant location in HPVP patients, while the oral tongue had the most cases in HPVN patients (Figures 2F and 2G). In the U.S., regardless of HPV status, the oral tongue is the most common site for HNSCC [58].
In our analysis, genes that play a role in all HNSCC development belonged to four main functional pathways: cell survival, cellular proliferation, squamous epithelial differentiation, and invasion/metastasis. We identified differentially expresses and methylated genes in HPVP versus HPVN HNSCC. Of the 1854 DEG, 16 genes were the top hits identified in the transcriptome and methylome analyses. The functions of these genes range from cell cycle, immune response, to cell death regulation. Specifically, we found that SYCP2 and TAF7L were the two most deregulated genes in both analyses. Synaptonemal Complex Protein 2 (SYCP2) was the top hypomethylated and upregulated gene in HPVP HNSCC. This gene is testis-specific human gene and it has been associated with impaired meiosis [64]. It is known that SYCP2 aberrant expression in HPVP cancers may contribute to the genomic instability induced by high-risk HPVs and subsequent oncogenic change [65]. In 2015, a paper by Masterson et al. reported that deregulation of SYCP2 predicts early-stage human papillomavirus-positive oropharyngeal carcinoma. The same authors concluded their study proposing SYCP2 as a potential biomarker [66]. In addition to this, an independent study showed SYCP2 was hypomethylated in HPVP HNSCC, which is in concordance with what we have discovered [19]. This might imply that the previously proposed biomarker function for SYCP2 is not unlikely. Besides these reports, the elevated expression of SYCP2 in HPV-associated tumors has previously been observed in three additional gene expression analysis studies [67–69]. Similarly, the second highlighted gene that was hypomethylated and upregulated in HPVP HNSCC is TATA-Box Binding Protein Associated Factor 7 Like (TAF7L), a gene involved in spermatogenesis [70]. According to a study by Mobasheri et al., TAF7L is upregulated in breast cancer, so it is possible that is not an exclusive feature observed only in breast cancer tissue [71].
DEG analysis identified that PCNA, TNFRSF14, TRAF1, TRAF2, BIRC3, and BCL2 were significantly altered in HPVP HNSCC.
Proliferating cell nuclear antigen (PCNA) is a gene that was significantly overexpressed in HPVP versus HPVN HNSCC patient samples. It has been shown that PCNA expression levels change during cell cycle, as PCNA is associated with proliferation and cell transformation in cancer [72, 73]. PCNA is one of the crucial regulators in cell cycle as it makes complexes with cell cycle activators (cyclins and cyclin dependent kinases) and inhibitors (p21) [72]. Post-translational modifications are crucial for PCNA function, so much so, that PCNA exists in an alternative methylated form in cancers [74].
Tumor necrosis factor receptor superfamily member 14 (TNFRSF14) is known to be a herpesvirus entry mediator by being a part of signal transduction pathways that activate inflammatory and inhibitory T-cell immune response [75]. It is not surprising to see that it was upregulated in HPVP HNSCC, although it is interesting that a herpesvirus related gene has been upregulated upon HPV infection in this cancer type. TNFRSF14 is known to interact with TNF Receptor Associated Factor 2 (TRAF2), that is also upregulated in HPVP HNSCC. This protein directly interacts with the TNF receptors, and forms a complex with another TRAF family member, TRAF1 that is also upregulated in HPVP HNSCC. This is all necessary for TNFα mediated activation of MAPK8/JNK and NF-kβ, which are known to be involved in cell survival. The protein complex formed by TRAF2 and TRAF1 interacts with the inhibitor-of-apoptosis proteins (IAPs), and functions as a mediator of the anti-apoptotic and pro-survival signals from TNF receptors. One of those IAPs that is upregulated in HPVP HNSCC is BIRC3 - apoptosis inhibitor [76–78]. According to The Human Protein Atlas (THPA), TRAF2 has the highest expression in HNSCC, followed by cervical cancer among all sampled cancer types (17 cancer types) [79]. BIRC3 shows similar observations, implying this pattern may be specific for HPV related HNSCC [79]. Another role of TRAF1 is a negative regulation of Toll-like receptor (TLR) and Nod-like receptor (NLR) signaling. TRAF1 can also, independently from TRAF2, contribute to NF-kβ activation; conversely, during TLR and NLR signaling, TRAF1 can also negatively regulate NF-kβ activation. According to THPA, TRAF1 has been found to be overexpressed in HNSCC. Additionally, TRAF1 can contribute to chronic viral infection and limit inflammation, contributing to survival of Epstein-Barr virus dependent cancers [76, 79]. TRAF family genes (TRAF1 and TRAF2 specifically) have been found to be differentially expressed in a couple of HPV related studies, including one in our lab [80, 81]. An interesting question follows: does TRAF1 have a similar role in HPV dependent cancers as well? To investigate this, more research is needed.
In addition to BIRC3 - apoptosis inhibitor being upregulated in HPVP HNSCC, BCL-2, an anti-apoptotic gene has been observed to be upregulated in HPVP HNSCC as well. An existing model explains the observed picture in our data. Similarly to oncogene addiction, some tumor cells may become dependent on BCL-2 for survival [82]. As tumor environment may induce higher stress signal production that is pro-apoptotic nature, a proportion of cancer cells manage to overexpress BCL-2 and survive producing this anti-apoptotic signal. In this way, BCL-2 helps cancer progression by promoting the survival of altered cells [83, 84]. BCL-2 is also known to be overexpressed in non-hematologic tumors as ovarian, neuroblastoma, colorectal and HNSCC [85–88].
Starburst analysis combined DEG and DMR results and highlighted genes that were the most hypomethylated and upregulated and the most hypermethylated and downregulated. We performed Starburst with FDR cut=1 and a more stringent parameter FDR cut=3 and maintained top highlighted gene profile (Supplementary Table 5, and Figures 5C and 5D), specifically SYCP2 and TAF7L. Taken together, some of the DEG identified as top hits may be used as potential biomarkers for early identification of HPVP HNSCC, including SYCP2, TAFL7 and ZFR2. The analysis of DEG of tonsil HPVP HNSCC and oral tongue HPVN HNSCC (predominant anatomical locations of samples), identified unique genes that were downregulated in HPVP tonsil HNSCC (Supplementary Table 6). One of these, RBM24 is shown to mediate repression of p53/TP53 mRNA translation and INHBA, a member of transforming growth factor-beta (TGF-β) superfamily of proteins. According to THPA, the highest expression of RBM24 is observed in HNSCC, followed by cervical cancer, although we have not seen its use as a diagnostic tool [79, 89]. This implies that these genes when downregulated might specifically indicate HPVP HNSCC site specific (tonsil) cancer development.
In conclusion, using TCGA transcriptome data enabled us to identify 1854 DEG and these DEG belong to a wide range of pathways including cell cycle, papillomavirus infection, transcriptional misregulation, TNF signaling, cytoskeletal rearrangement and apoptosis. Combining knowledge gained, both by genome wide transcriptome and DNA methylome data analysis, we identified potential players that might contribute to cancer development in HPVP HNSCC. In particular, SYCP2 and TAF7L, that have been shown in the past to be involved in cancer development. SYCP2 specifically attracts our attention, as it has been shown that deregulation of SYCP2 predicts early stage HPVP oropharyngeal carcinoma and it has been proposed to serve as a biomarker by other authors [66]. We also propose a potential panel of genes to serve for HPVP HNSCC detection and possible anatomical characterization. Screening for circulating tumor DNA from peripheral blood is low invasive and provides fast results, and we suggest screening for HPVP HNSCC using a panel including RBM24, INHBA, SYCP2, TAFL7, and ZFR2. This may serve as an informative tool for HNSCC HPVP screening, and even direct to the specific anatomical location.
HPV: Human Papillomavirus
HNSCC: head and neck squamous cell carcinoma
pRb: Retinoblastoma protein
PCNA: proliferating cell nuclear antigen
TNFRSF14: Tumor Necrosis Factor (TNF) Receptor Superfamily Member 14
TRAF1: TNF Receptor Associated Factor 1
TRAF2: TNF Receptor Associated Factor 2
BCL2: B-cell lymphoma 2 apoptosis regulator
BIRC3: Baculoviral Inhibitor of Apoptosis (IAP) Repeat Containing 3
SYCP2: Synaptonemal Complex Protein 2
TAF7L: TATA-Box Binding Protein Associated Factor 7 Like
E6: HPV early gene 6
E7: HPV early gene 7
TP53/p53: tumor suppressor gene/protein p53
HPVN: HPV negative HNSCC
HPVP: HPV positive HNSCC
p16: cyclin-dependent kinase inhibitor
CCND1: cyclin D1
TCGA: The Cancer Genome Atlas
GDC: NCI Genomic Data Commons
PCA: Principal Component Analysis
DGE: differential gene expression
DMA: differential methylation analysis
DMR: differential methylated regions
GPX2: Glutathione peroxidase 2
KRTDAP: Keratinocyte Differentiation Associated Protein
CRNN: Cornulin
PCA: Principal component analysis
KEGG: Kyoto Encyclopedia of Genes and Genomes
ECM: extracellular matrix
PANTHER: protein annotation through evolutionary relationship
THPA: The Human Protein Atlas
TLR: Toll-like receptor
NLR: Nod-like receptor
NF-kβ: nuclear factor kappa-light-chain-enhancer of activated B cells
ZFR2: Zinc Finger RNA Binding Protein 2
INHBA: Inhibin Subunit Beta A
Ethics approval and consent to participate. Institutional Review Board Statement: Not Applicable
Consent for publication: All data used in this study was open access and obtained from the publicly available NIH GDC legacy archive.
Availability of data and material: Data used in this study are open access and can be found on NIH GDC Legacy Archive under the following link https://portal.gdc.cancer.gov/legacy-archive/search/f .
Competing Interests: The authors declare no conflict of interest.
Funding: This research was funded by investigator university provided funding, no external funding.
Author Contributions: Conceptualization, S.S., P.I.M.; methodology, S.S, A.R.; formal analysis, S.S., A.R.; investigation, S.S, P.I.M.; data curation, S.S.; writing—original draft preparation, S.S.; writing—review and editing, S.S., N.V.A., L.S., S.C.P., P.I.M.; visualization, S.S.; supervision, P.I.M.; funding acquisition, P.I.M. All authors have read and agreed to the published version of the manuscript.
Acknowledgments: We thank the NIH/GDC.
Table 1 to 5 is available in the Supplemental Files section.