Two matrices were constructed, the first containing 199 experiments and the second 666. The rows amount to 33622, representing the studied genes as expression foldchanges. The adjust_matrix() function of the cola package was used to clean the matrices.
In the main matrix of 199 columns, 4 treatments were removed because more than 80% of their values were missing, leaving us 195 samples and 16127 genes filtered for analysis. Consensus partitioning recommended grouping the experiments into 5 classes, and alternatively into 2 or 4. Table 1 shows the 1-PAC, mean silhouette, and concordance evaluation indices for all classes with a maximum of 6. Having calculated the product of 1-PAC, concordance, and silhouette for all 3 suggested k, k = 4 classes was the highest k with a product ≥ 0.90 and is hence preferred. The cumulative distribution function (CDF) curves for a maximum k of 6 and the PCA plot for k equal to 4 are shown in Fig. 1a and Fig. 1b. Only treatments with silhouette scores ≥0.5 were retained in their corresponding classes, resulting in a new total of 193 analyzable experiments.
Table 2. Studied drugs' action on the glycolytic pathway
Action
|
Drug
|
Action
|
Drug
|
Akt
|
Afuresertib
|
Mitochondria
|
Buformin
|
GLUT
|
Silybin
|
Metformin
|
Glucose
|
No Glucose
|
Phenformin
|
HK2
|
2DG
|
mTORC
|
Everolimus
|
Genistein
|
MLN128
|
Resveratrol
|
PP242
|
KRAS
mutations
|
Selumetinib
|
BEZ235
|
Sorafenib
|
Rapamycin
|
Trametinib
|
Temsirolimus
|
|
PDK
|
DCA
|
Samples were not unequally distributed (p = 0.054), with class 1 having 61 experiments, class 2 having 54, class 3 with 38, and class 4 having 40 experiments. The treatments were then classified by molecular mechanisms and the goodness-of-fit test was applied to the frequency of experiments in each class. Table 2 lists the treatments used and their respective molecular mechanism as explained by Abdel-Wahab. Treatments acting on Hexokinase 2 (HK2) showed a significant preference for class 2, and those acting on variants of mTORC (1/2) demonstrated high affinity for class 1. Combination of treatments acting on the mitochondrion (e.g., biguanides) with those acting on HK2 or with glucose restricting, constituting a type of “pan-glycolysis” inhibition, had a preference for class 3. This classification in class 3 was also seen with glucose starvation, a true “pan-glycolysis” inhibition, which additionally showed an affinity for class 1. Nevertheless, it should be noted that drugs targeting mitochondrial function and the KRAS pathway, alone, did not show a predilection for any class, proving their wide range of activity. The results described are visualized in Figure 2.
Concerning cell type, Cluster 1 seems to house the majority of cells, comprising iCluster 15 (C15) and C24 cells, almost exclusively being SKCM (skin/uveal melanoma) and LAML (myeloid leukemia), respectively. Both iClusters are homogeneous, having mostly one type of cells, with C15 being enriched with UVB exposure and having the highest median mutation per megabase, and C24 having the lowest median mutation per megabase. Likewise, C27 (HNSC and CESC, head, neck, and cervical, both being squamous cell cancers) was primarily classified in Cluster 1 alongside Cluster 3, the same goes for C22, C7, C3, enriched with SARC (sarcomas, mostly bone). One type of cell is predominant in C27, which is also linked to HPV. The pathways related to squamous cells and proliferation are activated, and there is a relatively high level of hypoxia, signaling related to the immune system, and basal signaling. As for C22, C7, and C3, these iClusters were primarily mesenchymal, having most commonly a chr9 deletion, and were enriched for gene programs representing PD1, CTLA4, and JAK2/STAT1,3,6 signaling. As for Cluster 2, it was primarily enriched with C8 and C19, themselves known to cluster BRCA and UCEC (breast and uterus), with C8 equally and nearly solely finding itself in Cluster 4. C8 is also a homogeneous iCluster, dominated by a single tumor type, with POLE mutation being present in highly somatically mutated cells. Alongside C19, they both demonstrate hormonal and cell-of-origin dependency, with C19 expressing the estrogen-signaling gene program. Figure 3 summarizes these findings. Supplementary file S3 illustrates the analysis process described for drugs and tissue cluster distribution.
Signature genes were identified using a False Discovery Rate (FDR) threshold of <0.001. K-means clustering on the genes yielded 5 signature functions, 2 being biologically more meaningful than the rest. Functional analysis of the first group using Gene Ontology (GO) terms for biological properties resulted in 17 term clusters (Figure 4). Function A revolves around cell cycle inhibition, including (i) DNA replication, repair, as well as RNA splicing, and telomere maintenance, and (ii) mitotic cell cycle and regulation. Function B tackles adaptation and molecular quality control mechanisms describing (i) apoptosis and unfolded proteins response (UPR) and related (ii) RNA and protein transport and localization in cells, (iii) organ development (perception and learning), and (iv) ion homeostasis. The activity of these functions in each cluster relative to each other (greatest activity, lowest activity, or reference activity) is summarized in Table 3 and can be visualized in Figure 5.
Table 3. Function clusters to antiglycolytic agents
|
|
Cluster\Function
|
A: mitotic activity
& DNA repair
|
B: Quality control, UPR
& apoptosis
|
|
1
|
↓
|
–
|
|
2
|
↑
|
–
|
|
3
|
↓
|
↑
|
|
4
|
–
|
↓
|
|
The functions previously enumerated describe the expression of biological functions which are different between classes regardless of whether they are upregulated or downregulated in each experiment (FC significantly high). That said, the up/downregulated responses are left to be explored. Because our data were quantile normalized, an FC ≥ 0.5 or ≤ -0.5 were considered to define an upregulated or downregulated gene, respectively. Only Class 1 and 3 were able to be enriched, showing downregulation in mitotic activity and DNA replication, and Class 3 showed further upregulation in apoptotic activity and response to cellular stress, cellular differentiation, and immune activation of cells.
Additional consensus clustering was performed on 341 sorafenib-treated and 325 rapamycin-treated samples (from the accession of 1014 samples), each treatment having mainly 3 available doses given for 3 exposure times. Consensus partitioning of Rapamycin samples resulted in 3 classes and 4 Sorafenib classes. This classification controls for the drug used, leaving room for tissue-, dose-, and duration-based classification and interpretation. The classes obtained for both treatments simply shared the samples almost uniformly between them; this being true for tissues, and treatment duration and dose if tissue-dose and tissue-duration interactions were considered. Therefore, no hidden trends or categorization of these parameters could be inferred. It should be however noted that when not controlling for tissues, treatment duration, and dose showed a preference for some classes (P <0.01). The signature genes for the treatments were also classified and resulted in the same biological processes as described earlier, with organ development/differentiation being more vascular.
It was deemed unnecessary to combine the 666 experiments with the 195 and perform a clustering analysis because of the potential bias that could result from adding 300 samples treated with only 2 agents, to 195 samples treated with a total of 26 agents. In addition, partitioning that maintains constant the tissue variable and allows for a treatment-based study was also deemed impossible because of the limited number of samples per tissue.