Overview of the Method
We used transcriptomic and phosphoproteomic data of five cancer cell lines treated with 89 perturbagens and the associated control treatment (Connectivity Map Project - CMap) to understand the drugs' signaling level differences and commonalities systematically. We obtained the upstream regulators - the set of transcription factors - of the significantly expressed genes for each cell line-drug pair from transcriptomic data. Additionally, we retrieved the targets of each drug from CMAP Drug Repurposing tool47, which combines the data from DrugBank, PubChem, and other drug-related databases. Finally, we merged the set of transcription factors, phosphoprotein hits, and drug targets to obtain the list of seed proteins of each cell line-drug pair for the network modeling. In Figure 1, we conceptually illustrated our integrative approach.
Although we started with 89 drugs and five cell lines, the number of proteins in the seed list is very low for some drugs, making the network modeling not feasible. We have also tested less stringent thresholds to enlarge the size of the seed list. However, we still could not overcome this issue for some drugs (i.e., decitabine, ginkgetin in A549 and YAPC, vemurafenib, and tacrolimus in MCF7 and PC3 cells). Therefore, we continued with 70 drug treatments on five cell lines. We calculated Tanimoto similarities and MACCS key distances of these 70 drugs (Supplementary Figure 1, Supplementary Note 1) to understand their chemical similarities and elucidated their mechanism of action (Supplementary Figure 2) to later use as a reference for the analysis.
To discover the hidden mechanisms underlying the effects of drugs, we integrated the seed proteins list (drug targets, transcription factors, and phosphoproteins) with the human interactome using Omics Integrator48 software. The edge weights in the human interactome (iRefWeb v13.0) represent the confidence of the interactions and are calculated by the MI-Score function. We additionally refined the human interactome with several approaches to decrease the impact of false positive and false negative interactions. Additionally, the link prediction followed by cellular localization filters enriched the interactome with the missing interactions (see Methods). Omics Integrator solves the prize-collecting Steiner Forest (PCSF) problem to reconstruct optimal subnetworks by integrating the list of seed proteins (terminal set) and a reference interactome. The performance of the network modeling approaches is highly dependent on the interactome, parameter tuning, and inclusion of known biological insights. Omics Integrator has a better performance compared to all pairs shortest paths, page rank, and heat diffusion approaches when rigorous parameter tuning is applied and multiple suboptimal solutions are merged. This modification increases the precision and recall of PCSF49. Additionally, PCSF can reconstruct a network from a single list of seed nodes whereas other powerful approaches such as PathLinker20 and ResponseNet50 require two sets of inputs. Because of its good performance among other methods requiring only a single seed list, Omics Integrator was selected as the main tool for network reconstruction. In this study, we merged the outputs of multiple parameter sets and refined the underlying interactome to reconstruct better networks. The terminal set size varies between 6 and 214 across the cell line-drug pairs, and the representation of phosphoproteins is relatively limited (max of 20 hits). Proteins have weights reflecting their importance in the terminal set based on their type. Transcription factors have the mean of their target gene expressions, phosphoproteins have the fold change compared to the control as the weight, and drug targets have a uniform weight inferred from the overall weight distribution of all proteins.
Overall, we constructed 236 subnetworks from the combination of 70 drugs and five cell lines. However, not all cell lines have the same number of subnetworks. Out of 236 networks, cell line A375 has 70, A549 has 46, MCF7 has 43, PC3 has 59, and YAPC has 18 drug-specific subnetworks (Supplementary Figure 3A). The topological characteristics of the drug networks can give clues about the perturbation or the connectivity of the modulated proteins. We summarized the topological properties of networks for each cell line (Supplementary Figure 4). We found that the final network size is dependent on the number of altered proteins/genes from multi-omic data rather than the centrality of the drug targets in the interactome. All topological analyses are present in the Supplementary Material (Supplementary Note 2, Supplementary Figure 5).
We initially quantified the node level overlap between network pairs where extensive overlap means similar network neighborhood. 98.4% of network pairs share at least one protein. Therefore, a comparison solely based on the overlapping nodes and node frequencies does not allow us to understand the comprehensive modulation of drugs (Supplementary Figure 3 and Supplementary Note 2).
The reconstructed networks preserve more detailed information about the drug effects beyond the common proteins and their association with cancer pathways. If two drug networks highly overlap, these drugs' action and phenotypic outcomes may be potentially similar. We applied the network-based separation approach, developed by Menche et al. 51, to reveal disease-disease relations on each drug network pair within and across cell lines to explore the network level separation. The separation score represents how close two networks are to each other, a topology-based comparison calculating the average shortest distances between the nodes in each network in the reference interactome (see Methods). The topological overlap of two drug networks in a cell line reflects their similarities at the pathway level. We found several overlapping subnetworks for drug pairs that do not have common target proteins or chemical structures, such as BIX-01338, entinostat, etoposide. The similarity between two networks represented by the separation scores is found statistically significant based on a hypergeometric test (Supplementary Data 1).
We also compared the application of the separation score method on reconstructed networks against the direct application on the seed proteins. Direct application of the method resulted in a distribution of separation scores mainly within the negative range. Because of the molecular heterogeneity, it is rarely expected for any drugs to modulate the same pathways in all cancer types. This situation implies false positives across cancer types. The difference between the two applications stems mainly from the intermediate (Steiner) nodes found via the network reconstruction method. These intermediate proteins can potentially reveal the off-target effects of drugs (Supplementary Figure 6 and Supplementary Note 3).
Moreover, the effect of link prediction applied on the interactomes used in network reconstruction is investigated to reveal whether the predicted edges cause any misleading results or not. We observed that predicted edges constitute very low percentages of total edge count in the networks, and they do not cause any spurious proteins to be included in the networks. On the other hand, the predicted edges can be considered as the noise introduced to the reference interactome. The very low level contribution of link prediction to the reconstructed networks show the robustness to this noise (Supplementary Note 4, Supplementary Table 1 and Supplementary Data 2).
Chemically and Functionally Different Drugs May Modulate Overlapping Networks
Transforming the networks into a matrix of separation scores (Supplementary Data 3) and clustering represents the overlaps of the networks as drug modules. We observed that the network overlap of some drugs is very high in specific cell lines such as A375, MCF7, and PC3, although their chemical similarity is limited. All-pair separation scores of drug modules in the A375 (skin melanoma) cell line are illustrated in Figure 2A with corresponding MoA similarities (T1=same MoA, T2=different MoA). We noticed that the chemically and functionally different drugs might modulate similar networks. Separation scores of network pairs in other cell lines can be found in Supplementary Figure 7. For example, we observed that tacedinaline and geldanamycin have overlapping networks in the A375 cell line, suggesting similar signaling output despite their different targets or mechanism of action (separation-score = -0.61, Figure 2B).
Tacedinaline is a selective HDAC1 inhibitor, while geldanamycin is a heat-shock protein (HSP) inhibitor targeting HSP90AB1 and HSP90AA1 proteins. Effects of HDAC inhibitor and HSP90 inhibitor drugs on HDACs and HSPs and the potential interplay between HDACs (especially HDAC6) and HSP90 has been highlighted in several studies52–58. In our analysis, two drugs share several downstream transcription factors (i.e., HIF1A, HSF1, CEBPA, CEBPB, AR, FOXO1/3) in the reconstructed networks. We then linked these drug targets to cancer phenotypes using CancerGeneNet59 which calculates the shortest paths between genes and phenotypes. We found that the downstream transcription factors of HDAC1 and HSP90s are shared on the path toward cell death and differentiation phenotypes (Figure 2B sub-panel). The intersection of tacedinaline and geldanamycin networks in A375 is enriched in several pathways such as MAPK, AMPK, and TGF-beta signaling pathways. Moreover, transcriptomic data of the two drugs are highly correlated. Despite the high overlap, they also have some disjoint pathways enriched in each specific network. Tacedinaline affects chemokine, neurotrophin, ErbB, TNF, FoxO, PPAR, prolactin, and thyroid hormone signaling pathways that are not present in the geldanamycin network (Figure 2C).
To systematically evaluate the performance of this method, we classified the drug pairs having the same MoA and similar resulting networks as true positives (TP) and drug pairs having different MoA and overlapping networks as false positives (FP). We defined drug pairs having the same MoA and separated networks as false negatives (FN) and drug pairs having different MoA and separated networks as true negatives (TN). We plotted the ROC curve and Precision-Recall curve using different threshold values (Supplementary Figure 8 and Supplementary Data 4). Because the data is imbalanced (200 positive cases - pairs with same MoA, 6017 negative cases - pairs with different MoA), the best performance is found based on Matthews Correlation Coefficient (MCC). The best performance is achieved when the separation score threshold is selected as -0.45. (precision=0.138, recall=0.400, accuracy=0.900, sensitivity=0.400, specificity=0.917, TPR= 0.400, FPR=0.083, F1-score=0.205, MCC=0.192). Same MoA and overlapping networks are expected, but the interesting cases are the drug pairs with different MoA and overlapping networks. We, therefore, further studied the literature to find evidence about mechanistic similarities among FP drug pairs. This evidence strengthens our claim that even drugs without the same MoA can perturb similar pathways (Supplementary Table 2). Despite being an FP based on the ground truth, the synergistic behavior of decitabine and entinostat was previously shown in the activation of FoxO160.
There are network pairs following the ground truth, i.e., drugs having the same MoA and overlapping networks. One example of drugs with the same MoA is sirolimus and OSI-027, which are selective mTOR inhibitors. Phase I clinical trials of OSI-027 were completed in 2013 for the investigation on patients with advanced solid tumors or lymphoma. It has activity on pancreatic ductal adenocarcinoma as an inhibitor of cell proliferation61. The networks of sirolimus and OSI-027 in A375 cells highly overlap (ss = -0.562). They have many common proteins and shared signaling pathways such as MAPK signaling, PI3K-Akt signaling, and cGMP-PKG signaling pathways. However, the OSI-027 network has specifically an enrichment of Hippo signaling, Jak-STAT signaling pathways, while sirolimus differs from OSI-027 with the enrichment of Wnt signaling, ErbB signaling pathways, and some immunity-related pathways such as Toll-like receptor signaling pathway (Supplementary Figure 9A-B).
Selumetinib and PD-0325901(Mirdametinib) are MEK inhibitors targeting MAP2K1. We showed a consistent similarity between two drugs for the same cancer types. The separation scores between two drugs reach up to -0.64 in YAPC (pancreatic cancer) (Supplementary Figure 9C-D). This similarity is a supportive result for the clinical studies. However, the network perturbation of these drugs may vary in different cancer types. For example, the separation score between PD-0325901 in the PC3 cell line and selumetinib in the YAPC cell line is 0.48. While the affected pathways are FoxO, VEGF, ErbB cAMP, Rap1, Ras, GnRH signaling pathways in PD-0325901 treated PC3, Jak-STAT, TGF-beta, PI3K-Akt, TNF, and Wnt signaling pathways are enriched in selumetinib treated YAPC (Supplementary Figure 9E-F).
Target selectivity of the drug pairs determines the network separation, despite having the same MoA.
The analysis of drug modulation in a cell line-specific manner helped determine a subset of drugs acting similarly in A375, MCF7, and PC3 (Supplementary Data 5). Afterward, we measured the separation of the drugs across different cell lines to discover if the same subset of drugs similarly acts in various tumor types. We compared the separation scores of drug pairs from 236 networks (Figure 3A) by dividing them into the same cell line and the different cell line groups. Next, we checked if two similar drugs in the same cell line act similarly in different cancer types. The results show that the network level overlap between drug pairs in the same cell line is preserved across different cell lines, but their separation is relatively higher. For example, resveratrol and geldanamycin are highly overlapping in the A375, MCF7, and PC3 cell lines (in A375, separation score=-0.60; in MCF7, separation score=-0.47; in PC3, separation score=-0.54), but their overlap is lower across cell lines such that separation score between A375-resveratrol and MCF7-geldanamycin is -0.40 and separation score between A375-resveratrol and PC3-geldanamycin is -0.43. Another aspect of this difference is the target selectivity of the drugs. Two examples of this phenomenon are the HDAC inhibitors and JAK inhibitors.
RGFP-966 is a slow-on/slow-off, competitive tight-binding selective HDAC3 inhibitor. Tacedinaline selectively targets HDAC1. Other HDAC inhibitors, Belinostat, Entinostat, Trichostatin-a, and Vorinostat, inhibit several HDACs. The network separations of these HDAC inhibitors across all cell lines vary based on the target selectivity of these drugs. We observed both overlapping and non-overlapping networks depending on the cell types. Belinostat, Trichostatin-a, and Vorinostat networks usually have some part overlapping in all cell types, as these three drugs are in the same structural group, and it is expected for them to behave similarly. Tacedinaline treatment activates similar mechanisms on A375, MCF7, and PC3 cells as separation scores between these are negative, but the network of Tacedinaline on YAPC cells is more distant than other cell networks. The separation score of Tacedinaline networks on A375 and YAPC cells is 0.38. YAPC-Tacedinaline network is a small network. It is not significantly enriched for any pathways other than Osteoclast differentiation which is recently shown to be induced in pancreatic cancer-derived exosomes 62. In contrast, the A375-Tacedinaline network is enriched for several critical signaling pathways such as MAPK, ErbB, FoxO, TGF-beta, and TNF signaling pathways (Figure 3B-C). However, RGFP-966 treatment on A375, MCF7, and PC3 results in distant networks both from each other and from the networks of other HDAC inhibitors. All pairwise separation scores of RGFP-966 networks are positive (MCF7-PC3:0.42, A375-PC3:0.28, and A375-MCF7:0.45), and the number of shared proteins is minimal. Selectivity of RGFP-966 on HDAC3 among all HDAC proteins results in the induction of various pathways in different cell types. When RGFP-966 is compared to Belinostat, Entinostat, Trichostatin-a, and Vorinostat, the most distant networks have separation scores around 0.4, and these networks generally belong to A375 and PC3 cells. For example, the PC3-RGFP-966 network is separated from A375-Trichostatin-a with a score of 0.46. Two networks are not commonly enriched in any pathway, while PC3-RGFP-966 is only enriched in cell cycle, but A375-Trichostatin-a is enriched in several pathways such as estrogen, neurotrophin, and TNF signaling pathways (Supplementary Figure 10A-B). As Tacedinaline is the other selective HDAC inhibitor, RGFP-966 and Tacedinaline treatments cause more distant PPI networks. RGFP-966 in PC3 and Tacedinaline in A375 have the most different networks with a separation score of 0.62. They only have three proteins in common, and there are no commonly affected pathways (Supplementary Figure 10C-D).
Among three JAK inhibitors, ruxolitinib, tofacitinib, and TG-101348 (fedratinib), only TG-101348 is known for its selectivity to JAK2 protein, and the rest targets JAK1, JAK2, JAK3, and TYK2 proteins with varying affinities63–68. According to our results, the same drug may modulate different subnetworks in different cell lines, despite a marginal overlap. Within the same cell lines, tofacitinib and ruxolitinib networks have the most distant network pairs with separation scores of 0.68 in A375 and 0.46 in PC3. In A375, they only share the target proteins, JAK1 and JAK2. The tofacitinib network differs from ruxolitinib with the enrichment of Jak-STAT, ErbB, PI3K-Akt signaling pathways. In PC3, they share two more proteins other than JAK1/2 while the ruxolitinib network is enriched in the AMPK signaling pathway, and the tofacitinib network is only enriched in the Jak-STAT signaling pathway. A relatively low separation score (-0.147) is obtained from networks of TG-101348- and tofacitinib-treated A549 cells. Two networks share 23 proteins besides the target proteins, and commonly induced pathways include Jak-STAT and Notch signaling pathways. However, TG-101348 and tofacitinib networks have a separation score of 0.4 in A375, and the Jak-STAT signaling pathway is commonly enriched while the A375-tofacitinib network differs from TG-101348 with ErbB, mTOR, PI3K-Akt signaling pathways (Supplementary Figure 11).
Topologically separated drug networks may guide the use of drug combinations
The most effective strategy for finding drug combinations is to modulate the orthogonal sets of proteins or pathways simultaneously. The modulated pathways should be close enough to influence the disease module commonly. That can be described as targeting a disease through multiple signaling pathways11,69. Therefore, networks play an essential role in identifying drug combinations. Cheng et al. showed the efficacy of drug combinations using drug targets and disease proteins by calculating their separation mapped on the interactome46. Given two drugs, if each drug module overlaps with a disease module and these two drug modules are topologically separated, this is defined as complementary exposure46. They demonstrated six classes of drug combination-disease interactions from FDA-approved or experimentally validated pairs and found one class correlates with therapeutic effects (complementary exposure). In our study, we found the intersection between our network models and the drug combinations provided in Cheng et al. Our reconstructed drug networks include many hidden proteins and modulated omic hits besides the drug targets. Therefore, the coverage of the networks is higher than the drug modules described in previous studies. In our analysis, we considered the disease module as the set of cancer driver genes (obtained from Cancer Genome Interpreter70). We defined a rule where if two drug networks have at least one cancer driver protein in common and each has a topologically disjoint region, then the combination of these drugs can be effective. (Figure 4A). We only have four drug pairs intersecting with Cheng et al., but when we consider them per cell type, we could analyze seven drug pairs in total.
The separation scores of seven cases vary between -0.09 and 0.675. The network pairs in this analysis share a limited number of nodes (min three, max 16 proteins). KEGG pathway enrichment of these drug networks shows that each drug modulates a different region of the interactome so that different signaling pathways are perturbed, and their combination may be effective (Supplementary Figure 12-13).
Considering these seven cases and the concept of complementary exposure as a guideline, we rated other drug pairs in our dataset with similar criteria. If two networks are separated (ss>0.0) and share less than two common enriched pathways and the networks are not in small size (number of nodes > 40), and at least one is a large network (number of nodes > 100), then these drug pairs have potential to be used in combination. Given these criteria, we found 1,441 potential drug combinations out of 6,217 possible cell line–drug network pairs. Some of these new drug pairs were previously supported in the literature for their use in combination therapy. The top-ranking drug pair was I-BET-762 and lenalidomide in the A375 cell line. Two networks are separated with a score of 0.693, and they do not share any modulated signaling pathways. Lenalidomide network is enriched in several essential pathways such as PI3K-Akt, ErbB, MAPK, Ras, Rap1, VEGF, and FoxO signaling pathways, while the I-BET-762 network is enriched in Jak-STAT signaling and transcriptional misregulation in cancer pathways (Figure 4B). Three proteins are shared between these networks. Among them, "BRAF" is a cancer driver protein. These networks also contain different disease-related proteins (Figure 4C). The combination of lenalidomide with CPI-203, a primary amide analog of (+)-JQ1 and having the same mechanism of action as I-BET-762, is shown to synergistically induce cell death in bortezomib-resistant mantle cell lymphoma71. Similar activities of BET inhibitors and lenalidomide are reported in different studies of multiple myeloma72,73.
Other examples within the top-ranking 100 predicted drug pairs are geldanamycin and tofacitinib (A375, ss= 0.656), trichostatin-a and tofacitinib (A375, ss= 0.562), and resveratrol and ginkgetin (PC3, ss= 0.53). HSP90 and JAK2 inhibition was shown to synergistically overcome resistance to JAK2-TKI in human myeloproliferative neoplasm cells74. Tofacitinib targets JAK proteins nonspecifically; however, selective JAK2 inhibitor TG-101348 was also predicted in combination with tofacitinib. Moreover, HDAC and JAK dual inhibition were previously studied to improve treatment strategies75–78. Resveratrol and ginkgetin combination was also shown to suppress VEGF-induced angiogenesis in colorectal cancer synergistically79.
The separation between networks of a drug across cell lines gives clues about their sensitivity levels.
Next, we associated the network models with the drug sensitivity of the cell lines. For this purpose, we collected drug sensitivity scores of cell lines deposited in the Genomics of Drug Sensitivity in Cancer (GDSC) Database80. The intersection of GDSC and CMap datasets results in five drugs that at least one cell line is significantly resistant or sensitive to. A375 is sensitive to CHIR-99021 and PD-0325901. CHIR-99021 has reconstructed networks in three cell lines (A375, MCF7, and PC3). PD-0325901 has reconstructed networks in four cell lines (A375, A549, PC3, and YAPC). Trichostatin-a has networks in two cell lines (MCF7 and A375), and MCF7 is significantly sensitive to Trichostatin-a. Thus, we can compare the networks to better understand the changes in sensitivity of cell lines to these drugs. PC3 is resistant to Staurosporine and YAPC is resistant to Dinaciclib. Since there may be several processes underlying the resistance of cells and we can not directly infer these mechanisms from our reconstructed networks, we focused only on the drugs that cell lines are sensitive to (CHIR-99021, PD-0325901, and Trichostatin-a).
If a drug has dissimilar network models across different cell lines, sensitivity to that drug may also vary (Figure 5A). Since Trichostatin-a(TSA) has highly overlapping networks across three cell lines, we could not observe this pattern for sensitive MCF7 across A375 and PC3. MCF7-TSA network differs from others with active TGF-beta signaling and cell cycle pathways, while A375 is differently enriched in MAPK signaling, neurotrophin signaling, estrogen signaling, TNF signaling pathways, and PC3 differs with Jak-STAT signaling and AMPK signaling pathways. TSA targets class I and class II HDACs. Treatment with HDAC inhibitors is reported to restore TGF-beta signaling in breast cancer81, so it is expected to observe that the TGF-beta signaling pathway is enriched in MCF7 cells (Figure 5B). The higher difference between z-scores implies the more separated networks in CHIR-99021 and PD-0325901 modulation. Therefore, analyzing the networks of these drugs and exploring enriched pathways may give clues about the resistance to these drugs. PD-0325901, to which A375 cells are sensitive, is a selective MAP2K1 (MEK1) inhibitor directly related to cell proliferation. Given the MEK is in the middle of the RAS/RAF/MEK/ERK pathway, it is on the upstream of several cellular mechanisms for cell proliferation and cell survival. In networks of two cell lines whose z-scores are close to the resistance threshold, YAPC and PC3, we observed several modulated pathways as expected. While the neurotrophin signaling pathway is enriched in all cells, PC3 cells differ with several pathways such as cGMP-PKG signaling, sphingolipid signaling, Rap1, and Ras signaling, VEGF signaling, Oxytocin signaling, FoxO signaling, and Fc epsilon RI signaling. Sphingolipid signaling is shown to be involved in the resistance of prostate cancer cell lines to antineoplastic drug treatment (z-score of PC3 is 1.95, very close to the resistance threshold)82 (Figure 5C).
We also analyzed all possible relationships between z-score differences and separation scores. We could find z-scores of 30 drugs and calculated pairwise z-score differences of those with cell line-drug networks and their associated separation scores. After filtering for pairs with one negative z-score and one positive z-score, and separation scores higher than -0.45, we performed a regression analysis and observed a moderately positive correlation such that separation score increases with increasing delta z-scores (R=.29, p-value=0.011) (Figure 5D).