Optimization of the CanSel strategy
To optimize the model for in vivo selection, we compared the growth pattern of three cancer cell lines (B16-F10 melanoma, 4T1 breast cancer, and Lewis Lung Carcinoma, LLC) upon implantation into syngeneic skeletal muscle, which is highly permissive to AAV transduction (17, 18). B16-F10 and 4T1 cells formed a compact tumor mass, squishing, but not destroying, surrounding muscle fibers. In contrast, LLC cells exhibited an infiltrative pattern of growth, resulting in the progressive replacement of muscle fibers (Supplementary Fig. S1A). This progressive disappearance of transduced fibers would allow the selection of fibers expressing anti-invasive proteins and thereby resisting to cancer invasion. Consequently, the viral genomes encoding for these anti-invasive proteins were expected to be progressively enriched over time. Thus, the entire CanSel screening was performed using LLC cells. To optimize the timing to retrieve viral genomes, we generated a pilot pool composed of 50 randomly chosen AAV vectors, including the positive control Semaphorin3A (Sema3A), which is known to inhibit cell invasion and tumor growth in vivo (19, 20). This pool was injected in the tibialis anterior muscle of five animals and, after seven days, a time sufficient to drive robust transgene expression by AAV vectors (21), three animals were injected into the same muscles with LLC cells and sacrificed after additional two or ten days. As expected, LLC cells progressively replaced muscle fibers, each one containing a few AAV genomes (Supplementary Fig. S1B, C). As described above, any transgene inhibiting cancer cell invasion, thereby thus promoting fiber survival, was expected to be enriched. We evaluated barcode frequency, representing the frequency of each transgene in muscles injected with the AAV pool, either in the presence or in the absence of cancer cells, and expressed it as a ratio (Supplementary Fig. S1D). At day ten, when cancer cells have invaded a massive portion of the injected muscles, five factors resulted enriched, including our positive control, Sema3A (Supplementary Table S1). Thus, this time point was chosen for the eventual CanSel experiment, using the whole AAV library.
In vivo selection identifies secreted proteins inhibiting cancer cell invasion
We prepared 21 pools of AAV9, each one composed of an equimolar amount of 50 vectors, carrying transgenes of comparable lengths, as validated by our previous studies (3). Each AAV pool was injected bilaterally into the tibialis anterior muscle. After seven days, three muscles per pool were injected with 5 x 104 LLC cells and harvested after ten additional days for DNA extraction and identification of the relative abundance of each barcode by NGS, as schematically represented in Fig. 1A.
Cumulative barcode recovery was > 70% for 18 out of 21 pools (Supplementary Fig. S2A). The relative abundance of each transgene in its pool is reported in Supplementary Table S2. We first analyzed pools individually and calculated the relative abundance of each transgene in its pool (Supplementary Table S2 and Supplementary Fig. S2B). All pools were eventually ranked together. As shown in Fig. 1B, some transgenes were lost, meaning that they favored cancer cell invasion (blue area), most transgenes were neither lost nor enriched, consistent with their neutral role on cell invasiveness (gray area), and 60 transgenes were enriched (avlog > 1SD, red area). The 10 top enriched transgenes were selected for further validation and included parathormone (PTH), Fibroblast Growth Factor 3 (FGF3), polypeptide N-acetylgalactosaminyltransferase 2 (GALNT2), EMID2, galectin 3 (LGALS3), lactalbumin alpha (LALBA), resistin-like beta (RTNLB), nyctalopin (NYX), Leukemia Inhibitory Factor Receptor (LIFR), and lipocalin 12 (LCN12).
Gene ontology analysis revealed that the most abundant categories among both enriched and lost factors were related to ECM remodeling, immunity, angiogenesis and cell migration, confirming our initial aim of selecting factors able to modulate ECM and extracellular space (Supplementary Fig. S3A, B and Supplementary Table S3). Next, we used a transwell invasion assay to further score the top 10 factors, using both LLC cells (Fig. 1C, D) and immortalized mouse aortic endothelial cells (iMAEC), as an additional cell type that populate most cancer types (Supplementary Fig. S3C, D). In either case, the same factors (PTH, EMID2, FGF3, and LCN12) significantly inhibited cell migration. Thus, we selected these four factors for further in vivo validation.
EMID2 is the most effective protein in inhibiting cancer cell invasiveness in vivo
We produced AAV9 vectors for the individual expression of PTH, EMID2, FGF3, and LCN12 and injected them intramuscularly prior to implantation of LLC cells. As shown in Fig. 1E, F EMID2 was the most potent factor in reducing muscle infiltration by cancer cells, identified by their positivity for the proliferation marker Ki67. We repeated the experiment, using LG cells, which form more compact masses (Supplementary Fig. S1), for easier quantification of tumor size. By doing this experiment, we observed that LG cells expanded with a branching pattern of infiltration (Fig. 1G). At higher magnification, the invasive front was characterized by branches composed of cancer cells growing along collagen extensions and forming road-like structures, radiating perpendicular from the tumor border toward surrounding tissues (Fig. 1H). This pattern was unaffected or even exacerbated by the presence of PTH, FGF3, and LCN12 (Fig. 1G, I). In contrast, EMID2 overexpression resulted in a more circular tumor mass, suggesting reduced invasiveness of cancer cells (Fig. 1G, I). This was further validated in a larger cohort of animals, in which we could detected a significant reduction in tumor growth by EMID2 (Fig. 1J, K).
Inhibition of TGFβ maturation by EMID2 hampers CAF activation
To explore the mechanism by which EMID2 inhibits tumor invasion, we analyzed the structural domains of the protein. EMID2 belongs to the EMI domain endowed (EDEN) superfamily of proteins, which all contain a cysteine-rich sequence of approximately 80 amino acids, defined as the EMI domain (22). This domain is typically found at the N terminus of matricellular proteins that form multimers. Because the EMI domain of Emilin 1 was shown to inhibit TGFβ maturation in the context of hypertension (23), we assessed whether EMID2 could do the same. We transfected HEK293T cells with a plasmid expressing pro-TGFβ, either alone or in combination with a second plasmid expressing EMID2. As shown in Fig. 2A, B, pro-TGFβ was mostly detected inside the cells, while the mature, active form was exclusively present in the cell supernatant. Overexpression of EMID2 did not affect the levels of pro-TGFβ, but inhibited its maturation into the active isoform. Next, we wondered whether the same happened in vivo, in EMID2-treated tumors. Consistent with the in vitro data, homogenized EMID2-treated tumors contained lower levels of active TGFβ (shown and quantified in Fig. 2C, D).
Since TGFβ is a key player in stimulating cancer-associated fibroblasts (CAFs), we investigated whether EMID2 could inhibit their activation, using primary skin fibroblasts from COLL-EGFP/αSMA-RFP mice, which simultaneously express the enhanced green fluorescent protein (EGFP) and the red fluorescent protein (RFP) under the control of the collagen α1(I) and the α-smooth muscle actin (αSMA) promoter/enhancer respectively (14). When kept in culture for five days, these fibroblasts almost completely transdifferentiate into myofibroblasts, resulting in a shift from green to red fluorescence (24).
Administration of rEMID2 significantly reduced the number of RFP+ fibroblasts at three days after plating (Fig. 2E, F). The same effect was observed by co-culturing primary COLL-EGFP/αSMA-RFP fibroblasts with LG cells to enhance their differentiation into myofibroblasts (Supplementary Fig. S4A, B). We then assessed whether this inhibition also occurred in tumors in vivo, by injecting LG cells in COLL-EGFP/αSMA-RFP muscles transduced with either AAV9-control or AAV9-EMID2. The area covered by activated CAFs inside tumors was significantly smaller in muscles pre-injected with AAV9-EMID2 compared to those injected with AAV9-control (Fig. 2G, H). The same result was obtained by injecting LG cells in wild type muscles followed by staining with anti-αSMA antibodies (Supplementary Fig. S4C-E). The injection of AAV9-EMID2 in the absence of cancer cells did not induce any change in αSMA expression (Supplementary Fig. S4F).
CAFs are the major source of tumor ECM and a shift from a laminin- to a collagen- and fibronectin-rich environment is known to promote cancer cell invasiveness (25–27). By analyzing the content of these ECM proteins by immunofluorescence, we observed that both collagen I and fibronectin were reduced by the overexpression of EMID2 (Fig. 2I-L), while laminin was up-regulated to a level like normal muscles (Supplementary Fig. S5A, B). The same data were confirmed by western blot of homogenized tumors implanted in muscles transduced with either AAV9-control or AAV9-EMID2 (Fig. 2M-P).
Thus, EMID2 inhibits TGFβ maturation and, consequently, CAF activation, resulting in a normalized ECM.
EMID2 reduces matrix stiffness and nuclear YAP localization
To characterize the effect of EMID2 on ECM mechanical properties, we added recombinant EMID2 to a Matrigel layer and evaluated gelation kinetics of Matrigel with a rheometer. The elastic modulus (G’) and the viscous modulus (G’’) of the samples were measured for 30 minutes to allow Matrigel polymerization. While the viscous modulus was only minimally increased by the presence of rEMID2 and remained constant during gelation, G’ showed a steep increase during the first minutes and further increased at later times (Fig. 3A). This is consistent with EMID2 interaction with Matrigel proteins and consequent increase in the elastic component of the resulting gel. Next, we applied three subsequent sweeps of increasing strain. By applying a first sweep, resulting in 100% deformation, neither the control Matrigel nor the Matrigel with rEMID2 reached the stiffening limit and their G’ showed a comparable, linear response (Fig. 3B, first panel). This indicates that the gel resisted to the applied forces without breaking. We then applied a second, more intense sweep, resulting in 1000% deformation. The EMID2-containing Matrigel showed a peak in G’, which was not present in the control, which indicates higher resistance (Fig. 3B, second panel). Finally, we applied a third sweep, again resulting in 1000% deformation and acting on the materials previously deformed. The inclusion of EMID2 in the Matrigel again increased resistance, as shown by the higher level of strain at which G’ started to decline (46% in the case of control Matrigel and 92% in the presence of EMID2) (Fig. 3B, third panel). Overall, these rheological measures confirmed that the rEMID2 interacted with Matrigel components and resulted in increased elasticity at high strains. To assess whether EMID2-induced ECM elasticity altered cell adhesion and invasiveness, we cultured primary fibroblasts for nine days, to allow abundant ECM secretion (15), either in the presence or in the absence of rEMID2 (Fig. 3C). We then seeded LG cells on decellularized matrices and stained them with phalloidin and anti-paxillin antibodies to label F-actin cytoskeleton and focal adhesions (FA), which are essential for cell migration (28). Exposure to EMID2 almost completely abrogated the formation of stress fibers, composed by lamellar F-actin bundles, and reduced the length of FAs, which appeared scattered along the whole cell membrane (Fig. 3D-F). We also seeded NIH-3T3 fibroblasts on the same matrices, as these cells are known to extend multiple and long filopodia prior to their movement (29). Also in this case, the presence of EMID2 significantly reduced both the number and length of filopodia (Fig. 3G-I).
As changes in ECM mechanical properties and cytoskeletal actin organization are generally transduced by the Hippo pathway (30), we monitored YAP localization in LG cancer cells seeded on control and rEMID2-contaning matrices and found that YAP was preferentially excluded from the nucleus in the presence of rEMID2 (Fig. 3J, K).
Thus, EMID2 significantly increases ECM elasticity, resulting in reduced cancer cell adhesion, filopodia formation and nuclear YAP localization.
EMID2 inhibits tumor growth and dissemination in clinically relevant animal models
We validated the anti-invasive effect of EMID2 in clinically relevant animal models of highly invasive and metastatic cancer types, such as lung and pancreatic cancer (31, 32).
First, we used an orthotopic model of lung cancer, by systemically injecting LG cells, which spontaneously colonize the lung, forming numerous neoplastic nodules (24). In this case, we used AAV6.2FF vectors, herein referred to as AAV6, due to the high tropism of this AAV serotype for the lung (33). Delivery of AAV6-EMID2 significantly increased EMID2 levels compared to endogenous expression (Supplementary Fig. S6A, B), and resulted in reduced number of tumor foci and their total extension (Fig. 4A-C).
Next, we moved to a model of pancreatic ductal adenocarcinoma. KPC cells were orthotopically implanted in the pancreas of C57BL/6 mice, together with AAV9-EMID2 or AAV9-control vectors, due to the tropism of AAV9 for the pancreas (34). We assessed KPC proliferation by immunostaining for Ki67 and found that the number of proliferating cancer cell, as well as the tumor size, were significantly reduced upon EMID2 overexpression in the pancreas (Fig. 4D-F). In addition, the delivery of AAV9-EMID2 to the pancreas inhibited cancer cell dissemination to the lung, resulting in decreased density of Ki67+ cancer cells in EMID2-treated animals compared to controls (Supplementary Fig. 7A, B).
Finally, we used a genetic model of pancreatic cancer in KC mice (13). Also in this case, overexpression of EMID2 resulted in a significant reduction in overall pancreatic weight, paralleled by fewer Ki67+ cancer cells and primary tumor nodules (Fig. 4G-J). Consistent with our data on LG cells, overexpression of EMID2 resulted in reduced levels of YAP in pancreatic cancers (Fig. 4K, L). Even more evident was the effect on lung cancer dissemination, with a minimal number of metastatic, Ki67+ cancer cells detectable in the lungs of EMID2-treated animals (Supplementary Fig. 7C, D).
EMID2 is a positive prognostic marker in aggressive human cancer
To verify whether EMID2 also inhibited the migration of human pancreatic cancer cells, we transfected human fibroblasts with a plasmid encoding for EMID2 and, after two days, added Panc1 cells, previously labeled with a fluorescent membrane dye. As shown in the scratch assay in Fig. 4M, N, EMID2 overexpression significantly inhibited Panc1 cell migration.
To further explore the relevance of EMID2 in human cancer, we interrogated The Cancer Genome Atlas (TCGA) to analyze the correlation between EMID2 expression levels and survival rates in multiple cohorts of patients diagnosed with aggressive cancer (https://www.nuffieldtrust.org.uk/resource/cancer-survival-rates). First, we focused on pancreatic and lung adenocarcinoma (PAAD and LUAD), to validate the protective effect of EMID2 observed in animal models. As shown in Fig. 4O, P, Kaplan-Meier curves show increased survival in patients with high expression of EMID2 in their tumor. Next, we extended our analysis to lung squamous cell carcinoma (LUASC), mesothelioma (MESO), glioblastoma and glioma (GBMLGG, LGG), liver hepatocarcinoma (LIHC), cholangiocarcinoma (CHOL), bladder cancer (BLCA), and esophageal cancer (ESCA). As shown in the forest plot of Fig. 4Q, the hazard ratio (HR) value for most of these tumors was < 1, indicating that high expression of EMID2 was associated with increased probability of survival. This association was significant for GBMLGG, LGG, PAAD, LIHC and BLCA (p-value < 0.05). The size of each square in Fig. 4Q reflects the weight of the significance, which was highest for glioma and glioblastoma. The protective effect of high EMID2 expression was most evident for GBMLGG, LGG, LIHC and BLCA, as shown in Supplementary Fig. S8.
Thus, high expression levels of EMID2 are associated with a better prognosis in most aggressive human cancers.