LncRNA FAM83H-AS1 Amplication is Associated With a Poor Prognosis in Lung Adenocarcinoma and Can Serve as A Therapeutic Target

Background: Few oncogenic drivers of long noncoding RNAs (lncRNAs) have been identied and investigated. Identifying noncoding drivers provides potential strategies for novel interventions in lung adenocarcinoma (LUAD). Methods: We constructed a machine learning model for driver gene annotation using pan-cancer and clinical prognosis data from OncoKB and TCGA to predict potential oncogenic drivers of lncRNAs; then, we used zebrash models to validate the biological function of candidate targets. The full length of FAM83H-AS1 was obtained by rapid amplication of the cDNA ends (RACE) assay. RNA pull-down, RNA immunoprecipitation (RIP), quantative mass spectrometry (QMS) and RNA sequencing (RNA-Seq) assays were utilized to explore the potential mechanisms. Additionally, we used CRISPR interference (CRISPRi) system and patient-derived tumor xenograft (PDTX) model to evaluate the therapeutic potential of targeting FAM83H-AS1 in vivo. Results: The results suggested that FAM83H-AS1 was a potential oncogenic driver from the chromosome 8q24 amplicon; increases in the expression of FAM83H-AS1 resulted in poor prognosis for LUAD patients both in JSCH and TCGA cohorts. Functional assays revealed that FAM83H-AS1 promotes malignant progression and inhibits apoptosis. Mechanistically, FAM83H-AS1 binds with HNRNPK to enhance the translation of oncogenes RAB8B and RAB14. Experiments using CRISPR interference (CRISPRi)-mediated xenografts and patient-derived tumor xenograft (PDTX) models indicated that targeting FAM83H-AS1 inhibited LUAD progression in vivo. Conclusions: Our work demonstrated that FAM83H-AS1 is a potential oncogenic driver that inhibits LUAD-mediated apoptosis via the FAM83H-AS1-HNRNPK-RAB8B/RAB14 axis. Importantly, we suggest targeting of FAM83H-AS1 as Atlas, TCGA; Gene Expression Omnibus, GEO; protein-coding genes, PCGs; tissue microarray; TMA; Jiangsu Cancer Hospital, JSCH; in situ hybridization; CISH; human bronchial epithelial cell, HBE; internal ribosome entry segment; IRES; liquid chromatography mass spectrometry, LC-MS; small guide RNAs, sgRNA; transcription start site, TSS; TdT-mediated dUTP nick end labeling, TUNEL; untranslated regions, UTR. TCGA cohort; l qRT-PCR results demonstrated a high correlation between gene expression and amplication of FAM83H-AS1 in LUAD tissues. *, P < 0.05, standard poor survival in TCGA cohort; l qRT-PCR results demonstrated a high correlation between gene expression and amplication of FAM83H-AS1 in LUAD tissues. *, P < 0.05, and standard and fold change after silencing FAM83H-AS1; b Common upregulated and downregulated genes in the RNA-Seq and QMS results; c Signicantly differentially expressed genes harboring HNRNPK-specic motifs in their 5'UTRs were identied by RNA-Seq and QMS results. P values were determined by Fisher’s exact test; d The top differentially expressed genes identied by QMS and the corresponding mRNA changes in RNA-Seq; e Predicted IRES sites and identied HNRNPK motifs in the 5'UTRs of RAB8B and RAB14; f RIP evaluation of the interaction between HNRNPK and mRNAs of RAB8B and RAB14 using an anti-HNRNPK antibody as described above; g Dual luciferase reporter assays showed that HNRNPK directly binds to the 5'UTRs of RAB8B and RAB14 and activates luciferase activity; h Effect of the translation inhibitor cycloheximide on the HNRNPK overexpression-induced increase in the protein levels of RAB8B and RAB14 in FAM83H-AS1 knockdown A549 cells; i The silencing of FAM83H-AS1 decreased the expression of RAB8B and RAB14, but overexpressing FAM83H-AS1 increased their expression; j The colony formation assay results suggested an oncogenic function for RAB8B and RAB14 in A549 cells; k Colony formation assays suggested that HNRNPK knockdown partially abolished the effects of FAM83H-AS1; l.m HNRNPK knockdown abolished the effects of FAM83H-AS1 on apoptosis, as ow cytometry and fold FAM83H-AS1; b Common upregulated and downregulated in the RNA-Seq and QMS results; c Signicantly differentially expressed genes harboring HNRNPK-specic motifs in their 5'UTRs were identied by RNA-Seq and QMS results. P values were determined by Fisher’s exact test; d The top differentially expressed genes identied by QMS and the corresponding mRNA changes in RNA-Seq; e Predicted IRES sites and identied HNRNPK motifs in the 5'UTRs of RAB8B and RAB14; f RIP evaluation of the interaction between HNRNPK and mRNAs of RAB8B and RAB14 using an anti-HNRNPK antibody as described above; g Dual luciferase reporter assays showed that HNRNPK directly binds to the 5'UTRs of RAB8B and RAB14 and activates luciferase activity; h Effect of the translation inhibitor cycloheximide on the HNRNPK overexpression-induced increase in the protein levels of RAB8B and RAB14 in FAM83H-AS1 knockdown A549 cells; i The silencing of FAM83H-AS1 decreased the expression of RAB8B and RAB14, but overexpressing FAM83H-AS1 increased their expression; j The colony formation assay results suggested an oncogenic function for RAB8B and RAB14 in cells; k Colony formation assays suggested that HNRNPK knockdown partially abolished the effects of FAM83H-AS1; l.m HNRNPK knockdown abolished the effects of FAM83H-AS1 on apoptosis,


Background
Identifying cancer driver genes is essential for precision oncology. Somatic mutations in driver genes have been revealed across multiple types of cancers [1,2], and a number of these driver genetic alterations have become therapeutic targets or prognostic markers [3]. Recently, somatic copy number alterations (SCNAs) have been found to affect a larger fraction of cancer genomes than any other type of somatic genetic alterations, and these frequently altered genomic regions have critical roles in activating and inactivating oncogenic pathways [4]. In lung cancer, the TracerX program demonstrated that a high frequency of SCNAs, but not somatic mutations, is signi cantly correlated with a poor survival rate [5]. Therefore, among the considerable number of genes located in SCNA regions, novel cancer drivers should be further investigated.
Long noncoding RNAs (lncRNAs) play critical roles in cancer development, and the expression levels of lncRNAs is closely associated with oncogenic functions. Several oncogenic lncRNAs, such as FAL1 and PRAL, have been found to be regulated by SCNAs [6,7]. According to expression pro les of matched clinical and SCNA data, oncogenic drivers of lncRNAs can be distinguished from passengers by mathematical methods [8]. However, few lncRNAs have been identi ed as oncogenic drivers in lung adenocarcinoma (LUAD), which is the leading cause of cancer-associated deaths worldwide and accounts for nearly 40% of all lung cancer cases [9]. Therefore, systematic exploration and identi cation of noncoding drivers of LUAD is warranted.
The expression levels of oncogenic drivers of lncRNA have been thought to be regulated by corresponding genomic alterations [8]. In addition, multidimensional data, including clinical prognosis data and gene expression and SCNA data, could increase the ability to detect potential oncogenic drivers. Machine learning is a great method for constructing classi ers to identify oncogenic drivers [10]. Based on genomic data, a machine learning method was demonstrated to have obvious advantages in estimating prognostic signatures in transcription and methylation data [11][12][13]. Using in vivo and in vitro functional assays, the performance of machine learning algorithms could be assessed, and identi ed oncogenic drivers could also be validated.
In the current study, we extracted SCNA data, gene expression data and clinical prognosis data from The Cancer Genome Atlas (TCGA) LUAD population. Using a decision tree machine learning method, several potential oncogenic drivers of lncRNAs were identi ed. We further characterized the lncRNA FAM83H-AS1, which is highly expressed in LUAD tumor tissues and is associated with frequent 8q24 ampli cation and poor prognosis. The characteristics of FAM83H-AS1 were subsequently validated in independent cohorts and with additional public datasets. Experimental investigation revealed that FAM83H-AS1 could bind with heterogeneous nuclear ribonucleoprotein K (HNRNPK) and increase the protein levels of RAB8B and RAB14, thus suppressing apoptosis and promoting tumorigenesis of LUAD cells. Importantly, targeting FAM83H-AS1 signi cantly reduced LUAD growth in a patient-derived tumor xenograft (PDTX) model.

Materials And Methods
Identi cation of differentially expressed lncRNAs RNA sequencing data from the TCGA LUAD dataset were downloaded from the data portal (https://portal.gdc.cancer.gov) for 585 LUAD patients, including 56 normal lung tissue samples. The R package DESeq2 [14] was applied to HTSeq count data, and it detected 7320 differentially expressed genes (P < 0.01 and fold change > 2.0) among 60483 genes. According to the "Gene_type" annotation by the Ensembl genes database, 596 lncRNAs were screened from differentially expressed genes. Two independent datasets, GSE74095 and GSE12236, were obtained from Gene Expression Omnibus (GEO) (https://www.ncbi.nlm.nih.gov/gds) for use.
Driver gene annotation and machine learning classi er The oncogenic annotations of protein-coding genes (PCGs) were obtained from the OncoKB database [3]. Among 290 PCGs annotated with oncogenic genomic variation, we classi ed PCGs according to whether or not they exhibited ampli cation to construct the decision tree model. We used the J48 decision tree function in the Weka package [15] to construct a pruned decision tree; we used the feature matrix as input and oncogenic driver annotation as the class variable. Subsequently, LUAD-upregulated lncRNAs were classi ed by the decision tree to discover candidate oncogenic drivers. In detail, total SCNA and focal SCNA pro les were both included, and Pearson's correlation analyses were performed between SCNA and expression pro les.

Tissue samples and microarrays
All primary LUAD tissues and adjacent normal tissues were collected from patients who had undergone surgery at the Department of Thoracic Surgery, The A liated Cancer Hospital of Nanjing Medical University (Jiangsu Cancer Hospital, Nanjing, China). All included tissue samples were con rmed by experienced pathologists and conducted in accordance with the International Ethical Guidelines for Biomedical Research Involving Human Subjects. Written informed consent was obtained from all patients. This study was approved by the Ethics Committee of the A liated Cancer Hospital of Nanjing Medical University. Tissue microarray (TMA) was constructed as described previously [16]. 68 pairs of lung cancer tissues and adjacent normal tissues from Jiangsu Cancer Hospital (JSCH) cohort were used to construct the TMA. RNA chromogenic in situ hybridization (CISH) was performed to detect FAM83H-AS1 expression in TMA using digoxigenin-labeled probe (C10910 lnc1100151, RiboBio). According to percentages of positive stained cancer cells and areas, the CISH score was rated on a scale of one to twelve as described previously [16]. The characteristic and prognostic information of patients included in this study was obtained from follow-up team of Jiangsu Cancer Hospital.
Cell proliferation was examined using a CCK-8 Kit (Roche Applied Science) and Real time xCELLigence analysis system (RTCA) following the research protocol afforded by the manufacturer (ACEA Biosciences). Colony formation assays were performed to monitor LUAD cell cloning capability. Flow cytometer (FACScan; BD Biosciences) equipped with CellQuest software (BD Biosciences) was used to detect apoptosis level.
RNA extraction, genome DNA extraction,Western blot and qRT-PCR analysis, andnuclear and cytoplasmic fractions extraction RNA extraction, DNA extraction, and qRT-PCR were performed as described previously [16]. GAPDH, β-Actin and snRNA U6 were used as internal controls. All primer sequences were listed in Additional le 1: Table S1. Protein was extracted from transfected cells and quanti ed as previously described [17] using 12% or 4%-20% poly-acrylamide gradient SDS gel. All antibodies were listed in Additional le 1: Table S2. RNA and protein isolation of nuclear and cytoplasmic fractions were applied with using PARIS Kit according to the manufacturer's protocol (Ambion, Life Technologies).

SiRNA and plasmid construction and cell transfection
The siRNAs were provided by Realgene Biotechnology (Nanjing, China). The full-length cDNA of human FAM83H-AS1 was synthesized and cloned into the expression vector pCDNA3.1 by Vigene Bioscience (Jinan, China). The nal construct was veri ed by sequencing. SiRNA and plasmid vectors transfection was performed as described previously [16]. All siRNA sequences used are listed in Additional le 1: Table  S3. RACE (Rapid ampli cation of cDNA ends) 5′-RACE, 3′-RACE, and full-length ampli cation of FAM83H-AS1 were performed using a SMART RACE cDNA Ampli cation Kit (Clontech) according to the manufacturer's instructions. The gene-speci c primers used for RACE analysis are presented in Additional le 1: Table S1.
RNA immunoprecipitation and pull-down assays RNA immunoprecipitation was performed as described previously [16], and magnetic beads were conjugated with anti-HNRNPK or control anti-IgG antibody. In vitro translation assays were performed using mMESSAGE mMACHINE T7 Transcription Kit (Invitrogen) according to the manufacturer's instructions. Then, FAM83H-AS1 RNAs were labeled with desthiobiotinylation using the Pierce RNA 3′ End Desthiobiotinylation Kit (Magnetic RNA-Protein Pull-Down Kit, Components; Thermo Fisher). RNA pulldown assays were performed with Magnetic RNA-Protein Pull-Down Kit according to the manufacturer's instructions. After elution of lncRNA-interacting proteins, they were subjected to mass spectrometric analysis. Liquid chromatography mass spectrometry (LC-MS) experiments were performed with a linear ion trap quadrupole mass spectrometer (Thermo Finnigan) equipped with a micro-spray source.

Luciferase reporter assays
The mRNA internal ribosome entry segment (IRES) of RAB8B and RAB14 was predicted by IRESite (http://iresite.org). The HNRNPK-binding sites of RAB8B and RAB14 mRNA were identi ed by the Blast program. The sequences of different fragments were synthesized and then inserted into the pGL3-basic vector (Vigene Bioscience). All constructs were veri ed by sequencing, and luciferase activity was assessed using the Dual Luciferase Assay Kit (Promega) according to the manufacturer's instructions.
RNA sequencing and quantitative mass spectrometry A549 cells were plated in a 6-well plate and transfected with an siRNA targeting FAM83H-AS1 or a negative control. Twenty-four hours after transfection, cells were harvested for RNA extraction and subsequent library construction and sequencing (CapitalBio Technology, Beijing, China). Similarly, cells were harvested for protein extraction and subsequent iTRAQ (Isobaric Tag for Relative Absolute Quantitation)/TMT (Tandem Mass Tags) detection (PTM Bio, Hangzhou, China).

CRISPR interference (CRISPRi)-mediated generation of FAM83H-AS1 knockdown LUAD cells
For the CRISPRi experiments, six paired small guide RNAs (sgRNAs) were designed to target near the transcription start site (TSS) of FAM83H-AS1 (within 250 bp upstream and downstream). The location of the TSS was determined using NCBI (http://www.ncbi.nlm.nih.gov/). The sgRNA oligos were designed, phosphorylated, annealed, and cloned into a pBHCas-ZXS 023 vector using a BsmBI ligation strategy. Additional details and a list of the sgRNA sequences can be found in Additional le 1: Table S1.
In vivo tumor growth assays, tumor engraftment, and PDTX maintenance All animal experiments were approved by the Nanjing Medical Experimental Animal Care Commission.
The zebra sh tumor model was constructed according to the previous study [18]. In brief, 4 Í 10 2 A549 cells of control or silenced group were labeled by CellTracker CM-DiI (Invitrogen), and zebra sh embryos were monitored 96 hours for investigating tumor invasion and metastasis using a uorescent microscope. BALB/c nude mice (4 to 6 weeks), purchased from the Vital River Laboratory Animal Technology (Beijing, China), were maintained under speci c pathogen-free conditions. For the tumor formation assay, 10 6 CRISPRi constructed or control cells were subcutaneously injected into one ank of each mouse. Tumor volume was calculated using the following equation: V = 0.5 Í D Í d 2 (V, volume; D, longitudinal diameter; d, transverse diameter). The method of building PDTX model has been described in the previous study [16].

Statistical analysis
GraphPad prism 8, R software version 3.5.1 and SPSS 23 were used to plot the gures. Differences between groups were assessed by two-tailed Student's t test. The strength of the association between continuous variables was tested with Pearson's correlation test. Uni-and Multi-variate Cox regressions were used to identify independent risk factors of LUAD. For survival analysis, overall survival was calculated using the Kaplan-Meier method and the log-rank test. All P values were two sided and P value < 0.05 were considered to be statistically signi cant.

Results
A machine learning classi er identi ed candidate oncogenic drivers of lncRNAs in LUAD OncoKB oncogenic driver annotation was used with TCGA LUAD data to build a type of machine learning classi er known as the dichotomous decision tree. In detail, OncoKB annotated PCGs that harbored genomic amplicons were considered to be ampli ed oncogenic drivers, and then SCNA, gene expression and prognosis data for the PCG drivers were organized according to TCGA datasets (Additional le 2: Fig.   S1a). In the screening phase, upregulated lncRNAs in LUAD were narrowed by the decision tree classi er to discover candidate oncogenic drivers of lncRNAs. Five-dimensional data resources were utilized with SCNA, gene expression and clinical prognosis data to construct the resulting tree, and this model had an accuracy of 81% in predicting the original 141 PCG drivers (Additional le 2: Fig. S1b).
SCNA pro les in different prognosis statuses are presented in Fig. 1, and total and focal SCNAs are shown in the upper and bottom heatmaps, respectively. These results suggested that ampli ed PCG drivers had signi cantly higher levels of accumulated SCNAs than other regions, and they exhibited a trend correlating with poor prognosis (Fig. 1a.c). The decision tree identi ed 72 candidate drivers of lncRNA expression, and they also harbored high levels of SCNA accumulation and strong correlation with poor prognosis (Fig. 1b.e). The decision tree classi er excluded the remaining 524 upregulated lncRNAs, and 75 of them were randomly selected to present in Fig. 1d. These results indicated that ampli ed oncogenic drivers of lncRNA had a higher level of SCNA accumulation (Additional le 2: Fig. S1c), an association with worse prognoses (Additional le 2: Fig. S1d) and a stronger correlation between SCNA and expression levels (Additional le 2: Fig. S1e). Among these candidate oncogenic drivers of lncRNAs, 18 lncRNAs were found to be located in high frequency ampli ed regions, and most of them (11 of 18) were grouped on chromosome 8q24 ( Fig. 2a.b). We subsequently used zebra sh tumor models to quickly validate the biological functions of the four top ranked lncRNAs in vivo (Fig. 2c), and FAM83H-AS1 was shown to exhibit the most signi cant effect in promoting proliferation and metastasis.
Chromosome 8q24 ampli ed FAM83H-AS1, leading to poor prognosis of LUAD We gained insights into the oncogenic functions of FAM83H-AS1 in LUAD cohorts and datasets. The expression of FAM83H-AS1 was analyzed in 40 pairs of primary LUAD and adjacent nontumor tissues from the A liated Cancer Hospital of Nanjing Medical University between August 1 and October 1 in 2017. FAM83H-AS1 was highly upregulated in LUAD, with an average fold change of 13.30 (P < 0.001) ( Fig. 3a). TCGA and other datasets of gene expression, including, GSE74095 and GSE12236, all indicated that FAM83H-AS1 was overexpressed in LUAD tissues (Fig. 3b.c.d).
FAM83H-AS1 expression was then detected in the JSCH cohort by CISH using a TMA of 68 pairs of LUAD and adjacent nontumor tissues. Overexpression of FAM83H-AS1 in LUAD was validated by CISH scores from the TMA (Fig. 3e.f). Kaplan-Meier survival analysis showed that patients with a higher CISH score for FAM83H-AS1 had a shorter overall survival (Fig. 3h), and the result was validated by data from TCGA (Fig. 3j). The multivariable Cox proportional hazards model indicated that the FAM83H-AS1 level was an independent prognostic factor for LUAD patients ( Fig. 3g; Additional le 1: Table S4). Additionally, the expression level of FAM83H-AS1 was positively correlated with tumor size and TNM stage in both the JSCH and TCGA cohorts ( Fig. 3i;Additional le 3: Fig S2a.b). An increase in the SCNV for FAM83H-AS1 was shown to be related to poor prognosis in both the GSE29065 and GSE28572 datasets ( Fig. 3k; Additional le 3: Fig. S2c). PCR results validated that the expression level of FAM83H-AS1 was positively correlated with the ampli cation level of FAM83H-AS1 (Fig. 3l), which was consistent with the TCGA data

The high expression level of FAM83H-AS1 induced malignant behavior in LUAD cells
The expression of FAM83H-AS1 was rst detected in several LUAD cell lines and was found to be remarkably higher than it was in the normal bronchial epithelial cell line HBE (Fig. 4a). FAM83H-AS1 is located on chromosome 8q24 in humans; a transcript length of 2162 nt was determined by 5' and 3' RACE assays, which is slightly shorter than the full transcript length of 2743 nt (NR_033849.1) ( Fig. 4b; Additional le 2: Fig. S2g). Further, no translation of FAM83H-AS1 was found according to coding potentiality assay (Fig. 4c).
FAM83H-AS1 regulated LUAD cell apoptosis by binding with HNRNPK Although the head-to-head coding gene FAM83H has been known to be involved in the progression of human cancers [19, 20], we did not nd any signi cant changes in FAM83H mRNA or protein in FAM83H-AS1-reduced LUAD cells (Fig. 4i). Additionally, the oncogene MYC, which is located in 8q24 and is known to be a driver in human cancers [4,21], revealed no signi cant changes after the silencing of FAM83H-AS1 (Fig. 4i).
To explore the molecular mechanisms of FAM83H-AS1 promoting LUAD tumorigenesis, we rst performed nuclear mass separation and FISH assays and found that FAM83H-AS1 was mainly distributed in the cell cytoplasm (Fig. 4j.k), which indicated that FAM83H-AS1 might exert biological function at the posttranscriptional level. A subsequent RNA pull-down experiment was performed to identify potential proteins binding with FAM83H-AS1 (Fig. 5a). Mass spectrometry analysis of a differentially displayed band revealed that HNRNPK was associated with FAM83H-AS1 (Fig. 5b). Then, we con rmed the association of FAM83H-AS1 and HNRNPK by western blot of the proteins isolated from the RNA pull-down assays (Fig. 5c). Additionally, a RIP assay was performed with an HNRNPK antibody to ensure that FAM83H-AS1 formed a complex with HNRNPK (Fig. 5d).
HNRNPK is an RNA-binding protein that is localized both in the cytoplasm and nucleus [22], and it has been shown to regulate the translation of oncogenes in cancer cells [23,24]. Western blot assays showed that FAM83H-AS1 did not affect the overall expression of HNRNPK within the A549 cells (Fig. 5e); however, FAM83H-AS1 overexpression increased HNRNPK expression within the cytoplasm, whereas FAM83H-AS1 knockdown had an opposite effect (Fig. 5e). Immuno uorescence assays validated these results (Fig. 5f). A further nding con rmed by RNA pull-down using biotinylated truncations of FAM83H-AS1 was that the stem-loop structure from 1301 nt to 1881 nt (3 rd truncation) was su cient for enabling interaction between FAM83H-AS1 and HNRNPK (Fig. 5g). In addition, western blotting demonstrated that this identi ed truncation of FAM83H-AS1 could suppress apoptosis at a rate similar to that of the full length FAM83H-AS1 (Fig. 5h).

FAM83H-AS1 targeted RAB8B and RAB14 and promoted their translation
To identify the potential downstream molecular targets of FAM83H-AS1, we conducted RNA-Seq and QMS assays after silencing FAM83H-AS1 in A549 cells. A total of 230 differentially expressed mRNAs (FDR < 0.01 and fold change > 2) were detected ( Fig. 6a; Additional le 1: Table S5; Additional le 4: Fig.  S4a), and the QMS assay demonstrated 258 differentially expressed proteins (FDR < 0.05 and fold change > 1.2; Fig. 6a; Additional le 1: Table S6; Additional le 4: Fig. S4b). However, few genes overlapped at both the mRNA and protein levels (Fig. 6B). Considering that the FAM83H-AS1 and HNRNPK complex is mainly distributed in the cytoplasm, we speculated that FAM83H-AS1 may affect the translation of downstream genes in LUAD cells.
HNRNPK has been shown to have a preference for AU/CU-rich sequences in 5' untranslated regions (UTRs) and to have a speci ed motif of N 2-5 C(C/U)ACC(C/A)N 11-17 [25]. Therefore, the differentially expressed genes harboring the HNRNPK motif in their 5'UTRs were identi ed from both the RNA-Seq and QMS data, which indicated that HNRNPK-targeting genes were signi cantly enriched in the QMS results (Fig. 6c). Among the top ranked genes in the QMS results, no signi cant changes in expression level were revealed by RNA-Seq (Fig. 6d). Two known oncogenes, RAB8B and RAB14, of the RAS family of oncogenic proteins, were found to be signi cantly downregulated at the protein level after FAM83H-AS1 silencing. Considering the extremely low RNA levels of RAB8B and RAB14 and the effect of HNRNPK in stimulating the activity of mRNA IRESs to regulate translation [26, 27], we analyzed the 5'UTRs of RAB8B and RAB14, which indicated potential IRES segments harboring HNRNPK motifs (Fig. 6e). The RIP assays performed with HNRNPK antibody indicated that HNRNPK could bind to RAB8B and RAB14 mRNAs (Fig.  6f). Furthermore, we cloned the wild-type and mutated 5'UTRs of the RAB8B and RAB14 mRNAs and performed dual luciferase reporter assays with them. Compared with the control group, the overexpression of HNRNPK e ciently promoted luciferase activity of wild-type groups but not mutated groups (Fig. 6g). These results suggested that the 5'UTRs of both RAB8B and RAB14 could be bound by HNRNPK. We also observed that the translation inhibitor cycloheximide inhibited the HNRNPK overexpression-induced increase in the protein levels of RAB8B and RAB14 in FAM83H-AS1-knockdown LUAD cells (Fig. 6h).
The silencing of FAM83H-AS1 decreased the expression of RAB8B and RAB14, whereas overexpressing FAM83H-AS1 increased the expression of these genes at the protein level ( Fig. 6i; Additional le 5: Fig.   S4c). The high expression level of RAB14 has been reported to inhibit apoptosis in non-small cell lung cancer (NSCLC) [28], and we found that the silencing of RAB8B or RAB14 suppressed the clonogenicity of A549 cells (Fig. 6j). To determine whether FAM83H-AS1 inhibits LUAD cell apoptosis via the FAM83H-AS1-HNRNPK-RAB8B/RAB14 axis, colony formation, ow cytometry assays and western blotting of cleaved PARP and Caspase-3 were performed; silencing of HNRNPK partially rescued the apoptosisinhibiting effect induced by FAM83H-AS1 (Fig. 6k.l.m). Additionally, the silencing of HNRNPK partially reversed the effects of FAM83H-AS1 on RAB8B and RAB14 (Fig. 6m).
FAM83H-AS1 promoted LUAD in vitro and was revealed as a potential therapeutic target To validate the biological function of FAM83H-AS1 in vivo, we constructed A549 cells with CRISPRimediated FAM83H-AS1 silencing. A total of six sgRNAs around the transcription start site (TSS) of FAM83H-AS1 were designed to suppress the transcription of FAM83H-AS1 (Additional le 5: Fig. S4d) , and the combination of three sgRNAs in the 3'-end of TSS produced the highest knockdown e ciency (Additional le 5: Figure S4e) without affecting the expression of FAM83H (Additional le 5: Figure S4f). Consequently, xenograft tumor models demonstrated that the tumors derived from CRISPRi-mediated FAM83H-AS1 silenced A549 cells had a smaller tumor size than that of the control (Fig. 7a).
We then developed a PDTX model from four LUAD patients and evaluated the therapeutic potential of targeting FAM83H-AS1 by intratumor injection of cholesterol-conjugated siFAM83H-AS1 and a control siRNA (4 times and twice a week) (Fig. 7b.c). Immunohistochemistry (IHC) revealed that the siFAM83H-AS1 group had fewer RAB8B-and RAB14-positive cells but more TdT-mediated dUTP nick end labeling (TUNEL)-positive cells than the control group (Additional le 5: Fig S4g). As a result, suppressing FAM83H-AS1 inhibited PDTX growth in vivo, suggesting that FAM83H-AS1 could serve as a promising therapeutic target for LUAD.

Discussion
In this study, we identi ed potential oncogenic drivers of lncRNAs in LUAD using machine learning algorithms. The oncogenic functions of four candidate lncRNAs located at 8q24 were tested in vivo using zebra sh models, and the molecular function of FAM83H-AS1 was revealed. In LUAD cells, FAM83H-AS1 bound with cytoplasmic HNRNPK to form an RNA-protein complex, which further bound to the 5'UTRs of target mRNAs RAB8B and RAB14. This interaction enhanced the translation of these oncogenes and upregulated their protein levels, which nally promoted the malignant progression of LUAD (Fig. 7d).
To identify oncogenic drivers of lncRNAs in cancers, genomic variation-associated data have been widely used in the discovery phase [6-8, 29]. Integrative analysis of genomic and transcriptional data provided a theoretical basis for identifying these candidate drivers. Unlike PCGs, noncoding RNAs have been demonstrated to lack hotspot point mutations, but structural variants, including SCNAs, breakpoints and fusion events, have been thought to be substantial contributors to noncoding drivers [30]. High frequent SCNA gain or loss in the genome of cancers has now been revealed by TCGA project, with joint analyses performed on lncRNA pro les in several cancer types, including glioblastoma multiforme, ovarian cancer, lung squamous cell carcinoma (LSCC) and prostate cancer [8]. In addition, lncRNAs of RBPMS-AS1, TDRKH-AS1, LINC00578, RP11-470 M17.2 and LINC00941 were revealed to be key prognostic biomarkers of LUAD as a result of weighted gene coexpression and GISTIC analyses, but none of these lncRNAs has been validated in functional assays [29]. Chromosome 8q24 is a region with frequent SCNAs, regardless of arm or focal levels of LUAD [31], and it was also found to harbor most candidate drivers of lncRNAs in our study. This so-called "8q24 gene desert" was shown to be a hotspot region linking oncogenic lncRNAs and genomic variations [32-34], and this study added further insights into the oncogenic function of the 8q24 amplicon in LUAD.
FAM83H-AS1 has been proven to have potent tumor-promoting activity in colorectal carcinoma, breast cancer, bladder cancer and NSCLC [35-38]. Zhang et al. found that the proliferation, migration and invasion of NSCLC cells were decreased after FAM83H-AS1 downregulation, which is consistent with our results [37]. However, the molecular function of FAM83H-AS1 in human cancer cells has not been uncovered, especially in LUAD. Furthermore, the upregulation of FAM83H-AS1 was demonstrated to be related to poor prognosis in lung cancer, ovarian cancer and gastric cancer patients [39, 40]. All these results indicated that FAM83H-AS1 has conserved oncogenic function among different types of malignant tumors, even though expression level varies greatly; however, mechanism investigation and functional experiments in vivo were rare to be conducted in these previous studies. In summary, the oncogenic lncRNA FAM83H-AS1 exhibits a trend of overexpression in human cancers; therefore, we considered that transcriptional activation could also contribute to the high expression level of FAM83H-AS1.

Page 13/41
HNRNPK is a multifunctional protein that plays important roles in cancer cells. Previous studies found that HNRNPK could regulate biological processes at both transcriptional and posttranscriptional levels.
For example, HNRNPK was shown to interact with the RNA polymerase II transcription machinery to stimulate transcription [41,42] and to be involved in regulating the translation of MYC, P21 and ERK in cancer cells [23,24,43]. Additionally, HNRNPK was found to be essential for the anti-apoptosis mechanism in cancer cells, and it is independent of p53 status [44,45]. Furthermore, HNRNPK protein has been revealed to play a regulatory role in the molecular mechanisms of lncRNAs [46]. Previous studies discovered that HNRNPK is required for Xist-mediated chromatin modi cations [47] and that it binds with lncRNA CASC11 and linc00460 to form RNA-protein complexes in colorectal and lung cancers, respectively [48,49]. In the current study, we identi ed a complex of FAM83H-AS1 and HNRNPK in LUAD cells. To control for possible confounding bias, we therefore used high-throughput methods at both the RNA and protein levels to elucidate the underlying molecular mechanisms.
In conclusion, we have identi ed FAM83H-AS1 as a potential oncogenic driver and described its regulatory function in malignant phenotypes, especially apoptosis and clonogenicity. Importantly, our study discovered that FAM83H-AS1 interacts with HNRNPK to promote the translation of RAB8B and RAB14 and that LUAD PDTX growth was inhibited by targeting FAM83H-AS1.

Availability of data and materials
All data that support the ndings of this study are available from the corresponding authors upon reasonable request.
Ethics approval and consent to participate  The input and output somatic copy number alteration pro les generated by the decision tree classi er. a.b Total SCNA and focal SCNA pro les of ampli cated PCG drivers and predicted oncogenic drivers of lncRNAs. Red indicates an accumulation of SCNAs, while blue indicates a loss of SCNAs. Total SCNA gain or loss: The genomic segment value is greater than 0.2 or less than -0.2. Focal SCNA gain or loss: The GISTIC value is greater than 0.3 or less than -0.3. Heatmaps were separated into three parts according to different prognosis statuses, and a trend test was conducted. Correlation estimation was based on Pearson's correlation coe cient between SCNA and expression levels; c.d Total SCNA and focal SCNA pro les of non-ampli cated PCG drivers and excluded upregulated lncRNAs are shown (75 of 524 were randomly selected to present); e A constructed J48 decision tree was performed on LUADupregulated lncRNAs in LUAD.

Figure 1
The input and output somatic copy number alteration pro les generated by the decision tree classi er. a.b Total SCNA and focal SCNA pro les of ampli cated PCG drivers and predicted oncogenic drivers of lncRNAs. Red indicates an accumulation of SCNAs, while blue indicates a loss of SCNAs. Total SCNA gain or loss: The genomic segment value is greater than 0.2 or less than -0.2. Focal SCNA gain or loss: The GISTIC value is greater than 0.3 or less than -0.3. Heatmaps were separated into three parts according to different prognosis statuses, and a trend test was conducted. Correlation estimation was based on Pearson's correlation coe cient between SCNA and expression levels; c.d Total SCNA and focal SCNA pro les of non-ampli cated PCG drivers and excluded upregulated lncRNAs are shown (75 of 524 were randomly selected to present); e A constructed J48 decision tree was performed on LUADupregulated lncRNAs in LUAD.      sequences were added to the silencing and upregulation groups, respectively. GAPDH and Histone 3 served as loading controls; f Immuno uorescence assays indicated an increase in cytoplasmic HNRNPK after increasing the expression of FAM83H-AS1 in A549 cells; g The secondary structure of FAM83H-AS1 is shown as predicted by the centroid method (http://rna.tbi.univie.ac.at). The red color indicates strong con dence for the prediction of each base. RNA pull-down detection of the interaction between HNRNPK and FAM83H-AS1 truncations according to the predicted secondary structure; h Apoptosis assay of cells with the full-length and truncation mutant FAM83H-AS1, as assessed by Caspase-3 cleavage. *, P < 0.05, and **, P < 0.01. N. S, nonsigni cant. Error bars, standard error of the mean.  Page 36/41 The FAM83H-AS1-HNRNPK complex coregulates the expression of RAB8B and RAB14. a Signi cantly differentially expressed genes identi ed by RNA-Seq (FDR < 0.01 and fold change > 2) and QMS (FDR < 0.05 and fold change > 1. and RAB14 using an anti-HNRNPK antibody as described above; g Dual luciferase reporter assays showed that HNRNPK directly binds to the 5'UTRs of RAB8B and RAB14 and activates luciferase activity; h Effect of the translation inhibitor cycloheximide on the HNRNPK overexpression-induced increase in the protein levels of RAB8B and RAB14 in FAM83H-AS1 knockdown A549 cells; i The silencing of FAM83H-AS1 decreased the expression of RAB8B and RAB14, but overexpressing FAM83H-AS1 increased their expression; j The colony formation assay results suggested an oncogenic function for RAB8B and RAB14 in A549 cells; k Colony formation assays suggested that HNRNPK knockdown partially abolished the effects of FAM83H-AS1; l.m HNRNPK knockdown abolished the effects of FAM83H-AS1 on apoptosis, as revealed by ow cytometry and the cleavage of PARP and Caspase-3. The effect of HNRNPK knockdown on FAM83H-AS1 overexpression-induced protein expression of RAB8B and RAB14. *, P < 0.05, and **, P < 0.01. N. S, nonsigni cant. Error bars, standard error of the mean.

Figure 6
Page 38/41 The FAM83H-AS1-HNRNPK complex coregulates the expression of RAB8B and RAB14. a Signi cantly differentially expressed genes identi ed by RNA-Seq (FDR < 0.01 and fold change > 2) and QMS (FDR < 0.05 and fold change > 1.2) after silencing FAM83H-AS1; b Common upregulated and downregulated genes in the RNA-Seq and QMS results; c Signi cantly differentially expressed genes harboring HNRNPKspeci c motifs in their 5'UTRs were identi ed by RNA-Seq and QMS results. P values were determined by Fisher's exact test; d The top differentially expressed genes identi ed by QMS and the corresponding mRNA changes in RNA-Seq; e Predicted IRES sites (http://iresite.org) and identi ed HNRNPK motifs in the 5'UTRs of RAB8B and RAB14; f RIP evaluation of the interaction between HNRNPK and mRNAs of RAB8B and RAB14 using an anti-HNRNPK antibody as described above; g Dual luciferase reporter assays showed that HNRNPK directly binds to the 5'UTRs of RAB8B and RAB14 and activates luciferase activity;   Supplementary Files