Patients and clinical samples
A total of 10 paired LUAD and adjacent normal fresh tissues were obtained from Second Affiliated Hospital of Fujian Medical University between January, 2019 and May, 2019. The specimens were collected from LUAD patients with treatment-naïve who underwent primary surgical treatment. All of the specimens were immediately snap-frozen by liquid nitrogen after surgical resection, and stored at -80℃ until RNA extraction. The clinicopathological diagnosis were confirmed by two pathologists according to the guidelines of the World Health Organization (WHO, version 2015). All participants provided written informed consent, and the bioethical committees at The Second Affiliated Hospital of Fujian Medical University, China, gave written approval for the study (2020-206).
The other group resulted from the LUAD-related RNA-sequencing data in TCGA database, containing the expression data of mRNAs, miRNAs and lncRNAs. A total of 595 paired LUAD tissues RNA-seq data were retrieved, with their clinicopathological features. RNA-sequencing data and corresponding clinical information in LUAD were obtained from TCGA database as following criteria: 1) histologically diagnosed as LUAD; 2) data with complete clinical information; 3) diagnosed as stage I patients. Finally, a total of 260 paired LUAD tissues was included for further analysis. The present study meets the criteria of data usage and publishing of the National Cancer Institute of NIH, and no approval from the ethics committee was required. Progression-Free Survival (PFS) was calculated from the date when patients first received treatment until the date of progression or the last follow-up or death.
RNA extraction and sequencing
Total RNA of LUAD tissues and corresponding normal tissues was extraction from frozen tissues using RNeasy Mini Kit (Qiagen, Germany) following the manufacturer’s protocol. The RNA concentration was evaluated by the Qubit 4.0 (Thermo Fisher Scientific, Wilmington, DE, USA), and the RNA quality was evaluated by agarose gel eletrophoresis.
Then, ribosomal RNA was removed from total RNA to obtain the maximum residual ncRNA. After fragment of rRNA-depleted RNA, the cDNA library was constructed using the TruSeq RNA sample Prep Kit (Illumina, San Diego, CA, USA). LncRNA/mRNA sequencing libraries were prepared using VAHTS total RNA-seq Library Prep kit for Illumina (Vazyme NR603, China) according to the manufacturer’s protocol. The cDNA fragments with 150-bp paired-end reads were generated for RNA sequencing. Additionally, NEBNext® Multiplex Small RNA Library Prep Set for Illumina® (NEB) was used to prepare the miRNA library for samples. 12 libraries were pooled and sequenced in a single lane of Illumina HiSeq Xten sequencing platform. And Illumina’s TruSeq small RNA library preparation kit was used to prepare the miRNA library with 50-bp paired-end reads generated. After library construction, RNA sequencing for both lncRNA/mRNA and miRNA was performed using Illumina HiSeq Xten platform.
Identifications of DEmRNAs, DEmiRNA and DElncRNAs
Mirdeep2 (v2.0.0.5) was used for new miRNA prediction, whose expression was calculated and standardized using CPM (counts per million read, CPM). LncRNAs were annotated by the following database, including PLEX (https:// sourceforge.net/projects/plek/), CPAT (https://sourceforge.net/projects/rna-cpat/), CNCI (https://github.com/www-bioinfo-org/CNCI) and CPC2 (http:// cpc2.cbi.pku.edu.cn/). The intersection was exhibited using Venn diagram. (Fig. S1).
DESeq2Rpackage(https://bioconductor.org/packages/release/bioc/html/DESeq2.html) in Bioconductor project was used to screen the differentially expressed mRNA (DEmRNAs), differentially expressed miRNA (DEmiRNAs) and differentially expressed lncRNA (DElncRNAs) between LUAD and normal tissues. |Log2(fold change)| (|log2FC|) ≥1 and statistical P value ≤0.05 were set as cut-off criteria. Finally, unsupervised hierarchical clustering was carried out for DE-RNAs, and the expression patterns of which in paired tissues were displayed in form of heatmap using the pheatmap R package (https://cran.r-project.org/web/ packages/pheatmap/index.html).
Analysis of the DElncRNAs Enrichment pathway
Gene ontology (GO, http://www.bioconductor.org/packages/release/bioc/ html/topGO.html) analysis was conducted to screen enrichment of target genes to determine the biological functions regulated by lncRNAs, including biological processes (BP), cellular component (CC), and molecular function (MF) annotations. Kyoto Encyclopedia of Genes and Genomes (KEGG, http://www.genome.jp /keggbin/show_organism?menu_type=pathway_maps&org=hsa) analysis was performed to annotate the signaling pathways mainly involved for target genes. The GO and pathway analysis with enriched gene count ≥2 and P value <0.05 as the threshold for statistical significance.
Predication of miRNA Regulation Relationship
The predication of miRNA-gene analysis of DEmiRNA obtained was performed using the online website miRWalk 2.0 (http://zmf.umm.uni-heidelberg.de/ pps/zmf/mirwalk2/). Furthermore, miRWalk, miRanda, miRDB, miRMap and TargetScan databases were used to predict the possible DEmiRNA-DEmRNA regulatory relationships. Using the StarBase (http://starbase.sysu.edu.cn/) database, the miRNA-lncRNA regulatory relationships by DEmiRNA were predict. Then the DEmRNA-DEmiRNA and DEmiRNA-DElncRNA regulatory regulation relationship were constructed based on shared miRNAs with which both lncRNAs and mRNAs interact, and were illustrated by Cytoscape software.
Construction of lncRNA-miRNA-mRNA ceRNA network
Under the hypothesis of miRNA sponge, the positive correlation expression of lncRNAs-mRNAs was focused on, therefore, and the positive correlation coexpression relationship between mRNAs and lncRNAs simultaneously regulated by miRNAs was obtained. ceRNA network were constructed based on shared miRNAs with which mRNAs and lncRNAs interact. For further analysis, we used a hypergeometric cumulative distribution function test to determine potential ceRNA pairs[19], with correlation coefficient>0.5 and P value<0.05 as a ceRNA triplet with statistical significance.
Weighted gene correlation network analysis (WGCNA)
As described previously [20, 21], gene co-expressed network analysis was performed on LUAD and adjacent normal tissues using R WGCNA package. The expression matrix was restricted to expressed lncRNAs, following by a Pearson correlation matrix and a weighted adjacency matrix generated. The module eigengene (ME) was calculated for each module, the values of which clinical traits associated with was calculated by Pearson’s correlation. In the present study, we set the optimal soft-thresholding power at 4 (scale-free R2=0.85), cut height at 0.25, and minimal modules sized to 30, to identify key modules. topological overlap matrix (TOM) was constructed and transformed. Then, the lncRNAs were grouped into co-expression modules using tree pruning of gene hierarchical clustering dendrograms by cutreeDynamic method, with correlated modules merged. The module significantly correlated with sample traits were used for hub genes selection.
LUAD-specific prognostic lncRNA signatures identification
Progression-free survival (PFS) time was calculated from the date when patients first received chemotherapy until the date of progression or the last follow-up or death. Using TCGA dataset, the associations between DElncRNAs in ceRNA network and PFS in stage I LUAD patients were evaluated using univariate Cox regression analysis. The samples were divided into high expression and low expression groups based on the median FPKM of lncRNAs. Kaplan-Meier (KM) survival analysis and Log-rank (LR) test were performed, and LR P value, hazard ratio (HR) with 95% confidence interval (CI) were computed. Survival and survminer R package were used for Cox regression analysis, and the statistical P<0.05 was considered as significant.
RNA extraction and quantitative real-time RT-PCR assay
Total RNA was extracted using (Trizol®; Sigma) from 28 paired LUAD tissues, according to the manufacturer’s recommendations. After purification, the RNA was transcribed into cDNA using Reverse Transcription Kit (Takara, Tokyo, Japan). RT-PCR analysis was performed using the SYBR Green (Takara). The primers used for RT-PCR were: FENDRR primers were AGTCACAGCACCAGAAAGCCAAC (sense) and TGATGTTCTCCTTCTT GCCTCAGC (anti-sense); LNC00639 primers were GTGAGTGTTCAGACATGCCAGGAG (sense) and AGCCGAGTGGATTCAGCGAGAG (anti-sense); RP4-676L2.1 primers were TGTTTGAAGCCGTGAGACTGAGTG (sense) and CTCCTTTGCTGGCTC TTCCTCATC (anti-sense).