Identification of a novel autophagy-related gene signature for predicting metastasis and survival in patients with osteosarcoma

Background: Osteosarcoma (OS) is a bone malignant tumor that occurs in children and adolescents. Due to a lack of reliable prognostic biomarkers, the prognosis of OS patients is often uncertain. This study aimed to construct an autophagy-related gene signature to predict the prognosis of OS patients. Methods: The gene expression profile data of OS and normal muscle tissue samples were downloaded separately from the Therapeutically Applied Research To Generate Effective Treatments (TARGET) and Genotype-Tissue Expression (GTEx) databases . The differentially expressed autophagy-related genes (DEARGs) in OS and normal muscle tissue samples were screened using R software, before being subjected to Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analysis. A protein-protein interaction (PPI) network was constructed and hub autophagy-related genes were screened. Finally, the screened autophagy-related genes were subjected to univariate Cox regression, Lasso Cox regression, survival analysis, and clinical correlation analysis. Results: A total of 120 DEARGs and 10 hub autophagy-related genes were obtained. A prognostic autophagy-related gene signature consisting of 9 genes ( BNIP3 , MYC , BAG1 , CALCOCO2 , ATF4 , AMBRA1 , EGFR , MAPK1 , and PEX ) was constructed. This signature was significantly correlated to the prognosis ( P <0.0001) and distant metastasis of OS patients ( P = 0.013). Conclusion: This signature based on 9 autophagy-related genes could predict metastasis and survival in patients with OS.


Background
Osteosarcoma (OS) is a common primary bone malignant tumor that occurs in children and adolescents, and has a high rate of metastasis and recurrence [1,2]. Approximately 10-20% of OS patients have numerous lesions associated with pain and swelling [3]. While many studies have shown that adjuvant chemotherapy could improve survival in patients with OS, the prognosis of OS patients remains poor, and the long-term survival rate is less than 20% [4,5]. Improved understanding of disease pathogenesis and the identification of reliable prognostic biomarkers are crucial to better the prognosis of patients with OS.
Autophagy is a highly conserved intracellular degradation pathway that maintains homeostasis by degrading intracellular macromolecules and organelles through the formation of autophagosomes and subsequent fusion with lysosomes [6,7]. Autophagy is widely involved in the physiological and pathological processes of cells in vivo, such as cell homeostasis, hematopoiesis, organ development, neurodegeneration, and tumorigenesis [8][9][10]. Basal levels of autophagy are employed by cells under normal circumstances. When cells become deficient in nutrition or are exogenously stimulated, their autophagy levels increase accordingly to cope with unfavorable factors, as an adaptive response to metabolic stress [10]. Many studies have shown that autophagy could promote or inhibit the survival of tumor cells at different stages of cancer development [11]. In the early stages of cancer, autophagy promotes the degradation of damaged proteins and organelles to reduce cellular damage and chromosomal instability, thereby inhibiting cancer progression. However, once cancer has formed, autophagy allows tumor cells to survive under stressful conditions, thereby promoting tumor progression [12]. The relationship between autophagy and OS has been previously reported. For example, in a mouse OS model, silencing BECN1 (an autophagy-promoting gene) could increase OS cell metastasis [13]. By regulating autophagy in human OS MG-63 cells, the survival rate of OS cells and cisplatin-induced chemoresistance was significantly increased [14]. These studies have confirmed that autophagy is closely related to OS, suggesting that autophagy-related genes may have potential application as prognostic markers of OS.
However, autophagy is a complex process involving hundreds of molecules. Therefore, models that integrate multiple autophagy-related genes can improve the accuracy of prognostic prediction [15].
Compared to traditional single-molecule prognosis predictors, genomic profiling based on "Omics" could predict the prognosis of a set of genes for patients, which are called "classifiers" or "signatures" [16]. With the rapid development of gene expression profile chip technology and bioinformatics methods, it has become feasible to use high-throughput expression profile data to analyze the relationship between autophagy-related gene expression and the clinical outcomes of cancer patients [17]. Therefore, in this study, we used a bioinformatics approach to identify potential autophagy-related genes for use as predictive biomarkers of the prognosis of OS patients. batches in the expression matrix of the two datasets, and then merge them for subsequent analysis.

Identification of differentially expressed autophagy-related genes (DEARGs)
The Human Autophagy Database (HADb, http://autophagy.lu/clustering/index.html) is the first public database dedicated to human autophagy, which contains a complete and current list of human autophagy-related genes reported in literature from PubMed and biological public databases [20]. The 222 autophagy genes were obtained from the HADb. The limma package [21] was used to screen DEARGs between OS samples and normal muscle tissue samples. The intercept value was Adjusted P (adj. P) values < 0.05.

DEARGs function and pathway enrichment analysis
Gene ontology (GO) is an important method and tool for annotating genes and their products, which is beneficial for the integration and utilization of biological data [22]. Kyoto Encyclopedia of Genes and Genomes (KEGG) is a database integrating genomics, chemistry, and system function information, which can provide currently known biological metabolic signaling pathways [23]. ClusterProfiler package [24] was used to perform GO and KEGG enrichment analysis on DEARGs. FDR < 0.05 was set as the threshold for significant gene enrichment.

PPI network construction, hub autophagy-related gene selection, and functional similarity analysis of DEARGs
STRING (http://www.string-db.org/, Version: 11.0) is a biological tool for evaluating PPI information [25]. The DEARGs were imported into the STRING online software, and the PPI network was derived, and exported in TSV format. The interactions with a combined score of greater than 0.9 were considered statistically significant. The obtained source file was then imported into Cytoscape software [26], and the cytoHubba plug-in [27] was used to screen the top 10 hub autophagy-related genes according to the maximum correlation standard algorithm. Based on the semantic similarity of gene annotation GO terms [28], we applied the GOSemSim package [29] to calculate the strength of the relationship between molecular function and cell localization among the 10 hub genes, and the average value of functional similarity to compare to the 10 hub autophagy-related genes were sorted with a cut-off value of 0.65, and the results were visualized using the ggplot2 package [24].

Construction of prognostic autophagy-related gene signature
First, we performed univariate Cox regression analysis on the obtained DEARGs to screen for genes with prognostic value; the screening criterion was P < 0.05. Then, glmnet software [30] was used to perform Lasso Cox regression analysis on genes with prognostic value. Based on the gene expression level, an prognostic autophagy-related gene signature for OS patients was constructed. A prognostic risk score for each sample based on this signature was derived, and patients were divided into high and low-risk groups based on the median value of the risk score. The risk factor correlation diagram was drawn using R language to show the prognosis of patients in the high and low-risk groups. The difference in survival between the high and low-risk groups was evaluated by Kaplan-Meier curve.

Relationship between prognostic autophagy-related gene signature and clinical characteristics
To further evaluate the correlation between prognostic autophagy-related gene signatures and clinical characteristics, we used the ggstatsplot package (https://github.com/IndrajeetPatil/ggstatsplot) to perform correlation analysis between the obtained risk scores and clinical data (age, gender, and metastasis) from the TARGET database, and the results are shown graphically through the ggplot2 package (CRAN.R-project.org/package = ggplot2).

Data processing and identification of DEARGs
After the obtained data were subjected to background correction, data normalization, and removal of the difference between batches, a total of 222 autophagy-related genes and 120 DEARGs were obtained.

DEARGs function and pathway enrichment analysis
The results of GO analysis demonstrated that the biological process (BP) changes of the DEARGs were significantly enriched in 'autophagy', 'process utilizing autophagic mechanism', 'macroautophagy',
To further identify the strength of the interactions between the hub autophagy-related genes, we sorted them based on their average functional similarity, and the results showed that MAP1LC3B, GABARAPL2, GABARAPL1, and GABARAP were the genes with the closest interactions. Among them, MAP1LC3B is the only gene with a cut-off value > 0.7, with the highest average functional similarity ( Figure 2C).

Construction of prognostic autophagy-related genes signature
To study the prognostic value of DEARGs, the univariate Cox analysis revealed that 25 genes   including BNIP3, MYC, VEGFA, ATG9B, NKX2.3, ATG4A, BAG1, CALCOCO2, ATF4, ATG7, ATG4D,   AMBRA1, BAK1, APOL1, EGFR, CCL2, MAPK1, CASP1, MAP2K7, DLC1, DNAJB9, VAMP3, Table 2). These 9 genes were subsequently used to construct a gene signature. We calculated the risk score of each sample, and divided the samples into high-risk and low-risk groups using the median risk as a cut-off value, and obtained a risk factor correlation diagram of the prognostic autophagy-related gene risk score. Compared to the low-risk group, the number of deaths in the high-risk group was significantly greater, and the levels of ATF4, BAG1, MYC, and BNIP3 were higher in the high-risk group, while the expression levels of MAPK1, EGFR, CALCOCO2, and AMBRA1 were lower ( Figure 5A). Kaplan-Meier survival analysis showed that the overall survival rate of patients in the high-risk group was significantly lower than that in the low-risk group (P <0.0001) ( Figure 5B).

Relationship between prognostic autophagy-related gene signature and clinical characteristics
As shown in Figure 6 and Supplementary material 2, the risk score was significantly associated with the distant metastasis of OS (P = 0.05), but not with age (P = 0.710) and gender (P = 0.887).

Discussion
OS is the most common bone malignancy in children and adolescents, and is also the leading cause of disability and death in this age-group [31]. Although adjuvant chemotherapy can improve patient prognosis, the long-term survival rate of patients after relapse remains less than 20% [32]. Many studies have shown that autophagy-related proteins are closely related to the prognosis of patients with OS. Low expression of AGT5 may be related to the poor prognosis of OS patients [33]. Silencing the autophagy-promoting gene BECN1 will increase cancer cell metastasis [13]. Currently, hundreds of proteins are thought to be involved in the autophagy process. In view of the importance of autophagy in OS, it can be reasonably speculated that autophagy-related genes have broad prospects in predicting prognosis, and multi-gene signatures generated by reliable algorithms will be superior to single molecules in predicting the prognosis of OS patients. In this study, for the first time, we integrated the genetic chip sequencing data of OS samples from the TARGET and GTEx databases with normal muscle tissue samples through bioinformatics methods to construct a prognostic autophagy-related gene signature for identifying prognostic OS biomarkers.
A total of 120 DEARGs were screened in this study. GO results showed that the BP of DEARGs mainly involved autophagy, macroautophagy, autophagosome assembly, autophagosome organization, and autophagy of mitochondrion; the CC were mainly enriched in autophagosome, phagophore assembly site, autophagosome membrane, vacuolar membrane, and late endosome; and the MF mainly  [58,59]. Xiang et al. [60] showed that abnormal expression of EGFR is related to drug resistance in tumor patients. Chu et al. [61] showed that PEX3 is related to the invasion and metastasis of ovarian cancer cells. Importantly, our results show that the prognostic autophagy-related gene signature is significantly associated with OS metastasis, but not with age and gender. The autophagy-related gene signature revealed in this study are closely related to the clinical prognosis of OS patients, which may have important significance for the prognostic management and monitoring of these patients.
Current research on the prognosis of OS is mainly based on the prognosis of single genes, and the signature of multiple genes could well avoid the differences caused by individual heterogeneity. We used univariate Cox analysis, Lasso Cox, and survival analysis to predict the prognosis of OS patients by including multiple genes as a whole, and further verified by clinical correlation analysis. Therefore, this prognostic model has high prognostic value. However, our research has certain limitations. We used the recently opened TARGET database. Although it contains relatively complete clinical information of OS patients, this database lacks control group sample data. Therefore, the number of control group samples used in this study was derived from the GTEx database, which may lead to the data having a certain heterogeneity. In addition, there is currently an insufficient dataset to further verify the accuracy of our analysis results. Therefore, large-scale analysis of clinical tissue specimens is required in the future.

Conclusion
In conclusion, through in-depth analysis of OS at the level of autophagy genes, we constructed a novel prognostic autophagy-related gene signature based on 9 genes (BNIP3, MYC, BAG1, ALCOCO2, ATF4, AMBRA1, EGFR, MAPK1, and PEX3), which was used to predict the metastasis and survival of OS patients. Our finding suggests that the 9 prognostic autophagy-related gene signature may help facilitate personalized medicine in the clinical setting.

Ethics approval and consent to participate
Not applicable.

Consent for publication
All authors agreed on the manuscript

Availability of data and materials
The datasets used and/or analyzed during the current study are available from the corresponding author upon reasonable request.

Competing interests
The authors declare that they have no competing interests.

Funding
No funding was received.    Characteristics of the prognostic autophagy-related genes signature (A) The distribution of prognostic autophagy-related genes risk score, patient survival status, and gene expression heatmap were analyzed. The black dotted line is the optimal cut-off value for dividing patients into low-risk and high-risk groups. (B) Kaplan-Meier survival curves for samples of the high-risk score group and the low-risk score group. The high-risk score is related to the overall poor survival of OS patients. The red line represents the high gene expression group, and the blue line represents the low gene expression group. P <0.05 is statistically significant.