Lipoprotein-associated phospholipase A2 predicts exercise tolerance in COPD patients

& Impaired exercise tolerance is a clinical feature of chronic obstructive pulmonary disease (COPD) associated with disease progression and increased mortality. The 6-minute walk test (6MWT) is a reliable and widely used measure of exercise capacity. However, it is not commonly used in primary medical institutions because it requires a suitable venue and professional training. Developing a simple tool to assess exercise tolerance is important. Molecular biomarkers show potential for evaluating the clinical outcomes (mortality, exacerbation) of COPD patients. The aim of this study was to identify simple and effective biomarkers to predict poor exercise tolerance in COPD patients. Methods Ten genes were selected by weighted correlation network analysis and differentially expressed gene analysis. Validation in an independent dataset led to the identication of PLA2G7, which was veried as a potential biomarker in COPD by bioinformatics analysis. The concentration of lipoprotein-associated phospholipase A 2 (Lp-PLA2), which is encoded by the PLA2G7 gene, was assessed by enzyme-linked immunosorbent assay in a prospective validation cohort. The predictive capacity of Lp-PLA2 for 6-minute walk distance (6MWD) < 350 m was assessed using the area under the receiver operating characteristic curve (AUC). Traditional clinical features and Lp-PLA2 levels were incorporated into a nomogram to build a predictive model for poor exercise tolerance.


Introduction
Impaired exercise tolerance is one of the clinical features of COPD, and it is associated with disease progression and increased mortality of patients [1]. Therefore, assessing and monitoring exercise tolerance effectively are important. The 6-minute walk test (6MWT) is a reliable and widely used measure of exercise capacity [2]. However, the 6MWT has been di cult to popularize in primary medical institutions because it requires a suitable venue and professional training. Developing a screening method that is highly accurate, simple, and can be performed in a medical facility is important.
The mechanism underlying exercise intolerance in COPD patients is complex and multi-factorial. The factors contributing to impaired exercise tolerance include respiratory function, systemic in ammation, and cardiovascular and muscle system dysfunction [3]. The major limitation in COPD patients is impaired pulmonary function. Therefore, a biomarker related to impaired respiratory function and systemic in ammation may predict the exercise tolerance status of COPD patients. Advances in omics technology permit the large-scale evaluation of biomarkers (genetic, transcriptomic) to assess the clinical outcomes (i.e., mortality, exacerbation, and hospital admission) of COPD patients [4][5][6][7].
In this study, we screened out the hub gene PLA2G7 via combined differentially expressed gene (DEG) analysis, weighted co-expression network analysis (WGCNA), and validation in a separate dataset. The level of Lp-PLA2 (encoded by the PLA2G7 gene) was assessed as a potential biomarker to predict poor exercise tolerance in COPD patients in clinical practice.

Dataset preparation
Microarray datasets were screened from Gene Expression Omnibus (GEO, http://www.ncbi.nlm.nih.gov/geo). The selection criteria were as follows: 1. Lung tissue samples from COPD patients and normal smoker's lungs were included; lung cancer samples were excluded; 2. COPD patients had pulmonary function test data; and 3. datasets should contain at least 20 COPD patients and tissue samples from smokers. Based on these criteria, the GSE76925 and GSE38974 datasets were obtained. GSE76925 contained 111 lung tissue samples from COPD patients and 40 lung tissue samples from normal smokers. GSE38974 contained 23 lung tissue samples from COPD patients and nine lung tissue samples from normal smokers. GSE76925 was used to screen DEGs and for WGCNA. GSE38974 was used for validation of hub genes.

Differentially expressed genes and enrichment analysis
The R package "Limma" was used to identify DEGs between COPD samples and normal smoker samples.
An adjusted P-value <0.05 and |log 2 fold change| ≥1 were used as cut-off values. The R package "clusterPro ler" was used for Gene Ontology (GO) enrichment and Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis. P <0.05 was considered statistically signi cant.
Gene set enrichment analysis Gene set enrichment analysis (GSEA) was performed as previously described [8]. The MSigDB KEGG gene set was used as a reference.
Evaluation of tissue-in ltrating immune cells In this study, we used the "CIBERSORTx" website to estimate the fraction of 22 types of immune cells among GSE76925 samples. CIBERSORTx [9] is an analytical tool to impute gene expression pro les and provide an estimate of the abundance of member cell types in a mixed cell population using gene expression data.
Weighted co-expression network analysis A total of 4283 genes (according to variance) were extracted for WGCNA using a "WGCNA" package. The adjacency matrix was converted into the topological overlap matrix (TOM) when the power of β was equal to 3 (R 2 = 0.906). Similar modules were merged following a height cutoff of 0.25. The module showing the highest correlation with clinical features was selected to explore its biological function through GO and KEGG analyses.

Patients and clinical information
The study included 92 stable-stage COPD patients and 16 healthy smokers recruited from the Department of Respiratory and Critical Care Medicine of the First Hospital of China Medical University. Clinical features including age, sex, height, weight, pulmonary function, and mMRC (modi ed British Medical Research Council) and COPD Assessment Test (CAT) results were obtained from medical records. Fiverepetition sit-to-stand test (5STS), the 30-second sit-to-stand test (30STS), and the 6MWT were performed as described previously [10].

Statistical analysis
Statistical analyses were performed using SPSS 13.0 software (IBM, Armonk, NY, USA). The association between continuous variables was assessed using Spearman's correlation coe cient. Relationships between categorical variables were analyzed using the chi-square test. For continuous variables, differences between three or more groups were assessed using one-way analysis of variance (ANOVA) with the post-hoc Tukey multiple comparison test (for normally distributed data) or Kruskal-Wallis test (for non-normal distribution). Differences between two groups were assessed using the t-test (normally distributed data) or Mann-Whitney test (non-normal distribution). P values <0.05 were considered statistically signi cant.

Results
Identi cation of DEGs and enrichment analysis in samples from COPD patients and normal smokers DEGs in lung tissue samples from COPD patients and normal smokers were analyzed using the "Limma" package. As shown in Fig. 1A, 39 signi cantly upregulated genes and 313 signi cantly downregulated genes were identi ed. GO analysis of the 352 DEGs showed that genes were mainly involved in biological processes (BP) associated with the cytoskeleton and cytokines (Fig. S1A). The results of GSEA analysis showed that "ALANINE_ASPARTATE_AND_GLUTAMATE_METABOLISM", "HEMATOPOIETIC_CELL_LINEAGE", "INTESTINAL_IMMUNE_NETWORK_FOR_IGA_PRODUCTION", "KEGG_PANTOTHENATE_AND_COA_BIOSYNTHESIS", "PRIMARY_IMMUNODEFICIENCY", "PROXIMAL_TUBULE_BICARBONATE_RECLAMATION", "RENIN_ANGIOTENSIN_SYSTEM", and "TASTE_TRANSDUCTION" pathways were enriched in the COPD group compared with normal smokers (Fig. S1B). Taken together, these results identify potential biomarkers and abnormal signaling pathways involved in the progression of COPD.

Immune landscape associated with the characteristics of in uenza infection
Functional enrichment analysis showed that immune related pathways were enriched in the COPD group compared with the normal smokers' group. To explore the differential immune landscape in COPD patients and normal smokers, lung tissue microarray data from the GSE76925 dataset were analyzed. CIBERSORTx was used to estimate the fraction of 22 types of immune cells among the GSE76925 samples. CIBERSORTx is a website tool that enables evaluation of the relative proportion of immune cells in tissues via a deconvolution algorithm. The distribution of 22 types of immune cells in GSE76925 samples is shown in Fig. S2A. The immune landscape results showed that T cells CD8, T cells follicular helper, T cells gamma delta, and macrophages M0 were upregulated, whereas T cells CD4 memory activated, monocytes, and eosinophils were downregulated in the lung tissues of COPD patients (Fig.   S2B). Next, we analyzed the relationship between immune in ltration and clinical features. As shown in Fig. S2C, the in ltration level of neutrophils (r = 0.231, P = 0.004), monocytes (r = 0.226, P = 0.005), T cells CD4 memory resting (r = 0.174, P = 0.032), eosinophils (r = 0.170, P = 0.037), and T cells CD4 memory activated (r = 0.168, P = 0.039) were positively correlated with FEV1/FVC; the in ltration level of T cells follicular helper (r = -0.217, P = 0.007), T cells CD8 (r = -0.267, P < 0.001) and macrophages M0 (r = -0.300, P < 0.001) were negatively correlated with FEV1/FVC. Moreover, the in ltration levels of eosinophils (r = 0.219, P = 0.007) and T cells CD4 memory resting (r = 0.181, P = 0.026) were positively correlated with FVC% predicted; the in ltration levels of T cells gamma delta (r = -0.160, P = 0.049), macrophages M0 (r = -0.215, P = 0.008), and T cells CD8 (r = -0.260, P = 0.001) were negatively correlated with FVC1% predicted; the in ltration levels of T cells gamma delta (r = -0.160, P = 0.049), macrophages M0 (r = -0.215, P = 0.008), and T cells CD8 (r = -0.260, P = 0.001) were negatively correlated with FVC1% predicted (Fig. S2D).

Identi cation of key modules via WGCNA
To identify the key genes related to the clinical features of COPD patients, co-expression network analysis was performed via WGCNA using the GSE76925 dataset. Clinical features (age, sex, BMI, FEV1/FVC, and FVC1% predicted) were obtained from the GSE76925 dataset. The parameters were established by setting the soft-threshold power to 3 (scale free R 2 = 0.906) and the height was set to 0.25. The association between the modules and clinical features was determined by assessing the correlation between module eigengene (ME) values and clinical features. Data were visualized using heat map pro les. The results shown in Fig. 1B indicated that the brown module was the most closely correlated with COPD (Pearson's coe cient = 0.36, P = 6E-06), FEV1/FVC (Pearson's coe cient = 0.38, P = 1E -06 ), and FVC1% predicted (Pearson's coe cient = 0.4, P = 5E -07 ). Next, 536 genes from the brown module were selected as hub genes for GO and KEGG analyses.
In the brown module, T cells and leukocyte function, cytokine receptor, and cytokine activity were the most frequent pathways in the GO analysis (Fig. S3A). "Cytokine-cytokine receptor interaction," "Hematopoietic cell lineage" and "Viral protein interaction with cytokine and cytokine receptor" were enriched in the KEGG pathway analysis (Fig. S3B).

Selection and validation of hub genes
To screen stable and robust hub genes accurately, ten commonly changed genes shared by the brown module and upregulated DEGs were selected (Fig. 1C). These included HTR2B, CLECL1, FGG, CORIN, PLA2G7, BHLHE22, SPP1, TIMP4, TM4SF19, and MMP9. The expression levels of these ten genes were rst validated in GSE38974 (nine smokers and 26 smokers with COPD). As shown in Fig. S4A, the expression levels of HTR2B (P = 0.0075) and CORIN (P = 0.0049) were signi cantly lower in COPD lung tissues than in the smoker controls; the expression levels of PLA2G7 (P = 0.0042), SPP1 (P = 0.00032), TM4SF19 (P = 0.0087), and MMP9 (P = 0.0042) were signi cantly higher in COPD lung tissues than in the smoker controls. Next, the relationship between these ten genes and the GOLD stage was veri ed in GSE69818 (11 patients with GOLD stage 1, 41 patients with GOLD stage 2, nine patients with GOLD stage 3, and nine patients with GOLD stage 4). As shown in Fig. S4B, the expression levels of PLA2G7 (P = 0.014) and BHLHE22 (P = 0.015) increased signi cantly with advanced GOLD stage. PLA2G7 was selected for subsequent analyses because it showed signi cant differences in two independent datasets.
To investigate whether PLA2G7 is differentially expressed in other tissues relevant to COPD, we analyzed a series of datasets. As shown in Fig. S5D, PLA2G7 expression was signi cantly higher in the blood of COPD patients than in that of non-COPD controls (including 94 patients with COPD and 42 non-COPD controls from GSE42057 and 49 patients with COPD and 29 non-COPD controls from GSE56766). Given the signi cant correlation between PLA2G7 expression level and macrophages, we analyzed the differences in PLA2G7 expression in alveolar macrophages from bronchoalveolar lavage uid (BALF). As shown in Fig. S5E, PLA2G7 expression was signi cantly higher in alveolar macrophages from COPD samples than in never-smokers and smokers (including 22 patients with COPD, 24 never-smokers, and 42 smokers from GSE130928; and 12 patients with COPD, 24 never-smokers, and 34 smokers from GSE13896). These results showed that PLA2G7 expression was higher in different body uid specimens from COPD patients than in those from normal controls, indicating that PLA2G7 may function in immune regulation by regulating macrophages.

Validation of the PLA2G7 encoded protein Lp-PLA2 in clinical samples
To verify the clinical application potential of the PLA2G7 gene, the levels of proteins encoded by the PLA2G7 gene were detected using ELISA in clinical samples. Lp-PLA2, which is encoded by the PLA2G7 gene, is a plasma enzyme bound to lipoproteins. The serum concentration of Lp-PLA2 was higher in COPD patients than in healthy smokers ( Fig. 2A). In addition, the expression of Lp-PLA2 increased in correlation with Gold stage (Fig. 2B). Next, we analyzed the relationship between the expression of Lp-PLA2 and the clinical characteristics of COPD patients. Analysis of the relationship between Lp-PLA2 level and pulmonary function showed that Lp-PLA2 level was negatively correlated with FEV1/FVC (r = -0.528, P < 0.001) (Fig. 2C).
The Global Average of COPD (GOLD) states that in addition to the assessment of lung function, a comprehensive assessment of the clinical symptoms, acute exacerbations, and comorbidities of COPD is required. The CAT and the mMRC are widely used to assess the clinical symptoms of COPD patients [11]. We analyzed the relationship between Lp-PLA2 levels and the CAT and mMRC scores. As shown in Fig.  2D, Lp-PLA2 levels were positively correlated with the mMRC score (r = 0.339, P < 0.001) and CAT score (r = 0.339, P < 0.001).
Malnutrition has negative effects on exercise and muscle function, and on lung function, as well as increasing exacerbations and mortality [12]. Body mass index (BMI) and fat-free mass index (FFMI) are used to assess nutritional status, and are decreased in COPD patients [13]. As shown in Fig. 2E, Lp-PLA2 levels were negatively correlated with BMI (r = -0.312, P = 0.002) and FFMI (r = -0.336, P = 0.002).
Collectively, these ndings suggested that Lp-PLA2 increased signi cantly in correlation with disease progression and is an important biomarker in COPD patients.
Lp-PLA2 level effectively evaluates exercise tolerance.
Reduced exercise tolerance is one of the main clinical features of COPD. It increases the frequency of acute exacerbations and all-cause mortality, leading to a poor prognosis [1]. The 6-min walk distance (6MWD) assesses the exercise tolerance of COPD patients [2]. As shown in Fig. 3A, Lp-PLA2 levels were negatively correlated with 6MWD (r = -0.578, P = 0.002).
Because nding an appropriate site is di cult (a 30 m at course is required, and the layout of the track may in uence the performance), the 6MWT is not common in primary medical institutions. We found that Lp-PLA2 level is highly correlated with 6MWD. Therefore, we explored whether Lp-PLA2 level can predict a poor 6MWD. The sit-to-stand test (STST) is widely used to indirectly evaluate exercise tolerance [14]. Therefore, we compared the e cacy of the STST and Lp-PLA2 levels for predicting a poor 6MWD. As shown in Fig. 3B, the AUC of the 5STS score predicting a poor 6MWD (6MWD <350 m) was 0.728, the AUC of the 30STS score was 0.750, and the AUC of the Lp-PLA2 level was 0.724. The cutoff values of Lp-PLA2 level, 5STS, and 30STS scores were 133.7 ng/mL, 23.5, and 6.42, respectively. The sensitivity and speci city for predicting a poor 6MWD based on the cutoff value of the Lp-PLA2 level were 88.57% and 61.40%, respectively. The sensitivity and speci city for predicting a poor 6MWD according to the cutoff value of the 5STS score were 71.43% and 65.38%, respectively. The sensitivity and speci city for predicting a poor 6MWD based on the cutoff value of the 30STS score were 82.14% and 55.56%, respectively. These results suggested that the predictive power of Lp-PLA2 level is higher than that of STST modes, suggesting its potential for use in research and clinical practice.

Construction of a nomogram to predict impaired exercise tolerance
We combined the traditional clinical features of age, grade, FEV1/FVC, BMI, CAT score, and mMRC score with Lp-PLA2 level to construct a nomogram model to predict impaired exercise tolerance (6MWD <350 m) in COPD patients (Fig. 4A). Calibration plots were used to visualize the performances of the nomograms. The calibration plot con rmed the performance of our model (Fig. 4B). To demonstrate the clinical advantages of the nomogram model, we compared the ROC curves of the single variables against the nomogram curve. The nomogram model had the highest AUC value (Fig. 4C). Finally, decision curve analysis (DCA) was used to con rm the ndings. Compared with a single clinical variable, the combined nomogram model showed the highest e cacy for 6MWD <350 m predictions (Fig. 4D). These methods con rmed the clinical utility of our nomogram model.

Discussion
Early detection of impaired exercise tolerance would assist doctors in providing personalized treatment for COPD patients, thereby improving prognosis. The 6MWT is the most common method to assess exercise tolerance. The STST was introduced to indirectly assess exercise tolerance. In recent studies, hematological indexes such as neutrophil-to-lymphocyte ratio [15] and serum bilirubin level [16] showed a better correlation with exercise tolerance. These ndings suggest the potential clinical value of hematological markers for predicting impaired exercise tolerance.
In this study, analysis of DEGs and WGCNA were combined to identify potential biomarkers related to the severity of COPD. PLA2G7 was selected for further analysis based on validation in independent datasets. We found that PLA2G7 levels were signi cantly correlated with BMI and pulmonary function, and were increased in the blood and alveolar macrophages of COPD patients. Mechanistically, PLA2G7 may function via several immune-related pathways and macrophages. The present study is the rst report demonstrating the clinical value of PLA2G7 in COPD and suggesting its potential as a biomarker for COPD.
To explore the clinical application value of PLA2G7, we recruited COPD patients and healthy smokers, and detected the expression levels of the PLA2G7-encoded protein Lp-PLA2. Lp-PLA2 is mainly produced by macrophages and activated platelets [17,18]. Lp-PLA2 functions as an in ammatory biomarker for cardiovascular and cerebrovascular diseases [19,20]. Systemic in ammation in COPD is a risk factor for reduced exercise tolerance [3,21]. In this study, Lp-PLA2 was upregulated in COPD patients and increased along with GOLD stage. It was signi cantly correlated with clinical symptoms, nutritional status, and exercise tolerance in COPD patients. These results suggest that Lp-PLA2 is a potential biomarker for COPD. Because Lp-PLA2 level was highly correlated with 6MWD, we explored the ability of Lp-PLA2 level to predict a poor 6MWD. The 6MWT is a measure of exercise capacity [2]. However, the 6MWT requires a suitable venue and professional training, and it is thus di cult to use in the busy clinical setting.
Developing a screening method that is highly accurate, simple, and can be performed in any medical facility is important. Studies from our team and other group shows that STST is an effective method to indirectly assess exercise tolerance [10,22]. In this study, we compared the ability of the STST modes and Lp-PLA2 levels for predicting a poor 6MWD. Lp-PLA2 level had the highest AUC value (0.796), highest sensitivity (88.57%), and moderate speci city (61.40%), indicating that it is a useful predictor of a poor 6MWD compared with the 5STS and the 30STS. A nomogram model constructed by combining traditional clinical features and Lp-PLA2 level further improved the predictive ability (AUC: 0.884). The present results suggest that serum Lp-PLA2 levels and the nomogram model are simple and accurate methods to predict exercise tolerance in patients with COPD.
The present study had several limitations. First, we only detected the expression of Lp-PLA2 in the serum, and its expression level in sputum and alveolar lavage uid remains unknown. These two kinds of samples are more closely related to airway in ammation. In addition, the clinical application of Lp-PLA2 as a potential biomarker needs to be further assessed and subjected to external veri cation. These data will be provided in a future study.

Conclusion
Lp-PLA2 is a promising biomarker for COPD patients, and it is suitable for predicting poor exercise tolerance in clinical practice.

Declarations
Ethics approval and consent to participate All procedures performed in studies involving humans were reviewed and permitted by The First Hospital of China Medical University.
Consent for publication NA Availability of data and materials All data generated or analyzed during this study are included in this article.