2.1 Data acquisition
In total, 497 LUAD samples with associated clinical data, including sex, age, number pack years smoked, T stage, N stage, M stage, vital signs, epidermal growth factor receptor (EGFR) status, anaplastic lymphoma kinase (ALK) status, and BUB1 gene expression data (Table 1) were downloaded from TCGA data portal (https://tcga-data.nci.nih.gov/tcga/). Second, RNA-sequencing data in the fragments per kilobase per million mapped reads format were converted into the standardized transcripts per million mapped reads format using R language software (V.3.6.2). To eliminate technical errors in sequencing data.The GSE13213, GSE50081, and GSE37745 databases were downloaded from the National Center for Biotechnology Information (https://www.ncbi.nlm. nih.gov/) for comparison with TCGA dataset. In addition, tissue microarrays containing 96 LUAD samples and 81 adjacent normal tissues (HLugA180Su08) were purchased from Outdo Biotech (Shanghai, China). Detailed clinical features of the immunohistochemical samples are shown in Table 3.
2.2 Construction and evaluation of a prognostic model
In this study, a nomogram was created in the software package R (version 6.0-1) using the nomogram function from the rms library. C-index and calibration curve analyses were performed using the Hmisc R package (version 4.4-1). Nomograms were evaluated using calibration plots and C-indexes, which compared nomogram-predicated probability with observed outcomes. A C-index of 1 indicates perfect prediction accuracy, whereas a C-index of 0.5 indicates a model not better than random chance.
2.3 Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analyses
In the current study, we used Gene Expression Profiling Interactive Analysis to identify genes that were highly correlated with the expression level of BUB1 in LUAD. Correlation coefficients of |r| values greater than or equal to 0.65 were considered BUB1-related genes. Subsequently, we incorporated these BUB1-related genes into Metascape (http://metascape.org) for GO and KEGG analyses to predict the potential biological functions of BUB1. Among the identified genes, genes showing significant differences had P values less than 0.01, a minimum count of 3, and an enrichment factor greater than 1.5. In addition, gene set enrichment analysis (GSEA) was performed to predict significant functional differences between the low and high BUB1 expression groups using the Cluster Profiler package (3.8.0) in R software  and the Molecular Signatures Database (MSigDB) Collection (c2.cp.v7.2.symbol.gmt). Results with a p value less than 0.05, normalized enrichment score (|NES|) greater than 1, and false discovery rate (FDR) less than 0.25 were considered significantly enriched.
2.4 Single-cell function and pathway enrichment of BUB1
We used CancerSEA to analyze the functional state of BUB1 in LUAD and other cancer types. CancerSEA is a dedicated sequencing database designed to comprehensively explore different functional states of cancer cells at the single-cell level . Cancer-related single-cell sequencing data for human samples in CancerSEA were derived from 72 datasets in the Sequence Read Archive, GEO, and ArrayExpress websites. Therefore, CancerSEA was performed to examine the functional correlation of BUB1 with LUAD. The correlation between BUB1 and functional state in distinctive single-cell datasets was evaluated based on an FDR of less than 0.05 and a correlation coefficient greater than or equal to 0.2.
2.5 Immune infiltration analysis
To investigate the association between BUB1 and immune cell infiltration, single-sample GSEA (ssGSEA) with the GSVA package was used to detect the correlation between the relative proportions of different types of infiltrating immune cells in the tumor microenvironment and BUB1 expression . Spearman correlations were employed to evaluate the relationships between BUB1 expression and the infiltration of 24 types of immune cells, and lollipop charts were used to show correlations.
2.6 Analysis of methylation
The methylation data for BUB1 were downloaded from the cBioPortal web platform (https://www.cbioportal.org/) . The correlation between BUB1 methylation level and BUB1 gene expression (Spearman and Pearson correlations) was evaluated. MethSurv was used to analyze the prognostic value of BUB1 methylation in LUAD .
2.7 LUAD cell lines and cell culture
MRC-5 human embryonic lung fibroblasts, MRC-5 culture medium, and the NSCLC cell lines H1299, H1975, A549, H1650, and PC9 were obtained from Procell Life Science & Technology (Wuhan, China). A549 and PC9 cells were cultured in high-glucose Dulbecco’s modified Eagle’s medium (Hyclone, USA) containing 10% fetal bovine serum (FBS; Biological Industries, Israel) and 1% penicillin/streptomycin (Biological Industries), whereas other NSCLC cell lines were grown in RPMI-1640 medium (Hyclone) supplemented with 10% FBS and 1% penicillin/streptomycin. All cells were incubated in a humidified incubator with 5% CO2 at 37°C.
2.8 RNA extraction and reverse transcription quantitative polymerase chain reaction (RT-qPCR)
RNA was extracted using TRIzol reagent (TaKaRa, Japan) according to the manufacturer’s instructions. The extracted RNA was then reverse transcribed with a reverse transcription kit (TaKaRa) to yield cDNA. Subsequently, RT-qPCR was performed to measure the expression of BUB1 using TB Green Premix Ex Taq II (TaKaRa). Relative gene expression levels were calculated using the 2−ΔΔCt method. The primers used were as follows: BUB1-forward, 5′-AGCCCAGACAGTAACAGACTC-3′; BUB1-reverse, 5′-GTTGGCAACCTTATGTGTTTCAC-3′; glyceraldehyde 3-phosphate dehydrogenase (GAPDH)-forward, 5′-AGGTCGGTGTGAACGGATTTG-3′; GAPDH-reverse, 5′-TGTAGACCATGTAGTTGAGGTCA-3′.
2.9 Western blot analysis
Cells were lysed using RIPA buffer to extract proteins. The proteins were then subjected to sodium dodecyl sulfate polyacrylamide gel electrophoresis on 10% gels and transferred to polyvinylidene difluoride membranes (Millipore, USA). The membranes were then incubated in 3% bovine serum albumin and probed with appropriate primary antibodies. Immunoblotting images were collected on a Bio-Rad system after incubation with secondary antibodies. The antibodies used in this study were as follows: rabbit anti-GAPDH (cat. no. 10494-1-AP, 1:20000), anti-BUB1 (cat. no. 13330-1-AP, 1:2000), anti-E-cadherin (cat. no. 20874-1-AP, 1:5000), anti-N-cadherin (cat. no. 22018-1-AP, 1:4000), anti-cyclinB1 (cat. no. 55004-1-AP, 1:2000), anti-AKT (cat. no. 10176-2-AP, 1:4000), anti-phosphatidylinositol 3-kinase (PI3K; cat. no. 20584-1-AP, 1:300), anti-phospho-AKT (cat. no. 28731-1-AP, 1:3000), and anti-rabbit secondary antibodies (cat. no. B900210, 1:10000) purchased from Proteintech (Wuhan, China); and anti-phospho-PI3K (cat. no. AF3242, 1:1000) purchased from Affinity Biosciences (Cincinnati, OH, USA).
2.10 Downregulation of BUB1
LUAD cell lines were infected with a knockdown lentivirus (sh-BUB1, TCCTACACTTCCTGATATT)  and corresponding negative control lentivirus (sh-NC, TTCTCCGAACGTGTCACGT), purchased from Hanbio Tech (Shanghai, China). The cells were plated in 6-well plates at 1 × 105 cells/well and incubated for 8 h. Subsequently, depending on the multiplicity of infection value corresponding to the cells, the appropriate amount of virus suspension was added to the 6-well plate and incubated with the cells for 6 h. After replacing the medium with fresh medium and incubating for 48 h, stably transfected clones were screened with puromycin. RT-qPCR and western blot assays were used to confirm the transfection efficiency.
2.11 MTS assays
Promega CellTiter 96 AQueous One Solution Cell Proliferation Assays (Promega, Madison, WI, USA) were employed to measure cell proliferation activity. After transfection, cells were seeded in 96-well plates at a density of 1000 cells/well and cultured at 37°C for 24, 48, 72, or 96 h. Next, 10 μL MTS solution was added to 90 µL RPMI 1640 medium in each well of a 96-well plate, and plates were incubated for an additional 30 min. Subsequently, a microplate reader (Bio-Rad Laboratories) was used to measure the absorbance at 490 nm. Additionally, H1299 and H1975 cells were cultured in complete medium containing 0, 5, or 10 μM 2OH-BNPP1 (a BUB1 inhibitor; MedChemExpress, NJ, USA) in 96-well plates, and the results were detected as described above.
2.12 Wound healing assays
When the density of cells in six-well plates reached 80–90%, the cell monolayers were scraped using a 200-μL sterile pipette tip. Cells were cultured in medium containing 3% FBS and photographed under a microscope at 0, 24, and 48 h. In addition, to observe the effects of 2OH-BNPP1 on LUAD cell migration, we treated H1299 and H1975 cells with medium containing 3% FBS plus 0, 5, or 10 μM 2OH-BNPP1 and photographed at 0, 24, and 36 h.
2.13 Transwell assays
Two hundred microliters of cell suspension in serum-free medium containing 1 × 104 cells was added to the upper chambers of Transwell inserts (8-μm pore size; Corning, NY, USA). The inserts were incubated in 24-well plates supplemented with 700 μL of medium containing 20% FBS as a chemoattractant. For analysis of invasion, Matrigel (BD, NJ, USA) was added to the upper chambers. After incubation in a cell incubator for 48 h, the cells were fixed with 4% paraformaldehyde and then stained with 0.1% crystalline violet solution (Sorabio, Beijing, China). Finally, the cells remaining in the upper chamber were wiped with a cotton swab, dried naturally, and photographed under a light microscope.
2.14 Clone formation assay
Cells were seeded in 6-well plates at 500 cells/well and incubated for approximately 2 weeks. After individual cell clones formed clusters of more than 50 cells, the cells were fixed and stained. Finally, the number of cell clusters in each well was photographed and counted.
2.15 Flow cytometry
After transfection or drug treatment, apoptosis was detected using an Annexin V-FITC Apoptosis kit (Beyotime, Shanghai, China). Apoptosis was then measured by flow cytometry analysis (CytoFLEX S; Beckman Coulter, USA), and the data were analyzed using CytExpert2.4.
Tissue slides were analyzed using immunohistochemistry. Antigen retrieval was performed by boiling the slides in citrate buffer (pH 6.0) for 10 min, followed by cooling at room temperature for 20 min. The slides were incubated at 4°C with anti-BUB1 primary antibodies (Proteintech; cat. no. 13330-1-AP, 1:300 dilution) overnight, followed by anti-rabbit peroxidase-conjugated secondary antibodies (1:500). Subsequently, scoring was performed as previously reported . The staining score was determined based on the staining intensity and the percentage of positive staining. The intensity of staining was scored as 0 (no staining), 1 (weak), 2 (medium), or 3 (strong). Percentage scores were assigned as 0 (< 5%), 1 (5–25%), 2 (26–50%), 3 (51–75%), or 4 (76–100%). The staining score for each sample was assessed independently by two skilled pathologists.
2.17 Statistical analysis
Statistical analyses were carried out using R software (v3.6.3), GraphPad Prism8.0.2 (San Diego, CA, USA), and ImageJ (1.8.0). Wilcoxon signed-rank and Wilcoxon rank-sum tests were employed to analyze BUB1 expression in paired and nonpaired samples, respectively. The associations between clinicopathological features and BUB1 expression were evaluated using Wilcoxon signed-rank tests or Kruskal-Wallis tests. Receiver operating characteristic (ROC) curves were generated using the pROC R and ggplot2 R packages. Correlation analysis was performed using Spearman tests. Differences in survival status were measured using the Kaplan-Meier method, and differences between groups were assessed using log-rank tests. Univariate Cox analysis was used to identify potential prognostic factors, and multivariate Cox analyses were used to determine whether BUB1 was an independent risk factor for overall survival (OS) in patients with LUAD. Differences between groups were analyzed using t-tests, and results with P values less than 0.05 were considered significant. Mann-Whitney tests were used to analyze differences in BUB1 expression between LUAD and adjacent lung tissues.