2.1. Chemical and Materials
Methanol and acetonitrile of high performance liquid chromatography (HPLC) grade were procured from Fisher Scientific (Loughborough, UK). Similarly, formic acid, also of HPLC grade, was acquired from TCI (Shanghai, China). The procurement of ammonium acetate, adhering to HPLC grade standards, was facilitated through Sigma-Aldrich (Shanghai, China). The 2-chloro-L-phenylalanine was obtained from Aladdin (Shanghai, China).Furthermore, distilled water was filtered through the Milli-Q system (Millipore, Bedford, USA).
2.2. Study design and Sample Collection
This investigation was conducted at the Second Affiliated Hospital of Fujian Medical University over the period of 2021 to 2022, receiving ethical endorsement from the hospital's Ethics Committee under the reference number 2021[168]. Prior to the procurement of blood specimens, informed consent was duly acquired in written form from all the 51 subjects who were recruited for participation in this study. The research design was bifurcated into two distinct phases: the preliminary discovery phase, which comprised 18 individuals diagnosed with TNBC alongside 21 HC participants, and the subsequent validation phase, which included 7 TNBC patients and 5 control subjects. The diagnostic criterion for TNBC was strictly aligned with the international consensus, identifying patients based on the absence of estrogen receptor, progesterone receptor, and HER2 expression. The control cohort consisted of healthy volunteers, age-matched and with no prior history of breast disease, whose health status was rigorously verified through comprehensive physical exams.
Blood specimens were procured from fasting participants, subsequently deposited into tubes specifically engineered for serum segregation. Following a centrifugation process at 3000 rpm for a duration of 5 minutes at a temperature of 4 ℃, the serum was successfully isolated. Immediate post-isolation, the serum samples were expeditiously transferred to a refrigeration unit maintained at −80°C, thereby preserving them for future metabolomics analyses.
2.3. Sample Preparation
Commence by thawing the experimental specimens at an ambient temperature of 4 ℃, then subject them to vortex mixing for a duration of one minute to ensure a uniform mixture. With meticulous precision, transfer 100 µL of the specimen into a 2 mL centrifuge tube. Subsequently, introduce 400 µL of a methanol solution, preserved at a temperature of -20 ℃, into the tube and subject it to vortex mixing for another minute to ensure thorough mixing. The mixture is then centrifuged at 12,000 rpm for 10 minutes at a temperature of 4 ℃, a step designed to precipitate proteins. Upon the completion of centrifugation, carefully collect the supernatant and subject it to evaporation under a centrifugal vacuum to achieve dryness. Subsequently, with exactitude, add 150 µL of an 80% methanol-water solution containing 2-chloro-L-phenylalanine (concentration of 4 ppm), maintained at 4 ℃, to reconstitute the specimen. Thereafter, collect the supernatant, filter it through a 0.22 µm membrane, and transfer the filtrate into a vial prepared for liquid chromatography-mass spectrometry (LC-MS) analysis.
In a parallel experimental setup, pooled quality control (QC) samples were meticulously prepared by amalgamating equal volumes of all serum supernatants. These QC samples played a pivotal role in the evaluation of the stability and consistency of the overall experimental outcomes. To facilitate the equilibration of the analytical column, the pooled QC sample was initially introduced into the system via five consecutive injections at the commencement of the analytical batch. To ensure the accuracy and reliability of the analytical workflow, it was imperative that the QC sample be injected subsequent to every six serum sample injections throughout the entirety of the analytical procedure, thereby guaranteeing the maintenance of stringent analytical standards.
2.4. UHPLC–MS based metabolome profiling
Chromatographic separations were performed on a Vanquish ultra-high performance liquid chromatography (UHPLC) System (Thermo Fisher Scientific, USA), employing an ACQUITY UPLC® HSS T3 column (150×2.1 mm, 1.8 µm, Waters, Milford, MA, USA) for the analysis.
The metabolomic analyses were performed in both electrospray ionization positive (ESI+) and negative (ESI−) ion modes. For ESI+, the mobile phases were composed of A2 (0.1% formic acid in water) and B2 (0.1% formic acid in acetonitrile), with the elution gradient meticulously structured as follows: from 0 to 1 minute, the composition was maintained at 2% B2; from 1 to 9 minutes, it was gradually increased from 2% to 50% B2; from 9 to 12 minutes, it was further increased from 50% to 98% B2; from 12 to 13.5 minutes, it was held constant at 98% B2; from 13.5 to 14 minutes, it was rapidly decreased from 98% to 2% B2; and finally, from 14 to 20 minutes, it was maintained at 2% B2. In the ESI- mode, the mobile phases comprised A3 (ammonium formate at 5 mM) and B3 (acetonitrile), with the elution conditions set as follows: from 0 to 1 minute, the composition was at 2% B3; from 1 to 9 minutes, it was increased from 2% to 50% B3; from 9 to 12 minutes, it was raised from 50% to 98% B3; from 12 to 13.5 minutes, it remained at 98% B3; from 13.5 to 14 minutes, it was decreased from 98% to 2% B3; and from 14 to 17 minutes, it was kept at 2% B3. The column oven temperature was uniformly maintained at 40°C, with a flow rate of 0.25 mL/min and an injection volume of 2 μL. Throughout the duration of the experiment, all pre-treated serum samples were preserved at 4°C.
Metabolite detection was facilitated through a Q Exactive HF-X mass spectrometer (Thermo Fisher Scientific, USA), which was equipped with an ESI ion source and operated in both MS1 and MS/MS (Full MS-ddMS2 mode, data-dependent MS/MS) acquisition modes. The operational parameters were meticulously defined, with sheath gas pressure set at 30 arb, auxiliary gas flow at 10 arb, spray voltages calibrated at 3.50 kV for ESI(+) and -2.50 kV for ESI(−), capillary temperature at 325℃, MS1 scan range from m/z 81 to 1000, MS1 resolving power at 60000 FWHM, eight data-dependent scans per cycle, MS/MS resolving power at 15000 FWHM, normalized collision energy at 30%, and dynamic exclusion time set to automatic.
2.5. Metabolomics data analysis
The transformation of raw data into mzXML format was accomplished utilizing MSConvert, a component of the ProteoWizard software suite (version 3.0.8789)[13]. This preliminary step facilitated subsequent analytical processes. The feature detection, retention time correction, and alignment of the data were executed through the application of XCMS. Subsequently, advanced multivariate statistical analyses, namely principal component analysis (PCA) and orthogonal partial least squares discriminant analysis (OPLS-DA), were conducted using Simca-P14.0 software. These analyses served to delineate distinct groups and pinpoint biomarkers indicative of TNBC. To ascertain the robustness of the model, a permutation test encompassing 200 random permutations was employed, evaluating the OPLS-DA model based on its R2 (explained variance) and Q2 (predictive ability) parameters. The identification of discriminating metabolites was facilitated by the OPLS-DA model through the implementation of the variable importance on projection (VIP) strategy, whereby only metabolites exhibiting a VIP value in excess of 1 were deemed to possess statistical significance in the classification of TNBC. Following this, a nonparametric univariate statistical analysis was conducted, employing the Mann-Whitney U test (p < 0.05) in conjunction with fold change (FC) values ≤ 0.67 or ≥ 1.5 to discern differential metabolites (DMs).
The evaluation of the DMs' predictive capacity was undertaken through receiver operating characteristic (ROC) curve analysis, which leveraged the area under the ROC curve (AUC) as an indicator of the overall test efficacy. The optimum AUC, sensitivity, and specificity were determined by maximizing the Youden index, calculated as sensitivity + specificity - 1[4]. This analytical process was executed utilizing SPSS software (version 22.0).
The initial identification of DMs was predicated on the verification of accurate molecular weight (< 30 ppm). This was followed by an analysis based on precise mass numbers and high-resolution target MS/MS spectra, in conjunction with the fragmentation laws of various metabolites. The exploration for potential structures of differential metabolites was conducted through database searches (including METLIN, HMDB, and MassBank) and literature reviews, thereby accruing information on candidate metabolites.
Furthermore, Metabolite Set Enrichment Analysis (MSEA) was performed via MetaboAnalyst 6.0 (https://metascape.org/gp/index.html), aimed at elucidating metabolic pathways distinctly altered in TNBC patients in comparison to HC subjects.
2.6. Transcriptomics analysis
In the investigation of TNBC, three pertinent datasets from the Gene Expression Omnibus (GEO) database were meticulously selected for analysis: GSE65194, encompassing 55 TNBC tissue samples alongside 11 samples of healthy breast tissue derived from mammoplasty procedures; GSE45827, comprising 11 TNBC and 5 healthy breast tissue samples; and GSE36295, containing 41 TNBC tissues as well as 11 samples of normal tissue. The identification of differentially expressed genes (DEGs) contrasting the TNBC group with the group of normal breast tissues was executed utilizing the GEO2R analytical tool, adhering to stringent cutoff criteria of an absolute log2 FC greater than 2 and an adjusted p-value less than 0.05. This initial analysis facilitated the generation of volcano plots and Venn diagrams, accessible via (http://www.bioinformatics.com.cn/), to discern DEGs consistently observed across the trio of datasets.
Subsequent to the identification of shared DEGs, a comprehensive examination of the biological processes (BP), molecular functions (MF), cellular components (CC), and implicated pathways was conducted. This examination was facilitated through gene ontology (GO) enrichment analysis and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis, employing the Database for Annotation, Visualization, and Integrated Discovery (DAVID, version 12.0) as the analytical platform. This multifaceted approach aimed to elucidate the underlying molecular mechanisms and potential pathophysiological pathways relevant to TNBC, thereby contributing valuable insights into the biological characterization of this aggressive breast cancer subtype
2.7. Joint analysis of metabolomics and transcriptomics
An integrative analysis was undertaken to explore the synergistic relationship between DMs and DEGs, as identified through comprehensive metabolomic and transcriptomic investigations. This endeavor was facilitated by employing the Joint-Pathway Analysis module available within the MetaboAnalyst 6.0 platform, aimed at constructing a detailed metabolic pathway enrichment diagram. The analysis leveraged the total number of identified metabolites to evaluate the relevance and significance of each pathway, with pathways demonstrating a P-value less than 0.05 being deemed significantly enriched. In parallel, the KEGG database served as a pivotal resource for elucidating potential genes implicated within these significantly enriched pathways. The utilization of Cytoscape software version 3.9.1, in conjunction with the Metscape plugin, facilitated the elucidation of the intricate connections and interdependencies between metabolites and genes, thereby enabling the visualization of compound networks.
2.8. Validation of the expression of hub DEGs
Gene Expression Profiling Interactive Analysis (GEPIA; http://gepia.cancer-pku.cn/) represents a sophisticated interactive web service dedicated to the analysis of RNA sequencing expression data, incorporating 9,736 tumor and 8,587 normal samples derived from the Cancer Genome Atlas (TCGA) and the Genotype-Tissue Expression (GTEx) projects [14]. Concurrently, UALCAN (http://ualcan.path.uab.edu) emerges as an extensive, intuitive web portal tailored for the analysis of cancer OMICS data. This portal not only facilitates gene expression analysis predicated on clinical data from TCGA but also extends its functionality to include protein expression analysis leveraging data from the Clinical Proteomic Tumor Analysis Consortium (CPTAC) Confirmatory/Discovery dataset [14, 15]. Furthermore, the Human Protein Atlas (HPA) database (https://www.proteinatlas.org) provides an invaluable open-access repository of immunohistochemical images, documenting a broad spectrum of immune response observations across both neoplastic and normal tissues [16]. Employing the comprehensive datasets available within these repositories, a detailed comparative analysis of the mRNA and protein expressions of key hub genes in breast cancer versus normal breast tissue was conducted, with immunohistochemistry serving as the foundational analytical technique. The open-access status of these databases obviates the necessity for ethical approval, thereby negating the requirement for formal authorization from a local ethics committee.
2.9. Kaplan-Meier plotter database analysis
The Kaplan-Meier plotter database (www.kmplot.com) was deployed to elucidate the association between mRNA levels of each pivotal DEG and the prognostic outcomes of patients afflicted with TNBC. To this end, patient samples were stratified into two distinct groups predicated upon the median expression level of each gene, delineating cohorts with high versus low expression, thereby facilitating a rigorous evaluation of the prognostic relevance attributed to each gene. Notably, the platform autonomously computes the hazard ratios (HR) accompanied by 95% confidence intervals (CI) and Log rank P values, thereby streamlining the analytical process.