Patients enrollment and sample collection
A sample of 161 symptomatic individuals aged 35-70 years were included for metabolite analyses. These patients were taken from a set of 1,058 patients who were RT-qPCR-tested for SARS-CoV-2 between March 15 and November 1, 2020 at the Zacatecas General Hospital’s Respiratory Triage Unit of by the Mexican Institute of Social Security (IMSS). This is a public facility with ≈200 hospital beds located in Zacatecas, the capital city of the state of Zacatecas in central Mexico. Screened individuals were categorized into four mutually exclusive groups: Group 1: PCR-, controls (n=39); Group 2: PCR+, not hospitalized (n=40); Group 3: PCR+, hospitalized with or without oxygen mask (n=42); and Group 4: PCR+, intubated (n=40). Blood specimens for plasma analyses were collected within two days after admission on average. Sociodemographic, epidemiological and clinical data of study participants by surveyed group is provided in Table 1. The study was performed in accordance with the Declaration of Helsinki. It was also revised and approved by the Ethics Committee of the Comité Nacional de Investigación Científica del Instituto Mexicano de Seguridad Social, with the registration number R-2020-785-068. Informed consent was obtained from all participants. All patients included in this study were informed in writing regarding the collection of their samples for research aims and given the right to refuse such uses.
Metabolomics profile of plasma samples
Targeted quantitative metabolomics was used to identify and determine the concentration of 143 different endogenous metabolites. Amino acids, biogenic amines and derivatives, and organic acids were analyzed using a reverse-phase liquid chromatography-mass spectrometry (LC-MS)/MS custom assay. Glycerophospholipids, acylcarnitines, sphingomyelins, and glucose were measured by direct injection (DI). Mass spectrometric analyses were performed on an ABSciex 4000 Qtrap tandem MS instrument (Applied Biosystems/MDS Analytical Technologies, Foster City, CA, USA) equipped with an Agilent 1260 series UHPLC system (Agilent Technologies, Palo Alto, CA). The custom assay contained a 96-deep-well plate with a filter plate attached with sealing tape; reagents and solvents were used to prepare the plate assay. The first 14 wells were used for one blank, three zero samples, seven standards, and three quality control samples. Details of the assay have been published previously 15.
Sample preparation
For organic acid analyses, 150 µL of ice-cold methanol and 10 µL of isotope-labeled internal standard mixture were added to 50 µL of plasma sample for overnight protein precipitation at -20°C, followed by centrifugation at 13,000 x g for 20 min. A total of 50 µL of supernatant was loaded into the center of a 96-deep-well plate, followed by the addition of a 3-nitrophenylhydrazine reagent. After incubation for 2 hours, butylated hydroxytoluene stabilizer (2 mg/mL) and water were added before the LC-MS injection.
For amino acids and biogenic amines and derivatives, glycerophospholipids, acylcarnitines, and sphingomyelins, samples were thawed on ice and subsequently vortexed and centrifuged at 13,000 × g; 10 µL of each sample was then loaded onto the center of the filter on the upper 96-well plate and dried in a stream of nitrogen. Subsequently, phenyl-isothiocyanate was added for derivatization. After incubation, the filter spots were dried again using an evaporator. Metabolite extraction was then achieved by adding 300 µL of extraction solvent. Extracts were obtained by centrifugation into the lower 96-deep-well plate, followed by a dilution step with MS running solvent (0.2% formic acid in water, 0.2% formic acid in acetonitrile).
LC-MS/MS method
An Agilent reversed-phase Zorbax Eclipse XDB C18 column (3.0 mm × 100 mm, 3.5 μm particle size, 80 Å pore size) with a Phenomenex (Torrance, CA, USA) SecurityGuard C18 pre-column (4.0 mm × 3.0 mm) was used. LC parameters used were as follows: mobile phase A was 0.2% (v/v) formic acid in water, and mobile phase B was 0.2% (v/v) formic acid in acetonitrile. The gradient profile was: t=0 min, 0% B; t=0.5 min, 0% B; t=5.5 min, 95% B; t=6.5 min, 95% B; t=7.0 min, 0% B; and t=9.5 min, 0% B. The column oven was set at 50°C. The flow rate was 500 μL/min, and the sample injection volume was 10 μL.
For the analysis of organic acids, the mobile phases used were A) 0.01% (v/v) formic acid in water, and B) 0.01% (v/v) formic acid in methanol. The gradient profile was as follows: t = 0 min, 30% B; t = 2.0 min, 50% B; t = 12.5 min, 95% B; t = 12.51 min, 100% B; t = 13.5 min, 100% B; t = 13.6 min, 30% B and finally maintained at 30% B for 4.4 min. The column oven was set to 40°C. The flow rate was 300 μL/min, and the sample injection volume was 10 μL.
DI-MS/MS method
The LC autosampler was connected directly to the MS ion source by red PEEK tubing. The mobile phase was prepared by mixing 60 μL of formic acid, 10 mL of water, and 290 mL of methanol. The flow rate was programmed as follows: t=0 min, 30 μL/min; t=1.6 min, 30 μL/min; t=2.4 min; 200 μL/min; t=2.8 min, 200 μL/min; and t=3.0 min, 30 μL/min. The sample injection volume was 20 μL.
Quantification
To quantify organic acids, amino acids, and biogenic amines and derivatives, an individual seven-point calibration curve was generated for each analyte. Ratios for each analyte’s signal intensity to its corresponding isotope-labelled internal standard were plotted against the specific known concentrations using quadratic regression with a 1/x2 weighting.
Lipids, acylcarnitines, and glucose were analyzed semi-quantitatively. A single point calibration of a representative analyte was built using the same group of compounds that share the same core structure assuming a linear regression through zero.
All metabolite analyses were done using Analyst 1.6.2 and MultiQuant 3.0.3.
Statistical analysis
Frequencies and proportions stratified by study group were used to describe nominal variables. Since continuous data was not normally distributed, medians with quartiles 1 (Q1) and 3 (Q3) were used as central and dispersion measures stratifying by surveyed group.
Metabolites with >50% of missing values were removed from further analysis (n=33). Half of the minimum concentration value was imputed in those with <50% of missing values. Metabolites were log-transformed and auto-scaled. Principal component analysis (PCA) and two-dimension partial least squares discriminant analysis (2-D PLS-DA) scores plots were used to compare plasma metabolite data across and between study groups; 2000-fold permutation tests were used to minimize the possibility that the observed separation of the PLS-DA was due to chance. Coefficient scores and least absolute shrinkage and selection operator (LASSO) algorithm were used to identify the most discriminating metabolites for group comparisons. Metabolite data analyses was done using MetaboAnalyst 37.
The metabolites with the highest score coefficient and LASSO scores were used to create these metabolite panels for COVID-19 status or outcomes using multivariate logistic regression (metabolites-only models). Additionally, models were adjusted for relevant potential confounders such as sex, age, relevant comorbidities (i.e. DM-II, HTN, and obesity), and clinical laboratory data, but only statistically significant variables (p<0.05) remained in the final models (metabolites + demographic/clinical data models). Receiver-operating characteristic (ROC) analysis was performed using MetaboAnalyst to identify the best metabolite combination. In this analysis, balanced sub-sampling-based Monte Carlo cross validation (MCCV) was used to generate the ROC curves.