NOC analysis of EBA samples from CORTIS participants
In total, 99 CORTIS trial participants classified as either GeneXpert (GXP)-negative or GXP-positive were included in this study, from whom 200 EBA samples (include some revisits for some of the participants) were collected at different clinical visits (Figure 1 and Supplementary Table 1). For the 1st visit, there were 26 GXP-negative participants and 20 GXP-positive participants; for the 2nd visit, there were 63 GXP-negative participants and 11 GXP-positive participants; and for the 3rd visit, there were 53 GXP-negative and 11 GXP-positive participants. All EBA samples were processed with 0.2 µm membrane filtration (see methods) in which the bioaerosol pellet was separated from the supernatant. The concept of using dual-mobile phase extraction was based on the hydrophobicity of molecules. Since lipids are non-polar molecules 70% IPA is used for elution . The dual-mobile phase method enabled us to extract both small metabolites (50% acetonitrile solvent) and lipids from the same EBA sample, deepening the molecular identification and analysis in LC-MS/MS (Figure 1).
Validation of the dual-mobile phase extraction method
In our previous work, lipids were extracted using a classic Folch solvent separation method . Since the EBA samples were collected in ~10 mL of buffer solution, they required overnight lyophilization and centrifugation for extraction (Figure 2A). These steps are resource intensive and difficult to manage at scale; therefore, in the current study, we applied our newly developed solid phase extraction (SPE) approach which uses a C18 resin column as the capture matrix. For the sample preparation, EBA samples were loaded directly onto the column. After washing, small metabolite molecules and proteins, which are less nonpolar, were eluted using acetonitrile (1st elution). Lipids, which are nonpolar, were eluted with 2-propyl alcohol (2nd elution, Figure 2A). Three representative molecules – methadone, dilauroyl-sn-glycero-3-phosphorylcholine (DLPC), and insulin – were used for method validation (Figure 2B). Methadone and insulin were only detected in the 1st elution sample, not in the washed sample (Figure 2C, red and green dots). DLPC signals were detected in the 2nd elution sample, which suggests the efficient capture of lipid molecules by the C18 resin. The SPE-based sample preparation was rapid, with the separation process taking ~3 min per sample, and the use of a C18 resin cartridge makes it amenable to automation. Notably, dual mobile phases for polar and nonpolar molecule separation have been used by others, reinforcing the reliability and reproducibility of this approach .
Molecular profiles of EBA samples with liquid chromatogram and MS
The total ion chromatogram of extracted small metabolites and lipids in EBA samples showed that molecular signals were acquired in EBA samples when compared to blank samples (Figure 3A, blue, green, and orange lines). Representative total ion chromatograms of small metabolites in EBA samples from GXP-positive and GXP-negative participants revealed no obvious differences between the two groups, suggesting a deeper analysis was required (Figure 3B). In general, ~500 features were extracted from small metabolite analysis and ~330 features were extracted from lipid analysis (Figure 3C,D). No statistical difference was observed between the two groups based on the feature numbers. Both small metabolite and lipid analyses with MS achieved outstanding dynamic range: ~5 magnitudes in small metabolite profiles and ~4 magnitudes in lipid profiles (Figure 3E,F).
Selection of definitive molecules in each study group
A global correlation heat map demonstrated that molecular profiles could be used to segregate the two groups of study participants with GXP-negative and GXP-positive participants exhibiting clear correlations by Pearson correlation coefficient analysis (Figure 4A). The relative standard deviation of each feature of either GXP-negative or GXP-positive samples was calculated (Figure 4B-E). A threshold of 30% was used to select features in each group for statistics (Figure 4B-E, red line), resulting in 347 small metabolites and 217 lipids in the GXP-negative group, and 325 small metabolites and 198 lipids in the GXP-positive group (Figure 5A). The ion intensity distributions of 10 representative molecules in each participant group were identified as proline, all-trans-retinoic acid (RA), chalcone (CC), 5-hydroxyindole-3-acetic acid (HIAA), D-2-Aminobutyric acid (D2AA), uridine, cholesteryl ester (CE) 16:4, ceramide (Cer) 8:0, diacylglycerol (DG) 21:1, phosphatidylethanolamine (PE) 24:2, phosphatidylinositol (PI) 20:3, and phosphatidylserine (PS) 5:3 (Figure 5B).
Identification of 22 metabolite and lipid molecules associated with active TB status
Correlations between metabolites and lipid profiles detected in GXP-negative and GXP-positive groups were visualized using pairwise combinations of three clinical visits (Figure 6A) and those metabolites and lipids that were statistically significant by t-test identified in the volcano plots (Figure 6B, C). The molecules that were statistically significant at all three visits were marked as either red dots, meaning increased prevalence in GXP- positive group, or green dots, meaning decreased prevalence in GXP-positive group. In summary, 22 molecules – including 13 metabolites and 9 lipids – were identified that were statistically significant between the two groups at all three visits (Figure 6 B, C and Figure 7A, B).
We then evaluated the utility of the identified metabolites and/or lipids to distinguish GXP-negative from GXP-positive participants by generating ROC curves and calculating AUC values. The AUC for metabolites was ~87% (95% confidence interval: 0.832-0.919) and 93% (95% confidence interval: 0.889-0.971) for lipids to segregate between GXP-negative and GXP-positive study participants. When combined with metabolites and lipids, the AUC was slightly higher than using lipids only, ~97% (95% confidence interval: 0.926-0.986) in the segregation between GXP-negative and GXP-positive participants (Figure 7C). The correlation between the identified molecules was investigated by Pearson correlation coefficient analysis (Figure 7D), with the heatmap identifying seven molecules (Cer 8:0, uridine, PS 24:4, DG O-8:0, NAM, PI 20:4, PI 18:4) with the tightest correlation. In our previous work, we used significance analysis of microarrays (SAM) to identify those features with the most segregation power . Applying SAM analysis to the current dataset revealed 10 molecules (Figure 7E) that were significantly different between GXP-positive and GXP-negative samples, with PS 24:4 emerging as top driver of segregation between the two groups.