Workflow, Design and Summary
To evaluate the impact of hemolysis on biomarker discovery utilizing a multi-omics platform, we compared proteins, lipids, and metabolites identified across plasma, serum, and buffy coat samples (proteomics only) acquired from 420 non-diseased and pancreatic cancer patients. Workflow of Proteomics, Lipidomic and Metabolomic analysis is shown in Fig. 1. A hemolysis score was recorded for each sample, ranging from 0–4 for buffy coat and 1–3 for plasma and serum. A summary of the distribution of hemolysis scores within each sample type can be found in Fig. 2. Buffy coat yielded the largest hemolyzed samples 37.1% - #0, 25.1% - #1, 24.8% -#2, 12.4% - #3, 0.4% - #4 hemolysis. Protocol of isolation of buffy coat from blood may be one of the major reasons for the large number of contaminated buffy coat samples.
In proteomics, 7302, 1971, 2146 proteins were identified and quantified in buffy coat, serum and plasma, respectively, using TMT labeling and 2D online LC-MS/MS. After filtering the data for proteins that have less than 85% missing values, a total of 3648, 453, and 492 proteins in buffy coat, serum and plasma, respectively, were obtained and used for further analysis. In lipidomics, 1318 structural lipids and 106 mediator lipids were identified and quantified after data filtration for analysis in plasma and serum samples. In metabolomics, a total of 514 and 508 metabolites were identified and quantified in plasma and serum samples, respectively, after data filtering and kept for further analysis.
Differentially Expressed Metabolites And Lipids
Lipidomics analysis revealed no significant changes in lipid expression for mediator lipidomics data when comparing samples with hemolysis scores of 2 + to 1 in both plasma and serum. However, for structural lipidomics analysis, 5 lipids were found to be down regulated, and 2 lipids up regulated in plasma, and 14 lipids were down regulated, and 11 lipids up regulated in serum (Table 1). More profound effects were seen in the metabolomics data. When comparing samples with hemolysis scores of 2 + to 1, a total of 51 metabolites were found to be down regulated and a total of 25 upregulated due to hemolysis in plasma (Table 1). For the same comparison in serum, 93 metabolites were down regulated and 21 were upregulated due to hemolysis (Table 1). A summary of these results can be found in supplemental table 1.
Table 1
| Proteomics | Signaling Lipidomics | Structural Lipidomics | Metabolomics |
Matrix | Buffy Coat | Plasma | Serum | Plasma | Serum | Plasma | Serum | Plasma | Serum |
Downregulated Species | 250 | 2 | 2 | 0 | 0 | 5 | 14 | 51 | 93 |
Upregulated Species | 208 | 22 | 13 | 0 | 1 | 2 | 11 | 25 | 21 |
* Differentially expressed species due to hemolysis |
Missingness
A subset of samples with the lowest hemolysis score was created, in this case, a score of 0 for buffy coat samples and a score of 1 for plasma and serum samples. This subset was used to filter the proteins, and only the proteins that have less than 85% missing values were kept in the full proteomics data. The missing proportions of proteins for each sample were computed, and samples were then grouped by hemolysis score of 0: 244 samples, score 1: 165 samples, score 2: 163 samples, score 3+: 85 samples in buffy coat (Fig. 2). The boxplots clearly indicate that as the hemolysis score of a sample increases, the number of proteins that are identified across the set within the sample decreases, and the medians of proportions of missing proteins are 0.299, 0.353, 0.406, 0.410 for the groups with hemolysis score 0, 1, 2, 3+, respectively (Fig. 3). This can be explained by an increase in the signal derived from the more abundant hemoglobin proteins contributed from the lysed red blood cells, suppressing the signal of the less abundant proteins and changing the dynamic range of the protein content that would ideally be identified from samples with little to no hemolytic contamination.
Differential Expressed Proteins
To assess the effect of hemolysis on relative protein expression in buffy coat, comparisons between hemolysis groups were performed as shown by volcano plots (Fig. 4A). Overall, 657 samples were included in this analysis. A total of 3,647 proteins were identified when assessing the differentially expressed proteins between samples with a score of 0 vs. 1 (Fig. 4A), with 394 differently expressed proteins down-regulated and 310 proteins up-regulated at a 1.3 fold change threshold and a p-value of 0.05. Comparing samples with a score of 0 vs. 2 Fig. 4B, a total of 701 proteins were consistently identified across all samples, with 348 proteins differently expressed proteins down-regulated and 251 proteins upregulated at a 1.3 fold-change threshold and a p-value of 0.05 (Supplemental Table 1). Lastly, we compared samples with a score of 0 vs. 3 + Fig. 4C, and a total of 592 proteins were consistently identified across all samples, with 238 proteins differently expressed proteins down-regulated and 187 proteins up-regulated at a 1.3 fold-change threshold and a p-value of 0.05. Hemolysis not only impacted the proteins identified but also impacted the quantitation of the differentially expressed proteins.
Further, comparisons between samples with no visual hemolysis (scores of 0 for buffy coat, scores of 1 for plasma and serum) were made to samples with visual hemolysis (scores of 1–3 + for buffy coat, scores of 2 + for plasma and serum). Differential expression of proteins was observed using volcano plots shown in Fig. 4, using a threshold of 1.3 fold change with a corresponding p-value of 0.05 to be considered differentially expressed. Overall, 250 proteins were found to be downregulated and 208 upregulated in buffy coat. Similarly, in plasma and serum, 2 proteins were found to be down regulated in samples scored 2 + compared to samples scored 1. A total of 22 proteins in plasma and 13 proteins in serum were found to be up regulated in the same comparison. A summary of these results can be found in supplemental table 1.
Impact Of Hemolysis On Hemoglobin
To study hemolysis via protein identification and relative quantitation, we assessed the expression of Hemoglobin Subunit Alpha (HBA1), Hemoglobin Subunit Beta (HBB), and Hemoglobin Subunit Delta (HBD) across all sample types and grouped by hemolysis score within each sample type. Hemolysis is generally classified as the lysis of RBC in circulation or during sample preparation, and as hemoglobin is one of the most abundant proteins in red blood cells, the hemoglobin expression increases due to hemolysis (Fig. 5A) and increased stepwise with increasing hemolysis score. A similar pattern was seen in both plasma and serum, with lower levels observed in samples with a hemolysis score of 1, and significantly higher levels observed in samples scored 2+ (Fig. 5b and Fig. 5C).
We also assessed the expression of carbonic anhydrase (CA1), histone H2B type 1-L (HIST1H2BL), and ubinuclein-2 (UBN2) (Fig. 6). CA1 is another major protein found in RBC's and is responsible for processing carbon dioxide in the body. The expression of CA1 is low in samples classified with a hemolysis score of 0, and increases similar to the hemoglobin protein expression with increasing hemolysis score (Fig. 6). HIST1H2BL and UB2 are both nuclear proteins whose identification is expected in buffy coat samples and not from red blood cells. HIST1H2BL and UBN2 expression follow the expected result, with higher expression in samples with hemolysis score of 0 and lower expression with increasing hemolysis score (Fig. 6), indicating signal suppression of these proteins as a result of hemolysis.