Experimental design, protein yield, and peptide identifications
In clinical proteomic studies, biospecimen availability can be a constraining factor depending on the tissue of interest being analyzed. In this experiment, we used peripheral blood mononuclear cells (PBMCs) isolated from AML patients as a model and followed a modified version of the CPTAC TMT-based clinical proteomics pipeline (7). An illustration of the general workflow used to process the samples, from cell lysis through LC-MS/MS analysis, is presented in Figure 1. To determine the amount of protein obtainable from patient samples of varying sizes and volumes, we set up three replicates each of AML patient cells representing 1E7, 5E6, 1E6, 5E5, 1E5, 5E4, and 1E4 total cell counts. Samples were lysed in proportional amounts of lysis buffer; protein concentration was determined by BCA assay and total protein yield was calculated. As shown in Figure 2, achieving the most commonly used amount of 400 µg total protein requires greater than 10 million cells (Figure 2A). Lower numbers of cells resulted in a linear decrease in protein yields, with less than 1 µg of total protein extracted from smaller amounts of cells (5E4 and 1E4 cells). While these lower protein yields are substantially less than what is preferred for multiplexed proteomic studies, the sample sizes represent total amounts of biomaterial isolated in real-world clinical settings where available material is constrained. In these cases, researchers may face difficult decisions about inclusion of samples in experimental designs; thus, we set out to evaluate the impact of lower and varying protein amounts on LC-MS/MS peptide and protein identification.
As technical variability in proteomics workflows (i.e. protein extraction and digestion) is unavoidable—and likely amplified when sample size, protein amount and buffer volumes differ—we opted to pool all protein extracted from the different cell pellets and carry out digestion in a single reaction (Figure 1). From this homogenous pool of starting material, we generated aliquots containing either 400 µg (the conventional amount loaded into each channel of clinical TMT proteomics experiments), 200 µg, 100 µg, 40 µg, or 20 µg of total peptides to assess the effect of variable peptide amount. We then prepared 2 sets of samples for labeling with TMT 11-plex reagents, with each set containing 2 replicates of each peptide aliquot distributed randomly through the first 10 TMT channels (Figure 2B). An additional aliquot of 400 µg of peptides was included in each plex for use as a universal reference and was assigned to the 131C channel. This experimental design—standardizing the source of peptides in each channel from a large pool homogenized after digest and clean up—ensures that differences detected in downstream data processing and analysis are based on peptide loading and not sample processing. Following TMT labelling, the samples were mixed into their respective multiplexes, desalted by C18 SPE, fractionated by high pH reverse-phase HPLC, and concatenated into 12 total fractions. An aliquot of each fraction was removed for global proteomics analysis, and the remaining material was further concatenated into 6 fractions and underwent IMAC phosphopeptide enrichment. Global and phosphopeptide-enriched samples were analyzed by LC-MS/MS, and data was used to evaluate reporter ion intensities for each TMT channel as well as calculate total numbers of peptide and protein identifications.
Figure 2C displays the overall labeling efficiency calculated in each plex as well as the total reporter ion intensities acquired for each channel across both TMT11 multiplexes. In general, channels with equivalent peptide loadings showed consistent reporter ion intensities within and across the 2 plexes (Figure 2C). Furthermore, the median intensities for different channels increase linearly with the amount of peptide loaded per channel, demonstrating the general quantitative information achievable through utilization of TMT methodology (Figure 2D). Data from global and phospho-enriched fractions were used to determine the number of unique peptides and proteins identified from these samples (Table 1). Additionally, Table 1 displays the number of peptides/proteins from global and phosphoproteomic datasets that are quantified in 25%, 33%, 50%, and 100% of samples—cut-offs commonly applied to proteomics datasets prior to statistical analysis.
Table 1. Peptide and protein identifications.
|
Global Proteomics
|
Phosphoproteomics
|
Unique peptide identifications (total)
|
138373
|
27351
|
Unique peptides
Quantified in >25% of samples
|
137302
(99.2%)
|
26753
(97.8%)
|
Unique peptides
Quantified in >33% of samples
|
134399
(97.1%)
|
24230
(88.6%)
|
Unique peptides
Quantified in >50% of samples
|
118418
(85.6%)
|
17409
(63.7%)
|
Unique peptides
Quantified in 100% of samples
|
45925
(33.2%)
|
3366
(12.3%)
|
Unique protein identifications (total)
|
8926
|
NA
|
Unique proteins
Quantified in >25% of samples
|
8910
(99.8%)
|
NA
|
Unique proteins
Quantified in >33% of samples
|
8887
(99.6%)
|
NA
|
Unique proteins
Quantified in >50% of samples
|
8722
(97.7%)
|
NA
|
Unique proteins
Quantified in 100% of samples
|
7641
(85.6%)
|
NA
|
Table 1. Number of unique peptides and proteins identified from global proteomics datasets and phosphoproteomics datasets across the two experimental TMT11 multiplexes. Unique peptide counts were filtered based on presence in >25%, 33%, 50%, and 100% of sample channels, and percentages displayed represent the fraction of overall unique peptide identifications that pass the filtering criteria.
Effects of differential peptide loading on missing data
While our results demonstrate that reporter ion intensities correlate strongly with peptide loadings and multiplex experiments with differentially loaded channels yield good proteome coverage, a larger question remains regarding the impact of differential loading on quantitative data reproducibility and reliability. In multiplex proteomic experiments, missing data—that is, identified spectra where reporter ion intensities are not detected for one or more TMT channels—pose a significant challenge, especially when comparing across multiple TMT plexes (25). Indeed, when evaluating these datasets, an obvious trend of increasing missingness was evident in the data as peptide loading amounts decreased (Figure 3A). In general, this issue was more pronounced in the phosphoproteomic datasets, as the differences in samples was likely exacerbated by the phosphopeptide enrichment protocol. The effects of missing data in global proteomics datasets can be largely mitigated by rolling peptide identifications/ quantifications up to the protein level (Figure 3A); however, in cases where comparisons are to be made between individual peptide intensities (i.e. phosphoproteomic datasets), missing data can have tremendous implications on downstream statistical analysis. Consistent with the increased levels of missing data as peptide loadings decrease, comparison of the peptides identified in all replicates of each loading group illustrates that as channel loading decreases to 40 µg or 20 µg we begin to see increasing numbers of peptides that are not quantified in these samples (Figure 3B-C).
We sought to compare the levels of missing data in these differentially loaded TMT plexes with those that might arise in a standard TMT experiment where all channels contain equivalent peptide loadings. To this end, we analyzed data generated in our laboratory from an experiment using two TMT11 multiplexes where all channels were loaded with 400 µg of peptides derived from similar biological material (human AML cell lines), processed with the same sample preparation protocols, fractionated into 12 global fractions and 6 phospho fractions per plex, and analyzed on the same instrument with the same acquisition settings (this data is deposited on the MassIVE respository under the same accession as the data from the differential loading experiment). As illustrated in Table 2, differential peptide loading results in higher levels of missing data within multiplexes: on average, only 36% of phosphopeptides were quantified in all 10 channels, and only 83% of phosphopeptides were quantified in more than 6 channels (Table 2). These rates of missing data are significantly higher than those seen in the standard loading experiment—on average, greater than 95% of phosphopeptides are observed in all channels, and over 99% are observed in more than 6 channels (Table 2). While the issue of missing data is more apparent in phosphoproteomics measurements likely due to the low abundance of enriched phosphopeptides, the problem still exists in global proteomics. At the peptide level, only 76% of observations were quantified in all channels of either multiplex, while 96% of observations were quantified in more than 6 channels (Table 2). Global proteomics measurements benefit from the aggregation of data to the protein level; when evaluating quantification at the protein level, 96% of observations have values in all channels of either plex (Table 2). Again, these values are lower than standard, equally loaded TMT experiments, where ~99% of peptides and proteins are typically observed in all channels (Table 2). In all cases, the higher levels of missing data occur in channels with lower peptide loading, which we attribute to the reduced signal-to-noise ratio for these channels.
Table 2. Missing data in differential loading vs. standard loading TMT experiments.
|
Phosphoproteomics
Phosphopeptide-level
|
Global Proteomics
Peptide-level
|
Global Proteomics
Protein-level
|
|
Differential
Loading
|
Standard Loading
|
Differential
Loading
|
Standard Loading
|
Differential
Loading
|
Standard Loading
|
Observations
|
Average.
% Total
|
Average.
% Total
|
Average.
% Total
|
Average.
% Total
|
Average.
% Total
|
Average.
% Total
|
10
|
35.91
|
95.86
|
76.09
|
98.90
|
95.63
|
99.89
|
9
|
55.89
|
97.95
|
88.43
|
99.42
|
97.91
|
99.95
|
8
|
70.24
|
98.84
|
92.95
|
99.65
|
98.80
|
99.97
|
7
|
82.78
|
99.27
|
96.05
|
99.76
|
99.35
|
99.98
|
6
|
93.02
|
99.53
|
98.18
|
99.84
|
99.70
|
99.99
|
5
|
97.16
|
99.69
|
99.10
|
99.89
|
99.85
|
99.99
|
4
|
98.79
|
99.82
|
99.55
|
99.93
|
99.93
|
99.99
|
3
|
99.55
|
99.91
|
99.82
|
99.97
|
99.98
|
99.99
|
2
|
99.9
|
99.96
|
99.95
|
99.99
|
99.99
|
99.99
|
1
|
100
|
100
|
100
|
100
|
100
|
100
|
Total IDs
|
20,818
|
26,428
|
103,582
|
119,335
|
8,514
|
7,214
|
Table 2. Phosphoproteomics and global proteomics data from two TMT11 multiplexes from this experiment (Differential Loading) were compared with data acquired across two TMT11 multiplexes of peptides derived from a similar biological source, but with equal loading of 400 μg in all channels (Standard Loading). This table quantifies the percentage of total peptides, proteins, or phosphopeptides identified that were quantified in 1 through all 10 TMT channels (excluding the reference channel).
Statistical differences induced by differential channel loading
Before making any comparisons across differential loading groups, data for each sample were normalized by the central tendency method based on median values (32, 33), a standard approach in proteomics data analysis that accounts for technical variations between samples (Supplemental Figure 1). Following median normalization, principal component analysis (PCA) of both global and phosphoproteomic datasets indicate that peptide loading influences data quantification at a certain threshold: while 400 µg, 200 µg, and 100 µg samples all group reasonably close to one another post-normalization, samples with 40 µg or 20 µg of peptides drift away from the other samples and show more variation within the replicates (Figure 4A-B). Additionally, the reproducibility within each loading group decreases as a function of peptide quantity, demonstrated by plotting the percent coefficient of variation calculated from raw TMT reporter ion intensities among replicates in both global and phosphoproteomic datasets (Figure 4A-B). These data indicate that as the amount of peptide loaded per channel decreases, precision of reporter ion intensity measurement decreases. In settings where comparisons are to be made between channels (i.e. when comparing patient samples), large variations in amount of material loaded will likely impair the ability to discern statistically relevant biological differences.
While preliminary visualization by principal component analysis (PCA) plots suggest that samples with lower peptide loading cluster less tightly than samples with higher loadings, we sought to gain a better understanding of the effects of differential channel loading on statistical data analysis. Based on the unequal variances detected among the loading groups, we used unequal variance t-tests to compare each sample loading group to the 400-µg sample group. As samples were all derived from a common pool of peptide digest, we employ the assumption that there should be no statistically significant differences between peptide loading levels. In both global and phosphoproteomic datasets, we observe more differences from the 400-µg sample group as peptide loading decreases. While few proteins or peptides remain significantly different after correction for multiple hypothesis testing (defined as Benjamini-Hochberg adjusted p-value <0.05), p-value histograms when comparing the 20-µg samples or 40-µg samples with the 400-µg sample group show a more anti-conservative distribution suggesting larger quantitative differences (Figure 4C-D). Combined, these data demonstrate that using standard data normalization methods, up to 4-fold differences in channel loading can be effectively corrected and not have significant impacts on quantification precision, while more drastic differences in channel loading (i.e. 10-fold or 20-fold) may cause difficulties when trying to detect differences between patient samples.