Development of a supervised model using placental DNAme that predicts EOPE
Using sparse partial least discriminant analysis (sPLS-DA), we developed a placental DNAme-based classifier that selected EOPE-associated CpG sites in a training group and were able to use this signature to accurately predict the disease status of samples in an independent validation group.
Parameter tuning during CV selected the optimal number of components and features (CpGs) per component, with a component being constructed from a linear combination of the features. One component constructed from 90 CpGs resulted in the lowest average overall misclassification error rate (OER) across all CV folds. The optimal number of components is typically K-1, with K being the number of classes to be predicted by the model. Thus, as expected, introducing more components in the model did not significantly improve CV performance. While this model performed well on CV (OER=0.10, AUC=0.97), 35% of the 90 CpGs selected had low stability (frequency of selection across CV folds < 0.5). These CpGs also contributed less than those with higher stability to the model’s discriminative ability, suggesting that a simpler model with less CpGs may be equally successful in predicting the outcome of new samples. To test this, we developed six additional models by selecting the number of CpGs a priori (75, 60, 45, 30, 15, 5) and assessed their performance on CV.
Decreasing the number of CpGs did not significantly affect model performance as evaluated using two discrete classification metrics (OER, ROC-AUC) and one probability-based metric (Brier score) (Table 3). The mean OER value of the seven models with different numbers of CpGs was 0.11 (±0.01 standard deviation); we saw a slight but non-significant increase in OER as the number of CpGs decreased (Figure 1a, Table 3). For all seven models, we also assessed the area under the curve (AUC, Table 3) of the receiver operating characteristic (ROC, Figure 1b) curve, which informs about the trade-off between the true positive rate and the false positive rate. Once again, the difference between the seven models on this metric was not significant, and all seven showed outstanding discrimination with an AUC of 0.95 (Applied Logistic Regression, 3rd Edition, n.d.). Lastly, we computed the average Brier score across CV folds for each model, which measures the mean squared difference between the predicted probability assigned to the possible outcomes for a given sample, and its actual outcome. All seven models had an average Brier Score of 0.14 (Table 3), which confirmed that the predicted probability of most samples matched the true likelihood of the outcome of interest.
We selected the 45 CpG classifier as our final model for outcome prediction on independent data (Table S4, Supplementary Materials). While a model relying on a very small number of CpGs (e.g., 5) may demonstrate a good average performance on CV, it might not be as generalizable to new data. At the same time, the larger model (90 CpGs) initially selected by CV does not perform significantly better than any of the smaller models. Therefore, we chose a model with an intermediate number of features (45 CpGs) to evaluate on new data.
Table 3. Average performance metrics from cross-validation folds in the seven models tested during development, and in the permuted model. OER is the overall error rate, defined as the number of all incorrect predictions divided by the total number of samples in the data. OER (EOPE) is calculated exclusively over the EOPE samples (n=38), and OER (nPTB) is calculated over the nPTB samples (n=11). AUC is the area under the receiver operating characteristic curve, it is a measure of separability. The Brier score measures the accuracy of probabilistic predictions.
Model
|
OER
|
OER SD
|
OER (EOPE)
|
OER (nPTB)
|
AUC
|
Brier score
|
90 CpGs
|
0.10
|
0.01
|
0.13
|
0.07
|
0.95
|
0.14
|
75 CpGs
|
0.11
|
0.02
|
0.13
|
0.08
|
0.95
|
0.14
|
60 CpGs
|
0.11
|
0.02
|
0.14
|
0.08
|
0.95
|
0.14
|
45 CpGs (Final Model)
|
0.11
|
0.01
|
0.13
|
0.08
|
0.95
|
0.14
|
45 CpGs (Permuted Model)
|
0.53
|
0.05
|
0.52
|
0.55
|
0.45
|
0.26
|
30 CpGs
|
0.12
|
0.02
|
0.14
|
0.09
|
0.95
|
0.14
|
15 CpGs
|
0.12
|
0.02
|
0.15
|
0.09
|
0.95
|
0.14
|
5 CpGs
|
0.13
|
0.13
|
0.15
|
0.10
|
0.93
|
0.14
|
Lastly, we compared the average performance across CV folds of the final 45 CpG model to a 45 CpG model that was trained on randomly permuted labels. The goal of this permutation test was to assess whether our final model performs better than a model would be expected to by chance. Across all metrics, it was clear that the performance of the model with permuted labels was significantly worse (0.52 OER, AUC=0.45) as compared to our final model (Table 3).
eoPred demonstrated good discriminatory ability (AUC=0.73) in an independent validation group
To test the performance of eoPred, we applied it to predict the disease status of 49 samples (38 EOPE, 11 nPTB) in the validation group. Each sample was assigned a probability of being classified as either EOPE or nPTB; while some were labelled with a high confidence, others were intermediate (Figure 2b). Youden’s J statistic, defined as the sum of sensitivity and specificity minus one, was used to choose an optimal threshold of 55% to classify samples (Figure 2a). However, this threshold can be modified by the user in the eoPred function, and it is encouraged that users explore the spread of probabilities in their data and determine a threshold that fits their question best. For example, higher probabilities of EOPE may be used to select samples with a more homogeneous placental phenotype.
Of the 38 EOPE samples in the validation group, 23 were predicted as EOPE (60% sensitivity) with a mean EOPE probability of 70% (Figure 2). the EOPE samples that were correctly classified as EOPE (31.85 weeks) had significantly lower average gestational age (p=0.00037) between and those that were misclassified as nPTB (35.46 weeks) (Figure 2c). In addition, 15 EOPE samples were misclassified as nPTB, these samples were all from the same dataset (GSE125605), which consisted of samples of Han Chinese ancestry as reported by the original authors (Figure 2d). All samples from GSE98224 (16 EOPE, 5 nPTB) were correctly classified. There was a wide range of EOPE probability values in the EOPE group, ranging from 56% to 82% (Figure 2b).
In the small sample size (n=11) of nPTB cases in the validation group, eoPred had a high specificity of 91%. The probability of EOPE assigned to nPTB cases, 10 of which were correctly classified as nPTB using a 55% probability threshold, ranged from 24% to 53% (Figure 2b), with an average of 41%. The misclassified nPTB sample had a EOPE probability of 67%.
Applying eoPred to the validation cohort produced an area under the curve (AUC) value of 0.725, meaning that 73% of the time, for a randomly-selected pair of nPTB and EOPE samples, the model will correctly assign a higher absolute risk to the sample with EOPE than to the sample without EOPE. Generally, an AUC of 0.7-0.8 is considered good discrimination (Applied Logistic Regression, 3rd Edition, n.d.).
Lastly, we also tested the 5 CpG model in the validation group. The predictions made by eoPred (45 CpGs) and the 5 CpG model were very similar (MAE in EOPE probability between the two models=0.4). The 5 CpG model predicted EOPE with 61% sensitivity, and correctly classified 25 EOPE samples. The 5 CpG model demonstrated 91% specificity, only misclassifying 2 more nPTB samples than eoPred (Figure S2, Supplementary). The threshold chosen as optimal using Youden’s J statistic was also remarkably similar (56%).
FGR and late-PE cases with earlier gestational ages tend to be classified as EOPE
We then predicted the disease status of 328 samples in the exploratory group, which included 2 EOPE, 86 late-PE, 3 nPTB, 181 nTB, 46 FGR, and 10 CPM16 cases. We applied eoPred to these samples to further explore its ability to detect true negatives (normotensive samples) and to investigate the EOPE probability of cases from pathologies that may overlap with the placental phenotype of EOPE, such as FGR and late-PE.
Of cases known with high confidence to be normotensive, 91% were correctly classified as nPTB (n=184, 181 nTB and 3 nPTB), with a mean EOPE probability of 31%. The 2 EOPE cases in the exploratory group were correctly assigned a high EOPE of 73% and 86%, respectively (Figure 2b). In addition, 28% of 86 late-PE cases and 20% of 46 FGR cases were predicted as EOPE, which could reflect the subset of cases in these groups with similar molecular pathology to EOPE. Both late-PE and FGR cases classified as EOPE had significantly lower gestational ages than their counterparts predicted to be normotensive (Table 4), with late-PE cases classified as EOPE having an average GA of 35.65 weeks and nPTB classified as EOPE having an average GA of 33.37 weeks. Moreover, of the 6 late-PE cases that presented with co-occurring FGR (which is commonly associated with EOPE), 4 were classified as EOPE with a mean EOPE probability of 69%, whereas 2 were predicted to be normotensive with 54% and 28% EOPE probability, respectively. Late-PE+FGR cases were more likely to be predicted EOPE (mean p(EOPE) = 69%); but not significantly more than late-PE without FGR (mean P(EOPE)=41%, p-value=0.15).
Table 4. Characterization of late-PE and FGR cases in the exploratory cohort, by their predicted outcome. P-values for categorical values were calculated with Fisher's exact test. Welch's t-test was used to compute the p-values of the continuous variables.
|
Late-preeclampsia (n=86)
|
|
Fetal growth restriction (n=46)
|
p
|
Predicted outcome
|
EOPE
|
nPTB
|
EOPE
|
nPTB
|
Total N (%)
|
24 (28%)
|
62 (72%)
|
|
9 (20%)
|
37 (80%)
|
|
Probability of EOPE
(range)
|
0.70
(0.57-0.82)
|
0.34
(0.13-0.54)
|
<2.2e-16
|
0.68
(0.55-0.79)
|
0.35
(0.13-0.55)
|
3.05E-08
|
Inferred weeks of gestational age (range)
|
35.65
(34.02-38.80)
|
37.48
(34.65-40.69)
|
5.41E-07
|
33.37
(28.17-37.65)
|
37.18
(32.16-39.75)
|
8.22E-05
|
Placental sex (XY/XX)
|
14/10
|
31/31
|
0.63
|
4/5
|
18/19
|
1
|
Predicted ancestry
|
|
|
0.32
|
|
|
0.17
|
African
|
5
|
23
|
|
2
|
5
|
|
Asian
|
3
|
9
|
|
3
|
5
|
|
European
|
16
|
30
|
|
4
|
27
|
|
FGR status
|
|
|
0.03
|
|
|
NA
|
Yes
|
4
|
2
|
|
9
|
37
|
|
No
|
4
|
5
|
|
0
|
0
|
|
Unknown
|
16
|
55
|
|
0
|
0
|
|
Confined placental mosaicism for chromosome 16 (CPM16) is highly associated with EOPE and almost always results in FGR. DNAme patterns of CPM16 placentae were previously reported to overlap with those observed in EOPE (Yong et al., 2003). Using the same data as in (Yong et al., 2003), we evaluated how eoPred would classify CPM16 samples (n=10). Five of the 10 CPM16 samples were also diagnosed as EOPE; eoPred correctly predicted four of these samples as EOPE, while the fifth fell just below our 55% probability cut-off. All 5 CPM16 samples in which PE had not been diagnosed were correctly classified as nPTB.
Relationship of eoPred with placental cell composition
DNAme patterns associated with PE have been reported both in cord blood and placenta, but with little overlap between the two, likely due to tissue-specificity of DNAme. We applied eoPred to 110 cord blood samples (including EOPE, LOPE, nPTB and nTB) to test whether the 45 CpG sites in our placental model could predict disease status in cord blood. However, all 110 samples were predicted as nPTB (average probability of 93%), demonstrating that the DNAme signature of EOPE developed in this study is specific to placental tissue.
Many variables have been reported to influence the placental DNA methylome throughout gestation, including cell composition, GA, and genetic ancestry. There were significant differences (Bonferroni adjusted p-value < 0.05) in the PlaNET-predicted proportion of stromal and Hofbauer cells between EOPE and nPTB (original author labels) in the training data, and in the proportion of stromal but not Hofbauer cells in the validation data (Figure S3, Supplementary Materials). However, none of the 45 predictive CpGs overlapped with any of the first or third trimester cell-type specific CpGs used by PlaNET to estimate cell composition, suggesting that these CpGs are not strongly driven by cell composition differences and should be robust to minor cell composition variation arising from sampling differences. The probability of EOPE was not strongly and significantly associated with any cell type proportion (R>0.3 and p-value<0.05) regardless of the predicted class of the samples, with the exception of cytotrophoblast, which was moderately (R=0.31, p=0.0053) associated with the probability of EOPE in samples in the validation and exploratory groups predicted as EOPE (Figure 3).
We also used the placental methylome browser (Yuan, 2021) to assess the cell-specific DNAme signature of the top five most predictive CpGs (cg10994126, cg26787199, cg14605117, cg10246581) in the model (i.e., those that were selected as most stable in the CV process). While DNAme varied by cell type at these CpGs, there were no consistent changes: some CpGs had higher DNAme in trophoblast cells than other cell types, whereas Hofbauer cells had the highest DNAme at others eoPred CpGs (Figure S4, Supplementary).
Influence of genetic ancestry on the DNAme signature of EOPE
Ancestry probabilities computed with PlaNET indicated that 65% of samples in the training group and 61% in the validation and exploratory groups were primarily of European ancestry (>75% probability) (Figure S5, Supplementary). We measured the strength of association between the probability of EOPE and each of the three ancestry probabilities using Spearman’s rank order correlation for all samples in the validation and exploratory cohorts. None of the three PlaNET ancestry probabilities (African/Asian/European) were strongly associated with the probability of EOPE (ρ < 0.2).
We then compared how the β values at the 45 predictive CpGs varied by ancestry within the EOPE and nPTB groups. DNAme at 96% of the 45 eoPred CpGs was not significantly different by ancestry within 52 samples originally labelled as nPTB in the training and validation groups, suggesting that these CpGs are not systematically affected by genetic ancestry (Figure S6, Supplementary). Interestingly, the differences at these 45 CpGs across ancestries within the eoPred-predicted groups were smaller than within the groups created based on clinical labels; for instance, only 11 CpGs showed significant differences by ancestry within samples with >55% probability of EOPE in the validation and exploratory groups (Figure S7, Supplementary), as opposed to 35 within samples defined as EOPE by dataset authors in the training and validation groups (Figure S6, Supplementary).
Lastly, we explored how eoPred’s predictions may be affected by ancestry by assessing the average DNAme β values of EOPE versus nPTB samples in the training and validation cohorts at the 45 predictive CpGs (Table 5). DNAme was, as expected, significantly lower in EOPE compared to nPTB samples (p-value < 0.05) within each ancestry group, and within each of the datasets included in the training and validation groups, with the exception of GSE125605 (Table 5). The mean β of EOPE samples in this dataset, which is comprised exclusively of samples of Asian ancestry, was also higher than all the other datasets in both the training and validation groups (Table 4). In the training group, the mean β of EOPE samples of Asian ancestry was very similar to those of European ancestry (Table 4), and the mean β value of the 4 EOPE samples of Asian ancestry from GSE98224 was also similar (0.40). Therefore, the misclassification of a greater proportion of EOPE samples in GSE125605 compared to GSE98224 may be due to a dataset effect rather than to genetic ancestry. We confirmed that there were no major cell composition differences due to sampling between GSE125605 and GSE98224, which could have been a potential source for this dataset effect (Figure S8, Supplementary).
Table 5. Average β values at 45 predictive CpGs, by dataset and by ancestry. The mean β at the 45 predictive CpGs was computed for EOPE and nPTB samples separately, using author-assigned labels. Δβ is defined as the absolute difference between EOPE mean β and nPTB mean β. p-values were computed with an unpaired t-test.
|
EOPE mean β (N)
|
nPTB mean β (N)
|
Δβ
|
p-value
|
Training group (N=83)
|
|
|
|
All samples
|
0.47 (42)
|
0.61 (41)
|
0.14
|
< 2.2e-16
|
|
|
|
|
|
By inferred ancestry
|
|
|
|
|
Asian ancestry
|
0.49 (7)
|
0.62 (6)
|
0.13
|
< 2.2e-16
|
African ancestry
|
0.48 (7)
|
0.62 (5)
|
0.14
|
< 2.2e-16
|
European ancestry
|
0.48 (28)
|
0.62 (30)
|
0.14
|
< 2.2e-16
|
|
|
|
|
|
By dataset
|
|
|
|
|
GSE100197
|
0.48 (22)
|
0.62 (28)
|
0.14
|
< 2.2e-16
|
GSE103253
|
0.47 (11)
|
0.61 (8)
|
0.15
|
< 2.2e-16
|
GSE57767
|
0.52 (5)
|
0.59 (3)
|
0.07
|
0.000408
|
GSE73375
|
0.38 (4)
|
0.48 (2)
|
0.1
|
3.57E-06
|
|
|
|
|
|
|
|
|
|
|
Validation (N=49)
|
|
|
|
|
All samples
|
0.49 (38)
|
0.55 (11)
|
0.06
|
5.73E-10
|
|
|
|
|
|
By inferred ancestry
|
|
|
|
|
Asian ancestry
|
0.51 (26)
|
0.55 (2)
|
0.04
|
0.06439
|
African ancestry
|
0.43 (5)
|
0
|
NA
|
NA
|
European ancestry
|
0.43 (7)
|
0.55 (9)
|
0.12
|
< 2.2e-16
|
|
|
|
|
|
By dataset
|
|
|
|
|
GSE98224
|
0.42 (16)
|
0.56 (5)
|
0.14
|
< 2.2e-16
|
GSE125605
|
0.53 (22)
|
0.54 (1)
|
0.006
|
0.8235
|
GSE203396
|
0
|
0.53 (5)
|
NA
|
NA
|
Model performance varies with different DNAme normalization methods
As the performance of machine learning models can be affected by data transformations, we evaluated the performance of eoPred with different normalization methods using differently normalized iterations of a subset of the validation group data with IDATs available (GSE98224 and GSE125605, n=44). We assessed eoPred’s predictions on data normalized using 7 common methods: beta-mixture quantile normalization (BMIQ) (Teschendorff et al., 2013), quantile normalization (QN) (Aryee et al., 2014), subset-quantile within array normalization (SWAN) (Maksimovic et al., 2012), Dasen (Pidsley et al., 2013), noob (Triche et al., 2013), QN + BMIQ, and noob + BMIQ.
While our model was trained on BMIQ normalized data, it performed better on other iterations with different normalization methods (Table 5). Applying eoPred to quantile-normalized data demonstrated the best performance across all metrics evaluated (Table 5). This evaluation is strictly based on machine learning evaluation metrics and does not consider model interpretability. Our sample size is small, and we cannot conclude a given normalization method will consistently outperform others: testing on more data is required to systematically assess the effect of normalization on eoPred’s performance.
Table 5. Performance metrics of eoPred's predictions on the validation cohort, with different data transformations. Metrics of performance that resulted from prediction on filtered, BMIQ and batch corrected data are highlighted since eoPred performed best (albeit not significantly) using this processing. BMIQ (beta-mixture quantile normalization) is a within-array normalization method that corrects the methylation distribution of type II features to make them comparable to the distribution of type I features. SWAN (subset-quantile within-array normalization) is similar to BMIQ in that it adjusts the intensity distributions between the two probe types, but it does so with raw intensities. Dasen performs background adjustment followed by between-array normalization performed separately by probe design. Noob also performs background correction and dye-bias equalization. QN (quantile normalization) also performs between-array normalization.
Subset of validation data (n=44)
|
Sensitivity
|
Specificity
|
Precision
|
Recall
|
F1
|
PPV
|
Brier score
|
Raw
|
0.68
|
0.67
|
0.93
|
0.68
|
0.79
|
0.93
|
0.20
|
BMIQ
|
0.61
|
1.00
|
1.00
|
0.61
|
0.75
|
1.00
|
0.21
|
Dasen
|
0.87
|
0.83
|
0.97
|
0.87
|
0.92
|
0.97
|
0.17
|
SWAN
|
0.61
|
1.00
|
1.00
|
0.61
|
0.75
|
1.00
|
0.21
|
QN
|
0.87
|
0.83
|
0.97
|
0.87
|
0.92
|
0.97
|
0.17
|
Noob
|
0.84
|
0.83
|
0.97
|
0.84
|
0.90
|
0.97
|
0.15
|
QN+BMIQ
|
0.74
|
1.00
|
1.00
|
0.74
|
0.85
|
1.00
|
0.19
|
Noob+BMIQ
|
0.82
|
1.00
|
1.00
|
0.82
|
0.90
|
1.00
|
0.16
|
DNAme signature of EOPE is not strongly associated with gene expression, but may be involved in pathways relevant to placentation and preeclampsia
We next sought to investigate the biological underpinnings of the eoPred CpGs. DNAme is an important epigenetic regulator of gene expression that can affect the interactions of chromatin-binding proteins and transcription factors with DNA at the specific locus where the methyl group is present, and over moderate genomic distances (Jones, 2012). Based on the knowledge that proximal CpGs have correlated DNAme states, we grouped eoPred CpGs with proximal correlated CpG sites not included in eoPred, across samples in the training group. This was achieved by discovering 57,666 co-methylated regions (CMRs) in the training group and identifying the CMRs that comprised one or more eoPred CpGs. Only 19 of the 45 predictive CpGs occurred in CMRs; the 19 CpGs mapped to 16 distinct CMRs, which varied in size from 2 (smallest CMR) to 11 (largest CMR) CpGs. For the next step, we considered all CpGs that mapped to CMRs overlapping the 45 eoPred CpGs (n=79).
We then analyzed the EOPE versus nPTB gene expression of 14 genes that mapped to the 79 CpGs, using a linear model. Of these 14 genes, 7 were differentially expressed by EOPE status in the discovery cohort (absolute fold-change (FC)>1.4 and p-value<0.05) ; only 2 were validated with the same thresholds in an independent cohort: synapse defective rho GTPase homolog 1 (SYDE1) on chromosome 19, which was higher expressed in EOPE samples (FC = 1.5) and filamin B (FLNB) on chromosome 3, which was also more highly expressed in EOPE (FC = 1.7).
We also tested whether gene expression was significantly different between EOPE and nPTB samples for the 25 genes that the full set of 45 predictive CpGs mapped to. These 45 CpGs are distributed throughout the autosomes; 7 of them (15%) are located on chromosome 1, followed by 5 (11%) on chromosome 16 and 5 (11%) on chromosome 3. Ten genes were significantly differentially expressed in the discovery cohort by EOPE status (absolute FC>1.4 and p-value<0.05), and three of these ten genes (SYDE1, FLNB, and SCARB1) were differentially expressed using the same thresholds in the validation cohort. All three were higher expressed in EOPE compared to nPTB samples. Notably, pregnancy-associated plasma protein 2 (PAPPA2), which has been extensively reported to be associated with PE and overlaps with 3 of the 45 predictive CpGs, was identified as a DEG in the validation but not the discovery cohort.
The 45 predictive CpGs were not significantly enriched for any GO terms (FDR<0.05). The top GO term (p-value=0.0002), despite not reaching FDR significance, was “adhesion of symbiont to host”. When GO enrichment was conducted only on the 10 CpGs that map to promoter regions, “regulation of vascular associated smooth muscle cell proliferation” is the top term, though not significant after FDR multiple test correction. Lastly, we assessed whether the 45 predictive CpG probes were enriched in placentally-relevant gene sets reported in a recent study (Naismith and Cox, 2021). Once again, no gene sets were significant (FDR<0.05), but the top gene set by nominal p-value was “genes upregulated in invasive cytotrophoblasts versus human chorionic trophoblast progenitor cells”, and the genes associated with that set were PPARG, PAPPA2, and RRBP1, all of which were identified as differentially expressed in the discovery cohort, though none were validated.
Lastly, we evaluated whether the 45 CpGs overlapped lists of CpGs with PE-associated placental DNAme found in eleven previous studies (Martin et al., 2015; Yeung et al., 2016; Berg et al., 2017; Kim et al., 2017; Zhao et al., 2017; Wilson et al., 2018; Wang et al., 2019a; Lim et al., 2020; van den Berg et al., 2020; Workalemahu Tsegaselassie et al., 2020). Of the 45 eoPred CpGs, 36 overlapped hits found to be EOPE-associated by Wilson et al., 2018, which is to be expected given that the discovery dataset (GSE100197) was included in our training group. Of the eoPred CpGs reported in Wilson et al., cg10246581, one of the top 5 most predictive eoPred CpGs, was also reported in Wang et al., 2019 and Kim et al., 2017. The data used by Wang et al., 2019 was in our validation group (GSE125605). Another CpG among the 5 most predictive (cg26625897) was also reported to be associated with EOPE in placental tissue by 3 studies (Yeung et al., 2016; Kim et al., 2017; Lim et al., 2020).