Patient characteristics and sample description
Sixty-one patients diagnosed as AP were included in current study. Median age was 46.2 years (ranged from 22 to 65), and 62.3% were men. There were 17 patients with MAP, 7 with MSAP and 37 with SAP in the cohort according to the revised Atlanta classification (Fig. 1, Supplementary Table S1 and Supplementary Table S2), respectively. For all patients, on day 1, -3, and − 7 after hospital admission, whole blood samples were drawn to identify DNA methylation markers when AP may be rapidly advancing; additionally, patients with CT grades indicating significant pancreas pathology also had blood drawn on day 14 and − 21 to monitor the changes on methylation markers after initial treatment.
To discover tissue-specific DNA methylation markers of organs injury, we first generated genome-wide DNA methylation profiles from 120 cfDNA samples collected from multiple time points of the 45 AP patients. We also mapped DNA methylation profiles from cfDNA samples of 24 age- and sex-matched normal individuals as controls to minimize the interference of random background DNA methylation signals on AP diagnoses. cfDNA samples were extracted from both AP patients and healthy individual’s plasma samples and were prepared into DNA methylation library and sequenced on Illumina HiSeq X10 platform.
Identify DNA methylation markers in plasma that detect acute pancreatitis
We reasoned that MAP and SAP may share similar DNA methylation features, but SAP samples likely have higher levels because of more damages on internal organs or tissues in SAP cases, which leads to increased release of cfDNA into blood than MAP. Therefore, we stand a better chance to identify general AP markers by first contrasting DNA methylation profiles of SAP samples with those of healthy individuals in the training phase. To improve the power of detecting subtle methylation differences in plasma DNA, we focused on a set of Methylation Haplotype Blocks (MHBs) in which local CpG methylation status are coordinated along single DNA molecules, such that tissue-specific signals are easier to detect with a haplotype-based scoring scheme.
To this end, we randomly assigned half of healthy control cases (12 cases) and half of SAP cases (22 cases, 69 samples) to a training set for marker discovery. The rest of the samples of either class were assigned to an independent test set for validation, as well as all MAP cases (17 cases, 39 samples). After filtering out poorly covered MHBs, a total of 43,358 MHB were used for following analysis (Data S1).
We quantified DNA methylation patterns on MHBs using several metrics, such as methylation haploid load (MHL), average methylation frequency (AMF), etc. as classifiers for AP diagnosis. Eventually we determined that uMHL, a metric that quantifies the degree and linkage disequilibrium of unmethylated CpG sites in each MHB, is the most appropriate metric to derive a classifier: indeed, we identified 565 MHBs that are hypermethylated (uMHL scores < 0.1) in over 50% of training healthy samples and also methylated to a lesser degree (uMHL scores > = 0.1) in more than 40% of SAP training samples (Fig. 2A, Supplementary Table S3). An AP-predicting model was further formulated using the aggregated uMHL scores on these MHBs to quantify each training sample. By plotting the scores of healthy and SAP samples separately, we demonstrated that these markers can accurately separate these two classes of plasma samples (p = 0.00085, Welch’s t-test) (Fig. 2B). The accuracy of classification was quantified using receiving operational characteristic (ROC) curve, which achieved an AUC of 0.91 (sensitivity: 95.7%; specificity: 83.3%) (Supplementary Fig. 1A) on the training samples. To validate these markers, we applied the AP prediction model on the test samples using the same cutoff, 0.215 as on the training samples, and achieved accurate prediction of AP (sensitivity: 97.2%; specificity: 75%), confirming the robustness of our uMHL-based model in AP diagnosis (Fig. 2C).
To investigate the potential biological functions of these methylation markers, especially whether and how they are involved in the pathology of AP, we annotated these 565 MHBs using GREAT, a web portal for Gene Ontology (GO) annotation of regulatory regions. We observed significant enrichments in several GO terms that are closely connected to AP pathology (Fig. 2D, Supplementary Table S4), including regulation of cellular response to insulin stimulus and regulation of peptide hormone secretion that are associated with normal pancreas functions; or regulation of metanephros development and foregut morphogenesis that are associated with non-pancreas organs that are often damaged during SAP (kidney and upper digestive track, respectively). We also found enrichments in genes involved in myeloid differentiation and leukocyte degranulation, which are potentially related to SAP-caused local or systematic inflammatory responses. Overall, these enriched GO categories are consistent with the known pathology of AP, especially SAP.
Identify cfDNA methylation markers to classify MAP or SAP cases
We next sought to identify SAP-specific DNA methylation markers in order to assess the severity and distinguish SAP from MAP, which has an immediate clinical utility. To this end, we randomly assigned roughly half of the MAP cases (9 cases, 18 samples) and half of SAP cases (22 cases, 72 samples) to a training set for marker discovery, and the remaining cases (8 MAP cases, 21 samples, and 22 SAP cases, 64 samples) to an independent test set for validation. Based on the results from AP-specific markers discovery, we also chose uMHL as the quantitative metric for SAP marker discovery and predictive model building. MHBs were filtered based on sequencing coverage to ensure statistical robustness.
We performed multiple rounds of exploratory marker screenings on these MHBs and their uMHL values. Initial attempts using a single uMHL score to identify MHBs that are differentially methylated in MAP and SAP samples did not yield desired results. We then turned to an alternative strategy by looking for MHBs with mean uMHL values different between SAP and MAP samples. After evaluating multiple cutoffs for the average uMHL values and the cutoffs of maximal or minimal uMHL values, we discovered 59 MHBs, which are more methylated (max. uMHL < 0.7, mean uMHL < 0.5) in over 65% of MAP cases and less methylated (min. uMHL > 0.3, mean uMHL > 0.5) in over 65% of SAP cases, to diagnose MAP and SAP plasmas (Fig. 3A). We plotted the arithmetic average of uMHL values of these MHBs for both MAP and SAP training samples for comparison (Fig. 3B), and the results showed that SAP samples have significantly higher average uMHL scores than MAP cases (p = 2.83 x 10− 11, Welch’s t-test), demonstrating that these MHBs (Supplementary Table S5) are less methylated in SAP samples than in MAP samples, and that the average uMHL scores can be used to differentiate MAP and SAP plasma samples. Indeed, we used the average uMHL scores to classify MAP and SAP training samples. With a cutoff of 0.532, we were able to classify SAP with area under the receiver operating characteristic (AUC) = 0.97 (sensitivity: 87.5%; specificity: 94.4%) on the training samples. We then applied the MHB classifiers on the independent test samples for validation. Using the same cutoff as in the training set, we were able to classify MAP and SAP samples at an AUC of 0.81(sensitivity: 85.9%; specificity: 85.7%) (Fig. 3C). Such an accuracy is comparable to the performance of several clinically used stratification systems for early assessment of AP severity, including APACHE-II, BISAP and Ranson’s score.
Identify optimal clinical blood tests to predict SAP
We have demonstrated that a set of cfDNA methylation markers can predict the severity of AP at a comparable accuracy as several commonly used clinical AP stratification systems. To further improve the accuracy of predicting severity, we sought to integrate conventional biomarkers from body fluids to the cfDNA methylation-based SAP prediction model. A number of traditional biomarkers have been routinely used by clinicians to either diagnose AP (such as level of blood amylase or lipase) or have been used to monitor AP patients’ physiological conditions (blood electrolytes, etc.), inflammatory responses (levels of white blood cells, etc.) or organ damages (indicators for kidney, liver and lung functions, etc.). We tried to identify a small subset of these markers that are most indicative for SAP symptoms, which can be combined with the cfDNA methylation SAP markers to improve the overall SAP prediction accuracy. Furthermore, we aimed to select markers that can be measured during the first 24 hours of AP patients’ hospitalization, in order to inform treatment decisions in a timely manner.
We surveyed 93 non-invasive clinical tests (Supplementary Table S1), which were performed on 61 AP cases and a total of 175 samples. Samples from all collection dates were used in the analyses, therefore they provided a comprehensive and dynamic measurement of key biomarkers to assess the temporal progresses in AP pathology and severity. The types of body fluids used in these tests included venous and arterial blood, and urine. We also included vital signs such as body temperature in our analyses. For benchmarking the performance, a RAC grade was given to each case to evaluate the prediction accuracy of SAP prediction by selected clinical tests’ results.
We first performed a proof-of-principle prediction of SAP samples using all available clinical test results. We trained the all-markers model using a training set (9 MAP cases, 18 samples; 22 SAP cases, 72 samples), and detected SAP cases at a reasonable level of accuracy (AUC = 0.8) in the test set (8 MAP cases, 21 samples; 22 SAP cases, 64 samples) (Fig. 4A). This suggested that a significant number of clinical tests in this all-tests SAP prediction model are informative of pathologies that define SAP, so even without any marker selection, the all-test model was still capable of predicting SAP with moderate accuracy. We reasoned that by removing underperforming measurements with regard to detection accuracy, we should be able to further simplify and improve the predictor to an accuracy comparable to RAC. Meanwhile the large number of tests also allowed us to choose the ones that can be completed within 24 hours after the collection of body fluids, thus enable early SAP diagnosis.
We then focused on 66 tests that measure biomarkers in venous blood. This was mainly because venous blood contains the majority of measurable biomarkers, and is safe and easy to collect, and many of venous blood-based tests return results within 24 hours after blood collection. Body temperature was also included due to the convenience for measurement. We filtered 66 clinical tests based on data availability. This resulted in keeping 57 tests for marker discovery. Using the Recursive Feature Elimination algorithm of python package “sklearn” and the training set samples, we screened those 57 venous blood-based tests by recursively and gradually pruning off tests that contributed the least to the accuracy of SAP diagnosis, and identified the top 20 tests that formulated an SAP prediction model with an AUC of 0.99 (Supplementary Table 1, Supplementary Fig. 1B), which is nearly as high as that by RAC classification (1.0 by definition).
However, this model underperformed in the test set (ACU = 0.78) (Supplementary Fig. 1C), possibly due to overfitting. To improve the prediction model, we proceeded by first keeping tests in the 20-test model that contribute the most to prediction accuracy and whose targets were known to be associated with the risk (for example, triglyceride level) or symptoms of AP (urea nitrogen caused by kidney damage and dysfunction, etc.). We also added 5 additional tests to the prediction model based on their clinical significances on AP. Among them globulin level represents inflammatory response, creatinine, uric acid and estimated glomerular filtration rate all indicate kidney damage and dysfunction, and serum chloride has been reported to be indicative of SAP.
We then rebuilt a logistic regression model (python package “statsmodels”) using the 25 tests and recursively removed least-contributing tests to SAP prediction accuracy based on performances on the training set. The final model (Supplementary Table S6) contains 7 tests from the original 20, and all 5 new ones. It classified MAP and SAP samples in the training set with an AUC of 0.95 (sensitivity: 95.82%; specificity: 83.33%). We proceeded to validate the 12-biomarker SAP prediction model on the validation set, which predicted SAP samples at an AUC of 0.97 (sensitivity: 87.5%; specificity: 100%) (Fig. 4B). Such an accuracy is nearly as high as that of RAC, therefore we believe this model is likely sufficient for routine clinical diagnosis of SAP during the first 24 hours of AP patients’ hospitalization.
The 12-biomarker model mainly measured markers indicative of organs that are known to be frequently damaged in SAP, especially kidney (urea nitrogen, creatinine, etc.), or markers informative on inflammatory responses (levels of neutrophils, lymphocytes, erythrocytes, etc.), both categories are intimately connected to the main pathologies of SAP and may explain their capacity to collectively predict SAP. Among them two measurements on red blood cell level and volume have the highest overall weight in the prediction model (Fig. 4C), followed by markers indicative of inflammation (neutrophil and lymph levels), and then by those of kidney functions.
Finally, we built an expanded SAP prediction model by combining average uMHL scores of the predefined 59 cfDNA methylation markers with the 12 clinical tests and performed logistic regression on these markers using training set (9 MAP cases, 18 samples; 22 SAP cases, 72 samples). The expanded model (Supplementary Table S7) was able to classify an indepedent test set comprising of 8 MAP cases (21 samples) and 22 SAP cases (64 samples), achieving an AUC of 0.96 (sensitivity: 92.2%; specificity: 90.5%) (Fig. 4D). While the AUC is almost identical to that of the 12-biomarker only model, the shape of the ROC curve is slightly different, such that the sensitivity was improved from 87.5–92.2%. This is significant in the pancreatitis clinics, because a minor reduction of specificity from 100–90.5% is manageable since it does not lead to adverse outcomes. In contrast, identifying SAP more sensitively and early allows for timely adjustment of the treatment options, such as fluid resuscitation, enteral nutrition, interventional endoscopic, continuous regional arterial infusion and surgical treatments, which have been well documented for reducing the mortality of SAP patients.