Liquid Biopsy in Cholestatic Liver Disease Using Hepatocyte-Specic Methylation Markers

Background: Quantication of circulating organ-specic cell-free DNA (cfDNA) provides a sensitive measure of ongoing cell death that could benet evaluation of the cholestatic liver diseases primary biliary cholangitis (PBC) and primary sclerosing cholangitis (PSC), which lack reliable non-invasive biomarkers. Our goal in this pilot study was to determine whether liver-specic cfDNA levels are increased in PBC and PSC patients relative to controls and in advanced versus early disease, to evaluate their potential as novel disease biomarkers. Methods: Peripheral blood derived bisulte-treated DNA was PCR amplied from patients with PBC (n=48), PSC (n=48) and controls (n=96) to evaluate methylation status at 16 CpG sites reported to be specically unmethylated in liver tissue near the genes IGF2R, ITIH4 and VTN. Amplicons were used to prepare paired end libraries which were sequenced on a MiSeq sequencer. Trimmed reads were aligned and used to determine unmethylation ratios and to calculate concentration of liver-specic cfDNA. Comparisons between groups were performed using the two-tailed Mann-Whitney Test and relationships between variables were evaluated using Pearson’s Correlation. Results: Levels of liver-specic cfDNA, as measured at the 3 genetic loci, were increased in PBC and PSC patients relative to controls and in late-stage relative to early-stage patients. As well, cfDNA levels were correlated with levels of alkaline phosphatase, a commonly used biochemical test to evaluate disease severity in liver disease, in patients, but not in controls. Conclusions: cfDNA offers promise as a non-invasive liquid-biopsy to evaluate liver-specic cell-death in patients with cholestatic liver diseases.

are approved medications to treat PBC, ursodeoxycholic acid and obeticholic acid, neither of which has FDA approval for use in PSC, which currently lacks therapeutic options. (2) Regardless of the differences, both PBC and PSC are progressive diseases and orthotopic liver transplantation (OLT) is eventually required in many patients. (3,4) Despite some recent progress, PSC and PBC still lack reliable noninvasive prognostic biomarkers, (5) hampering the prediction of disease outcomes and assessment of the effect of therapy.(6) To address this unmet need, we have utilized an assay designed to detect liver-speci c circulating cell-free DNA (cfDNA) in plasma as a potential prognostic biomarker for PBC and PSC.
Apoptotic and injured dying cells are constantly releasing DNA into the blood and levels of this cfDNA have been shown to increase in cancer, cardiovascular disease, sepsis, autoimmune diseases and following intensive exercise. (7)(8)(9)(10)(11) Detection of cfDNA coming from particular organs relies on DNA methylation signatures that are organ speci c. Such signatures have recently been reported for a wide range of tissues and cell types including the liver. (12)(13)(14) For instance, a recent study reported CpGs near the genes IGF2R, VTN and ITIH4 to be speci cally unmethylated in the liver and showed these marks to be detectable in plasma of normal controls and increased following liver transplantation and in the context of liver damage in the setting of sepsis. (12) However, other liver pathologies were not assessed in this report. Our goal in this pilot study was to determine whether the levels of these liver-speci c unmethylated CpGs are increased in the plasma cfDNA of PBC and PSC patients relative to controls and in late-stage versus early-stage disease, as a means to evaluate their potential utility as novel disease biomarkers.

Study subjects
The study was approved by the Mayo Clinic Institutional Review Board and conforms to standards laid out in the Declaration of Helsinki. All participants provided written informed consent. Patients with PSC were selected from the PSC Resource of Genetic Risk, Environment and Synergy Studies (PROGRESS) (15) and patients with PBC were participants of the Mayo Clinic PBC Genetic Epidemiology Registry and Biospecimen repository.(16) As age and sex distributions differ between PBC and PSC, separate control populations with no history of liver disease were selected for each disease from the aforementioned resources. The diagnosis of PSC and PBC was based on standard clinical, biochemical, cholangiographic and histological criteria. (17,18) PBC and PSC patients were selected to equally represent early and late disease stages. For PSC, late disease was de ned as having serum alkaline phosphatase (ALP) greater than 3 times the upper limit of normal (ULN) and/or bilirubin greater than 2.5 mg/dL at time of sample collection or progression to OLT within 4 years of follow up. Late PBC was de ned similarly, although bilirubin values were not available. Early PSC and PBC was de ned as having ALP less than 1.1 times the ULN at sample collection with no evidence of elevated bilirubin, cirrhosis or OLT in follow up.
Plasma and cfDNA preparation Plasma samples were collected in EDTA-containing tubes and stored at -80°C prior to use. Thawed samples were centrifuged two times for 10 minutes at 1,500 rpm at 4°C to remove cellular debris and the supernatant was stored at -80°C prior to further processing. cfDNA was extracted from 2 ml of plasma using the Qiagen Cell-Free DNA (cfDNA) Puri cation Kit (Qiagen) and cfDNA concentration was measured using Qubit (Thermo Scienti c). The cfDNA was then treated with bisul te using the Zymo Research-EZ DNA Methylation-Gold™ Kit (Zymo Research) following the manufactures recommended protocol.

Next generation sequencing
Bisul te-treated DNA was PCR (multiplex) ampli ed using the Qiagen multiplex PCR kit (Qiagen) using primers speci c for bisul te-treated DNA but independent of methylation status at 16 monitored CpG sites in the vicinity of IGF2R (6 CpGs), VTN (5 CpGs) and ITIH4 (4 CpGs), which are speci cally unmethylated in liver tissue, as described previously. (12) Primer sequences were, IGF2R: L: TGGGTGTTGTTATTTTGTTGA and R: CTACAAAAATACACACCCCAA (94 bp); ITIH4: L: ATAGTGAAGATGTTAGTTTGTTTTT and R: AACACACTTACCTAATAACCAAAC (137 bp); VTN: L: GGTATTTTGAAGAGGTAGGTTT and R: ACCTAAATACCCCAAACTCAT (108 bp) and CpG locations are provided in Table 1. PCR products were cleaned with ExoSap-IT (Thermo Scienti c) and sent to the genome analysis core at Mayo Clinic for library preparation and sequencing. Quality and quantity of amplicon DNA were analyzed by Qubit (Thermo Scienti c) and bioanalyzer (Agilent). Individual paired end libraries were prepared using the NEBUltra II kit (New England Biolabs) without DNA fragmentation. As the combined read length of the 3 multiplex amplicons were only 339bp, each disease/control group of 96 samples were barcoded and sequenced on a single lane of a MiSeq sequencer (Illumina).

Bioinformatics and statistical data analysis
Adapter sequences were trimmed from the de-multiplexed raw sequence data in fastq format using Trim Galore [Trim Galore v0.4.4, https://www.bioinformatics.babraham.ac.uk/projects/trim_galore/]. Pairedend reads greater than 20 bases long after trimming and low quality base removal were aligned to human reference genome hg38 using BSMAP (v2.73) (19) with default parameters, followed by sorting and indexing the aligned BAM les. Methylation data was extracted for uniquely mapped read pairs from aligned bam les by a BSMAP script and the data was merged by CpG position across all samples. Offtarget CpG sites were excluded and only the 16 targeted CpGs were analyzed further. CpGs were considered unmethylated if "TG" was read and methylated if "CG" was read. We determined absolute levels of cfDNA in genome equivalents per ml (Geq/ml) as previously described. (14) Brie y, we calculated the unmethylation ratio for each locus by dividing the number of unmethylated reads by the total number of reads for all included CpGs. Then, we multiplied this ratio by the total concentration of cfDNA isolated from the 2ml plasma sample. Finally, we converted from units of ng/ml to genomic equivalents per ml by multiplying by a factor of 303, assuming the mass of a single haploid genome to be 3.3 picograms. The values obtained represent the amount of liver-speci c cfDNA in circulation, as measured for each locus, and were used in downstream analyses. Categorical variables were compared using chi-square or Fisher's exact test and continuous variables were compared using the Mann-Whitney test whereby values were expressed as median and inter-quartile range (IQR). Correlation between variables was determined by calculating the Pearson correlation coe cient. P-values of 0.05 or less were considered signi cant.

Patient Characteristics
A total of 48 PBC patients and 48 PSC patients were selected and matched to separate groups of 48 unaffected controls based on sex, reported race and age at sample collection. Following data generation, one of the PSC patients was found to be an outlier, having liver-speci c DNA levels greater than 2-fold higher at each locus than all other patients, and was removed from the study, leaving 47 PSC patients. The characteristics of these patient-control groups are presented in Table 2 (PBC) and Table 3 (PSC). The patient groups were further separated into two groups of 24 patients with early-or late-stage disease based on biochemical and clinical data. These groups were well-matched for most parameters, but PBC patients with late-stage disease were younger at diagnosis than those with early disease, median 42.9 years vs. 52.1 years, respectively, p=0.0185 (Table 2). In PSC this trend was opposite, with advanced disease patients being diagnosed later than patients in the early disease group, median 46.1 years vs. 36.2 years, respectively. However, this difference was not statistically signi cant, p=0.0693 (Table 3).

Assay Performance
We found that the multiplex amplicon-based method provides for very high read counts at each CpG site, with median counts in the range of 10,000 for CpGs in the VTN amplicon and over 30,000 for CpGs in the ITIH4 and IGF2R amplicons ( Table 1). The ratios of unmethylated to methylated CpGs were relatively consistent in the CpG sites at IGF2R and VTN across the study population, with median values ranging from 0.030-0.041 and 0.045-0.064, respectively (Table 1). However, the unmethylated ratios of CpGs in ITIH4 were more variable, with one of the CpGs, ITIH4-1, being signi cantly higher than other evaluated CpGs with a median unmethylation ratio value of 0.280 (Table 1). This suggests either an assay-based artifact or that ITIH4-1 unmethylation may not truly be liver-speci c, and thus, it was removed from the analysis. Liver-speci c DNA concentrations in our controls seemed to be higher than those in the original report,(12) possibly due to minor technical differences in the assay used. Consistent with the previous report(12), we did not detect an in uence of age on liver-speci c DNA concentration in controls ( Figure  1A). Likewise, age did not in uence liver-speci c DNA concentration in PBC ( Figure 1B) or PSC ( Figure   1C) patients. Finally, we found that sex did not in uence liver-speci c DNA levels as measured by all 3 genes: IGF2R (Figure 2A), ITIH4 ( Figure 2B) and VTN ( Figure 2C).
Liver-speci c circulating cfDNA is increased in PBC and PSC patients compared to controls and in latestage compared to early-stage disease The liver-speci c circulating cfDNA (Geq/ml) values were used to make comparisons between patient and control groups and between patients with early-and late-stage disease. Results of these analyses are shown in Figure 3.  Figure 3D).
Liver-speci c circulating cfDNA levels are correlated with alkaline phosphatase levels in PBC and PSC patients but not in controls Liver function tests, particularly ALP, are often used to evaluate liver damage and disease severity in cholestatic liver diseases such as PBC and PSC. (20,21) Thus, we evaluated the potential correlation between liver-speci c circulating cfDNA and ALP (expressed as times the ULN) using the Pearson correlation coe cient. The results of these analyses are presented in Figure 4 and show signi cant correlation between ALP and cfDNA levels as measured by all 3 genes in PBC ( Figure 4A) and PSC ( Figure 4B) but not in controls ( Figure 4C). We also had data available for Total bilirubin, another commonly used liver function test, in the PSC patients and found that those values did not correlate with liver-speci c DNA levels as measured by any of the 3 genes ( Figure 4D).

Discussion
Interrogation of organ-speci c methylation patterns in circulating cfDNA is an emerging approach with great clinical potential, especially in the setting where traditional means of evaluation require invasive techniques such as biopsy. Such an approach would be particularly valuable for evaluating cholestatic liver diseases such as PBC and PSC as clinical guidelines do not recommend routine use of biopsy in these conditions due to risk of complications related to this invasive procedure. Here we demonstrate that liver-speci c circulating cfDNA methylation patterns are increased in PBC and PSC patients relative to control groups and in late-stage compared to early-stage disease. As well, we demonstrate that the cfDNA levels correlate with ALP, a commonly used biochemical test to evaluate disease severity in PBC and PSC. Together, these ndings suggest cfDNA assays may have potential clinical utility in cholestatic liver disease.
The bulk of research into the use of circulating cfDNA to evaluate disease has focused on noninvasive tumor evaluation,(7) prenatal testing (22) and solid organ transplantation;(23) primarily exploiting differences in DNA sequence. Studies relying on organ-speci c DNA methylation patterns have recently become more practical and are showing promise in a wide range of diseases including diabetes, (24) cardiovascular disease (25) and neurodegenerative disorders.(26) Utility of cfDNA in the context of liver transplantation (27,28) and other liver diseases including Hepatitis B,(29) nonalcoholic fatty liver disease (30) and hepatocellular carcinoma (31) has been reported. However, to our knowledge, there has not been another study looking at the potential of cfDNA as a biomarker in PSC and PBC.
In our study we focus on an assay that interrogates CpGs at 3 genetic loci that were previously reported to be speci cally unmethylated in the liver. For the genes IGF2R and ITIH4 the unmethylated state was speci c to hepatocytes, while VTN was unmethylated in both hepatocytes and cholangiocytes (i.e., biliary epithelial cells). (12) Bile acid induced hepatocellular injury due to ongoing cholestasis has been long appreciated as a pathological feature of PBC and PSC (32) and the precise mechanisms of how this occurs are becoming more clear. (33) Thus, monitoring hepatocyte death as a proxy for ongoing disease activity is a valid approach, which our data supports. However, the use of cholangiocyte-speci c epigenetic marks may prove more bene cial, particularly for PBC, in which cholangiocyte apoptosis plays a pivotal role in pathogenesis. (34) Indeed, discovery of cell-type speci c epigenetic modi cations in cholangiocytes and other liver-resident cells should be a priority for future studies seeking to utilize cfDNA to monitor cholestatic and other liver diseases.
While our study was designed to be able to detect the differences in cfDNA that we describe, there are limitations to our approach. First, we used stored plasma samples collected under variable conditions and thus, there could be the contribution of additional DNA from leukocytes that underwent cell death after sample collection in the cfDNA, potentially diluting the liver-speci c signal. To avoid this, future studies should use samples that were purpose-collected using up-to-date methods and appropriate sampling tubes designed for collection of cfDNA. Second, we rely on amplicon-based next-generation sequencing, which is a time-consuming process. Future studies should focus on using emerging approaches such as digital droplet PCR, (35) which once optimized can be performed quickly and reproducibly. Finally, there is signi cant inter-individual variability present in the data. Most notably, we nd that some patients with early, and even late stage disease, have liver-speci c cfDNA levels at the low end of what is observed in the controls. Whether this was due to variation in sample handling or is in uenced by other factors such as ursodeoxycholic acid treatment remains to be determined. Larger studies, purpose-designed to evaluate such effects and the extent of intra-individual variability in cfDNA measurements over time will be needed to inaugurate clinical utility of cfDNA in PBC and PSC.

Conclusions
In conclusion, cfDNA offers promise to become a non-invasive liquid-biopsy to evaluate liver-speci c celldeath in patients with cholestatic and other liver diseases. However, several challenges need to be overcome before this technology is ready for routine clinical use.

Availability of data and materials
The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.

Competing interests
The authors declare that they have no competing interests. Lack of correlation between liver-speci c cfDNA levels and participant age. Our study did not identify correlation between age and liver-speci c DNA levels as measured at all 3 genes: IGF2R, ITIH4 and VTN in (A) Controls, (B) PBC patients or (C) PSC patients. Data presented as a plot of age in years vs. cfDNA values expressed as genomic equivalents per ml (Geq/ml), with linear regression line and 95% con dence interval shown. Correlation was evaluated using the Pearson correlation coe cient (r).  cfDNA values expressed as genomic equivalents per ml (Geq/ml). P-values determined using the twotailed Mann-Whitney test, exact p-values shown. Figure 4