Evaluation of Low Coverage Whole Genome Sequencing as a New Method for Detecting Malignant Ovarian Mass

To evaluate whether low coverage whole genome sequencing is suitable for the detection of malignant pelvic mass and compare its diagnostic value with traditional tumor markers. We enrolled 63 patients with a pelvic mass suspicious for ovarian malignancy. Each patient underwent low coverage whole genome sequencing (LCWGS) and traditional tumor markers test. The pelvic masses were nally conrmed via pathological examination. The copy number variants (CNVs) of whole genome were detected and the Stouffers Z-scores for each CNV was extracted. The risk of malignancy (RM) of each suspicious sample was calculated based on the CNV counts and Z-scores, which was subsequently compared with ovarian cancer markers CA125 and HE4, and the risk of ovarian malignancy algorithm (ROMA). Receiver Operating Characteristic Curve (ROC) were used to access the diagnostic value of variables. As conrmed by pathological diagnosis, 44 (70%) patients with malignancy and 19 patients with benign mass were identied. Our results showed that CA125 and HE4, the CNV, the mean of Z-scores (Zmean), the max of Z-scores (Zmax), the RM and the ROMA were signicantly different between patients with malignant and benign masses. The area under curve ( AUC) of CA125, HE4, CNV, Zmax, and Zmean was 0.775, 0.866, 0.786, 0.685 and 0.725 respectively. ROMA and RM showed similar AUC (0.876 and 0.837), but differed in sensitivity and specicity. After all, we develop a LCWGS based method for the identication of pelvic mass of suspicious ovarian cancer. LCWGS shows accurate result and could be complementary with the existing diagnostic methods.


Introduction
According to the latest 2018 global cancer data report, the incidence of ovarian tumors in female reproductive system accounted for 3.4% of all female tumors in China, and the number of women who died of malignant ovarian tumors accounted for 4.4% of all female patients who died of tumors [1].
Ovarian cancer has become the second highest incidence and mortality of female reproductive system tumor following cervical cancer [1,2]. Because of the small size of the ovary and its position in the pelvic cavity, ovarian tumor itself lacks typical symptoms in early stage [3]. Patients often nd that they have ovarian tumor after the pelvic cavity has a huge mass or bleeding in the vagina [4,5]. At this time, the tumor has developed to the late stage and most of them spread to other pelvic organs, and has missed the best time for treatment [6]. Therefore, the early detection of ovarian tumors is critical for clinical management and prognosis of patients. Multiple efforts have been made to evaluate traditional markers including serum concentration of CA125 and HE4 in the screening of ovarian cancers [7]. However, these markers did not meet the standards required to advocate population-based screening regarding with the diagnostic sensitivity and or speci city [8,9]. In order to improve the accuracy of diagnosis for ovarian cancer, additional cancer-speci c diagnostic methods may be required.
In recent years, the rapid development in the eld of next generation sequencing (NGS) and its application in low coverage whole genome sequencing (LCWGS) makes the detection of tumor-speci c copy number alterations (CNA) in cell-free DNA feasible [10,11]. Evidence has showed that tumor-derived chromosome abnormalities would be detectable in the plasma of patients prior to surgery [10,12].
Previous studies have reported that occult pelvic cancers can be detected by LCWGS testing but it might cause false positive results [13]. However, the diagnostic accuracy of LCWGS platform and analytic pipeline for ovarian cancer remains unknown. The aim of this study is to investigate whether a clinical LCWGS platform could detect ovarian cancers in patients with pelvic masses based on the abnormal plasma DNA copy number variants (CNVs), and to compare the diagnostic accuracy with traditional screening markers including CA125 and HE4, and the score of risk of ovarian malignancy algorithm (ROMA) [14].

1.Subjects and samples
Sixty-three patients with a pelvic mass suspicious for ovarian malignancy, who were referred to the gynecology department of the First A liated Hospital of Sun Yat-sen university from January 2018 to July 2019 were recruited in this study. In addition, a cohort of 39 healthy female individuals were also recruited. Blood samples were collected using EDTA anticoagulated tube and sent for laboratory within 2 hours. The study approval was obtained from the ethical committee of the First A liated Hospital of Sun Yat-sen university (S/55904). All participants submitted their written informed consents.

2.Sample processing and LCWGS
The blood samples were rstly centrifuged at 1600 g for ten minutes at 4℃, and then the supernatant was centrifuged at 16000 g again for ten minutes at 4℃. The plasma was stored − 80°C until analysis.
The isolation, puri cation, library construction and sequencing of cell free DNA from the blood were performed by using a Fetal Aneuploidies Trisomy Detection Kit (Daan Gene Corp, China) on Ion Proton next-generation sequencer (Life Technologies) which was certi ed by the China Food and Drug Administration. All procedures were performed according to the manufacture's protocol.

4.Analysis of malignant risk
For further analysis of the risk of malignancy, data from 39 healthy females was used to form a baseline. Firstly, we calculated the mean of CNV counts and |Z-scores| ( > = 3), then the risk of malignancy(RM) of each suspicious sample was calculated as (CNV counts suspicious -CNV counts mean of healthy ) X (|Z-scores| suspicious -|Z-scores| mean of healthy ).

5.Tumor marker detection and ROMA scores
HE4 and CA125 were tested in stored plasma using the ARCHITECT HE4 and CA125 assays (Abbott Diagnostics, Abbott Park, IL, USA) according to the manufacturer's instructions.

6.Pathology diagnosis of pelvic mass
All diagnoses of patients were con rmed via pathological examination by pathologists who were blind to the results of clinical laboratory testing. Tumor staging was performed according to the International Federation of Gynecology and Obstetrics (FIGO) criteria (2010).

7.Statistical analysis
Statistical analysis was carried out by an online statistics tool (http://dxonline.deepwise.com/) and R software (Version 4.0.1) with pROC and Rattle package (5-7). Receiver operating characteristics (ROC) curve was used to evaluate the diagnostic value. A two-tailed P value of less than 0.05 was considered statistically signi cant.

1.Clinical and pathology data of subjects
This study included 63 patients with a pelvic mass suspicious of ovarian malignancy, who were nally identi ed as 34 (54%) high grade malignancy, 10 (16%) low grade malignancy and 19 (30%) benign mass by pathological diagnosis. The median age of premenopausal patients were 35 years (range, 16-53 years), and the median age of postmenopausal patients were 62 years (range, 46-83 years). The median age of patients with malignancies was 51-year (range: 21-70) and that of benign diseases was 30-year (range:18-52). There was a signi cant difference in age distribution between these 2 groups of patients (P < 0.01). The FIGO stage of ovarian cancers patients included 13 (30%) I stage, 6 (14%) II stage, 18 (41%) III stage and 7 (16%) IV stage. The clinical and pathological data of subjects were listed in Table 1.

2.LCWGS on CNVs
LCWGS used a whole genome low coverage strategy to analyze the CNVs. For each sample, more than 5M (5.9 ± 0.68 for all samples) reads was obtained. The coverage of each sample is about 0.35×. A representative LCWGS gure for ovarian cancer and benign disease was shown in Fig. 1. The results from a patient with FIGO Stage III serous cystadenocarcinoma showed multiple regions of CNV (Fig. 1A). And the results from a patient with teratoma showed that no CNV (Fig. 1B). In this study, only 7 patients with malignancy showed trisomy or monosomy as indicated by LCWGS. To further investigate the diagnostic performance of LCWGS, CNV counts, max of Z scores(Zmax) of all CNVs, mean of Z scores (Zmean) and RM was calculated from each sample. Signi cant difference of LCWGS based index was found between patients with malignant and benign tumors. Patients with malignancy showed higher level in LCWGS based index than patients with benign disease. In addition, these indexes were closely related to different FIGO stage (Figure 2 and Table 2).

3.Traditional tumor markers
The serum concentration of CA125 was 416.457 ± 747.887 (Mean ± SD), HE4 was 219.192 ± 457.614 and ROMA was 0.534 ± 0.422 in all subjects. There were signi cant differences between the concentration of CA125(560.282 ± 854.994 VS 83.387 ± 112.353) and HE4(286.382 ± 534.32 VS 63.595 ± 51.849) in patients with malignant and benign diseases. Besides, we compared the serum concentration of CA125 and HE4 in different FIGO stage, and the results indicated that the level of both markers were correlated with different FIGO stage( Figure 2 and Table 2).

4.Correlation between traditional tumor markers and LCWGS index
Spearman correlation was used to investigate the relationship between tumor markers and LCWGS index. As shown in Fig. 3 and table 3, all indexes were statistically correlated (P < 0.01). However, the correlation between traditional tumor markers and LCWGS index was weak (r value range from 0.38 to 0.77).

Comparison of the diagnostic value of LCWGS and traditional tumor markers
Firstly, we evaluated the diagnostic value of single index in the reasearch subjects. The AUC of CA125 and HE4 was 0.775 and 0.866 respectively. HE4 showed better diagnostic accuracy than other markers. Then the integrated indexes were evaluated. The AUC of ROMA and RM was 0.876 and 0.837, respectively. Both ROMA and RM showed higher diagnostic accuracy than single index. However, no signi cant difference was found between ROMA and RM (Delong test: P = 0.476), which indicated that ROMA and RM had similar diagnostic value between ovarian cancers and benign diseases. With the cutoff of 0.085, the sensitivity and speci city of ROMA was 0.684 and 0.909 respectively. With the cutoff of 1.25, the sensitivity and speci city of ROMA was 0.895 and 0.773 respectively (Figure 4 and Table 4).

Discussion
As the second highest incidence and mortality of female reproductive system tumor following cervical cancer, ovarian cancer has the early clinical presentation that are di cult to be differentiated from digestive tract diseases, such as bloating or abdominal pain [15,16]. When ovarian cancer develops and spreads to the abdominal cavity, abdominal mass may appear [17]. Therefore, distinguishing between benign and malignant abdominal masses is very important for the early diagnosis of ovarian cancer.
Oncogenesis involves many types of genomic variation, such as point mutation, copy number variation and gene fusion [18]. Tumors are different from genetic diseases, and their genomic variation is frequently acquired [19]. The development of ovarian cancer is a complex process involving the changes of DNA, RNA, and proteins [20,21]. The abnormal DNA of cancers could release from cancer tissues and be detected in blood samples in the form of cell free DNA [22]. Therefore, the detection of CNVs would be a promising method for the identi cation of malignant abdominal masses.
In this study,we evaluated whether CNVs detected by LCWGS platform could accurately predict the existence of malignancy. In our study cohort, the number of patients with malignant (43 cases) was higher than the patients with benign disease (19 cases). Our results showed that, chromosome variation could be detected in cell free DNA in patients with malignancy. However, only a few cases with malignant mass showed trisomy or monosomy. Despite that chromosome instability was common in tumor cells, owing to the low concentration of tumor derived cell free DNA, detection of trisomy or monosomy might lack sensitivity for clinical diagnosis [23]. We set our detection target to CNVs at the resolution of 5MB.
With this strategy, more chromosome instabilities could found in the subjects, however, the speci city might reduce. To solve this problem, we extracted more indexes from the LCWGS results and a healthy cohort was used to calibrate our results. Our results indicate that LCWGS based indexes were signi cantly different between patients with malignant and benign diseases and closely related to FIGO Stage, which would be valuable in the diagnosis of malignant mass. The diagnostic value of LCWGS based indexes were evaluated by ROC curve. Despite that CNV, Zmax and Zmean were useful for the diagnosis of malignant mass, however, the AUCs were less than 0.80. An integrated RM index which is calculated by CNV and Zmean and calibrated by a healthy cohort, showed better diagnostic performance with a AUC of 0.837. With the cut-off value of 1.25, RM is highly sensitive in the detection of malignant mass with all stage.
Both CA125 and HE4 were the most widely used markers in ovarian cancer diagnosis [24]. In our study, CA125 and HE4 showed signi cant difference between the malignant mass and benign disease, which is consistent with previous reports. In 2009, Moore proposed ROMA as a new algorithm. He correlated HE4 and CA125 levels with menopausal status, which was de ned as 6 months of menopause without menstruation or clinical symptoms. The ROMA corresponds to the predicted probability [PP], expressed as a percentage [14]. The sensitivity of ROMA for ovarian cancer diagnosis varies from 75-97%, however, the detection of early stage malignancy was still a problem [25][26][27]. We compared the diagnostic value between RM and ROMA, despite that ROMA showed higher AUC than RM, however, the difference was not statistically signi cant. The sensitivity of RM (0.895) is superior to that of ROMA (0.684), while the speci city of RM (0.773) is inferior to that of ROMA (0.909). The CA125 and HE4 were correlated with LCWGS based index. However, the correlation was weak. Therefore, RM and ROMA could only be used as complementary in the diagnosis of pelvic malignant mass.
Low speci city of RM may originate from the bio-informatics pipeline in LCWGS. All CNVs in whole genome were used for further analysis. Ovarian cancers showed speci c gain or loss of chromosomes in tissues as demonstrated by other studies, however,there was no widely accepted speci c CNVs in cell free DNAs [28]. Further studies should be developed and focus on ovarian cancer speci c CNVs to improve the diagnostic speci city. In addition, the increase of sequencing depth would be helpful in increasing the diagnostic value. Further studies could try to ascertain the sequencing depth regarding with the cost and effect.
A limitation of this study was that the number of patients was small. A larger sample size is needed to validate our ndings, and to conduct further studies on different FIGO stages of ovarian cancer or in patients with pre -and post-menopause.
In conclusion, our study provided a new methodology with high accuracy for the diagnosis of ovarian cancers, which could be a supplement to the existing diagnostic methods.

Declarations
Ethics approval and consent to participate The Research Ethics Committee of the Sun Yat-sen University approved the study.

Consent for publication
All of the authors agreed for publication.

Availability of data and material
The data and material in our studies were availability.
Due to technical limitations, table 1, 2, 3 and 4 is only available as a download in the Supplemental Files section.   ROC of ROMA and RM. The ROC included Age, CA125, HE4, CNV, Zmax, Zmean, ROMA and RM. There were no signi cant differences in AUC of ROMA and RM (Delong test: p = 0.476).