Salivary Desulfovibrio Desulfuricans Level and Oral Hygiene Index for the Prediction of Colorectal Cancer in Chinese Patients

Background: Mounting evidence have shown that fecal microbiome can act as biomarkers for diagnosing colorectal cancer (CRC). Recent studies demonstrate that oral microbiome is concordant with gut microbiome. The role of oral microbiome in colorectal cancer has not been fully illustrated. Methods: We collected preoperational saliva with a nal cohort of 237 patients who underwent surgical resections or colorectal endoscopy in XX Hospital from January 2018 to January 2020. Clinical demographics, comorbidities and oral conditions were obtained from medical records or questionnaires. Salivary microbial biomarkers were detected by quantitative polymerase chain reaction (PCR) after DNA extraction. Multivariate logistics regression analysis was employed to analyze the risk factors for colorectal cancer. A four-variable prediction model was constructed based on the logistics analysis. Results: Among the 237 patients enrolled, there were 95 endoscopy conrmed healthy control and 142 pathologically conrmed colorectal adenocarcinoma patients. Logistics regression analysis demonstrated that the risk factors associated with CRC included age at diagnosis (OR=1.111, 95%CI=1.072-1.151), male sex (OR=2.111, 95%CI=1.068-4.175), oral hygiene index (OR=1.769, 95%CI=1.116-2.804) and relative salivary Desulfovibrio desulfuricans (Dd) abundance (OR=1.156, 95%CI=1.05-1.272), based on which a four-variable model was developed. The four-variable model had good discriminative (Brier score=0.144, Concordance index=0.866) and calibration (0.834) abilities after bias correction. Conclusions: Elevated salivary Dd level is an independent risk factor for CRC. We have developed a four-variable model that could help identify at-risk patients for CRC.


Background
Colorectal cancer (CRC) is one of the major public health problems with both high incidence and mortality rate 1 . In China, annually there are approximately 376,300 newly diagnosed CRC cases and 191,000 CRC related deaths 2 . Early identi cation of CRC could signi cantly reduce cancer incidence and mortality rates 3 . Classical CRC screening methods including serum carcinoembryonic antigen (CEA), colonoscopy and fecal occult bleeding test (FOBT) are widely used in clinical practice, the screening method, however, neither lack sensitivity nor is invasive and has low patient compliance 4,5 . Thus, a novel non-invasive biomarker with higher sensitivity and patient compliance is warranted.
Typically, colorectal tumorigenesis from precancerous adenoma to colorectal cancer involves several stepwise genetic events 6 . Numerous studies have reported that microbiota dysbiosis plays a pivotal role in the initiation and development of CRC and fecal microbiome act as biomarkers for diagnosing CRC 7 .
Given the concordance and relevance of oral and gut microbiome 8 , many researchers started to explore the oral microbiome structure and function of CRC 9 . Nevertheless, these studies lack investigations on clinical parameters and oral health status, which might impact the composition of oral microbiota. In addition, qPCR technique-based detection of certain oral bacteria has not been reported.
Poor oral hygiene has been demonstrated to be associated with higher risk of oral cancer and head and neck cancer 10,11 . What's more, one research conducted in Austria reported that head and neck squamous cell carcinoma patients presented great needs of dental care, which was underestimated by most patients 12 . Oral hygiene indicators comprise of tooth brushing frequency, number of loss teeth, regularity of dental visits and presence of dental caries.
Measuring salivary microbial biomarkers that have previously been identi ed for their possible role in carcinogenesis and assessing oral conditions, the current study aimed to establish a predictive model based on clinical characteristics, oral hygiene indicators and salivary microbiome for predicting the risk for colorectal cancer. We hypothesized that the combined use of oral hygiene index with biomarkers abundance could maximize the prediction accuracy of the model in diagnosing CRC.

Patient Selection and Sample Collection
Consecutive patients who underwent colonoscopy or surgical resections were enrolled in Shanghai Jiao Tong University a liated Renji Hospital from Jan.2018 to Jan.2020. Patients with colonoscopy con rmed no neoplasms or pathologically diagnosed colorectal adenocarcinoma were assigned to healthy control (HC) or CRC group, respectively. Strict exclusion criteria were applied to the subjects recruited as follows: 1) with history of gastrointestinal neoplasia; 2) with history of upper gastrointestinal tracts surgery; 3) speci c types of CRCs including Lynch Syndrome, familial adenomatous polyposis (FAP) and Peutz-Jeghers syndromes; 4) after chemotherapy or radiotherapy; 5) medication history of any of the following drugs: NSAIDs (nonsteroidal anti-in ammatory drugs), immuno-inhibitors, antibiotics or probiotics for past one month before enrollment. The work ow chart was shown in Figure 1. A total number of 237 patients were included in the nal statistical analysis. Preoperational unstimulated saliva was collected in a sterile container and transferred to -80℃ refrigerator in 30 minutes until future use.
The procedures were all performed by specialized doctors who were blind to the research content. The current study has received the approval from Renji Hospital ethics committee and written informed consent was obtained from all subjects enrolled.

Data Collection
Patient demographics including age, sex, body weight, height, tobacco and alcohol use as well as comorbidities were either retrieved from the hospital medical system or obtained by face to face questionnaires. Body mass index (BMI) was calculated as weight (kg) divided by height (m) squared. Patients' self-reported oral hygiene information was obtained from questionnaires recorded by licensed doctors, which has been proven to be reliable 13,14 . Due to some restrictions to dental assessment, oral hygiene index (OHI) was constructed based on three variables with minor modi cations 15 . The three variables were scored categorically: for number of teeth loss≥5 scored 1; for tooth brushing frequency < twice/day=1; irregular dental visits =1. Otherwise, the variable was scored 0. The sum of the three variables equaled oral hygiene index, which ranged from 0 to 3. As for the oral hygiene indicator for presence of dental caries, we de ned it as a categorical variable. Tobacco or alcohol use was de ned as consumption behavior for the past six months, no use or past consumption was de ned as none-users. Comorbidities for diabetes or hypertension were also recorded as categorical variables.
Genomic DNA Extraction and Quantitative PCR After thawing, saliva genomic DNA was extracted according to the manufacturer's instructions using QIAamp DNA mini kit (Qiagen, Germany). Upon DNA extraction, the purity and concentration of DNA were detected by Nanodrop 2000 spectrophotometer (Thermo sher Scienti c, America) and stored at -20℃ until future use. Primers used for detecting 16S rRNA, Fn and Pg were extracted from previous articles 16,17 , while the primers for Dd and Pm were designed by Primer-BLAST in NCBI targeting the avodoxin gene and hemolysin gene, respectively 18,19 . Primers used in this study were shown in supplementary materials. Quantitative PCR amplicons were performed in a 10µl reaction system of Sybergreen pre master mix containing in ABI thermocycler (thermo sher). qPCR reactions were performed in duplicates for each target with the 16S rRNA as the internal control, which had been shown to indicate total bacteria DNA load 20 . qPCR conditions were as follows: 95℃ denaturation for 5 s, 60℃ annealing and extension for 30s for a total of 40 cycles, after a short pre-denaturation at 95℃ for 30s.

Microbial Biomarker Selection
In an attempt to search for salivary microbial biomarkers, the current study selected four pathogens as detection targets in the saliva, namely Fusobacterium nucleatum (Fn), Porphyromonas gingivalis (Pg), Desulfovibrio desulfuricans (Dd) and Prevotella melaninogenica (Pm). Fn was among one of the most widely investigated pathogens associated with CRC, except for Fn in the body site of oral cavity. It was rst identi ed that Fn was enriched in colorectal carcinoma tissues through whole genome sequencing, followed by quantitative PCR and 16S rRNA sequencing analysis 21 . Fecal Fn has been widely reported as a diagnosis biomarker for CRC and high abundance of mucosa Fn was associated with chemotherapy resistance 7,22 . Mechanically, Fn could promote carcinogenesis through E-cadherin/β-catenin signaling via FadA adhesin 23 . Pg was an Gram negative anaerobe associated with chronic periodontitis, which could eventually lead to tooth loss 24 . Pg acted as an important player in the development of various diseases, including Alzheimer's Disease 25 . The role of salivary Pg in colorectal cancer, however, remains unexplored. Dd is a Gram-negative anaerobe belonging to sulfate reducing bacteria (SRB). Functionally, SRB could degrade the organic matter entering into the gastrointestinal tract and utilize a wide range of substrates to reduce sulfate to hydrogen sul de (H2S) 26 . Bacteria derived H2S is toxic to colon epithelium, causing DNA damage to the colon epithelial cell and promoting colon cancer cell proliferation 27,28 . In vitro experiments showed that Dd endotoxin could transcriptionally activate NF-kB and IKB α genes in Caco2 colon cancer cell 29 . Recently, one research conducted in the United States revealed that sulfur metabolizing bacteria including Dd in the stool was associated with risk of distal colorectal cancer in men 30 . All of the above prompted us to investigate the role of salivary Dd in CRC. As for Pm, prospective study demonstrated that Pm in pre-diagnostic mouth rinse samples was associated with a decreased risk of CRC 31 . Pm was enriched in tumoral microhabitat in a cohort of 276 gastric cancer patients 32 , suggesting a role of carcinogenesis in gastric cancer.

Data Processing and Statistical Analysis
For undetermined qPCR readouts, Ct values were replaced with a maximum value of 40. The relative abundance of target salivary microbial biomarkers was denoted as target Ct value subtracted 16S rRNA Ct value. Kolmogorov-Smirnov normality test was rst applied to assess the distribution of continuous data. Values were expressed as mean ± standard deviation (SD) or median with interquartile range (IQR) or whole number with percentage as appropriate. Univariable comparisons of clinical characteristics and oral conditions between CRC and HC patients were made using independent t test (normal distribution) or Wilcoxon-Mann Whitney U nonparametric test (nonnormal distribution). For categorical variables, chisquare test was utilized. Factors independently associated with CRC were selected using multivariate logistics analysis with forward selection. Statistical analysis was all performed by SPSS version 25 (IBM). Missing data on salivary microbiome occurred because some patients left limited volume of saliva. Data on BMI (0.4%), dental caries (2.5%) and salivary microbiome level (2.5% data for Fn, 2.1% for Pg, 0.4% for Dd and 2.1% for Pm) were missing randomly and compared separately, but were multiply imputed as mean or median when analyzed in multivariate logistics analysis according to Rubin's rule 33 . A P value less than 0.05 was considered as signi cant. Nomogram was constructed by R Package based on the odds ratio (OR) calculated from multivariate logistics analysis.

Patient Characteristics and Oral Hygiene Indicators
A total number of 237 patients met our criteria and were included in the nal cohort, with 95 healthy control and 142 CRC patients (Fig. 1). The patients' overall characteristics, oral hygiene indicators and microbial biomarkers level were described as follows. Average age at diagnosis was 59.49 ± 12.48 years and sex was almost evenly distributed (men 53.16%). The rate of comorbidity for hypertension in the cohort was 32.07%, while for type 2 diabetes was 10.13%. There was no signi cant difference between the two groups for patient BMI and drinking status (P > 0.05). However, there was signi cant difference for the variable age, sex, OHI, the presence of dental caries, comorbidities and smoking status (P < 0.05). Average age at diagnosis was 51.16 ± 10.75 for HC group while CRC group was signi cantly older with average age at 65.07 ± 10.27. HC group had a lower male case rate (44.21% vs 59.15%, P = 0.024). OHI in CRC with a median value of 2 was higher than that of HC group with a median value of 1, indicating worse oral hygiene habits for CRC group in comparison with HC group. A greater proportion of CRC patients had dental caries, comorbidities for type 2 diabetes and hypertension and smoking status. Univariate analysis between HC and CRC was summarized in Table 1.

Biomarker Comparison
As for salivary microbial biomarkers, the relative abundance of salivary Fn and Pm showed no signi cant difference between the two groups. The relative abundance of salivary Pg and Dd in CRC is higher than that of the control group. Scatter plots of comparisons for the relative abundance of salivary Pg and Dd between HC and CRC group were shown in Fig. 2 (Variance In ation Factor, VIF ranged from 1 to 2). Moreover, we developed a nomogram based on the ORs calculated from logistics regression (Fig. 3). Nomograms graphically present a complex mathematical algorithm, which incorporate biological and clinical variables 34 . The uppermost line was set as the reference for scoring points from 0 to 100, predictive variables were displayed right below with regularly segmented bars demonstrating visually the relative weight of each variable and allowing for values to be assigned to each variable accordingly. Total points for the nomogram and the corresponding predicted probability of CRC are shown in the bottom two bars.

Evaluation of the Model
The predictive accuracy of the four-variable model was evaluated using Brier score, Hosmer-Lemeshow goodness of t test, calibration curve and concordance-index (C-index). Brier score was used to assess the difference between observed and predicted values with values closer to 0 indicating better predictive ability. By contrast, Hosmer-Lemeshow goodness of t test is an over-all assessment of the difference between the predicted values and actual values with a P value over than 0.05 indicating no difference and good calibration ability. As for calibration slope, it re ected the agreement between observed and predicted values with values closer to 1 indicating better performance. C-index was used for assessing the discriminatory ability of the nomogram with values between 0.5 and 1 35 . Bootstrapping with 237 repetitions was used for internal validation, and the bias-corrected accuracy measures of our predictive model were obtained. Brier score was 0.144, which indicated good predictive ability of the model. Hosmer-Lemeshow goodness of t test P value equaled 0.054, indicating no statistical difference between the predictive model and actual values. Calibration slope and C-index was rather high, with a value of 0.834 and 0.866 respectively. In addition, calibration curve and receiver operating characteristic curve were also displayed (Fig. 4) to graphically demonstrate calibration and discrimination abilities of the model, respectively.

Discussion
The oral cavity inhabits various microorganisms, whose diversity is only secondary to gut microbiome 35 .
Oral microbiome has been associated with human health and diseases, including oral cancer 37 , pancreatic cancer 38 and systemic diseases 39 . One prospective study of oral microbiome and colorectal cancer risk was conducted in African American populations 31 , nevertheless the oral microbiome was represented by mouth rinse samples and low-income populations were recruited so selection bias was possible. Saliva is uid secreted by salivary glands with biological functions 40 , where salivary microbiome diversity exhibits temporal stability 41 , making it as optimal research target. In a study conducted by Italian scholars, preliminary comparisons of oral and gut microbiota in colorectal cancer patients were made and a different taxonomic composition was identi ed in CRC patients compared with HC group 9 . However, it neither lack strict inclusion and exclusion criteria nor the sample size was limited.
In addition, 16S rRNA high throughput sequencing but not quantitative PCR based oral microbial biomarker detection was employed, which was cost consuming. In the present study, we utilized quantitative PCR to detect the relative abundance of speci c pathogens in the saliva.
As for searching candidate salivary biomarkers, Fn, Pg, Dd and Pm were selected as qPCR targets in our cohort based on previous literatures. One recent study reported that salivary Fn level in CRC group was higher compared with the control group, while the amount of salivary Pg was similar between the two groups 42 . However, the sample size included was limited which need further validation. Oral carriage of Pg has been demonstrated to be associated with increased risk of pancreatic cancer in a prospective study 38 . The presence of SRB including Dd in the periodontal pocket and co-occurrence with Pg was observed, the prevalence of which was reduced after treatment 43 . As a matter of fact, we also found a positive correlation between salivary Pg and Dd level in our cohort (spearman correlation coe cient = 0.269, P < 0.001). As for salivary Pm, it has been reported to be a diagnostic indicator of oral squamous cell carcinoma 44 . The diagnostic role of salivary Pm in CRC still remains unexplored.
In our study, both salivary Pg and Dd were associated with the diagnosis of CRC in univariate analysis (P < 0.01). After adjusting for other risk factors, only Dd become the independent risk factor for CRC (OR = 1.156, 95%CI = 1.05-1.272). However, the discriminative ability of this salivary microbial biomarker in differentiating CRC from HC was limited when solely used (AUC = 0.653, sensitivity 46.32%, speci city 82.39%). In addition, three other risk factors were also included in the logistics regression analysis, namely older age, male sex and higher oral hygiene index. As mentioned before, though we did have observed a positive correlation between OHI and relative salivary Dd abundance, the collinearity between the two predictive variables included in the model disappeared after adjustment for other risk factors for CRC (1 < VIF < 2). According to latest statistics 1 , the incidence rates for male and older age are higher, especially for age over than 50, which is concordant with our results. High oral hygiene index was inversely correlated with good oral hygiene habits, which was demonstrated as a risk factor for CRC. One retrospective research from Nurses' Health Study found that women with fewer teeth and moderate to severe periodontal diseases might be associated with increased risk of developing CRC 45 , which is consistent with ours. We expectedly observed a higher incidence rate for type 2 diabetes comorbidity in CRC, concordant with previous ndings that type 2 diabetes mellitus was associated with higher risk of CRC from a prospective cohort 46 . Nevertheless, due to the low proportion of diabetes comorbidity rate in our cohort, this variable does not enter into nal prediction model.
Colonoscopy is currently the most widely used screening method for detecting colorectal neoplasia, however, it's rather invasive, expensive and has low patient compliance 4 . Less invasive screening methods include FOBT and serum CEA and CA199 (carbohydrate antigen199) detection. Although FOBT has been demonstrated to reduce mortality rates, its sensitivity is limited 5 . Serum CEA is a relatively mature biomarker used in clinical practice to monitor CRC recurrence and response to therapy, but lacks sensitivity and speci city when used as a screening method 47 . Elevated serum CA199 is usually seen in late stage CRC and demonstrated good value in monitoring the prognosis of CRC 48 . Hence, there's an urgent need to develop a novel classi er for predicting the risk for CRC with less invasiveness and more convenience and patient compliance.
Nomograms have long been used by oncologists to predict cancer survival and metastasis 49,50 . Based on the coe cients calculated from multivariate logistics analysis, we assigned a value to each risk factor to calculate the total score capable of quantifying the nomogram. In the current study we developed a visually nomogram to predict the risk for CRC. In addition, we nally developed a four-variable model to predict CRC and internally validated its prediction accuracy. Derived from logistics analysis, our four- convenience. What's more, the performance of the predictive model is encouraging as it demonstrated a high predictive ability for CRC, with both high sensitivity and speci city. Given the fact that continuous variable can be transformed to a score, constructing a nomogram is superior than other statistical models.
However, our research also has some limitations. This is a single-centered study which may not be generalizable. Our predictive model needs further external validation in the future. In addition, there is inevitable selection bias for the variable age during patient enrollment process, because there is signi cant difference for the age between outpatients for medical examinations and inpatients prepared for surgery.

Conclusions
our results demonstrate that elevated salivary Dd level is a risk factor for CRC, even after adjusting for other risk factors and salivary microbial biomarkers. We have developed a highly predictive model that can accurately differentiate CRC patients from those without colorectal neoplasm with great clinical applications. Given the fact that high oral hygiene index is a risk factor for CRC, developing good oral hygiene habits may help reduce the risk of CRC. However, the four-variable model should be externally validated in other centers and prospective researches should be conducted to evaluate the risk predicting value of aforementioned predictive variables before further clinical application. Additional researches are warranted to explore a mechanistic role for Dd in the initiation and development of CRC.

Declarations
Ethics approval and consent to participate: The current study has received the approval from Renji Hospital ethics committee and written informed consent was obtained from all subjects enrolled.

Consent for publication:
Not applicable.