Prevalence of human cytomegalovirus in colorectal cancer and viral gene expression proles

Background Human cytomegalovirus (HCMV) infection plays a crucial role in the development and progression of cancer. However, the effect of HCMV on colorectal cancer (CRC) remains controversial. This study was performed to explore the pathogenesis of HCMV in CRC. Methods HCMV DNA was detected in 74 CRC and paired normal samples by PCR. HCMV IEA protein expression was conrmed in 717 CRC biopsies by immunohistochemistry. HCMV gene expression proles (GEPs) were further analyzed in 5 CRC tissues by transcriptome sequencing. The associations of HCMV infection with clinical features and prognosis were also evaluated. Results The prevalence rates of HCMV in CRC tissues were 29.73% and 23.17% at the DNA and protein levels respectively, which was signicantly higher than those in normal tissues (0%). Transcriptome sequencing to evaluate the GEPs revealed 119 HCMV genes in CRC tissues. The high reads of transcriptions were RNA2.7, RNA4.9, RNA5.0, RL5A, UL82, UL83, and UL70, which correlate with gene expression or regulation. Survival analysis showed that patients with CRC patients and pIEA(++) had longer overall survival (OS) than those with pIEA(+)s at the protein level. However, there was no correlation between pIEA expression and clinical features. Conclusions HCMV, a common virus found in CRC tissues, is related to the development and progression of CRC. GEP analysis revealed genes correlated with lytic infection. Additionally, genes functioning in gene expression or regulation showed high expression in CRC. We found that CRC patients with HCMV lytic infection have a better prognosis than those with non-HCMV infection. Here, we revealed features of the pathogenic mechanism and provide insight that may be useful for targeted treatment of CRC.


Background
Colorectal cancer (CRC) is one of the most common cancers worldwide and has a high mortality rate.
With increasing studies of the etiology and pathogenesis of cancer, molecular-based targeted therapy has enabled speci c targeted treatment. However, few target therapies was used in CRC for that the etiology and mechanism of the occurrence and development of CRC remain unclear.
Numerous viruses have been widely recognized as carcinogens [1], such as human papillomavirus which causes cervical cancer and hepatitis B and C viruses which cause liver cancer. Additionally, many studies have suggested that pathogens participate in the occurrence and development of malignant tumors in the digestive tract, such as human papillomavirus-esophageal cancer/CRC/anal cancer, Epstein-Barr virus-esophageal cancer [2], and JC/human cytomegalovirus (HCMV)-CRC [3]. HCMV is regarded as an oncovirus that may enhance the malignancy of cancer cells or tumor-associated cells, a paradigm named as oncomodulation. HCMV has been shown to be associated with malignant glioma, leukemia, gastric cancer, and prostate cancer, among others .
Recently, HCMV has been frequently detected in CRC, and increasing evidence suggests an association between HCMV and CRC [4]. HCMV exerts oncomodulatory effects in the pathogenesis of CRC, as its genes have multiple functions. HCMV can impact infected cells, surrounding tissues, and/or immune reactions [5], encode a homologue of human interleukin-10 (LAvIL-10), control host immunity [6], and alter the tumor microenvironment [7,8]. Chen et al. showed that in patients with CRC aged < 65 years, those who were HCMV-positive had a more favorable disease-free survival (DFS) rate than HCMV(-) patients; however, HCMV infection was associated with a shorter DFS in elderly patients with CRC [9,10]. Studies have shown that pUS28 can enhance tumor in ammation by inducing the production of IL-6, RANTES, MCP-1, and fraktaline and activate invasion and metastasis. Polymorphisms in the UL144 gene are related to the clinical outcome of CRC, and HCMV may play an immunomodulatory role in the tumor microenvironment of CRC [11]. pUL123 and pUL122 (pIE1/IE2, pIEA) are expressed in proliferative HCMV infection and may promote virus replication and interfere with host immune responses against tumor cells [12]. However, Knosel et al. showed that HCMV is not associated with the progression and metastasis of CRC and Akintola-Ogunremi et al. failed to detect CMV DNA and protein in 24 CRC samples. Taken together, whether HCMV is closely related to the behavior and prognosis CRC and the underlying mechanism driving these changes remain unclear. Moreover, the viral gene pro les in CRC tissues have not been clearly determined.
Few studies have examined the relationship between HCMV and CRC. Here, we determined the prevalence of HCMV infection at the DNA and protein levels (Totally 865 samples). The gene expression pro les (GEPs) of HCMV in CRC tissues were detected by transcriptome sequencing. Moreover, the correlations of HCMV with the clinical features and prognosis of CRC were analyzed. These results may provide a foundation for new concepts regarding CRC pathogenesis and novel strategies for targeted treatment.

Study population and specimens
Seventy-four patients with CRC at the Second A liated Hospital of Wenzhou Medical University (Wenzhou, China) were enrolled in the study between February 2017 and December 2017. A total of 148 (74 × 2) paired CRC fresh-frozen and adjacent normal tissue specimens (at least 5 cm from the reception margin) were obtained during surgical resection. Both tissue statuses were con rmed by histopathologic diagnosis performed by two pathologists. The study was approved by the institutional review board of the Second A liated Hospital of Wenzhou Medical University. Written informed consent was obtained from each patient. Small pieces (each approximately 500 mg) of tumoral and paired adjacent normal tissues were resected, mixed with 500 µL RNA-Later, stored at 4°C for 1 h, and stored at -20°C for 24 h; the samples were stored over the long-term at -80°C. Moreover, 4 tissue microarrays (TMAs) with 717 biopsies was obtained from Professor Chang (Changhai Hospital, The Navy Military Medical University, Shanghai, China). Of these tissues, those from 669 patients had undergone curative surgery and the other 48 were from paracancerous tissues. The basic information of patients with CRC is shown in Supplementary Table 1.
Polymerase chain reaction (PCR) detection DNA was extracted from 100 mg of these tissues by QIAamp DNA mini kit (DP304, Qiagen, Hilden, Germany) according to the manufacturer's instructions. The concentration and purity of the extracted DNA was quanti ed using NanoDrop 1000 spectrophotometer (Thermo Fisher Scienti c, MA, USA), and then stored at -20°C until use.
A previous study showed that the UL47, UL56, and UL77 genes of HCMV can be detected simultaneously in ve different tumors [13]. Therefore, conservative regions of these three genes were selected to detect HCMV infection in this study to minimize the differences in detection caused by gene mutation. PCR was carried out using speci c primers for the UL47, UL55, and UL77 genes of HCMV [13] (Table 1)

RNA-seq and GEP analysis
Five HCMV(+) CRC tissues detected by PCR were selected for RNA-seq. GEPs were pre-processed with Cutadapter and FastQC to remove jointed reads and low-quality reads. Tophat (v.2.0.0) software was used to compare the sequences with the human and HCMV genomes. The GEPs of HCMV were determined using Integrated Genomics Viewer and Partek® Genomics Suite™ (version 6.5 beta, Partek, Inc., St. Louis, MO, USA) ( Figure 1). Gene expression was calculated as follows: Mapped reads (million) × exon length (kb) Miri Shnayder [14] considered more than two HCMV reads in the sample as positive for viral gene expression. Another study considered at least 10 reads as viral-positive infection [15]. Considering that in most cases, only a few sequences were identical to those of speci c viruses because of the large differences between the human and viral genomes, we determined that at least one viral fragment (read) in a sample was considered as positive for HCMV infection. The IHC results were independently assessed by eight researchers including two pathologists blinded to the clinical data. pIEA staining in either cells of tissues was considered as pIEA-positive expression. Scores for pIEA expression were evaluated from the extent of staining as follows: 0 (0-1%), 1 (2-24%), 2 (25-50%), and 3 (51-100%). The number of pIEA(+) cells and total adenocarcinoma cells in each block was accurately counted to determine the staining extent. Grade 0 was regarded as negative, Grades 1 and 2 were identi ed as low expression (pIEA(+)), and Grade 3 was identi ed as high expression (pIEA(++)).

Clinical features and survival analysis
Patients with CRC with intact IHC data were included in survival analysis. Continuous clinicopathological data such as patients' age were classi ed as dichotomous variables. Our primary outcome of interest was disease-free survival (DFS) and overall survival (OS

HCMV prevalence is high in CRC tissues
Previous studies showed that the prevalence of HCMV is higher in CRC tissues than in normal intestinal tissues . We evaluated HCMV infection at the DNA and protein levels to con rm its prevalence in CRC tissues. First, we detected the UL47, UL56, and UL77 genes of HCMV in CRC and adjacent tissues by PCR, as these three genes can be detected simultaneously in ve different tumors [13]. We de ned detection of at least one of these genes as HCMV infection. Of the 74 paired samples, 29.73% patients with CRC were positive for HCMV infection, whereas HCMV was not detected in normal tissues (Figure 2A). Additionally, we detected pIEA of HCMV in 669 CRC tissues and 48 adjacent normal tissues by IHC (Supplementary Figure 1). pIEA was expressed in the cytoplasm and was mainly detected in glandular epithelial cells near the adenoma lumen of the CRC (left panel, Figure 2B). As observed in PCR, the prevalence of pIEA was higher in CRC (23.17%) than in normal tissues (0%, right panel, Figure 2B). These results demonstrate that HCMV was present in patients with CRC and suggested that HCMV is correlated with the occurrence and development of CRC.
HCMV gene expression pro le in CRC Studies have shown that HCMV genes have multiple functions, including oncomodulatory effects . However, no studies have de ned the HCMV GEPs in CRC tissues. Therefore, in this study, ve HCMV-DNA(+) CRC tissues were randomly selected for transcriptome-seq. However, one samples did not align with any HCMV code sequencing region. The other 4 CRC samples were compared to 119 HCMV genes (Supplementary Table 2); the highest amount of HCMV transcript accumulation is shown in  (Figure 3). The GEPs showed that RL5A and RNA 2.7 had the largest number of reads in CRC, followed by RNA 4.9, RNA 5.0, UL82, UL83, and UL70. Notably, multiple gene associated with lytic infection [21], such as UL29 [22], UL54 [23,24], UL83 [19], UL122 [25], and UL123 [25] etc. were detected in patients with CRC. This indicates that HCMV present a lytic infection state in patients with CRC.
HCMV is correlated with the occurrence and development of CRC A previous study reported that pIEA was closely correlated to HCMV lytic infection. The major immediate early genes (MIE), UL123 and UL122 (IE1/IE2, IEA), play a critical role in subsequent viral gene expression and viral replication e ciency [26]. To further de ne the infected state of HCMV in patients with CRC, pIEA protein was detected by IHC. The associations of HCMV with survival and the clinical features of patients with CRC were further analyzed. Analysis of clinical characteristics showed that patients with CRC with pIEA(++) expression had longer OS than these patients with pIEA(+) expression (HR = 0.214, 95% CI: 0.047-0.969, P = 0.045, log rank P = 0.028, Supplementary Table 3), whereas no signi cant association was found with DFS (log rank P = 0.551, Figure 4A). However, no signi cant difference in survival was observed between pIEA(+) and pIEA(-) patients with CRC. Moreover, we found no signi cant correlation between pIEA expression and clinical information in 669 patients with CRC (Table 3).
Studies have suggested that HCMV infection affects the e cacy of postoperative radiotherapy and chemotherapy in CRC . Therefore, we also explore the in uence of HCMV infection on the e cacy of postoperative radiotherapy and chemotherapy in patients with CRC. We divided the 669 patients into two groups according to chemotherapy administration to analyze the effect of HCMV on CRC. In patients with CRC who had been administered chemotherapy, survival analysis showed that patients with pIEA(++) had a longer OS than those with pIEA(+) (HR yes = 0.120 (0.015-0.942), log rank P yes = 0.016, Figure 4B), with no difference observed in DFS. No difference was observed among patients with CRC who had not been administered chemotherapy ( Figure 4B).

Discussion
Chen et al. detected HCMV DNA by PCR in 69 (42.3%) samples resected before formalin xation (HCMV UL55, UL73, and UL144 genes), whereas only 14 (8.6%) samples from adjacent non-neoplastic tissue showed a positive result [27]. In another study, the HCMV UL73 gene transcript was detected in 42.2% (n = 83) of CRC samples [28]. Dimberg et al. showed that HCMV DNA was signi cantly higher in cancerous tissue than in paired normal tissue in Swedish (39.8%, n = 119) and Vietnamese (21.9%, n = 83) patients with CRC according to qRT-PCR (the artus CMV TM PCR kit (Qiagen)) [29]. Harkins et al. detected IE1-72 immunoreactivity in 12 (80%) of 15 adenocarcinomas but not in most of the normal colonic epithelium in areas adjacent to the tumor within the same pathological section, nor was IE1-72 detected in tumor-free surgical biopsy specimens of the colon from seven of these patients [30]. Similarly, in this study, we found that the prevalence of HCMV was signi cantly higher in CRC than in normal colorectal tissues. We evaluated HCMV at the DNA and protein levels for the rst time in a large number of samples, supporting the reliability of our results. However, different results were observed for the three genes evaluated in this study (UL47, UL56, and UL77). This may be because of differences in primer speci city and ampli cation e ciency of PCR for the different genes. Additionally, the number of copies of the HCMV genes may have been too low to be detected [31] because of the differences in the disease states between patients. These factors may explain the different positive rates for different HCMV genes in CRC.
HCMV latent infection is a key part of viral persistence and primary infection or reactivation. HCMV genes have diverse functions, which are closely correlated with the infection state. Previous studies revealed the gene pro les after HCMV infection by RNA-seq in different cell types. Rossetto et al. found the expression of HCMV genes was correlated with the time after infection in CD14(+) cells: primary infection, latent state, reactivation [32]. Almost all HCMV open reading frames were expressed at 5 days post-infection, only a subset of viral-encoded RNAs but immediate early IE2 and IE1 were detected at 18 days postinfection, and expression of the entire HCMV viral genome was observed, which is mostly consistent with lytic virus replication after reactivation. Guo et al. de ned the HCMV gene pro les in peripheral blood mononuclear cells from patients with SLE by RNA-seq [33]. Additionally, Tang et al. detected the HCMV sequence in CRC [13]. Shnayder et al. showed that the infective state is governed mainly by quantitative changes, with a limited number of qualitative changes, in HCMV gene expression. However, the HCMV gene expression pro le has not been explored in CRC tissue, which was evaluated by next-generation sequencing in this study. A total of 119 HCMV genes were detected in CRC tissues; 18 HCMV viral genes were detected in patients with SLE [33], among which 17 were detected except for UL112, which was subsequently found to function in immunomodulation [34]. We found that these HCMV genes which were correlated with DNA replication, gene expression regulation, and immunity showed high reads in CRC tissues. Of these genes, multiple lytic infection-associated genes were detected. As lytic-associated genes, UL122 and UL123 (IE1/IE2) play a critical role in viral gene expression and viral replication [26]. pUL83 (pp65) can modulate the expression of other HCMV genes and inhibit NK cell lysis [35]. pIE can promote the expression of early and late genes. The pUL34 interaction with pIE2 and pUL44 contribute to viral gene expression and DNA replication. Additionally, UL136, US29, and US33 are involved in vDNA replication. Similar to previous studies, RNA2.7 showed high levels of transcript accumulation after infection. Genes encoding putative membrane proteins or uncharacterized proteins such as UL1, UL2, UL8, UL59, UL90, UL120, UL127, UL134, and UL148b were not detected [36]. Moreover, in this study, one sample showed HCMV infection by PCR but the HCMV gene was not detected by RNA-sEq. This may be because the genes were not transcribed, or the reads were too low to be detected. Further, 115 HCMV genes were detected in one sample, possibly because of the viral titer, disease stage, or other reasons.
Unfortunately, the characteristics of the HCMV GEPs in CRC remain unclear because of the small number of samples evaluated. Thus, the effect of HCMV genes on CRC and the infective status of HCMV in CRC require further analysis of larger samples sizes.
Studies have shown HCMV is related to the development and progression of CRC. Harkins et al. reported that pIEA detected by IHC in tumor tissues was associated with progression in the colon (n = 29) [30]. Analysis of the relationship between HCMV and CRC showed unclear results in different populations by PCR. HCMV is associated with a shorter DFS in non-elderly patients with CRC (n = 89), whereas the opposite results were observed in elderly patients (n = 95), which didn't found in our study. Due to the diverse results with small scale of sample size in previous studies, we analyzed the impact of pIEA on CRC in a large sample size. We found pIEA is mainly expressed in the cytoplasm in CRC cells [10], which contrasts the results observed in other cancers. Interestingly, for the rst time, we found that patients with CRC with pIEA(++) had improved OS. However, OS can be affected by many factors. There was no difference in DFS in patients with CRC, contrasting the results of previous studies. Thus, pIEA(++) may play a role in prolonging the life of patients with CRC recurrence and promoting the recovery of these patients through other processes. In our study, pIEA(+) was not associated with survival in patients with CRC who had not been administered chemotherapy, whereas pIEA(++) was positively correlated with OS in patients with CRC who had undergone chemotherapy. We suggest that the expression of pIEA can increase the response of CRC tumor cells to chemotherapy and promote them to increase the tumor immune response. As we known, patients with CRC who do not undergo chemotherapy generally show earlier cancer status and have lighter lesions, suggesting that the bene cial effects of pIEA(++) on OS may mainly play a serious role in cancer lesions. The protection mechanism of pIEA remains unclear; however, our results provide insight into the mechanism of HCMV-CRC which will be further examined in our future research.

Conclusion
This is the rst study to detect HCMV infection in CRC in a large sample size at the DNA and protein levels. The prevalence of HCMV was higher in CRC tissues than para-carcinoma tissues. HCMV lytic infection-associated genes were found in CRC by RNA-sEq. Additionally, patients with CRC with pIEA(++) had a longer OS but not DFS. HCMV may affect CRC via its gene expression, which can in uence chemotherapy. The GEPs of HCMV in CRC remains unclear because of the small sample size used in this study, and thus our results should be con rmed in more samples. In conclusion, we revealed features of the pathogenic mechanism and provide insight that may be useful for targeted treatment of CRC.

Declarations
Declarations Figure 1 Flow chart of GEPs analysis.   with CRC according to whether they had been administered chemotherapy and found no difference among different pIEA expression in patients with CRC who had not undergone chemotherapy. In patients with CRC who had undergone chemotherapy, the pIEA(++) group showed longer OS than the pIEA(+)