Nucleotide Excision Repair Proteins and Risk of Head and Neck Squamous Cell Carcinomas in a Chinese Population

Background Nucleotide excision repair (NER) is pivotal in the development of smoking-related malignancies. We hypothesize that expression levels of NER proteins are associated with risk of the head and neck squamous cell carcinomas (HNSCCs) in a Chinese population. Methods To test this hypothesis, we conducted a case-control study of 337 HNSCC patients and 285 cancer-free controls by measuring the expression levels of nine core NER proteins in cultured peripheral lymphocytes. Results Compared with the controls, cases had statistically signicantly lower expression levels of XPA (P=0.001). After dividing the subjects by controls’ medians of expression levels, we found an association between an increased risk of HNSCCs and low XPA expression levels [adjusted ORs and 95% CIs:1.42 and 1.03-1.96; P trend =0.031]. We identied a multiplicative interaction between smoking as well as drinking status and XPA expression levels (P = 0.005 and 0.044, respectively). Finally, the sensitivity of the expanded model with protein expression levels, in addition to demographic variables, on HNSCCs risk was signicantly improved, especially among ever smokers and ever drinkers. Conclusions Reduced XPA expression levels were associated with an increased risk of in a Chinese population.

Functional mutations in any of these proteins may lead to abnormal NER and subsequently increase susceptibility to cancers including cancers of skin, lung, and head and neck, et al [23][24][25][26][27].
In a study of 57 HNSCC patients and 63 cancer-free controls, it is reported that an increased risk of HNSCCs was associated with reduced expression levels of NER proteins in lymphocytes in non-Hispanic Whites population [28]. Later, the same group validate these results in a study with a much larger sample size and more NER proteins [29]. Until now, there is no study exploring the above associations in Chinese population, in which the composition of HNSCCs is quite different from that in non-Hispanic Whites population. Speci cally, oropharyngeal cancers account for the most of HPV-positive HNSCCs in the United States, and the HPV-positive oropharyngeal cancer cases in previous non-Hispanic Whites study were about 91.4% of all the oropharyngeal cancer cases, while in Chinese population the most of oropharyngeal cancer cases are HPV-negative, meaning they are primarily caused by cigarette smoking [29][30][31][32][33]. In addition, the etiology of smoking-related HNSCCs is different from that of HPVpositive HNSCCs [34][35][36]. Subsequently, a study from a different race group is required to further validate the previously reported association studies. Therefore, we conducted a case-control study to test associations between expression levels of nine core NER proteins and risk of HNSCCs in a Chinese population.

Study subjects
We recruited 337 HNSCC patients and 285 cancer-free controls from the First A liated Hospital of Xi'an Jiaotong University during the period between 2013 and 2018. The cases were selected based on the following criteria: 40 years and older, newly diagnosed, histologically con rmed HNSCCs but with no other cancers. The controls were recruited among visitors accompanying patients to the First A liated Hospital of Xi'an Jiaotong University; they were biologically unrelated to the cases, frequency-matched with cases by age and sex, and have no history of prior malignancies. The subjects included in currently study were all Chinese Han. A written informed consent was obtained from cases and controls. Participants who smoked more than 100 cigarettes during their lifetime were de ned as ever smokers, of which those who had quit smoking at least one year were de ned as former smokers and remaining was considered current smokers; others were considered never smokers. Participants who drank alcoholic beverages at least weekly for one year were considered as ever drinkers, of which those who had quit drinking more than one year were considered as former and the remaining was de ned current drinkers; others were de ned never drinkers. Each subject donated a 15-ml blood sample. The HPV status of all subjects were tested by RT-PCR assay. In the previous study, the expression levels of NER proteins were not correlated with the HPV status in non-Hispanic White population [29]. Since the number of the HPVpositive HNSCC cases were very limited with only three cases identi ed as HPV-positive, we could not infer that NER proteins expression were not correlated with the HPV status in current Chinese population. Thus, the HPV-positive HNSCC subjects were excluded to avoid further heterogeneity in current study. The study protocol was approved by the First A liated Hospital of Xi'an Jiaotong University Institutional Review Board.

Reverse-phase Protein Lysate Microarrays
Details regarding the RPPA (reverse-phase protein lysate microarrays) assay have been reported previously [29]. In detail, we isolated T-lymphocytes from whole peripheral blood by Ficoll gradient centrifugation. Cellular proteins were extracted from the cells and prepared for the RPPA analysis. Serial diluted lysates applied to nitrocellulose-coated slides (Schleicher & Schuell BioScience, Inc., USA) by Aushon Arrayer (Aushon BioSystems, USA). Each sample containing the antigens (the NER proteins) to be detected was spotted in duplicate with additional positive and negative controls prepared from mixed cell lysates or dilution buffer, respectively. Each slide was probed with a validated primary antibody plus a biotin-conjugated secondary antibody. Mouse anti-goat or anti-rabbit polyclonal or anti-human monoclonal antibodies were used against XPA, XPB, XPC, XPD and ERCC1 (Santa Cruz, USA); XPF (Abcam, USA); XPG (Protein tech, USA); DDB1 and DDB2 (Invitrogen, USA). The arrays were incubated with individual antibodies for 1 h at room temperature. The secondary antibodies were added to the slides and incubated at room temperature for 30 min.
Signals were ampli ed using a Dako system according to the protocol as previously described [29]. We then incubated the slides with a secondary conjugated streptavidin for 30 min and observed the signals by DAB colorimetric reaction. The signals on the microarrays were processed using the Array-Pro Analyzer software (Media Cybernetics, USA) to determine spot intensity, which were then analyzed by a logistic model by the R package. A tted curve was plotted with the relative log2 concentration of each protein on the X-axis and the signal intensities on the Y-axis using the B-spline model as previously described [37].
Protein concentrations were determined from the tted curve for each lysate by the curve-tting and normalized by the median value for protein loading as described [38,39]. The RPPA_CF is the correction factor in RPPA. Samples were considered as an outlier, if the correction factor was below 0.25 or above 2.5.

Statistical Analysis
The distribution of demographic variables was evaluated between cases and controls by the Chi-square test. The differences in the relative expression levels of NER proteins were compared by Wilcoxon ranksum test between cases and controls.
The medians of expression values were used in the controls as the cutoff values for calculating crude odds ratio (OR) and their 95% con dence intervals (CI). The associations between protein expression levels and HNSCC risk were estimated by computing ORs and CIs from multivariate logistic regression models. Further strati cation analyses were used to evaluate effect modi cation of related NER protein expression levels and demographic variables. A multiplicative interaction was de ned as when OR 11 > OR 01 × OR 10 , in which OR 11 was the OR when both factors were present, OR 10 was the OR when only factor 1 was present, and OR 01 was the OR when only factor 2 was present[40].
To assess the effects of protein expression levels on HNSCC risk prediction, two risk models were constructed to examine the area under the receiver operating characteristic (ROC) curve (AUC): the baseline model including only demographic variables, and the protein model including the expression levels in addition to these demographic variables. All tests were two-sided, and P < 0.05 was considered signi cant. All statistical analyses were performed using SAS software (version 9.4; SAS Institute, Inc., Cary, NC).

Characteristics of the Study Population
The summary of the distributions of selected characteristics of cases and controls is presented in Table  1. There were no signi cant differences in the distributions of age and sex between cases and controls.

Differences in NER Protein Expression Levels between Cases and Controls
The cases showed lower relative mean expression levels in six of the nine core NER proteins analyzed than did controls, except for XPC, XPG, and ERCC1 (Table 2). In Wilcoxon rank-sum test for differences in NER protein expression levels between cases and controls, only XPA levels were statistically signi cantly lower in cases than in controls (P= 0.001; Fig 1A). Because the expression levels of the nine NER proteins were measured at the same time, they were likely to be correlated with each other. As shown in Supplementary Table 1, expression levels of XPA were statistically signi cantly correlated with XPB, XPC, XPD, and ERCC1(P = 0.019, P = 0.050, and P < 0.001, and P = 0.012, respectively).

Strati cation Analyses of Expression Levels of XPA by Selected Variables
Strati cation analyses of XPA expression levels revealed that patients in subgroups of the age ≤ 59, age > 59, male, female, former and current smokers, and former and current drinkers exhibited signi cantly lower mean expression levels of XPA than did controls (All the P < 0.001, respectively, Table 3). In cases, women had lower expression levels of XPA than did men, but in controls, women had higher expression levels of XPA than did men, and the sex differences in the expression levels were insigni cant in both case and control groups (P = 0.249 and P = 0.889, respectively, Table 3). Moreover, both ever smokers and drinkers had signi cant lower expression levels of XPA than did never smokers and drinkers, respectively (All the P < 0.001, respectively, Table 3). There were no signi cant differences in the expression levels of XPA by tumor sites, suggesting that expression levels of XPA may not be different among tumors of HNSCCs (Supplementary Table2).

Associations between NER Protein Expression Levels and Risk of HNSCCs
To estimate HNSCC risk, the relative expression levels were grouped into median values of the controls ( Table 4). The crude ORs for HNSCC risk associated with lower relative expression levels of XPA were 1.43 (95% CI, 1.04-1.97), compared with the high expression levels of XPA. After adjusting for age, sex, smoking status and alcohol consumption in multivariate logistic regression analysis, the OR of XPA remained essentially unchanged. When continuous expression values were used in the logistic regression model with adjustment for all covariates, there was also a dose-response relationship between the reduced expression levels of XPA and the increased HNSCC risk (P trend = 0.031).

Interactions between XPA Expression Levels and Selected Variables
We further assessed possible interactions on a multiplicative scale between expression levels of XPA and selected variables listed in Table 1. The multiplicative interaction was tested when we included the interaction term (i.e., relative expression levels of XPA × each of the risk factors) in a multivariate regression model that also included the main effects of NER protein expression levels and other covariates. We found that smoking status as well as drinking status had signi cantly multiplicative interactions with relative expression levels of XPA (P = 0.005 and P = 0.044, respectively, Table 3), in association with HNSCC risk. To further unravel these multiplicative interactions, we strati ed the adjusted ORs by smoking status and drinking status. It was apparent that ORs for the relative expression levels of XPA by median in groups of ever smokers were greater than those of never smokers (Fig. 1B). And the ORs for the relative expression levels of XPA by medians in groups of ever drinkers were greater than those of never drinkers (Fig. 1C).
We further assessed the prediction performance of models integrating demographic variables and protein expression levels on HNSCCs using the ROC curves that measure the effect of XPA expression levels in two dimensions. The AUC was signi cantly improved in the model that included the effect of XPA expression levels, compared with the model that did not ( Fig. 2A, P = 0.004). Furthermore, the AUC was signi cantly improved in former and current smokers that included the effects of XPA expression levels, compared with the model that did not ( Fig. 2C and 2D, P < 0.001 and P < 0.001, respectively), but insigni cantly improved in never smokers (Fig. 2B, P = 0.462). The AUC was signi cantly improved in former and current drinkers that included the effects of XPA expression levels, compared with the model that did not (Supplementary Fig. 1B and 1C, P = 0.001 and P = 0.001, respectively), but insigni cantly improved in never drinkers ( Supplementary Fig. 1A, P = 0.404).

Discussion
In this study, we further con rmed the previous study's results that reduced NER protein expression was associated with an increased risk of HNSCCs by the RPPA assay. Our results showed that the reduced relative expression levels of XPA were associated with an increased risk of HNSCCs in a Chinese population. We further assessed interactions between XPA expression levels and selected variables and found that smoking as well as drinking had signi cant multiplicative interactions with XPA expression on HNSCC risk. Moreover, the AUC model suggested that the effects of XPA expression levels further improved the risk prediction in ever smokers and drinkers.
In an early study, it was reported that there was an association between an increased risk of HNSCCs and reduced expression levels of XPD, XPF, XPA and XPC in non-Hispanic population, when appropriate antibodies for DDB1 and XPB were not available at that time [28]. Later, the same group validated the above results with more available antibodies for essential proteins, and found the relative expression levels of XPA and XPB were signi cantly lower in cases than in controls, and the risk of HNSCCs associated with lower expression levels of XPA and XPB [29]. As the composition of HNSCCs in Chinese population is quite different from that in non-Hispanic Whites population, we tested the associations between expression levels of nine core NER proteins and risk of HNSCCs in a Chinese population, and found that the reduced expression levels of XPA was associated with HNSCC risk, but not for XPB. These results further support the notion that altered translational levels of NER genes, which have a more direct effect on the NER capacity than that of transcript levels, may contribute to the risk of HNSCCs. Moreover, our previous work of transcript level suggested that mRNA expression level of XPA and XPB were statistically signi cantly lower in cases than in controls, and the reduced mRNA expression levels of XPB were associated with an increased risk of HNSCCs in a Chinese population[41], however, we did not nd the above association with XPB in translational level. One reason for this discrepancy is that the transcript levels and translational levels of NER genes may not be directly correlated. Although the mRNA of NER gene is ultimately translated into a NER protein, the transcription and translation processes are far from a simple linear correlation [42]. The underling mechanisms are likely to be the cis-acting and transacting processes create a serial of systems that promote or inhibit the synthesis of proteins from a certain copy number of mRNA molecules, and translation levels are more directly involved in the NER repair process [43]. Another reason is that the sample size of current study is still not large enough, future studies with more cases and controls are warranted to validate the current results.
Previous study suggested that a modi cation effect of smoking status on XPB, indicating that an association between the reduced expression levels of XPB and increased risk of HNSCCs may differ by smoking status [29]. In current study, we have observed smoking as well as drinking status had signi cant multiplicative interactions with XPA expression levels on HNSCC risk, other than XPB. Subsequently, we strati ed the ORs of XPA by smoking and drinking status and found that the adjusted ORs for XPA in ever smokers or ever drinkers were greater than that in never smokers or never drinkers, indicating that ever smokers or ever drinkers might have a higher risk of developing HNSCCs with reduced XPA expression levels.
The XPA protein consists of several domains: the C-terminal domain able to interact with the transcription factor IIH, the N-terminal domain with RPA34 and ERCC1 binding sites, and the central domain responsible for DNA binding [44]. Variation in XPA's functions may lead to an aberrant NER process and subsequently increase the susceptibility to cancer. Our data suggested an increased risk of HNSCCs associated with reduced expression levels of XPA in a Chinese population, and the current results were consistent with previously published non-Hispanic Whites studies on HNSCC risks, suggesting XPA may serve as a general biomarker for HNSCCs among two race groups.
Previously, we assessed the performance of NER proteins on HNSCC risk in the AUC model in non-Hispanic Whites population and found that the AUC model was signi cantly improved by including the combined effect of XPB and XPA expression, compared with the model that did not, especially in former smokers [29]. In current Chinese population study, we found that the AUC model was signi cantly improved by XPA expression levels, compared with the model that did not, especially in ever smokers and ever drinkers, suggesting that suboptimal XPA expression levels may play a more important role in the risk of HNSCCs in ever smokers and ever drinkers.
The RPPA assay is a rapid, cost-effective and most importantly an e cient method to measure the expression levels of NER proteins, and the current study is the rst study to measure the associations between NER proteins and risk of HNSCCs in Chinese population. Although the present study is an extension of previous work for NER proteins with more antibody, there are still several limitations needed to be resolved. Although the results in current translational study is different from that of the transcriptional study, the current results in translation levels are more directly involved in the NER repair process. Like previous hospital-based studies, the control group may not be representative of the general population, and future studies may need a much larger sample size and recruit the controls from the community-based population.