NUSAP1 and KIAA0101 downregulation by neoadjuvant therapy is associated with better outcome and survival in breast cancer.

Background: Studies of molecular changes occurring before and after neo-adjuvant chemotherapy (NCT) for breast cancer may unveil genetic biomarkers to predict therapy response. This study aimed at identifying genomic changes in breast primary tumors of patients under NCT. Gene expression changes were correlated with pathological response and survival. Methods: Gene expression proles in tissue samples from pre and post NCT were obtained by a non-supervised classication analysis. Thirty-nine patients were classied according to their response to the chemotherapy as pathologic complete responders or non-responders (pCR and no-pCR, respectively). Overall survival was assessed by comparing gene expression values before NCT using the Log-rank (Mantel-Cox) test. Results: A signature constituted by 43 genes was obtained to stratify pCR and no-pCR patients after NCT (FC = + 3, FDR p -value < 0.0298). These genes were involved in regulation of the mitotic nuclear division and the anaphase-promoting complex-dependent catabolic process. Remarkably, over-expression of NUSAP1 and KIAA0101 were associated to poor overall survival. Conclusions: A new expression signature evaluating response for the neo-adjuvant chemotherapy stratied pathological response. The expression levels of NUSAP1 and KIAA0101 before and after the neo-adjuvant therapy may be useful to predict overall survival.

The objective of this work was to observe and analyze the genomic adaptations in the primary tumor of patients with BC to the effects of NCT, and to correlate these changes with pathological response and prognosis.

Materials And Methods
Patient population. BC patients were recruited, consented, and enrolled in the study in the Breast Cancer or cyclophosphamide IV (500-1500 mg/m 2 ) and epirubicin IV (> 60 mg/m 2 ). After receiving any of these schemes, the patients received 12 weekly cycles of paclitaxel IV infusions (80 mg/m 2 ) administered in 1 hr. In patients who presented drug toxicity, cycles of carboplatin replaced the drug responsible for the toxicity. After NCT, surgical resection of the breast was performed in all patients.
Tumor sample collection. Two tissue samples were collected from each patient: a biopsy sample (BS) before NCT paired with tissue after completing the NCT cycles (surgery sample or SS). Thick needle puncture biopsies were obtained using a Bard Magnun 12 Fr gauge Needle (Bard®). Tumor location was marked at diagnosis using a carbon tracking technique [13]. Six to eight tissue cylinders were obtained from each patient. Four samples were used for histopathological analysis and three samples were preserved in RNA-later solution for analysis. The SS were obtained from surgeries for locoregional control (modi ed radical mastectomy in most of the cases). Tissues were sent to pathology for histopathological analysis. A 2 x 1 cm piece, marked by the carbon track used during the diagnostic biopsy procedure, was preserved in RNA-later solution for the gene expression analysis. Treatment Response. Two pathologists evaluated surgical specimens and assessed tumor response to NCT using the Miller-Payne grading system. For purposes of this study, a grade 5 score in the grading system was considered as pathological complete response (pCR), and the remaining scores (including partial pathological response) were classi ed as non-pathological complete responses (no-pCR) [14].
RNA isolation and Microarrays analysis (expression pro les). RNA extraction from BS and SS were done using RNeasy Fibrous Tissue Mini Kit (Qiagen, Germantown, MA) following manufacturer's instructions. RNA quality was assessed by capillary electrophoresis using the Experion™ Automated Electrophoresis Station (Bio-Rad, CA, USA). Sample processing, microarray hybridization and gene expression analysis from the selected RNAs was conducted using the GeneChip 3'IVT Express Kit and GeneChip Human Genome U133 Plus 2 (Affymetrix, Santa Clara, CA), according to manufacturer's instructions and previously described [15].
Microarray data processing. Normalization was performed using RMA (Robust Microarray Analysis) normalization. Samples from ve patients were removed because they showed clearly altered pro les compared to the others (abnormal microarray quality controls), leaving 39 patients for this study. Probes with mean expression lower than 3 (in logarithmic scale resulted from RMA) were also removed.
Differential gene expression signature was performed using t-test with multiple comparison corrections using the False Discovery Rate method (FDR) [16]. We considered as positive the probes with a signi cant p-value (FDR: p < 0.05). These analyses were performed using the free Transcriptome Analysis Console (TAC) 4.0.1 software from Applied Biosystems (Thermo Fisher Scienti c, Waltham, MA).
Gene network. An interaction analysis of the selected genes was carried out using the online tool STRING: functional protein association networks version 11.0 [17]. The combined score was computed by combining the probabilities from the different evidence channels and corrected for the probability of randomly observing an interaction [18].
Real-time qPCR validation. In order to validate microarray data, four genes were selected based on the consistency of their expression in the different analyses: two over-expressed (MME and DST) and two under-expressed (NUSAP1 and KIAA0101). GRAMD1A was used as an endogenous gene control due to a low variation between the samples [15]. Expression analyses of these genes, as well as one endogenous control (GRAMD1A), were assessed using predesigned hydrolysis probes (MME: Hs00153510_m1, DST: Hs00156137_m1, NUSAP1: Hs01006195_m1, KIAA0101: Hs00207134_m1, GRAMD1A: Hs.PT.5840681431) (Thermo-Fisher Scienti c). Total RNA aliquots used for microarray assays were analyzed through qPCR using Quant Studio 3 (Applied Biosystem). Ct means for each gene were used for dCt (Problem minus endogenous), and 2-dCt analysis was done using calculated dCt for all genes. In order to compare gene expression of pCR and no-pCR groups, the relative expression 2-dCt was evaluated from qPCR data of all genes after normalization with GRAMD1A. Unpaired T test with Welch´s correction was used (p-value < 0.05) to establish differences.
Overall survival. Comparison of BS gene expression values with overall survival (OS) was evaluated in the 39 patients. In order to evaluate differences in OS, Log-rank (Mantel-Cox) test was used for comparison of Kaplan-Meier survival curves using GraphPad Prism version 6.01 for Windows, GraphPad Software (La Jolla, CA, www.graphpad.com). A p-value ≤ 0.05 was considered signi cant in all statistical analyses.
For external validation, Kaplan-Meier Plotter (http://kmplot.com/) online database [19] was used to analyze the OS correlated to high versus low gene mRNA expression levels The Kaplan-Meier Plotter split the BC patient (n = 1,402) samples into two groups according to their median mRNA levels. The Affymetrix probe ID used for the Kaplan Meier were KIAA0101/PCLAF 202503_s_at and NUSAP1 219978_s_at.

Results
Patients. Fifty-four patients were enrolled in the study, but only 44-paired samples (BS and SS) satis ed the RNA quality and quantity standards needed for microarray analysis. After this, ve sets of samples were eliminated because failed to achieve quality standards; leaving 39 patient sample sets for the nal analysis. Demographics and clinical characteristics of the patients are described in Table 1. Only 8 (20.5%) out of the 39 patients reached pCR, according to the Miller-Payne grading system. Gene expression pro le analysis. Different comparisons between pCR and no-pCR sets were made to evaluate the molecular gene expression modi cations induced by NTC (Figure 1). The rst comparison assessed changes induced by NCT in samples of patients achieving pCR (BS=8 vs SS= 8). A pro le of 21 probes representing 14 genes was found (FC = + 3, FDR p-value < 0.05) ( Figure 1A). Three genes were overexpressed (TOP2A, RRM2, and CDKN3) and eleven were downregulated (EGR2, ADAMTS5, JUN,  APOLD1, DUSP1, CYR61, ATF3, EGR1,  Gene Network. The online tool "STRING: functional protein association networks version 11.0" [17] was used to investigate interactions between the genes that were identi ed as differentially expressed in each comparison. The value of the interaction network was signi cant (PPI enrichment p-value: 2.26E-10), meaning that these proteins have more interactions among themselves than what would be expected from a random set of proteins of similar size drawn from the genome. Figure 1 shows the gene network for each gene expression pro le. According to the protein interaction analysis, AURKA, CCDC8, CCNB1, NUSAP1, and UBE2C are involved in the regulation of nuclear division during mitosis, while AURKA, CCNB1, and UBE2C genes are part of the anaphase promoter complex dependent catabolic process.
qPCR Validation, DST, MME, NUSAP1 and KIAA0101/PCLAF were used to validate the microarrays by qPCR. Only 31 samples (pCR=5, no-pCR=26) out of 39 had enough quality and quantity of total RNA to perform qPCR validation analysis. Supplementary Figure 4 shows the box plot of NUSAP1 and KIAA0101 expression by qPCR (4A and 4B) and by microarray (4C and 4D). A similar analysis was made for DST and MME (supplementary Figure 4E-F). This qPCR analysis corroborated the expression patterns of all the differentially expressed genes revealed by the microarray. Overall survival. The primary outcome variable was death due to BC. Patients were followed up for 46.5 months in average (SD=20.34, range 5.1-79.2 months). Higher levels of NUSAP1 gene expression in the BS (56 to 89%), were associated with a decreased OS (Log-rank Mantel-Cox test, Chi square = 4.517, pvalue = 0.0336) ( Figure 4A). Similarly, KIAA0101 gene overexpression negatively impacted OS, with a reduction from 84 to 60% (Log-rank Mantel-Cox test, Chi square = 2.827, p-value = 0.0927( Figure 4B). Consistent results were observed when analyzing public data from 1,402 patients from the Kaplan-Meier Plotter website (http://kmplot.com). Low levels of NUSAP1 and KIAA0101 were associated with greater OS (Log-rank HR = 1.82 CI = 1.46-2.26 p-value = 6.2e-08 and Log-rank HR = 1.47 CI=1.19-1.82 p-value=0.00039) ( Figures 4C and 4D, respectively).

Discussion
High-throughput genomic technologies like microarray and mass sequencing platforms may in uence the understanding of the changes in tumor biology of BC lesions after NCT. Usually, clinicians rely only in the histopathological assessment performed in biopsies and surgical materials to predict patient's prognosis. A better understanding of the tumor response to chemotherapy is important to design better treatment regimens for aggressive tumors.
In this study, tumor tissue samples of patients with BC were collected before (BS) and after neo-adjuvant therapy (SS), and then, gene expression assays were evaluated in a microarray platform. Comparison of gene expression pro les in patients who did or did not respond to treatment (pCR vs. no-pCR) provided a gene signature constituted by 43 differentially expressed genes in the SS sample. Validations with pCR corroborated the observations, con rming that 30 genes were over-expressed and 13 were underexpressed in these patients (Supplementary Table 1), some of these genes are involved in the regulation of the nuclear division during mitosis or participate in the anaphase promoter complex dependent catabolic process.
Deregulated gene expression of some of these genes has been implicated in BC. For example CCNB1, KIAA0101, NUSAP1, RRM2, UBE2C, and UBE2T alterations are part of a gene signature identi ed in the tumor genesis process in young women from the Middle East [20]. NUSAP1 and KIAA0101 were overexpressed in DCIS (ductal carcinoma in situ) and IDC (invasive ductal carcinoma) when compared to normal age-matched controls. CCNB1, RRM2, and UBE2C genes are reported in the PAM50 signature as elements for the molecular classi cation of BC lesions [21]. However, to our knowledge, the 43 gene signature described here for predicting BC response after NTC has not been reported.
KIAA0101 and NUSAP1 genes were found to be down regulated by the NTC in patients with pCR. Furthermore, higher expression levels of these same genes in the tumor biopsy before treatment (BS) were related to worse survival, indicating that they are potential predictors of survival in diagnostic biopsies. The protein codi ed by KIAA0101 (aka PCLAF or PCNA Clamp Associated Factor) is a PCNA binding protein which acts as a regulator of the number of centrosomes and it's involved in repair mechanisms during DNA replication [22]. Over-expressed KIAA0101 has been associated with a decreased survival in BC patients [22], but not with the pathological response to NCT. NUSAP1 gene expression levels showed a remarkable inverse correlation with survival (Fig 4A and 4C). This gene encodes the nucleolar-spindle associated protein that binds to chromatin and microtubules, and is critical for the cytokinesis spindle assembly during mitosis [23]. NUSAP1 overexpression has been described in bladder, cervix, colon, glioblastoma, liver, lung, oral squamous cell carcinoma, prostrate, kidney, and breast [24][25][26][27][28][29][30][31][32][33][34][35][36][37][38][39] cancers. Multiple studies have correlated its over-expression with a poor prognosis [22,28,30,32,36,37,40,41]. Zhang et al. [37] demonstrated that down-regulation of NUSAP1 suppressed proliferation, migration and invasion of MCF7 cells by disturbing the regulation of CDK1 and DLGAP5, and reported increased susceptibility to epirubicin [37].
Both, NUSAP1 and KIAA0101, are associated with DNA repair mechanisms through their relationship with BCRA1. NUSAP1 promotes increased expression of the BRCA1 protein [42]. KIAA0101 regulates the number of centrosomes through its interaction with BRCA1 [22]. Since the biological role of NUSAP1 and KIAA0101 involve cell cycle pathways, patients with elevated levels of these genes can bene t from chemotherapeutic drugs interfering with BRCA1 DNA repairing pathways such as platinum derivatives. In addition, levels of expression of these genes after NCT in patients not reaching pCR may suggest a second-line therapy.
Galiellalactone (GL), a fungal metabolite with anti-tumor and anti-in ammatory properties, downregulates NUSAP1 in the DU145 by targeting the NF-κB and STAT3 pathways, inducing arrest of the cell cycle. NUSAP1 overexpression may be a target for GL [43]. Another option to target NUSAP1 overexpression is the antitumor compound isopicrinin, isolated from the Rhazya stricta plant, which has cytotoxic activity by inhibiting the assembly of microtubules [44].
The under-expression of NUSAP1 seems to sensitize osteosarcoma cells to treatment with taxol, since NUSAP1 interacts with the SUMO E3 ligase complex contributing to adequate chromosomal segregation [45]. In oral squamous cell carcinoma, NUSAP1 knockdown has been observed to potentiate paclitaxelinduced apoptosis [46]. In our case, the under-expression of NUSAP1 before NTC treatment was shown to be associated with better survival and pCR.

Conclusions
Differential expression analysis between pCR and no-pCR patients showed a pro le constituted by 43 differentially expressed genes in tumor tissue samples collected after chemotherapy. Among these, overexpression of NUSAP1 and KIAA0101 was associated with poor prognosis in BC and with pathological response, opening the option for these genes to be used as prognostic markers of response to the NCT.
Furthermore, drugs that impair the cell cycle and DNA repair mechanisms or microtubule assembly during mitosis are potential candidates for a second line of treatment in those patients who do not reach the pCR after NCT. Authors' contributions. GI M-G participated in the conception and design of the study, recruited patients, collected clinical data, interpreted and analyzed data, participated in the manuscript drafting, and contributed with a critical revision of the article. SK S-F participated in the conception and design of the study, performed the molecular biology experiments, interpreted and analyzed data, participated in the manuscript drafting, and contributed with a critical revision of the article. AF V-V contributed with the molecular biology experiments and interpreted and analyzed data. S C-H contributed with the design of the study, recruited patients, collected clinical data, and interpreted and analyzed data. P R-F contributed with data interpretation and analysis and in the manuscript drafting. J H-S-C participated in data interpretation and analysis and in the manuscript drafting. YX P-P participated in the molecular biology experiments and in the data interpretation and analysis. GS G-M contributed with the pathological analysis of biopsies and tumor samples and in the data analysis process. D D-G helped to recruit patients, collected clinical data, and participated in the data interpretation and analysis. J V-G contributed with the conception of the study and participated in the data interpretation and analysis. G B-S contributed with the molecular biology experiments and interpreted and analyzed data. A R-M participated in the interpretation and data analysis process, contributed to the drafting of the manuscript, and contributed with a critical revision of the article. R O-L participated in the conception and design of the study, in the molecular biology experiments, in the data analysis and interpretation in drafting the article, in performing a critical revision of the article and approved the nal version to be published. Figure 1 Heat map of differential gene expression pro le in SS (pCR vs no-pCR). Blue areas represent low gene expression, while red represents high gene expression. The top row separates the no-pCR (blue) and pCR (red) patients. Each column represents a different sample and each row, a single probe. O cial gene or probe symbols are displayed at the right-side margin.

Supplementary Files
This is a list of supplementary les associated with this preprint. Click to download. SupplementaryMaterials.25JUL20.docx