Ppia is the most stable housekeeping gene for qRT-PCR normalization in kidney tissues of Pkd1 decient mouse models

Autosomal Dominant Polycystic Kidney Disease (ADPKD) is the most common inherited renal disorder, characterized by renal cyst development leading to end-stage renal disease. Although the appropriate choice of suitable reference is critical for quantitative RNA analysis, no comparison of frequently used “housekeeping” genes is available. Here, we determined the validity of 7 candidate housekeeping genes (Actb, Actg1, B2m, Gapdh, Hprt, Pgam1 and Ppia) in kidney tissues from mouse models orthologous to ADPKD, including a cystic mice (CY) 10–12 weeks old (Pkd1 ox/ox :Nestin cre /Pkd1 ox/- :Nestin cre , n = 10) and noncystic (NC) controls (Pkd1 ox / ox /Pkd1 ox / - , n = 10), Pkd1-haploinsucient (HT) mice (Pkd1 +/- , n = 6) and wild-type (WT) controls (Pkd1 +/+ , n = 6) and a severely cystic (SC) mice 15 days old (Pkd1 V/V , n = 7) and their controls (CO, n = 5). Gene expression data were analyzed using six distinct statistical softwares. The estimation of the ideal number of genes suggested the use of Ppia alone as sucient, although not ideal, to analyze groups altogether. Actb, Hprt and Ppia expression proles were correlated in all samples. Ppia was identied as the most stable housekeeping gene, while Gapdh was the least stable for all kidney samples. Stat3 expression level was consistent with upregulation in SC compared to CO when normalized by Ppia expression. In conclusion, present ndings identied Ppia as the best housekeeping gene for CY + NC and SC + CO groups, while Hprt was the best for the HT + WT group. people worldwide and the leading monogenic cause of end-stage kidney disease. Mutations in the PKD1 gene account for ~ 78% of the affected families mutations in PKD2 are detected in ~ 15% of them, with the remaining ones being genetically unresolved or associated with rare mutations in the GANAB and DNAJB11 gene expression performed in in models orthologous ADPKD. the rst assessment of candidate housekeeping genes for normalization of mRNA expression by RT-qPCR in kidney tissue of mouse models orthologous to ADPKD with distinct patterns of Pkd1 deciency. gene in reference genes beta-Actin, GAPDH, 18S B2M. anchors present investigation about the stability of seven candidate genes selected from previous studies (Actb, Actg1, B2m, Gapdh, Hprt, Pgam1 and Ppia) 34,35 24 31,36,37 38 in kidneys samples of Pkd1-decient noncystic and cystic mouse models. The stability of gene expression was analyzed using distinct statistical models, including a pairwise comparison model, geNorm, and an model, NormFinder. commonly used gene as endogenous control in RT-qPCR analyses, since its expression is usually constant. 23,24,38,39 25 Using a similar approach, a recent report evaluated six commonly used reference genes (Actb, B2m, Gapdh, Hmbs, Hprt and Ppia) to identify the most constantly expressed gene under the inuence of testosterone in rat. 40 This study showed that Hmbs and Ppia were the most stably expressed genes in the hypothalamus while Hmbs and Gapdh appeared to be the most stable genes in kidneys, indicating that in this setting the Gapdh expression prole was more stable than in our case. Concentrations of GAPDH, however, may vary among individuals, 41 during pregnancy, 42 according to developmental stage 43 44 and during the cell cycle. 45 Other reports also documented such a limitation. 46–49 50 . Therefore, the recognition that the expression of Gapdh may exhibit tissue-specic regulationis consistent with our results and emphasizes the need for validation of commercially available control assays.


Introduction
Autosomal dominant polycystic kidney disease (ADPKD) is manifested by bilateral development of multiple uid-lled epithelialderived cysts. 1 2 It is the most common Mendelian disorder of the kidneys, affecting 3-5:10,000 people worldwide 3 and the leading monogenic cause of end-stage kidney disease. 4 Mutations in the PKD1 gene account for ~ 78% of the affected families while mutations in PKD2 are detected in ~ 15% of them, 5 6 with the remaining ones being genetically unresolved or associated with rare mutations in the GANAB and DNAJB11 loci. 7 The elucidation of PKD1/PKD2-related biology has allowed major steps toward the understanding of ADPKD pathogenesis and the development of diagnostic tools and biomarkers. 8 The signi cant number of abnormally functioning pathways involved in the disease pathophysiology, in turn, opened a number of tracks to develop potentially speci c therapies. In addition to eGFR, the development of progression and prognostic markers has been also essential to guide clinical decisions in ADPKD. The classi cation based on total kidney volume growth 9 has been widely used as well as the multiple criteria developed by Cornec-Le Gal et al. 10 A combination of distinct biomarkers in a classi er including new biomarkers should further increase sensitivity and speci city. 11 The characterization and elucidation of speci c strategic transcriptional pro les in turn, are expected to expand the options of therapeutic targets and the number of useful biomarkers for the disease.
Gene expression analysis plays a central role on identifying and characterizing pathways involved in speci c phenotypes and diseases, potentially allowing elucidation of pathogenetic aspects and biomarker discovery. Reverse transcription quantitative polymerase chain reaction (RT-qPCR) is one of the most sensitive and reproducible means of quantifying RNA expression. 12 13 Nonetheless, to provide accurate expression analysis this method requires expression normalization of the genes of interest to a reference gene that is stable and not affected by experimental conditions. In this scenario, expression stability is a major criterion for housekeeping gene selection. 14 Reference genes are generally selected among housekeeping genes ubiquitously expressed and not transcriptionally affected by experimental conditions. 13 While the expression of some housekeeping genes is constant under certain conditions, it can signi cantly change in some circumstances 15 such as developmental stages, cell types and experimental conditions. 14, 16 17 13 It is therefore essential to characterize the suitability of candidate housekeeping genes to serve as appropriate internal mRNA expression controls under a given experimental condition where transcriptional effects are being investigated.
Glyceraldehyde-3-phosphate dehydrogenase (Gapdh), actin beta (Actb) and 18s ribosomal RNA (18s rRNA) are the most common reference genes. In kidney tissues of Pkd1-de cient mouse models, Gapdh has been the most often used endogenous control for gene expression studies. 18 However, its reliability in this context has not been demonstrated yet. Moreover, the indiscriminate use of these genes may be inappropriate, as they have been implicated in disease processes 19 including ADPKD 20 . Therefore, whether the expression of housekeeping genes is or is not stable in kidney tissues of Pkd1-de cient mouse models remains unknown. Determining the stability of genes known to be involved in ADPKD and used as housekeeping is particularly important.
To elucidate this issue and bring appropriate information to PKD experimental studies involving kidney transcriptional pro les, we aimed to identify the most stable housekeeping controls from a panel, namely seven candidate genes commonly used as endogenous controls in kidney disease, not limited to ADPKD: Actb, Actg1, B2m, Gapdh, Hprt, Pgam1, and Ppia., to be considered in kidney RT-qPCR obtained from Pkd1-de cient mouse models.

Results
Quantitative gene expression analyses: the rst step to identify optimal housekeeping genes for RT-qPCR studies of renal tissue in Pkd1-de cient mouse models We applied a stepwise strategy to select and evaluate candidate housekeeping genes with the aim of identifying optimal housekeeping genes for RT-qPCR analyses of kidney tissue in three Pkd1-de cient mouse models: Pkd1 +/-[haploinsu cient (HT), noncystic] and its control Pkd1 +/+ (wild type, WT); Pkd1 ox/ ox :Nestin cre and Pkd1 ox/-:Nestin cre (cystic, CY) and their controls Pkd1 ox/ ox and Pkd1 ox/-(noncystic, NC); and Pkd1 V/V [severely cystic, SC) and its control Pkd1 +/+ (CO, noncystic). Such steps are outlined in Fig. 1.
Raw RT-qPCR data in renal tissue were obtained for the Actb, Actg1, B2m, Gapdh, Hprt, Pgam1 and Ppia candidate housekeeping genes for each of the Pkd1-de cient mouse models. The corresponding median Ct values are shown in Fig. 2 [25.38 (1.32)], in turn, displayed moderate (Ct between 20 and 25) and relatively low expression (Ct above 25), respectively (Fig. 2). This process was preceded by assessment of RNA quality, which showed that RIN (RNA Integration Numbers) ranged from 5.9 to 8.8 among the analyzed samples (Supplementary Table S4).
Separate assessments and comparisons involving the renal cystic mice and their corresponding controls were performed, including only the Pkd1 ox/ ox :Nestin cre and Pkd1 ox / ox genotypes as well as both cystic genotypes and corresponding controls: Pkd1 ox/ ox :Nestin cre , Pkd1 ox/-:Nestin cre , Pkd1 ox/ ox and Pkd1 ox / - (Supplementary Tables S2 and S3). The results yielded by the two analyses did not signi cantly differ, showing similarity and stability of genetic background between both model pairs. Of note, this pattern was also observed between such model pairs in a previous study of ours directed to identifying housekeeping genes for microRNA expression analysis. 21 In this context, in the current study we analyzed this cystic model merging Pkd1 ox/ ox :Nestin cre and Pkd1 ox/-:Nestin cre mice in the CY group and Pkd1 ox/ ox and Pkd1 ox /in the NC group.
Analysis of expression stability of the candidate housekeeping genes The following step in the process of selecting the best housekeeping genes was the assessment of expression stability for each of the candidates, taking into account the related genetic backgrounds. This task was accomplished using the NormFinder, GeNorm, RefFinder, ΔCt method, Bestkeeper and DataAssist software packages (Fig. 2), which allowed determining their relative expression stability and generating a ranking among the best ones. The most stably expressed genes in the kidney samples of the three Pkd1-de cient mouse models were de ned by the lowest variability values observed in each condition of interest. It must be pointed out that the results for the CY-NC model did not differ between the two and four-genotype analyses (Supplementary Tables S2 and S3).
All candidate housekeeping mRNAs presented M values below 1.5, the GeNorm set threshold, ndings that are consistent with stability. 22 The Bestkeeper software points out inconsistency when SD is higher than 1.0, a pattern detected only for Hprt and Pgam1 in the SC sample groups (Supplementary Table S2). NormFinder evaluates expression stability by analyzing the intraand intergroup transcriptional variation of the candidate reference genes. An SD value lower than 0.5 is recommended to consider a gene a suitable housekeeping. Ppia, Hprt, Pgam1 and B2m had SD values below 0.5, while the Actb, Actg1 and Gapdh values lied above 0.5 in all sample groups. Moreover, Pgam1 and B2m displayed SD higher than 0.5 in the SC group, B2m in CO and SC + CO, and Gapdh in CY, NC, HT, WT, SC e SC + CO groups. These results suggest that these candidate genes should not be considered suitable housekeepings in the mentioned groups and settings.
Based on the different used algorithms and considering all ranks yielded by the referred analyses, Ppia appeared as the most stably expressed housekeeping gene for analyses including all groups, as well as for the CY, SC, CY + NC and SC + CO groups. The data that supported this conclusion were yielded by NormFinder (SD = 0,22), GeNorm (M value = 0,53), RefFinder (Geomean = 1,00), ΔCt method (Mean SD = 0,80), Bestkeeper (CV = 3,24 and SD = 0,54) and DataAssist (Score = 0,68) (Supplementary Table  S2). Pgam1, on the other hand, was the most stably expressed housekeeping gene for analyses involving the WT + CO groups, Actb for the NC group, and Hprt for analyses including the HT + WT groups as well as only HT animals (Table 1).
Taken together, our data ranked Ppia and Hprt as the most stable candidate housekeeping genes. Gapdh, on the other hand, was deemed as the least stable one, being associated with the following values: NormFinder (SD = 0.  Table S2).
Since the selection of the most suitable housekeeping gene depends on the Pkd1 de ciency model, it is important to conclude that Ppia was identi ed as the best housekeeping gene for the CY-NC and SC-CO pairs of samples while Hprt was the most suitable for HT-WT. Therefore, and interestingly, Ppia seems to be the best housekeeping in analysis involving models with cystic phenotypes ( Table S2). Pgam1, in turn, appears as the most appropriate housekeeping gene when comparing noncystic controls (WT, NC and CO) ( Table 1).

Analysis of the best combination of housekeeping genes
The use of combined housekeeping genes is a strategy often employed to improve comparisons of relative target gene expression among groups. A comprehensive analysis of potential combinations of evaluated candidates, therefore, should be performed to establish the best anchors to quantify and compare gene expression. Ultimately, this approach constitutes a way of minimizing possible outliers and inappropriate differences between measurements.
The software packages recommend at least two genes to be used together as housekeeping genes. The best combination of candidate housekeepings for each of our group of samples, based on analyses yielded by different software packages, is shown in Table 2. Such analyses revealed that the Actb + Hprt and Hprt + Ppia pairs are the combinations most often identi ed as the best housekeeping gene options for comparisons including all groups, with the same statistical weight; Actg1 + Ppia was the most frequent pair for the CY group; Actb + Pgam1 for NC and WT; Hprt + Pgam1 for HT; Actb + Ppia or Actg1 + Ppia for SC; and Actg1 + Pgam1 for the CO group.
Comparisons of the different Pkd1-de cient kidney tissues with their respective controls identi ed Pgam1 + Ppia as the best pair of housekeeping genes for CY and NC, B2m + Hprt for HT and WT, and Actg1 + Ppia for SC and CO. Additionally, Actb + Hprt + Ppia was the most frequent best trio identi ed for analyses including all groups; Actg1 + Pgam1 + Ppia for CY and NC, Actg1 + B2m + Hprt for HT and WT, and Actb + Actg1 + Ppia or Actg1 + Hprt + Ppia for SC and CO (Table 2).
Notably, the identi ed best pair and trio combinations of housekeeping genes provided further support to the association of Ppia and Pgam1 with the cystic and non-cystic phenotypes, respectively. Moreover, considering all samples together Ppia remains the most stable candidate housekeeping gene. Genes in bold: higher frequency of appearance in softwares. Determination of the optimal number of housekeeping genes The use of an optimal number of reference genes is important to save samples and primers, select and validate housekeeping genes, classify samples, group genes and monitor time-dependent processes. While the selection and use of stable housekeeping genes is often limited to only one, in most of such cases the stability of gene expression analysis could improve with the inclusion of one or more additional housekeeping controls. In this scenario, in the current study we established the minimum number of housekeeping genes that should be used in gene expression analysis involving each of our groups, each model with its respective control, and all groups.
The GenEx software package was used to calculate the Acc.SD for the seven candidate housekeeping genes and to determine the optimal number of reference genes to be used for each dataset (Fig. 3). Based on the 0.15 Acc.SD cut-off, Ppia was identi ed as the most stable candidate gene for normalization when considering all sample groups (All; Fig. 4). The addition of another gene led to a cut-off greater than 0.15, demonstrating that the insertion of a second normalizer in the gene expression analysis would increase the Acc.SD. Moreover, the addition of more than one gene led to a cut-off greater than 0.2, indicating that the insertion of a third gene would increase even more the Acc.SD. Applying a global analysis and based on the lowest Acc.SD value, the best normalizers for each sample group were found to be: two (Actg1 + Ppia) was the optimal number of housekeeping genes to be considered for the CY group, three (Actb + Hprt + Pgam1) for the NC group, two (Hprt + Pgam1) for HT, two (Actb + Pgam1) for the WT group, one (Ppia) for SC, and three (Actg1 + Gapdh + Pgam1) for the CO group. Importantly, we observed that different numbers of housekeeping genes should be used when comparing the renal tissue of each Pkd1-de cient model with its respective control: one (Ppia) for CY + NC, two (B2m + Hprt) for HT + WT, and two (Actg1 + Ppia) for SC + CO.
Intergene expression normalization among the top three candidate housekeeping genes An optimal housekeeping gene should present expression stability, displaying low expression variability with respect to other housekeeping genes. This property can be evaluated by normalizing their expression by each other's. To evaluate this aspect for the best candidate housekeeping genes, we used as the third strategy different algorithms to normalize the expression levels (Ct) of the top three candidates (Actb + Hprt + Ppia) to each other's expression (Fig. 4). Their expression levels did not differ between any two-sample groups. These results suggest that any of these three genes are suitable to be employed as housekeeping genes among the analyzed samples. Interestingly, the presence of Ppia as a housekeeping gene decreased the gene expression dispersion of both Actb and Hprt (Fig. 4).

Correlation of mRNAs expression between pairs of the top three candidate housekeeping genes
Pair correlation analysis between the best housekeeping genes can provide options to choose relevant housekeeping genes for gene expression studies. This approach was applied to our models orthologous to ADPKD associated with different pro les of Pkd1 de ciency, displaying or not a cystic phenotype. Such correlation analyses were performed using the mRNAs expression data yielded by all evaluated kidney samples. The expression levels (Ct) of the three best candidate housekeeping genes revealed a very strong correlation between Hprt and Ppia (ρ = 0.89, p < 0.05, Fig. 5) and a strong correlation between Actb and Hprt (ρ = 0.84, p < 0.05, Fig. 5). In addition, a moderate correlation was observed between Actb and Ppia (ρ = 0.81, p < 0.05, Fig. 5). These results showed that the Actb, Hprt and Ppia expression pro les are correlated in all the samples herein evaluated and can be used together as suitable housekeeping genes.
Validation of the best candidate housekeeping genes by normalizing expression of Stat3 target gene In order to validate the expression stability of the three best candidate housekeeping genes, the relative expression of the Stat3 target gene was assessed using different combinations of Actb, Hprt and Ppia normalizers (Fig. 6). Stat3 was selected for this purpose because it is a well-de ned gene with increased expression in ADPKD and in murine polycystic kidney disease models.
The Stat3 expression level was consistent with upregulation in SC kidneys compared to its control CO when normalized by Ppia expression (Fig. 6C). Trends of upregulation were also observed in SC kidneys compared to CO when normalized by other combinations of the best candidate housekeeping genes (Figs. 6B, 6D, 6E, 6F and 6G). Trends of Stat3 upregulation in CY compared to their control kidneys were also observed when its expression was normalized by different combinations of Actb, Hprt and Ppia (Figs. 6A, 6B, 6D, 6E, 6F and 6G). In contrast, Stat3 expression was not statistically different between HT and WT kidneys regardless the housekeeping gene combination used for normalization (Fig. 6). Our results also suggest that the SC kidneys, and likely their controls CO, have expression pro les slightly different from the other analyzed mouse model kidneys. Taken together, the use of Ppia showed to be the most suitable housekeeping gene among all considered sample groups.
Many studies with other animal models, including kidney tissues, have used Gapdh as the housekeeping gene, assuming unaltered expression. 20,23,24 25 Given the weakness of Gapdh as a housekeeping gene observed in our models, it is essential that this gene be analyzed and compared to other potential housekeeping gene normalizers. Indeed, when we analyzed Stat3 expression using Gapdh as the housekeeping normalizer, we did not nd different expression levels between CY and NC, HT and WT, and SC and CO (Supplementary Figure S2). These nding are in disagreement with the trends of Stat3 upregulation detected in CY kidneys compared to NC and in SC compared to CO evidenced when our housekeeping gene combinations were applied.
Expression of the best candidate housekeeping genes does not correlate with the kidney weight/body weight ratio in CY mice Since analyses involving mouse models orthologous to human ADPKD often require assessment of severity of the cystic phenotype or cyst growth, we reasoned that the identi ed housekeeping genes should have their stability evaluated with respect to the renal cyst burden, re ected in kidney size/weight. This approach could allow the investigation whether the disease stage, progression and severity signi cantly modify the expression patterns of potential housekeeping genes. To accomplish this task, we sought for potential correlations between the expression levels of Actb, Hprt and Ppia and kidney weight/body weight ratio (KW/BW) in the CY group. The expression levels of Actb, Hprt and Ppia, however, did not correlate with KW/BW (ρ=-0.05, p < 0.91, Supplementary Figure  Discussion PKD1/PKD2 pathogenic mutations deregulate a number of downstream pathways that aberrantly affect major cellular properties and contribute to the ADPKD phenotype. Such pathways encompass mTOR, cMYC, STATs, cAMP and intracellular calcium homeostasis, promoting cyst epithelial cell proliferation, apoptosis, epithelial secretion, and cellular metabolic rewiring 26 which in turn constitute logical targets for the development of therapeutic interventions. Given there is no widely accepted single model of ADPKD for therapeutic testing at present, most agents are tested in multiple preclinical models. 27 Since orthologous animal models with slowly progressive renal cystic disease reproduce more closely the human disease, they are more appropriate to unravel mechanisms underlying ADPKD pathogenesis and generate useful biomarkers. 28 To systematically evaluate the complexity of target gene expression analyses in in vivo models orthologous to ADPKD and propose the best standards for such studies, we aimed to establish the most appropriate housekeeping genes to be employed in three mouse models with distinct pro les of Pkd1 gene de ciency, in different experimental scenarios. While HT mice show almost exclusively Pkd1 +/− renal cells but do not display cysts, reproducing the background cell environment found in ADPKD type 1 patients, CY mice have renal cysts presumably formed by Pkd1 -/cells, reproducing the ADPKD type 1 cystic phenotype and its expected consequences. Notably, CY mice have preserved GFR at the evaluated ages (12 weeks). To complete the diverse phenotypes associated with Pkd1 de ciency, we included in our systematic assessment, an early, severely cystic animal model, the Pkd1 V/V mouse (SC).
While the use of non-validated endogenous control genes in gene expression studies results in unreliable data, a universal, invariably expressed gene is unlikely to exist. 29 This hypothetical control gene, in fact, may not even exist within individual tissues and cell types. In this context, a more adequate purpose is to identify the most reliable gene or set of genes to be applied in each experimental setting. Although several studies have evaluated gene expression pro les to identify new biomarkers and therapeutic targets for ADPKD, 30 26 to the best of our knowledge, no study to date has addressed the appropriateness of housekeeping gene usage in animal models orthologous to ADPKD. The present study identi ed Ppia as the best housekeeping gene for CY + NC and SC + CO groups, while Hprt was the best for the HT + WT group.
Cui et al (2009) 31 have previously analyzed endogenous genes as potentially useful housekeeping genes for analyses of target gene expression in kidney tissue samples of cpk mice, a well-characterized recessive cystic kidney disease model. 31 They studied the expression of 16 commonly used housekeeping genes in seven mildly and seven severely affected whole kidney tissue samples using TaqMan RT-qPCR assays and Affymetrix GeneChip arrays, normalized and tested for overall variance and equivalence of the means. Both statistical approaches and both TaqMan-and GeneChip-based methods converged on 3 out of the 4 top-ranked genes (Ppia, Gapdh and Pgk1) that displayed the most constant expression levels across the assessed phenotypes. Such results led to the conclusion that a combination of the top-ranked genes would provide suitable endogenous internal control for gene expression studies in cpk kidney tissues across a wide range of disease severity. The different disease etiology, however, precludes the application of these data to gene expression studies performed in in vivo models orthologous to ADPKD. The rationale of the current study was to run the rst assessment of candidate housekeeping genes for normalization of mRNA expression by RT-qPCR in kidney tissue of mouse models orthologous to ADPKD with distinct patterns of Pkd1 de ciency. Leal et al (2016) 32 has successfully evaluated the suitability of reference genes in meniscus samples in pathological and control conditions using the software packages applied in our current analysis. 32 These investigators revealed HPRT1 as the best single reference gene but as expected, showed that two or more reference genes should be used for gene expression normalization instead. According to the samples tested and proposed comparisons, the appropriate housekeeping gene combination included HPRT1 + TBP, HPRT1 + GAPDH, or HPRT1 + TBP + GAPDH. Other authors suggested that PPIA and RPS13, especially in combination, were the best suitable references to normalize gene expression in ccRCC tissues as compared to classical reference genes such as beta-Actin, GAPDH, 18S or B2M. 33 Therefore, such reports served as anchors for our present investigation about the stability of seven candidate genes selected from previous studies (Actb, Actg1, B2m, Gapdh, Hprt, Pgam1 and Ppia) 34,35 24 31,36,37 38 in kidneys samples of Pkd1-de cient noncystic and cystic mouse models. The stability of gene expression was analyzed using distinct statistical models, including a pairwise comparison model, geNorm, and an ANOVAbased model, NormFinder.
Each of the applied algorithms ranked the best candidate reference genes, identifying Ppia as the most stable and reliable housekeeping gene, while Gapdh was least stable for all kidney samples. Gapdh is the most commonly used gene as endogenous control in RT-qPCR analyses, since its expression is usually constant. 23,24,38,39 25 Using a similar approach, a recent report evaluated six commonly used reference genes (Actb, B2m, Gapdh, Hmbs, Hprt and Ppia) to identify the most constantly expressed gene under the in uence of testosterone in rat. 40 This study showed that Hmbs and Ppia were the most stably expressed genes in the hypothalamus while Hmbs and Gapdh appeared to be the most stable genes in kidneys, indicating that in this setting the Gapdh expression pro le was more stable than in our case. Concentrations of GAPDH, however, may vary among individuals, 41 during pregnancy, 42 according to developmental stage 43 44 and during the cell cycle. 45 Other reports also documented such a limitation. 46-49 50 . Therefore, the recognition that the expression of Gapdh may exhibit tissue-speci c regulationis consistent with our results and emphasizes the need for validation of commercially available control assays.
In line with the aforementioned raised concerns, our ndings revealed that the selection of appropriate housekeeping genes and consequent use of Ppia as the reference, allowed the detection of differences between groups regarding Stat3 expression, thought to be upregulated in kidney cysts and associated with PKD progression in mouse models. 8,51−56 57 On the other hand, the normalization using Gapdh showed no difference. Interestingly, despite the different ages between the CY and SC groups, normalization using Ppia showed very close expression pro les in these two groups, decreasing the expression dispersion observed with Actb and Hprt. These data suggest that Ppia is a good expression normalizer for models associated with renal cystic phenotypes, a piece of information that may be highly relevant to investigate and compare different cystic disease stages, progression and severity. On the other hand, Pgam1 was associated with the Pkd1-haploinsu cient non-cystic model, suggesting that this might be a more appropriate normalizer to be used in studies involving Pkd1-de cient non-cystic renal phenotypes.
One limitation of the present study relies on the selection of only seven candidate housekeeping genes to be tested for validation, a process that was based on data available in previous studies. We believe, therefore, that further research is still required is this eld, since the evaluation of additional candidate housekeeping genes in additional Pkd1-orthologous mouse models and experimental conditions are likely to improve the speci cies of gene normalization in this biological scenario. Moreover, the pipeline adopted in the present study should be further tested using other models as well.
In conclusion, our ndings established Ppia as the most appropriate housekeeping gene for comparisons involving a cystic model and its respective control, Hprt for non-cystic Pkd1-haploinsu cient and wild-type mice, and Ppia for a severely cystic model and its corresponding wild-type control. Overall, our analysis identi ed Ppia as the best and the most stable housekeeping gene in the comprehensive Pkd1-de ciency scenario, while Gapdh was the least stable in the three evaluated mouse models. Such data will allow more robust and reliable analyses of target gene expression in kidney tissue of Pkd1-de cient mouse models, a reality that will contribute to the elucidation of the role of different genes in different scenarios related to ADPKD or Pkd1 biology. Normalizing to a suitable housekeeping gene or sets of them can not only remove artifactual differences due to sampling and quality of mRNA but also identify real changes in gene expression levels.

Ethical statement
The authors con rm that all experiments were carried out in accordance with ARRIVE guidelines and regulations (https://arriveguidelines.org). The ethical approval for all animal care and procedures were carried out in accordance with relevant guidelines and regulations by the Ethic Committee on Animal Use of the Federal University of Sao Paulo -Brazil (CEUA/UNIFESP), protocol number CEUA 4558140219.

ADPKD mouse models
The three mouse models were maintained on the C57BL/6 strain background, an important requirement for the performance of our study. Two models were evaluated at 10-12 weeks of age: a renal cystic mouse (Pkd1 ox/ ox :Nestin cre /Pkd1 ox/− :Nestin cre , CY, n = 10; and their corresponding noncystic controls Pkd1 ox / ox /Pkd1 ox / − , NC, n = 10) and a Pkd1-haploinsu cient noncystic mouse (Pkd1 +/− , HT, n = 6) and its respective wild-type control (Pkd1 +/+ , WT, n = 6). The third model was assessed at 15 days of life due to its severely renal cystic phenotype (Pkd1 V/V , SC, n = 7) along with its same-age wild-type control (CO, n = 5). It must be noted that the WT and CO controls harbor the same genotype (Pkd1 +/+ ) but were analyzed at different ages. Only kidneys from male animals were analyzed in order to avoid potential gender-related experimental variability. The mice were genotyped using speci c PCR reactions. 21 The CY mice, homozygous for a Pkd1oxed allele (Pkd1 ox ) or compound heterozygous Pkd1 ox/-, display a mosaic pattern of full gene inactivation induced by a Nestin-Cre transgene through excision of exons 2-4 (Pkd1 ox/ ox :Nestin cre ). 38,58,59 60 The HT model is heterozygous for a Pkd1 null allele, characterized by early transcriptional interruption, 61 60 and develops no renal cysts by 12 weeks of age. The SC model is homozygous for the Pkd1 knockin T3041V allele, which prevents the autoproteolytic cleavage of PC1 at the GPS site. 62 60 Pkd1 V/V animals have no gross phenotype by postnatal day (P) 6 but develop rapid and progressive distal nephron cysts thereafter. This severe renal phenotype, which eventually leads to renal failure, is responsible for the early mortality that occurs between the 2nd and 6th week. 62 The mice were fed ad libitum and housed at constant ambient temperature in a 12-hour light cycle. Animal procedures were approved by the Internal Biosafety Commission of Genetically Modi ed Organisms of the University of São Paulo School of Medicine and by the Universidade Federal de São Paulo (UNIFESP) Research Ethics Committees. SC (Pkd1 V/V ) and its wild-type control (CO) animals were euthanized by cervical dislocation, whereas the other animal groups were euthanized with intraperitoneal thiopental (0.4 mg/g of body weight). Their kidneys were appropriately harvested for RT-qPCR analyses. All experiments were conducted in accordance with international standards of animal care and experimentation. Both kidneys were collected and stored at − 80°C for further use. 21 Housekeeping genes Using the terms "genes" and "polycystic and reverse transcriptase polymerase chain reaction" in the PubMed search, we found seven articles employing tissue samples from human ADPKD and animal ADPKD-orthologous kidneys. We also included housekeeping genes selected from other tissues in the literature, 24,31,34−37 38 namely Actb, Actg1, B2m, Gapdh, Hprt, Pgam1 and Ppia. All of these seven genes are constitutively expressed in kidney tissue of mouse models orthologous to ADPKD, have independent cellular functions, and are assumed not to be co-regulated.

RNA extraction
Renal tissue lysis was performed using zirconia beads (Interprise, USA) and the Precellys (BioAmerica, USA) homogenizer. TRIzol (Life Technologies, USA) was employed for total RNA extraction according to the manufacturer's protocol. The RNA quantity and purity were determined using the NanoVue spectrophotometer (GE Healthcare Life Sciences, USA) and analyzed with the Agilent 2100 Bioanalyzer 6000 Nanochip (Agilent Technologies Inc., Waldbronn, BW, Germany). Total RNA was stored at -80°C until further use. 21 cDNA preparation and RT-qPCR Complementary DNA (cDNA) was synthesized from 2 µg of total RNA using the High-Capacity cDNA Reverse Transcription Kit (Applied Biosystems), according to the manufacturer`s instructions.
Gene expression was performed in triplicate using SYBR Green (Applied Biosystems) in the QuantStudio7 qPCR system according to the manufacturer`s instructions. Primer sequences for the 7 candidate housekeeping genes and the target gene are shown in Supplementary Table S1. Gene expression quanti cation was performed in the same run for each sample to eliminate technical variation.

Analysis of housekeeping gene expression stability
The tissue samples were classi ed into 7 distinct groups: (1) 12 and RefFinder (https://www.heartcure.com.au/re nder/) 65 following the authors' recommendations. These software packages were used to determine the relative expression stability of the candidate housekeeping genes and to generate a ranking for the best ones.
The NormFinder software is a Microsoft Excel-based application that evaluates the expression stability of candidate reference genes. The stability value is calculated by analyzing their intra-and intergroup transcriptional variation. A lower variation in the expression levels indicates more stable gene expression (low stability value). 63 The GeNorm program determines the gene expression stability value M according to the average pairwise variation between one particular gene and all other candidate genes. The most stable gene expression yields the lowest M value. 22 The BestKeeper program calculates the standard deviation (SD) and the coe cient of variance (CV) of the Ct levels, and applies the Pearson correlation coe cient to estimate the intergene relations of all possible candidate gene pairs. 64 The DataAssist software, in turn, determines a score for the best reference genes based on the GeNorm algorithm. It uses the RQ to calculate the stability value of each gene. A lower score indicates more stable expression (Thermo Fisher, USA). The comparative ΔCt method is based on a comparison of the relative expression of possible gene pairs within each sample. Gene stability is ranked according to the reproducibility of the gene expression differences in the analyzed samples. 12 Lastly, the RefFinder software integrates the NormFinder, GeNorm, BestKeeper, and the comparative ΔCt method assigning an appropriate weight to each gene, calculating the geometric mean of these weights, and generating a rank of the best candidate reference genes. 65 The optimal number of reference genes was selected using the GenEx software package.

Statistical analysis
The Shapiro-Wilk test showed that the Ct values of the candidate housekeeping genes were not normally distributed, so that the results are expressed as median and interquartile range (IQ). The ΔCt values were determined by: Ct(target gene) -Ct(housekeeping gene) or Ct(target gene) -mean[Ct(housekeeping gene 1);Ct(housekeeping gene 2)].
To evaluate potential expression differences of Actb, Actg1, B2m, Gapdh, Hprt, Pgam1, Ppia and Stat3 among the sample groups, the Kruskal-Wallis with Dunn's post-hoc test, followed by FDR correction using the Benjamini-Hochberg method, was used. Comparisons between the Pkd1-de cient model model and its respective control were performed with the Mann-Whitney U test.
Lastly, the Spearman correlation test was employed to verify potential correlations among the expression levels of Actb, Hprt and Ppia, and between the expression levels of such genes and kidney weight in CY cystic kidney samples. A value between 0.30-0.50 was determined as a weak correlation, 0.50-0.70 as moderate, 0.70-0.90 as strong, and 0.90-1.00 as a very strong correlation. 66 Declarations Data Availability: All data including supporting datasets are made available as main gures or supplementary information les.
Work ow diagram illustrating the strategy used to identify the best housekeeping normalizer for RT-qPCR studies in Pkd1de cient kidneys. To proceed with appropriate comparisons, we included housekeeping genes selected from the available literature (Actb, Actg1, B2m, Gapdh, Hprt, Pgam1 and Ppia). Pkd1, polycystic kidney disease 1 gene.

Figure 2
Renal expression pro les of the seven selected candidate housekeeping genes in Pkd1-de cient mouse models. Lower threshold values (Ct) indicate higher expression levels. The median values are expressed as horizontal lines while the error bars represent the corresponding interquartile ranges. The Actb Ct values were the highest ones, re ecting the lowest expression levels whereas the Actg1 Ct values were the lowest, indicating the highest expression.

Figure 3
Optimal number of reference genes for each dataset according to GenEx analyses. Accumulated standard deviation (Acc.SD) for the ve candidate housekeeping genes in all groups of samples allows estimating the ideal number of genes for normalization. Acc.SD was calculated with the GenEx software package. Lower values of Acc.SD indicate the optimal number of reference genes. HT, Pkd1-haploinsu cient; WT, wild type; CY, cystic; NC, noncystic; SC, severely cystic; CO, control for the severely cystic phenotype.

Figure 4
The ΔCt values of the three best candidate housekeeping genes: Actb (A -C), Hprt (D -F) and Ppia (G -I). A lower threshold value (Ct) indicates a higher gene expression level. The median values are expressed as horizontal lines while the error bars represent interquartile ranges. CY, cystic; NC, noncystic; HT, haploinsu cient; WT, wild type; SC, severely cystic phenotype; CO, control of severely cystic phenotype. Actb, target expression normalized by Actb; Hprt, target expression normalized by Hprt; Ppia, target expression normalized by Ppia; Actb+Hprt, target expression normalized by Actb+Hprt; Actb+Ppia, target expression normalized by Actb+Ppia; Hprt+Ppia, target expression normalized by Hprt+Ppia. Comparisons between the Pkd1-de cient model and its respective control were performed with the Mann-Whitney U test. P<0.05 was not detected for any of the comparisons.

Figure 5
Pair correlation matrix between the number of ampli cation cycles of the best housekeeping candidate genes Actb, Hprt and Ppia. Circle color and radius are proportional to the Spearman's rank correlation coe cient (signi cance level: p<0.05).