Evaluation of candidate reference genes stability for gene expression analysis by reverse transcription qPCR in Clostridium perfringens

Identification of stable reference genes for normalization purposes is necessary for obtaining reliable and accurate results of reverse transcription quantitative polymerase chain reaction (RT-qPCR) analyses. To our knowledge, no reference gene(s) have been validated for this purpose in Clostridium perfringens. In this study, the expression profile of ten candidate reference genes from three strains of C. perfringens were assessed for stability under various experimental conditions using geNorm in qbase + . These stability rankings were then compared to stability assessments evaluated by BestKeeper, NormFinder, delta Ct, and RefFinder algorithms. When comparing all the analyses; gyrA, ftsZ, and recA were identified within the most stable genes under the different experimental conditions and were further tested as a set of reference genes for normalization of alpha toxin gene expression over a 22-h period. Depending on the condition, rpoA and rho might also be suitable to include as part of the reference set. Although commonly used for the purpose of normalizing RT-qPCR data, the 16S rRNA gene (rrs) was found to be an unsuitable gene to be used as a reference. This work provides a framework for the selection of a suitable stable reference gene set for data normalization of C. perfringens gene expression.


Results
Selection of the candidate reference genes and their amplification efficiencies. Ten genes were identified as potential reference genes from Clostridium species based on their previous use in this or other bacterial species. DNA sequences for eight candidate reference genes previously evaluated for C. difficile (adk, gyrA, rho, rpsJ, tpiA, gluD (gdhA in C. perfringens), rpoA, rrs) 10 were acquired from NCBI for four Clostridium perfringens strains (ATCC 13,124, strain 13, EHE-NE18, and SM101) and aligned in MegAlignPro in order to identify highly (100%) conserved areas suitable for primer design. Two additional genes (ftsZ and recA) were chosen as they have been previously validated as suitable reference genes for other bacteria 21 . Species-specific primers for these targets were designed using Primer3 22,23 (Table 1). Primers sets for rpoA and rrs were used as previously described as they had not been previously validated 7,24,25 .
Primer melting temperatures (Tm) are listed in Table 2 and corresponded to predicted Tms. Amplification specificity was confirmed for all primer pairs by visualizing single PCR amplicons of the correct size on agarose gel electrophoresis (Supplementary Fig. S1). Additionally, single peaks were observed for all 10 primer sets from  : TGG AAA ATG TGA TCT TTG CGGA  173  This study  R: AGA CCT CGT TTA TTG CTT GTGT   gyrA  DNA replication  F: GAG GTG GTA AGG GAA TAC AAGC  167  This study  R: TGT TCC CTT AGC AGT TCT TCCT   recA  DNA replication  F: CTG GTA AAA CAA CAG TGG CTTT  167  This study  R: AGC TTG TTC TCC TGT ATC TGGT   rho  Transcription  F: TGA AAG ACC AGA AGA AGT AACGG  150  This study  R: ACA TCT CTT CCT TGT TCA  www.nature.com/scientificreports/ dissociation curve analysis ( Supplementary Fig. S2) indicating to absence of primer-dimers and non-specific amplification products. Amplicons for each of the ten candidate reference genes were cloned into a cloning vector in order to generate standard curves for the calculation of amplification efficiencies. Amplification efficiencies for these genes ranged from 1.72 (gdhA) to 1.86 (rpsJ) with regression coefficients (R 2 ) of 0.99 ( Table 2).

Establishment of experimental conditions
We chose to evaluate three varied growth conditions to simulate potential environments the bacterial cells could encounter during pathogenesis. The first condition (referred to as 'incubation period') assessed expression of the candidate reference genes after 4 h and 9 h of growth. These time points were chosen to approximate log phase and stationary phase growth; however, due to logistical constraints, real-time bacterial densities could not be monitored during the trial. For this trial, the density of bacterial growth equaled an OD 600 of 0.15, 0.3, and 0.78 respectively for CP-1, CP-2, and CP-4 for the 4-h time point and an OD 600 of 1.2, 1.3, and 1.4 respectively for CP-1, CP-2, and CP-4 for the 9-h time point.
In the second trial, two additional conditions were assessed. One condition (referred to as 'oxygen shock') evaluated expression of the candidate reference genes after a 30-min exposure to an aerobic environment following 4 h of growth under normal anaerobic conditions. No measurements of oxygen status were recorded. The other condition (referred to as 'heat shock') evaluated expression of the candidate reference genes after a 30-min exposure to 50 °C following 4 h of growth under normal 37 °C growth.

Expression stability of candidate reference genes
The stability of transcription levels of the candidate reference genes under different stress conditions was initially evaluated by geNorm 13 within the qbase + software package, version 3.2 (Biogazelle, Zwijnaarde, Belgium-www. qbase plus. com). These results were then compared to analyses by additional algorithms─ BestKeeper 15 , comparative delta Ct 26 , normFinder 14 , and RefFinder 16 .
A geNorm pilot experiment utilizing all 10 candidate reference genes produced M-values ranging from 0.69 (gyrA) to 2.0 (gdhA) when assessing stability only for the incubation period experimental condition (Fig. 1). Genes with medium stability, defined as having average geNorm M-values between 0.5 and 1.0, are often seen when heterogeneous sets of samples, such as our bacterial cultures, are evaluated 27 . Four genes fall into the category of medium stability when evaluating the incubation period condition (simulating log and stationary phase)-ftsZ, gyrA, recA, and rpoA or heat shock condition-ftsZ, gyrA, rpoA, and rpsJ (Fig. 1). Under our conditions, oxygen shock had the least influence on transcript levels of the candidate reference genes. Six out of the 10 candidate genes were considered stable after being subjected to oxygen shock with ftsZ, gyrA. recA, rho, rpsJ, and rpoA having M-values less than 1.0; gyrA, recA, and rho had M-values less than 0.5. Mean Cq values for an isolate and cycle variation, as indicated by standard deviation of the mean Cq values, across all conditions for a respective isolate are shown in Table 3. Average Cq values for any given gene would often differ by up to 10 Cq values between isolates for different growth/stress parameters indicating sample heterogeneity. The greatest variation is observed from all genes during comparisons during the incubation period trial (standard deviations range from 0.33 for CP-1 rrs to 6.47 for CP-2 rpsJ). The genes gyrA and recA showed the least variation in cycle number (standard deviation less than 1.0) for each isolate when assessing transcript levels during the oxygen shock and heat shock trials.
To compare the stability ranking determined by qbase + software to other algorithms, Cq data was submitted to the online tool RefFinder (as found at www. heart cure. com. au/ reffi nder/). RefFinder 16 is a comprehensive tool that integrates the output of other computational algorithms by assigning weight to each candidate genes then calculating a final ranking based on the geometric mean of the weighting value for each gene. In this process, the tool provides the output for each of the additional computational programs. Comparisons of the expression stability ranking calculated by RefFinder for geNorm, BestKeeper, NormFinder, and the delta Ct method, as well as the comprehensive RefFinder ranking are shown in Table 4. For most of the algorithms and conditions, the genes ftsZ, recA, and gyrA were consistently within the top four ranks. The BestKeeper 15 computational algorithm www.nature.com/scientificreports/ showed the most different ranking of the 10 genes under the incubation period condition compared to the other analysis platforms with rrs, rho, and gdhA identified as the three most stable genes. Two of these genes (rrs and gdhA) were identified among the least stable genes by the other platforms. RefFinder comprehensive rankings identified ftsZ, recA, and gyrA consistently within the top four ranks when comparing any of the conditions (incubation period, oxygen shock or heat shock, all conditions together) with geometric means of rank values between 1.41 and 3.71 depending on the growth/stress condition being evaluated. Under two conditions (incubation period and all conditions together), RefFinder identified rho as the third most stable gene with a geometric mean of the rank value at 3.36 and 2.51, respectively. Under these same conditions, rpoA was not determined to be within the top four most stable genes (rank 9 for incubation period and rank 5 for all conditions together).

Expression profile over time of the alpha toxin gene (plc)
It has long been recognized that alpha toxin production occurs during active cellular growth 17,28-30 , so evaluating the expression of plc at various time points during growth should allow us to evaluate the reliability of our selected reference genes for use in normalization of transcript levels. The incubation time points we evaluated were 2.5 h and 4 h for simulating early-and mid-log phase, and 6 h and 22 h for simulating early-and latestationary phase. Because of the variability in growth rates between our three C. perfringens isolates, it was difficult to achieve the desired growth phase targets (early-log, mid-log, early-stationary) for all three strains when cells were harvested at our experimental time points. The actual OD 600 of the cultures used for RNA harvest are shown in Supplementary Table S1. Expression of the plc gene relative to expression of this gene after 22 h of growth was assessed following normalization with our previously determined set of reference genes (ftsZ, recA, and gyrA), with these genes individually, with one of the least stable but reliably amplified genes (tpiA), and with rrs as a historically used reference gene (Fig. 2). Our set of reference genes and the same genes individually as reference genes produced similar fold changes in plc expression; however, using the set of reference genes more often produced more consistent results indicated by shorter error bars. When a Student's t-test was used to assess statistical significance (p ≤ 0.05) of the changes in expression at 4 h in relation to the 22-h time point, a maximal and significant upregulation of plc transcription was observed when employing ftsZ, gyrA, and recA either as a set or individually. Although the transcript levels decreased by 6 h, the level of plc expression was still significantly upregulated when normalizing with the reference set genes or ftsZ and gyrA individually, but not recA individually.
When tpiA, one of the less stably expressed candidate reference genes, was used for normalization; the same pattern of expression was observed but to a greatly depressed level. Expression of plc at the 4-h time point was maximal with 9 times less transcript observed when tpiA was used for normalization as compared to the same www.nature.com/scientificreports/ expression when normalization was performed using the set of three reference genes (~ sixfold increase vs 54-fold increase, respectively). None of the change in plc expression were found to be significant when tpiA was used in the normalization process. When plc expression was normalized with rrs, the apparent relative expression of plc decreased over time instead of increasing during active growth. As such, normalization with the 16S RNA gene produced a different expression profile for plc than was observed when using our reference set genes (either as a set or individually). Significant upregulation of plc at the 2.5-h time point was only noted when rrs was utilized for normalization.

Discussion
Clostridium perfringens causes a wide variety of diseases in humans and animals as a result of the toxins it produces 2 . These toxins are regulated by a complex network of signaling pathways and understanding how different environmental conditions modulate the expression of these toxins will provide us with a better understanding of the role of these toxins the development of disease. Reverse transcription qPCR is the gold standard method in studying expression of specific genes and virulence regulation under different environmental conditions 31 , and proper normalization is essential to achieve reliable results 8 . Under the most appropriate normalization strategy, selected reference genes should be verified to be stably expressed under the physiological state or experimental conditions under investigation 8 . To identify suitable genes for normalization of gene expression in C. perfringens, we performed the necessary validation analyses under three different conditions that could be relevant during pathogenesis.
In this study, we selected and evaluated ten candidate reference genes (adk, ftsZ, gyrA, gdhA, recA, rho, rpoA, rpsJ, rrs, and tpiA) for use in normalization of transcript levels in C. perfringens. As suggested by others, we selected genes representing different functional classes to minimize the likelihood of co-regulation 10,32 (Table 1). Of these genes; rrs, gyrA, recA, rpoA, rho, ftsZ, and adk have all been experimentally tested for use in other bacteria; however, the stability for each gene was not validated in all studies 21 . For instance, adk has been experimentally tested in 3 different organisms (Bacillus cereus group strains 33 , C. difficile 10 , and C. botulinum 12 ), yet it was only identified as stably expressed in C. difficile. Three of the 10 genes investigated (gyrA, rrs, and www.nature.com/scientificreports/ www.nature.com/scientificreports/ rpoA) have been used previously as an endogenous control for C. perfringens, although validity for these genes was not documented 7,17,18,20,34 . In our study, we determined the most suitable reference genes for normalization in C. perfringens to be ftsZ, gyrA, and recA. This determination was made by assessing the rankings of genes by all the available computational algorithms under the different growth/stress conditions and choosing the genes that ranked highest (within the top four) for the different conditions by each of the computational algorithms. The three genes of our proposed reference set ranked within the top spots under all conditions (analyzed separately and together) by geNorm, NormFinder, and RefFinder (Table 4). These three genes also ranked highest using the delta Ct analysis method when assessing stability under the incubation period trial and when results of all the experimental conditions were combined; whereas gyrA and recA ranked second and third, but ftsZ ranked fifth when assessing expression in the oxygen shock and heat shock experiment. The delta Ct and geNorm methods are similar in that a stepwise comparison of expression values is made to exclude or rank less favorably genes that fluctuate expression under different conditions. Other genes, especially rho and rpoA, occasionally were ranked in the top 4, but this ranking was not consistent between conditions and computational algorithms.
The 16S rRNA gene (rrs) is frequently used as a reference gene for normalization; yet it was not found to be stably expressed in this study. One reason for this instability is likely linked to the considerable variation in rrs transcript levels where they were often extremely abundant (Cq values < 10) and difficult to reproduce consistently as the high abundance makes it difficult to accurately set the baseline 13 . Another reason the rrs gene is not appropriate as a reference gene is that rrs transcript levels were quite abundant in relation to the transcript levels of the other genes including the plc gene as our gene of interest (Cq less than 10 vs. Cq values around 20), and its use is not appropriate as its abundance is overrepresented in relation to the total amount of mRNA in the samples 8 . In fact, when we used rrs as the internal control for normalization of plc, a decreasing upregulation of the alpha toxin gene over time was observed, which is not the same expression profile observed when using the genes in our proposed reference set (Fig. 2).
One hurdle of this study, but one that is likely common for researchers studying bacterial populations with a high degree of heterogeneity, is the inherent variability in gene expression between different strains of the same bacterial species. In C. difficile, a discrepancy between reference gene stability attributable to the heterogeneity of C. difficile ribotypes was observed 10 . In this study, we used three C. perfringens isolates representing three different toxinotypes─ types B, E, and G. Each isolate showed different growth kinetics with CP-2 (type E) growing most rapidly and CP-1 (type B) growing slowest. Similarly, each isolate differed in the expression levels at the different sampling points. To illustrate this variability in Cq values, we can look at recA and tpiA expression across all the conditions between the different strains. In the oxidative stress experiment, for recA transcripts, there was approximately a 3 Cq value spread between the three strains, but close values between the conditions within a strain as evidenced by the low standard deviations (average Cq ± SD values for CP-1, CP-2 and CP-4 are 20.26 ± 0.45, 23.83 ± 0.58, and 21.92 ± 0.30, respectively; Table 3). On the other hand, for tpiA in the same experiment, there was approximately a 6 Cq value spread between the three strains, but close Cq values between www.nature.com/scientificreports/ the conditions within a strain as evidenced by the low standard deviations (average Cq ± SD values for CP-1, CP-2 and CP-4 are 19.18 ± 0.32, 26.08 ± 0.26, and 22.91 ± 0.67, respectively; Table 3). For robust qPCR assays, the general rule-of-thumb is to have near 100% amplification efficiency as this indicates that the amount of the target doubles at each cycle 8 . The range for a good PCR is 90 -110% with a range of 20% (E = 1.9-2.1). A potential pitfall of this study is our amplification efficiencies are not within this range; however, they are close to each other. Our PCR efficiencies ranged from 72 to 86% with the efficiencies of our chosen reference set between 73% (gyrA) to 83.2% (ftsZ). This range is narrower than the range of efficiency for a "good" qPCR (10 vs. 20%). So although these efficiencies are lower than we would like, they are similar enough to result in accurate relative quantification 35 .
To test the ability of the proposed set of C. perfringens reference genes to be used for normalization, we applied ftsZ, gyrA, and recA transcript levels for normalization of the alpha toxin gene (plc) at different stages of growth. Ohtani and colleagues 4 noted that toxin gene expression in C. perfringens is unique as it is initiated much earlier during growth thereby reaching maximum expression during the exponential phase of growth rather than stationary phase of growth. Ohtani and colleagues 4 , by using Northern blot analysis, attributed this expression pattern to the agrBD system in C. perfringens rather than a traditional LuxS-based quorum-sensing mechanism. Using RT-qPCR, Abildgaard and colleagues 17 observed a similar induction of plc during early exponential phase with a subsequent decrease in expression occurring by stationary phase. This is the same expression profile observed in this study when using our proposed reference set for normalization. We found that using our proposed reference genes, either in combination or individually, produced similar expression trends over the tested time points from early exponential phase (2.5-h) to late stationary phase (22-h). The observance of the transcript level peaking supports previous findings that mRNA expression is induced rather than constitutive 17,29 .
Selection of stable reference genes for data normalization is required for reliable and accurate RT-qPCR analysis. In this study, we assessed potential reference gene stability of three different C. perfringens strains representing three different toxinotypes under three different experimental conditions that could have physiologic relevance. Overall, we recommend the use of gyrA, ftsZ, and recA as a set of reference genes for use in C. perfringens for normalization of gene transcripts in relative expression studies. While we know that ideal universal endogenous control genes have not been identified and likely do not exist 13 especially for species like C. perfringens that are heterogeneous nature, the proposed set of reference genes identified in this study will provide researchers with a good starting point for developing an accurate normalization strategy for their C. perfringens-based expression assay. Ideally, these reference genes should be confirmed as stable under other experimental conditions or with additional strains prior to their use.

Methods. Strains, media, and growth conditions
Two strains of C. perfringens were purchased from ATCC belonging to toxintype B (ATCC 3626, CP-1) and toxintype E (ATCC 27,324, CP-2). A third strain containing netB (C103-100, CP-4) was obtained from Auburn University (Ken Macklin, extension specialist and professor) that originated from the Mitchem-Sparks Regional Diagnostic Laboratory in Boaz, Alabama. This netB-containing isolate is classified as subtype G in the recently expanded toxin-based typing scheme 3 .
Bacterial cultures were grown on perfringens agar (HiMedia, Kennett Square, PA) for routine maintenance and in BHI (Criterion, Hardy Diagnostics, Santa Maria, CA) broth during experimental assays. An anaerobic environment was maintained in standard sized growth chambers containing gas-generating sachets with anaerobic indicators (Becton, Dickinson and Company; Franklin Lakes, NJ). Starter BHI cultures were inoculated from plates and allowed to grow anaerobically overnight (~ 16 h) at 37 °C. Trials were carried out as two experiments. In Experiment 1 for the incubation period trial, 10 ml BHI cultures were inoculated using 200 µl of the appropriate starter culture, then allowed to grow 4 h to approximate the logarithmic phase (target OD 600 0.3-0.4) or 9 h to approximate the stationary phase (target OD 600 1.2-1.4). After the appropriate time points, the cultures were divided into 2 ml aliquots, centrifuged at 3,260 X g for 4 min, then pellets were stored at − 80 °C. For experiment 2, the oxygen shock and heat shock trials, all cultures were inoculated and incubated as for the incubation period samples in Experiment 1. At 4-h, the control samples were divided into 1.5 ml aliquots, centrifuged at 14,000 × g for 2 min, then stored at − 80 °C or immediately processed for RNA extraction. The oxygen shock samples were removed from the anaerobic growth chamber and incubated aerobically at 37 °C for 25 min, after which time 1.5 ml aliquots were centrifuged at 14,000 × g for 2 min, then stored at − 80 °C or immediately processed for RNA extraction. For the heat shock samples, cultures were transferred to a 50 °C water bath for 30 minutes 36 , after which time 1.5 ml aliquots were centrifuged at 14,000 × g for 2 min, then stored at − 80 °C or immediately processed for RNA extraction.

Primers
Ten genes were identified as potential reference genes from the pertinent Clostridium sp. literature (Table 1). Eight C. perfringens-specific primers were designed for the genes utilized by Metcalf et al 10 for C. difficile. Primers were designed using Primer3web 22,23 with the following parameters: optimal primer size-20 bp (range 18 bp-23 bp); Tm-59 °C (range 57-62 °C), GC content-50% (range 30-70%). Using Primer-BLAST against the non-redundant (nr) nucleotide database, potential primers were assessed to check that C. perfringens entries were the only match at > 95% similarity for the primer set. Primers sets for rpoA and rrs were used as previously described 7,24,25 . All primers were synthesized by Integrated DNA Technologies (IDT, Coralville, IA) and purified by standard desalting methods.

RNA isolation and reverse transcription
Total RNA was isolated from the stored pellets of C. perfringens isolates harvested at logarithmic and stationary phases of growth using the Promega SV Total RNA kit (Promega Corporation, Madison, WI) following the manufacturer's protocol for Isolation of RNA from Gram-Positive and Gram-Negative Bacteria with a few modifications. Pellets were resuspended in 100 μl of freshly prepared TE containing 20 mg/ml lysozyme and incubated at room temperature for 10 min. Purified RNA was eluted with 100 μl nuclease-free water. A postelution DNase treatment was performed (TURBO DNA-free kit, Invitrogen). RNA concentrations were assessed spectrophotometrically using a Nanodrop 1000 (Thermo Scientific, Wilmington, DE) prior to normalization of RNA concentration. Concentration of normalized RNA was subsequently verified by Nanodrop to be within 10% of each other.

Reference gene validation
Ten µl reactions were prepared in duplicate for each C. perfringens cDNA with each of 10 potential reference genes using PowerUp SYBR Green Master Mix (Applied Biosystems) and 500 nM of each primer of the respective primer set. No reverse transcriptase controls were included for each RNA isolation. Amplification was performed in clear low profile plates using a CFX-96 Connect real time thermal cycler (Bio-Rad, Hercules, CA) with the following cycling parameters: 50 °C UDG activation for 2 min, 95 °C DNA polymerase unlock step for 2 min, followed by 40 cycles of 95 °C for 15 s, 61 °C for 15 s, 72 °C for 1 min with fluorescent images captured at the completion of each cycle. Amplification was immediately followed by dissociation curve analyses to check for primer-dimer formation and any nonspecific amplification products. Cq values for each candidate reference gene were analysed for stability using geNorm 13 within the qbase + software package, version 3.2 (Biogazelle, Zwijnaarde, Belgium-www. qbase plus. com). Comparisons of stability rankings were made following additional computational analyses using BestKeeper 15 , comparative delta Ct 26 , normFinder 14 , and RefFinder 16 . Cq values for all candidate reference genes from experiments of this study are included in supplementary Table S2.

Alpha toxin gene (plc) qPCR
The set of reference genes identified by geNorm to be most stable under these stress conditions were used to normalize expression of plc in response to growth phase. Primers were designed using the same parameters as for the candidate reference genes to amplify a 174 bp fragment of the plc gene of C. perfringens (forward primer 5′-CTG ACA CAG GGG AAT CAC AA-3′, reverse primer 5′-CAT GTC CTG CGC TAT CAA CG-3′). As alpha toxin has been shown to be regulated by cell density mechanisms, the growth time points we assessed were 2.5 h (early-log), 4 h (mid-log), 6 h (early stationary), and 22-h (late stationary). Bacteria were grown and harvested at each experimental time point as previously described. Procedures for RNA isolation, cDNA synthesis, and RT-qPCR conditions were the same as for the reference gene validation experiments. Stability of the reference genes under these growth conditions were confirmed with geNorm (qbase +). Normalized gene expression of plc was determined using the multi-gene normalization quantification model 27 and statistical significance of plc expression due to growth phase in relation to the 22-h time point simulating stationary phase when alpha toxin gene expression should be minimal was assessed using a two-tailed Student's t-test (p ≤ 0.05).

Data availability
All data relevant to the study are included in the article or uploaded as supplementary information. In addition, the datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.