Identification and validation of miRNA reference genes in poplar under pathogen stress

Quantitative real time polymerase chain reaction (qRT-PCR) is a common method to analyze gene expression. Due to differences in RNA quantity, quality, and reverse transcription efficiency between qRT-PCR samples, reference genes are used as internal standards to normalize gene expression. However, few universal genes, especially miRNAs, have been identified as reference so far. Therefore, it is essential to identify reference genes that can be used across various experimental conditions, stress treatments, or tissues. In this study, 14 microRNAs (miRNAs) and 5.8S rRNA were assessed for expression stability in poplar trees infected with canker pathogen. Using geNorm, NormFinder and Bestkeeper reference gene analysis programs, we found that miR156g and miR156a exhibited stable expression throughout the infection process. miR156g, miR156a and 5.8S rRNA were then tested as internal standards to measure the expression of miR1447 and miR171c, and the results were compared to small RNA sequencing (RNA-seq) data. We found that when miR156a and 5.8S rRNA were used as the reference gene, the expression of miR1447 and miR171c were consistent with the small RNA-seq expression profiles. Therefore, miR156a was the most stable miRNAs examined in this study, and could be used as a reference gene in poplar under canker pathogen stress, which should enable comprehensive comparisons of miRNAs expression and avoid the bias caused by different length between detected miRNAs and traditional reference genes. The present study has expanded the miRNA reference genes available for gene expression studies in trees under biotic stress.


Introduction
Quantitative real time polymerase chain reaction (qRT-PCR) is a commonly used method for studying gene expression due to its high quantitative accuracy, sensitivity, and specificity, as well as its low sample concentration requirements [1,2]. The precision of qRT-PCR is impacted by variations in the concentration of RNA between samples, efficiency of reverse transcription, pipetting errors, and cDNA quality [3][4][5]; thus, the correction and standardization of qRT-PCR results are vital for accurate and reliable data acquisition [6]. To overcome the challenges of intra-and inter-run qRT-PCR variation, an appropriate stable reference gene is required as an internal standard for data normalization. However, a universal reference gene that can be used to normalize all genes expressed under any condition has not yet been identified [7]. Therefore, it is necessary to identify suitable reference genes among different species, tissues, and stress conditions. Genes encoding Actin 7, glyceraldehyde-3-phosphate dehydrogenase, eukaryotic initiation factor 4A III, polyubiquitin, protein phosphatase 2A-2, and tubulin, have been used as reference genes for expression analyses in numerous plant species [8][9][10]. MicroRNAs (miRNAs) are ~ 21 nucleotides in length and regulate various activities in plants, such as development, hormone signaling, and nitrogen regulation [11][12][13]. However, these genes are not typically used as references for miRNA expression analyses [14]; instead, non-coding RNAs and small RNAs, such as 5.8S ribosomal RNA (5.8S rRNA) and U6 small nuclear RNA (U6 snRNA), 1 3 are more commonly used as reference genes to quantify the expression of miRNAs. Notably, the expression of some miRNAs is more stable than that of protein-coding reference genes; therefore, miRNAs could be used as reference genes in plants under biological and abiotic stress [15]. In a previous study searching for novel miRNA reference genes in cotton, the expression of 12 non-protein-coding genes was examined in root, stem, leaf, and flower tissues taken from three different plant cultivars; miR172, miR168, and miR390 were found to be the best reference genes [16]. Furthermore, in cotton miR169, miR171/170, and miR172 were found to be stably expressed under various abiotic stresses [16]. miR390-5p, miR7694-3p, miR1868, and miR1849 were found to be suitable miRNA reference genes in rice throughout infection with the root-knot nematode Meloidogyne graminicola or treatment with beta-aminobutyric acid (BABA) [17]. miR156b and miR1520d also exhibited stable expression in soybean under abiotic stress [15]. In woody plants, there have been relatively few studies screening for miRNA reference genes for use in small RNA expression studies [18]. The majority of reference genes used in woody plants so far are protein-coding genes, such as tonoplast intrinsic protein 4-like and phosphate transporter 1, which can be used for photoperiod and low-temperature studies in Populus tomentosa [19]. Moreover, elongation factor 1 alpha and 18S ribosomal RNA are considered to be the most appropriate reference genes for studying developmental processes in poplar. It is therefore important to discover novel miRNAs that could be used as reference genes for expression analyses. Poplar is a model perennial plant used for its fast growth and efficient genetic transformation protocols [20,21]. In our previous work, we found that miRNA expression levels vary greatly among poplar trees infected by pathogens [22]. The accurate determination of miRNA expression levels is crucial to understanding the functions of miRNAs under pathogen stress. Thus, the identification and selection of suitable reference genes is vital for the normalization and quantification of expression changes in poplar miRNAs. In this investigation, 15 candidate reference miRNA genes were selected for expression stability analyses, including 14 miR-NAs (miR1444b, miR156a, miR156g, miR172a, miR172b-5p, miR172d, miR390a, miR390d-3p, miR397a, miR399f, miR408-3p, miR6421-3p, miR159a, and miR642-3p) and the commonly-used reference 5.8S rRNA for comparison. The candidate miRNA reference genes were selected according to previously published small RNA profiles of poplar under biotic or abiotic stress [22][23][24]. Here, we identified miRNAs with stable expression and selected the most suitable reference genes for qRT-PCR miRNA expression assays in poplar under biotic stress. The regression coefficient (R 2 ) and the amplification efficiency were evaluated for each miRNA gene, then the cycle threshold (Cq) values of candidate genes were used to assess their expression stability using three algorithms: geNorm, NormFinder, and Bestkeeper [25][26][27]. We found miR156a to be the most suitable reference gene for poplar under biotic stress, thus providing a novel gene reference for future investigations into miRNA expression under pathogenic stress in woody plants.

Plant material and pathogen inoculation
Populus tomentosa plantlets were grown in a natural environment for 2 years at the Forestry Research Institute of Puyang (114°87′E, 35°81′E), Henan Province, China. 15 cm-long stems of the poplar were sampled and a total of 12 samples were used in this investigation. The plant pathogen Lonsdalea quercina subsp. populi strain N-5-1 (L. quercina) (provided by Prof. Wei He in the Key Laboratory for Silviculture and Conservation of the Ministry of Education, Beijing Forestry University) was cultured in liquid LB medium for 24 h at 30 °C with 180 rpm agitation. The poplar stems were inoculated with L. quercina or LB as a mock-inoculation control. Before inoculation, the stem inoculation points were sterilized with 75% alcohol and wounds 1.5 cm × 1.5 cm in size were created to the depth of the xylem with a sterile scalpel. Then, 100 μL LB medium with or without L. quercina was pipetted into each wound and packed with sterile cotton saturated with sterile water. The inoculated wounds were then sealed with plastic wrap to retain the moisture. Next, the inoculated stems were placed in a growth chamber at 28 °C with 90% relative humidity and a 16/8 h light/dark photoperiod. At 3, 6, and 9 days postinoculation (dpi), bark and developing xylem tissue samples of 1 cm × 1 cm in size were taken from around the inoculation sites and immediately frozen in liquid nitrogen.

Small RNA isolation and cDNA synthesis
Small RNAs were isolated using an Ambion mirVana™ miRNA Isolation Kit (Thermo Scientific, Waltham, MA, USA). RNA integrity was evaluated by electrophoresis using 3% agarose gels (Fig. S1). A Nanodrop 2000 spectrophotometer (Thermo Scientific) was used to measure the concentration and integrity of RNA. A total of 1 µg of small RNA was polyadenylated with 1.2 U poly(A) polymerase (Ambion, Cat# AM2030), and the reaction was incubated at 37 °C for 30 min as instructed. The tailed small RNA was reverse transcribed at 37 ℃ for 60 min using the Quantscript RT Kit of TIANGEN (Beijing, China, Cat# KR103-03) with a 20-µl volume containing 1 µg RNA, 2 µl of 10 × RT Mix, 2 µl of Super pure dNTPs (2.5 mM each), 2 µl of oligo(dT)3′-RACE adaptor (Table S1), 1 µl of Quant Reverse Transcriptase. 5.8S rRNA and the 14 candidate miRNAs were selected based on profiles of miRNAs in poplars previously subjected to biotic or abiotic stress [22][23][24]. To ensure that 14 mature miRNA sequences were unique in the poplar genome, sequences were aligned with NCBI nr database using BLAST with default parameters (https:// blast. ncbi. nlm. nih. gov/ Blast. cgi). Forward primers for small RNA amplification were designed based on conserved poplar miRNA sequences and generated using Primer Premier 5.0 software and 3′-RACE Outer Primer was used as a universal reverse primer (Table S1).

qRT-PCR experiment
qRT-PCR was performed using Takara SYBR Premix Ex TaqTM (Tli RNaseH Plus). Reaction volumes of 10 µL were used, containing 1 μL cDNA, 5 μL SYBR, 0.2 μL forward primer (Table S1), 0.2 μL reverse primer (Table S1), 0.2 μL ROXII, and 3.4 μL distilled water. qRT-PCR was performed using an ABI 7500 Real-Time PCR System (Applied Biosystems, Foster City, CA, USA). The thermocycling conditions were as follows: 95 °C for 10 min followed by 40 cycles of 95 °C for 30 s, 95 °C for 5 s, and 60 °C for 30 s. The melting curve ranged from 65 °C to 97 °C. Amplification efficiency was analyzed using serial fivefold dilutions of pooled cDNA. Non-template reactions were used as negative controls and all assays were replicated three times.

Data analysis
Miner software was used to estimate primer amplification efficiency for each candidate reference gene. SDS software (ABI 7500 Real-Time PCR System v. 2.0) was used to calculate R 2 . GeNorm (https:// genorm. cmgg. be/), Nor-mFinder (https:// www. moma. dk/ normf inder-softw are/), and BestKeeper (https:// www. gene-quant ifica tion. de/ bestk eeper. html) software were used to calculate the expression stability of 15 candidate reference genes. For GeNorm, expression stability was evaluated using two metrics: stability value (M) and pairwise variation (V). The lower the M value, the greater the stability of the reference gene. The V ratio (V = V n /V n+1 ) between normalization factors (NF), indicated the optimal number of reference genes the value of V n /V n+1 selection threshold is 0.15. If V n /V n+1 > 0.15, there was n + 1 suitable reference genes, if V n /V n+1 < 0.15, there was n suitable reference genes. Expression stability was further evaluated using NormFinder, which enabled the evaluation of intra-and inter-group variations by calculating the NF. Raw mean Cq values were transformed into relative expression data according to the 2 −∆∆Ct method in GeNorm and NormFinder.

Validation of amplification specificity, efficiency and suitability of candidate reference genes
The amplification specificity of the 14 candidate miRNAs and 5.8S rRNA was validated using single melting peak analyses and the amplification efficiency of each miRNA was calculated using five diluted cDNA templates. miR1447 and miR171c were selected to verify the suitability of the selected reference genes combined with the comparison with 5.8S rRNA.

Amplification specificity and efficiency of candidate reference genes
Poplar stems were inoculated with L. quercina, and the infected tissues exhibited typical disease symptoms that became more severe over time (Fig. S2). The amplification specificities of the candidate miRNAs and 5.8S rRNA were further verified using single melting peak analyses (Fig. S3). The melting peaks for most miRNAs tested were unimodal. Only miR390a exhibited a slight fluctuation before its sharp melting peak; however, the final T m value for miR390a was not affected.
Most of the candidate reference genes exhibited moderate amplification efficiencies ranging from 82.32% to 106.242% and regression coefficients (R 2 ) between 0.975 and 0.998 (Table S1). miR6427-3p displayed a significantly lower amplification efficiency compared to the other miRNAs (82.321%), and the R 2 value of miR159a was also relatively low (0.975). Interestingly, the amplification efficiency of 5.8S rRNA, which is widely used as a reference gene, was lower than the majority of other miRNAs in poplar, suggesting that miRNAs could be more suitable references than 5.8S rRNA. The comparatively lower amplification efficiencies and R 2 values of miR390a, miR159a, miR6427-3p, and 5.8S rRNA suggest that they are not optimal reference genes; however, they were included in our subsequent analyses to ensure the comprehensive comparison of the results. Our qRT-PCR results indicated that the majority of the candidate reference genes exhibited high amplification efficiencies and specificities.

Expression abundance of candidate reference genes post pathogen infection
Next, the expression levels of the 15 candidate reference genes in poplar over 9 days of canker infection were determined. The mean Cq values calculated for the reference gene candidates displayed significant variation, ranging from 21.01 (5.8S rRNA) to 34.13 (miR6421-3p) ( Fig. 1; Table S2). Most of the Cq values fell between 26 and 30; however, the expression levels of miR397a, miR399f, miR408-3p, and miR6421-3p were comparatively lower, displaying Cq values of 30 − 35. The most highly expressed gene was 5.8S rRNA, with a mean Cq value of 21.01.

Expression stability of reference genes analyzed using geNorm
Based on the parameters of the geNorm algorithm, genes with M values above 1.5 were regarded as unsuitable reference genes. The M values of the candidate reference genes were all below 1.5 for each time point examined (Fig. 2a). The pairwise variation value of V 2/3 was 0.12 (Fig. 2b). After applying a V n /V n+1 selection threshold of below 0.15, we identified two suitable reference genes miR156a and miR156g (Fig. 2b), which exhibited the highest expression stabilities (M = 0.18 for both genes), followed by miR6427-3p. The least stably expressed gene was miR1444b, as it exhibited the highest M value (1.21) in all samples (Fig. 2a).
These results indicate that miR156a and miR156g were the most stably expressed miRNAs under biotic stress, whereas miR1444b exhibited the most variation in expression levels.

Expression stability of reference genes analyzed using NormFinder
The NormFinder algorithm was used to determine the expression stability of the reference genes using NF values. The stability values of the candidate reference genes ranged from 0.062 to 1.342, where lower NF values represented higher expression stability. miR156g was the most stably expressed reference gene, followed by miR156a and miR6427-3p. miR1444b was the least stably expressed miRNA (Fig. 3). For the most part, the NormFinder results were consistent with the geNorm analysis. However, miR156a and miR156g shared the same M value (0.18) in the geNorm analysis, but the stability value of miR156a was lower than that of miR156g in the NormFinder analysis. miR1444b was the least stable gene according to both the geNorm and NormFinder analyses.

Validation of the selection of reference genes
To assess the suitability of the selected candidate miRNAs, miR156g and miR156a were further used as reference genes to investigate the expression of miRNAs in poplars. Compared with small RNAs in other plants, miR1447 is a unique small RNA in poplars and was found to be induced under biotic stress condition [22]. It was reported that miR171c was involved in the response of rice to salt stress, and the stress tolerance was decreased in rice overexpressing Fig. 1 The expression levels of candidate reference genes. Box plots shows the Cq values of 15 candidate reference genes. The box indicates the 25th and 75th percentiles, and the whisker caps represent the maximum and minimum values. The cross in box denotes the median miR171c [28]. The expression abundance of miR171c was changed significantly under biotic or abiotic stress in poplars [22][23][24]. Therefore, the expression changes of miR1447 and miR171c were examined in poplars infected with the canker pathogen using miR156g and miR156a as internal standards. We found that miR1447 and miR171c were upregulated in the bark of infected poplars when using miR156a as the reference gene (Fig. 4a, b), which was consistent with the expression trend of miR1447 and miR171c in previously reported miRNA profiles [22]. However, we found that the expression of miR1447 and miR171c were downregulated in infected poplars when using miR156g as the reference (Fig. 4c, d) [22]. Using both reference genes together, miR1447 expression was found to be induced and miR171c to be repressed (Fig. 4e, f). To further validate the suitability of miR156g and miR156a as reference genes, the comparison with 5.8S rRNA was performed considering that 5.8S rRNA was commonly used as the reference gene in the quantification of miRNAs expression. The result indicated that miR171c and miR1447 were upregulated, which was similar with the expression changes using miR156a as the reference gene (Fig. 4g, h). These comparisons demonstrated

Average pairwise variations
that miR156a could be used as a novel reference gene to investigate the expression of miRNAs in poplar under pathogenic stress, and may be suitable as a general-use reference gene in plants.

Discussion
Over recent years, several novel reference genes have been identified for quantitative gene expression studies in plants [16,18,29]. However, these researches have primarily been focusing on annual crop plants, as opposed to perennial plants [7]. Moreover, studies investigating the use of miRNAs as reference genes under biotic stress in trees are limited.
Due to the short length of miRNAs, some RNA is typically lost during extraction procedures, thus complicating RNA quantification experiments [30]. Using miRNAs as reference genes could avoid the bias caused by traditional RNA extraction methods and reverse transcription processes [31]. Therefore, it is necessary to discover relatively short internal standard genes for miRNA quantification. In contrast with traditional housekeeping genes, using miRNA reference genes could enable comprehensive comparisons of detected miRNAs, since their PCR products are approximately 100 bp in length and effectively avoid the bias of different amplication length as mentioned above.
In this investigation, 14 miRNAs and 5.8S rRNA were chosen based on previously published miRNA profiles [22][23][24]. The Cq values for each miRNA were analyzed Stability value by three programs to determine expression stability. The expression patterns of most of the candidate reference miR-NAs were stable throughout the course of canker pathogen inoculation. The two best reference miRNA candidates were then used as internal standards to evaluate the expression of two miRNAs, miR1447 and miR171c, which exhibit dramatic changes in expression in pathogen-infected poplars.
Taken together, we found that miR156a was the most suitable reference miRNA. At present, qRT-PCR is one of the most popular methods for analyzing gene expression. The selection of an appropriate reference gene for specific tissues, developmental stages, or stress treatments ensures the reliable normalization of qRT-PCR data [32]. An ideal reference gene would Fig. 4 The expression abundance of miR1447 and miR171c based on miR156g, miR156a and 5.8S rRNA as reference gene. miR156a as reference gene (a, b). miR156g as reference gene (c, d). miR156a together with miR156g as reference gene (e, f). 5.8S rRNA as reference gene (g, h) be stably expressed under all conditions. However, few reference genes have been identified thus far, some of which are differentially expressed in various tissues and stress treatments [33,34]. In bottle gourd roots, miR166b was found to be the best reference gene under nutrient limitation stress. miR167c and miR167f can be used as internal standards for gene expression studies in both squash roots and watermelon leaves under nutrient starvation [35]. miR1849, miR390-5p, miR1868, and miR7694-3p have been used as reference genes in rice during infection with the root-knot nematode Meloidogyne graminicola or treatment with BABA [18]. In this work, we found that miR390a expression levels decreased 0.78-fold, and were ranked as the fourth most stable according to the geNorm and NormFinder programs, suggesting that miR390a is not the optimal reference gene in poplar under biotic stress. This is supported by a previous study, in which disease-resistant and disease-susceptible cultivars of Brassica oleracea were each inoculated with a pathogen; miR390 expression levels in disease-resistant cultivars increased 1.5-fold, whereas its levels decreased 1.7fold in disease-susceptible cultivars [36]. It was reported that the expression of mi390a-5p increased 2.8-fold in Arabidopsis inoculated with Verticillium dahlia [37]. Taking RNA samples at multiple time points ensured that our study could cover changes in miRNA expression at three different infection stages: the invasive period, the incubation period, and the disease period. Therefore, we concluded that miR156a could be used as a reference gene throughout each infection period of P. tomentosa under biotic stress.
The results of qRT-PCR experiments are influenced by the stability and expression abundance of the reference genes [31]. Relatively small differences in expression levels between the reference and target genes indicate that the gene expression quantification calculations are less likely to be impacted by the amplification efficiency of the reference gene. Conversely, if there is a large difference between the expression abundances of the target and reference genes, the reference gene is not a reliable means of measuring target gene expression [25]. Therefore, it is necessary to identify reference genes whose expression abundances are similar to those of the target genes. The expression abundances of the 15 candidate reference genes of this study are shown in Fig. 1. miR156g and miR156a exhibited moderate gene expression levels in canker-infected bark tissue, making them suitable for assessing miRNAs with moderate expression abundances in qRT-PCR experiments. Furthermore, we found that 5.8S rRNA was more abundant than the other reference gene candidates; hence, 5.8S rRNA is a more suitable reference for highly expressed genes. Other miRNAs that exhibit stable expression abundances in poplar in response to biotic stress exist; however, due to their low expression levels and specificity to tree species, other miRNAs were not considered in this investigation. miR156 is involved in abiotic stress responses [25], therefore miR156 may not be the optimal reference for gene expression studies in environmental stress [29]. For example, the miR156/SQUAMOSA promoter-binding proteinlike protein (SPL) 9/dihydroflavonol-4-reductase pathway coordinates the relationship between development and abiotic stress tolerance in plants [38]. miR156 targeted Tamarix chinensis SPLs (TcSPLs) under salt stress responses in Tamarix chinensis [39]. Therefore, further investigation is required to determine whether miR156a can be used as a broad-spectrum internal reference gene under abiotic stress. However, under biological stress, miR156a was moderately expressed and more stable than 5.8S rRNA in poplar, suggesting that miR156a is more suitable as an internal control for measuring miRNA expression levels in poplar under canker pathogen stress. Considering the diverse expression patterns and regulatory pathways of miRNAs in various stress responses, the usefulness of miR156a as a reference gene in other tree species or for infection with pathogens other than L. quercina should be determined. This investigation has provided novel insights into the applicability of miRNAs as internal standards in gene expression studies in trees under biotic stress.

Conclusion
The present study has expanded the miRNA reference genes available for gene expression studies in trees under biotic stress. miR156a and miR156g ranked similarly following analyses with geNorm and NormFinder, and were the most stably expressed genes in all samples analyzed. The Cq value of miR156a suggested that it is expressed at a moderate level. Furthermore, miR156a exhibited the highest r values and moderate SD and CV values in the Bestkeeper analysis. Following testing of miR156a as a reference gene by measuring miR1447 and miR171c expression levels, we found that miR156a could be used as a reference gene in poplar under canker pathogen stress.