Synonymous sites for accessibility around microRNA binding sites in bacterial spot and speck disease resistance genes of tomato

The major causes of mass tomato infections in both covered and open ground are agents of bacterial spot and bacterial speck diseases. MicroRNAs (miRNAs) are 16–21 nucleotides in length, non-coding RNAs that inhibit translation and trigger mRNA degradation. MiRNAs play a significant part in plant resistance to abiotic and biotic stresses by mediating gene regulation via post-transcriptional RNA silencing. In this study, we analyzed a collection of bacterial resistance genes of tomato and their binding sites for tomato miRNAs and Pseudomonas syringe pv. tomato miRNAs. Our study found that two genes, bacterial spot disease resistance gene (Bs4) and bacterial speck disease resistance gene (Prf), have a 7mer-m8 perfect seed match with miRNAs. Bs4 was targeted by one tomato miRNA (sly-miR9470-3p) and three Pseudomonas syringe pv. tomato miRNAs (PSTJ4_3p_27246, PSTJ4_3p_27246, and PSTJ4_3p_27246). Again, Prf gene was found to be targeted by two tomato miRNAs namely, sly-miR9469-5p and sly-miR9474-3p. The accessibility of the miRNA-target site and its flanking regions and the relationship between relative synonymous codon usage and tRNAs were compared. Strong access to miRNA targeting regions and decreased rate of translations suggested that miRNAs might be efficient in binding to their particular targets. We also found the existence of rare codons, which suggests that it could enhance miRNA targeting even more. The codon usage pattern analysis of the two genes revealed that both were AT-rich (Bs4 = 63.2%; Prf = 60.8%). We found a low codon usage bias in both genes, suggesting that selective restriction might regulate them. The silencing property of miRNAs would allow researchers to discover the involvement of plant miRNAs in pathogen invasion. However, the efficient validation of direct targets of miRNAs is an urgent need that might be highly beneficial in enhancing plant resistance to multiple pathogenic diseases.


Introduction
Tomato (Solanum lycopersicum) has excellent nutritional and gastronomic value, making it the world's second most significant vegetable crop after potato (Gerszberg, et al., 2015). Tomatoes are infected by a variety of pathogens, including bacteria, viruses, and fungi. Bacterial speck and bacterial spot are the two most frequent bacteria-induced tomato diseases. Bacterial speck disease of tomato (caused by the bacterium Pseudomonas syringae pv. tomato) can be found everywhere tomatoes are cultivated (Preston 2000). The disease severely damages the leaves early in the growing season, resulting in a drastically reduced yield. When symptoms occur on tomato fruit, the disease has the potential to have a significant impact on quality and market value for corporate tomato farmers. Bacterial spot is a major tomato disease all over the world. Different species of the genus Xanthomonas cause bacterial spot disease (but primarily by Xanthomonas perforans) (Koenraadt et al. 2007). It can infect only the green fruits and not the red fruits. This disease, like bacterial speck, may be a major tomato disease that is difficult to treat when the disease pressure is high under favorable climatic conditions. The necessity for protection against pathogen is thought 1 3 247 Page 2 of 9 to be a powerful evolutionary force that leads to diverse selection and significant levels of diversity in plant genes encoding essential defense-related proteins. Many studies have found that microRNAs (miRNAs) are extremely sensitive to various physiological processes such as abiotic or biotic stress.
MicroRNAs are small (16-21 nucleotides) non-coding RNAs that form the miRNA-induced silencing complex (miRISC) with argonaute proteins to inhibit the translation process and induce mRNA degradation (Fabian and Sonenberg 2012). The degree of translational inhibition and mRNA degradation for each target region of miRNA might be quite different (Béthune et al. 2012;Djuranovic et al. 2012;Nam et al. 2014;Selbach et al. 2008). Poor miRNA binding or RNA binding proteins influencing inhibition might explain the variation in inhibition for particular target regions (Kedde et al. 2010;Kertesz et al. 2007;Kundu et al. 2012). Previous researches have revealed the mechanisms by which miRNAs detect their targets in the 5′ untranslated regions (UTR), coding sequences (CDS), and 3′UTR and assessed whether these regions might influence miRNA-mediated suppression (Cottrell et al. 2017;Kertesz et al. 2007). MiRNAs are known to control the expression of a variety of developmental and stress-related genes. In plants, miRNA target sites were identified in the CDS and 3′ UTRs of mRNA; however, it was not confirmed whether the efficacy of regulation was associated to target site locations or not (Bartel 2009;Jones-Rhoades and Bartel 2004). Previous study has shown that protein coding sequences can encode regulatory information by selecting certain synonymous codons (Itzkovitz et al. 2010).
In this study, we attempted to evaluate whether selection on synonymous codons near miRNA target regions might occur at the gene level. Based on 7mer-m8 seed match, we identified the miRNA targets in the CDS of bacterial spot disease resistance gene (Bs4) and bacterial speck disease resistance gene (Prf) of tomato. It has been reported that the majority of miRNA targets in plants were found in protein-coding domains (Jones-Rhoades and Bartel 2004). Plant miRNAs have nearly perfect target site matching, making computational prediction of miRNA targets much more effective (Dai et al. 2011). As a result, we tested our hypothesis using the coding sequences of bacterial spot and bacterial speck disease resistance genes of tomato. To determine the parameters that could control miRNA binding to target mRNA molecules, we predicted the miRNA target genes and then evaluated the target site accessibility; translational efficiency of the miRNA target's upstream, downstream, and target regions; mRNA stability, codon usage bias; and nucleotide base compositions using bioinformatics tools. In addition, we employed the free energy calculation technique, compAI, and cosine similarity metric (COSM), as well as a t-test to analyze the variation among the upstream, downstream, and target regions.

Coding sequences and microRNA data
The CDS of bacterial disease resistance genes of tomato were retrieved from the Plant Resistance Genes database (PRGdb 4.0; http:// prgdb. org/ prgdb4/) and National Center for Biotechnology (NCBI) GenBank database (http:// www. ncbi. nlm. nih. gov). The mature miRNA sequences of tomato (147 in total) were also downloaded from miRBASE version 21 (http:// www. mirba se. org). The miRNAs of Pseudomonas syringae pv. tomato (Pst) were collected from our previously reported data (Sophiarani and Chakraborty 2021). The coding regions of bacterial disease resistance genes were searched for Pst and tomato miRNA targets.

MicroRNA target prediction in coding sequences
The 7mer-m8 seed match was used to find the miRNA targets in the CDS of bacterial disease resistance genes of tomato. A 7mer-m8 miRNA seed is defined as a of region 2-8 nucleotides in the miRNA 5′-3′ direction (Peterson et al. 2014). Watson-Crick (WC) base pairing has been used by the miRNAs to bind to their targets. The miRNA non-seed region is positioned adjacent to the miRNA seed. In the present study, up to 4 mismatches between miRNA non-seed and target mRNA were permitted. After carefully screening the outputs, we selected only the CDS with the highest number of miRNA targets for further investigation. To obtain a deeper understanding, the miRNA target region, including its upstream and downstream regions, were all cleaved and saved individually with codons in a frame. Based on the findings of Kertesz et al. (2007), we collected 18 nucleotides (6 codons in frame) from the miRNA target's upstream and downstream sections separately.

MiRNA target free energy and site accessibility analysis
Site accessibility describes how easily a miRNA-RISC complex may identify and hybridize with its target mRNA sequence (Axtell et al. 2011;Gu et al. 2012a). The hybridization of miRNA with its target mRNA involves binding of miRNA to a specific accessible region of mRNA and unfolding of the mRNA once it completes binding to the target (Peterson et al. 2014;Riolo et al. 2020). In this study, we employed the SantaLucia formula (in a 7bp sliding window at 37°C) to evaluate the folding energy/free energy (in kcal/ mol) of the target mRNA and its sequence flanks, as well as the accessibility of the miRNA-binding region (SantaLucia 1998). In this case, the absolute value of folding energy was taken into account, and the larger value of folding energy was interpreted as significant mRNA folding. For each miRNA target, the folding energies of the upstream, downstream, and target regions were computed.

Translational efficiency analysis
The parameter compAI was used to estimate the translational efficiencies of the miRNA targets. It was used to compare the miRNA-binding sites and their flank regions. The value of compAI varies from 0 to 1, with 0 denoting the slowest and 1 denoting the fastest translation rate (Dilucca et al. 2015). The translational efficiency assessed by com-pAI is unaffected by gene expression bias. It examines the competition of comparable tRNA species.

Cosine similarity metric for microRNA targets analysis
The parameter COSM was used to measure the degree of parallel association between tRNA loops and relative synonymous codon usage (RSCU) (Sun et al. 2016). COSM values vary from 0 to 1, with 0 signifying the highest similarity and 1 denoting the least similarity.

Nucleotide composition analysis
We calculated the nucleotide composition of the CDS of bacterial spot and bacterial speck disease resistance genes of tomato for (i) overall nucleotide composition (A%, T%, G%, and C%), and its composition at the 3rd codon position (A3%, T3%, G3%, and C3%); (ii) frequencies of nucleotides GC (total G and C nucleotides) present at the 1st (GC1%), 2nd (GC2%), and 3rd (GC3%) synonymous codon positions; and (iii) total GC and AT3% (total A and T nucleotides at the 3rd synonymous codon positions) of the genes.

Relative synonymous codon usage analysis
The RSCU values of different synonymous codons in the CDS of bacterial spot and bacterial speck disease resistance genes of tomato were determined using the formula: where g ij represents the frequency of the relative codon usage of the ith codon for the jth amino acid which is encoded by n i synonymous codons (Sharp et al. 1993).
A positive codon usage bias related to a given codon is indicated by an RSCU value larger than one, and the associated codon is called a favored codon. However, RSCU less ni than 1 denotes a bias against the usage of codons, and the associated codon is regarded as less prevalent for the relevant amino acid (Sau et al. 2006). When the RSCU score is 1, the codon is considered unbiased for the specific amino acid and is selected equally or randomly in the RNA transcript with other synonymous codons from the same family. Furthermore, the synonymous codons with RSCU value higher than 1.6 are considered over-represented whereas those codons with value less than 0.6 are considered underrepresented (Butt et al. 2014).

Synonymous codon usage order analysis
In the present study, we employed synonymous codon usage order (SCUO) as a metric to quantify the codon usage as well as gene expression. It determines how frequently synonymous codons of an amino acid are being used in a nonrandom manner. SCUO has a value ranging from 0 (lowest) to 1 (highest) (Angellotti et al. 2007). Genes with a strong preference for specific codons are highly expressed, whereas genes with little or no codon preference are often underexpressed (Wan et al. 2003).

Statistical analysis
The t-test statistical analysis was used to understand the nucleotide base compositional changes of target regions in relation to upstream and downstream regions individually.

MicroRNA targets in the coding sequences of tomato bacterial spot and speck disease resistance genes
The 7mer-m8 seed match was used to search for miRNA binding sites on the CDS of disease resistance genes of tomato. The collected data were filtered, and two genes (bacterial spot disease resistance gene (Bs4) and bacterial speck disease resistance gene (Prf)) were chosen for further investigation. In this study, it was found that three Pseudomonas syringae pv. tomato (Pst) miRNAs (PSTJ4_3p_27246, PSTJ4_3p_27246, and PSTJ4_3p_27246) and one tomato miRNA (sly-miR9470-3p) targeted the Bs4 gene (Table 1). Additionally, two tomato miRNAs, sly-miR9469-5p and sly-miR9474-3p, were found to target the Prf gene (Table 2). However, we found no Pst miRNA targets in the Prf CDS. Brodersen et al. (2008) suggested that in the miRNA-mRNA target hybrids, imperfect pairing with central mismatches enhances translational repression because it prevents cleavage. On the other hand, translational repression is prevented by miRNAs that have exact central matches to their target mRNA, which allows for cleavage. In this study, we observed that all of the miRNAs have central mismatches with their target mRNAs (Fig. 1). As a result, we hypothesized that these miRNAs might suppress the expression of Bs4 and Prf genes by translational repression, which is comparable to the findings of (Brodersen et al. 2008).

Target site accessibility
We first calculated the free energy/folding energy (kcal/mol) of mRNA secondary structure using SantaLucia method in a 7-bp sliding window at 37°C, moving upstream and downstream in 18-nucleotide steps from the real miRNA target region for each gene to understand the differences that might exist among these regions. Lastly, we calculated the mean free energy in each miRNA target's upstream, downstream, and target sequences (Table 3). Here, we found that the free energy values in the upstream, downstream, and target regions varied from 2.8 to 4.13 kcal/mol, indicating weak folding of the target site and sequence flanks (Tuller et al. 2010). It was suggested that a positive free energy value indicates selection for loose RNA secondary structure at miRNA target sites (Gu et al. 2012b). Therefore, we hypothesized that the miRNA binding domains (target and flank) on these genes might fold into loose secondary structures.

Rare codon usage and translational efficiency
To determine the rate of translation in each gene segment, the CDS of the Bs4 and Prf were examined in their upstream, downstream, and target regions of the miRNA targets. In the present study, we found that the mean values of compAI in the two genes' upstream, downstream, and target areas exhibited lower translational efficiency (Table 4). Many studies have revealed that miRNA binding in the CDS results in translational inhibition (Brümmer and Hausser 2014;Hausser et al. 2013). Furthermore, it was reported that the miRNA-mediated translational repression   with a high translation rate was shown to be more strongly repressed (Cottrell et al. 2017). As a result, the low translational efficiency of the three regions reported in this study might interfere with the suppression of miRNA-mediated gene expression process. We calculated the COSM values to determine if the RSCU values correspond to the number of tRNA species. In general, COSM value ranges from 0 to 1 and a COSM value of 1 suggests a close resemblance, whereas a value of 0 reveals total dissimilarity. Table 5 shows the COSM values of the two genes in their upstream, downstream, and target regions. We found that the mean values of COSM were nearly equal to 0 in the three regions of the two genes, suggesting a weak association between synonymous codons found in the three regions and the tRNA pool. As a result, we hypothesized that the codons accessible in the three regions of Bs4 and Prf genes might be nonoptimal, resulting in a low translation efficiency. This is comparable to the previous work on growth rate-optimized tRNA abundance and codon usage, which revealed that the more frequently used codons, recognized by numerous tRNAs, resulted in faster translation elongation and greater translation efficiency (Berg and Kurland 1997; Gustafsson et al. 2004).

Nucleotide base composition analysis in the coding sequences of Bs4 and Prf genes
In the overall nucleotide composition and CUB indices of the CDS of Bs4 and Prf genes which included frequencies of nucleotide bases (adenine, guanine, cytosine, and thymine) at the 3rd codon positions (A3%, G3%, C3%, and T3%), GC and AT contents at the 3rd codon position   (GC3% and AT3%) were estimated to understand their effect on CUB (Table 6). The nucleotide composition analysis at the 3rd codon position revealed that the average percentage of T3 (37.5%) was found to be the highest followed by A3 (29.15%), G3 (17.45%), and C3 (15.95%) in the two genes analyzed. GC and AT distributions over the two genes are shown in Fig. 2. The CDS of Bs4 and Prf have an overall GC content lower than 50% (Bs4 = 36.8%; Prf = 39.1%). This suggested that the GC content of both genes was low. The AT distributions of the CDS of Bs4 and Prf were significantly distinct (p≤0.01). This revealed that the AT-ending codons are systematically preferred over the GCending ones.

Most favored codons for bacterial spot and speck disease resistance genes of tomato
The patterns of codon usage in the two genes were evaluated using RSCU analysis. The heatmap of RSCUs (Fig. 3) revealed that Bs4 and Prf originated to encode for amino acids using a limited number of relatively optimum (A/Tending) codons and were selectively enriched in AT-ending codons. Among the 59 synonymous codons of Bs4 and Prf genes, we found 6 highly preferred codons (RSCU>1.6) (TCT (Ser), CCA (Pro), AGA (Arg), ACA (Thr), GCT (Ala), and GAT (Asp)) for Bs4 gene, and 8 highly preferred codons (TCA (Ser), TCT (Ser), CCT (Pro), CAT (His), AGA (Arg), ACT (Thr), GTT (Val), and GAT (Asp)) for Prf gene (Fig. 4). In the present study, all the favored codons of Bs4 and Prf genes ended with A/T. Previous research had revealed that GC-poor codons were probably selected for the miRNA target region in plants to increase site accessibility and facilitate miRNA binding (Gu et al. 2012a). Our findings revealed that the Bs4 and Prf genes favored AT-rich codons across the miRNA-binding sites. These findings clearly show that evolution has resulted in significantly different codon and amino acid distributions in the Bs4 and Prf genes.

Relationship between codon usage and gene expression
The Bs4 and Prf genes exhibited SCUO values close to 0 (Bs4: SCUO = 0.09; Prf: SCUO = 0.1), indicating a low codon usage pattern. Moreover, we conducted the correlation analysis between effective number of codons (ENC), which determines the degree of synonymous codon bias in a particular gene or genome (Wright 1990) and SCUO values to assess if the codon usage pattern influenced gene expression in the two genes. Here, we found a significant negative correlation (r = −1.00, p<0.01) between ENC and SCUO. Our results suggested that gene expression level might have been influenced by the codon usage pattern.

Discussion
Many studies have shown that protein coding sequences can encode regulatory information by exploiting certain synonymous codons (Bollenbach et al. 2007;Itzkovitz et al. 2010). MiRNAs have gained prominence due to their vital role in gene regulation, which may lead to the development of numerous disorders (Liu and Wang 2019;Ni and Leng 2015). The regulatory process starts when miRNAs bind to their target sites in mRNAs (Brodersen and Voinnet 2009). MiRNA binding, which is brought on by local mRNA secondary structure or upstream translation efficiency, might be impacted by the surrounding nucleotides of miRNA target sites (Lin and Ganem 2011). Several studies have suggested that miRNAs trigger mRNA degradation by binding to 3′ UTRs (Grimson et al. 2007;Guo et al. 2010;Nielsen et al. 2007), while miRNAs binding to the CDS mostly inhibit translation (Brümmer and Hausser 2014;Larsson et al. 2010). Since our miRNAs all targeted in the 3′ UTRs regions of the CDS (Tables 1 and 2), wet lab validation is required to better describe the amplitude and time-scale of gene regulation at CDS sites. It was suggested that miRNA binding to 3′ UTRs leads to changes in protein abundance mostly through mRNA deadenylation, whereas binding to CDS sites primarily represses translation with a minimal influence on the polyA tail of the target mRNA (Brümmer and Hausser 2014).
In this study, we analyzed the local translation efficiency and site accessibility in the upstream, downstream, and target regions of all coding targets of tomato and Pst miR-NAs in the CDS of bacterial spot and bacterial speck disease resistance genes of tomato. In a previous study, it was revealed that site accessibility and the translational process were the two most important elements in miRNA binding to its targets (Gu et al. 2013). MiRNA activity is also influenced by RNA secondary structure around miRNA target sites. We have found that the free energies calculated in each gene's three locations (upstream, downstream, and target) were low, suggesting weak mRNA folding and easy access to miRNA. We also observed a weak association between the RSCU values of the synonymous codons found in the three regions and the tRNA pool, suggesting that the existence of rare codons might enhance miRNA targeting even more. Elf et al. (2003) revealed that in response to amino acid deficiency, the charge levels of different tRNAs recognizing synonymous codons varied dramatically in bacteria. As a result, while some synonymous tRNA pools remain completely charged, the charged fraction of others might fall to zero. Therefore, an efficient validation of direct targets of miRNAs is an urgent need that might be highly beneficial in enhancing plant resistance to multiple pathogenic diseases.
We found that the synonymous codons around the target sites were GC-poor. This finding is comparable to the previous study that GC-poor codons were locally favored around miRNA target sites for loose secondary structure (Gu et al. 2012b). As a result, we hypothesized that the miRNAs might have functioned in such a way as to reach the target locations easily. Since the coding segments were AT-rich, their complementary miRNAs were also AT-rich. Thus, miRNA binding to their target regions might be greater, resulting in effective repression of disease resistance genes. Again, lower free energy of miRNA targets was associated with lower stability. This is consistent with the findings of a previous study (Gu et al. 2012a), which found significantly lower mRNA stability at miRNA targets in Oryza sativa and Zea mays. In this study, we found no selection signal in the miRNA target regions (upstream, downstream, and target). This is consistent with the observation of low purifying selection in A. thaliana with novel miRNA genes and its targets (Cuperus et al. 2011;Fahlgren et al. 2010). Previous studies revealed that in A. thaliana, synonymous codons were largely chosen for efficient and accurate mRNA translation (Morton and Wright 2007). We found that translational rates were low in the three miRNA target regions (upstream, downstream, and target) of Bs4 and Prf genes. Our result is comparable to the findings that selection for translation in A. thaliana inhibits the selective pressure for site accessibility (Gu et al. 2012b). Thus, we could come to the conclusion that the ineffective translation process might inhibit the miRNA-induced gene regulation. Furthermore, it was revealed that site accessibility is a local selection limit operating on the synonymous codons in the miRNA target area, as opposed to translation efficiency and accuracy, indicating that in A. thaliana, the selection signal of enhanced site accessibility at miRNA target sites is most likely messed up by strong signals caused by translational selection.
We found a weak codon usage bias in the two genes studied. Gu et al. (2012a) reported in their study that certain miRNA targeting the CDS with a higher codon usage bias preferred to use rare codons in the upstream region of their target sites. This observation was contrary to our result. Although we studied the miRNA targets in the protein coding sequences of Bs4 and Prf genes by comparing just to a few restricted previous findings (Gu et al. 2009;Lin and Ganem 2011), the conclusions based on a single or several miRNA targets may be biased. As a consequence, we revealed that several additional genomic properties, such as folding energy near miRNA targets or local mRNA secondary structure (Kertesz et al. 2007;Long et al. 2007), could be potential factors facilitating miRNA binding, which is similar to the findings of Lin and Ganem (2011).

Conclusion
In conclusion, we hypothesize that synonymous codon usage in the upstream, downstream, and target regions of miRNA targets is related to local translation efficiency in Bs4 and Prf genes of tomato. The selective restrictions around most miRNA targets, however, were too weak to be observed. When miRNA targets are located in the CDS, additional genetic traits, in addition to local translation efficiency, might even be more relevant to miRNA activity. We provide a description of selective effects on synonymous codon usage in the three regions of miRNA targets for optimal miRNA activity by assessing the local translation efficiency of all miRNA targets in the CDS of the two genes.