Isolation of Genomic Sequences of Rca Genes in Wheat
The bread wheat TaRca1 and TaRca2 sequences were used as queries by blast analysis in the Svevo portal (https://d-gbrowse.interomics.eu/gb2/gbrowse/Svevo/), and several different durum wheat sequences annotated as the Ribulose-1,5-bisphosphate carboxylase-oxygenase activase, were retrieved. Specifically, 79 different splicing isoforms were identified on 4A chromosome, while 74 different ones were identified for 4B homoeologous.
Indeed, differently from bread wheat and other species, the two genes were not separately and individually annotated, but reported in tandem as different splicing form of the same gene: TRITD4Av1G139700 on minus strand of 4A chromosome, and TRITD4Bv1G060980 on plus strand of 4B. By sequence analysis and comparison of CDS and predicted aminoacidic sequence, it was possible to determine the most similar and more likely to be the durum wheat Rca1 and Rca2 sequences for both A and B genome: Rca1-4A: TRITD4Av1G139700.79 (chr4A:442,229,641..442,232,675); Rca2-4A: TRITD4Av1G139700.4; Rca1-4B: TRITD4Bv1G060980.1; Rca2-4B: TRITD4Bv1G060980.64. Both Rca1 homoeologous genes comprise 2 exons, a complete CDS of 1299 bp and a predicted protein of 432 aa.
Rca2 genes have 6 exons, a CDS of 1404 bp and a protein of 467 aa. It should be reminded that in this case, an alternative splicing at exon 5 might induce a shorter isoform (Fig. 1).
Phylogenetic Analysis
Orthologues Rca1 and Rca2 gene sequences were retrieved for ten species from the EnsemblePlant database (http://plants.ensembl.org/), including Triticum aestivum, Triticum durum, Triticum dicoccoides, Triticum urartu, Aegilops tauschii, Hordeum vulgare, Arabidopsis thaliana, Brachypodium distachyon, Oryza sativa, Sorghum bicolor and Zea mays. The above reported species were chosen as either sequenced wheat genome progenitor and related species, or model plants which genome sequences have been fully sequenced and annotated. Furthermore, two C4 species were chosen in order to compare the evolutive distance of Rca1 genes with C3 ones.
The identified CDS were checked for sequence structure and similarity. A total of 28 gene sequences were retained to build a phylogenetic tree comprising the ten considered species (Table 1), which were firstly aligned by using the ClustalW method via MegaX software.
The evolutionary history was inferred by using the Maximum Likelihood method based on the Tamura-Nei model (Tamura et al. 1993). The tree with the highest log likelihood (-8623.3812) is shown (Fig. 2). Initial tree(s) for the heuristic search were obtained automatically by applying Neighbor-Join and BioNJ algorithms to a matrix of pairwise distances estimated using the Maximum Composite Likelihood (MCL) approach, and then selecting the topology with superior log likelihood value. The tree is drawn to scale, with branch lengths measured in the number of substitutions per site. Codon positions included were 1st+2nd+3rd+Noncoding. There were a total of 1774 positions in the final dataset. Evolutionary analyses were conducted in MEGAX (Kumar et al. 2018).
As shown by the tree in Fig. 2, two main clades, and a third smaller one, were generated. The first one, grouped 11 Rca1 sequences (reported in pink) belonging to the 7 of the analyzed species; as expected, sequences belonging to A genome from Triticum species clustered altogether with the Urartu one, and the same situation was observed for orthologous Rca1 belonging to B and D genome, the latter clustering with the Rca1 of Aegilops as well. Hordeum and Brachypodium had the more dissimilar sequences, as expected. The same situation was observed for the second cluster, which grouped 12 Rca2 sequences. Triticum orthologous belonging to the same genome clustered together, as previously observed for Rca1 CDS. Interestingly, a third smaller cluster formed, which grouped both Rca1 and Rca2 from Sorghum and Maize, the only C4 species included in the analysis. A divergent sequence was also found out, a Rca2 CDS sequence of rice.
Table 1. Ensembl entries of the Rca1 and Rca2 genes retrieved in Triticum aestivum, Triticum durum, Triticum dicoccoides, Triticum urartu, Aegilops tauschii, Hordeum vulgare, Arabidopsis thaliana, Brachypodium distachyon, Oryza sativa, Sorghum bicolor and Zea mays. (EnsemblPlants website:http://plants.ensembl.org/)
Genetic and physical mapping of Wheat Rca Genes Sequences
ECOTILLING approach requires a treatment of the amplified DNA with CELI endonuclease, or any of a number of single strand endonucleases, after heteroduplex formation between the lines to be investigated. Surveyor nuclease cleaves with high specificity at the 3′ side of any mismatch site in both DNA strands, including all base substitutions and insertion/deletions up to at least 12 nucleotides. The treatment of all amplicons for each Rca gene allowed the identification of a mismatch in Rca1 sequence, and specifically, a SNP was identified within the second exon of Rca1-4A gene. Precisely, the T/C SNP identified between the two parental line, 02-5B-318 (C) and Saragolla (T), was mapped in the RIL mapping population, and was localized at 123.9 cM (Fig. 3). Analysis of SNP in the predicted mature protein showed that the polymorphism resulted in amino acid substitution in position 260, a leucine to phenylalanine switch L-F (C/T). Unfortunately, no polymorphism was detected within Rca2 gene sequence. The projected SNP of Rca1 in the Svevo genome mapped at 37.7 cM, at physical position 442230162 bp. The metaQTL analysis conducted by Maccaferri et al. (2019), identified 14 different QTLs underlying the Rca genes region, most of which related to yield traits, but two of them found to be involved for Fusarium graminearium and leaf rust resistance. On this bases a new QTL analysis for Fusarium resistance was contacted using the same phenotypic data and genetic map of Giancaspro et al. (2016) adding data from Rca1. QTL analysis conformed the presences of a QTL for FHB resistance coincident with Rca gene with a LOD of 3.
Expression Profile of Rca Genes in Wheat
Using the genome browser for ‘Svevo’ (https://d-gbrowse.interomics.eu/gb2/gbrowse/Svevo/) reference genomes and the RNAseq data available at http://www.wheat-expression.com/ (Borrill et al. 2016), we carried out an in silico gene expression analysis to identify in which tissue and phenological stage Rca1 gene transcripts were more abundant.
In addition, the analysis was conducted to correlate the gene expression with biotic stress conditions and detected where the expression was higher during plant development.
The Rca1-4A gene expression was detected in leaves (including flag leaf), followed by roots and spikes. Considering the developmental stages and types of leaf, the higher level of Rca1-4A gene expression was reported at seedling, three leaf and reproductive stages and during grain filling.
According to stress response, the wheat Rca1-4A gene showed to be particularly expressed during Zymoseptoria tritici, Stripe rust, Powdery mildew and Fusarium infections (Fig. 4), conforming what obtained with QTL analysis for FHB resistance.
Expression analysis of Rca1-4A under abiotic stress included: drought stress, heat stress, combined drought and heat stresses, water stress, chitin addition and PEG 6000 treatment to simulate drought and cold stress.
The homeologus gene Rca1-4B was highly expressed in leaves, while lower levels were detected in roots and spikes. Comparing the expression of the Rca1-4A and Rca1-4B genes under stress conditions, the latter one showed a higher expression level under the Powdery mildew infection. Overall, the expression data reported for the Rca genes located on chromosome 4B appeared to be more abundant compared to the ones on the 4A homoeologues.