Differential accumulation of repetitive DNA families on Capsicum genomes
TEs can move through genomes, representing an evolutionary force that modifies genome structure using different mechanisms, such as illegitimate recombination, gene capture, shuffling of regulatory motifs and generation of new functionality or silencing (see [35]). In the last instance, TEs may cause polymorphism on genomes’ global structure and fluctuations in the DNA C-values [10]. TEs have been useful to compare genomes and karyotypes in evolutionary studies as well as applicable approaches, such as in grape and blood orange, where the insertion of TEs near the genes leads to a change in the gene expression [36].
The set of TEs called "Mobilome" occupy the most portion in plant genomes and play an important role in the physical and functional aspects of chromosomal structures, such as those of the CR family (centromeric retrotransposons) that are associated with chromosomal kinetics [37, 38]. In some monocotyledons, for instance, the Mobilome represents around to 75% of the genomes, such as the 80% in maize [6], and the LTR-RTs are the most dynamic elements in them [39, 7]. In Nicotiana attenuata and N. obtusifolia, for example, LTR retrotransposons reach up to 81% and 64% of genomes respectively [40]. The knowledge of LTR-RTs in a chromosomal landscape can help us understand some of the regulatory potential of TEs along chromosomes, and holds the prospect of its possible application in crop breeding programs, such as for peppers. A good example is maize, in which there are several TE families near the genes that can act as an enhancer/repressor under stressful conditions [41, 42].
Previous studies addressing TEs in the Solanaceae family have shown that Gypsy elements of Solanum lycopersicum have been more frequent than the Copia superfamily members [43], although an approach using only autonomous elements showed a predominance of the Tekay/Del family (Gypsy) and Tork (Copia) in this species [44]. This is interesting because it shows that estimates can vary when we consider only autonomous elements or also sequences of non-autonomous elements. Our results indicate that only 0.002% of the C. annuum dataset corresponds to the autonomous elements, followed by 0.001% in C. chinense and 0.004% in C. baccatum. The remaining sequences correspond to non-autonomous elements. It is important to point out that the three genomic datasets were obtained by high-covered sequencing [17, 29], which may support the assembly of the pseudochromosomes as complete elements. According to Lisch [36], the major part of coding repetitive fractions relates to fragments of non-autonomous elements, which may be amplified by the activity of the autonomous. This could explain the high percentage of LTR-RT fragments in these three datasets.
According to Qin et al. [29], the Gypsy members were the most abundant LTR retrotransposons in Capsicum, with the highest insertion activity among Solanaceae species. When we furthered this comparison to the lowest clades of Gypsy and Copia between Capsicum annuum with 89% of LTR-RTs, C. chinense with 98% and C. baccatum with 70%, a predominance of Gypsy (>70%) over Copia superfamily (<10%) was evident as well as a contrasting accumulation of Tekay/Del, Athila/Tat and CRM families of Gypsy in these three genomes. These data point toward the importance of LTR-RTs’ fate in the process of genome organization and differentiation between related species, even considering that C. annuum and C. chinense (Annuum clade) are closest in relation to C. baccatum of Baccatum clade [45]. These three species also differ in the accumulation of 35S rDNA, with about 20% more sequences in C. baccatum than in C. annuum and C. chinense, besides the number of rDNA sites [25]. These genomic differences may be responsible for certain difficulties in performing interspecific crosses between these species of distinct clades because the pre- and post-zygotic barriers, as reported by Manzur et al. [46] and Cremona et al. [47].
The differential activity of retrotransposons among close genomes was also reported in Helianthus [48] and Solanum [49, 50], and the results agree with those obtained here in terms of Capsicum. Our data has also shown differences among C. annuum and C. chinense, especially in Tekay/Del, ERVs and Line-RTE accumulation, suggesting that other elements, besides the LTR-RT ones, evolve independently. The differential accumulation of Tnt1 retrotransposon in Nicotiana may be a good example for comparison as well as to support the idea of an independent fate of TEs on genome differentiation [51, 52].
Recovered Del, CRM and Athila/Tat autonomous elements support the Gypsy LTR-RTs’ predominance
The ability of retrotransposons to activete and invade plant genomes may be associated with some internal and external factors, such as biotic and abiotic stresses, breeding processes, injuries, climatic changes, polyploidization, hybridization and other events (see [35, 7]). However, the activation and proliferation of TEs also depend on the ability to cheat cellular silencing controls [53]. However, only autonomous elements containing the complete and functional enzymatic machinery can do that. According to Kumar and Bennetzen’s [54] criteria, autonomous elements are those that require a complete polygenic chain, regulators and both LTRs. The absence of any region qualifies it as a non-autonomous element. Following these criteria, the difference observed between potentially autonomous sequences in the three Capsicum datasets and those considered non-autonomous was 87.27%. It suggests that these repetitive element classes may have undergone different events of degeneration along genome differentiation.
The putative autonomous elements recovered in our analysis, i.e., ten sequences of Tekay/Del, two sequences of Athila/Tat of C. annuum and seven of CRM, have been quantitatively different in each dataset, supporting the idea of the independent fate of TEs among genomes [55]. The retrotransposons’ independent fate can be exemplified by the occurrence of some exclusive Tekay/Del elements in C. baccatum against the three in C. annuum and C. chinense as well as by the thirty-fold difference in CRM amount in C. baccatum in relation to C. annuum and C. chinense. This is in accordance with the report of Hawkins et al. [56], which suggests that in Gossypium species, different lineages of LTR-RTs evolved at different moments along genome evolutionary history, generating a threefold difference in DNA content among diploid species. In other example, De Castro Nunes et al. [38] observed a greater accumulation of CRM copies in the diploid Coffea species, in comparison to the hybrid tetraploid C. arabica.
Not all LTR-RT rich regions in Capsicum chromosomes are heterochromatin hotspots
There is a widespread idea in related literature that TEs, especially LTR-RT superfamilies, occupy “specific” chromosomal regions, with the agreement that Copia elements are distributed preferentially along the chromosomes associated with euchromatin, while Gypsy elements are resident in heterochromatin-rich regions (see [7]). In Coffea, Brachiaria and Secale, for example, Gypsy probes were located in proximal heterochromatin-rich chromosome regions [57 59], but in Gossypium species, Gypsy probes are hybridized along chromosomes [60]. However, when the elements are considered according to their phylogenetic positions, i.e., families of Copia and Gypsy [61, 5, 7], it becomes evident that there are many differences in the TE distribution profiles, in both plants (see [10, 62]) and animals [63, 64]. Thus, it seems wiser to believe that each element has its own characteristics, including chromosomal positioning, genome impact, epigenetic influence, diversification rate and other features.
Previous studies using FISH in Capsicum spp. have been restricted to rDNA probes [25], which demonstrated that C. baccatum accumulates more in terminal 35S rDNA sites compared to C. annuum and C. chinense, which exhibited just two to four pairs. This is interesting because the reports of Moscone et al. [21, 65], Scaldaferro et al. [24] and Martins et al. [27] have shown a wide variability in the presence of terminal, interstitial and proximal heterochromatic bands in Capsicum species, such as the large and minor heterochromatic terminal bands in C. annuum, C. chinense and C. baccatum observed here. The FISH results using different LTR-RT probes showed hybridization signals accumulated from proximal (CRM) to interstitial region (Athila/Tat and Tekay/Del), scattered, or as minor dots along chromosomes (Tekay/Del, Oryco and Tork), such as in Brachiaria [59]. However, in no case were preferential accumulation or strong signals found in terminal chromosome regions, suggesting there is no accumulation of LTR-RT sequences at regions containing rDNA or terminal heterochromatin in Capsicum chromosomes. This is different from what Balint-Kurti et al. [66] found in Musa, where the Copia-Monkey retrotransposons appeared accumulated in the NOR.
The FISH using the Athila/Tat probe was strongly hybridized at proximal to interstitial regions in almost all the chromosomes, and in this case, there was also no evident co-location with heterochromatic regions, although there were small AT- and GC-rich minor bands in very few chromosomes, i.e., without evident correlation with heterochromatin hotspots. In this case as well, the scattered signals (or dots) observed after FISH with Tekay/Del and Oryco probes are in agreement with the concept of dispersed localization of retroelements within plant genomes but without a dependence on co-localization with heterochromatin blocks. This Athila/Tat dispersion pattern, such as interstitial dots, has been described by Park et al. [67] in C. annuum. Using the Passiflora edulis for comparison, members of Ty3/Gypsy superfamily were the most accumulated, and their sequences appeared scattered along chromosomes, including at the pericentromeric regions [68]. In some Solanaceae species, such as tomato and peppers, elements of the Tekay/Del Gypsy superfamily had a scattered accumulation profile, as reported by Park et al. [68], with hybridization in chromosomes of Solanum lycopersicum (tomato) and Capsicum annuum (pepper), where pepper had a higher number of and intense signals than those observed in tomato.
Different from the other LTR-RT probes, the CRM probe exhibited intense signals in the proximal regions, associated with centromeres, and in Capsicum chromosomes, these regions were rich in CMA+ and DAPI+ signals. One notable exception is the Centromeric Retrotransposon lineage of Chromovirus (CRM or centromeric retrotransposon of maize), which occurs preferentially in proximal chromosome regions. CRMs carry particular domains called chromodomain (CHRomatin Organization MOdifer DOMAIN) and CR motifs that have a potential to interact with the CENH3 centromeric protein and to participate in the centromere function [69, 37]. FISH centromere signals using CRM family probes have been described in several plant species, for example in some monocotyledon groups [70, 71, 59], suggesting that besides association with specific centromeric proteins, this accumulation may also be associated with recombination-poor regions.