Teleost is the largest group of vertebrates, with more than 35,000 species described (Fricke et al., 2020) with and undoubtably associated disparity of forms and functions. About 20 years ago, the first complete genome of a teleost, the Japanese pufferfish (Takifugu rubripes), was sequenced (Aparicio et al., 2002). Since then, the number of fish species sequences has continuously increased, largely because of advances in sequencing technologies and assembly algorithms (Ravi & Venkatesh, 2018). According to NCBI and as of August 2023, the genome of only about 3% (1071) of all fishes has been sequenced. However, these numbers will soon be outdated due to large-scale sequencing of fish species genomes. For example, the 10,000 Fish Genomes Project (Fish10K) (Fan et al., 2020) aims to sequence and obtain reference genomes from representative fish species and has recently begun using long-read sequencing and Hi-C technology for at least one representative species from all families to obtain better quality reference genomes. All the information obtained from Fish10K, together with genome sequences from other laboratories (such as the one presented in this study), will help improve breeding programs and promote sustainable aquaculture in the future, as well as perform genome editing and genomic selection (Lu & Luo, 2020).
Using next-generation sequencing (NGS) technologies, we were able to sequence, assemble, and annotate the genome of non-model species, the Black flounder. Genome sequencing identified 25,231 protein-coding genes, which is higher than the 21,787 protein-coding genes described in Japanese flounder (Shao et al., 2017). The estimated genome size of Black flounder is ~ 538 Mbp, similar to the genome size of other flatfish species: Japanese flounder (534 Mb) (Shao et al., 2017), Turbot (568 Mb) (Figueras et al., 2016), Spotted halibut (556 Mb) (Zhao et al., 2021), ~ 15% bigger than the tongue sole genome (477 Mb) (Chen et al., 2014) and ~ 11% smaller compared to the Senegalese sole genome (~ 612 Mb) (Manchado et al., 2016). In addition, another 8 flatfish species have genome sizes ranging from 399.64 Mb in Ocellated flounder to 643.91 Mb in Japanese flounder (Lü et al., 2021). Before to fully determine the genome size, it is recommended to use multiple complementary methods to assess the genome quality (Gurevich et al., 2013). In this study, we assessed the N50 sizes of contigs and scaffolds and the completeness of the genome using KOG and BUSCO. The results of quantitative measurements to determine assembly completeness using BUSCO showed a high percentage of conserved orthologs in turbot (Xu et al., 2020). High BUSCO values (94.7%) were observed in Actinopterygii using the combined assemblies, to compensate for the lack of long reads. Future transcriptome and long reads sequencing studies are therefore required to provide a comparative, more comprehensive and quantitative overview of the level of completeness achieved.
The comparative analysis of homologous genes reveal that 1234 putative orthologous groups were exclusive for the three different flatfish families (Scophthalmidae, Cynoglossidae, and Paralichthyidae) (Fig. 3). Within clusters several semantically similar GO terms correspond to features that are key for adaptation to benthic lifestyle (Supplementary Table S2). It is important to note that a recent comparative genome analysis has examined the origins of flatfish body structure, among other important features, from an evolutionary perspective using the genomes of eight new species from fourteen families of Pleuronectiformes (Lü et al., 2021), and we identified some gene clusters shared by Pleuronectiformes associated to these features. However, more comprehensive comparative analyses including the South American black flounder can be performed once those annotated genomes become publicly available.
In this study, we performed a comparative analysis of 27 flatfish species from 10 orders to understand why Black flounder have a relatively small genome size. Genome size varies from very small genomes as in Tetraodon nigroviridis (~ 350 Mb) (Neafsey & Palumbi, 2003) to large genomes as in Salmo salar (2967 Mb) (Yuan et al., 2018). As expected, the comparative analysis revealed that Black flounder has one of the smallest genomes among the compared species, which is consistent with the results of previous studies in which flatfish genome sizes were among the smallest of all teleosts (Figueras et al., 2016; Xu et al., 2020). This observation is also consistent with the analysis of fish C-values (Figs. 4 and 5). Indeed, the genomes of B. splendens (Anabantiformes), G. aculeatus (Perciformes, suborder Gasterosteiformes), C. semilaevis (Pleuronectiformes), T. rubripes, and T. nigroviridis (Tetraodontiformes) in the selected data set belong to groups with typically small genome size according to the average C-value (Fig. 4). In addition, several C-values of Pleuronectiformes are within the 10th percentile of the distribution, especially in the families Paralichthyidae (including Black flounder), Rhombosoleidae and Pleuronectidae. The large genome size in salmonids could be explained by a specific round (called 4R) whole-gene duplication event in this lineage (Lien et al., 2016). Although genome size diversity in teleosts could be the cause of the tremendous diversity of morphology, ecology, and behavior in this group (Volff, 2005), the origin of this dispersion is still unknown.
Genome size is not only affected by the teleost-specific rounds of whole-genome duplications (3R and 4R) (Meyer & Van de Peer, 2005). There are also other genomic structures that has been related to the variation of genome size. It is well known, the changes of proportion of REs (repetitive elements) can modify the size of the genome (Yuan et al., 2018; Canapa et al., 2015). Complementary to the changes of REs, the numbers of exons and introns could also explain the changes in genome size in teleost fish (Jakt & Johansen., 2022).
Most of these REs come from specific sequences that replicate and move through the genome, called transposable elements (TEs) (Bourque, 2009). Thus, Black flounder is consistent with the compact genomes observed in other flatfish genomes, with ∼5% TEs, slightly more than the 3% in pufferfish another species of extremely small genome (Aparicio et al., 2002). Consequently, the results of this study are consistent with the generally accepted fact that variation in the genome is related to the amount of repetitive DNA in eukaryotic species (Kidwell, 2002), which has also been demonstrated in teleost fishes (Chalopin et al., 2015). The abundance of TEs varies with genome size and position in the fish tree of life (Shao et al., 2019). In this work, we compare the comparison proportion of REs (Fig. 1): DNA transposons, long and short interspersed nuclear elements (LINEs and SINEs, respectively), and long terminal repeats (LTRs) in 27 fish species. In Black flounder and the order of Pleuronectiformes in general, the proportion of all REs was low, which is probably one of the reasons for the small size of the Black flounder genome. This study (Fig. 2) revealed an extraordinary abundance of transposable elements (TEs) in zebrafish. However, it is particularly surprising that despite the large number of TEs, the zebrafish genome itself is not exceptionally large (Fig. 2). However, this does not provide a clear explanation for the lack of correspondence between the proportion of TEs and the total genome size in this particular species, and further studies are needed to shed light on this intriguing phenomenon.
As mentioned before, another feature that could modify the genome size, is the number and size of exons and introns. To this end, the coordinates of all exons and introns in the genomes of twenty-seven fish species, including P. orbignyanus, were determined. Results showed that P. orbignyanus has smaller gene sizes compared to all fish species studied, including groups with small genomes such as the Tetraodontiformes (Brainerd et al., 2001) and some Perciformes (Reid et al., 2021). Flow cytometric analyses revealed that in four pufferfish species of the family Tetraodontidae, genome size varies between 0.38 and 0.82 pg, whereas the sister family Diodontidae is larger (0.8-1 pg), likely due to DNA loss while the two families diverged during evolution (Noleto et al., 2009). This is consistent with our study, which showed that all Pleuronectiformes generally have a small genome size (0.3–1.1 pg), which is in general among the smallest of all teleost fishes. Since T. nigroviridis also shows a trend towards smaller gene sizes, this can represent one of the possible explanations for genome shrinkage.
In eukaryotes, intronic DNA is the major component of genes and genomes and plays a key role in gene regulation, and intron size is important from an evolutionary perspective (Zhang & Edwards, 2012). In teleosts, genome size and intron size are closely related, with intron size reflecting genome size (Jakt et al., 2022). To further investigate the origin of the small genome size of Black flounder, we examined the distribution of total genome size of introns and exons for 27 species in 10 fish orders. Our analyses show that, as expected, the mean and median exon sizes are smaller compared to intron sizes, suggesting a more compact distribution, as observed in other Pleuronectiformes species (Robledo et al., 2017). However, we detected a difference in the distribution of the intron size in Black flounder's genome, particularly regarding the decrease of number of very large introns and small introns (Fig. 7 and Supplementary Fig. 4). Indeed, distribution of intron size in other species with small genomes (such as T. nigroviridis and T. rubripes) indicate smaller intron size, suggesting that this may be a mechanism leading to genome size reduction in those species. However, the low content of small and very long introns in P. orbignyanus could explain the gene size reduction observed in this species. Regarding exon size, as reported in previous studies, no remarkable differences in exon size distribution were detected between species (Li et al., 2017). Considering these results, we concluded that intron size may have an impact in gene size and consequently genome shrinkage in Black flounder, particularly associated to lower contents of very large and small introns. Further comparative analysis with genomes annotated at chromosome level among Pleuronectiformes and other fish species with remarkably small or large genomes (such as T. nigroviridis and salmon genomes) may help to shed light into other mechanisms involved, such as size of intergenic regions.
One of the implications of changes in the structure of genes, such as the reduction in intron size, can clearly affect the Alternative splicing (AS) of genes. AS is an essential mechanism that plays a key role in cellular differentiation and organism development (Wang et al., 2015). In teleosts, lower AS frequencies have been observed in highly duplicated genomes (e.g., zebrafish) and large occurrences in compact genomes (e.g., pufferfish). These inverse correlations between AS frequency and genome size appear to be the same across fish species (Lu et al., 2010). This study opens a new line of research with that questions whether smaller introns could affect the AS mechanism (Mechaly et al., 2009; 2011; 2018). Nowadays, we have a huge amount of data from transcriptome analysis and from fish genomes, which is constantly increasing. This resource is extremely helpful for conducting comparative genome studies and investigating the potential correlation between alternative splicing and genome size through RNA-seq analysis.
In the last decade, several genomic, proteomic, and metabolomic studies have focused on characterizing the reproduction, development, nutrition, immunity, and toxicology of flatfish (Cerdà et al., 2008; 2010; Forné et al., 2010). Flatfish genomics is important for studying the management of wild fish populations, improving fish conservation, and increasing productivity in aquaculture (Cerdà et al., 2010). In the last two years, long-read sequencing technologies have been applied to several Pleuronectiformes species, allowing the assembly of chromosomes (Guerrero-Cózar et al., 2021; Jasonowicz et al., 2022; Lü et al., 2021; Martínez et al., 2021; Zhao et al., 2021). High-quality reference genomes are important for studying evolutionary variation in fish genome structure and organization (Varadharajan et al., 2019). Moreover, P. orbignyanus is considered ‘data deficient’ in the IUCN Red List of Threatened Species (Riestra et al. 2020), and thus genetic variation at the genomic level of this species can become the start point for future studies about the population genetic structure on this specie. This technology will allow us to perform new comparative and in-depth analyzes soon. For example, one important question that remains to be addressed is whether there is a general sex-determining gene (locus) in fish and specifically in flatfish. In the recent study by Ferchaud et al., (2021), the authors suggest SRY-Box Transcription Factor 2 (Sox2) as a possible candidate not only in Greenland halibut but also in other flatfishes. Other gene candidates include follicle stimulating hormone receptor (fshr) in Senegalese sole (de la Herrán, et al., 2023), bone morpho-genetic protein receptor type-1B (bmpr1ba) in Hippoglossus stenolepi (Jasonowicz et al., 2022), Forkhead box L2 (Foxl2) and Doublesex and mab-3 related transcription factor 1 (dmrt1) in Japanese flounder (Paralichthys olivaceus) (Shu et al., 2022), and Gonadal soma-derived factor (gsdf) in Atlantic Halibut (Hippoglossus hippoglossus) (Palaiokostas et al., 2013; Einfeldt et al., 2021). More recently, other study in the Japanese flounder carried out with amh-mutant flounders created by using the CRISPR-Cas9 system technology shown that amhy is necessary for testicular formation in this species (Hattori et al., 2022). In this study, the preliminarily analysis of sex determination genes (SD) in Black flounder, we search and found in our genome all the genes that have been associated as gene markers in flatfish, as mentioned above (see Table 4). Since all genes related to sex determination in flatfish have been found in the genome of Black flounder, we suggest that the first attempt to study the mechanism of sex determination in this species should focus on the closest species, e.g., investigate the role of dmrt1 and amy, both genes that have been studied in Japanese flounder, a species of the same order.
Table 4
Summary of sex determination (SD) genes described in flatfish
Species (Common names) | Gene name (Abbreviation) | Physiological actions in reproduction fish | References |
Reinhardtius hippoglossoides (Greenland Halibut) | SRY-Box Transcription Factor 2 (sox2) | role in sex determination and differentiation | Ferchaud et al., 2021 |
Solea senegalensis (Senegalese sole) | Follicle stimulating hormone receptor (fshr) | role in folliculogenesis | De la Herrán et al., 2023 |
Hippoglossus stenolepis (Pacific Halibut) | bone morpho - genetic protein receptor type-1B (bmpr1ba) | potential candidate for master sex-determining | Jasonowicz et al., 2021 |
Paralichthys olivaceus (Japanese flounder) | Forkhead box L2 (foxl2) | role in ovarian differentiation | Shu et al., 2021 |
| A male-specific duplication of anti-Müllerian hormone (amhy) | role in sex determination and differentiation | Hattori et al., 2021 |
| Doublesex and mab-3 related transcription factor 1 (dmrt1) | roles in sex determination and neural development | Shu et al., 2021 |
Hippoglossus hippoglossus (Atlantic Halibut) | Gonadal soma-derived factor (gsdf) | role in testicular differentiation | Palaiokostas et al., 2013; Einfeldt et al., 2021 |
Insert here Table 4
Based on the analyzed features, we concluded that the most important components that could be responsible for the reduced flounder genome are (i) the low frequency of repetitive elements, (ii) the overall reduced gene size, which may be associated mainly to (iii) the lower number of both very large and small introns with possible implications to AS. The last two components (ii and iii) have a lower value than in other species with similar C-values, suggesting that this may be a novel genome reduction strategy.
In summary, in this study we generated a genome assembly of Black flounder (Paralichthys orbignyanus). The results show a reduced genome size and frequency of REs of flounder, and we have demonstrated that this is related to reduced gene locus size due to a reduction in the number of very large and small introns. Finally, our study also serves to investigate a strategy of genome reduction in teleost fishes.