Detection of subgenome bias using an anchored syntenic approach in Eleusine coracana (finger millet)

doi:10.21203/rs.3.rs-20447/v1

Download PDF

Research article

Detection of subgenome bias using an anchored syntenic approach in Eleusine coracana (finger millet)

https://doi.org/10.21203/rs.3.rs-20447/v1

This work is licensed under a CC BY 4.0 License

Journal Publication

published 12 Mar, 2021

Read the published version in BMC Genomics →

You are reading this older preprint version

Read the latest preprint version →

Background: Finger millet (Eleusine coracana 2n=4x=36 ) is a hardy, nutraceutical, climate change tolerant, orphan crop that is consumed throughout eastern Africa and India. Its genome has been sequenced multiple times, but A and B subgenomes could not be separated because no published genome for E. indica existed. The classification of A and B subgenomes is important for understanding the evolution of this crop and provide a means to improve current and future breeding programs.

Results: We produced subgenome calls for 704 syntenic blocks and inferred A or B subgenomic identity for 59,377 genes 81% of the annotated genes. Phylogenetic analysis of a super matrix containing 455 genes shows high support for A and B divergence within the Eleusine genus. Synonymous substitution rates between A and B genes supports A and B calls. The repetitive content on highly supported B contigs is higher than that on similar A contigs. Analysis of syntenic singletons showed evidence of biased fractionation showed a pattern of A genome dominance, with 61% A , 37% B and 1% unassigned, and was further supported by the pattern of loss observed among cyto-nuclear interacting genes. Examination of expression within the ciradian rhythm pathway suggests A subgenomic preference.

Conclusion: The evidence of individual gene calls within each syntenic block, provides a powerful tool for inference for subgenome classification. Our results show the utility of a draft genome in resolving A and B subgenomes calls, primarily it allows for the proper polarization of A and B syntenic blocks. There have been multiple calls for the use of phylogenetic inference in subgenome classification, our use of synteny is a practical application in a system that has only one parental genome available.

Epigenetics & Genomics

Eleusine coracana

finger millet

Eleusine indica

subgenome

allotetraploid

Eleusine coracana (finger millet) is an important small-seed cereal crop in its native Africa and South Asia [1, 2]. It is believed to be the product of an allopolyploid hybridization between E. indica and another likely extinct species [3–7]. Eleusine indica is consistently identified as an A genome donor [3, 4], however, based on the strength plastid phylogenetic analysis, some have suggested that E. coracana is the result of multiple hybridization events, between the B genome donor, E. indica and E. tristachya [6, 7].

The allopolyploid speciation event of Eleusine coracana is potentially much older than most crop origin scenarios, occurring 1.4 mya according to molecular clock estimates of plastid gene markers (ndhA intron, ndhF, rps16-trnK, rps16 intron, rps3, and rpl32-trnL) [6]. It has been hypothesized that the original allopolyploidy was also the point of origin of the wild species E. africana (with an identical 2n=4x=36 chromosome number) that underwent domestication to form the crop species E. coracana, and evidence in support of this hypothesis is largely based on the phylogenetic analysis of small sets of single copy nuclear genes (e.g. waxy) [3, 6, 7], ITS and plastid markers [4, 8, 9], or cytogenetic methods [5]. To our knowledge, it has yet to be tested on a genomic scale.

Long considered an orphan crop [10], interest in Eleusine coracana is gathering momentum [2]. Two genome projects [11, 12] have published results recently, including a scaffold-length assembly resolving many homeologs [12]. These assemblies provide foundational resources interpreting past studies investigating gene functions [13] and for understanding the origin and evolution of the A and B genomes. The key to unlocking these valuable resources is the genomic characterization of the most likely A genome donor E. indica, because it enables the separation of sub-genomes of a phased E. coracana assembly [12].

Assigning identity to phased polyploid assemblies is still time consuming and resource intensive even with access to advanced sequencing methods Single Molecular Real-Time sequencing [14], nanochannel genome mapping [15] and other approaches (e.g. HI-C [16]) which readily produce phased genomic assemblies[17]. Subgenomic phasing can be done with or without parental genomes. In the worst case scenario, homeologs are binned without a sequenced genome progenitor because it is extinct or unknown based on observed intrinsic differences between homeologous copies such as consistent biased fractionation caused by subgenomic dominance [18, 19], or differences in repeats [20, 21]. In cases where only one parental genome donor is known, the parental sequence is used to assign homeolog identity [3], relying on the assumption that the homeolog least similar to the parent is from the other parent. The optimal case where the genome donors are known, subgenomic regions or even transcripts are identified by their similarity to parents [22–24]. Genome painting has been widely employed with the use of probes and through in silico approaches analogs (e.g. the mapping of repetitive sequence from known progenitors to the allopolyploid [20].

The characterization of subgenomes is the first step in describing the subgenomic dominance that may occur when plants undergo diploidization and a single parental genome is preserved to a greater extent than would be expected by chance. It is marked by the smaller numbers of repetitive elements that it contains, and its preferential retention of single copy genes [25]. It occurs as the polyploid returns to diploid status following a whole genome duplication (WGD) event. Consistent patterns of subgenomic dominance and homeolog expression are conserved [26, 27]. Homeolog expression bias often occurs in tandem with subgenomic dominance but these two phenomena are not inextricably linked [28]. Homeolog expression bias may be precipitated by broad patterns of heterochromatin, more specific cases of methylation associated with transposable element clusters near down regulated genes, or novel interactions caused by trans-acting regulatory elements [27]. Down regulated genes may experience relaxed selection, undergo neofunctionalization [29, 30] and cause less of an impact if lost during the process of diploidization [18], or they may experience conserving selection in the presence of processes such as subfunctionalization [31, 32].

Subgenomic dominance may be driven by the effects of cytonuclear interacting genes.

Nuclear encoded cytoplasmic and organellar genes and must coordinate expression and optimal macro-molecular structure in concert with each other and with organellar encoded genes to maintain normal function [33, 34]. Whole genomic duplication, events such as allopolyploidization, may perturb cytonuclear interaction and function due to addition of incongruous copy of nuclear genes. It is believed that a newly formed allopolyploid genome will attempt to retain the antecedent cytonuclear interaction of the maternal progenitor by suppressing the expression of the paternal cytonuclear genes[35]. These new patterns of expression that are the result of allopolyploidization are likely the basis for the selective advantage gained by polyploids [36] .

The effects of homeolog expression bias can cause the emergence of novel phenotypes through the creation of new trans-regulatory interactions among subgenomes leading to transgressive traits. The circadian rhythm pathway offers a highly conserved network of genes with which to examine homeolog expression bias, because this pathway has a high impact on the rest of the genome and is estimated to control more than 6% of genes. Among these are key genes that influence traits of high economic and evolutionary interest that are tied to yield, such as flowering time (TOC1 (K12127), GI (K12124 ) [37]), drought response, stomatal conductance (ZTL (K12115 ), CRY1 (K12118), COP1 (K10143) [38, 39]), and growth rate, starch accumulation (LHY (K12133), CCA1 (K12134) [40, 41]).

Here we phase a previous assembly produced by Hatakeyama et al. [12], calculate synonymous substitution rate (kS) values to examine evolutionary relationship of the A and B genome to several members of the Eleusine genus, look for subgenomic bias across single copy cytoplasmic genes, and examine the relationship of subgenomic bias to homeolog expression bias in several previously published transcriptomes.

To identify A and B genes we used a 2 step method. First A and B calls were made using the direct gene to gene comparison. Gene relationships were determined using CoGe SynFinder. All analyzed genes occurred in a triad of genes, 2 Eleusine coracana genes linked by a single Eleusine indica gene. The E. coracana gene most similar to the E. indica gene was annotated as A while the other was annotated B. 12,296 direct calls were made, 500 triads were uncalled because both E. coracana copies were equidistant from the E. indica copy. Direct calls were used to infer if the syntenic block they occured in was A or B. If a block had significantly more A or B calls under chi square (p=0.05), it was called A or B respectively; 704 blocks were called (351 A, 353 B) and 116 blocks had no call, 76% (88/116) of uncalled blocks contained fewer than 8 genes, in these cases 1 conflicting call was enough to keep the block from being characterized as A or B. Syntenic block calls were used to obtain contig calls. Contig calls were divided into 3 categories, strong, no uncalled blocks, provisional, contig contains an uncalled block, and ambiguous, the contig contained multiple uncalled blocks. We identified 12 strong and 16 provisional crossovers, 68 strong and 29 provisional A contigs, and 82 strong and 23 provisional B contigs. Using strong syntenic region and surrounding sequence we inferred 33% of the 62,347 predicted genes in E. coracana (9,985 A genes and 10,349 B gene).

Second we extended A and B calls to a larger set of contigs by mapping reads from A and B repetitive elements to the entire genome and then making a call A or B per region. Repetitive elements were identified for the entire E. coracana genome using RepeatScout with default settings and its output was used to create a custom database for RepeatMasker for annotation. A subset of A contigs (61) and B contigs (73) were chosen to identify A and B repeat elements. The B genome had a higher density of repetitive elements in per sliding 100 kbp window (Fig.1) A total of 50,416, repetitive elements were longer than 200 bp were identified, A 21,481 and 28,935 B, spanning 81 mpb (A 32 mbp and 49 mpbB). The longest repetitive elements spanned several thousand base pairs (A 12,578 bp) and (B 15,440). For all retro-element families 898 were represented in both subgenomes by 84,965 entries (A 30,896 and B 54,069), while 180 families were uniquely predicted for one subgenome (A 50 and B 130) with 9,281 entries (A 1,175 and B 8,106). Reads and their pairs that mapped to either A or B were extracted, and mapped against the entire genome. High quality coverage mapping was calculated for A and B read mappings where, both reads were mapped to the same contig and their insert was less than 1000bp, with mapping score of greater than or equal to 30. A sliding window was used to sum all A and B reads mapped to a region of the genome, and region calls were made and aggregated using a custom python script (abPainting.ipynb). Using this method we called 31,543 A genes and 28,483 B genes. When added to existing calls we made a total 59,377 unambiguous calls accounting for 81% of the gene annotations (Additional File 1).

We produced the largest Eleusine super-matrix compiled to date containing 455 genes to resolve A and B genome relationships within Eleusine coracana (Fig. 2abc). We tested the effects of targeted analysis on the Eleusine indica genome to determine if using targeted assembly we could assemble false B transcripts from a completely A genome. Our results show that some, 31, putative B transcripts assembled, but that they were in the same clade as E. indica and the A genome (Fig. 2b). This demonstrates that our process of targeted assembly does not create B genomic transcripts as an artifact, or unduly bias the transcriptomic assembly process. The successful assembly and inclusion of 31 putative B transcripts suggests that our syntentic approach was not sensitive to cases of gene conversion, which would be expected when designating A and B genes by region. Our phylogenomic analysis reveals that E. indica is indeed sister to the A genome and that Eleusine tristachya is sister to the E. indica - A genome clade while the B genome is sister the E. tristachya - A genome clade, further confirming that the B genome did not arise from E. floccifolia [5]. More complete species level sampling is required to determine the precise relationship of E. indica to the Eleusine africana A genome and the E. coracana A genome. Since bootstrap values and tree topology vary between the gappy supermatrices (Fig. 2bc) ungapped super matrix (Fig. 2a).

Our genome guided phylogenomic approach established an expected pattern of divergence among subgenomes for kS (Fig. 3). kS patterns confirm genome calls made by our methods in that we can observe the expected splits between Eleusine indica and A , and E. indica and B. kS values suggest that there are a small number of mis-characterized genes or instances of gene conversion which would be expected given the size of the sliding window used to characterize A and B blocks. The comparison between E. indica and the A genome shows a peak at approximately 1.1 mya when using the standard conversion rate of 6.5 e-9 substitutions per year [42].

Analysis of repetitive DNA occurring on high confidence called contigs using RepeatMasker and custom repeat library indicated that the B region contains more repetitive elements per base pair than the A genome. Analysis of repeat family density within a sliding window of 100 kbp was applied to A and B homeologs called using syntenic blocks, and it revealed a significant difference in repeat count density for only one family, LTR_Copia, out of 30 families. Most families exist in clusters 1-20 per 100,000 bp, and DNA_Mule_MudDR is a striking example of a repeat occurring at dense clusters in the A genome, with a maximum of 114 in the sliding window, compared to 13 for the B genome. Several of the TE Line_L1 elements show an elevation in the density of single count insertions in the B genome which contained 1080 windows containing a single Line_L1 compared to the A genome which had 823 windows containing a single Line_L1 (Additional File 2). When all repeat counts and coverage are taken together they show that B contains more variation in repeat counts per sliding window while it also shows markedly higher coverage. This suggests that the repetitive elements in B are less well controlled and may be undergoing an expansion relative to A repeats (Figure 1), 17.4% of windows from the B contigs had coverage of 75% or greater compared to 9.64% in A contigs. This wider coverage of repeats is likely to have impacts on expression levels because of TE induced methylation (Fig. 1).

Expression of A and B genes within the circadian rhythm pathway (ko04712) [43] visibly demonstrated homeolog expression bias with a subgenome preference of A, in line with expression patterns suggested by repeat coverage. PHYA (K12120), LHY, PRR5 (K12130), CSNK2A (K03097), FKF1 (K12116), and COP1 all showed a strong A bias in their expression, while CHE (K16221) was the only gene to show a strong B preference caused by the loss of the A homeolog (Additional File 3). This was confirmed by examination of the region with Gevo using CoGe (https://genomevolution.org/coge last accessed: 3/18/2020). While the rest of the genes exhibited a mixture of A and B homeolog expression. There are a couple interesting cases, first floral transcriptome sets showed a marked preference in A homeolog expression. Second, comparison of pooled drought and control leaf tissue samples from a single study [12] showed notable shifts from A to B homeolog expression for key genes: ZTL, ELF3 (K12125), CRY1, and LHY further examination of individual readsets confirmed A to B switching when expression was detected, for LHY. The circadian rhythm pathway shows a visible preference for A homeolog expression but does exhibit a case of putative A homeolog loss.

When we examined patterns of homeolog loss at a global scale we detected a strong pattern of biased fractionation favoring A genome homeolog retention. A comparison of syntenic diads created between hard masked assemblies of Setaria italica and E. coracana using CoGe, show that 61% singletons are A, that 38% are B, and 1% are unassigned (A: 1506, B:925, Unassigned: 21) (Additional File 4,5,6).

To determine if cytoplasmic genes showed appreciable subgenomic bias in their retention we started with a list of 4,042 genes that were determined to be lost from either the A or B subgenome during an unmasked syntenic analysis. These genes occur in Eleusine indica to Eleusine coracana diads not the expected E. coracana to E. indica to E. coracana triads. In the initial blast we identified 111 single copy genes that should have potential cytonuclear interaction. These were further scanned in E. coracana genome using CoGe (https://genomevolution.org/coge/ last accessed: 3/18/2020) blast search which found that 26 of them had a single hit while 8 of them had two hits but the second hit had either partial or poor alignment. The remaining 77 genes had either two or more hits. Out of the 34 genes (26 single hit and 8 with two hit), we found that 24 were on the A subgenome and 10 were on the B subgenome (Additional File 7). These results suggest that the rate of gene loss is generally slow, but that genes on A subgenome are favored for retention.

We found 34 genes that are present in single copy and involved in defense signaling pathways, synthesis of indole-3-acetic acid, RNA interference, heat shock proteins, seed germination, plant growth formation and repair of photosystem II super complex, protein kinase, flowering time. Some of the interesting genes that are important for normal functioning of chloroplast and mitochondria included Met1, Ribose-5-phosphate isomerase 3 (RPI3), Brassinazole insensitive pale gene-2 (BPG2) 3-hydroxyisobutyrate dehydrogenase (HIBADH), Translocase of outer membrane 34 kDa (TOM34).

Our findings for A and B genome donors concur with past research [5, 8, 44–46]. Synonymous substitution values show evidence of ancient Poaceae genome duplications [47–49], and suggested that when Eleusine coracana arose 1.1 mya at the divergence of Eleusine indica and the A subgenome around the same time as Eragrostis tef [21]. Our kS values are within the range of previous predictions 0.50 -2.7 mya, albeit slightly more recent than the 1.40 mya predicted [6].

Past work designating the genome donors of Eleusine coracana has been limited to a few loci [3, 6, 7], or organellar genomes [46]. Our implementation of a genome guided orthology approach [50] is the most comprehensive treatment of this genus to date and the first to include a sample for each likely B genome donor since Eleusine multiflora and Eleusine jaegerii can be ruled out as genome donors based on number of chromosomes alone and Eleusine semisterilis has never been considered a strong contender owing to its morphological divergence [51]. The placement of Eleusine tristachya sister to Eleusine indica - A genome (Fig 2abc) clade is interesting because it opens the possibility that E. tristachya, the sole species of new world origin [52], was the result of an ancient long distance dispersal event. Previous work using plastid markers suggested that E. tristachya was only recently transferred to the new world [6]. The A genome - E. indica clade shows low resolution for the relationship among E. indica, E. coracana and E. africana ( Fig. 2abc) in the most complete super-matrices calling into question the high bootstrap values in the gappy super-matrix (Fig. 2b). Here it appears that there is a positively misleading bias since the data rich genome based datasets are drawn together in the gappy super-matrix, while they are split in the ungapped super matrix.

Eleusine coracana still maintains several pairs of copy resistant genes as shown through BUSCO analysis [12].Yet we were able to detect a discernable genome wide bias toward A genome retention in our syntenic diad analysis, with 61% of the diads determined to be A and 36% of the diads determined to B. Furthermore this pattern was upheld in the context of cytonuclear interacting genes where a total of 34 reversions of cytonuclear genes to single copy state heavily biased towards retention in the A subgenome. We found four genes of note that were retained on the A genome instrumental to growth, Met1, RPI3, BPG2, and HIBADH. Met1 is a thylakoid-associated tetratricopeptide protein which is highly conserved in photosynthetic eukaryotes and major player in formation and repair photosystem II complex [53]. When the white light intensity was fluctuated, two independent Met1 mutants showed reduction in growth, diameter of rosettes, biomass and PSII compared to wild type [53]. RPI3 catalyze the reversible conversion of ribose-5-phosphate to ribulose 5-phosphate in non‐oxidative phase of the pathway and photosynthesis process [54]. A map based cloning identified point mutation in ribose 5‐phosphate isomerase (RPI) gene to cause reduction in cellulose synthesis, radical swelling and reduced growth of roots [55]. A RPI2 knockout mutant showed abnormalities in chloroplast structure and function, reduced starch in leaves, delayed flowering and untimely cell death [56]. BPG2 is a phytochrome-regulated gene which encodes protein required for normal chloroplast biogenesis and greening process [57]. Mutation in this gene can curtail accumulation of chloroplast protein induced by brassinazole, carotenoid pigmentation in the plastids and expression of rbcL and psbA and inefficient photosystem II and altered photosystem I function [57, 58]. HIBADH encodes a mitochondrial enzyme which catalyzed reversible oxidation reaction of 3-hydroxyisobutyrate to methylmalonate semialdehyde in presence of NAD+ [59]. It is also involved in degradation of branched-chain amino acids. Knockdown of this gene have reduced degradation of valine and isoleucine the root growth under presence of valine and isoleucine [60]. It seems apparent that these subgenomic biases are instrumental in shaping phenotypes and future plasticity of E. coracana no matter the driving mechanism.

The interactions of subgenomes is the genesis of novel polyploid phenotypes [61], and in this context homeolog expression bias creates beneficial transgressive phenotypes reviewed by Chen et al. [62]. We identified a homeolog expression bias for several genes in the circadian rhythm pathway and identified multiple cases of homeolog preference shift under drought conditions in concert with similar global patterns reported in Eragrostis tef [21]. This shift in expression due to drought may be a transgressive phenotype. Analysis of drought and control expression sets suggest there are several genes undergoing subfunctionalization with respect to drought response. Notably LHY shows a shift from A to B expression under drought conditions. Changes in LHY binding efficiency affect growth and starch accumulation within plants, with less efficient LHY binding producing an increased rate of growth and starch production [40, 41]. ELF3 regulates the increase in blue light response linked to stomatal control and CRY1, ZTL are involved in regulating stomatal conductance [38, 39, 63]. Since stomata are more likely to be closed during drought conditions [64] and growth rates depressed [65], this suggests the A subgenome is driving fast growth, through high starch accumulation and likely, higher rates of stomatal conductance. This contrasts with the B subgenome expression during drought decreases starch storage, perhaps due to higher binding efficiency of trans-regulatory interactions and longer periods of stomatal closure facilitated in part by the expression of B subgenomic versions of critical circadian rhythm genes. We propose that the A subgenome positively accelerates growth under optimal conditions while the B subgenome negatively accelerated growth under drought conditions.

Analysis of A and B homelog bias across our pooled transcriptome set suggests that subgenomic expression bias LHY expression is consistently A under control conditions regardless of sample time, variety, or place, while the other mechanism, stomatal conductance, blue light sensitivity exhibit variation. This pattern could be explained by drought response arising independently across cultivars, or the scourge of incomplete and inconsistent sampling, since we are comparing results from 3 different experiments, presumably sampled under different light conditions at different times of day. The underlying variability that is a part of the pooled transcriptomic data makes it easier to detect genes exhibiting constitutive subgenomic expression bias, but harder to find nuanced patterns. We found constitutive homeolog bias for 5 genes (A: PHYA, COP1, PRR5, CK2 and B:, CHE). Of these only one case, CHE, appears to be linked directly to subgenomic bias via gene loss, while the others may be attributable to differences in either methylation of trans-acting regulators. More work is needed to make this determination.

In conclusion we were able characterizes more than 80% of the Eleusine coracana genome and show that there is discernible homeolog expression bias within our selected pathway and that these patterns are in line with the biased fractionation observed in cytonuclear interacting genes as well as genome wide biased fraction, which is correlated with the repetitive element content of both subgenomes. With the TE rich B subgenome exhibiting more frequent gene loss than the A subgenome. Examination of the homeolog expression bias in the Circadian rhythm pathway leads us to hypothesize that the interactions between the A and B genomes creates transgressive patterns of expression that expand the phenotypic plasticity based on complementary responses to environmental stress, with the A subgenome driving and more aggressive response while the B subgenome drives a more conservative response.

Eleusine indica genome (NCBI Accession: QEPD01000000) [66], was compared to the Eleusine coracana DDJB DRA Accession: DRA005897 [12] genome using SynMap CoGe [67, 68] with quota align [69] set to limit 2 E. coracana to 1 E. indica, minimum number of aligned gene pairs was at 5, Ka\Ks values were calculated CoGe version of codeml [70] with a maximum value of 3 and a minimum value of 0. A vs B genome calls were made from raw downloaded CoGe output using with dc_ks_bagofgenes.py (https://github.com/NDHall/coge_tools/blob/master/dc_tools/dc_ks_bagofgenes.py last accessed: 3/31/2020). This approach first categorized E. coracana genes by the identity of their matching syntenic E. indica gene. Eleusine indica genes which possess only 2 E. coracana genes used. For these genes the one with the highest sequence similarity to E. indica, since it is known to be the maternal genome donor [11, 12] is designated A and the other is designated B. These genes categorized coracana genes by syntenic block using ab_call.py (https://github.com/NDHall/coge_tools/blob/master/dc_tools/ab_call.py last accessed: 3/31/2020) (p-value set as default and cutoff value of 5). A vs B calls per block are compared with chi squared analysis. If p-value is less than 0.05 the block is called as either A or B, depending on the dominant gene call. Syntenic blocks are then categorized by scaffold which are then designated A, B or AB depending on region calls and assigned level of confidence based on presence or absence of uncalled syntenic blocks. Genes that only had 1 E. coracana hit to 1 E. indica hit were extracted as a list and manually searched against the E. coracana genome to confirm singleton status, and searched against other meso-allopolyploids, rice and peanut genome to determine if these genes frequently revert to one copy.

Transcriptome assembly for phylogenomics

RNA-Seq reads were downloaded from NCBI (Additional File 8) were downloaded from NCBI and converted to fastq format with fastq-dump v2.8.2 from Sratoolkit v2.8.2-1 (https://trace.ncbi.nlm.nih.gov/Traces/sra/sra.cgi?view=software last accessed: 3/31/2020) and cleaned using fastp v0.19.4 [71] with default settings. Cleaned reads from diploid species were assembled using Trinity v2.4.0 [72] (--max_memory 102, --CPU 40, --trimmomatic, --fullcleanup, --verbose). To produce the A and B sequences for tetraploid E. africana, its reads were mapped to E. coracana reference with Tophat v2.1.1 [73] employing Bowtie 2 v2.2.9 [74] with default flags. Bedtools intersect v2.27.1 [75] was used to extract A and B regions identified by the bag of gene approach for targeted assembly carried out by Trinity (--genome_guided_bam, --genome_guided_max_intron 10000, --max_memory 102G, --CPU 40). To test if this process was artificially creating a B genome E. indica transcriptomic reads were also assembled to reference and split into A and B. Protein sequences were predicted using TransDecoder v3.0.1 [76] using Transdecoder.LongOrfs and Transdecoder.Predict run with default flags. Predicted proteins were condensed into a set of unigenes using cd-hit v4.7 [77] with default flags.

Genome guided phylogenomics

Genome guided orthology [50] was implemented with ggOrtho (https://github.com/NDHall/ggOrtho/tree/master/gg_ortho last accessed: 3/31/2020). A set of reference genes were extracted by a comparison between Eleusine indica and Oropetium thomaeum [14] using ggGetSet.py. A vs B genes were classified using region calls per gene and added separately as either A or B. Transcriptomes were matched to each gene as per Washburn et al. [50] using ggGoAdd.py. Multifasta files were then filtered with ggSelectAlns.py to exclude any file that contained more than 1 sequence per species and exclude any file lacking more than 1. Alignments were made with codon aware MACSE v2 [78] run with default flags, trimmed with gBlocks v0.91b default mode, sequences were organized using fasta_ghost.py (https://github.com/NDHall/pysam_tools/tree/master/fasta_ghost last accessed: 3/31/2020) and concatenated using FASconCat v1.0 [79]. Concatenated sequences were partitioned by gene using PartitionFinder v2.0 [80]. RAxML v8.2.9 [81] was run with output from PartitionFinder v2.0, using GTR Gamma 1000 bootstraps.

kS calculations

Modified DagChainer [82] files were downloaded from CoGe [67, 68] for Eleusine coracana vs E. coracana and for E. coracana vs Eleusine indica, E. coracana vs Oropetium thomaeum. Gene relationships were extracted using dc2multiFasta.py (https://github.com/NDHall/ggOrtho/blob/master/util_scripts/dc2multiFasta.py last accessed: 3/31/2020). Sequences were aligned with MACSE, cleaned with Gblocks default settings. kS values were calculated using codeml from PAML and annotated as A or B using unambiguous calls (Additional File 9,10,11). PAML values were matched with KEGG pathway annotation and updated using a total unambiguous filtered list of A and B calls (labelAvsBKsKa.py).

KEGG pathway expression

To get expression values SRAs datasets (DRR095904, DRR095905, DRR095906, DRR095907, DRR095908, DRR095909, ERR2040786, ERR2040787, ERR2040788, ERR2040789, SRR4021829, SRR4021830, SRR5341138, SRR5341139, SRR5341140, SRR5341141, SRR5341142, SRR5341143, SRR5341144, SRR5341145, SRR5341146, SRR5341147, SRR5341148) were downloaded, cleaned with fastp default settings , and mapped against custom masked E. coracana genome in which Super-Scaffold_7:7120824-13553037 and Super-Scaffold_152:5200000-9256248 were masked using bedtools. This step was necessary because we found that these scaffolds contained large identical repeats that interrupted our ability to detect expression levels for key genes in the circadian rhythm pathway. This was due to the low mapping quality scores assigned by tophat2 in the case of equally likely mapping positions, initial expression counts speciously suggested that there was no expression in these regions. SRAs were mapped to Custom masked genome, using tophat2 with –trainscriptome-only flag and counted using htseq with following options ( -r pos, --mode intersection-nonempty , and --nonunique all ).

We annotated transcripts with kegg pathway ids using best blastx which covered at least 30% of the query as a single hit with an e-value less than 1-e5 hits from Setaria italica. KEGG annotations were retrieved iteratively using custom python scripts, with the final script employed to add manual annotations to complete the Circadian Rhythm Pathway (abCallToKegg.py, abCallToKeggAppend.py abCallToKeggAppendPaintedCalls.py and handAnnontationCircadianR.py). These manual annotations were required because of the rigorous cutoff values applied to the initial blastx hits. KEGG pathway counts were extracted and summed using another custom python script (KeggPathWayExtraction.py).

Genome Painting

To extend A and B homeolog calls we decided to employ an in silico genome painting approach. To accomplish this end we began by selecting repetitive regions identified with repeatmasker on a subset of the previously identified A contigs. We chose this approach so we could test the concordance between A and B calls made with painting method and those made with a syntenic method. These elements were extracted, labeled as A or B and added to a common reference fasta to which and all reads from NCBI SRA DRR095893 were mapped. Each mapped read and its pair were extracted and labeled as A or B. During this process read pairs that were split between A and B repetitive elements were excluded. A and B reads were then mapped to the entire E. coracana genome and bedtools was used to calculate A and B read coverage for a sliding window of 250,000 bp in size that advanced 2,000 bp per step. A custom python script (abPainting.ipynb) was used to determine A vs B bed regions. Regions were designated A (Additional File 12), B (Additional File 13), low coverage or ambiguous then calls were compared among all paint, and syntenic called regions, and unambiguous regions were reported using bedtools. Resulting unambiguous A and B bed files were used to extract a list of A and B genes from gtf file using bedtools intersect (Additional File 14,15). Call accuracy was confirmed on a set of test contigs excluded from the initial mapping.

Singleton analysis

Here we first identified 4,042 genes that are in single copy in the current Eleusine coracana genome. These genes occur in Eluesine indica to Eleusine coracana diads not the expected E. coracana to E. indica to E. coracana triads. We retrieved a list of genes from the Arabidopsis genome database in TAIR 10 (https://www.arabidopsis.org) which are directly or indirectly involved in cytonuclear interaction. These cytonuclear genes were BLAST (basic local alignment search tool) with the single copy genes identified in the Eleusine coracana genome sequence with e^-10. The single copy that had a match with the cytonuclear genes were BLAST search again in Eleusine coracana genome to reconfirm that there were no other copy in the genome and only the one that had a single hit on the genome were selected to identify putative functional gene match on UniProt (Universal Protein resource) database in NCBI (https://www.ncbi.nlm.nih.gov/ last accessed 3/31/2020) by using blastx.

To establish a global pattern genome bias we compared hardmasked Setaria italica (COGE ID: 12241) and Eleusine coracana (COGE ID: 52747) genomes using COGE and syntenic depth of 1 to 2: This comparison created triads when both A and B copies were present and diads when only the A or B copy was present. Syntenic linkages were parsed with basic command line tools and bedtools, homeolog identities (A,B, or unassigned) were carried out using regions called by our in silico genome painting process (Additional File 16).

BLAST: basic local alignment search tool

kS: synonymous substitution rate

MYA: million years ago

TE: transposable element

WGD: whole genome duplication

Ethics approval and consent to participate

Not applicable

Consent for publication

Named Authors provide consent publish

Competing interests

Authors declare no competing interest

Funding

Not applicable,

Authors' contributions

LRG conceived of a syntenty based phasing. NDH wrote the custom programs used for analysis, functional annotation, differential expression, and phylogenetic analysis. JDP identified cytonuclear interacting singletons. All authors contributed to the writing and revision of this Manuscript.

Acknowledgements

Not applicable

Data availability

The data sets analyzed for this during the current the study are available in

NCBI (https://www.ncbi.nlm.nih.gov/), DDJB (https://www.ddbj.nig.ac.jp/index-e.html), and CoGe (https://genomevolution.org/coge/).

Shobana S, Krishnaswamy K, Sudha V, Malleshi NG, Anjana RM, Palaniappan L, et al.
Finger millet (Ragi, Eleusine coracana L.): a review of its nutritional properties, processing, and plausible health benefits. Adv Food Nutr Res. 2013;69:1–39.
Goron TL, Raizada MN. Genetic diversity and genomic resources available for the small millet crops to accelerate a New Green Revolution. Front Plant Sci. 2015;6:157.
Werth CR, Hilu KW, Langner CA. Isozymes of Eleusine (Gramineae) and the origin of finger millet. Am J Bot. 1994;81:1186–97.
Hilu KW, Johnson JL. Systematics of Eleusine Gaertn. (Poaceae: Chloridoideae): Chloroplast DNA and Total Evidence. Ann Mo Bot Gard. 1997;84:841–7.
Bisht MS, Mukai Y. Genomic in situ hybridization identifies genome donor of finger millet (Eleusine coracana). Theor Appl Genet. 2001;102:825–32.
Liu Q, Triplett JK, Wen J, Peterson PM. Allotetraploid origin and divergence in Eleusine (Chloridoideae, Poaceae): evidence from low-copy nuclear gene phylogenies and a plastid gene chronogram. Ann Bot. 2011;108:1287–98.
Liu Q, Jiang B, Wen J, Peterson PM. Low-copy nuclear gene and McGISH resolves polyploid history of Eleusine coracana and morphological character evolution in Eleusine. Turk J Botany. 2014;38:1–12.
Hilu KW. Identification of the“ A” genome of finger millet using chloroplast DNA. Genetics. 1988;118:163–7.
Neves SS, Swire-Clark G, Hilu KW, Baird WV. Phylogeny of Eleusine (Poaceae: Chloridoideae) based on nuclear ITS and plastid trnT–trnF sequences. Mol Phylogenet Evol. 2005/5;35:395–419.
Varshney RK, Ribaut J-M, Buckler ES, Tuberosa R, Rafalski JA, Langridge P. Can genomics boost productivity of orphan crops? Nat Biotechnol. 2012;30:1172–6.
Hittalmani S, Mahesh HB, Shirke MD, Biradar H, Uday G, Aruna YR, et al. Genome and Transcriptome sequence of Finger millet (Eleusine coracana (L.) Gaertn.) provides insights into drought tolerance and nutraceutical properties. BMC Genomics. 2017;18:465.
Hatakeyama M, Aluri S, Balachadran MT, Sivarajan SR, Patrignani A, Grüter S, et al. Multiple hybrid de novo genome assembly of finger millet, an orphan allotetraploid crop. DNA Res. 2017. doi:10.1093/dnares/dsx036.
Rahman H, Ramanathan V, Nallathambi J, Duraialagaraja S, Muthurajan R. Over-expression of a NAC 67 transcription factor from finger millet (Eleusine coracana L.) confers tolerance against salinity and drought stress in rice. BMC Biotechnol. 2016;16 Suppl 1:35.
VanBuren R, Bryant D, Edger PP, Tang H, Burgess D, Challabathula D, et al. Single-molecule sequencing of the desiccation-tolerant grass Oropetium thomaeum. Nature. 2015;527:508–11.
Hastie AR, Dong L, Smith A, Finklestein J, Lam ET, Huo N, et al. Rapid genome mapping in nanochannel arrays for highly complete and accurate de novo sequence assembly of the complex Aegilops tauschii genome. PLoS One. 2013;8:e55864.
Mifsud B, Tavares-Cadete F, Young AN, Sugar R, Schoenfelder S, Ferreira L, et al. Mapping long-range promoter contacts in human cells with high-resolution capture Hi-C. Nat Genet. 2015;47:598–606.
Chin C-S, Peluso P, Sedlazeck FJ, Nattestad M, Concepcion GT, Clum A, et al. Phased diploid genome assembly with single-molecule real-time sequencing. Nat Methods. 2016;13:1050–4.
Schnable JC, Springer NM, Freeling M. Differentiation of the maize subgenomes by genome dominance and both ancient and ongoing gene loss. Proc Natl Acad Sci U S A. 2011;108:4069–74.
McKain MR, Estep MC, Pasquet R, Layton DJ, Vela Díaz DM, Zhong J, et al. Ancestry of the two subgenomes of maize. bioRxiv. 2018;:352351. doi:10.1101/352351.
Gordon SP, Levy JJ, Vogel JP. PolyCRACKER, a robust method for the unsupervised partitioning of polyploid subgenomes by signatures of repetitive DNA evolution. BMC Genomics. 2019;20:580.
VanBuren R, Wai CM, Pardo J, Yocca AE, Wang X, Wang H, et al. Exceptional subgenome stability and functional divergence in allotetraploid teff, the primary cereal crop in Ethiopia. bioRxiv. 2019;:580720. doi:10.1101/580720.
Salmon A, Flagel L, Ying B, Udall JA, Wendel JF. Homoeologous nonreciprocal recombination in polyploid cotton. New Phytol. 2010;186:123–34.
Zhang T, Hu Y, Jiang W, Fang L, Guan X, Chen J, et al. Sequencing of allotetraploid cotton (Gossypium hirsutum L. acc. TM-1) provides a resource for fiber improvement. Nat Biotechnol. 2015;33:531–7.
Paape T, Briskine RV, Halstead-Nussloch G, Lischer HEL, Shimizu-Inatsugi R, Hatakeyama M, et al. Patterns of polymorphism and selection in the subgenomes of the allopolyploid Arabidopsis kamchatica. Nat Commun. 2018;9:3909.
Woodhouse MR, Cheng F, Pires JC, Lisch D, Freeling M, Wang X. Origin, inheritance, and gene regulatory consequences of genome dominance in polyploids. Proc Natl Acad Sci U S A. 2014;111:5283–8.
Flagel LE, Wendel JF. Evolutionary rate variation, genomic dominance and duplicate gene expression evolution during allotetraploid cotton speciation. New Phytologist. 2010;186:184–93. doi:10.1111/j.1469-8137.2009.03107.x.
Bottani S, Zabet NR, Wendel JF, Veitia RA. Gene Expression Dominance in Allopolyploids: Hypotheses and Models. Trends Plant Sci. 2018;23:393–402.
Li Q, Qiao X, Yin H, Zhou Y, Dong H, Qi K, et al. Unbiased subgenome evolution following a recent whole-genome duplication in pear (Pyrus bretschneideri Rehd.). Hortic Res. 2019;6:34.
Stephens SG. Possible Significance of Duplication in Evolution. In: Demerec M, editor. Advances in Genetics. Academic Press; 1951. p. 247–65.
Ohno S. Gene duplication. Evolution by Gene Duplication Springer-Verlag, New York. 1970;:59–65.
Lynch M, Force A. The probability of duplicate gene preservation by subfunctionalization. Genetics. 2000;154:459–73.
Prince VE, Pickett FB. Splitting pairs: the diverging fates of duplicated genes. Nat Rev Genet. 2002;3:827–37.
Sharbrough J, Conover JL, Tate JA, Wendel JF, Sloan DB. Cytonuclear responses to genome doubling. Am J Bot. 2017;104:1277–80.
Oberprieler C, Talianova M, Griesenbeck J. Effects of polyploidy on the coordination of gene expression between organellar and nuclear genomes in Leucanthemum Mill. (Compositae, Anthemideae). Ecol Evol. 2019;9:9100–10.
Wolf JB. Cytonuclear interactions can favor the evolution of genomic imprinting. Evolution. 2009;63:1364–71.
Soltis PS, Marchant DB, Van de Peer Y, Soltis DE. Polyploidy and genome evolution in plants. Curr Opin Genet Dev. 2015;35:119–25.
Hotta CT, Gardner MJ, Hubbard KE, Baek SJ, Dalchau N, Suhita D, et al. Modulation of environmental responses of plants by circadian clocks. Plant Cell Environ. 2007;30:333–49.
Más P, Kim W-Y, Somers DE, Kay SA. Targeted degradation of TOC1 by ZTL modulates circadian function in Arabidopsis thaliana. Nature. 2003;426:567–70.
Mao J, Zhang Y-C, Sang Y, Li Q-H, Yang H-Q. A role for Arabidopsis cryptochromes and COP1 in the regulation of stomatal opening. Proc Natl Acad Sci U S A. 2005;102:12270–5.
Ni Z, Kim E-D, Ha M, Lackey E, Liu J, Zhang Y, et al. Altered circadian rhythms regulate growth vigour in hybrids and allopolyploids. Nature. 2009;457:327–31.
Stitt M, Zeeman SC. Starch turnover: pathways, regulation and role in growth. Curr Opin Plant Biol. 2012;15:282–92.
Gaut BS, Morton BR, McCaig BC, Clegg MT. Substitution rate comparisons between grasses and palms: synonymous rate differences at the nuclear gene Adh parallel rate differences at the plastid gene rbcL. Proc Natl Acad Sci U S A. 1996;93:10274–9.
Kanehisa M, Araki M, Goto S, Hattori M, Hirakawa M, Itoh M, et al. KEGG for linking genomes to life and the environment. Nucleic Acids Res. 2008;36 Database issue:D480–4.
Neves SS, Swire-Clark G, Hilu KW, Baird WV. Phylogeny of Eleusine (Poaceae: Chloridoideae) based on nuclear ITS and plastid trnT-trnF sequences. Mol Phylogenet Evol. 2005;35:395–419.
Bisht MS, Mukai Y. Identification of Genome Donors to the Wild Species of Finger Millet, Eleusine africana by Genomic in situ Hybridization. Breed Sci. 2001;51:263–9.
Zhang H, Hall N, Scott McElroy J, Lowe EK, Goertzen LR. Complete plastid genome sequence of goosegrass (Eleusine indica) and comparison with other Poaceae. Gene. 2016. doi:10.1016/j.gene.2016.11.038.
Jiao Y, Li J, Tang H, Paterson AH. Integrated syntenic and phylogenomic analyses reveal an ancient genome duplication in monocots. Plant Cell. 2014;26:2792–802.
Soltis DE, Albert VA, Leebens-Mack J, Bell CD, Paterson AH, Zheng C, et al. Polyploidy and angiosperm diversification. Am J Bot. 2009;96:336–48.
Paterson AH, Bowers JE, Chapman BA. Ancient polyploidization predating divergence of the cereals, and its consequences for comparative genomics. Proc Natl Acad Sci U S A. 2004;101:9903–8.
Washburn JD, Schnable JC, Conant GC, Brutnell TP, Shao Y, Zhang Y, et al. Genome-Guided Phylo-Transcriptomic Methods and the Nuclear Phylogentic Tree of the Paniceae Grasses. Sci Rep. 2017;7:13528.
Hilu KW, de Wet JMJ. Domestication of Eleusine coracana. Econ Bot. 1976;30:199–208.
Phillips SM. A survey of the genus Eleusine Gaertn. (Gramineae) in Africa. Kew Bull. 1972;27:251–70.
Bhuiyan NH, Friso G, Poliakov A, Ponnala L, van Wijk KJ. MET1 Is a Thylakoid-Associated TPR Protein Involved in Photosystem II Supercomplex Formation and Repair in Arabidopsis. The Plant Cell. 2015;27:262–85. doi:10.1105/tpc.114.132787.
Ishikawa K, Matsui I, Payan F, Cambillau C, Ishida H, Kawarabayasi Y, et al. A hyperthermostable D-ribose-5-phosphate isomerase from Pyrococcus horikoshii characterization and three-dimensional structure. Structure. 2002;10:877–86.
Howles PA, Birch RJ, Collings DA, Gebbie LK, Hurley UA, Hocart CH, et al. A mutation in an Arabidopsis ribose 5-phosphate isomerase reduces cellulose synthesis and is rescued by exogenous uridine. Plant J. 2006;48:606–18.
Xiong Y, DeFraia C, Williams D, Zhang X, Mou Z. Deficiency in a cytosolic ribose-5-phosphate isomerase causes chloroplast dysfunction, late flowering and premature cell death in Arabidopsis. Physiol Plant. 2009;137:249–63.
Komatsu T, Kawaide H, Saito C, Yamagami A, Shimada S, Nakazawa M, et al. The chloroplast protein BPG2 functions in brassinosteroid-mediated post-transcriptional accumulation of chloroplast rRNA. Plant J. 2010;61:409–22.
Kim B-H, Malec P, Waloszek A, von Arnim AG. Arabidopsis BPG2: a phytochrome-regulated gene whose protein product binds to plastid ribosomal RNAs. Planta. 2012;236:677–90.
Hawes JW, Crabb DW, Chan RM, Rougraff PM, Harris RA. Chemical modification and site-directed mutagenesis studies of rat 3-hydroxyisobutyrate dehydrogenase. Biochemistry. 1995;34:4231–7.
Schertl P, Danne L, Braun H-P. 3-Hydroxyisobutyrate Dehydrogenase Is Involved in Both, Valine and Isoleucine Degradation in Arabidopsis thaliana. Plant Physiology. 2017;175:51–61. doi:10.1104/pp.17.00649.
Paterson AH, Wendel JF, Gundlach H, Guo H, Jenkins J, Jin D, et al. Repeated polyploidization of Gossypium genomes and the evolution of spinnable cotton fibres. Nature. 2012;492:423–7.
Chen ZJ. Genetic and epigenetic mechanisms for gene expression and phenotypic variation in plant polyploids. Annu Rev Plant Biol. 2007;58:377–406.
Yu J-W, Rubio V, Lee N-Y, Bai S, Lee S-Y, Kim S-S, et al. COP1 and ELF3 control circadian function and photoperiodic flowering by regulating GI stability. Mol Cell. 2008;32:617–30.
Sionit N, Patterson DT, Coffin RD, Mortenson DA. Water relations and growth of the weed, goosegrass (Eleusine indica), under drought stress. Field Crops Res. 1987;17:163–73.
Y.-M. Park. Effects of Drought on Two Grass Species with Different Distribution Around Coastal Sand-Dunes. Funct Ecol. 1990;4:735–41.
Zhang H, Hall N, Goertzen LR, Bi B, Chen CY, Peatman E, et al. Development of a goosegrass (Eleusine indica) draft genome and application to weed science research. Pest Manag Sci. 2019. doi:10.1002/ps.5389.
Lyons E, Pedersen B, Kane J, Alam M, Ming R, Tang H, et al. Finding and comparing syntenic regions among Arabidopsis and the outgroups papaya, poplar, and grape: CoGe with rosids. Plant Physiol. 2008;148:1772–81.
Lyons E, Freeling M. How to usefully compare homologous plant genes and chromosomes as DNA sequences. Plant J. 2008;53:661–73.
Tang H, Lyons E, Pedersen B, Schnable JC, Paterson AH, Freeling M. Screening synteny blocks in pairwise genome comparisons through integer programming. BMC Bioinformatics. 2011;12:102.
Yang Z. PAML 4: phylogenetic analysis by maximum likelihood. Mol Biol Evol. 2007;24:1586–91.
Chen S, Zhou Y, Chen Y, Gu J. fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics. 2018;34:i884–90.
Grabherr MG, Haas BJ, Yassour M, Levin JZ, Thompson DA, Amit I, et al. Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nat Biotechnol. 2011;29:644–52.
Kim D, Pertea G, Trapnell C, Pimentel H, Kelley R, Salzberg SL. TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions. Genome Biol. 2013;14:R36.
Langmead B, Salzberg SL. Fast gapped-read alignment with Bowtie 2. Nat Methods. 2012;9:357–9.
Quinlan AR. BEDTools: The Swiss-Army Tool for Genome Feature Analysis. Curr Protoc Bioinformatics. 2014;47:11.12.1–34.
Haas BJ, Papanicolaou A. TransDecoder (find coding regions within transcripts). 2016.
Chen S, McElroy JS, Dane F, Goertzen LR. Transcriptome Assembly and Comparison of an Allotetraploid Weed Species, Annual Bluegrass, with its Two Diploid Progenitor Species, Schrad and Kunth. Plant Genome. 2016;9. doi:10.3835/plantgenome2015.06.0050.
Ranwez V, Douzery EJP, Cambon C, Chantret N, Delsuc F. MACSE v2: Toolkit for the Alignment of Coding Sequences Accounting for Frameshifts and Stop Codons. Mol Biol Evol. 2018;35:2582–4.
Kück P, Meusemann K. FASconCAT: Convenient handling of data matrices. Mol Phylogenet Evol. 2010;56:1115–8.
Lanfear R, Frandsen PB, Wright AM, Senfeld T, Calcott B. PartitionFinder 2: New Methods for Selecting Partitioned Models of Evolution for Molecular and Morphological Phylogenetic Analyses. Mol Biol Evol. 2016. doi:10.1093/molbev/msw260.
Stamatakis A. RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics. 2014;30:1312–3.
Haas BJ, Delcher AL, Wortman JR, Salzberg SL. DAGchainer: a tool for mining segmental genome duplications and synteny. Bioinformatics. 2004;20:3643–6.
Rambaut A. FigTree. Tree figure drawing tool version 1.3. 1. Institute of Evolutionary biology, University of Edinburgh. 2009.

AdditionalFiles.zip

Download PDF

Journal Publication

published 12 Mar, 2021

Read the published version in BMC Genomics →

Editorial decision: Major revision
13 Sep, 2020
Review #4 received at journal
08 Sep, 2020
Review #3 received at journal
06 Sep, 2020
Reviewer #3 agreed at journal
21 Aug, 2020
Reviewer #4 agreed at journal
21 Aug, 2020
Review #2 received at journal
23 Jul, 2020
Review #1 received at journal
17 Jul, 2020
Reviewer #2 agreed at journal
25 Jun, 2020
Reviewer #1 agreed at journal
27 Apr, 2020
Reviewers invited by journal
26 Apr, 2020
Editor assigned by journal
03 Apr, 2020
First submitted to journal
02 Apr, 2020
Submission checks completed at journal
02 Apr, 2020
Editor invited by journal
02 Apr, 2020

You are reading this older preprint version

Read the latest preprint version →

Detection of subgenome bias using an anchored syntenic approach in Eleusine coracana (finger millet)

Status:

Journal Publication

Version 1

Abstract

Figures

Background

Results

Discussion

Conclusion

Methods

Abbreviations

Declarations

References

Supplementary Files

Status:

Journal Publication

Version 1