Revelation of candidate genes and molecular mechanism of reproductive seasonality in carp sh (Labeo rohita Ham) by RNA sequencing

Background Carp sh, rohu (Labeo rohita Ham) is important freshwater aquaculture species of South-East Asia having seasonal reproductive rhythm. There is no holistic study at transcriptome level revealing key candidate genes involved in such circannual rhythm regulated by biological clock genes (BCGs). Seasonality manifestation has two contrasting phases of reproduction, i.e., post-spawning regression and initiation of gonadal activity appropriate for discovery of associated candidate genes. It can be deciphered by RNA sequencing of tissues involved in BPGL (Brain-Pituitary-Gonad-Liver) axis controlling seasonality. How far such BCGs of this sh are evolutionarily conserved across different phyla is unknown. Such study can be of further use to enhance sh productivity as seasonality restricts seed production beyond monsoon season. A total of ~150 Gb of transcriptomic data of four tissues viz., BPGL were generated using Illumina TruSeq. De-novo assembled BPGL tissues revealed 75554 differentially expressed transcripts, 115534 SSRs, 65584 SNPs, 514 pathways, 5379 transcription factors, 187 mature miRNA which regulates candidate genes represented by 1576 differentially expressed transcripts which are available in the form of web-genomic resources. Findings were validated by qPCR. This is rst report in carp sh having 32 BCGs found widely conserved in sh, amphibian, reptile, birds, prototheria, marsupials and placental mammals. This is due to universal mechanism of rhythmicity in response to environment and earth rotation having adaptive and reproductive signicance. This study elucidates evolutionary conserved mechanism of photo-periodism sensing, neuroendocrine secretion, metabolism and yolk synthesis in liver, gonadal maturation, muscular growth with sensory and auditory perception in this sh. Study reveals sh as a good model for research on biological clock besides its relevance in reproductive eciency enhancement.

in process of speciation thus comparison of such genes using whole genome sequence data of each phyla can throw light on evolutionary signi cance in terms of adaptation [8].
Present work aims at 1) Discovery of candidate genes of seasonal biological rhythm of breeding and gene regulatory networks. 2) Investigation of differentially expressed biological clock genes (BCGs). 3) Finding the extent of evolutionary conservation of these BCGs across jawless sh, cartilaginous sh (whale shark), bony shes (zebra sh and European carp), amphibia, bird, reptile, monotreme (egg laying mammal), marsupials (young born) and placental mammals using comparative analysis of whole genome sequences of each phyla. 4) Developing a genomic resource database of evolutionary conserved BCGs of each phyla along with seasonality associated candidate genes of carp sh, Labeo rohita (rohu).

Results And Discussion
Reproductive phase identi cation by estimation of GSI and histology Histological analysis of ovarian tissue of PSR and IGA phases revealed the changes in response to seasonality. Primary ovarian follicles (POF), smaller in size were present during PSR dormancy (lower GSI). During IGA phase having higher GSI maturation of secondary ovarian follicles (SOF) with larger size was seen in the histological photomicrographs, revealing conspicuous seasonal differences ( Figure 1). It con rmed the accuracy of tissues collected in the investigation. Similar histological attributes of ovarian development stages has also been reported in other shes like zebra sh (Koç 2008).
Like other Indian teleosts, the reproductive cycle of L. rohita may be divided into four stages, viz., preparatory period (February-April), pre-spawning period (May-June), spawning period (July-August), and post-spawning period (September-January) and at each stage gonads show discrete changes [9].
Generally in an adult rohu female, the ovary remains quiescent and almost inactive during last part of post spawning period also called post spawning regression (PSR). With gradual increase in day length and temperature, there is increase of gonadal activity (IGA) for preparatory period. This con rmed that these two reproductive contrasting phases, viz., PSR and IGA as most appropriate transcriptomic data points to study the changes that happens at the molecular level during onset of gonad maturation in this tropical species manifesting biological rhythm of seasonal breeding.
Pre-processing and assembly A total of ~150 Gb of transcriptomic data of four tissues viz., BPGL were pre-processed before assembly by removing 1098340 poor quality reads. The remaining 305461765 high quality reads were subjected to de novo assembly. A total of 440665 transcripts were generated with k-mer size 25. The minimum contig length was 201 bp and maximum of 19331 bp with average length of 867 bp. N50 was found to be 1850 bp and total GC content as 42.94% (Table 1). Figure 2 shows the sequence length distribution of the data under study. A total of 45019 unigenes were involved in the formation of isoforms. Numbers of isoforms were found in single unigenes ranging from 2 to 30.  Table 2. Expression pro ling and hierarchical clustering of DEGs of all tissue speci c comparison were shown in the form of heatmaps. MA plots and volcano plots of differentially expressed genes were generated by edgeR tool which showed the red color dots are differentially expressed and black color dot represents the non-DEGs and all these plots were generated on the basis of log2 fold change values (Supplementary le 1).
Venn diagram was constructed to identify the unique and common unigenes among all tissue speci c differentially expressed genes ( Figure 3). Venn diagram was constructed to identify the unique and common unigenes among all tissue speci c candidate genes. A total of 705 unigenes were found common in all datasets of tissue speci c DEGs. We found 20845, 9388 Figure 4). Some of the TFs are known to regulate biological clock genes in ies and humans by acting as time switch genes [12]. We found some of these are well known to control BCGs and are also conserved even in this carp sh. Example of such TFs are clock, foxl, bmal, kiss, esr, prdm, ofx, myc, srebf, bhlh and PAS. These TF and/or BCGs regulates gene regulatory network (GRN) of BPGL axis manifesting circannual rhythm of breeding.
Prediction of miRNA controlling seasonality genes Though 1637 microRNA have been identi ed and reported in mirRBase in nine sh species and most of them are widely conserved but still experimental identi cation of miRNAs in speci c shes are too less. In case of rohu there is no micro RNA reported so far. This is due to di culty in isolation by cloning due to low expression, temporal expression and stability, tissue. Thus, computational identi cation of miRNAs binding site can be predicted and classi cation of the functional attributes of these miRNA can be done. Instead of isolating small RNA afresh, the same larger RNA can be used to predict speci c binding site to have information without any additional cost and time. It has been well reported that molecular markers for egg quality and embryonic development is a potential biomarker in rainbow trout [13]. Moreover, miRNAs are also reported to be involved in oocyte and early embryo [14,15,16] and gonadal development [17].
We identi ed miRNA target prediction of all tissue speci c differential expressed genes with 1637 mature miRNAs of nine sh species with cut-off threshold of total score at 1000 and total energy > -14 kcal/mol. We obtained 790, 641, 764 and 629 mature miRNAs that regulate 1444, 770, 1229 and 1044 differential expressed genes at tissue speci c stages IGA  [19]. SNPs of genes involved in GRN affects phenotype/trait [20] thus are valuable for eQTL discovery.
Since prediction of the behaviour of GRN are rapid and economical than wet lab experiments, thus computational methods can be a valuable research tool [21]. GRN studies has been used to identify candidate genes associated with the traits [22]. Gene regulatory networks (GRN) is further controlled by transcription factor and microRNA [23].
The GRN models of brain, pituitary, gonad and liver tissues were represented by 21, 35, 46 and 25 genes, respectively. Some of them were predicted as hub genes having similar protein-protein interaction. Magnitude of such genes expression can be correlated with trait of interest [24]. As not much information is available on seasonality aspect of these tissues, thus limited sample size transcriptomic data was used to develop "logical model" to have a basic picture of GRN re ecting qualitative attributes of genes involved in the network rather than precise quantitative relationship [25]. BPGL/BPG axis has already been reported with several candidate genes controlling reproduction and its synchronization with food availability and photoperiodism in farm shes [26].
GRN in brain controlling seasonality GRN in brain controlling seasonality was constructed using top 100 co-expressed genes to depict proteinprotein interactive network. Genes were selected at a cut off degree value of 167. A total of 20 hub genes were predicted having 8 up, 13 down regulated genes shown in Supplementary le 10. Since sh neuroendocrine system and brain is having very high similarity (82%) with human brain [27], thus available literature can be used to understand and explain the GRN operating in brain manifesting seasonal breeding. GRN was found to control sensing of time by visual stimuli, neuro modulation, neuro translation, neurotransmitter synthesis and release, integration of light and temperature regulated circadian gene expression, melanocytes and growth and learning regulation of feeding behaviour.
Speci c role of predicted hub genes are summarized in table 6.
Seasonality is synchronized by temperate latitudes, photoperiod, abundance of food, temperature and other environmental cues which are input to neuroendocrine output. Brain perceives the signal of photoperiod by melatonin pathway. Eyes are the only photoreceptive organs operating through pineal gland producing melatonin. Melatonin receptors are present in suprachiasmatic nuclei (SCN) which are the transducer of such signal acting as circadian rhythm pacemaker. Saccus vasculosus (SV) acts as a sensor of such seasonal changes in day length in sh as its removal affects the BPGL axis [28]. Photoperiod is sensed by SCN and transduced to give signal to thyroid for seasonal physiological events thus thyroid-stimulating hormone (TSH) and DIO gene also play role in seasonality [29].
There are two steps in cyclical phase of seasonality viz., PSR and IGA with slow and faster growth, respectively which is controlled by BPGL axis in rohu sh [30]. Such biological rhythm is even connected with sh auditory function for environmental cue required for reproduction and breeding [31].

GRN in pituitary controlling seasonality
In order to depict protein-protein interactive network in pituitary GRN was constructed using co-expressed genes in the tissue by selecting the genes at a cut off degree of 175. A total of 35 hub genes were regulating the network with 15 and 20 up and down regulated genes, respectively (Supplementary le 10). MAPK signalling pathway was found to control environmental signal mediation, gonadogenesis, retenoic acid pathway controlling gametogenesis, ovarian growth and gametogenesis, increase in cell size and proliferation, oocyte maturation and ovulation, control of BPGL axis for liver metabolism, skeletal muscle development and immunity control. The speci c role of hub genes in controlling seasonality has been summarized in  10). Hub gene activities were found to control mitogenic action of estrogen in grandulosa cell of ovary, growth of ovarian follicles, oocyte maturation, chondrocyte differentiation, prostaglandin synthesis, collagen synthesis and receptor mediated endocytosis. This re ects that neuroendocrine mechanism regulating reproduction through hypothalamo-pituitary-gonadal (HPG) axis is evolutionarily conserved in vertebrates [38]. All these neuroendocrine induced physiological events are coordinated by set of regulatory genes which are depicted in Supplementary le 10 and the role of various candidate genes/ hub genes identi ed in rohu sh ovary are shown in Table 8. Seasonal breeding sh, Labeo rohita has seasonal variation in ovarian development having seven distinct stages (phases) viz., Virgin/Immature, primary growth, perinucleolar, previtellogenic (yolk vesicle), vitellogenic (postvitellogenic), germinal vesicle break down and spawning stage [39]. In ovary, there is coordinated activities for various regulators of ovulation like proteases, protease inhibitors, progestational steroids, eicosanoids, catecholamines and vasoactive peptides [40.
Thus, we can see that neuroendocrine control of BPGL axis and its associated gene regulatory network operating in brain, pituitary, gonad and liver are well coordinated to manifest seasonality in sh breeding by sensing photoperiod using biological clock genes along with temperature and food abundance [3]. We observed LH controlled DEG (poliovirus receptor and Bloodthirsty) involved in gonad development and maturation associated genes (ehmt2 and racgap1) which is also reported in rainbow trout. FSH modulates steroidogenic pathway as well as early germ cell proliferation and differentiation. We found four major pathways and their genes which are differentially expressed in BPGL axis viz., IGF pathway (insulin gene enhancer protein, insulin receptor, insulin-induced gene, precursor of insulin and its receptor), the TGF pathway (amh, inha and fstl3), the Wnt pathway (wisp1), and pleiotrophin (mdka) which are also reported in rainbow trout [3]. Insulin gene enhancer protein isl with its 3 isoforms families (Isl-1,2,3), observed in our dataset which is reported to play role in regulation of IGF pathway [41]. SV organ acts as a sensor of seasonality by sensing the day length/photoperiod by rhodopsin molecule. Thus, rhodopsin gene family must be one of the major hub in brain tissue acting as a transducer for neuroendocrine output in form of GnRH [28].
GRN in Liver controlling seasonality GRN in liver controlling seasonality was constructed using top 100 co-expressed genes to depict proteinprotein interactive network. Genes were selected at a cut off degree 165. A total of 25 hub genes were predicted having 9 up, 16 down regulated genes shown in Supplementary le 10. JAK/STAT signaling pathway was found to control glycolysis, gluconeogenesis, oxidative phosphorhylation, apolipo protein production, MHC response, retenoic acid, betain and GABA, immune response, cholesterol and lipid transport (Table 9). Like all oviparous vertebrates, in sh also, egg-yolk and chorionic proteins like vitellogenin and choriogenin are synthesized heterologously in the liver [42]. Fish liver is also involved in synthesis, degradation, transportation, and storage of lipid. These major activities are well coordinated by BPGL axis along with oocyte growth and development in tune of seasonality. The axis synchronises the food availability and ingestion behaviour along with photoperiod, temperature and gonadal maturation leading to seasonal reproduction in temperate shes [43]. In vitellogenesis period, there is increase in GSI and HSI due to increase in total protein, glycogen and cholesterol content of the ovary and liver and stored fat reserve is used for gonadal maturation [44].
We also found upregulation of Wnt and Notch signaling genes in all the four tissues of BPGL axis having multifarious role including higher perceptivity of auditory system in breeding season. This is in response to estrogen induced seasonal changes [45]. Besides tissue speci c GRN, there are some common set of genes which are expressed in different magnitudes in different tissue with respect to seasonality. Expression of plectin gene was found to be most abundant in liver and gonads rather than brain and pituitary. Plectin proteins are abundant (75%) in cytoskeleton of ovary, which is a major phosphoacceptor, in phosphorylation, thus important for the protein's association with the cytoskeleton [46]. Plectin is a hemidesmosomal protein that mediates hyperproliferation of gonads and liver. It also increases the expression of estrogen receptor in sh gonads which triggers seasonal response in ovary manifesting seasonality [47]. Claudin gene was found to be upregulated in all the 4 tissues but its highest expression was found in gonads. Claudin in teleost sh is reported to be tissue speci c in expression. It functions as pore forming TJ protein increasing luminal uid accumulation and volume expansion in ovary [48] as observed in histological picture ( Figure 1).

qPCR for validation of DEG
To validate the nding, qPCR analysis was performed. Relative gene expression of randomly selected 20 differentially expressed genes along with one housekeeping gene (beta actin) showed similar magnitude with the computed log fold change value except three genes (Supplementary le 11). This could be due to cross-reactivity in the designed primer [49].
Discovery and role of biological clock genes in carp sh for circannual breeding Blast hit of four sets of DEGs revealed presence of at least 32 well known biological genes in carp sh. These sets of transcripts represent carp sh speci c BCG sequences which is also a direct evidence for presence of these biological clock genes in this sh. These differentially expressed BCGs in BPGL tissues are involved in manifestation of circannual rhythm of breeding. In case of European sh reproductive axis gene expression in brain is known to be modulated by photoperiod which is regulated by biological clock genes and conserved transcription factors [3]. This is the rst report having transcriptomic evidence of biological clock genes in rohu mediating seasonality over BPGL axis controlling reproductive behaviour. Interestingly, all these 32 biological clock genes are well conserved from lower vertebrate like cyclostome (Lamprey) to higher vertebrate like mammals. In pituitary, we found differential expression of Bmal1, Clock, Per1, Per2, Cry1 and Cry2 genes sensing duration of photoperiod for synchronization in BPGL axis. This culminates in manifestation of reproductive seasonality with circannual rhythm [50].
Few already reported genes of rohu sh involved in seasonality regulation were also found in our dataset. For example Kisspeptin and its receptor Kissr [51]. We found differential expression of known sex determining and seasonality genes for example aromatase cyp19a1a, estrogen receptor esr1a, and foxl2 in ovaries similar to Iberian cyprinid sh (Squalius pyrenaicus) [52] and GnRH3, similar to other teleost [53]. Similar gene family like Kiss and gnrh are reported to control reproductive axis under in uence of contrasting day length as photoperiod in sh sea bass [54].
Though expression of biological clock genes are tissue speci c as well as magnitude speci c [55] but they are well coordinated and synchronized across various organs involved in sensing photoperiod, ambient temperature, food supplies, immunity, locomotory activity and body growth [56] enabling the sh for seasonal breeding. Isotocin-neurophysin gene was found to be upregulated in brain. Isotocin is teleost homologue of mammalian oxytocin. In case of zebra sh it has been reported that isotocin stimulates the proliferation of cells enhancing the functional activities of ionocytes. Isotocin stimulates the proliferation of cells in sh gonads [57]. Similarly, higher expression of synaptotagmin gene was also found in hypothalamus and pituitary which is involved in neurotransmitter secretion [58].
Reproductive seasonality in sh is reported to be controlled by hypothalamus/ SV organ which acts as a sensor of seasonal changes in day length [28]. Existence of this organ is already reported in Labeo rohita [59]. Present transcriptome based study in rohu also con rms the seasonality control by similar regulatory pathway of BPGL axis. Seasonality in tropical shes is an adaptation to environment and its optimal growth and perpetuation of species in its speci c ecological niches. Fish represents vertebrate circadian timing system operated by circadian biological clock and photoreceptor organ. This ability originates during embryogenesis and continues till reproductive age [60]. Seasonal reproductive behaviour is induced by environmental cues. Various sets of biological clock sense the environmental cues and acts as a transducer eliciting neuroendocrine mediated developmental and physiological events under coordination of BPGL axis [61].
Evolutionary conservation of BCGs from cyclostome to eutherian mammals Extent of conservation in primitive jawless sh like lamprey, cartilaginous, teleost, lung sh; amphibian, bird, reptile and three mammalian classes (monotreme, marsupials and eutherians) is summarized in table 10. The comparative analysis of carp sh BCGs sequence with different vertebrate phyla indicates that these biological clock genes are well conserved in all. Maximum percentage similarity of all the 32 biological clock genes of rohu was found with jawless sh, lamprey. However, query coverage was maximum up to 100% with cyprinid (common carp) and zebra sh up to 100% (Supplementary le 12).
Circular plot shows further qualitative depiction of extent of conservation in selected six phyla namely, amphibian, sh, birds reptiles, eutherian mammals and ovoviviparous mammals ( gure 5). These ndings infer that genes and neuroendocrine pathways sensing environmental cues are well conserved during evolutionary process of species diversi cation into different classes of vertebrates [62]. Such biological clock genes are reported to be conserved in other shes also like Atlantic cod sh having 18 conserved biological clock genes [63]. Thus, investigation of present transcriptomic landscape revealed that at least 32 reported biological clock genes of other diverse species are also present in rohu sh. With seasonal changes these genes are differentially expressed synchronizing environment and reproduction.
In evolution, biological clock has been found universal, which originated 2.5 billion years ago [64]. This is because of universality in response of all evolving organisms against change in day length with rotation of earth having daily and seasonal variation in environment. Environmental signal entrains the BCGs to trigger seasonal behaviour for annual biological rhythm for reproduction/migration [65]. Magnitude of biological rhythm varies from 20 minutes (bacterial reproduction) to daily as circadian rhythm but such rhythm may have much longer duration like tidal rhythm (14 days), lunar rhythm (28 days), circannual rhythm one year, sometime it may be of 17 years (cicada species emergence from ground) to 60 years (bamboo owering) [66].
Circadian rhythm is well connected to lunar, tidal, circannual rhythm in hibernating mammals. It is also present in invertebrates [67]. Circannual rhythm of breeding is well known in farm shes, birds and mammals [68]. These nding re-endorses that there is a universal mechanism at molecular level regulating seasonal reproduction in vertebrates right from sh to mammal which starts from photo period sensing by retina and signals to neurohypophyseal complex for endocrinal output in a sh-speci c organ called saccus vasculosus [69]. Arti cial manipulation of day length affects rhythmicity of biological clock affecting reproductive activities like ovulation, spermatogenic activity , gamete quality and sexual behaviour in sh, bird and mammals [68]. Oscillation in biological rhythm has adaptive signi cance to reduce competition between species by occupying differential species speci c spatial niches. This is done by differential phasing of biological rhythm which is entrained by both internal and external force [8]. Thus we observe ubiquity of circadian rhythm phylogenetically having strong adaptive signi cance [70]. This evolutionary comparative genomic study endorses that sh biological clock can be used as a model for research and knowledge discovery in terms of depicting molecular and cellular mechanism in response to environment in applied research like drug e cacy for jetlag etc [71].
Web genomic resources of evolutionary conserved BCGs and candidate genes of seasonality Present investigation of environmentally triggered reproduction can be of much relevance in the endeavour of out of seasonal breeding leading to higher sh productivity [56]. The present genomic resource catalogues 75554 differentially expressed transcripts, 115534 SSRs, 65584 SNPs from de novo assembly, 514 pathways, 5379 TFs, 187 mature miRNA regulating 1576 DEGs and 32 BCGs. It also catalogues ready to use primers. This is freely accessibility for academic purpose.
Fish transcriptome can be a powerful tool to establish relationship between genotype and phenotype using structural and functional annotation approach [72]. In rohu sh, SSRs and SNPs have been reported to be associated with important disease resistant trait against aeromoniasis bacterial disease [72]. Our transcriptome database can be used as genomic resources as it is having putative markers useful for similar association studies in future [73]. Such genomic resources having SNP markers has also been used in linkage mapping and QTL discovery in rohu sh [74].
Candidate genes involved in GRN of ovary can be used in SNP discovery and future association studies in sh selection. Liver GRN genes can be used for biomonitoring of sh breeding area with seasonality to predict magnitude of sh productivity [75]. GRN of liver having TOR signaling pathway which is associated with lipogenesis can be over-activated to utilize glucose more e ciently by genetic selection [76]. Similarly genes expressed in liver associated with innate immune system can be used for biomonitoring of sh mortality [77].

Conclusion
This is the rst report in carp sh rohu deciphering candidate genes of circannual biological rhythm of reproduction along with BCGs. Study reveals reproductive seasonality of rohu sh with elucidation of neuroendocrine physiological events at molecular level. We present the transcriptomic landscape of BPGL axis operating through various gene regulatory network associated with tissue involved in manifestation of circannual breeding having environmental synchronization by BCGs and BPGL axis.
Differentially expressed genes in BPGL tissues along with putative molecular markers (SSRs and SNPs), pathways, transcription factors, mature miRNA and BCGs are available in the form of web genomic resources (LrSATDb: http://webtom.cabgrid.res.in/lrsatdb/). This study reveals atleast 32 BCGs in this carp sh for the rst time which are also highly conserved across jawless, cartilaginous, teleost sh, amphibian, reptile, birds, prototheria, marsupials and placental mammals. This is because of universal molecular mechanism of rhythmicity in response to environment and earth rotation. Biological rhythm and its amplitude has different adaptive and reproductive signi cance during speciation and evolutionary divergence of different taxa.
Gene regulatory network operating in BPGL tissues mediating photo-periodism sensing, neuroendocrine secretion, metabolism and yolk synthesis in liver, gonadal maturation, muscular growth with sensory and auditory perception in this sh are elucidated. Study further reveals that sh can be a good model for research on biological clock besides its immense use in the endeavour of reproductive e ciency improvement.

Ethics Statement
This study was approved by the Animal Ethics Committee of ICAR-CIFA, Bhubaneswar. All the shes (rohu, Labeo rohita) used in the experiments were handled according to the prescribed guidelines of the Institute.
Tissue sample collection Brain, pituitary, gonad (ovary) and liver tissues of adult rohu female were collected during resting PSR phase (in December) and IGA phase (in February), respectively. The shes were euthanized with MS-222 at 300 mg/L before dissection. Tissues were collected from minimum ten shes for each phase, quickly frozen in liquid nitrogen and stored at -80 0 C, until used for RNA extraction. Carp sh, brood stock was reared in ICAR-CIFA farm ponds (Latitude 20°1'06"-20°11'45"N, Longitude 80°50'52"-85°51'35"E, 45 m MSL). For optimal growth maintenance, stocking density was maintained at the rate of 1500kg/ha along with routine monitoring of water temperature, pH, dissolved oxygen, total alkalinity and total hardness at 28-30˚C, 7.5-8.5, 100-140 ppm, and 100-130 ppm, respectively.

Reproductive phase identi cation by estimation of GSI and histology
The gonado somatic index (GSI) and histological analysis of gonads were carried out to con rm the reproductive phases. For GSI and histological observation, shes were humanely sacri ced. GSI was calculated using the formula: GSI=Gonad weight/ Total tissue weight x100 [78]. For histology, middle portion of the right and left lobes of the gonads from the sampled sh were taken after dissection. The gonad samples (1-1.5mm thickness) were immersed in Bouin's xative for 24 hour and 5µm thick sections were cut using a mechanical microtome (WESWOX Optik Rotary Microtome, Ambala Cantt, India) and stained using haematoxylin and eosin. Gonads were processed for examination with light microscopy using routine histological procedures [79].

RNA extraction and sequencing
Total RNA was extracted using Qiagen RNeasy mini kit using manufacturer's protocol. RNA concentration and purity were estimated using Bioanalyzer 2100 (Agilent, Santa Clara, CA, USA) and samples having RIN (RNA Integrity Number) > 7.0 were used for sequence data generation. Library preparation was performed using Illumina TruSeq RNA library protocol and sequenced on Illumina HiSeq 2000 platform (Illumina, San Diego, CA) to generate 100 nucleotide pair-end reads.
Pre-processing, de-novo assembly, abundance estimation and annotation The paired end reads generated from Illumina for PSR and IGA phases from BPGL tissues were taken for the study. Data were pre-processed for quality check and low quality reads, reads with ambiguous bases 'N' and adaptor sequences were trimmed using FASTQC [80] and trimmomatic [81] tool. The processed tissues speci c paired end reads of PSR and IGA phases were pooled together and de novo assembly with default parameters were performed using Trinity [82]. Further, abundance estimation was performed using RSEM (RNA-Seq by Expectation Maximization) [83] and DEGs were identi ed from the data set under study. DEGs from comparative studies of IGA season vs. PSR season were identi ed in each of the BPGL tissues using edgeR package [84].
Homology search and functional characterization of candidate genes Standalone NCBI blast package was used for sequence similarity search of all novel transcripts and differential expressed genes against the NCBI non-redundant database (nr.32). Annotation was done for the DEGs along with functional categorization, interproscan, mapping and identi cation of KEGG pathways using Blast2GO PRO 3.1 software [10]. Danio rerio transcription factors from AnimalTFDB 2.0 database [11] were retrieved to predict similar transcription factors for differential expressed genes of BPGL tissues in rohu.
SNPs and Indels were detected using two references i.e. Danio rerio reference genome data and generated de novo transcription reference assembly. The reference genome of Danio rerio GRCz10 was retrieved from ensemble database (http://asia.ensembl.org/Danio_rerio/Info/Index?redirect=no) as it is closely related species of L. rohita and both belongs to same family, Cyprinidae. We separately identi ed the SNPs and Indels from IGA and PSR tissue samples. Burrows-Wheeler Aligner (BWA) tool [86] was used to alignment and SAMtools package used for calling SNPs and Indels [87]. Annotation of SNPs and Indels were performed using SnpEff tool (http://snpeff.sourceforge.net/).

miRNA target prediction of DEGs
A total of 1637 mature miRNA sequences of nine sh species such as Danio rerio, Fugu rubripes, Cyprinus carpio, Hippoglossus hippoglossus, Ictalurus punctatus, Oryzias latipes, Paralichthys olivaceus, Salmo salar and Tetraodon nigroviridis were retrieved from miRBase database release 21 [88]. MiRNA target predictions of L. rohita from DEGs of all stages were performed by using miRanda-3.3a tool [89].

Construction of Gene Regulatory Network
Gene regulatory network (GRN) was constructed using up and down regulated DEGs having > 8 fold difference in expression. Visualization was done using open source tool, Cytoscape version 3.2.1 [90].
Network centrality was analysed using Network Analyzer plug-in. Analysis of degree, betweenness and stress were used to identify hub genes and its network. The "logical model" was constructed based on sample size [25].

qPCR for validation of DEG
For quantitative PCR, 20 transcripts were randomly selected. Primers were designed using Primer 3 software [91]. The primer were synthesised in 10nm scale and puri ed by HPLC. The total RNA were converted into cDNA using a nity script qPCR cDNA synthesis kit (Agilent Technologies, USA) as per manufacturer`s protocol. Gene expressions were measured using SYBR Green chemistry (Brilliant II SYBR Green qPCR master mix, Agilent Technologies, USA) with standard 40 cycles in Stratagene mx3005P instrument (Agilent Technologies, USA). The dissociation curve analysis was performed to ensure speci city of ampli cation. The mean Ct value of technical replicates were calculated. Magnitude of differential gene expression were calculated in terms of ΔΔCT fold change value as described by Pfa [92]. Housekeeping beta actin gene was used as reference to normalize the qPCR data. Fold change values were computed by log transformation (Log2).

Discovery of biological clock genes in carp sh
Four sets of DEG obtained from BPGL tissue were subjected to homology search using BLAST. From literature, names of known BCGs [50, 79,63] were used to identify carp sh speci c BCG transcripts as an evidence of BCGs. These selected BCG transcripts were used for further analysis (supplementary le 12).

Evolutionary conservation of BCGs from cyclostome to eutherian mammals
To identify biological clock genes and their extent of conservation, comparative genomics approach was used. For this study, 32 genes were selected viz. Bmal2,clock, cry1,cry2, foxl2, GNRH, kisspeptin, LH (Luteinizing hormone), Per1, Per2, Per3, esr1a, PVR, Bty, wisp1, mdka, amh, ependymin, CaMK2, ehmt1, ehmt2, racgap1, fstl3, fstl4, fstl5, prdm, Otx, Myc, srebf1, pac, picalm and aanat and homology search was done using Blastn programme of NCBI Web genomic resources of evolutionary conserved BCGs and candidate genes of seasonality An online relational database of rohu sh transcriptome was developed which catalogues tissue wise transcripts/contigs, putative SSRs, SNPs, Indels, transcription factors, miRNA targets representing two reproductive phases (IGA and PSR). The architecture is "three-tier architecture" viz., client-, middle-and database tier. This genomic resource is freely accessible for non-commercial use at http://webtom.cabgrid.res.in/lrsatdb/. In order to browse and query, user can go through the web pages in client tier. All the information are available in various tables corresponding to MySQL in the database tier.
Server side scripting in PHP was done in the middle tier for database connectivity, query execution and fetching. In order to generate primers over selected markers, Primer3 executable was integrated at the backend.

Ethics Statement
This study was approved by the Animal Ethics Committee of ICAR-CIFA, Bhubaneswar. All the shes (rohu, Labeo rohita) used in the experiments were handled according to the prescribed guidelines of the Institute.

Availability of Supporting Data
The transcriptome dataset of the study used in this article are available in the NCBI repository with following accessions and is kept at hold till the publication. These would be made public after publication. (Bioproject: PRJNA401304; BioSamples: SAMN07602341, SAMN07602342,   SAMN07602343, SAMN07602344, SAMN07602345, SAMN07602346,     STCH gene This is a Hsp70 family gene reported to be a promising candidate gene for sensing incubation time by ocular dominance and cortical plasticity mediating visual stimuli [99] AATK gene Apoptosis associated tyrosine kinase (AATK) gene: This gene has epigenetic regulation associated with growth and differentiation. Regulates GRIN2 gene, which is one of the key genes of learning and memory in zebra sh [100] EIF31 Controls lipid metabolism, biosynthesis, proteolysis and gluconeogenesis in liver of zebra sh [101] WDFY gene Regulates feeding behaviour and growth involved in brain-pituitary axis of zebra sh [102] PHF24 A well conserved gene across various phyla, involved in integration of light and temperature in the regulation of circadian gene expression [103]  EIF3i Eukaryotic translation initiation factor i (EIF3i) is involved in the initiation process of protein translation. Overexpression leads to increase in cell size and proliferation. In zebra sh, it regulates development of the brain, heart, vasculature, and lateral line. [106,107] Myo1b Myosin IB (Myo1B) gene affects BPGL axis by directly affecting liver metabolism and growth in cyprinid sh, fathead minnow [108] FLNA Filamin A (FLNA) regulates neuro-epilthelia cell proliferation and differentiation by regulating actin cytoskeleton and transmembrane receptor complex in cardiac and skeletal muscles development in medaka sh [109] dpp6B Dipeptidyl-peptidase (dpp6B) gene is involved in retinoic acid pathway controling development of eye and germ cells in medaka sh. Retinoic acid pathway controls gametogenesis. [110,111] Ik gene Ik Gene codes for protein Red. This gene is present in zebra sh. It is involved in cell proliferation and differentiation [112] LAMP2 Pituitary being endocrine gland has several secretory activities. Lamp2 has been reported to control such activities of lysosomes in zebra sh [113] IRF2 BP Iterferon Regulatory Factors (IRFs): In grass carp and zebra sh, this highly conserved gene has been reported to modulate innate immune response. Seasonality in breeding of sh is highly correlated with such immunity [114] ZGC Gene ZGC gene is involved in oocyte maturation and ovulation in zebra sh [115] CMYCB MYCB gene is involved in neuroendocrine secretion in brain and physiological regulation in liver, heart, kidney in sh (European sea bass) affecting growth and differentiation [116] ARHGEF ARHGEF is also known as Rho Guanine Nucleotide Exchange Factor gene family which is present in all eukaryotes including jawless (Lampreys, Agnatha) as well as jaw (cartilaginous and boney) shes. These are the adaptive cell signaling mediators of environmental signals in mediating seasonality [117] DNAJC Also known as HSP 40 play major role in immune responses and their overexpression in liver is associated with defense mechanism [118] LIMA1 and LIMA1A splice variant LIM Domain And Actin Binding 1 (LIMA1) gene is associated with nuclear receptor essential for gonado-genesis [119]  ZGC gene is involved in oocyte maturation and ovulation in zebra sh [115] ZAR1 Zar1 is highly conserved gene also present in zebra sh, involved in growth of ovarian follicles [120] NASP nuclear autoantigenic sperm protein:Expression of zona pellucida protein of primary oocytes and cell cycle regulation in zebra sh is associated with NASP gene. This gene is also reported to promote female development by repressing male-speci c genes [121] Col12A1: Collagen Type XII Alpha 1 Chain This is a signalling molecule associated with chondrocyte differentiation especially where collagen molecules are used in tissue remodeling during physiological growth and differentiation in zebra sh [122] SCML4 (Sex Comb On Midleg Like) SCML4 is differentially expressed and its GRN study has revealed its role in development of optic, retina and olfactory epithelia with reproductive seasonality in teleost sh [123] ptgdsb Prostaglandin D2 Synthase: This gene is involved in synthesis of prostaglandins which controls spawning by stimulating contraction of smooth muscles leading to ovulation in gold sh [124] Col 6a3 Collagen Type VI Alpha 3 Chain;There is seasonal change in collagen content in muscles of sh synchronised with swimming ability. In farm sh,swimming exercise is reported to control puberty. [125,126] CDC42 gene Cell Division Cycle 42: Fish egg yolk mainly contains vitellogenins, synthesized in liver, growing oocytes takes it from circulation by receptormediated endocytosis in oocyte growth [127] ACD Adrenocortical dysplasia homolog (ACD), Shelterin Complex Subunit And Telomerase Recruitment: This gene controls telomeric elongation mediating mitogenic actions of estrogen in granulosa cells for ovarian growth with season [128]  This gene is highly conserved from sh to mammal and associated with seasonal changes in retinoic acid, betaine and gamma-aminobutyric acid production in sh liver [134] PMT (mitogenic): Protein O-Mannosyltransferase In zebra sh, it controls apoptotic signaling mechanisms, immune response and homeostasis in unfavourable environmental stimuli or unfavourable period for breeding, thus it plays role in synchronising the liver activity with environment and seasonality [135] pSAT1 is similar to GPR54 Fish GPR54-1 contains PSAT1 loci, there is evolutionary propinquity between GPR54 and kisspeptin genes, later is well known for controling seasonality [136] G6PCA G6PCA is associated with JAK-STAT5 signalling pathway operating in hepatocytes having glycolysis and gluconeogenesis in tilapia sh [137] OSBPL2B OSBPL2B protein is involved in regulation of cholesterol tra cking and intracellular transport of lipids in zebra sh [138]