The Chemical Defensome of Fish: Conservation and Divergence of Genes Involved in Sensing and Responding to Pollutants Among Five Model Teleosts

19 How an organism copes with chemicals is largely determined by the genes and proteins that 20 collectively function to defend against, detoxify and eliminate chemical stressors. This 21 integrative network includes receptors and transcription factors, biotransformation enzymes, 22 transporters, antioxidants, and metal- and heat-responsive genes, and is collectively known 23 as the chemical defensome . Although the types of defensome genes are generally conserved 24 in animals, there are important differences in the complement and function of specific genes 25 between species. Teleost fish is the largest group of vertebrate species and can provide 26 valuable insights into the evolution and functional diversity of defensome genes. 27 In this study, we compared the genes comprising the chemical defensome of five fish 28 species that span the teleosteii evolutionary branch often used as model species in 29 toxicological studies and environmental monitoring programs: zebrafish ( Danio rerio ), Atlantic 30 cod ( Gadus morhua ), medaka ( Oryzias latipes ), Atlantic killifish ( Fundulus heteroclitus ) and 31 three-spined stickleback ( Gasterosteus aculeatus ). Genome mining revealed evolved 32 differences in the number and composition of defensome genes that can have implication for 33 how these species sense and respond to environmental pollutants. The results indicate that 34 knowledge regarding the diversity and function of the defensome will be important for 35 toxicological testing and risk assessment studies. 36 37 38


43
The aquatic environment is a sink for anthropogenic compounds, and aquatic animals are 44 particularly vulnerable to chemical stressors in their natural habitats. Many of these chemicals 45 may profoundly influence organism health, including viability, growth, performance, and 46 reproductive abilities. Aquatic species are also widely used as model organisms to assess 47 responses to environmental pollutants. In the OECD Guidelines for the Testing of Chemicals, 48 13 tests for toxic properties of chemicals use fish in general, and often specific fish species 49 such as zebrafish, medaka, Atlantic killifish and three-spined stickleback. 50 The intrinsic defense against toxic chemicals largely depends on a set of genes and 51 proteins collectively known as the chemical defensome 1 . The chemical defensome include a 52 wide range of transcription factors, enzymes, transporters, and antioxidant enzymes that 53 together function to detoxify and eliminate harmful compounds, including xenobiotic and 54 endobiotic chemicals. Thus, the composition of genes comprising a species' chemical 55 defensome will affect the species overall responsiveness and sensitivity towards chemicals 56 stressors. The recent years of sequencing efforts have produced high quality genome 57 assemblies from a wide range of species, facilitating genome-wide mapping and annotation 58 of genes. The chemical defensome was first described in the invertebrates sea urchin and sea 59 anemone 1,2 , and was later mapped in zebrafish (Danio rerio), coral, arthropods, and partly in 60 tunicates 3-6 . Although these reports show that the overall metabolic pathways involved in the 61 chemical defensome are largely evolutionarily conserved, the detailed comparison of 62 defensome gene composition in different teleost fish species is not studied. 63 The diversity in both presence and number of gene homologs can vary substantially 64 between fish species due to the two whole genome duplication (WGD) events in early 65 vertebrate evolution 7 and a third fish-specific WGD event 8 , in addition to other evolutionary 66 mechanisms such as gene loss, inversions and neo-and subfunctionalizations. For example, 67 we have previously shown that several losses of the pregnane x receptor (pxr * , or nr1i2) have 68 occurred independently across teleost evolution 9 . PXRis an important xenosensor and as a 69 ligand-activated transcription factor one of the key regulators of the chemical defensome 10,11 . 70 The importance of PXR in response to chemical stressors in vertebrates, raises questions of 71 how some fish species cope without this gene. 72 Thus, the objective of this study was to compare the chemical defensome of zebrafish 73 (Danio rerio), medaka (Oryzias latipes), and Atlantic killifish (Fundulus heteroclitus), which are 74 species that have retained a pxr gene, to Atlantic cod (Gadus morhua) and three-spined 75 stickleback (Gasterosteus aculeatus), that have lost this gene by independent mechanisms. 76 Zebrafish, medaka, killifish, and stickleback are established laboratory and environmental 77 model species in both developmental and toxicological studies 12-15 . Atlantic cod is an 78 ecologically and economically important species in the North Atlantic Ocean, and has 79 commonly been used as a bioindicator species in environmental monitoring programs [16][17][18] . 80 The genome of Atlantic cod was published in 2011 19 , which has facilitated its increased use 81 as model in toxicological studies [20][21][22][23][24] . Although these fish are all benthopelagic species, their 82 natural habitats range from freshwater to marine environments, from tropical to temperate 83 conditions, and reach sizes ranging from less than five cm (zebrafish and medaka) and 15 cm 84 (killifish and stickleback) to 200 cm (Atlantic cod). 85 Our results showed that putative orthologs of all genes comprising the chemical 86 defensome were retained in these five fish species, except for the absence of pxr in Atlantic 87 cod and stickleback. We found that the number of homologs in some gene families can vary 88 greatly between fish species, which could result in differences in the corresponding defense 89 pathway. However, these variations appeared to be randomly distributed in the defensome 90 gene families and were not unique to known target genes of Pxr or other xenosensors. 91 Furthermore, we found that many of chemical defensome genes are not transcribed in early 92 development of zebrafish and stickleback until after hatching. The consequential lack of 93 transcriptional response in zebrafish embryo compared to larvae were further demonstrated 94 following exposure to the model polycyclic aromatic hydrocarbon contaminant, 95 benzo(a)pyrene. 96 In conclusion, our study represents the first interspecies comparison of the full 97 complement of chemical defensome genes in teleost model species. We found that although 98 most defensome genes have been retained in the teleost genomes over millions of years, 99 there are distinct differences between the species. Based on our results, we suggest a holistic 100 approach to analyze omics datasets from toxicogenomic studies, where differences in the 101 chemical defensome gene complement are taken into consideration. 102 Putative orthologs of the retrieved protein sequences were identified using reciprocal 130 best hit BLAST searches against the well-annotated zebrafish proteome. To capture any 131 species-specific duplications in the fish genomes compared to the zebrafish reference 132 genome, hits from one-way BLAST hits were also included. The identified peptide sequence 133

Material and methods
IDs were subsequently converted to their related gene IDs using the BioMart tool on ENSEMBL The zebrafish dataset included 18 time points from one cell to five days post 156 fertilization (dpf). In order to best compare to the available stickleback developmental data, 157 we chose to include the following time points in this study: Cleavage_2 cell (early morula), 158 blastula_1k cell (late morula), mid-gastrula, segmentation_1-4-somites (early organogenesis), 159 and larval_protruding_mouth (24 hph). 160 161 2.4 Exposure response on defensome genes in zebrafish early development 162 RNA-Seq datasets of zebrafish exposed to benzo(a)pyrene (B(a)P) (gene counts from NCBI 163 GEO: GSE64198) were previously published by Fang, et al. 32 . Briefly, the datasets are results 164 from the following in vivo experiments: Adult zebrafish were exposed to B(a)P for 7-11 days 165 before their eggs were collected and further exposed until 3.3 (embryonic state) and 96 166 (larvae state) hours post fertilization (hpf). 200 embryos and 10 larvae were pooled for each 167 group, giving three replicate groups of control and exposed at 3.3 hpf and two replicate groups 168 of control and exposed at 96 hpf. The gene counts were then normalized followed by 169 differential expression analysis using edgeR 31 . 170 171 172 stickleback to 510 in zebrafish (Figure 1). Although the number of putative homologous genes 180

Results and discussion
in each subfamily varies, we found that all gene subfamilies of the chemical defensome is 181 represented in each species, except for the absence of pxr in stickleback and Atlantic cod. 182 searching gene names and using HMMER searches with Pfam profiles, followed by reciprocal 185 or best-hit blast searches towards the zebrafish proteome. The gene families are organized in 186 categories following Gene Ontology annotations and grouped by their role in the chemical 187 defensome. The size of the disk represents the relative number of genes in the different fish 188 genomes within each group, with the number of genes in a specific gene family as slices. 189

Soluble receptors and transcription factors 191
Stress-activated transcription factors serve as important first responders to many chemicals, 192 and in turn regulate the transcription of other parts of the chemical defensome. Nuclear 193 receptors (NRs) are a superfamily of structurally similar, ligand-activated transcription factors, 194 where members of subfamilies NR1A, B, C, H, and I (such as retinoid acid receptors, 195 peroxisomal proliferator-activated receptors, and liver X receptor), NR2A and B (hepatocyte 196 nuclear factors and retinoid x receptors), and NR3 (such as estrogen receptors and androgen 197 receptor) are involved in the chemical defense 33-36 . All NR subfamilies were found in the five 198 fish genomes, except for the nr1i2 gene. 199 NR1I2, or pregnane x receptor (PXR) is considered an important xenosensor responsible 200 for the transcription of many genes involved in the biotransformation of xenobiotics 10,37 . We 201 have previously shown that loss of the pxr gene has occurred multiple times in teleost fish 202 evolution 9 , including in Atlantic cod and stickleback. Interestingly, our searches did not reveal 203 a pxr gene in the ENSEMBL genome assembly of Japanese medaka HdRr, which is considered 204 the reference medaka strain 38,39 . In contrast, a pxr gene is annotated in the ENSEMBL 205 genomes of the closely related medaka strains HSOK and HNI. Our previous study identified a 206 sequence similar to zebrafish Pxr in the MEDAKA1 (ENSEMBL release 93) genome 9 , and a 207 partial coding sequence (cds) of pxr is cloned from medaka genome 40 . However, the specific 208 strain of these resources is not disclosed. 209 To assess the possible absence of pxr in the medaka HdRr strain, we also performed 210 synteny analysis. In vertebrate species, including fish, pxr is flanked by the genes maats1 and 211 gsk3b. These genes are also annotated in medaka HdRr, but the specific gene region has a 212 very low %GC and low sequence quality. Thus, we suspect that the absence of pxr in the 213 Japanese medaka HdRr genome is due to a sequencing or assembly error. However, until more 214 evidence of its absence can be presented, we chose to include the medaka pxr gene (UniProt 215 ID A8DD90_ORYLA) in our resulting list of medaka chemical defensome genes.  prostaglandin-endoperoxide synthases (PTGS, also known as cyclooxygenases, EC 1.14.99.1).

242
Of these, aldh represented the largest family in our study, with number of putative gene 243 orthologs ranging from 19 to 22 genes. In comparison, fmo had only one gene in zebrafish and 244 four in killifish and cod. 245 Furthermore, reductases modify chemicals by reducing the number of electrons. 246 Reductases include aldo-keto reductases (AKRs, EC 1. ranging from one in zebrafish to ten in stickleback. A phylogenetic analysis of the evolutionary 251 relationship of the sequences (Figure 2), shows that all fish species have a Nqo1 annotated 252 gene. In addition, medaka, killifish, stickleback and cod have three to nine other closely related 253 genes. The endogenous functions of the different nqo genes found in fish, and thus the 254 consequences of their putative evolutionary gain in teleost fish, remains unknown and should 255 be studied further.  In our study, we found putative orthologs for all gene families, except a mt encoding 341 gene in stickleback or cod genome assemblies. Metallothioneins are cysteine-rich, low 342 molecular weight proteins, and can thus be lost due to low-quality sequence and subsequent 343 assembly. In discrepancy with the genome data, Mts are previously described in both Atlantic 344 cod (Hylland et al 1994) and stickleback (Uren Webster 2017), and these protein IDs were 345 included into our overview. Similarly, only one or two mt genes were found in zebrafish, 346 killifish, and medaka, and this low number is in line with previous findings on metallothioneins 347 in fish 73 . 348 349

Expression of defensome genes in early development of fish 350
The developmental stage at which a chemical exposure event occurs greatly impacts the effect 351 on fish. In general, chemical exposures during early developmental stages of fish cause the 352 most adverse and detrimental effects. Based on data from the ECETOC Aquatic Toxicity 353 database, fish larvae are more sensitive to substances than embryos and juveniles 74 . Our results showed that there are many defensome genes that are not expressed until 360 after hatching in both species. The delayed genes belonged to all functional categories but 361 were especially prominent where there are several paralogs within the same gene subfamily, 362 for example the transporters and the transferases (Figure 3). Other genes were highly 363 expressed at the early developmental stages, before gradually decreasing. Glutathione-364 related genes, such as gclc, were previously shown to be highly expressed in early 365 development of zebrafish due to maternal loading 75 , and this was supported in our findings 366 in both zebrafish and stickleback. 367 Moreover, we found patterns of clustered transcriptional regulation of oxygenase and 368 transferase enzymes in both zebrafish and stickleback. For example, the Ahr target genes 369 cyp1a, gsta.1, ugt1a1/7 were transcribed at the early morula stage in both fish species, before 370 the levels decreased at late morula and mid gastrula, before again increasing post hatching 371 (Figure 3). In both zebrafish and stickleback, ahr2 is continuously expressed throughout 372 development, with the highest levels at the 24 hph stage. It has been demonstrated that genes 373 regulated by common transcription factors tend to be located spatially close in the genome 374 sequence and thus facilitate a concerted gene expression 76 , and our findings could be 375 supporting such an arrangement. 376 Finally, there were some genes that were transcribed at very high levels at similar 377 developmental stages in both fish. The heat shock proteins hspa8 and hsp90ab1 were highly 378 transcribed at all stages in both zebrafish and stickleback (Figure 3). Hspa8 is a constitutively 379 expressed member of the Hsp70 subfamily, which is previously known as important in rodent 380 embryogenesis 77 . The role of Hsp90ab1 in development is less known 78 . Furthermore, the 381 ferritin genes fth1a and fth1b were expressed at high levels in both zebrafish and stickleback, 382 respectively (Figure 3). However, although ferritin mRNA was found present throughout early 383 development of brown trout, the translated protein was only present after hatching 79 . 384 Importantly, this suggests that there are additional mechanisms that regulate the expression 385 of chemical defensome genes. 386 late morula, mid gastrula, early organogenesis, and 24 hours post hatching (hph). 391

Exposure response of the defensome genes 393
Next, we studied the transcriptional effect of a well-known Ahr agonist, benzo(a)pyrene (BaP), 394 on embryonic and larval stages of zebrafish (relevant data available on FAIRDOMHub: post fertilization (hpf), the exposure led to a strong upregulation of cyp2aa9 (3.87 fold) and 397 an increased transcription of single genes such as ahr2, nfe2, aanat1, and hspb1 (Figure 4a). 398 Cyp1a, which is an established biomarker of exposure to BaP and other polycyclic aromatic 399 hydrocarbons (PAH) in fish 80-82 , was not induced at this stage. However, as described in the 400 original study 32 , we found a strong induction of cyp1a (5.67 fold) at the larvae developmental 401 stage (Figure 4b). Induction of zebrafish cyp1a is previously shown from 24 hpf following 402 exposure to the Ahr model-agonist TCDD 83 . Following exposure at the larval stage, we found 403 a trend of clustered regulation of functionally grouped genes. In general, transcription factors 404 were downregulated, whereas biotransformation enzymes were upregulated. However, the 405 BaP xenosensor, ahr2, and the oxidative stress-responsive transcription factor nfe2l1a, were 406 both upregulated (0.53 and 1.28 fold, respectively). The crosstalk between these transcription 407 factors following exposure to chemical stressors is previously studied in zebrafish 84,85 . 408

417
The chemical defensome is essential for detoxification and subsequent clearance of 418 xenobiotic compounds, and the composition of the defensome can determine the 419 toxicological responses to many chemicals. Our results showed that the number of chemical 420 defensome genes ranged from 446 in three-spined stickleback to 510 in zebrafish, due to a 421 varying number of gene homologs in the evolutionarily conserved modules. Of the five fish 422 included in this study, zebrafish has the highest number of gene homologs in most gene 423 families, with the interesting exception of the nqo reductases where medaka, killifish, cod, 424 and especially stickleback, had retained a higher number of homologs compared to only one 425 in zebrafish.

426
We have previously shown that the stress-activated receptor pxr gene has been lost in 427 stickleback and cod, but is retained in zebrafish, Atlantic killifish and medaka (Eide et al., Traditionally, studying single molecular biomarkers of exposure has proven very useful 446 in toxicological studies 67,91 . Now, the recent advances in omics technologies enable a more 447 holistic view of toxicological responses, including gene set enrichment analysis and pathway 448 analysis approaches 92-95 . However, these analyses can be challenging when working with less 449 studied and annotated species, such as marine teleosts. As seen from our results, studying the 450 full gene complement of the chemical defense system can identify trends of grouped 451 responses that can provide a better understanding of the overall orchestrated effects to 452 chemical stressors. Such insights will be highly useful in chemical toxicity testing and 453 environmental risk assessment.         Figure 1 Chemical defensome genes in ve model sh species. The genes were identi ed by searching gene names and using HMMER searches with Pfam pro les, followed by reciprocal or best-hit blast searches towards the zebra sh proteome. The gene families are organized in categories following Gene Ontology annotations and grouped by their role in the chemical defensome. The size of the disk represents the relative number of genes in the different sh genomes within each group, with the number of genes in a speci c gene family as slices.

Figure 2
Phylogenetic tree of NAD(P)H:quinone oxidoreductases (NQO), also known as DT-diaphorase (DTD). Multiple sequence alignment and phylogenetic tree was built using Clustal Omega49 with standard settings. The tree was drawn using iTol50, and rooted with the archaebacterial NQO5.  Transcriptional responses on chemical defensome genes in a) zebra sh embryo (3.3 hours post fertilization (hpf)) and b) zebra sh larvae (96 hpf) following exposure to benzo(a)pyrene. The transcription is shown as log2 fold change between exposed and control group at each timepoint. The genes are grouped into their functional categories in the chemical defensome and the name of some genes are indicated for clarity.