Combined QTL and GWAS Analysis to Identify the Growth-Related Gene in Rhopilema Esculentum with the Help of 2b-RAD Sequencing

R. esculentum is a popular seafood in Asian countries and an economic marine shery resource in China. However, the high-resolution genetic map and growth-related molecular markers still lack, hindering the process of the genetic breeding of R. esculentum. Therefore, we rstly used the 2b-RAD method to sequence 152 R. esculentum specimens, identied 9100 single nucleotide polymorphism (SNP) markers and constructed a high-resolution genetic map with a marker interval of 0.58 cM, covering 98.68% of the genome. Then, we separately detected four and three quantitative trait loci (QTLs) associated with umbrella diameter and body weight based on the linkage map, which is located on linkage group (LG) 4, 13, 14 and 15. Finally, 27 genes were found both associated with umbrella diameter and body weight of R. esculentum by genome-wide association study (GWAS), in which one gene named RE13670 containing calcium-binding EGF-like domain may play an important role in controlling the growth. This study will be benecial for underlying the growth mechanism of R. esculentum and also provide background knowledge for guiding its genetic breeding.


Introduction
Edible jelly sh R. esculentum distributes in the northwest Paci c Ocean and is a popular seafood in Asian countries, especially in China 1,2 . The edibility, nutritional value and medicinal properties of R. esculentum make it an economical sh resource and widely farmed in aquaculture systems of China 2 . Although morphological development, environmental physiology and culture techniques have been well investigated in R. esculentum 2 , the molecular marker studies still lag in the genetic breeding industry.
The advance of the 2b-RAD approach attracts many researchers' attention and has accelerated the identi cation of SNP markers in aquatic shes, such as Carassius auratus 3 , Cyprinus carpio haematopterus 4 , Hypophthalmichthys nobilis 5 and Hemibagrus wyckioides 6 . Based on the 2b-RAD approach, the high-resolution genetic linkage maps were constructed with a marker interval varying from Phenotyping The body weight and umbrella diameter of R. esculentum F1 offsprings varied from 2.7 to 33.5 g and 3.2 to 7 cm, respectively ( Figure 1). The body weight of most R. esculentum was between 5 and 8 g and the umbrella diameter of most R. esculentum was between 3 and 5 cm ( Figure 1A, B). Figure 1C showed that the body weight and umbrella diameter are positively correlated in R. esculentum and the correlation between them was also supported by the value of CORREL function (0.90) (Supplementary Table 1). In addition, the value of the coe cient of variation (C.V) for body weight in R. esculentum offsprings is higher than that of umbrella diameter (Supplementary Table S1).

2b-RAD sequencing
Total 1539 million raw reads were generated by 2b-RAD sequencing, including 19 million from the male and female parents and 1520 million from the F1 offspring (Table 1). The ratio of clean reads to raw reads is higher than 93.63% and the alignment rate of the clean reads to R. esculentum genome was close to 63% ( Table 1). The tag number of female and male parents was both higher than that of F1 offspring while the tag depth was both lower than that of F1 offspring.   Table S2). The longest LG is LG14 with a genetic length of 88.9 cM, 1.75-fold higher than that of the shortest LG20 ( Figure 3).

Discussion
Growth shows a moderate heritability in aquaculture animals and plays an important role in production output, therefore, making it a primary target for selective breeding 15,16 . Our study showed the growth traits body weight and umbrella diameter are positively correlated in R. esculentum, manifesting any trait can independently represent the growth. Excluding growth-related phenotype, molecular marker shows potential for investigating the growth of aquaculture animals in the genetic breeding industry 17 . SNP markers as the molecular marker were genotyped easily to construct genetic linkage maps for the genetic breeding of aquaculture animals 18 .
For R. esculentum, four and three QTLs were detected associated with the umbrella diameter and body weight, respectively ( Table 2). These QTLs are located on LG 4, 13, 14 and 15 and the LOD value oated between 3.0 and 4.4 (Table 2). For umbrella diameter and body weight, two QTLs in the similar regions separated existed on LG 14 and 15, explaining 9.4 to 13.0% variation ( Table 2). Candidate genes related to umbrella diameter and body weight GWAS analysis indicated that the candidate genes related to umbrella diameter and body weight located on LG 14 and 15, following the results of QTL analysis ( Figure 4A, Table 2). In R. esculentum, 28 and 35 candidate genes were identi ed associated with umbrella diameter and body weight, respectively (Supplementary Table S3). Of these candidate genes, 27 genes were overlapped and one overlapped gene RE13670 (Appendix 1) that located on LG14 may play the key role in controlling the growth of R. esculentum ( Figure 4B, Supplementary Table S3). RE13670 was annotated as EGF and pentraxin domaincontaining protein 1-like and contained six types of conserved domains, including DUF5011 super family, Calcium-binding EGF-like (EGF_CA), Ephrin_rec_like, FXa_inhibition, IG_like and PLAT ( Figure 4C). In addition, RE13670 was homologous to Acropora digitifera with 30.36% similarity after NCBI-blast analysis.
Due to the simplicity and exibility, the 2b-RAD method was extensively used for constructing highdensity linkage maps for sh 5 albi ora, six QTLs distributing on six LGs were related to the body weight 24 . Additionally, two QTLs in similar regions are both associated with umbrella diameter and body weight in R. esculentum, further supporting the above inference and the positive correlation of the two growth traits.
QTL analysis often combines with GWAS for identifying growth-related genes in aquaculture animals 29,30 . For example, ve QTLs were revealed associated with the body weight of cat sh and the candidate genes in these regions were related to bone development and muscle growth 29 . In Epinephelus fuscoguttatus, 23 QTLs are detected corresponding to the growth and 19 candidate genes were detected 30 . Table 3 summarized the candidate genes related to growth traits of aquaculture animals by GWAS and different species showed the difference (Table 3). More than one gene about multiple functions was identi ed in relation to the growth of C. carpio L., L. crocea and O. mykiss while one candidate gene was identi ed controlling the growth in P. yessoensis, E. fuscoguttatus and L. maculatus (Table 3). In R. esculentum, RE13670 containing the EGF_CA domain showed the most possibility in controlling the growth, which is in accordance with the growth-related genes reported in C. auratus 3 . EGF_CA domain needs calcium for performing biological function and is present in extracellular (mostly animal) and membrane-bound 31 . Moreover, this domain has three main roles, including protein-protein interactions, as a spacer unit and structural stabilization 32 . Nevertheless, the function of EGF_CA domains has not been well understood in aquaculture animals. With the release of genomic information of R. esculentum 14 and the development of biotechnology, the gene function studies of R. esculentum will be improved and more genes will be investigated for genetic breeding.

Conclusion
In this study, a high-density linkage map of R. esculentum was constructed with a marker interval of 0.58 mM using the 2b-RAD method. Based on the linkage map, seven QTLs were identi ed associated with the growth of R. esculentum and one candidate gene RE13670 encoding EGF_CA protein may play the key role in controlling the growth of R. esculentum by QTL mapping and GWAS analysis. Both the constructed linkage map and identi ed candidate genes will expand the knowledge of molecular markers of the genetic breeding industry in R. esculentum.

Mapping family and phenotyping
A full-sib family of R. esculentum was generated from a breeding pond in Yingkou City, Liaoning province, China. Total 152 specimens of R. esculentum including two parents and 150 F1 offsprings at sevenmonth were collected from the mapping family for measuring the body weight and umbrella diameter.

DNA extraction
The R. esculentum specimens were collected for extracting genomic DNA and the DNA extraction process referred to our previous method 14 . The DNA quality was measured by Qingdao OE Biotech Co., Ltd.
2b-RAD library construction, sequencing and analysis After removing the low-quality DNA samples, 2b-RAD libraries of 144 specimens were constructed at Qingdao OE Biotech Co., Ltd., following the published method 40 . Firstly, 100-200 ng genomic DNA was digested by 1 U BsaXI (New England Biolabs). Secondly, the ligation reaction was conducted to add speci c adaptors to the digested genomic DNA. Thirdly, the ligation products were ampli ed in MyCycler thermal cyclers (Bio-Rad). Fourthly, the PCR products were puri ed using a MinElute PCR Puri cation Kit and digested using SapI (New England Biolabs). Fifthly, the digested products were transferred to the tube containing magnetic beads for incubation and then transferred the supernatant to a new tube for ligation using T4 DNA ligase (New England Biolabs). After that, the ligation products were puri ed and barcodes were introduced by PCR using barcode-bearing primers. Finally, PCR products were puri ed and pooled for sequencing using the Illumina Novaseq 6000 PE150 sequencing platform.
The raw data of 2b-RAD sequencing were trimmed for getting the high-quality data, and then the highquality data were aligned to reads. The reads with the BsaXI site were extracted and aligned to the reference reads using SOAP (version 2.21) 41 . The reference reads were extracted from R. esculentum genome (NCBI Genome ID: 56778) after electronic digestion using the BsaXI enzyme. The maximum likelihood method was used for SNP typing 42  esculentum were performed by software MapQTL 6 47 . The interval mapping method was used for QTL analysis and every one cM on each LG was scanned for searching the possible QTL. LOD threshold value at 95% level was calculated via 1000 permutation tests for each trait and QTL. LOD score of QTL that was greater than the LOD threshold value at 95% level was declared signi cant.
Genome-wide association analysis The genome-wide association analysis of body weight and umbrella diameter and SNP locus on the