Experimental Design:
ENU mutant grass carp (meiotic gynogenetic offsprings induced by UV inactivated heterologous sperm from Megalobrama amblycephala) were obtained from the Bream Genetics and Breeding Center of Shanghai Ocean University, Shanghai, China. After on arrival in the laboratory, fish were maintained in the laboratory at 28 ± 0.5 °C for at least 7 days prior to experimental use and feed them well to make healthier. A total of 60 fishes were used. The average weight of the fish was 4.4-6.0g. The trials were conducted in aerated glass aquariums (120 x 40 x 30 cm) each containing 100 L of water. After the acclimatization period, all the fishes were intraperitoneally inoculated with 20 μl/g of GCRV-873 strain. After 14 days of the challenge, all survived fish were collected as a safe fish from which 30 fishes were chosen arbitrarily as a resistant group.
Fish sampling:
In this way, mutant grass carp were divided into two group. 30 fishes were selected as susceptible/ infected/morbid group (S01) and leftover 30 fishes which were selected as resistant group (R02) to GCRV. About fourteen days of the injection later, liver tissue collected as sample and transferred it for BSA analysis.
Sequencing analysis:
Genomic DNA was taken out utilizing conventional phenol-chloroform extraction strategy in combination with RNase treatment and put away at −20°C until utilize. Two bulks were produced by pooling an equal quantity of DNA from a susceptible (S01) and a resistant group (R02). DNA from each bulk was utilized to build paired-end (PE) sequencing libraries, which were sequenced on an Illumina HiSeq (Illumina Casava 1.8 version). Whole experimental procedure is prepared according to the standard convention was given by Illumina, including sample testing, library construction, library-quality testing, and computer sequencing and Sequence read length was 150 bp (Biomarker technology, Beijing, China).
After evacuating connector and low-quality reads, the clean reads were advance rechecked for quality utilizing FASTQC. High quality paired-end reads were mapped to the grass carp reference genome sequence (PRJEB5920) [12] using the BWA program with default constraints [31]. Position of Clean Reads on the reference genome was found out by comparing the data such as the sequencing depth and genome coverage of each test, and then mutation location was achieved. The assessment results of the sequencing output information of each sample, the comparison results of the samples, the average coverage depth of each sample, and the genome coverage proportion comparison to each depth among S01 and R02 group can be seen in the following Table 1.
SNP and InDel detection:
SnpEff is a software for annotating mutations (SNP, Small InDel) and predicting the effects of mutations[32]. The detection of SNP is mainly implemented using GATK software toolkit [33]. According to the positioning results of Clean Reads in the reference genome, Picard (http://sourceforge.net/projects/picard/) was used to perform preprocessing such as Mark duplicates. GATK was used for Local realignment and base recalibration to ensure detection.
InDel represents single base insertion and deletion. The insertion loss of the sample was detected using GATK. Small InDel variation is generally less than SNP variation, which also reflects the difference between the sample and the reference genome, and InDel in the coding region will cause frameshift mutations, resulting in changes in gene function.
Euclidean distance calculation:
The Euclidean Distance (ED) algorithm is a method that uses sequencing data to find markers that have significant differences between pools and also evaluated the SNP, inDel association analysis [8]. Theoretically, two mixed pools constructed by the BSA project have differences in the target trait-related sites, and other sites tend to be consistent, so the ED value of non-target sites should tend to 0. The calculation formula of the ED method is shown below. The larger the ED value, the greater the difference between the mark and the two mixing tanks.
where each letter (A, C, G, T) corresponds to the frequency of its corresponding DNA nucleotide in the mutation and wild type pool or bulk respectively.
Functional annotation of genes containing SNPs:
The genes having SNPs correlated with resistance/susceptibility to GCRV were annotated using NCBI non-redundant database by BLAST software to perform in-depth annotation of multiple databases such as (NR [non-redundant protein database, NCBI], Swiss-Prot [http://www.uniprot.org/], GO [Gene Ontology, http://www.geneontology.org/], KEGG [http://www.genome.jp/kegg/], COG [http://www.ncbi.nlm.nih.gov/COG/]) coding genes in the candidate interval. Quickly screening the candidate genes were through detailed annotations.
Gene expression and SNP verification analysis:
The qPCR was carried out on CFX96 Touch™Real-Time PCR System, using SYBR Premix Ex Taq kit (TaKaRa, Japan). All primers used in the study were designed by software Primer Premier 5.0 and are listed in Table 8. The comparative expression value of the designated gene vs. 18s rRNA (reference gene to normalize expression levels between sample) was calculated using the 2−∆∆Ct method. Reactions of SYBR Green were performed in a 20 µL volume containing 10 µL of 2 × SYBR®Green Realtime PCR Master Mix (Toyobo, Osaka, Japan), 1 µL of each forward and reverse primer (10 µM), 7 µL of water, and 1 µL of diluted cDNA (100 ng/µL). All experiments were performed in two groups. We identified five functional genes as disease-causing mutations in response to GCRV in mutant grass carp.
For SNP verification, again we injected the virus (GCRV) in 100 ENU grass carp and divided the fishes into 50 resistant and 50 susceptible groups to find out their allelic frequency as we can see in the Table 8. Both SNPs were located in the intron region of EPHB2. Using these two SNPs, we investigated the genotype number in 50 dead (susceptible and 50 survived(resistant) fish after the GCRV challenge to detect whether there was a statistically association between genotype and resistance, using p-values of chi-square test. A subsequent p-value 0.05 0r less was measured statistically significantly.
Statistical Analysis:
The statistical results (expressed as mean ± standard deviation) were analyzed by one-way analysis of variance, followed by Dunnett’s test for multiple comparisons using IBM SPSS Statistics 22 software. p < 0.01, p < 0.05 was considered to be statistically significant. All experiments were repeated at least three times.