Origin of pacu families
Data were obtained from 381 juvenile individuals belonging to 14 full-sibling families of pacu, generated in 2016 by a hierarchical mating scheme using 4 dams and 14 sires (approximately 1 dam for each 4 sires) These breeders were originated from three different fish farms located at Sao Paulo State (Brazil). The names of the fish farms were kept confidential. Induced spawning was performed using carp pituitary extract dissolved in saline solution (0.9 % NaCl) and applied in two dosages, with a 12 h interval (first and second dosage of 0.6 and 5.4 mg/kg, respectively). For sires, a single dosage was used, at the same time of the second dosage for dams, equivalent to 1.5 mg/kg of carp pituitary extract.
After hatching in 20 l conical fiberglass incubators installed in the Laboratory of Genetics in Aquaculture and Conservation (LaGeAC), at the São Paulo State University (UNESP), Jaboticabal city (São Paulo State, Brazil), the larvae were fed with artemia nauplii for 20 days. Gradually, the feed was replaced by 50 % of crude protein. In the fingerling stage, 1.2 mm pelleted feeds were used (40 % of crude protein), being gradually replaced by 2 to 3 mm pelleted feeds (36 % of crude protein) provided twice daily in 60 l tanks. Animals used in the experiment were pit-tagged to maintain the pedigree information during the challenge experiments. Laterally, fish were kept in 800 l fiberglass tanks.
Aeromonas hydrophila challenge
Bacterial challenge was performed in 2016 at LaGeAC using a strain of A. hydrophila isolated from an outbreak of pacu aeromoniosis in a commercial facility from the São Paulo State, by the Laboratory of Microbiology and Parasitology of Aquatic Organisms, at UNESP, Jaboticabal city (São Paulo State, Brazil). The strain of A. hydrophila was the same of that used by Mastrochirico-Filho et al. (2019) . The strain was cultured in Agar Trypticase Soy (ATS), Vegitone (Sigma-Aldrich) for 24 h (28°C). The colony was then transferred to a nutrient tryptic soy broth (TSB) (Sigma-Aldrich) and cultured for 24 h (28°C). After bacteria growth, the culture was centrifuged at 5,000 g for 10 min (4°C, in the Eppendorf Centrifuge 5810), forming a bacterial pellet suspended in a saline solution (PBS) and washed twice.
The challenge test performed in this study was carried out by intraperitoneal inoculation, according to protocols of Mastrochirico-Filho et al. (2019) . The LD50 (lethal dose in 50 % of individuals) was previously tested in 60 randomly chosen juvenile individuals from the same pacu families using concentrations adjusted by optical density of the solution at 0.400, 0.600 and 0.800 at 625 nm in spectrophotometer (2100 Unico, Japan). A sample of 100 μl of the LD50 was removed from the inoculum to perform serial dilutions and plate counts in duplicate on Tripticase Soy Agar (TSA).
Prior to challenge, the presence of A. hydrophila and other pathogens, such as Flavobacterium columnare and Streptococcus agalactiae in the populations was discarded checking subsamples of the populations by routine microbiological tests. In the experimental design of the challenge test, 381 juvenile individuals were distributed into three communal tanks of 2 m3 (length = 2 m, width = 1 m, depth = 1 m), where approximately ten individuals from each family (9.1 ± 1.3 individuals) were randomly distributed into each treatment tank (about 127 fish per tank), according to the sample size recommendation for disease resistance challenges proposed by Gjedrem and Baranski (2009)  . Prior to the inoculation of bacteria, fish were placed for 1 minute in a 3 l bucket containing anesthesia dissolved in water (benzocaine - 10 mg/l) with continuous aeration, and weighed. The mean weight of animals prior to the bacterial inoculation time was 23.0 ± 9.06 g. During the anesthetic state, individual fish were injected by intraperitoneal inoculation of the predefined LD50 dose of live cells of A. hydrophila (8 × 105 CFU/g body weight), according to protocols carried out by Mastrochirico-Filho et al. (2019) . Moreover, ten fish from each family (140 fish) were also used as control and kept in a separated tank (called as control tank) of 2 m3 (length = 2 m, width = 1 m, depth = 1 m). Individuals of the control tank were injected by intraperitoneal inoculation of saline solution (PBS).
Each treatment and control tank were maintained with an independent water recirculation system, fitted with mechanical and biological filters, external aeration system, and controlled temperature using thermal controller connected to heaters (2 x 500 w). Water quality parameters were determined daily during 14 days of the challenge. Temperature, dissolved oxygen and pH were measured with a Multiparameter Water Quality Checker U-50 (Horiba, Kyoto, Japan). Water temperature was maintained at 30°C (SD = 0.5), with dissolved oxygen at 5.8 mg/l (SD = 1.0) and pH at 7.0 (SD = 1.1) during the challenge period. No water was exchanged during the challenge, but the tanks were topped off to compensate for evaporation. The control tank had the same condition when compared with treatment tanks, except for an antibacterial treatment through a UV filter.
Fish mortality was observed throughout the entire day (24 h) in the initial three days of challenge, and in intervals of 8 h in the remaining days of the challenge experiment. Fish that showed mortality by clinical signs of A. hydrophila infection (e.g. disequilibrium, hemorrhage, isolation from the group) were recorded and removed immediately from the tanks. Necropsy examination and microbiological tests were performed in a sub-sample of dead fish in order to confirm mortality by A. hydrophila and discard other pathogens. The isolation of bacteria was performed in a specific growth medium (phenol red agar and ampicillin) incubated at 28°C for 48 h.
All surviving fish were examined externally for clinical signs of disease, and posteriorly euthanized using benzocaine (5 g in 50 ml of ethanol) diluted in 3 l bucket containing distilled water.
Genetic parameter estimation
Resistance was assessed as survival to the challenge test using the following trait definitions, according to Mastrochirico et al. (2019) :
- Test survival (TS), which was scored as 1 if the fish died in the challenge test period and 0 if the fish survived at the end of the experiment. This trait was analysed using a binary threshold (probit) model (THR) to account for the binary nature of the trait.
- Time of death (TD), which was scored in minutes, ranging from the moment of the first and last event of mortality. If fish survived to the end of testing period, the time was recorded as 3,880 min. This trait was analysed using a linear mixed model (LIN).
Data were analyzed with two different univariate animal models as defined below:
- LIN: A linear model was used to fit the continuous variable of TD:
yij is the phenotype for the fish j, in tank i; μ is the fixed effect of the overall mean; ti is the fixed effect of the tank i; wij is the covariate of weight prior bacteria inoculation for fish j, in tank i; aij is the random animal genetic effect of individual j, in tank i; and eij was the random residual effect for the fish j, in tank i.
- THR: A binary threshold (probit) model was used for analysing TS:
Where, yij is the phenotype (TS) for the fish j, in the tank i; Φ(·) is the cumulative standard normal distribution and the other parameters are similar to those described above.
THR and LIN models were fitted using ASREML 3.0 package . For all the models, the random animal genetic effect was assumed to be N (0,), where A is the pedigree-based additive genetic kinship matrix among all the animals included in the population and is the additive genetic variance. Random residuals for LIN were assumed to be N (0,), where I is an identity matrix and is the residual variance. For THR model, the residual variance on the underlying scale was set to 1. For both models, heritability was calculated as:
Where, was the additive genetic variance and was the residual variance.
DNA extraction and construction of RAD libraries
For RAD-seq analysis, fins from 391 individuals (373 offspring + 18 parents) were sampled for DNA extraction using the Dneasy Blood & Tissue QIAGEN kit. The quantification of extracted DNA (ng/μl) was measured by the Qubit fluorescence detector using the Qubit dsDNA BR Assay kit (Invitrogen, USA).
Each library was constructed with approximately 29 individuals identified with barcodes (14 libraries in total). Genomic DNA from each individual was digested by Sbfl enzyme. For each 11.35 μl of DNA (concentration at 10 ng/μl), 0.25 μl of Sbfl and 1.3 μl of NEBuffer 4 (New England Biolabs) were used. The enzyme digestion was performed at 37°C for 1 h, followed by an inactivation period of 20 min at 80°C.
Barcode adapters were ligated to the end of the DNA fragments following digestion with Sbfl, using the T4 DNA ligase (New England Biolabs). The ligation was incubated at 20°C for 2 h, and then at 65°C for 20 min. Following library construction, their quality was determined by PCR that combined 8.5 μl of water; 12.5 μl Phusion High-Fidelity Master Mix; 1 μl of RAD primer mix (forward and reverse) and 1 μl of digested DNA. The amplification was carried out with 18 cycles in a thermocycler (98°C for 10 s, 65°C for 30 s, 72°C for 30 s) and 72°C for 5 min. Size selection (300 - 500 bp) was carried out using excision of the appropriate band from gel electrophoresis. Purification was then carried out by the MinElute Gel Extraction kit, following the recommendations of the manufacturer. The concentration of each library was then normalized to 2.5 nm for sequencing by Edinburgh Genomics (University of Edinburgh, UK) on a Illumina NovaSeq platform.
Identification of SNPs from RAD sequencing
Sequences were filtered to discard those of low quality and trimmed to 150 bp. Sequences with ambiguous barcode sequences were also eliminated from the subsequent marker processing using Stacks software . The sequences were separated according to the barcodes linked to the DNA fragments. Redundant sequences were filtered using Dedupe Python Library . The minimum depth of coverage (m) to create a putative allele (stack) was 3. The number of mismatches allowed between two alleles of sample (parameter M) was 3. Considering the RAD locus catalog containing parental individuals, the number of mismatches allowed between two alleles from the population to form a catalog was 3. To identify putative SNPs, only loci present in at least 70 % of individuals in a population were considered (call rate > 0.70). In order to differentiate putative SNPs from sequencing errors, Plink 1.9 software  were used to exclude SNPs with minor allele frequencies (MAF) lower than 0.05, p-value of Hardy-Weinberg disequilibrium (PHWE) lower than 1x10-6 and individuals and SNPs with high missing genotype rates (mind 0.3; geno 0.2). In addition, SNPs with more than 10 % Mendelian error rate were also discarded for linkage map construction.
Linkage map and synteny analysis
Analysis of parentage assignment was performed by software Cervus3 [53, 54], with sex information of parents to confirm the pedigree. A linkage map was created using Lep-MAP3 . ParentCall2 module confirmed reliable parental genotypes using joint information on offspring and parents. Filtering2 module was used to remove markers with significant segregation distortion (dataTolerance = 0.001). Markers were assigned into linkage groups (LGs) by SeparateChromosomes2 (lodLimit = 8.6) and adjusted to 27 LGs (corresponding to the 27 pairs of chromosomes of pacu) . Posteriorly, orphan markers were assigned to existing linkage groups (LOD score = 6.7) using JoinSingles2 and ordered within linkage groups using OrderMarkers2 module. The generated linkage map was drawn using the LinkageMapView . In order to verify the ordering of loci within the LGs belonging to the linkage map, correspondence synteny analysis between the chromosomes of Astyanax mexicanus and the LGs for pacu was performed with the circos plots by Circa software (http://omgenomics.com/circa/).
Genome-wide association study (GWAS)
Genome-wide association study for test of survival (TS) and time of death (TD) traits were tested using single-step GBLUP (ssGBLUP) and weighted single-step GBLUP (wssGBLUP). The weights for each SNP were 1 for the first iteration, which means that all SNPs have the same weight (i.e., single-step GBLUP). For the next iteration (wssGBLUP), more weights were attributed to SNPs that are associated with QTLs with a relatively large effect. The weights were estimated from the variance explained by each SNP . The kinship matrix used was H , in which genotype and pedigree data are combined. The inverse of the matrix H is:
Where A-1 is the inverse numerator relationship matrix for all individuals, the inverse of a pedigree-based relationship matrix for genotyped individuals, and G-1 the inverse genomic relationship matrix.
TD was analyzed as a linear trait using AIREMLF90 and BLUPF90, whereas, TS was analyzed as a threshold trait using THRGIBBS1F90 in the BLUPF90 family of programs . Manhattan plots were drawn based on the proportion of additive genetic variance explained by windows of 20 adjacent SNPs. 12,657 SNPs and 268 genotyped animals which passed on the quality control (call rate > 90%; MAF > 0.05) were analyzed.
SNP windows that explained more than 1 % of the additive genetic variance for the trait were defined as putative QTL associated with A. hydrophila resistance. Each RAD-tag containing highly associated SNPs were aligned against the Pygocentrus nattereri genome (a phylogenetically close fish species) using BLASTn tool to investigate candidate surrounding genes (within 500Kb region) associated with A. hydrophila resistance.