Ethical approval
The housing of all animals located at the Department of Genetics, (UFS, South Africa) as well as the experimental procedures of this study adheres to the guidelines approved by the Interfaculty Animal Ethics Committee at the University of the Free State (Ethical approval number: UFS-AED2018/0037). Section 20 veterinary authorization was obtained from the South African Department of Agriculture, Forestry, and Fisheries (DAFF).
Sampling
A total of 65 zebrafish were selected from several sources, with localities coded for anonymity. Forty-six specimens were obtained from supplier “1” and supplier “2” in Bloemfontein, Free State Province, South Africa. These specimens were kept in the same quarantine tank and formed sample group “A” (ZFA). Another eight specimens were acquired from supplier “3” in the Bloemfontein area (ZFB), and ten fish were obtained from a large-scale ornamental fish supplier based in Johannesburg, Gauteng Province (supplier “4”) (ZFC).
Samples for DNA extraction were obtained from both living fish and laboratory fish that died from natural causes. Live fish were sampled by swabbing as described by Le Vin et al. (2011). Samples from dead fish (stored at -20°C) were taken by means of tail cuttings of roughly 4mm x 2mm.
For comparative purposes, a further 178 cytb sequences from Whiteley et al. (2011) were downloaded from the GenBank database (Accession numbers, JN234180–JN234356), including the cytb sequence of an outgroup species (Danio kyathit; Accession number, EF452733). See Fig. 1 for sampling localities from the Whiteley et al. (2011) data. Abbreviations for the sampled areas are indicated in Supplementary Table S1.
DNA extraction and PCR amplification
DNA was extracted from the zebrafish samples using the Roche© High-Pure PCR Template Preparation Kit (Roche Diagnostics, Indianapolis, IN, USA), following the manufacturer’s protocol. An assessment of DNA quality and quantity was performed on a NanoDrop® Spectrophotometer ND-1000 (ThermoFisher Scientific, Waltham, MA, USA). All DNA extracts were subsequently stored at -20˚C.
A 1 122 bp region of the cytb gene was amplified, using the primers utilised by Whiteley et al. (2011) (Whiteley et al. 2011). For the forward primer we used fishcytbzf-F from Fang et al. (2009), with HA-danio from Mayden et al. (2007) used as reverse primer (Table 1). PCR reaction mixes (12.5 μl volume) were composed of 6.25 µl Ampliqon TEMPase Hot Start 2X Master Mix (Odense M, Denmark), 3.5 µl dH2O, 0.375 µl of each primer (10µM stock), and 2 µl DNA. The PCR reaction conditions were as follows: 95°C for 15 min, 35 cycles of 95°C for 30 s, 56.5°C for 40 s, 72°C for 1 min, 72°C for an additional 5 min and 12°C until manually terminated. All PCR products were sequenced on an ABI 3500 Genetic Analyser at the Department of Genetics, UFS. All sequences generated from this study were deposited in GenBank (Accession numbers: MK893921-MK893984).
Five highly variable microsatellite loci (Ztri1, Z249, Z6104, Z9230, and Z20450), previously utilised by Coe et al. (2009), were used to determine the genetic diversity levels for the zebrafish from different domestic sources. All primer sequences are presented in Table 1. The markers were divided into two multiplex sets (multiplex 1: Ztri1 and Z9230; multiplex 2: Z249, Z6104, and Z20450). The forward primer in each microsatellite pair was fluorescently labelled at the 5’ end. Ampliqon TEMPase Hot Start 2X Master Mix was used for all amplifications (Odense M, Denmark). The PCR reactions for multiplex 1 (12.5 µl) consisted of 4.5 µl dH2O, 0.375 µl of each Ztri1 primer (10 µM stock), 0.250 µl of each Z6104 primer (10 µM stock), 6.25 µl Ampliqon TEMPase Hot Start 2X Master Mix, and 0.5 µl of template DNA. The PCR reaction for multiplex 2 consisted of 4.25 µl dH2O, 6.25 µl Ampliqon TEMPase Hot Start 2X Master Mix, 0.25 µl of each primer (10 µM stock), and 0.5 µl DNA to give a final reaction volume of 12.5 µl. Samples that did not amplify at all markers when using multiplex, were then amplified separately, with the PCR reaction (11 µl) composed of 3.5 µl dH2O, 6.25 µl Ampliqon TEMPase Hot Start 2X Master Mix, 0.375 µl for each primer (10 µM stock), and 0.5 µl DNA.
A touchdown PCR was performed for all microsatellite amplifications. The PCR profile consisted of 95°C for 3 min, 8 cycles of 95°C for 30 s, 64°C for 40 s, -1°C per cycle, 72°C for 1 min. Twenty-five cycles of 95°C for 30 s, 56°C for 40 s, 72°C for 1 min, and a final extension step at 72°C for 5 min.
Statistical analysis
All DNA sequences were assembled and aligned in GENEIOUS v4.7.4 (Drummond et al. 2009) using the ClustalW option (Thompson et al. 1994). DnaSP software (Rozas et al. 2003) was used to calculate the number of haplotypes (h), haplotype diversity (HD) and nucleotide diversity (π) .
Nucleotide diversity (π) results were rounded to four decimal places due to the small magnitude of values obtained. Pairwise population PhiPT values among groups were estimated in GenAlEx v6.5 (Peakall and Smouse 2012).
A Maximum likelihood (ML) analysis of the identified haplotypes was performed using the online PhyML platform (Guindon et al. 2010) (http://www.atgc-montpellier.fr/phyml/) to assess the relationship between our sequences and previously published data. Automatic model selection by Smart Model Selection (SMS) (Lefort et al. 2017) was selected, and branch support was estimated by performing 1 000 bootstrap iterations. The closely related Danio kyathit (Mayden et al. 2007) was used as the outgroup (Accession number, EF452733).
The genetic variation estimated from the microsatellite loci were quantified in terms of observed and expected heterozygosity, number of alleles observed, polymorphic information content (PIC), allelic richness, conformation of expected numbers of genotypes to expectations under Hardy-Weinberg Test equilibrium (HWE), the inbreeding coefficient, null alleles, and presence of linkage disequilibrium. The expectation maximization (EM) algorithm for detection of null allele frequencies was used as implemented in the software program FreeNA (Chapuis and Estoup 2007). A paired t test was performed to determine if null alleles has a significant effect on FST estimates by comparing null allele corrected and uncorrected FST values calculated by the excluding null allele (ENA) method (Chapuis and Estoup 2007). The test for HWE and linkage disequilibrium was done using GENEPOP (Raymond and Rousset 1995). Number of alleles and observed, expected heterozygosity, and pairwise FST and associated p-values, with a significance level of 0.05, were calculated using GenAlEx v6.5 (Peakall and Smouse 2012). Polymorphic information content was determined using Cervus (Kalinowski et al. 2007). Allelic richness and inbreeding coefficient were calculated using FSTAT 1.2 (Goudet 1995).