Animals and Training
All dogs used in this study were either purebred German Shepherds (GSs, n = 127) or purebred Labrador Retrievers (LRs, n = 497). These two breeds are the most common dog breeds used as drug detection dogs by Japan Customs.
Dogs were trained to be drug detection dogs at the Canine Training Center, Tokyo Customs, between 2002 and 2019. They were provided by dog breeders (i.e., general dog population in Japan) at about 1 year of age. They did not receive any specific training for detection dogs until then. At the training center, dogs were kept in kennels and the animal management staff provided daily care in accordance with the relevant regulations in Japan. We could not obtain the precise pedigree data for confidentiality reasons.
The training method was based on positive reinforcement with social rewards (i.e., food rewards were not used). Canine trainers played tug-of-war with the dogs using a rolled towel with a target scent (i.e., reinforcer) so that dogs were motivated to search for an object with the scent. The training period lasted for about 4 months, and was divided into 3 phases (i.e., familiarization, basic training, and advanced training). The Familiarization phase was aimed at habituating the dogs to the training methods and to the novel environment of the training facility. Basic and advanced training phases were aimed at training the dogs to detect strong- and soft-scent drugs, respectively. At the end of each training phase, dog experts determined which dogs were suitable for scent work. Dogs that succeeded in all training sessions and phases were employed for drug detection at each field (e.g., airport, port) and were categorized as ‘qualified,’ whereas those that did not, were categorized as ‘unqualified.’
Finally, 121 GSs and 205 LRs were subjected to genetic analysis. The qualification ratios were 64/121 (52.9%) in GSs and 73/205 (35.6%) in LRs. Previous studies have suggested that the rate of successful training of working dogs reached 30% to 50%[15,16], which is consistent with the qualification rate in the present study. Note that these ratios refer to the qualification rate of dogs for which detailed genetic analysis was possible in this study. Japan Customs reported that the qualification rate is about 30 % for both breeds. The outline of the training protocol has been described in further detail in previous studies[5,14]. This study was approved by the Ethics Committee of the Wildlife Research Center, Kyoto University (WRC 2010EC001) and performed in accordance with the relevant guidelines and regulations. This study used a non-invasive method based on behavioral observation, except for blood sampling. All methods using dogs in this study were reported in accordance with ARRIVE guidelines (https://www.nature.com/srep/journal-policies/editorial-policies#experimental-subjects).
Assessment of behavioral traits
The dogs’ behavioral traits were assessed after 2 weeks of training in the Canine Training Center. We obtained scores of seven behavioral traits regarding suitability for scent work as follows: activity, boldness, concentration, friendliness to humans, independence, and interest in the target (dummy) (Table 1). Using a 5-point scale (i.e. 1 to 5) with a score of 5 indicating ‘very high,’ dog experts working at the training facility rated the extent to which a behavioral trait was applicable to each dog. The detailed procedure of behavioral assessment has been described in previous studies[5,14].
Behavioral differences based on sex, qualification status, and breed were evaluated using the Wilcoxon rank-sum test implemented in R software (version 3.6). Multiple comparisons (seven traits based on sex, qualification, and breeds) were conducted, and the Bonferroni correction was applied. The significance level of the corrected P value was 0.0014 (0.05/35).
Genomic DNA of all dogs was extracted from drawn blood, which was obtained via syringe, using a DNeasy Blood and Tissue Kit (Qiagen, CA, USA), according to manufacturer’s instructions. The Canine 230K Consortium BeadChip Array (Illumina, CA, USA) for genome-wide SNP genotyping was then employed following the standard protocols provided by the manufacturer. Prior to genomic analyses, quality control for samples and markers was performed using PLINK v1.90 and v2.00a2LM with the following settings: missingness per dog < 0.05; miner allele frequency > 0.01; missingness per marker < 0.05; remove high linkage disequilibrium SNPs by --indep-pairwise option including 50 kb windows, 10 SNPs step, and r2 threshold 0.1; sex-check by confirming genetic sex and interview sheet; remove genetically identical dogs based on kinship value exceeding 0.354; exclude mitochondrial DNA, sex chromosomes, and unlocalized SNPs.
Population genetic analysis
We determined the inbreeding coefficient and performed principal component analysis (PCA) to evaluate the population genetic structure in target populations. The inbreeding coefficient considering runs of homozygosity (FROH) was calculated using PLINK v1.90 and detectRUNS v.0.9.6 R package (options: maxOppRun = 0, maxMissRun = 0, minSNP = 2, minLengthBps = 100, and maxGap = 500,000), and was based on the methods of sliding windows and consecutive runs. The difference in FROH between breeds were tested using the Welch Two Sample t-test implemented in R. Statistical significance was set at 0.05. PCA was performed using the --pca option implemented in PLINK v1.90.
Genomic heritability estimation
The genomic heritability for each trait and qualification was estimated using Genome-wide Complex Trait Analysis (GCTA) software v1.91.7beta. The log-likelihood ratio test was used to estimate genetic variance, residual variance, phenotypic variance, and standard errors. Genomic heritability was estimated based on genome-wide SNP data as a ratio of genetic variance to phenotypic variance. SNP heritability (h2SNP) was tested using the Genomic Restricted Maximum Likelihood (GREML) method implemented in GCTA software.
Further, the Pearson’s product-moment correlation coefficient was used to reveal the relationships of heritability estimates between breeds. The significance level was set as 0.05 for each heritability analysis.
Genome-wide association study
In the GWAS, a linear mixed model was fitted using GEMMA software. We tested the associations between SNP genotypes and behavioral scores for seven traits (1 to 5) or binary for qualification (qualified or unqualified). To handle the effect of population structure on the GWAS, we used a centered relatedness matrix option (-gk 1) in GEMMA as a random effect. P-values were calculated using the Wald test. The genomic inflation factor (λ) was calculated using the R package GenABEL v. 1.8. to estimate the effect of the population genetic structure on GWAS results. The λ values were estimated using a regression model. To address the multiple testing problem, we used a test measuring the proportion of false positives incurred (the false discovery rate, FDR). The FDR was calculated using p.adjust function in R software (version 3.6). The genome-wide significance and suggestive level thresholds for adjusted p-values were set at 0.05 and 0.10, respectively.
Candidate SNP and gene analysis
A subsequent analysis based on linkage disequilibrium was performed to identify the potentially effective genes for each trait. We denoted the candidate loci associated with traits as follows. The candidate SNPs exceeding significance and suggestive level thresholds were subjected to uncover the candidate genes. All gene data sets were obtained from Ensembl genome browser (release 104, CanFam 3.1). We denoted the length between adjacent SNPs as within 200 kb. When the SNP did not share the high-LD SNPs then the region located + /− 200 kb of the SNP was denoted as the same locus for target. The bedtools v2.27.1 were used to extract the Ensembl gene IDs from the gene feature format (GTF) file downloaded from the Ensembl genome browser (release 104, CanFam 3.1).
To determine the functional classes of genes associated with the analyzed traits, we used the gene ontology (GO) enrichment analysis powered by PANTHER. The target genes were selected using the above linkage disequilibrium-based analysis. PANTHER Overrepresentation Test (Released 20210224) was used for GO enrichment analysis. We used “Biological process” for annotated analysis with PANTHER 16.0. for domestic dog data sets (Canis lupus familiaris) as the analysis type. Ensembl gene IDs were used for all annotation data.
As in vivo evidence in mice facilitates the search of genes associated with traits, we used Mouse Genomics Informatics (MGI) database (v 6.17) to investigate the relationships between genes and phenotypes (http://www.informatics.jax.org/). The phenotype annotations related to “behavior/neurological” categoly were targeted to identify the relationship.