Development of highly validated SNP markers for genetic analyses of chestnut species

To better study and manage chestnut trees and species, we identified nuclear single nucleotide polymorphism (SNP) markers using restriction-associated DNA sequencing. Out of 343 loci tested, 68 SNP markers were selected that withhold stringent quality criteria such as quasi-systematic amplification across species and Mendelian segregation in both purebred and hybrid individuals. They provide sufficient power for species, hybrids and backcross characterization as well as for clonal identification, as shown by a comparison with single sequenced repeat (SSR) loci.

Chestnuts are self-incompatible and insect-pollinated Fagaceae trees from the Northern hemisphere (Stout 1926;Xiong et al. 2019; Barreneche et al. 2019;Larue et al. 2021a, b). Three species are widely cultivated for their nutritious nuts, the Japanese (Castanea crenata), Chinese (C. mollissima) and European (C. sativa) chestnuts (Barreneche et al. 2019). C. sativa is very vulnerable to ink disease and chestnut blight caused by pathogenic agents originating from Asia (Gonthier and Robin 2019). Hybrids between Asiatic species and C. sativa proved resistant to ink disease and were selected for cultivation in Europe. Genetic markers could help differentiate chestnut species, hybrids and other introgressed material as well as varieties, thereby facilitating the management of orchards. Such markers could also clarify the status of natural chestnut stands threatened by the propagation of diseases and by genetic pollution. A number of molecular markers have been previously developed in chestnuts, especially SSRs (e.g. Buck et al. 2003;Marinoni et al. 2003;Durand et al. 2010;Laurent et al. 2020) and were then widely used (e.g. Barreneche et al. 2004;Casasoli et al. 2006;Bodénès et al. 2012;Fernández-Cruz and Fernández-López 2012;Mattioni et al. 2013;Pereira-Lorenzo et al. 2017Bouffartigue et al. 2020;Nishio et al. 2021). However, SNPs have some important advantages over SSRs (Guichoux et al. 2011). First, genotyping errors are much rarer with SNPs than with SSRs, facilitating standardisation across laboratories. Second, SNP genotyping platforms make it possible to quickly characterize and score a large number of samples at reduced costs. Although SNPs have already been developed in chestnuts (Santos et al. 2017;Garcia et al. 2018;Nunziata et al. 2020;Sun et al. 2020), no SNP assay has been designed and optimized for the mentioned applications.
We first tested the markers on a set of 95 samples including the three sequenced parents, the offspring of two interspecific crosses (Ca 577 × Ca 737 and Ca 577 × Ca 04) and nine French cultivars. Their DNA was isolated from leaves dried in silica gel with Qiagen DNeasy 96 Plant kit. We further checked the markers on another set of 95 unique genotypes from the INRAE chestnut germplasm collection, which includes the three chestnut species and several F1, F2 and advanced hybrids. Their DNA was isolated from frozen leaves with a modified CTAB DNA isolation protocol (Supplementary 1) adapted from Doyle and Doyle (1987).
We selected 343 candidate SNPs, including 37 loci from Santos et al. (2017) and 306 loci originating from a restriction-associated DNA sequencing experiment (García et al. 2018). These loci were successfully sequenced in all three parents, were heterozygous in at least one of them, and lacked variation within at least 50 bp around the SNP position. We designed nine MassARRAY multiplexes (Assay Design Suite v2.0, Agena Bioscience, San Diego, USA) of up to 40 loci. Data analysis relied on MassARRAY Typer Analyzer 4.0.26.75 (Agena Biosciences). We excluded all monomorphic SNPs, loci with weak or ambiguous signal (i.e., displaying more than three genotypes clusters or with unclear cluster delimitation) and loci with > 10% missing data. Out of the 343 loci tested, 237 were validated (Larue 2021, File 1).
These markers are listed in Table 1, and 66 of the 68 SNPs are located on the chromosome assembly of C. mollissima genome (Sun et al. 2020). Table 1 also includes allelic frequencies for the three chestnut species and for C. sativa × C. crenata hybrids, computed using Genalex 6.51 (Peakall and Smouse 2012).
To evaluate the utility of these markers for species and hybrid identification, we used the Bayesian clustering analysis software Structure (Pritchard et al. 2000) and compared the results with those obtained with SSRs (Laurent et al. 2020). A total of 91 unique genotypes were characterized with both types of markers and used for the comparison (Larue 2021, Files 3 and 4; Supplementary 3). Three clear-cut genetic clusters were identified with both markers, matching well with the known identity of the trees and confirming the taxonomic utility of these SNPs (Fig. 1). We also computed the probability of identity for the 68 SNPs and the 94 SSRs (Supplementary 4). For the SNPs, they were all close to zero, showing that all chestnut genotypes can be easily differentiated with these markers. To conclude, the developed SNPs are suitable for identification of chestnut cultivars, species and hybrids. They should help manage production orchards and monitor the few remaining wild European chestnut stands.