## 2.1. Study area and data collection

**Study area and vegetal material**

The study area is Saharan and Sahelian zones, two main bio geographical regions in which date palm growns in Niger (Fig. 1). The saharan zone is generaly divided in Grand Nord and Air region while the Sahelian zone is divided in Manga and Damagaram regions (Fig. 1). The plant material used consists of date palm leaf samples collected from Iférouane, Timia, Ingall, Bilma, Dirkou, Aguer, Aney, Achenouma, Siguidine and Djado (ten villages of Sahara in Agadez region) and from Guidimouni, Dan Kalou, Riria Loloji, Wacha, Lakiré, Jambirji, Kilboa, Anassaboul, Broumwadi and Madkwari (ten village of Sahel in Zinder and Damagaram regions) (Fig. 1). The samples were collected according to the geographical distribution of the species in order to cover the greatest possible genetic diversity, mainly in the Saharan and Sahelian zone.

**data collection**

DNA was extracted from the young date palm leaves using the DNeasy Plant mini kit (Qiagen S.A., Courtaboeuf, France). DNA purification was performed with the PureLink purification kit (Invitrogen). After purification, the DNA concentrations were determined using the GeneQuant spectrometer (Amersham Pharmacia Biotech, France).

Eighteen (18) microsatellite markers Simple Sequence Repeats (SSRs, Billotte et al., 2004; Ludenã et al., 2011; Aberlenc-Bertossi et al., 2014; Zehdi-Azouzi et al., 2015) and one minisatellite chloroplastic marker (Henderson et al., 2006) were used to collect information from 140 date palm samples of twenty (20) palm groves (one per village) in four (04) sub-regions and of two (02) main areas.

## 2.2. Data analysis

To compare the spatial relationship between subpopulations, we used the Principal Component Analysis (PCA) to assess the spatial genetic distribution of population units considered along spatial gradients. Principal Coordinates Analysis (PCoA) (Peakall & Smouse, 2006) based on the projection of genetic matrix distances constructed from the allelic frequencies based on molecular markers SSRs were performed. Genetic variation between subpopulations was evaluated using genetic differentiation index (Wright, 1951). In order to analyze the relationship between date palm cultivation activities in palm groves, the geographical distribution and the genetic structure of date palm subpopulations, the Mantel test (Peakall & Smouse, 2006) was performed. Thus, the correlation between genetic distances and geographical distances matrices, and the one between genetic distances and the fidelity on cultural practices matrices was establish.

The matrix of genetic distances as proportion of genetic differentiation between subpopulations *Fst* (Meirmans, 2011) was determined by:

$$Fst=\frac{{H}_{T}-{H}_{S}}{{H}_{T}} \left(1\right)$$

The Euclidean distance matrices from the implementation of date palm cultural practices and geographical distances were estimated by:

$${D}_{i.j}=\sqrt{{({x}_{i}-{x}_{j})}^{2}+{({y}_{i}-{y}_{j})}^{2}+\dots } \left(2\right)$$

With \({x}_{i}\) and \({x}_{j}\) are respectively the values of the \({i}^{th}\) and \({j}^{th}\) individual at locus \(x\); \({y}_{i}\) and \({y}_{j}\) are respectively the values of the \({i}^{th}\) and \({j}^{th}\) individual at locus \(y\) ; \({H}_{T}=1-\sum _{i=1}^{K}{{\stackrel{-}{P}}_{i}}^{2}\) the total expected heterozygosity in Hardy Weinberg equilibrium population ; \({H}_{S}=1-\sum {P}_{i}^{2}\) the average intra-population heterozygosity; K the number of subpopulations and \({P}_{ix}\) and \({P}_{iy}\) the frequency of the \({i}^{th}\) allelic in subpopulation x or y.

The correlation between genetic distances, geographical distances and fidelity to cultural practices matrices is carried out using the simple and partial Mantel test (Peakall and Smouse, 2006).

$${r}_{m AB}=\frac{1}{N-1}\sum _{i=1}^{N}\sum _{j=1}^{N}\left[\frac{{A}_{i j}-\stackrel{-}{A}}{{S}_{A}}\right]\left[\frac{{B}_{i j}-\stackrel{-}{B}}{{S}_{B}}\right] \left(3\right)$$

With N the number of elements of the upper or lower triangular part of the matrix, A and B are two distance matrices, \(\stackrel{-}{A}\) and \(\stackrel{-}{B}\) are respective averages of matrices A and B; \({S}_{A}\) and \({S}_{B}\) are of respective standard deviation matrices of A and B, and \({r}_{m AB}\) the Mantel correlation coefficient. The value of an association parameter or a correlation coefficient, between the two matrices is calculated from the real data, then compared to the series of pseudo-values get from random permutation. This permutation test was carried out assuming that the considered matrices A and B have the same size (\(n\times n\)), the rows and columns correspond to the same objects.

**Permutation design**

The following steps were used: (i) Calculate the correlation coefficient \({r}_{A B}\) between the two matrices using Eq. (3), (ii) Randomly permute the corresponding rows and columns of one of the matrices to create a new matrix A’, (iii) Then calculate a new correlation coefficient \({r}_{m {A}^{{\prime }} B}\) between the permuted matrix and the non-permuted by using Eq. (3), (iv) Repeat the steps 2 and 3 100,000 times for better precision of the test.

The tests and analysis of this study are carried out in the R version 3.6.0 environment (R Development Core Team, 2019).