Genetic and Metabolomic Differentiation of Physalis Ixocarpa Brot. Populations in Michoacan State, Mexico

Physalis ixocarpa Brot. is a native species that is consumed in many localities of the Cienega-Chapala in Mexico's Michoacan state. These fruits are cultivated and collected into traditional maize crops. The fruits are similar to P. Philadelphica, but the differences are in the fruit size and organoleptic properties (avor, sweetness). According to antecedents of domestication that this zone represents in Mexico, is possible that P. ixocarpa shows incipient differentiation signals in genetic structure and metabolomic ngerprinting. Our objective was nd evidences of genetic and metabolomic differentiation among populations of P. ixocarpa in the Cienega-Chapala. We used the sequencing of the chloroplast intergenic sequences psbJ – petA and trnL – rpL32, and the metabolomic ngerprinting by GC-MS. The results showed that exist genetic differentiation (F ST ) and signatures of selection (Fu's Fs' neutrality test) among populations. Moreover the metabolomic ngerprinting showed differences among populations and an increase of aldehydes, aromatic aldehydes, ester, and alcohols related with organoleptic properties of P. ixocarpa. We conclude that P. ixocarpa is an important genetic resource with signatures of differentiation in the Cienega-Chapala, Michoacan state, Mexico that eventually could be related with domestication signatures.


Introduction
Mexico is the center of diversity of the genus Physalis with around 50 native species (Zamora-Tavares et al. 2015). The Cienega-Chapala region that belongs to the Balsas Basin is an important center of domestication of species where some Physalis species produce edible fruits and are consumed by local people (Vargas-Ponce et al. 2016). In this region, some species of Physalis are cultivated inside of corn eld (milpa system), for this reason, the fruits are knowledge as "tomate milpero". One of this species that are harvested is Physalis ixocarpa Brot. The form of fruits is similar to domesticated husk tomatoe (P. philadelphica), but the principal difference between species is their size. P. philadelphica fruits are around ve-folds bigger than P. ixocarpa. Thus, some authors have proposed that P. ixocarpa is the same specie or is the wild form of actually cultivated husk tomatoes (Morales-Contreras 2018; Ramos-López and Morales 2018; Ayala-Armenta et al. 2020). Local people highly value P. ixocarpa fruits because they have a better avor than commercial husk tomatoes. This species' kilogram has a price four-folds in the markets with respect to cultivated tomatoes (Torres-García, personal observation). For studying the domestication processes, genetic analyses (sequencing, molecular markers, genomics, among others) have been used in many species with reliable results (Gross and Olsen 2010). However, using other scopes to make phenotyping has been delayed with respect to genetic scopes (García-Flores et al. 2015). The use of non-target metabolomics to elaborate metabolic ngerprinting allows determinate a higher number of traits with high levels of con dence (Sumner et  For this reason, this study hypothesized that in the region of the Cienega-Chapala region, it would nd evidence of differentiation in genetic population and metabolomic ngerprint in P. ixocarpa populations caused by the farmer's management. The objectives of this study were: 1) Determinate the genetic differentiation of P. ixocarpa populations using chloroplast intergenic sequences (psbJ -petA and trnL -rpL32), and 2) determinate the metabolomic differentiation using metabolomic ngerprinting by GC-MS.

Study Area
This study was carried in the region known as the extreme northwest Balsas-Jalisco region in Mexico's Michoacan state. This region is an area with a complex topography due to the Western Sierra Madre, where many climatic conditions can be found in small distances. The annual mean temperature ranges are around 14 to 18°C, an altitude of 1500 to 1700 mamsl, and annual precipitation of 400 to 800 mm.
The agricultural activities in this region are in uenced by Chapala lake. Chapala is the biggest lake in Mexico, with an extension of 1100 km 2 . However, this lake has reduced its extension caused by natural and human events (as dams' construction). Such events reduced the lake extension leaving new cultivable areas where agricultural areas have been established. Actually, this zone is one of the principal zones for producing berries and avocados in the world.

Collection of Biological Material
The collection of the biological samples used in this study was obtained in the itinerant-local markets of the region named "tianguis". In such markets, the local producer sells their products every week. Six local markets were visited; each market is ubicated in the respective municipality center. The municipalities sampled in this study were Chavinda, Jacona, Tangamandapio, Villamar, Sahuayo, and La Barca (Fig. 1). We used the domesticated husk tomatoe (P. philadelphica) as a reference group.
In each market, local producers were localized and asked for their husk tomatoes' origin. The husk tomatoes were bought from farmers that cultivated and collected their own P. ixocarpa fruits. In La Barca's market, the seller can not identify their husk tomatoes' origin. La Barca is one of the bigger cities of the region, and the principal economic activity is intensive agriculture. However, this municipality was included because this is the east limit of the Cienega-Chapala region.
The samples were washed two times with water and soap and washed three times with distillate water.
After, samples separated considering each fruit as and different individual and were lyophilized in a vacuum chamber at − 50°C for 72 h. Lyophilized samples were stored hermetically until their utilization.
Ampli cation by PCR of Intergenic Chloroplast Regions psbJ -petA and trnL -rpL32 Individual lyophilized samples were ground and used for DNA extraction using the CTAB method (Chen and Ronald 1999). As external groups, four genus species were used P. angulata, P. lagascae, P. peruviana, and P. pruinosa. Such species were obtained from the herbarium of CIIDIR-IPN, Unidad Michoacán (Centro Interdisciplinario de Investigación para el Desarrollo Integral Regional). The quality of extraction was corroborated on 1% agarose gel in 1 % TBA buffer at 76 V for 40 min.
For the ampli cation of intergenic chloroplast regions, ve individual of each population were taken and ampli ed with a set of primers designed by Shaw et al. (2005). These primers are universal for whatever plant species; besides, chloroplast regions' use establishes in a better way the phylogenetic relations among populations (Shaw et al. 2005). The two ampli ed regions correspond to psbJ -petA, and trnL -rpL32.
The PCR mix contained 1x of the buffer, 10 mM of dNTPs, 1 µl of DNA, 1.5 mM of MgCl 2 , 1 U of Taq (Go Taq® Flexi DNA polymerase; Promega), and 10 mM of each primer in a nal volume of 15 µL. The ampli cation conditions were as follows: denaturation at 80°C for 5 min followed by 30 cycles of 95°C for 1 m, 50-65°C (gradient of 0.25°C each 1 s) for 1 min, and 72°C for 4 min, followed by a nal extension at 72°C for 5 min. The amplicons were sent to the Macrogen, Korea for sequencing.

Sequence Analyses
Sequence analysis was performed using the software MEGA (Molecular Evolutionary Genetic Analysis) version 7 (Kumar et al. 2016). We con rmed the quality and effective size of the sequences included in the alignment before analysis. The sequences were aligned by each intergenic region, after both alignments were linked, indicating each fragment's start and end site. Each region's alignment included sequences P. angulata, P. lagascae, P. peruviana, and P. pruinosa were used as external groups. A model test for nding the best evolutive model was calculated.
Genetic diversity (π) was quanti ed within populations based on the number of mutations. Additionally, Fu's Fs' neutrality test was calculated for each population. We evaluated the genetic distance between populations (F ST ) with a test performed with Arlequin Software v.3 (Exco er et al. 2005). using 100 000 steps in the Markov chain and 1000 dememorization steps with a signi cance level of 0.05. To establish whether the genetic distance between accessions is independent of geographic location, we performed a Mantel test with 1000 permutations.

Phylogenetic Reconstruction
Phylogenetic reconstruction of the different populations was made using two approaches. The rst method used was constructing a phylogenetic tree using 1000 bootstrap replicates under the maximum likelihood method, gamma distribution, and invariant sites under the evolutive model of Tamura-3 parameters. Additionally, due to genetic differentiation being a continuum, and to draw the possible expansion of P. ixocarpa populations, a haplotype network was estimated. This network was constructed in the PopArt software (Leigh and Bryant 2015) using the TCS method (Clement et al. 2000).
Metabolomic Fingerprinting Using GC-MS Fifty milligrams of individual samples lyophilized and ground tissue were placed in amber vials ( ve individuals for each population). Vials were incubated at 80°C for one h. After this time, solid-phase microextraction (SPME) was used, inserting the ber holder into vials and awaiting 10 minutes for each sample. Samples were injected into a gas chromatograph (Clarus 680, Perkin-Elmer Inc., Waltham, MA, USA), equipped with a phase capillary column: 5% diphenyl 95% dimethylpolysiloxane 30 m long, 0.32 mm i.d., 0.25 µm lm thickness, temperature limits between − 60 a 320/350°C (Elite-5 MS, Perkin-Elmer Inc., Waltham, MA, USA). Helium gas was used at a ow rate of 1 mL min − 1 , the ow remained constant, and there was an initial wait time of 0.5 min. The column temperature was initially maintained at 50°C for 1 min and then ramped to 250°C at 30°C/min, remaining at this temperature for a further 10 min. The temperature of the injector was 230°C. A mass spectrometer (Clarus SQ8T, Perkin-Elmer Inc., Waltham, MA, USA), with an electron impact ionization source (70 eV) in full scan mode was used. The analysis range was 40-500 m/z. The transfer line and ionization source temperatures were 230 and 250°C, respectively.
The feature detection, retention time correction, and peak alignment of the original chromatograms were made in XCMS Online (https://xcmsonline.scripps.edu) (Tautenhahn et al. 2012). To avoid false positives in the detection of metabolites, we only used metabolites with q-values ≤ of 0.05. A Principal Component Analysis (PCA) was made to select the metabolites with signi cant participation in the metabolomic differentiation. The annotation of the principal metabolites was made using the NIST library using a cutoff value of 0.8.
The results were represented in a heatmap-bicluster. An ion matrix was constructed using the metabolites with the highest differentiation levels. The heatmap construction was made using the platform Metaboanalyst (www.metaboanalyst.ca) (Chong et al. 2018). For the heatmap, the data were normalized and auto-scaled. The dendrograms used the Minkowski correlation as a distance function and the Ward clustering algorithm; the branches' signi cance was p ≤ 0.05.

Genetic Differentiation Among Populations
Each of the intergenic regions ampli ed by PCR had an extension of around 550 pb. The nal alignment had 1023 sites. The number of transversions was higher than transitions in most of the population, except in Santiago ( Table 1). The population from La Barca showed the highest number of variable sites (59); on the other hand, Jacona showed the least number (eight). The nucleotidic diversity had the same tendency as the number of variable sites, where La Barca showed the highest diversity and Jacona the lowest. The external group used, P. philadelphica, had 14 variables sites and one of the most reduced nucleotidic diversity.

Genetic Differentiation
The genetic differentiation measured using F ST showed the conformation of seven groups ( Table 2). The populations conformed to the rst group from Chavinda, Jacona, Sahuayo, and Santiago. A second group was conformed by Jacona and Chavinda, the third by Sahuayo and Chavinda, the fourth by Santiago, Chavinda, Sahuayo. Villamar, La Barca, and commercial husk tomatoe each population was considered independent. The isolation by distance analysis showed that not exist a correlation between the geographic distance (km) and genetic differentiation (F ST ) (R 2 = 0.01944).

Phylogenetic Reconstruction
The phylogenetic tree shows that three species used as external groups were grouped in the tree's basal branch (Fig. 2). Such species were P. angulata, P. lagascae, P. peruviana, and P. pruinosa. The most ancestral P. ixocarpa were Sahuayo and Villamar, followed by Chavinda, Jacona, and Santiago. However, in the recent branches, the species P. lagascae was arranged between the P. ixocarpa from La Barca and the commercial husk tomatoes (P. philadelphica). This result shows that the tomates milperos sold in La Barca is another species different from P. ixocarpa.
In the haplotype network, the ancestry of populations shows that Chavinda is the network center (Fig. 3). Similarly, as the phylogenetic tree, commercial husk tomatoe and La Barca's accessions appear in the network's extremes.

Metabolomic Fingerprinting
The metabolomic ngerprinting of the different accessions of tomatoes milperos using GC-MS showed detection of 552 metabolites with ranges of q-values ≤ 0.05. The PCA reduced the number of metabolites at 34 in the two rst components. Such principal components explained the 80% of total variation among samples. In the Upper side of the heat map, the dendrogram shows the population's grouping considering the 34 most important metabolites; such dendrogram has two principal branches (Fig. 4). La Barca, Sahuayo, and Villamar were grouped in the left branch. Inside of this branch, two sub-branches can be observed, Sahuayo and Villamar are considered sister groups, while the La Barca population was excluded.
The second branch (right side of the heatmap) contains two sub-branches, one of which is conformed by Santiago and Chavinda; both populations did not show statistical differences in the metabolomic ngerprint. The other sub-branches are formed for Jacona, and the most divergent population in this subbranch are the samples that correspond to commercial husk tomatoes (P. philadelphica).
The heatmap's left side shows the grouping dendrogram of metabolites; this dendrogram is divided into two principal branches. The rst branch is located in the n two principal branches. The rst branch is Thus can give an idea about the high variation levels and distinct domestication degrees that can be found in this area.
The phylogenetic reconstruction using two approaches give different results. In the phylogenetic tree, the species used as external groups were separated from the P. ixocarpa populations as was expected. However, in P. lagascae this sample was grouped between P. ixocarpa populations. Nerveless in the case of La Barca's population, the samples were collected in the local market and not directly with the farmers. This opens the possibility that the samples correspond to another specie as Physalis angulata or small fruits of P. philadelphica. In some Jalisco regions, the consumption of Physalis angulata also named tomate milpero. 2 Another critical result of the phylogenetic trees is that P. ixocarpa is not the ancestor of the actual husk tomatoes (P. philadelphica), according to the distribution of the branch, the ancestor of P. philadelphica may be P. lagascae.
The haplotype network showed different results to the phylogenetic tree. Chavinda was ubicated in the center of the network; this indicates that it is probably that in this municipality is possible to nd the most conserved genetics of populations sampled in this study. Fu's Fs result showed that this population had the lowest value with respect to other populations. This means that the selection process has not been so intense for this population. The individuals from La Barca and commercial husk tomate (P. philadelphica) were ubicated in the network's extremes; this con rms the grouping estimated by Fst and the phylogenetic tree. The F ST showed that the La Barca population did not have any genetic relationship with the other populations.
The genetic analyses demonstrated the evidence of population differentiation, probably caused by human selection. Besides, the metabolomic ngerprinting gave evidence for the traits under selection. The heatmap showed differences among populations and con rmed the differences between La Barca and P. philadelphica with respect to the rest of the population. This study contributes to knowledge of a poorly studied species with a high potential to approaching. The evidence showed that tomates milperos (P. ixocarpa) cultivated in the region Cienega-Chapala, Michoacan State, Mexico, been in a continuum selection process. This diferentiation process has changed some organoleptic properties, consumers of the region more appreciate these characteristics.

Con icts of interest/Competing interests : Not applicable
Availability of data and material : Avalable Code availability: Not applicable