Molecular Detection and Quantication of the Striga Seedbank in Ethiopian Sorghum Field Soils

Aims Striga hermonthica is a devastating parasitic weed in Sub-Saharan Africa (SSA) and its persistent soil seedbank is the major contributing factor for its prevalence and persistence. So far, there is little to no information on the Striga seedbank density in agricultural elds in SSA due to the lack of reliable detection and quantication methods. Methods We developed a high-throughput method that combines density- and size-based separation techniques with quantitative polymerase chain reaction (qPCR)-based detection of Striga seeds in soil. The method was optimized and validated on two physicochemically different Striga-free Dutch agricultural soils by introducing increasing numbers of Striga seeds (0, 1, 3, 9, 27, 81 and 243 seeds). Results The results showed that as little as one seed of S. hermonthica per 150 g of soil can be detected. This technique was subsequently tested on soil samples of 48 sorghum elds from different agroecological zones in Ethiopia to map the geospatial distribution of the Striga seedbank along a trajectory of more than 1500 km. Considerable variation in Striga seed densities was observed for these soils: in 75% of the eld soils, Striga seeds were detectable up to 86 seeds per 150 g of soil. Correlation analyses further revealed a signicant non-linear relationship between the seed density and Striga incidence assessed in the same sorghum eld soils at the time of soil sampling. Conclusions The method developed allows for high-through-put and accurate mapping of the Striga seedbank in physicochemically diverse eld soils and can be used to predict Striga incidence and to assess the impact of management strategies on Striga seedbank dynamics.


Introduction
Striga is one of the major genera of parasitic plants in Africa, Asia and Australia. More than 50 species of Striga have been reported across the globe and S. hermonthica, S. asiatica, S. gesneroides, S. aspera and S. forbesii are the most common and destructive in cultivated cereal and legume crops (Scholes and Press 2008;Parker 2009). Striga is responsible for more crop losses in Africa than any other weed species. It is estimated that two-thirds of the total area of cereals and legumes in sub-Saharan Africa is infested with Striga and its spread has accelerated at an alarming rate (Parker 2012). The annual yield losses due to Striga alone were estimated at US $7 billion in Sub-Saharan Africa (SSA), posing a major threat to the livelihood of over 300 million people  Abate et al. 2014). Management of Striga in many parts of the world is constrained to a great extent by Striga seeds residing in the soil, also referred to as the "seedbank".
Striga produces 10,000-200,000 tiny seeds (0.2-0.3 mm, 4-7 µg) per plant, which survive in soil for at least two years (Hearne 2009). Managing Striga requires a better understanding of seedbank replenishment and depletion, also referred to as the seedbank dynamics. Replenishment encompasses seed production by mature Striga plants and 'immigration' of seeds from neighbouring eld soils via wind and farmer activities. Seedbank depletion is caused by suicidal germination (i.e. germination in absence of a host plant), pathogen infection, seed predation, seed aging, and 'emigration' of seeds to neighboring elds (van Mourik 2007). Hence, a methodology that allows accurate detection and quanti cation of the Striga seed density in agricultural elds is of paramount importance in the management of Striga in general and for understanding the seedbank dynamics in particular. The advancement of techniques to extract environmental DNA and RNA (eDNA and eRNA) from soils with different physicochemical characteristics followed by qPCR or sequencing has opened new means for sensitive and accurate detection and quanti cation of speci c (micro)organisms and marker genes. Such techniques have been used for several years to detect and quantify pathogenic microorganisms in order to deploy or optimize early measures to prevent disease outbreaks in farms (Ophel-Keller et al. 2008; Taparia et al. 2020). Furthermore, integration of high-throughput eDNA extraction and qPCR can maximize the number of samples that can be processed in a single day, reducing labor costs and turnaround times (Prider et al. 2013). Recently, the use of qPCR has received attention for determining the seedbank of weeds present in a soil. Examples include, the use of molecular markers and DNA-based assays for the quanti cation and identi cation of seeds of different species of the parasitic weeds Orobanche and Phelipanche (Dongo et al. 2012;Aly et al. 2012;Prider et al. 2013). With the recent public release of the genome sequence of Striga (http://ppgp.huck.psu.edu/), we here developed a highthroughput molecular detection method for quanti cation of the Striga seedbank in eld soils. The method encompasses several steps, starting with a density-and size-based separation of Striga seeds from the soil matrix followed by eDNA extraction and qPCR-based detection and quanti cation. Furthermore, the optimized protocol was then used to quantify and map the geospatial distribution of the Striga seedbank in sorghum eld soils collected from different agro-ecological zones in Ethiopia covering a trajectory of more than 1500 km and to relate the seed densities to Striga incidence in these elds.

Results And Discussion
Selection of marker genes, primer design and speci city From the parasitic plant genome project (http://ppgp.huck.psu.edu/) the Striga StHe0GB1 genome assembly was downloaded. The genome sequence was generated from Striga seeds that have been imbibed for various lengths of time, covering seeds that were conditioned as well as seeds that were treated for up to 6 h with the germination stimulant GR-24 (Westwood et al. 2012). From the assembly, several genes were selected and blasted against the nr database. We selected ve genes (StHe0GB1_1, StHe0GB1_9, StHe0GB1_20, StHe0GB1_76 and StHe0GB1_93) with Striga-speci c sequences as a putative marker gene for Striga seed detection and quanti cation (Table 1). For these 5 genes, a total of 14 primer pairs were designed targeting the Striga-speci c sequences (Table 1). Using the NCBI primer blast web tool and the nr-database, the speci city of the forward and reverse primers was validated in silico (data not shown). To validate the e cacy of these primers experimentally, we extracted DNA from 6000 Striga seeds, from 100 mg of Striga-free Dutch agricultural soil spiked with ~ 6000 Striga seeds and 100 mg of the soil sample without Striga seeds (control). All primer sets, except set 3 (targeting StHe0GB1_1), resulted in the PCR product of the predicted size (Table S1) for the samples containing the Striga seeds, whereas no ampli cation product was observed for any of the primer sets with DNA extracted from the soil samples without Striga seeds (Fig. 1a). Next, we tested the primer pairs in qPCR at two annealing temperatures (56 and 60 o C) to determine sensitivity, speci city and stability of the primers. All primers ampli ed the genomic DNA of Striga seeds at both temperatures but with different sensitivity, speci city and stability (Fig. 1b). Primer set 14 (P14) targeting the StHe0GB1_93 gene, showed high sensitivity as manifested by a low quanti cation cycle (Cq) value and single melting curve for both annealing temperatures ( Fig. 1b and S1). Furthermore, P14 resulted in a PCR product of the expected size for ve independent S. hermonthica seed batches and one S. asiatica seed batch collected from different agroecological zones in Ethiopia (Fig. S2). Hence, primer set P14 was selected for testing the speci city and sensitivity of PCR-based detection and quanti cation of Striga seeds in soil samples.
Optimizing DNA extraction and qPCR e ciency in different agricultural soils Accurate determination of the seed density of Striga in a eld soil requires a substantial amount of soil sample. Previous assays to microscopically determine Striga seed densities used 100 g soil samples (van Mourik 2007). However, such a large sample cannot be directly accommodated in the currently available high-throughput DNA extraction kits. Moreover, the application of molecular techniques for the detection and quanti cation of eDNA from soils may be hampered by humic acids, polysaccharides, urea, phenolic compounds and heavy metals (Frostegård et al. 1999). Here, we introduced 65 Striga seeds into seven physicochemically different Striga-free Dutch agricultural soils (Table S2), each weighing 100 mg. qPCR analysis on eDNA extracted from these 'spiked' soil samples showed variation in the mean Cq value from 27.3 to 29.3 cycles (Fig. 2a). This variation could be due to differences in soil physicochemical properties between these soils affecting the e ciency of eDNA extraction and/or qPCR analysis. Correlation analysis between the Cq values and a number of physicochemical properties of the seven soil samples revealed that Cq values were positively correlated to Fe, Mg, S, C, N,C/N and organic matter (OM) content, whereas they were negatively correlated to pH, K and P contents (albeit not statistically signi cant p ≥ 0.05).
To minimize interference of soil physicochemical properties, we then tested if separation of Striga seeds from the bulk soil prior to eDNA extraction and qPCR could improve the sensitivity of Striga detection and quanti cation. To this end, we adopted a density-dependent K 2 CO 3 separation of the Striga seeds from the soil matrix followed by successive sieving through two lters with meshes of 425 and 75 µm, respectively. The Striga seeds and smaller soil particles and organic debris retained on the 75 µm lters were collected and dried at 35 0 C for 48 hours followed by grinding and DNA extraction. This procedure reduced the soil volume by on average 99.7% for two physicochemically different Dutch soils tested. By reducing so effectively the soil volume, the soil mixture can be used directly for DNA extraction using the widely available commercial extraction kits.
Next, we introduced increasing densities of Striga seeds in soils D08 and D17 at nal densities of 0, 1, 3, 9, 27, 81 and 243 seeds per 150 g of soil and processed these soil samples as described above. Results of the qPCR analysis revealed that even a single Striga seed introduced into 150 g of soil sample can be detected by qPCR in both soil types (Fig. 2b). Furthermore, the variations in Cq values for the same seed density in both soil types were minimal, suggesting e cient recovery of Striga seeds and qPCR e ciency in both soil types (clay, sand) (Fig. 2b). In the study by van Delft et al. (1997), where Striga seeds were manually counted, the otation method had a recovery of up to 85%. Hence, our approach substantially improved Striga seed detection and provided a molecular con rmation of Striga seed presence.

Optimizing quanti cation of Striga seeds in agricultural soil
For accurate quanti cation of the Striga seedbank in naturally infested soils, standard curves are typically generated by using genomic DNA extracted from the weed seeds (Dongo et al. 2012;Aly et al. 2012Aly et al. , 2019. To be precise on gene copy number, we ampli ed and cloned the marker gene (StHe0GB1_93) into the pGEM®-T Easy vector to establish an absolute standard curve. Additionally, we established a second standard curve with genomic DNA extracted from Striga seeds introduced in 150 g D08 soil at six different densities. An initial number of gene copies (NGC) of 258129 single stranded (ss)-rpDNA µl − 1 was calculated from the initial DNA concentration of the rpDNA (0.5 pg/µl) extracted from the transformed E. coli using Eq. 1 (Brankatschk et al. 2012). From the ve point 10-fold serial dilution (0.5 pg/µl to 0.00005 pg/µl) of the puri ed rpDNA, mean Cq values of 16.51 and 30.82 were calculated for the highest (0.5 pg/µl) and lowest (0.00005 pg/µl) concentration, respectively corresponding to 258129 and 26 gene copies per µl − 1 of ss-rpDNA. The absolute standard curve for StHe0GB1_93 is linear in the range tested (R 2 = 0.9959) with a slope of -3.5965 (Fig. 3a). From the slope, an ampli cation e ciency of 89.69% was determined for StHe0GB1_93. DNA concentration as low as 0.00005 pg/µl could be detected in the assay. This result revealed that our method is more sensitive than the recent study of Aly and coworkers (Aly et al. 2019), in which 0.001 ng/µl was the minimum DNA concentration that could be detected for genomic DNA of the parasitic weed Orobanche cumana. The high detection sensitivity obtained in our study with qPCR may be due to the use of plasmid DNA, which is devoid of other DNA and inhibitors from seed samples that can interfere in qPCR.
Following in the footsteps of the elegant study by Aly et al. (2019) to quantify O. cumana seeds in naturally infested soil samples, we plotted the seed number against the gDNA extracted from six Striga seed densities (1,3,9,27, 81 and 243 seeds) introduced into 150 g of eld soil D08. The obtained standard curve was then used to establish a relationship between the number of Striga seeds in a soil and the estimated copy number of the marker gene (StHe0GB1_93), calculated based on the Cq values of the different seedbank densities (Fig. 3b). The standard curve is linear in the range of Striga seed numbers tested (R 2 = 0.9942). Hence, this standard curve was then used for quanti cation of Striga seeds in naturally infested soil samples collected from sorghum growing elds in Ethiopia.

Striga seedbank density in naturally infested sorghum elds in Ethiopia
The method that was validated on arti cially infested soils was used to detect and quantify Striga seeds in 48 naturally infested soil samples (referred to as E01 -E50, except soils E15 and E48) collected from sorghum eld soils from different agroecological zones in Ethiopia and covering a trajectory along the sorghum belt of more than 1500 km (Fig. 4a). Following our new method described above, the results showed substantial variation in Striga seed density among the 48 Ethiopian soil samples ( Fig. 4b and Table S3). The Striga seed densities ranged from 0 to 86 per 150 g of soil sample, with soil samples E22, E12 and E27 harboring the highest Striga seedbank densities of 86, 67 and 46 seeds per 150 g, respectively (Fig. 4b). Striga seeds were not detected in soil samples of 12 Ethiopian sorghum elds (E13, E16, E17, E19, E20, E21, E30, E33, E38, E40, E43, and E45) (Fig. 4b).
When looking into the geospatial distribution of the Striga seedbank in Ethiopian sorghum elds, most of the soils with relatively high Striga seed densities were collected from the Tigray region of Ethiopia. The majority of the samples that showed relatively low Striga seed densities were collected from sorghum growing areas of North Shewa. Here we would like to emphasize that the terms 'relatively low' and 'relative high' Striga seed densities are merely used to categorize the seedbank of our soil samples and this may not re ect the extent to which it poses an adverse effect on sorghum growth and yield. For example, if the seed density is presented per square meter of eld soil with 300 kg top soil per square meter (considering only the top 20 cm and assuming a bulk density of agricultural soil of average 1.5 g/cm 3 ), then the lowest seed density detected (1 seed per 150 g of soil sample) still corresponds to approximately 2,000 Striga seeds per m 2 . Translating the numbers shown in Fig. 4b to numbers that are relevant at eld scale suggests the persistence of a high Striga seedbank in multiple elds in the sorghum belt of Ethiopia.

Relationship between Striga seedbank and Striga incidence
Determining the relationship between Striga seed densities and eld infestation would be highly instrumental to predict the risk for crop losses in different agroecological conditions and to test the e cacy of speci c management practices. At the same time, establishing causal relationships is di cult as these are highly dependent on the sorghum genotypes and management practices used by the farmers at the time of sampling and in future cultivations. Both linear and non-linear relationships between the number of emerged Striga plants and the initial seedbank density were previously reported (Smith and Webb 1996;van Delft et al. 1997). Our analysis revealed a signi cant (p ≤ 0.0001) positive correlation (r = 0.561) between the Striga seed density and the percentage of Striga emergence per square meter assessed in the same eld where the soil samples were collected. Moreover, different regression analyses (linear, non-linear) revealed that the non-linear regression analysis provided the best relationship (R 2 = 0.362) between the number of Striga seeds per 150 g of soil and the number of emerged Striga seedlings counted per square meter of sorghum eld (Fig. 5b). The asymptotic nature of this non-linear relationship appears to make biologically more sense than a linear relationship considering intraspeci c competition for infection sites and/or outgrowth and emergence. Despite the overall positive correlation between these two parameters, however, some soil samples deviated to some extent from this relationship. For example, soil E04 showed high Striga incidence but low Striga seed density whereas soil E27 showed high seed density but low Striga incidence (Fig. 5b). The underlying mechanisms of this deviation are under investigation and can be due to soil physicochemical and/or microbiological attributes that act on the Striga seedbank or on Striga infection. A previous study also showed that even in fallow elds, one year after the last harvest, a decrease of 62% in the number of seeds was recorded for the top soil fraction; this was not the case for samples originating from below a depth of 10 cm, possibly re ecting the decrease in microbial activity with soil depth (van Delft et al. 1997). Soils E12 and E22 that showed high Striga seedbank density and high Striga incidence could be considered soils conducive for Striga, whereas soil E27 can be considered as a potential Striga-suppressive soil. Although, this regression analysis might not provide a conclusive means to categorize eld soils as Striga conducive or suppressive, it can serve as a lead to further interrogate these soils for Striga-suppressive physicochemical or microbiological traits. Furthermore, Striga seeds were detected in soil samples collected from push-pull elds (E01, E07, E49 and E50, respectively having 1, 3, 12 and 1 Striga seeds per 150 g of soil) though no or low Striga incidence was observed during soil sampling (Fig. 4b). This result is in line with earlier observations of low Striga incidence in push pull elds but also suggests that Striga seeds may persist in push-pull soils that are assumed to diminish the seedbank of this parasitic weed.
Whether the Striga seeds detected in these and other soils tested in this study are still viable remains to be determined. Hence, selecting marker genes that distinguish viable non-dormant from viable dormant seeds and designing primers that differentially amplify the genomic DNA/RNA extracted from these seeds is the next research priority to get an even more detailed insight into Striga seedbank dynamics.

Conclusions
In this study, we developed a high-throughput and robust molecular technique for the detection and quanti cation of the Striga seedbank in agricultural soils. This technique is a rst important step to screen large numbers of samples to assess the impact of different intervention strategies on Striga seedbank dynamics and to unravel the impact of soil microbiological and physicochemical properties. The proof-of-principle experiment we performed by mixing known numbers of Striga seed in 150 g of two Striga-free Dutch agricultural soil samples showed that our procedure is e cient to detect and quantify a single Striga seed per 150 g of soil. The qPCR detection and quanti cation of Striga seeds in soils were also further tested on soil samples collected from naturally infested sorghum elds and showed considerable variation in Striga seedbanks across the sorghum belt in Ethiopia. Correlation analysis also revealed a signi cant (p ≤ 0.0001) positive correlation (r = 0.561) between the density of Striga seeds and the number of emerged Striga per square meter of sorghum eld. The next challenge will be differentiating viable non-dormant and viable dormant seeds to further ne-tune the relationship between Striga seedbank dynamics and Striga incidence.

Soil sampling and study areas
The soil samples were collected from naturally Striga infested sorghum elds in Amhara (Kemise, North Shewa, South and North Wollo Zones) and Tigray (West, Central and South zones) regions of Ethiopia in October 2017 (Fig. 4a). For representative soil sampling, sorghum elds with four categories (zero, low, medium and high) of Striga eld infestation were randomly selected. These categories were determined based on the number of emerged Striga plants counted for four quadrants of 1m *1m. Soil samples from the top layer (0-20 cm) around the root zone of sorghum plant in these quadrants were sampled separately and later combined together to form one composite sample per eld. Utensils used for the sampling were washed with water and rinsed with 70% ethanol between successive samplings to avoid cross contamination of samples. In total, 48 composite soil samples covering a trajectory of more than 1500 km were collected from naturally Striga infested sorghum growing agro-ecological zones in Ethiopia. Among the soil samples, four soil samples from push-pull demonstration sorghum elds in North Shewa, Kemise and West Hararghae Zones of Ethiopia were included to investigate the effect of push-pull technologies on the Striga seedbank density in agricultural elds. Soil samples were brought to the lab in Holeta research centre, air dried and sieved through a 4-mm mesh sieve to remove stones and plant debris. Furthermore, seven Striga-free soil samples were collected from different parts of the Netherlands and used to investigate Striga DNA recovery and qPCR e ciency.

Selection of marker genes, primer design and speci city
From the parasitic plant genome project (http://ppgp.huck.psu.edu/) the Striga StHe0GB1 genome assembly was downloaded. We selected ve genes (StHe0GB1_1, StHe0GB1_9, StHe0GB1_20, StHe0GB1_76 and StHe0GB1_93) with Striga-speci c sequences as a putative marker gene for Striga seed detection and quanti cation. For these 5 genes, a total of 14 primer pairs were designed targeting the Striga-speci c sequences. Using the NCBI primer blast web tool and nr-database, the speci city of the forward and reverse primers was validated in silico. To validate the e cacy of these primers experimentally, we extracted DNA from ~ 6000 Striga seeds, from 100 mg of Striga-free Dutch agricultural soil spiked with ~ 6000 Striga seeds and 100 mg of the soil sample without Striga seeds (control).
Samples were ground manually with mortar and pestle in liquid nitrogen and kept at -80 0 C until further use. The genomic DNA was extracted using DNeasy PowerSoil Kit (QIAGEN) according to the manufacturer's instructions. The DNA quality and quantity were determined using a NanoDrop spectrophotometer.
PCR was carried out in 25 µl reaction volume using GoTaq hot start polymerase master mix (12.5 µl), primer mix (1 µl), templet DNA (0.5 µl) and water (11 µl) on a thermocycler equipped with heated lid. An initial denaturation for 2 minute at 95°C; 35 cycles with 30 sec at 95°C, 30 sec at 50°C, 30 sec at 72°C and a nal elongation for 5 min at 72°C. The primer sets were also evaluated in qPCR. The qPCR mixes amounted to a total volume of 20 µl, consisting of 4 µl of the template DNA, 10 µl of SYBR Green, 1 µl of each forward and reverse primer (10 ppm), 2 µl of BSA (4 mg/ml) and 2 µl Sigma water. Two annealing temperatures (56 0 C, 60 0 C) were tested to assess the speci city of the primers. Bio-Rad qPCR machine was used with the following conditions: 3 minute at 95 0 C followed by 35 ampli cation cycles of 5 sec at 95 0 C, 15 sec at 56 0 C or 60 0 C, and 25 sec for the nal elongation at 72 0 C.

Establishment of standard curves
We developed a recombinant plasmid containing the marker gene (StHe0GB1_93) for establishing absolute standard curves aiming to establish the relationship between Cq values and gene copy numbers. Furthermore, we established a second standard curve using the genomic DNA extracted from six densities of Striga seeds introduced in 150 g of Striga-free agricultural soil (D08) to establish relationship between seed number and Cq value. The combined use of the two standard curves enabled us to establish relationship between Cq value and seed number when analyzing the naturally infested Ethiopian soil samples.

Recombinant plasmid DNA-based standard curve
The marker gene was rst cloned in pGEM®-T Easy vector. Then, the vector containing the marker gene was transformed into the E. coli and positive colonies were identi ed using colony PCR and cultured in LB medium. The rpDNA was isolated and puri ed and the concentration of the rpDNA was determined. A ve point 10 fold serial dilutions (0.5 pg/µl to 0.00005 pg/µl) of the puri ed rpDNA was subjected for qPCR assay to establish the relationship between Cq values and the calculated gene copies of the marker gene.
An initial number of gene copies µl − 1 of a single strand (ss)-rpDNA (NGC ss-rpDNA) was calculated from the initial DNA concentration of the rpDNA (0.5 pg/µl), the length of the plasmid containing the target gene (3535 bp), the number of targets per DNA fragment (n target [2 copies]), the Avogadro constant (6.022 * 10 23 bp mol − 1 ), and the average weight of a double-stranded base pair (660 g mol − 1 = 6.6 * 10 11 ng mol − 1 ) (Eq. 1) (Brankatschk et al. 2012).  The linear regression of the Cq value of each dilution versus their corresponding log 10 gene copy (N 0 Sample ) was used to calculate the slop (b) and intercept (a) of the standard curve (Eq. 2) (Brankatschk et al. 2012). The ampli cation e ciency (E) was calculated from the slope of the standard curve using Eq. 3.

3
Striga seedbank density-based standard curve Another standard curve was also established from the genomic DNA extracted above from soil sample (D08) mixed with six densities of Striga seeds (1,3,9,27, 81 and 243 seeds). The gene copies of each density of the seeds were calculated from the average Cq value by using the regression formula generated above from rpDNA gene copies and the corresponding average Cq value. Then, the relationship between number of Striga seeds and the estimated gene copies was generated. Hence, this regression equation is used to convert the detected DNA of Striga seeds by qPCR to quanti ed number of seeds in naturally infested soils.
The above Eq. 2 was also rearranged and taken the reverse of Log of both sides to calculate the number of gene copies Striga seed DNA (NGC ssDNA) extracted from different densities of seed introduced in Striga free Dutch soil D08 as indicated in (Gallup 2011).

4
The number of gene copies of Striga seed DNA extracted from naturally infested eld soils (NGC ss DNA soil) were calculated per the Eq. 5 as described in (Gallup 2011).

5
The standard curve that established a relationship between Striga seed number and gene copy created above from arti cially contaminated soil sample with different densities of Striga seeds was used to extrapolate the number of Striga seeds in naturally infested soil samples from the average Cq value-gene copy relationship.
Striga seed separation from the soil matrix To reduce the in uence of soil physicochemical properties on DNA recovery and qPCR e ciency, the methods that separate Striga seeds from the bulk soil to enhance the detection and accurate quanti cation of Striga seeds in soil was investigated. As a rst step, using two Striga free Dutch agricultural soil samples (D08 and D17) that have contrasting physicochemical properties, we evaluated two seed-soil separation methods: 1) combined washing and sieving of the Striga-soil mixture and 2) density-dependent K 2 CO 3 separation ( otation) followed by sieving. In the rst method, 150 g of the soil samples were washed by gently mixing on ve sieves arranged sequentially in successive order of pore size (300, 200, 180, 150 and 100 µm) and the soil samples retained on the last three smaller sieves were collected together. In the second method, density-based extraction by K 2 CO 3 solution followed by sizedependent separation by sieving was performed to separate Striga seeds and some small and lighter soil particles from other heavy and larger organic debris and soil particles. The samples were divided into three 250 ml centrifuge bottles with 50 g of soil sample suspended in 150 ml of 5.5 M K 2 C0 3 solution.
Then, the soil samples were dispersed by shaking at 250 rpm for 15 min followed by sonication for 15 min by using Bransonic® Ultrasonic Cleaner sonicator containing a RF frequency of 47 KHZ ± 6%. The dispersed soil samples were centrifuged at 5,000 X g for 5 min at room temperature by using high speed centrifuge. The Striga seeds and other lighter organic matter oated on the top of the supernatant whereas the majority of the soil particles settled at the bottom. The supernatants from the three bottles of the same sample were collected into 1000 ml bottles and the process was repeated a second time to ensure full recovery of all the Striga seeds. Then, size-dependent separation of the Striga seeds, smaller soil particles and organic debris from larger particles were performed by using two meshes (pore sizes 425 and 75 µm) arranged in successive order. The Striga seeds and smaller particles retained on 75 µm were dried at 35 0 C for 48 hours and collected for further grinding and DNA extraction.
The e ciency of the density and size-dependent method described above was assessed in proof-ofprinciple experiments involving introduction of known numbers of S. hermonthica seeds (0, 1, 3, 9, 27, 81 and 243 seeds) into 150 g of two soil samples (D08 and D17) with contrasting soil physicochemical properties. Then, the Striga seeds were re-separated from 150 g of soil samples and were ground manually by mortar and pestle under liquid nitrogen. The genomic DNA was also extracted using DNAeasy PowerSoil Kit (QIAGEN) according to the manufacturer's instructions as indicated above. Then, qPCR was carried out to assess the effectiveness of the recovery of the Striga seeds from the soil matrix.   In uence of soil type on recovery of Striga seeds. a) qPCR detection of 65 Striga hermonthica seeds mixed into seven physicochemically different Dutch agricultural soils (D08, D10, D11, D13, D20, D21, D17). After mixing the seeds into these soils, total DNA was extracted and subjected to qPCR with primer set 14 (see gure 1B). b) qPCR detection of different Striga seed densities introduced into two physicochemically distinct Dutch agricultural soils (D08, D17). In contrast to the procedure used in panel A, soils containing the Striga seeds were rst treated with K2CO3 for size-dependent separation of the Striga seeds from the soil matrix prior to DNA extraction. For both experiments, the mean Cq values (± SE) are shown for three biological replications and two technical replications per biological replication.  Figure 1A) and the logarithm of the gene copy number. For each log gene copy number, 3 replicates were used in qPCR; b) relationship between different Striga hermonthica seed densities mixed into agricultural soil and the estimated gene copy number. For each Striga seed density, three biological replications and two technical replications per biological replication were used.

Supplementary Files
This is a list of supplementary les associated with this preprint. Click to download.