Selection of marker genes, primer design and specificity
From the parasitic plant genome project (http://ppgp.huck.psu.edu/) the Striga StHe0GB1 genome assembly was downloaded. The genome sequence was generated from Striga seeds that have been imbibed for various lengths of time, covering seeds that were conditioned as well as seeds that were treated for up to 6 h with the germination stimulant GR-24 (Westwood et al. 2012). From the assembly, several genes were selected and blasted against the nr database. We selected five genes (StHe0GB1_1, StHe0GB1_9, StHe0GB1_20, StHe0GB1_76 and StHe0GB1_93) with Striga-specific sequences as a putative marker gene for Striga seed detection and quantification (Table 1). For these 5 genes, a total of 14 primer pairs were designed targeting the Striga-specific sequences (Table 1). Using the NCBI primer blast web tool and the nr-database, the specificity of the forward and reverse primers was validated in silico (data not shown). To validate the efficacy of these primers experimentally, we extracted DNA from ~ 6000 Striga seeds, from 100 mg of Striga-free Dutch agricultural soil spiked with ~ 6000 Striga seeds and 100 mg of the soil sample without Striga seeds (control). All primer sets, except set 3 (targeting StHe0GB1_1), resulted in the PCR product of the predicted size (Table S1) for the samples containing the Striga seeds, whereas no amplification product was observed for any of the primer sets with DNA extracted from the soil samples without Striga seeds (Fig. 1a). Next, we tested the primer pairs in qPCR at two annealing temperatures (56 and 60 oC) to determine sensitivity, specificity and stability of the primers. All primers amplified the genomic DNA of Striga seeds at both temperatures but with different sensitivity, specificity and stability (Fig. 1b). Primer set 14 (P14) targeting the StHe0GB1_93 gene, showed high sensitivity as manifested by a low quantification cycle (Cq) value and single melting curve for both annealing temperatures (Fig. 1b and S1). Furthermore, P14 resulted in a PCR product of the expected size for five independent S. hermonthica seed batches and one S. asiatica seed batch collected from different agroecological zones in Ethiopia (Fig. S2). Hence, primer set P14 was selected for testing the specificity and sensitivity of PCR-based detection and quantification of Striga seeds in soil samples.
Optimizing DNA extraction and qPCR efficiency in different agricultural soils
Accurate determination of the seed density of Striga in a field soil requires a substantial amount of soil sample. Previous assays to microscopically determine Striga seed densities used 100 g soil samples (van Mourik 2007). However, such a large sample cannot be directly accommodated in the currently available high-throughput DNA extraction kits. Moreover, the application of molecular techniques for the detection and quantification of eDNA from soils may be hampered by humic acids, polysaccharides, urea, phenolic compounds and heavy metals (Frostegård et al. 1999). Here, we introduced 65 Striga seeds into seven physicochemically different Striga-free Dutch agricultural soils (Table S2), each weighing 100 mg. qPCR analysis on eDNA extracted from these ‘spiked’ soil samples showed variation in the mean Cq value from 27.3 to 29.3 cycles (Fig. 2a). This variation could be due to differences in soil physicochemical properties between these soils affecting the efficiency of eDNA extraction and/or qPCR analysis. Correlation analysis between the Cq values and a number of physicochemical properties of the seven soil samples revealed that Cq values were positively correlated to Fe, Mg, S, C, N,C/N and organic matter (OM) content, whereas they were negatively correlated to pH, K and P contents (albeit not statistically significant p ≥ 0.05).
To minimize interference of soil physicochemical properties, we then tested if separation of Striga seeds from the bulk soil prior to eDNA extraction and qPCR could improve the sensitivity of Striga detection and quantification. To this end, we adopted a density-dependent K2CO3 separation of the Striga seeds from the soil matrix followed by successive sieving through two filters with meshes of 425 and 75 µm, respectively. The Striga seeds and smaller soil particles and organic debris retained on the 75 µm filters were collected and dried at 35 0C for 48 hours followed by grinding and DNA extraction. This procedure reduced the soil volume by on average 99.7% for two physicochemically different Dutch soils tested. By reducing so effectively the soil volume, the soil mixture can be used directly for DNA extraction using the widely available commercial extraction kits.
Next, we introduced increasing densities of Striga seeds in soils D08 and D17 at final densities of 0, 1, 3, 9, 27, 81 and 243 seeds per 150 g of soil and processed these soil samples as described above. Results of the qPCR analysis revealed that even a single Striga seed introduced into 150 g of soil sample can be detected by qPCR in both soil types (Fig. 2b). Furthermore, the variations in Cq values for the same seed density in both soil types were minimal, suggesting efficient recovery of Striga seeds and qPCR efficiency in both soil types (clay, sand) (Fig. 2b). In the study by van Delft et al. (1997), where Striga seeds were manually counted, the flotation method had a recovery of up to 85%. Hence, our approach substantially improved Striga seed detection and provided a molecular confirmation of Striga seed presence.
Optimizing quantification of Striga seeds in agricultural soil
For accurate quantification of the Striga seedbank in naturally infested soils, standard curves are typically generated by using genomic DNA extracted from the weed seeds (Dongo et al. 2012; Aly et al. 2012, 2019). To be precise on gene copy number, we amplified and cloned the marker gene (StHe0GB1_93) into the pGEM®-T Easy vector to establish an absolute standard curve. Additionally, we established a second standard curve with genomic DNA extracted from Striga seeds introduced in 150 g D08 soil at six different densities. An initial number of gene copies (NGC) of 258129 single stranded (ss)-rpDNA µl− 1 was calculated from the initial DNA concentration of the rpDNA (0.5 pg/µl) extracted from the transformed E. coli using Eq. 1 (Brankatschk et al. 2012). From the five point 10-fold serial dilution (0.5 pg/µl to 0.00005 pg/µl) of the purified rpDNA, mean Cq values of 16.51 and 30.82 were calculated for the highest (0.5 pg/µl) and lowest (0.00005 pg/µl) concentration, respectively corresponding to 258129 and 26 gene copies per µl− 1 of ss-rpDNA. The absolute standard curve for StHe0GB1_93 is linear in the range tested (R2 = 0.9959) with a slope of -3.5965 (Fig. 3a). From the slope, an amplification efficiency of 89.69% was determined for StHe0GB1_93. DNA concentration as low as 0.00005 pg/µl could be detected in the assay. This result revealed that our method is more sensitive than the recent study of Aly and coworkers (Aly et al. 2019), in which 0.001 ng/µl was the minimum DNA concentration that could be detected for genomic DNA of the parasitic weed Orobanche cumana. The high detection sensitivity obtained in our study with qPCR may be due to the use of plasmid DNA, which is devoid of other DNA and inhibitors from seed samples that can interfere in qPCR.
Following in the footsteps of the elegant study by Aly et al. (2019) to quantify O. cumana seeds in naturally infested soil samples, we plotted the seed number against the gDNA extracted from six Striga seed densities (1, 3, 9, 27, 81 and 243 seeds) introduced into 150 g of field soil D08. The obtained standard curve was then used to establish a relationship between the number of Striga seeds in a soil and the estimated copy number of the marker gene (StHe0GB1_93), calculated based on the Cq values of the different seedbank densities (Fig. 3b). The standard curve is linear in the range of Striga seed numbers tested (R2 = 0.9942). Hence, this standard curve was then used for quantification of Striga seeds in naturally infested soil samples collected from sorghum growing fields in Ethiopia.
Striga seedbank density in naturally infested sorghum fields in Ethiopia
The method that was validated on artificially infested soils was used to detect and quantify Striga seeds in 48 naturally infested soil samples (referred to as E01 – E50, except soils E15 and E48) collected from sorghum field soils from different agroecological zones in Ethiopia and covering a trajectory along the sorghum belt of more than 1500 km (Fig. 4a). Following our new method described above, the results showed substantial variation in Striga seed density among the 48 Ethiopian soil samples (Fig. 4b and Table S3). The Striga seed densities ranged from 0 to 86 per 150 g of soil sample, with soil samples E22, E12 and E27 harboring the highest Striga seedbank densities of 86, 67 and 46 seeds per 150 g, respectively (Fig. 4b). Striga seeds were not detected in soil samples of 12 Ethiopian sorghum fields (E13, E16, E17, E19, E20, E21, E30, E33, E38, E40, E43, and E45) (Fig. 4b).
When looking into the geospatial distribution of the Striga seedbank in Ethiopian sorghum fields, most of the soils with relatively high Striga seed densities were collected from the Tigray region of Ethiopia. The majority of the samples that showed relatively low Striga seed densities were collected from sorghum growing areas of North Shewa. Here we would like to emphasize that the terms ‘relatively low’ and ‘relative high’ Striga seed densities are merely used to categorize the seedbank of our soil samples and this may not reflect the extent to which it poses an adverse effect on sorghum growth and yield. For example, if the seed density is presented per square meter of field soil with 300 kg top soil per square meter (considering only the top 20 cm and assuming a bulk density of agricultural soil of average 1.5 g/cm3), then the lowest seed density detected (1 seed per 150 g of soil sample) still corresponds to approximately 2,000 Striga seeds per m2. Translating the numbers shown in Fig. 4b to numbers that are relevant at field scale suggests the persistence of a high Striga seedbank in multiple fields in the sorghum belt of Ethiopia.
Relationship between Striga seedbank and Striga incidence
Determining the relationship between Striga seed densities and field infestation would be highly instrumental to predict the risk for crop losses in different agroecological conditions and to test the efficacy of specific management practices. At the same time, establishing causal relationships is difficult as these are highly dependent on the sorghum genotypes and management practices used by the farmers at the time of sampling and in future cultivations. Both linear and non-linear relationships between the number of emerged Striga plants and the initial seedbank density were previously reported (Smith and Webb 1996; van Delft et al. 1997). Our analysis revealed a significant (p ≤ 0.0001) positive correlation (r = 0.561) between the Striga seed density and the percentage of Striga emergence per square meter assessed in the same field where the soil samples were collected. Moreover, different regression analyses (linear, non-linear) revealed that the non-linear regression analysis provided the best relationship (R2 = 0.362) between the number of Striga seeds per 150 g of soil and the number of emerged Striga seedlings counted per square meter of sorghum field (Fig. 5b). The asymptotic nature of this non-linear relationship appears to make biologically more sense than a linear relationship considering intraspecific competition for infection sites and/or outgrowth and emergence. Despite the overall positive correlation between these two parameters, however, some soil samples deviated to some extent from this relationship. For example, soil E04 showed high Striga incidence but low Striga seed density whereas soil E27 showed high seed density but low Striga incidence (Fig. 5b). The underlying mechanisms of this deviation are under investigation and can be due to soil physicochemical and/or microbiological attributes that act on the Striga seedbank or on Striga infection. A previous study also showed that even in fallow fields, one year after the last harvest, a decrease of 62% in the number of seeds was recorded for the top soil fraction; this was not the case for samples originating from below a depth of 10 cm, possibly reflecting the decrease in microbial activity with soil depth (van Delft et al. 1997). Soils E12 and E22 that showed high Striga seedbank density and high Striga incidence could be considered soils conducive for Striga, whereas soil E27 can be considered as a potential Striga-suppressive soil. Although, this regression analysis might not provide a conclusive means to categorize field soils as Striga conducive or suppressive, it can serve as a lead to further interrogate these soils for Striga-suppressive physicochemical or microbiological traits. Furthermore, Striga seeds were detected in soil samples collected from push-pull fields (E01, E07, E49 and E50, respectively having 1, 3, 12 and 1 Striga seeds per 150 g of soil) though no or low Striga incidence was observed during soil sampling (Fig. 4b). This result is in line with earlier observations of low Striga incidence in push pull fields but also suggests that Striga seeds may persist in push-pull soils that are assumed to diminish the seedbank of this parasitic weed. Whether the Striga seeds detected in these and other soils tested in this study are still viable remains to be determined. Hence, selecting marker genes that distinguish viable non-dormant from viable dormant seeds and designing primers that differentially amplify the genomic DNA/RNA extracted from these seeds is the next research priority to get an even more detailed insight into Striga seedbank dynamics.