Effect of Soil Water on GPR Estimation of Bulked Roots, Methods, and Suggestions

Brody L Teare (  blteare@tamu.edu ) Texas A and M University: Texas A&M University https://orcid.org/0000-0001-7122-2127 Henry Ruiz Texas A&M University Afolabi Agbona Texas A&M University Matthew Wolfe Texas A&M University Iliyanna Dobreva Texas A&M University Tyler Adams Texas A&M University Michael Selvaraj Centro Internacional de agricultura Tropical CIAT: Alliance of Bioversity International and International Center for Tropical Agriculture Dirk B Hays Texas A&M University

In 2017, Delgado et al. reported on the use of ground penetrating radar (GPR) for estimating cassava root mass and detecting the onset of root bulking [17]. A later 2019 study by Delgado et al. furthered the work by performing high density scans on buried roots in a climate controlled sand box [18]. This study was able to approximate 3D models of the buried roots. GPR has also been used to estimate the root mass of sugar beet, wheat, and peanut [19][20][21]. GPR is a geophysical tool for detecting belowground features such as fault lines or buried utilities [22]. GPR works by emitting a pulse of electromagnetic energy into the ground, where it is either transmitted, absorbed, scattered, or re ected. Re ections are caused by changes in dielectric permittivity, a measure of how strongly molecules can be polarized. In agricultural soils, the primary drivers of dielectric are soil texture and water content, with water content having the greater in uence by an order of magnitude. Therefore, the high water content of roots has the potential to re ect GPR signals.
In general, GPR systems consist of system electronics which generate and process the signal, a transmitting and receiving antenna or antenna array, and computer-based control and capture software.
GPR data are in the time domain, and the images are not representative of spatial relations, making GPR data di cult to interpret visually. This is because the GPR is directly measuring time-of-ight for the signal, the speed of which is controlled by the dielectric of the medium, such that the velocity (V) can be estimated by the ratio of the speed of light in a vacuum (C) to the square root of the dielectric (ε) (see Eq. 1). In many cases, GPR data are used qualitatively for the location of objects rather than quantitatively [23]. Soil moisture has the effect of attenuating GPR signals; consequently, it is standard practice to prefer dry soil for GPR measurements [24]. However, in an earlier experiment which attempted to estimate ne roots with GPR, Liu et al. suggest that wet soil may improve data quality [25]. Equation 1 The velocity (V) of an electromagnetic wave is dependent on the dielectric (ε) of the medium in which it travels, and is relative to the speed of light in a vacuum (C). Delgado et al. (2017) used a small radar system which was carefully passed along the soil surface in a grid around each plant, and then measured the depth of each root to allow a supervised processing method. While this method serves as proof of concept, it is unsuitable for high-volume phenotypic evaluations common in the early-stage testing in plant breeding. GPR systems commonly use antennas which are in contact with the ground, called ground coupled. In agricultural systems, because of the soft, uneven soil, and the likelihood of standing plant mass, an air-launched antenna (antenna not ground coupled) is more appropriate. However, lifting the antenna from the ground has the potential to add observational error to the data. Air-launched antennas tend to suffer from increased ground clutter, which is the bright re ection caused by the interface of the air and soil, and any material at or near that interface, such as plant mass. Additionally, variations in the orientation of the antenna, whether caused by wobbling of the cart or other factors, can introduce variations in the measurement. Lastly, all GPR systems potentially receive interference from outside sources, such as cell phone signals or other nearby instrumentation. These often can be minimized during the collection by stacking, which is the rapid and automatic collection and averaging of several GPR pulses.
In this paper, we describe a eld ready GPR system which is more suited to high volume applications in estimating bulked root mass by using an air launched antenna array. We describe preliminary experiments in controlled eld settings using a model root crop, daikon radish, and novel data processing methods for extracting quanti able data from GPR scans. Importantly, the effect of soil water was explored. We present methods for collection, and suggest some good practices which future researchers should consider. The strength, limitations, and future potential of GPR will discussed.

Location
A raised bed of loamy sand was built in the Brazos River Bottoms, near Texas A&M University in College Station, Texas (Fig. 1). The bed is built of concrete blocks, stacked approximately 2 meters high, and the bed is approximately 3.5 by 22.5 meters long. The soil is sandy loam, transported from nearby farmland, and is kept free of weeds, therefore, the soil is homogenous and without distinct horizons. The soil was broken with a shallow till, leveled, and settled by watering before experiments began. 21 plots were measured out at 80 cm intervals, and marked off with stakes and string, then holes were carefully dug in the center of each plot. The holes were square shaped, approximately 65 cm on a side, with at bottoms and measured 15 cm deep from the surface. The number of plots was limited by available space.

Root Mass
Daikon radish were purchased from local grocery stores then weighed and labeled individually in random order. Daikon radish were used rather than cassava because of local availability, cassava being unavailable in the required quantities. Daikon radish were considered an appropriate model root because of similarities in size and shape. The roots were placed in the holes horizontally, and arranged to maximize the angle between adjacent roots and mimic the root growth of cassava. The number of roots in each plot was varied between 1 and 5 to increase variation in plot mass and variation in root orientations ( Figure 2). Enough roots were obtained to ll 19 plots. Per plot root mass ranged from 542 g to 2931 g.

Sensors
Campbell Scienti c CS655 soil moisture sensors (Logan, Utah, USA) were placed in the rst and last plots, at two depths: 5 cm and 20 cm. The sensors were inserted horizontally into the undisturbed soil on the side of the plots. Moisture levels were recorded before radar scanning. Additionally, 2 at metal plates were place in the rst and second plot at the same depths as the sensors, so that the plates straddled the root zone. These were meant to demarcate the root zone in the GPR data by acting as distinct re ectors.
The radar sensor used was an experimental loaded-vee dipole array, manufactured by IDS Georadar (Pisa, Italy) [26-28]. The array consists of 4 transmitters and 4 receivers in alternating pairs, each spaced 4 cm from adjacent antennas (Fig.3). The antennas are wideband with a center frequency of 1.8 GHz. The radar captures 512 samples over 18 ns, and pulses every 1 cm, as measured by an encoder wheel. Channel con gurations paired every antenna with its directly adjacent neighbor, giving a total of 7 channels, each offset by 4 cm. The sampling time increases with the number of channels, and in this case prevented automatic stacking as that would result in lost data due to hardware limitations of sampling speed.
The array was air launched, and mounted on a 4 wheeled cart that straddled the plots and placed the bottom of the antenna enclosure 39 cm from the ground surface (Fig. 4). The antenna was pointed directly at the ground, and a plastic rod was attached at the center of the enclosure to give a ground indication of the nadir of the radar. The plastic rod indicated the center of the antenna at ground level, and was used as a reference for marking the plots digitally in the capture software.

Soil Moisture
The experiment was scanned at 5 different water contents, beginning with a 'dry' state. Two oscillating sprinklers were placed in the eld such that the sprays evenly covered the entire expanse without overlap. The sprinklers were measured to provide about 13.5mm per hour by collecting water in two 5-gallon buckets placed in the sprinklers' path. The sprinklers were run for times varying from 2 hours to 8 hours, then the eld was allowed to rest between 4 and 48 hours before scanning, allowing surface pools to drain and attempting to let subsurface moisture equalize spatially. Table 1 shows the percent volumetric water content (VWC) of each treatment, and the standard deviation across the sensors.

Capture
The GPR cart was assembled in the eld and given time to equilibrate to ambient temperatures. Before collecting data, several scans were passed across the entire transect to allow the electronics to "warm up". More than 1 meter was allowed between the starting position of the cart and the rst plot, and the ending position and the last plot, to ensure the radar captured the entire extent. Using the plastic rod ( Fig. 4e) and string as an indicator, a digital marker ( ducial) was placed in the data in between each plot. The experiment was scanned 6 times for each treatment.

Data Processing and Analysis
Data was processed using GPR Studio version 1.0 (Crop Phenomics LLC, College Station, TX, USA), a Python software library developed for the quantitative analysis of GPR data. The software utilizes published data processing libraries and custom-built functions speci c to GPR analysis. Analysis was conducted in two stages.
In Stage 1, each scan was separated into plots based on the digital markers placed in the data during eld capture. The plots were then ltered to only those containing roots. Data were subset to the approximate root zone, then passed through a Butterworth bandpass lter, removing noise below 0.5 GHz and above 1.05 GHz. The 7 channels were each standardized to themselves by subtracting the channel mean from each value, and dividing by the channel standard deviation, similar to how a statistical z-score is calculated. This removed offset differences between channels caused by automated signal calibration in the eld. Standardized channels were then squared to move all values to the positive domain and minimize the background information which tends to gather about the mean, or 0 in standardized data. Channels were interpolated into a 3D cube using linear interpolation. A horizontal window, or time slice, 5 rows deep was passed from the top of the cube to the bottom, summing the amplitude in each window, which can be considered an indicator of total re ected energy in that window. Several window depths and alternative measurements, such as window variance, were tested -the most effective is reported here. Window values were correlated against known root mass in each plot, the results ltered for p-value < 0.1, while also controlling for appropriateness in the depth of the window and consistency between observations. The depth with the lowest p-value was stored for further processing. The cumulative energy values for each plot were divided by the length of that plot, in scan columns, resulting in relative energy density (Fig. 5).
In Stage 2, the results of Stage 1 for repeated observations were averaged together to reduce observational variance. Table 1 shows the mean energy density per plot with the standard error of the mean (SEM), and the coe cient of variation (CV) for each plot and treatment, demonstrating the observational variance in each repeated scan. Observational variance is an indication of noise in the data, which is a signi cant source of error here, and is further discussed below. Repeated observations help to remove the error through averaging.
Linear regression analysis was performed between the averaged energy density and the known root mass. Table 2 Observational variance per plot, the mean relative energy density for each plot and treatment, and the coe cient of variation. Plots are arranged in order of increasing root mass to aid interpretation. The metal plates placed above and below the root zone were not identi able in the radar data, possibly because of the type of paint applied to them. Therefore, it was not possible to use them to ne tune the root zone as had been planned.

Results
Our results demonstrate a signi cant relationship between re ected GPR signal and bulked root biomass. Additionally, the results demonstrate the importance of a homogenous dielectric environment in the soil, independent of water content. Lastly, we show that for relatively shallow agricultural studies, dry soil is not necessarily superior to wet soil for GPR measurement, a nding which supports the hypothesis of Liu et al, 2018 [20].
All treatments were signi cant at the p < 0.05 level. The nal treatment, which had the most homogenous wet soil but not the wettest, showed the strongest correlation, followed by the wettest treatment, then the dry treatment, which had the least variation in soil moisture. Larger variation in soil moisture decreased the strength of the correlations (Fig. 6). Although the number of plots was low, the repeated measurements and multiple treatments reinforces the probability that GPR features are indicative of root mass, and are not random. Dividing the sum of energy by the scan length modestly improved the correlations by adjusting for the effect of scan length, indicating that scan length was not a signi cant contributor to the correlations, but that variations in scan length introduced error.
The window depth of signi cant correlations varied slightly between treatments, which is expected as an effect of the varying dielectric. The depth of the dry treatment was greater than the wet treatments, which is unexpected, and will be discussed below. Standard deviation of the plot size, as measured by scan lengths, was 2.86 cm, less than 4% of the target plot length of 80 cm, and was randomly distributed in relation to plot number, observation, and treatment. Standard deviation between observations, within treatments, was not correlated to volumetric water content (VWC), variation of VWC, or plot biomass. The only exception for this is the Irrigation 4 treatment, in which relative energy density deviation was correlated to plot biomass at the p < 0.05 level.

Discussion
Rapid estimation of bulked root mass is possible with GPR. These results show correlation strength up to 79% using these methods. Further, we have demonstrated that increased VWC can improve the detection of bulked roots, as long as the dielectric is homogeneous across the study. The bulk dielectric of soil is driven primarily by water, and the interfaces of dielectric change cause the re ection of GPR energy. With a su ciently high signal frequency, soil structure has the potential to introduce noise in GPR data through the minute re ection and scattering of EM energy, driven by soil features such as compaction layers, aggregates, and pores. By increasing the VWC, some soil pores ll with water, and the dielectric heterogeneities are reduced, leading to less noisy GPR data. Additionally, as the dielectric of the soil increases, the signal velocity reduces, effectively increasing the sampling resolution of the system, as the sampling is a function of time.
These two factors may explain why increased VWC improved the strength of correlation. Indeed, we hypothesize that as VWC variance decreases, the strength of the correlation should increase, possibly maximizing the predictive potential near eld capacity. Unfortunately, soils near eld capacity are easily compacted, and are di cult to work in. Therefore, some compromise must be found to maximize predictive power of the GPR and minimize the impact and di culty of eld work. This optimal level of soil moisture is most likely dependent on soil texture, and could be expressed as a fraction of eld capacity. Further studies in multiple soil types should lead to standardized recommendations of optimum water content for major texture groups.
This study, like others before it, presents a supervised correlation -that is, the depth and mass of roots is known, so it becomes less di cult to determine the optimum depth of radar information to analyze. The window of analysis is relatively narrow -only ve rows, or approximately 2 cm of soil depth -and selecting the correct depth without previous knowledge of the root depth is di cult at this time. As research continues, it may become possible to distinguish the zone of highest information density, and researchers are already working towards that goal [21]. In this study, however, the noise was too great to establish the root zone from only GPR data. For this application, noise may be considered as all recorded energy which is not re ected by plant roots. As discussed earlier, GPR systems record all intercepted energy in the antenna range, regardless of the energy source. Noise may also be generated within the GPR system itself, and there has been some indication that the prototype system used here is not immune to this type of noise. This can be reduced by careful engineering, and through data ltering, if the inherent noise has been characterized. Other sources of noise include re ections and scattering caused by variations in soil structure, stones, clay clods, surface roughness, and above-ground biomass. Because we placed roots in the soil, rather than growing them, aboveground biomass was not an issue in this study, but has been in other data which are not yet published.
As noted, the soil type and water content have a large effect on GPR data. This variation makes it di cult to build a uni ed correlation between studies, elds, or even dates. As such, GPR remains a relative measure of root mass, suitable for ranking within a single eld and date, otherwise requiring a speci c calibration at each use. It remains possible that a correction factor could change this. Inclusion of multiple blank plots in the study may provide that correction factor, such that data can be normalized to the feature values of the blank plot, accounting for the soil type and moisture content. Further studies are planned to investigate this possibility. Without locational correction, GPR data may still be used to rank plots for genotype, and rankings may be compared across locations and/or time.
These results demonstrate the effect of soil moisture not just on the ability to pick out roots, but also the effect on the method. As mentioned in the results, the depth of best correlation was deeper for the dry treatment than the irrigated treatments, which was unexpected. GPR energy is re ected at the interface of dielectric contrast. When the object causing the re ection, such as a root, has su cient diameter, the re ection may happen at both interfaces on the signal vector -that is, it can re ect from both the top and bottom of the root. In other uses of GPR, the thickness of large objects can be estimated by measuring the distance between the top and bottom re ection. In this study, however, discreet returns were not observed; rather, the total re ected energy for a volume was measured. It is possible the re ected energy in the dry treatment was more intense at the bottom of the root, whereas the irrigated treatments re ected primarily from the upper interface. In all treatments, a nearly continuous range of window depths showed signi cant correlation to root mass, indicating that information about the root mass was present across a depth corresponding approximately to the root diameters. This also suggests the possibility of some distinct characteristic for that region, such that it may be possible to nd that region using machine learning techniques, so that supervised correlation is no longer required, and furthering the usefulness of GPR as a predictive tool.
In 2019, Delgado et al. reported a similar study designed to test commercially available GPR models in bulked root imaging [18]. A C-Thrue GPR system (IDS Georadar) was mounted to a computer controlled gantry and passed over a climate controlled sandbox. Cassava roots of varying sizes were buried at orientations parallel, orthogonal, and 45° to the scan direction. A single antenna pair was passed in transects at 2.5 cm intervals over the sandbox, with signal pulses every 0.2 cm. The GPR data were interpolated to form 3D models of the buried roots and interpolated image dimensions were compared to physical dimensions. The study illuminates several important factors for the application of GPR to root measurement, namely, the superiority of vertical antenna polarization over horizontal, and the effect of root orientation on measurement accuracy. However, the study differs signi cantly from the current -the focus was on 3D imaging rather than mass estimation, a single antenna pair was used in a high-density grid rather than an antenna array, the antenna was ground coupled rather than air-launched, and the soil medium was air dry such that no effect of soil water content was studied. Finally, the data collection method was not appropriate for high volume phenotyping.
The application of GPR for the quanti cation of roots is still in its infancy, and signi cant research is required before it can be used as a predictive tool. We have shown here that GPR data contain information about root mass, but it is also clear that other factors in uence the data, and noise is a problem. Radar data is highly sensitive to processing parameters, such that adding or removing a step readily effects correlation. The presented methods utilized a multi-channel radar to rapidly collect 3D information. To produce the 3D information, the channels were interpolated using simple linear interpolation. Similar to other 3D data, such as LiDAR, care must be taken to align the interpolated entities. In the case of GPR, the primary point of alignment is usually the soil surface, because it is discrete and constant. In this study, the eld was level, and the antenna was facing straight down at nadir, resulting in well-aligned channels with consistent positioning of the surface between channels. This is not always possible. In many cases, the antenna array cannot pass directly over the center of the root mass because plants are still present, so the antenna may be angled to point towards the plant center. Additionally, errors in channel calibration can produce small offsets that change the apparent height of the antenna relative to the ground surface. Finally, uneven ground surface can cause differences between channels. In these cases, the ground surface must be identi ed in each channel so they can be aligned before interpolation, as described by Dobreva et al [21]. Automated methods of identifying the ground surface would greatly reduce the time required for channel alignment.
Though each application will have its unique problems, there remain several constant considerations which we suggest become standard practice when using GPR to measure roots. Foremost among these is to understand your radar system. Unlike visual tools such as LiDAR, GPR emissions are not highly focused and are generally shaped like an ellipsoid bubble, meaning the energy extends in front of, behind, and to the sides of the antenna. This is why at least 1 m was allowed between the cart and the rst study plot, so that initial readings would be outside the plot. This also means care must be taken for transitory re ectors, such as workers, to not enter the volume of sensitivity while collecting the data.
GPR data are highly dependent on the dielectric of the soil; therefore, it is strongly recommended that dielectric measurements always be made at the time of scanning. This can be done in many ways, such as measuring dielectric directly with a probe, measuring water content and converting using the Topp equation, or by burying a re ector at a known depth, which allows a velocity estimation by dividing the known depth below the surface by the difference in signal time from the surface to the re ector [24,29]. Knowing the dielectric, or the signal velocity (see Eq. 1), allows the conversion of data from the time domain to the space domain, enabling estimation of depth. Further, some GPR processing techniques require these parameters. Many studies will be interested in the root mass at certain depths, as is currently measured with destructive techniques. This is only possible if the signal velocity is known.
Published methods to date have relied on measuring re ected energy, whether by amplitude threshold and pixel counting, or summations of other features. These techniques are inherently tied to the volume of soil analyzed, meaning that plot size will auto-correlate with feature count. Plot size must therefore be carefully controlled. In this study, plot size was controlled in the eld through careful measurement and marking. Other studies have controlled plot size by cropping the data, and others have controlled by conversion to either feature density, root density, or both. We recommend the former whenever possible, as it protects the integrity of the data. However, current root phenotyping methods frequently use root density as a measurement and is acceptable to many researchers [30].
Whichever way the plot length is controlled, the data must be related to the eld. Some GPR systems are capable of integrating GPS data into the scan data, while others can utilize digital markers. Some have neither capability, thus plot positions must be derived another way, possibly by placing re ectors at plot ends. Experience dictates caution in the latter method -the re ector must be easily identi able in the radargram, and re ectors placed on the soil surface are easily lost in the surface re ection. In such cases, an aerial re ector is recommended.
Finally, based on the results of this study, care must be taken to ensure homogeneous dielectric environment at the time of scanning. Depending on the hydraulic conductivity of the soil, several days may be required after an irrigation event.

Conclusion
The use of GPR technology to quantify root mass in agricultural elds is still very young and is yet to be widely accepted. However, interest is growing as more studies are published showing the potential. With up to 63% explained variability (r 2 = 0.626, r = 0.792), this study con rms previous publications, and demonstrates the feasibility of an air-launched antenna array for rapidly collecting GPR-based root estimations. Further, it is the rst study to show improved data quality for wet soil over dry soil. It is novel in demonstrating the importance of dielectric homogeneity for estimating root mass. Though this study was small and should be con rmed with more samples, it serves as a proof of concept that merits further investigation. Importantly, we have demonstrated considerations in the use of agricultural GPR and begun the work of establishing a standardized method. BT performed the data acquisition, data processing and analysis, and wrote the manuscript. HR assisted with acquisition and created the software framework used in data processing. AA assisted with statistical analysis. MW assisted with data acquisition and interpretation. ID assisted in creating the software. TA provided substantial assistance in revising the manuscript. DH aided with conception, design, and data acquisition.

Figure 1
Study was conducted in large raised beds lled with sandy loam soil. Plots were carefully measured and marked by string.

Figure 2
Roots were placed horizontally to maximize the angle between adjacent roots, as space allowed. Plots contained between 1 and 5 roots.  The antenna cart had 4 wheels and straddled the plots. The antenna array was hung in the middle with a plastic rod on one side to indicate the position of the center of the array on the ground.