CIAT Pheno–i: an automated image analysis framework for HTFP
The increased use of UAVs in field phenotyping considerably decreased the hardware costs, however, image processing is the major challenge to the crop phenotyping scientists around the world [56]. As mentioned in the introduction, midway steps to extract information from the plot level field experiments need full automation and integration. Therefore, a need for accurate, robust, and automated analysis framework building orthomosaics and extract phenotyping information corresponding to each image of micro-plots (breeding) or large scale (precision agriculture) field experiments is necessary. Here, we are describing the Pheno-i image analysis software (Additional File 1) developed by CIAT phenomics platform (https://phenomics.ciat.cgiar.org/) and the automated orthomosaic generation pipeline. The primary criterion for any image analysis software should be cost effective, easy-to-use and rapid generation of actionable data from time-series images irrespective of experimental plot sizes. Making use of Agisoft Metashape Python API, the orthomosaic and DEM generation process was automated (Figure S1), achieving a reduction in time of ~30%, saving ~1.1h for RGB imagery and ~0.33h for MS imagery (Figure S5), compared to our manual processing method. CIAT Pheno-i back-end image analysis software design brings a significant improvement over any regular single thread Python implementation reducing the processing time of MS imagery processing up to 5x (Figure 4). Aforesaid processing time was calculated using two different CPU architectures as seen in Table S3. Our CIAT Pheno-i front-end software design comes with the advantage for the user to create, upload, calibrate, visualize, and analyze orthomosaics in a map-based canvas, giving a privilege to a non-programmer to analyze his own data through the internet. The image analysis report comes in CSV format with a timestamp and a reference to a quantified plot level data, in which the data can be used either to develop plant models or just to monitor the crop health status. We offered CIAT Pheno-i as a simple and easy to use solution to extract plot/plant-level information.
We validated the developed platform using proof-of-concept experiments with cassava genotypes over the two seasonal field trials to demonstrate the end-to-end application. The results obtained from the platform are described below.
High-throughput field phenomics for aerial imaging of cassava
UAV offers very attractive alternatives such as, convenient operation, high spatial and temporal resolutions with reasonable spatial coverage [57–59], makes it possible to document the within-microplot variability in phenotyping field experiments [60,61]. UAV, a current and an invaluable tool for crop monitoring at large scale (e.g., [27,59,62–65], has been proved to be useful for estimating canopy height and biomass in crops including rice [65], wheat [66] maize [30], sorghum [67] and peas [17]. However, in cassava, the UAV based high-throughput phenotyping methods need to be standardized for feasibility and accuracy in estimating various phenotyping parameters such as, biotic and abiotic stresses. So far, most studies have attempted to correlate morpho-physiological data with the productive potential (root yield) of the genotypes at the end of the crop cycle [68]. Subsequently, these pre-breeding field experiments go through long selection cycles, leading to high maintenance costs. The correlation analysis between important breeding traits at different phenological stages and UAV image derived VIs are discussed below.
Relationship between UAV images derived features and canopy height
Canopy height (CH) is a key factor in cassava root yield, dry matter, leaf area, and plant architecture [69]. Collecting CH within cassava field breeding programs are labor intensive and prone to assessment error. In this study, orthomosaics and DEMs were generated using Methashape Agisoft API. Canopy metrics (CHuav, CCuav and CVuav) and VIs derived from high-resolution MS images (2.7 cm x pixel) were extracted through our CIAT Pheno-i web-based application. The pearson’s correlation analysis between UAV features (VIs, CHuav, CCuav and CVuav) and canopy height (CH) at EL and LBK stage showed that the UAV feature are positively correlated (Figure 5c and Figure 7a), except during the trial two, where most of the VIs showed low and negative correlations at DMA stage (Figure 7a). This low or poor correlation is mainly due to the saturation of VIs at later stages of growth and crop lodging. Significant correlation was found at EL stage between manually estimated CH and CHuav (Figure 8a). However, the best relationship was reached at the late bulking stage for both the trials with r values 0.89 and 0.92, respectively (Figure 5c, Figure 7a, and Figure 8b). Similar results were found in cotton using DEMs from MS cameras [70]. In trial one, among the VIs, NDRE index showed significant relationship with CH manually with an r value of 0.83 at LBK stage. (Figure 5c). The CH data collected by the UAV were credible and the correlation with ground-truth measurement was very high. Therefore, UAV based CH measurements in cassava has great potential for use in studies of physiological and genetic mapping experiments.
Relationship between UAV metrics and canopy structure related traits
Time series measurements of canopy related traits are very useful to develop crop growth curves. Estimating AGB traits such as canopy volume is laborious, destructive and time-consuming and therefore needs an easier and convenient method [71]. In cassava, AGB can provide valuable insights into understanding the carbon assimilation mechanism and storage root development. In this paper, canopy metrics such as CCuav and CVuav across the phenological stages showed positive significant relationship with AGB. During the trial one and two, significant correlation (r = 0.80 and r = 0.54, respectively) was found between CCuav and AGB at LBK stage (Figure 5b and Figure 6b). A similar relationship was previously reported between dry leaf biomass and UAV derived green CC [72]. Also, at LBK stage a similar relationship (r = 0.70) was found between CVuav and BGB during the trial one (Figure 5a). High-throughput canopy metrics tools developed from this study could provide quantitative data for novel traits that define canopy structure. Recurrent measurement offers time-series data from which we can estimate growth rates and dynamics. Such non-invasive measurements are very useful to understand genotype specific responses to environmental stresses during the growth period. Cassava canopy structure parameter data can also contribute to the development of root yield prediction models and could help cassava breeders in the selection procedure by providing early hints on the performance of novel lines.
Correlation between LAI and UAV derived features
The leaf area index (LAI) refers to the per unit area of the one-sided leaf per unit area of ground surface. The maximum LAI in cassava ranges from 4 to 8, depending on the cultivar, the atmospheric and edaphic conditions that prevails during crop growth stages [73]. Selection for higher LAI should favor high root yield, since there is an optimum relationship between root yield and LAI [68]. Positive contribution of LAI with cassava yield has also been reported by [74], and [75] also reported significant high correlation between ground cover and LAI in grass, legume and crucifer crop. Measuring LAI is a tedious [76] and time-consuming process, and an image trait complimenting LAI can be very useful. In order to establish this relationship, in trial two, LAI was measured and the correlation analysis was performed with UAV derived canopy metrics and VIs. The results of canopy metrics (CCuav and CVuav) and VIs showed highly significant and positive correlation with LAI in all the tested phenological stages, whereas, CCuav and CVuav correlated with DMA with r value of 0.56 (Figure 7b). Among the tested VIs, NDREI showed highly significant correlation with LAI at EL and DMA stage with r values of 0.53 and 0.63, respectively (Figure 9a, d); whereas, the correlation decreased slightly with the bulking stages (EBK and LBK) (Figure 9b, c). Additionally, highly significant correlations were found with LAI and NDVI at EL and DMA stages with r values of 0.55 and 0.59, respectively (Figure 7b). Strong correlation between NDVI and LAI using UAV images has also been reported in different crops such as rice [65], sorghum [67]; for NDREI in bread wheat [77]. These results indicate that NDREI could explain the green leaf area during senescence.
Relationship between UAV features and above-ground biomass
Breeding for early vigor, fast growing cassava genotypes is ideal to tackle several issues especially in early stages of crop management. Vigorous and early growth cultivars were less sensitive to lack of weed control than non-vigorous slow growth types. Above-ground biomass (AGB) estimation in cassava, is a most laborious and time-consuming method, requires a multi-step process: crop sacrifice from the field plot, oven dried before being weighed to assess the fresh and dry biomass of each sample. This multi-step destructive process is prone to error, from variability in the area within the plot sampled, to the potential loss of material while collecting and transporting [6]. In this present study, we estimated fresh canopy biomass in cassava using remote aerial imaging methods. Our results from both the trials revealed significant positive correlations between VIs (NDRE, NDVI, GNDVI, BNDVI, NDREI, NPCI and GRVI) and AGB, at three different phenological stages (EL, EBK and LBK). A further comparison between VIs and AGB at LBK stage, using NDRE values alone, also showed positive significant correlation in both the trials with r values of 0.84 and 0.65, respectively (Figure 5b, Figure 6b). Across UAV derived canopy metrics at LBK stage, we found significant correlation between CCuav and AGB above r = 0.54 (Figure 5b, Figure 6b). Our results clearly indicate that EBK is one of the key phenological stages to predict AGB through remote sensing in cassava. Combining VIs at three phenological stages (EL, EBK and LBK), the trial two showed good AGB relationship with NDRE, NDVI, GNDVI, BNDVI, NDREI, NPCI and GRVI with r values of 0.71, 0.62, 0.66, 0.59, 0.64, 0.55, and 0.66, respectively (Figure 6b, Figure 10a).
Relationship between UAV derived VIs and below-ground biomass
Measuring root biomass through non-destructive methods over different cassava varieties will help cassava breeders in the efficient selection of cultivars with favorable rooting architectures e.g. root area and harvesting [78]. Thereby, the impact of agronomic research through unique agricultural practices on root bulking can be assessed. Destructive root sampling in cassava requires sampling large populations and trials that are laborious and expensive [8]. Rapid and non-destructive process of estimating below-ground biomass (BGB) across different environments would reduce time, cost and sample size requirements in phenotypic data collection. In this study, we determine the capability of MS aerial imaging to estimate BGB. In both trials, except at DMA stage, all the tested VIs showed positive and significant correlation with fresh BGB at EL and LBK stages (Figure 5a, Figure 6a). Our results revealed that the later stage (DMA) of cassava crop life was least correlated, attributing the fact that at the later crop stages (i.e. when the roots are actively accumulating dry matter), cassava canopy tends to senescence.
In both the trials, NDRE, NDVI, GNDVI, BNDVI, NDREI, NPCI, GRVI indices showed significant positive correlations with fresh root biomass with r values ranging from 0.18 to 0.72 during the EL to LBK stage, where the highest correlation coefficient (r=0.72) correspond to NDRE at the EL stage at trial two (Figure 5a, Figure 6a). On the other hand, canopy metrics (CCuav and CVuav) exhibited highest and stronger correlations with BGB at LBK in trial one with r = 0.70 and r = 0.70, respectively (Figure 5a). Also, we found that the DMA stage showed poor and no significant correlation for some VIs, CHuav and CVuav metrics (Figure 6a). In addition, the multi-temporal analysis showed improved correlations with BGB, where we observed that the combination of VIs at [EL+EBK] stages showed highly significant correlation (r = 0.77) for GNDVI (Figure 6a, Figure 10b). Generally, from three to five months after planting (MAP), intense development of the photosynthetic apparatus and aerial part of the cassava plants is observed. Consequently, a vigor in this phase causes the greatest enhancement of AGB with consequent reflection in fresh root yield [13]. The relationship between aerial imaging features and BGB obtained from this study are encouraging and it can be an add-on feature for our ongoing Ground penetrating Radar (GPR) research predicting BGB in cassava. Furthermore, all the data produced from above (UAV multispectral) and below ground sensors (GPR) could be merged using high precision Geographic Information System (GIS) to achieve more comprehensive estimation of BGB.
Cassava root yield predictions using ML models
Accurate estimation of crop yield is essential for plant breeders. Yield is a very important harvest trait observation that involves the cumulative effect of weather and management practices throughout the entire growing cycle. [79]. Early detection and crop management associated with yield limitations can help increase productivity [4,23,80]. Crop yield prediction models could aid in early decision-making, optimizing the time required for field evaluation, thus reducing the resources allocated to the research programs [81]. Furthermore, the predicted yield maps could also be used to implement variable rate technology (VRT) systems in spatial databases, thereby accomplishing precise field-level inputs through the entire field [82]. Traditional cassava growth models have certain limitations, such as high input cost required to run the models, the lack of spatial information, or the actual quality of input data [13]. Remote sensing approaches can provide growers with final yield assessments and show variations across the field [79]. In remote sensing, MS imagery can describe crop development for potato tuber yield forecasting, across time and space, in a cost-effective manner [81,82].
To our knowledge, there are no predictive models for cassava root yield using aerial imaging and ML techniques. Therefore, ML technique was explored to provide a means of early prediction of cassava root yield using MS UAV remote sensing on a field scale. A PCA and PCR analysis was used to establish, with which more than 600 predictor variables were retained to train the models. The PCA results showed that the contribution of the first 10 components explains 90% of variance (Figure 11) and PCR after a 10 fold cross validations can achieve a R2 of 0.89. With PCA, the most important component was PC1, explained 55.6% of total variance (Table 3). Using the first four components provided by PCA (80% of the total variance) and PCR, SVM, RF, kNN, and ANN models were built to predict BGB using multi-temporal VIs combinations and canopy metrics (Figure 12). Among the four developed ML models, the results showed consistent performance with small differences between PCA and PCR techniques ranging from 0 to 9% along the metrics (Table 4). PCA was performed little better than PCR in terms of RRMSE and R2 ranging from 20.51% to 22.73% and 0.61 to 0.67, respectively. In this case, the RF model gave the most well-adjusted results, with high R2 and lowest RMSE, indicating the importance of VIs and canopy metrics to predict BGB by MS sensors. Even though the accuracy of developed models is not very high, considering the laborious cassava phenotyping efforts, CIAT Pheno-i will still be handy for breeders to reduce their time and efforts. This model accuracy can be easily improved by adding other features such as climate, soil, and more timing points.