Exploring the possibilities of remote yield estimation using crop water requirements for area yield index insurance in a data-scarce dryland

The need for accurate and meaningful agricultural data as the means of making vibrant policies and informed decisions, is an increasing concern for policymakers in developing countries such as Ethiopia, where such information is usually scarce. In Ethiopia, the impacts of climate change on crops yields is rarely available at the lowest administrative levels such as wards/villages, for the benefits of the grassroots’ populace. Thus, this research sought to evaluate the use of crop water requirements in the estimation of crops’ yield. FAO’s CROPWAT 8.0 application was used to pre-determine the possibility, preceding the use of CROWRAYEM. Both CROPWAT and CROWRAYEM had high coefficients of determination, when tested with a survey data of barley and sorghum farmers’ yield for the 2015 to 2018 cropping season in semi-arid southern Tigray, northern Ethiopia. Furthermore, the infusion of the crop yield into a recently published area yield index insurance payout structure, increases the functionality of the proposed yield estimated model (CROWRAYEM).

their evapotranspiration, to facilitate planning of water management, and the scheduling of irrigation in the event of rainfall deficits (Bhat et al., 2017;Ewaid et al., 2019;Khan and Walker, 2015). No doubt, the use of the CROPWAT model has given rise to improved decisions and practices in terms of timely and appropriate and supplemental irrigation practices. However, most studies have limited their scope of the application of the CROPWAT model to estimating crop water requirements (Bhat et al., 2017;Yazew et al., 2012), irrigation scheduling (Berhane et al., 2010), and yield response and reduction (Loum et al., 2014); combination of crop water requirement and yield reduction (Bana et al., 2013); crop water requirement and irrigation schedules (Ewaid et al., 2019;Surendran et al., 2017).
This study explores the use of crop water requirement of barley and sorghum in estimating farmers' yield and yield losses. This idea is considered novel in the Ethiopian context, as no other known work has attempted yield estimation using our approach. The models were validated and found highly promising, especially with use for remote calculation of the area's yield insurance payouts based on an earlier work of Eze et al. (2020). A study as this will address the expressed shortcoming of index insurance as lacking empirical validation and monitoring of experiences, in data scarce regions (Clarke, 2016)n. Our methods are presented in simple, explicit steps and could aid the replication of such a study anywhere in the world.

The study area
The study was undertaken in two selected woredas (districts) of Tigray region, Ethiopia.
Purposive sampling approach was adopted in selecting the study communities based on their contribution to the regional and national food security. The study area is located between 12°38'44'' N to 12°57'10'' N Latitude and 39°27'18'' E to 39°55'56'' E Longitude,with an altitude ranging from 1507 to 3760 m above sea level (m.a.s.l.).
The agro-climatology of the study area is classified as semi-arid, predominantly with irregular rainfall accompanied by frequent droughts. From analysis of climate data obtained, the mean annual temperature of the area is between 13 °C and 29 °C. Although annual rainfall ranges from 624 mm to 827 mm, the average annual rainfall is around 725 mm. Place of higher altitudes within the area have a bimodal rainfall type with the main rainy season, which lasts from June to September (locally called "Kiremt"), and the shorter rainy season, February to May, (locally called "Belg"). Four traditional agro-ecological divisions are used for categorization in Ethiopia namely: Kola (lowland 1400-1800 m.a.s.l.), Woina-dega (midland 1800-2400 m.a.s.l.); Dega (highland 2400-3400 m.a.s.l.) and Wurch (very-cold highlands above 3400 m.a.s.l.). However, only the Kola and Dega agro-ecological divisions are found in the study area (Table 1).

Yield data (2015-2018)
This data collected in the first quarter of 2019, was from a direct interview of farmers, in lieu of historical yield data unavailable at ward and district levels. An IPAQ device was used for delineation of agricultural fields to aid the collection of increasingly accurate field area and realistic crop yield measurements from farmers. Thirty-four (34) farm owners (sorghum = 16; barley = 18) engaged in only rain-fed cropping in the highland and lowland areas are used for this study. Although more farm fields were randomly selected, only famers who could provide history of crop yield obtained in the last four years were eventually included.
In most cases also, farmers were found in pairs and groups of at least three, hence there is a strong influence of group consensus on farmers' responses, hence increasing the accuracy of the farmers' memory-based crop yield data. The unit of measurement of crop yield is in Quintals per hectare (Qt/Ha).
Barley and Sorghum were purposively selected as the subject of this study due to their nutrition value and their daily intake in Ethiopia. Also, Barley is the major cultivated crop in Endamekhoni, due to agro-climatic factors and an available ready market in the beer factory nearby; while sorghum is popular in Raya Azebo for agro-climatic reasons, and as a cushion effect on availability of livestock feed in the event of drought. These reasons necessitated the restriction of this study to the most commonly cultivated crops, and also to increase the accuracy of yield recall from farmers' memory. Barley was therefore selected for consideration in Endamekhoni district and sorghum in Raya Azebo district. According to Shuai et al. (2018), barley, sorghum, teff, maize, wheat are referred to be the core of the Ethiopian food economy, covering about 75% of the total cultivated area. These reasons necessitated the restriction of this study to the most commonly cultivated crops, and also to increase the accuracy of yield recall from farmers' memory.

Meteorological data (2015-2018)
The Climate Hazards group InfraRed Precipitation with Stations (CHIRPS) grid precipitation dataset from 2015 to 2018, obtained from the IRI Data Library (ftp://ftp.chg.ucsb.edu/pub/org/chg/products/CHIRPS-2.0/) was used for this study. The CHIRPS is described as a combination of a "high-resolution climatology, time-varying cold cloud duration precipitation estimates, and in situ precipitation estimates", and holds very low errors, yet of high research quality (Funk et al., 2014;Shukla et al., 2017). The CHIRPS being a grid-based data is preferred to the point-station precipitation data for use in this study, because it is highly reliable and provides data even for areas without a meteorological station and has been proven as accurate for use in Ethiopia by (Bayissa, 2018;Gebrechorkos et al., 2018).
The daily Enhancing National Climate Services (ENACTS) maximum and minimum temperature data (2015-2018) was obtained from the Ethiopian National Meteorological Agency (NMA), and used for this study. From the global ENACTS data (http://iri.columbia.edu/resources/enacts/) NMA refined it further using several in-situ stations in Ethiopia. Further details on the production of this temperature data has been described by International Research Institute for Climate and Society (2016) as being generated from the combination of observation records, digital elevation models and reanalysis products for temperature, hence considered appropriate for this study being highly reliable. The climate data (temperature and rainfall data) were sourced from the IRI Data Library and resources page; while the crop data for the crops of interest (barley and sorghum), were derived from the FAO 56 Manual (Allen et al., 1998), and served as input into relevant fields of the software. Soil data used for the area include black clay soil for (Endamekhoni) and

CROPWAT
Red sandy loam soil for Raya Azebo. Planting dates were obtained from farmers during the survey.

Model efficiency evaluation
Four statistical analyses were implemented to evaluate the performance of both CROPWAT 8.0 and Crop water requirement yield estimation model. These are the coefficient of determination (R 2 ), root mean square error (RMSE), mean absolute error (MAE) and the index of agreement (d).
Where is the predicted yield values and is the farmers' reported yield values The R 2 (equation 1) is the square of the Pearson's product moment correlation of simulated and recorded yield. This can be interpreted as how well the models explains variation in farmers' reported yield. The RMSE (equation 2), which squares the errors and returns positive numbers is used to rule out the problem of over-and under-predictions by the models. However, due to the use of average squared differences by RMSE, large differences are usually weighted more heavily.  (2017) and (Liu et al., 2017) viz: "excellent"-(d>0.9); "good"-(0.8≤d<0.9), "moderate"-(0.7≤d<0.8), and "poor"-(d<0.7).

Preliminary analyses
The crop irrigation schedule was computed for each famer based on the required parameters,  Following the computation of the cumulative yield reduction (in percentage (%)), yield values were computed for each farmer, based on the average expected crop yield during years of normal precipitation and optimum harvest. Thus, we derived "CROPWAT yield data", in addition to the farmers' survey crop yield data for correlation analyses. A plot was made with both datasets to produce a trend line, with a linear equation and coefficient of determination (R 2 ) generated (Figure 3 and 4). The estimated yield was evaluated using relevant statistics (Table 2).  and index of agreement (d) and is good in years of (above)normal precipitation and yield.

Farmers Reported Yield vs CROPWAT8 Barley Yield Estimate
conversely, in Endamekhoni, the coefficient of determination (R 2 ) indicate a very strong prediction of farmers' reported yield ranging from 68% to 86% estimations.

Crop water requirement analysis yield estimation model (CROWRAYEM)
Following a strong relationship between predicted results in CROPWAT8 and farmers' reported yield in most of the years (Figures 3 and 4) Furthermore, from the rainfall data (Rf), effective rainfall (Peff) is determined using (Rf*0.7). Moisture deficit (Def) is consequently obtained from deducting Peff from CWR.
Yield response to moisture deficit (Yr) is then calculated and the value multiplied by the maximum yield potential of the crop to determine its yield value (CWR yield) ( Figure 5).
Results of the estimated yield of sorghum and barley are plotted (Figure 6 and 7), and compared for an evaluation of the performance (Table 3).  Azebo, is quite poor in 2015, but estimates higher and better in years of mild drought and normal precipitation. This is evident by the very low coefficient of determination (R 2 ) and index of agreement (d) and is good in years of (above)normal precipitation and yield. For barley estimate in Endamekhoni, the evaluation indices present diverse results, but the coefficient of determination (R 2 ) indicate a moderate to good estimation of the farmers' yield.

Application of the use of CROWRAYEM estimated yields for insurance payouts
The outputs of the CROWRAYEM model were applied as yield inputs in determination of insurance payouts, based on the area yield index insurance payout structure (equation 5) used by (Choudhury et al., 2016;Eze et al., 2020),  The overall model evaluation statistics, especially for the CROWRAYEM, indicates prospects of improvements on the model's ability to estimate crop yields, owing to its high accuracy (Table 3).
All statistics used in evaluating the yield estimates for sorghum, which is the crop for the lowland area of Raya Azebo, for the years under review had an overall promising performance, which is likely to improve further, with more accurate yield data from farmers or other agricultural agencies.
On the other hand, the barley yield estimates, from the highland areas, show a gradually improving coefficient of determination (56% to 89%), but other evaluation parameters applied indicate a weak performance of the model. A question arises as to other factors unconsidered by this study has impact on crop yield, and should be considered for incorporation in crop yield models. Factors such as agro-ecology of an area, in addition to other strong factors of crop yield outlined by (Batchelor et al., 2002) such as plant genetics, crop population, and crop management, have not been considered in this study. Also, heterogeneous crop performance within a plot raised by (Sapkota et al., 2016), were not incorporated into the yield estimation model, which could also affect its accuracy.
For drought-prone Raya Azebo, the ability of these crop water requirements-based yield estimation models in providing approximate yield data is of great importance. Since it has been given by (Abera et al., 2019), climate change adversely affects water availability in Northern Ethiopia, in which Raya Azebo falls, such mechanisms of yield estimation as developed in this study could be adapted as a decision tool for investors. Insurance companies could leverage on this possibility to establish their yield estimation mechanisms, to reduce cost of physical travels for verification of crop losses each farming season. Hence, determination and affirmation of crop losses recorded by farmers for compensation during droughts within the area-yield-index-insurance framework can be easily achieved.

Conclusion
This study has demonstrated that barley and sorghum farmers' yield can be estimated using CROPWAT 8.0 and CROWRAYEM models, with a strong relationship found between obtained farmers' historical yield and model estimations. This approach has also been successfully applied to determination of payouts remotely, which has prospects of reduction of physical trips to farms and other related costs for verification of crop losses, thus increasing the possibility of a fully automated approach to crop insurance (especially the area-yield-index-insurance).
The prospect of the developed CROWRAYEM to predict yield and yield loss is feasible. With availability of improved historical crop yield data and adequate soil tests, farmers' yield can be accurately predicted to determine yield losses and required compensation for farmers in the context of area yield index insurance. Furthermore, a reassessment of accurate historical data, and comparison of the same with the output of the CROWRAYEM is likely to provide a stronger evidence for the efficiency of the model.

Limitations and further research
This study has apparent limitations in terms of length of yield data and a meagre sample size. This reduces the generalizability of its findings, and results herein cannot be directly applied to other places outside of the study area. Required time series data for crop yield at ward and district levels were unavailable at the agricultural and statistics' offices, hence our improvisation for data acquisition. Furthermore, certain strict criteria necessitated the trimming of the sample size, such as farmers who cultivate the specified crops as their major crops, and who could easily recall yields obtained in four previous seasons. With the availability of comprehensive time series crop yield data at ward levels, increased accuracy of yield data obtainable from surveys could improve the findings of the study. More so, if future research in the area could include soil tests to increase accuracy in detecting the yield loss rates from CROPWAT 8.0. Also, there is the likelihood of developing a regression model to include other factors that affect crop yields in drylands, for yield prediction.