Estimation and determinants of technical efficiency of smallholder cashew (anacardium) farmers in Dassa district, Benin: a bootstrap data envelopment approach

Cashew nut production is becoming more important in Benin’s economy, and it helps to improve the living conditions of rural populations. This paper examines cashew producers’ technical efficiency in Dassa District using survey data collected from 100 farms in 2020. The research relies on bootstrap modeling of Data Envelopment Analysis, and Fractional Regression Model is applied to assess the determinants of producers’ technical efficiency. The results show the technical efficiency of the farms to be 34.56%, and the same level of production is obtained when inputs are reduced by 65.44%. The research shows that, contrary to theory, farmers without access to credit are more efficient than those with access, with a difference of 12.64%. Finally, the results show that schooling expenses and the sale of cashew apples as a by-product of production are promising factors that may increase the efficiency of cashew nut production. Policies to promote cashew farm efficiency need to focus on promoting formal education in rural areas, establishing a financial literacy training program for farm managers, and promoting the expansion of agricultural extension for farmers. In addition, producers should also set up a system for productive, substainable, and profitable use of cashew apples to increase their income.


Introduction
Measurement and determination of factors that influence efficiency in resource use remain important steps to improve agricultural productivity (Abate et al. 2019;Anang 2021;Awunyo-Vitor et al. 2016). Moreover, the reduction of technical efficiency plays a key role in the definition of sustainable production systems, given the environmental arguments for reducing emissions and waste. Thus, the measurement of technical efficiency in agricultural production becomes relevant due to the competition for natural resources, such as water and land, which are necessary inputs in the production process (Gaviglio et al. 2021). Therefore, this paper will provide useful information to policymakers in determining the factors that make it possible to increase agricultural production without increasing resource use in relation to the objectives of reducing Greenhouse Gas emissions, especially in the framework of the Kyoto Protocol.
Technical efficiency is defined as a situation in which maximum output is produced using a given set of inputs or the application of a minimum amount of resources to obtain a given level of output (Battese 1992;Battese and Coelli 1992;Farrell 1957). In the agricultural sector, a farm is considered technically efficient when, given its use of inputs, it produces the maximum possible output. The more efficiently farmers produce, the closer they are to the efficient production frontier and the more productive they are in using their limited resources. In other words, the production frontier is reached when the available inputs used at optimal level. The quantification of a firm's ability to transform inputs into output has been subject to several controversial debates in economic analysis. For example, according to the neoclassical theory, productivity gains are assessed using the changes achieved in output that are not explained by the main inputs, such as labor and capital stock (Solow 1957). Thus, Total Factor Productivity (TFP), which is defined as the residual, was initially interpreted by Solow (1957) as a technical progress (shifting to the production frontier). However, this approach, qualified as the growth accounting approach, had some weaknesses, as it required first the definition of a functional form for production, second assumptions about the market structure, and finally, the assumption that the market is perfect (Del Gatto et al. 2011). Thus, to overcome these difficulties, Farrell (1957) introduced the Data Envelopment Analysis (DEA) approach. This approach consists in determining the optimal combinations of inputs and output to obtain an approximation of the production frontier (best practice). This will permit identify the share of technological progress and technical efficiency in productivity growth. Therefore, productivity growth is achieved by increasing three factors: capital accumulation, technological progress, and efficiency (Kumar and Russell 2002). The concept of efficiency is central to the microeconomic theory of production. Therefore, the optimal use of inputs and the important role of efficiency in increasing agricultural productivity have been recognized by several researchers. Indeed, improving resource use efficiency by producers is important for achieving the goal of sustainable agricultural production for food security and rural household incomes.
In Benin, the difficulties due to the decline in the international prices of cotton, the main export crop, have shown the vulnerability of an economy based on a single export crop. Thus, several programs and projects, such as the Agricultural Sector Development Strategic Plan (PSDSA) and the Support Programme for Agricultural Diversification (PADA), have been implemented to make agricultural diversification a national priority. It is within this framework that cashew nuts have been classified among the profitable priority sectors that can improve public revenues and producers' incomes. Indeed, cashew nut production involves the supply of cashew nuts and cashew apples, which are products whose local and international demand keeps increasing. In addition, from an environmental point of view, it reduces carbon dioxide emissions, the destruction of plantations by bush fires, water consumption, and soil erosion. The average area is 1 hectare, with about 60,000 households and 200,000 actors dependent on this production. Furthermore, in Benin, the cashew industry contributes 3% to the Gross Domestic Product (GDP) and 7% to the agricultural GDP; it provides 8% of export earnings. Benin is ranked as the 9th largest exporter of cashew nuts globally and the 3rd largest exporter in Africa (PSDSA 2017). These statistics show the opportunity that cashew production represents for agricultural diversification in Benin. However, despite the benefits, cashew nut production faces some challenges. Cashew production in Benin remains low, with tree productivity in 2019 at 356 kg/hectare, far lower than the 800 kg/hectare in Ghana with similar land properties (Gbaguidi 2020). In addition, production is characterized by insufficient cost control, which leads to wasteful or inefficient use of resources. This inefficiency can have a negative impact on the environment through the abusive use of organic fertilizers and phytosanitary products. Therefore, it becomes necessary to understand the factors that influence technical efficiency in the use of resources by cashew producers in Benin. The objective of this research is to identify the main factors that increase the technical efficiency of cashew farms in Benin. The contributions of this article can be summarized as follows. First, it will not only help to fill the gap in research on the determinants of the technical efficiency of cashew nut producers in Benin but also highlight the economic utility of cashew apples, which are considered as production waste. Second, it will inform public policy on the factors that should be emphasized to boost cashew nut productivity in Benin.
After the introduction, which outlines the problematic of this article, the second section reviews the literature on the technical efficiency of agricultural producers. The third section presents the Data Envelopment Approach (DEA) and the Fractional Regression Model (FRM), which makes it possible to determine the factors that influence producers' technical efficiency. The fourth section analyzes and interprets the estimations results. Finally, the fifth section presents the conclusion and policy implications.

Literature review
The notion of efficiency first appeared in economic literature with the reflections of Carlson (1939), Hicks (1935) and Samuelson (1947). This notion is developed in the work of Debreu (1951) on the empirical measurement of resource use, of Koopmans (1951) on the measurement of efficiency, and of Shephard (1981) who introduces the distance function to measure inefficiency. Farrell (1957), drawing on the work of Debreu (1951) and Koopmans (1951), considers efficiency to be composed of Allocative Efficiency (AE) and Technical Efficiency (TE). As stated earlier, TE is the ability of a firm or farm to produce the maximum possible output from a given set of inputs, while AE is the ability of a firm or farm to produce a given level of output using cost-minimizing input ratios (Farrell 1957). Charnes et al. (1978), in showing that Farrell's (1957) model is a special case of programming, develop a field of operations research that gives rise to the DEA, which is described as a nonparametric approach. The main advantage of the DEA is that it allows several outputs and inputs in the production process. Second, it imposes no restrictions on the functional form and no distributional assumptions on firm-specific efficiency, but it is extremely sensitive to variable selection and errors. The main criticism of DEA is that it gives all deviations from the frontier to inefficiency (Coelli 1995). As a result, it is sensitive to measurement errors and random effects, leading to an overestimation of inefficiencies (Del Gatto et al. 2011).
Parametric approaches, on the other hand, are based on econometric techniques, with a specification of the functional form. One such approach, the Stochastic Frontier Approach (SFA) proposed by Aigner et al. (1977) and Meeusen and van Den Broeck (1977), is widely used in the economic literature to estimate firm efficiency and its determinants. The SFA separates random errors that are beyond the control of the Decision Making Units (DMU) from inefficiency deviations and allows the estimation of standard errors and hypothesis testing (Banker 1996;Coelli 1998Coelli , 1995Grosskopf 1996). Comparing the results of the parametric and non-parametric approaches, most researchers point out that if the chosen functional form is close to the production technology, the SFA is the best performing method. Otherwise, the DEA method is more appropriate. Several empirical studies have measured technical efficiency and its determinants in agricultural production in developing countries (Adamie et al. 2019;Geffersa et al. 2019;Nsiah and Fayissa 2019;Pradhan and Mukherjee 2018). Anang (2021), in studying the technical efficiency of groundnut farmers in Ghana using the DEA and the Tobit model, finds that on average farmers are 50 and 70% technically efficient under the returns to scale hypothesis; farmer's gender, farming experience, household size, and off-farm activities negatively affect TE unlike extension. Tetteh Anang et al. (2020), combining the double bootstrap DEA and the probit model, show that the technical efficiency of maize farmers increases with technology adoption and herd size and decreases with education level, household size, access to extension, weeding frequency and area cultivated. Balogun et al. (2017), using the SFA, show that cassava farmers have an average technical efficiency of 81.8% in South West Nigeria. They also find a positive impact of land fragmentation on TE in contrast to distance between farms. Analyzing the technical efficiency of rice farmers in Cameroon using the SFA, Njikam and Alhadji (2017) find that technical efficiency and the impact of socio-economic characteristics vary across agro-ecological zones. Ahmed et al. (2018) have examined the efficiency of maize production in Ethiopia. Their results indicate that the mean values of technical efficiency, allocative efficiency, and economic efficiency are 82.24%, 37.07%, and 28.97%, respectively. They also conclude that extension services, cooperative membership, distance between farm and home, labor, improved seeds, and off-farm activities have a positive impact on technical efficiency. This is confirmed by Abate et al. (2019), who also find that age, education level, household size, land fragmentation, market information, and access to credit have a positive impact on technical efficiency.
In Benin, Kinkingninhoun-Mêdagbé et al. (2010), using the SFA, find that female rice farmers are discriminated against in regard to adherence to the irrigation system and access to land and equipment. The results also show a positive impact of experience and distance to the irrigation channel on technical efficiency and a negative impact of planting dates. Similarly, Singbo and Lansink (2010), using DEA bootstrap, find that the average inefficiency of rice farmers, at 34.9%, is positively influenced by age, water control, and high land area and negatively impacted by experience, household size, and education. Singbo et al. (2014) find a 13.7% inefficiency in vegetable production. Soil fertility, extension, credit, formal education, and experience have a positive impact on technical efficiency. Lawin and Tamini (2019) show that the technical efficiency of smallholder farms is 71%, with higher efficiency for non-landowners.

Data envelopment analysis
In our research, we use the input-oriented DEA model to assess the extent to which a farm can reduce its input use compared to the best farmers. We have chosen this specification, because Benin's farmers find it much more difficult to acquire inputs at lower cost. Hence, it would be important to determine input efficiency to reduce production costs.
Bootstrapping is a method of testing the reliability of original data by creating a pseudo-replicated data set. This method determines whether the distribution has been influenced by stochastic errors and can be used to build confidence intervals for point estimates that cannot be derived analytically. In this context, the DEA bootstrap method is formulated, where the Data Generation Process (DGP) is repeatedly simulated by resampling the sample data and applying the original estimator to each simulated sample Wilson 2000, 1998).
This method is based on the idea that the bootstrap distribution will repeatedly the original unknown sampling distribution of the interest estimators (using a nonparametric estimate of their densities). Therefore, this measure can simulate the DGP using the Monte Carlo approximation and can provide a reasonable estimator of the true unknown DGP. In addition, the non-parametric DEA approach has the advantage of having less stringent constraints than other non-parametric methods.
The efficiency for a data point x k , y k is k = min | x k X y k , where X y k is a set of input requirements. If k = 1 , the unit k is input-efficient. If k ≤ 1 represents the feasible proportionate reduction of inputs, the DMU could realize if y k were produced efficiently. Simar and Wilson (1998) define the efficient level of input corresponding to the output level y k as x x k |y k = k x k . Note that k is a radial measure of the distance between x k y k and the corresponding frontier. Unfortunately, k is unknown, because X(y) and k x k are unknown.
Let us suppose that the DGP P generates a random sample = x k y k |k = 1, … , n . Using the data with a non-parametric method: To obtain X (y) , X (y) , it is necessary to estimate its efficiency ̂ k = min | x k X y k . Because the DGP of P is unknown, the bootstrap procedure is used to determine the DGP P as a reasonable estimator of the true unknown DGP generated by the data. Note that conditional on , the sampling distribution of the estimators X * (y) and X * (y) are known, since P is known. Analytically, P could be difficult to calculate, so the Monte Carlo approximation is used to obtain and generate B pseudo-samples * b , where b = 1, … , B , and pseudo-estimates of the efficiency scores. The empirical distribution of these pseudo-estimates gives an approximation of the unknown sampling distribution of the efficiency scores.
A smoothed homogeneous bootstrap procedure, according to Simar and Wilson (1998), is applied in this research. An algorithm to generate consistent bootstrap values ̂ * b from the b-kernel density estimate is implemented. For each DMU, given the input-output data x k , y k withk = 1, … , n , k , k is calculated by the linear program based on the efficiency estimators.
The smoothed bootstrap sample * 1 , … , * n for i = 1, … , n is generated by letting * 1 , … , * n , a simple bootstrap sample, be obtained by drawing uniformly with replacement.
The sequence is defined by and the correct bootstrap sample is given by h is called the bandwidth factor and * i is a random deviate drawn from the standard normal distribution. With these procedures, the sample values take the same mean and variance as the original values. The bandwidth factor h is calculated according to a methodological procedure discussed in detail by Simar and Wilson (2011).
Then, we use the smoothed bootstrap sample sequence to compute new data ( (2) * In this paper, 2000 interactions (B) of the last two steps are performed to ensure adequate coverage of the confidence intervals. The bootstrap efficiency scores ̂ * k represent approximations of ̂ k , just as the DEA efficiency scores ̂ k represent approximations to k .
Since the bootstrap estimates ̂ * k,b = 1, … , B are biased by construction. By definition, BIAS ̂ k = E ̂ k − . The empirical bootstrap bias for the original The adjusted DEA scores are obtained by subtracting the bias from the original efficiency estimates. However, the bias correction introduces additional noise and could have a higher mean square error than the original point estimates, and the analysis provides corrections to find interval estimates. The percentile method modified by Simar and Wilson (2000) is applied to obtain confidence intervals, automatically correcting for bias without using a noisy biased estimator. Using the bootstrap score, we can build confidence intervals for each producer k . If we know the distribution of ̂ * (x, y) − (x, y) , it would be possible to find a , b , such that Because a , b are unknown, we use ̂ * k,b = 1, … , B to find the values b ,â , such that Finding b ,â entails sorting the values of ̂ k,b x 0 , y 0 −̂ k x 0 , y 0 , b = 1, … , B in increasing order and then deliting a number of rows equal to 2 * 100 % at either end of the list and setting â ≤b at the endpoints of the array with â ≤b . The confidence interval 1 − is This procedure is repeated n times to obtain n confidence intervals, one for each producer with â ≤ 0,b ≤ 0and̂ k which has values above the confidence interval.

Determinants of efficiency
Producers' socio-economic factors that influence DMU efficiency are estimated from a second step based on a regression model of DEA scores. The standard linear model is inappropriate for this second step, because the predicted values of the DEA scores may lie outside the unit interval. Furthermore, the standard approach of censored normal regressions with limits at zero and unity, such as the Tobit model, is not advisable, as scores less than or equal to unity are due to the DEA (6) k x 0 , y 0 +â ≤ x 0 , y 0 ≤̂ k x 0 , y 0 +b . model rather than censoring. Furthermore, the domain of the Tobit model varies from that of the DEA efficiency scores, as scores equal to zero are not observed (Ramalho et al. 2010). Therefore, since the efficiency scores are bounded by the interval [0;1], we apply the fractional regression model (FRM) proposed by Papke and Wooldridge (1996), which allows keeping the predicted values of the conditional mean of the scores in the unit interval. The FRM requires assumption of a functional form for the y efficiency scores which imposes a restriction on the dependent mean as follows: where G(.) is a known linear function satisfying 0 ≤ G(.) ≤ 1 , x represents a vector with environment variables, and represents a vector of parameters to be estimated. Papke and Wooldridge (1996) suggest as a possible specification for G(x ) any cumulative distribution function usually applied to model binary data. The most widely used functions are the logit and probit functional forms, where G(x ) = e x ∕(1 + e x ) and G(x ) = Φ(x ) , respectively. However, there are alternatives such as the loglog and cloglog specifications G(x ) = e e −x and G(x ) = 1 − e e −x , respectively (Long and Freese 2006).
Producers' specific socio-economic characteristics are used as an explanatory variable to identify efficiency determinants. These are farm manager's gender (x 1 ) , experience (x 2 ) , number of children (x 3 ) , access to credit (x 4 ) , non-farm income (x 5 ) , farm manager's age (x 6 ) , schooling expenditure (x 7 ) , health expenditure (x 8 ) , and cashew apple sale (x 9 ) . To determine the correct specification of the functional form, the Reset test and the P test proposed by Ramsey (1969) and McKinnon (1963), respectively, are used.

Study area description and data collection
Located between 1°41′ and 2°39′ longitude and 7°27′ and 8°31′ north latitude, Dassa District is bordered to the north by the Districts of Glazoué and Savè, to the south by the towns of Covè, Zagnanado, and Djidja, to the west by the town of Savalou, and to the east by the District of Kétou (Fig. 1). The District of Dassa is one of the six districts of the Department of Collines and is also the chief town of the department (Fig. 1). It has ten roundings: Dassa I, Dassa II, Akòfojúlé, Gbàfo, Kɛrè, Ìkpɛǹyìn, Lèma, Paouignan, Soclogbo, and Tré. According to the 2013 national census (RGPH-4), the district had 112,122 inhabitants. The district has 3 daily markets and 10 periodic markets that play an important role in its economic life. The district's economy is dominated by the primary sector, particularly agriculture. About 63.5% of the population of 11,268 farming households are engaged in the production of cassava, yams, soya, maize, and groundnuts. Cashew cultivation is common on all farms and practiced by families. Cashew is the main crop cultivated in Collines Department, making it the largest producer of cashew in Benin. Table 1 presents the descriptive statistics of the variables used for the DEA model and the socio-demographic variables for the analysis of the determinants of technical efficiency. The sample is composed of 100 respondent farms in Dassa District. Table 2 shows the distribution of efficiency scores. It can be observed that all producers in the sample have efficiency scores below 100%, which implies that no farm has reached its cashew production possibility frontier. The overall average efficiency score is 34.56%, which means that the same level of production can be obtainedby the average cashew producer in Dassa region when they reduce their input quantity by 65.44%, given the technology currently available. The average TE of cashew farmers who had access to credit is 28.12%, while that of farmers with no credit access is 40.74%. In practice, this result shows that by reducing their quantity of inputs by 71.88%, farmers with access to credit would have the same level of production; those without access to credit would have the same level of production by reducing their quantity of inputs by 59.26%. Overall, these results show that farmers without access to credit are more efficient in their use of inputs. This result is lower than the efficiency rates of 71% and 74.5% obtained, respectively, by (Ogundari 2014) in West Africa and by Mugera and Ojede (2014) in the whole of Africa. It is also lower than the rates of 67.52% and 71% obtained, respectively, by Sossou et al. (2014) and Lawin and Tamini (2019) in agricultural farms in Benin. These differences can be explained by the use of the bootstrap method, which allows for the correction of the bias of the estimators, unlike the previous results. Figure 2 shows score distribution by gender. Analysis of the distribution of efficiency scores shows that the distribution is much more concentrated on the left Sale of cashew apples (no = 0, yes = 1) 0.26 0.440844 0 1

Distribution of efficiency scores
Observations 100 and less flat, which means that the majority of farmers, regardless of gender, have scores between 7 and 40%, while a small number has higher scores. Similarly, it is observed that the probability of obtaining scores below 40% is much higher for men than for women, while the opposite effect is observed for scores between 40 and 75%. Scores above 75% are only obtained by men. Figure 3 shows the relationship between cultivated area and technical efficiency. We observe a U-shaped relationship, which means that technical efficiency increases with high cultivated area in contrast to small cultivated area. This shows an excess use of input resources by small cashew farms. Table 3 presents the Reset test and the P test for selecting the correct functional form specification from all possible specifications for the second stage of the DEA analysis. The results show that only the loglog model is not rejected at the 10% threshold. Therefore, a loglog functional form is preferred to analyze the determinants of technical efficiency. The determinants of technical efficiency from the regression analysis for the second stage using the FRM model are presented in column 1 of Table 4. The Tobit, truncated regression, and bootstrap truncated regression models shown in columns 2, 3, and 4, respectively, are estimated to test the robustness and sensitivity of the results obtained. We observe that, for the most part, all estimates retain their signs and levels of statistical significance. The effects, whether negativeor positive, are higher when estimated using the FRM model. The results show that experience, as represented by the number of years of farm operation, has a significant negative impact on the technical efficiency of farms. This result demonstrates that as the number of years of operation increases, the technical efficiency of cashew farmers decreases. This result is explained by the fact that most cashew farmers in Benin farm the same area of land for many years and use fertilizers and other chemical products, which decrease the quality of the soil and consequently make it less productive over the years. In addition, we also note that cashew farmers often stick to old production methods and are reluctant to adopt modern innovations that improve efficiency. This result is consistent with that obtained by Anang (2021) and partially confirms that obtained by Njikam and Alhadji (2017), who find an inverted U-shaped relationship between technical efficiency and the number of years of experience.

Efficiency score determinants
Access to credit has a significant negative impact on technical efficiency. This could be due to credit use for activities unrelated to cashew production, such as maintaining household consumption levels, education, and health expenditure. In addition, very high interest rates combined with climatic hazards may lead to the sale of cashew nuts at low prices to repay the credit, resulting in a decrease in the producer's income and reduced productivity for the next production cycles. This result is in line with those of Abate et al. (2019) and Miriti et al. (2021), who obtained similar results for red pepper and sorghum producers in Africa, suggesting that agricultural loans are used for other purposes by producers.
A farm manager's age also has a significant negative impact on their technical efficiency. This means that, over the years, a cashew farmer becomes less efficient, and young farmers are more efficient than their older counterparts. This is because older farmers are less energetic and more reluctant to innovate and adopt new production technologies as opposed to younger farmers who participate in extension The results show that schooling expenses have a very little but significant positive effect on cashew producers' technical efficiency. Schooling expenses allow for the education of household children who constitute the family labor force. These educated children are able to manage production more efficiently and have access to relevant information which they can explain to farm managers, positively influencing their adoption of new agricultural strategies and practices. This confirms the positive impact of education on the technical efficiency of agricultural producers in the literature (Akamin et al. 2017;Karimov et al. 2014;Tetteh Anang et al. 2020). Finally, the sale of cashew apples has a positive and significant impact on producers' technical efficiency. This means that the sale of cashew apples, which are considered production waste products, improves producers' technical efficiency. Indeed, cashew apple sale provides additional income for producers and improves their productivity. In addition, the increase in cashew juice production has led to an increase in demand for cashew apples. Thus, producers can increase their profits through higher prices.
Overall, the results show that the sale of cashew apples is the key factor in increasing the technical efficiency of producers. In line with the concept of circular economy, the sale of cashew apples will allow for sustainable growth by maximizing efficiency in the use of resources and by integrating as an input into other production chains, thus reducing environmental damage due to the abusive use of land and water.

Conclusion
Cultivation of cashew as a cash crop plays an important role in the economy of developing countries. However, despite all the agricultural policies implemented at national and international levels, it remains characterized by low productivity and inefficiency. This paper's general objective is to identify the determinants of technical efficiency of farmers in the district of Dassa in Benin. The levels of technical efficiency and its determinants have been estimated using the DEA Bootstrap method and the Tobit model. The results show that the majority of cashew farms do not make optimal use of available inputs, which makes it difficult to reach their production potentials. The average technical efficiency of cashew farmers is 34.56%. In other words, the same level of production can be achieved by the average cashew producer in the district when they reduce their inputs by 65.44%, given the currently available technology. Second, the results show that the average TE of cashew farmers who had access to credit is lower than that of those with no access to credit. Concerning the determinants, cashew apple sale and education, health, and food expenditures improve technical efficiency compared to age, experience, and access to credit.
In terms of economic policy recommendations, public policies should focus on capacity building for producers through apprenticeship, professionalization, and extension services to improve their competence. Financing institutions should target credits based on the type of farmers, provide financial literacy training to farm managers, and ensure regularly monitor beneficiaries to ensure that credits are used for targeted activities. We suggest that producers set up a system for harvesting, preserving, and selling cashew apples to increase their income. Finally, useful policies should include cashew producers, such as the Human Capital Strengthening Insurance (HCSI) that supports schooling expenses.
This research study has its limitations. The data only concern a sample of producers in the main cashew production region. Nevertheless, the results reflect the average performance of producers in the region. Future research should extend the assessment to all other producing areas using an integrated model that is composed of producers, processors, and ecosystem services to measure the effect of cashew sales on the efficiency of processing plants; the model should also incorporate land use to account for potential impacts on the environment, climate change, and production systems.