Technical Eciency of Kenyan Smallholder Dairy Farmers in Different Agro-ecological Regions

We employ stochastic meta-frontier and region- specific frontiers based on the “true” random effect 49 framework to examine technical efficiencies, technology gaps and meta-frontier technical efficiency of 50 Kenyan smallholder dairy farmers in different agro-ecological zones. The empirical analysis is based 51 on comprehensive three-wave household level panel data from across three agro-ecological zones in 52 Kenya. Results show variations in efficiency measures and that smallholder milk production is 53 characterized by increasing returns scale across all agro-ecological zones. The results indicate that the 54 milk output of smallholder dairy farmers from all the agro-ecological zones lags behind their potential 55 with the technology available and the prevailing enviromnetal conditions in their agro-ecological zones. 56 We also, find that there exists a significant technology gap in dairy production across the agro- 57 ecological zones in Kenya. These findings generate important policy implication for achieving increased 58 technical efficiency and reduce the technology gap in smallholder dairy production. 59 analysis based on comprehensive

The Kenyan government encourages SDFs to produce more milk by increasing farm productivity and 91 efficiency. In order to have a better understanding of the scope for increasing the output of milk in the 92 country, it is imperative to have some knowledge of the productive capacity of SDFs. Gelan and Muriithi 93 (2012) and Maina, Mburu, Gitau, and VanLeeuwen (2020) have examined technical efficiency of 94 smallholder milk production in Kenya. A common feature of these studies is that they are based on a 95 hypothesis that all the SDFs share a common technology. However, Kenya has heterogeneous agro-96 ecological conditions which would affect milk productive efficiency. Hence, the hypothesis of homogenous 97 technology may not apply to the SDFs of Kenya. It is more plausible to assume that different types of 98 agro-ecological regions in Kenya may employ different types of technology. However, to our 99 knowledge, almost no studies have taken this into account when evaluating the technical efficiency of 100 smallholder milk production in the agro-ecological regions of Kenya. Ignoring the variation of 101 technologies in Kenya's agro-ecological regions may lead to biased estimates of SDFs efficiency scores, 102 thereby misleading policy implications. 103 Smallholder dairy production areas span a wide agro-ecological gradient, including semi-arid, sub-104 humid and temperate highlands, which differ in terms of altitude, soil types, landscapes, and climatic 105 conditions. The geographical and climatic variations lead to different levels of resource endowment and 106 production potentials which causes further variations in farming systems and socio-economic conditions. 107 The aforementioned factors have led to wide heterogeneity in the available production technology set 108 for the farming households in different agro-ecological regions. Therefore, these differences in agro-109 ecological regions potentially inhibit SDFs in some regions from choosing the best technology from a set 110 of potential technologies resulting in the so-called technology gap ratios (TGRs). In this study, we deal 111 with heterogeneous technologies through integrating their measurement of SDFs efficiency with the 112 parametric meta-frontier approach first proposed by Battese and Rao (2002). This method enables us 113 to estimate technology gaps and comparable efficiencies for milk production under distinct technologies 114 relative to the potential technology available to the SDFs. 115 The objective of this study is to provide new evidence on the technology gap and SDFs efficiency 116 performance in Kenya using a parametric meta-frontier approach. The remainder of this study is 117 organized as follows. In Section 2, we introduce the parametric meta-frontier model to measure the 118 SDFs efficiency and estimate the technology gap. In Section 3, we present the empirical analysis of the 119 case of SDFs in Kenya. In Section 4, we conclude the study. 120 121 122 2. Methodology 123 2.1 Analytical strategy 124 To consider the heterogeneities of agro-ecological regions, we need to employ the meta-frontier 125 analysis method. The meta-frontier analysis is deeply rooted in production theory. It differentiates the 126 technology heterogeneities by estimating region-specific production frontier. It then estimates the meta-127 frontier by searching for the envelope of the region-specific frontiers. Suppose we have k agro-128 ecological regions in Kenya, we can then estimate group stochastic frontiers for each region as follows: 129 where denotes the output level for SDF i in the k th region in the t th time period, is the input vector, 130 represents the error term and is assumed to be iid as ~(0, 2 ). is a one-sided error 131 representing technical inefficiency and is distributed as ~+ (0, 2 ( )), where denotes 132 inefficiency or production environment determinants, and is a vector of unknown parameters for the 133 k th region. The technical efficiency (TE) of the i th SDF relative to the region k frontier can be computed, 134 following Greene (2005b), as: 135 where is a measure of the performance of the individual SDF (i) relative to the regional group 136 frontier. 137 To estimate the stochastic meta-frontier function that envelopes all the frontiers of the k agro-138 ecological regions, we use the approach by Battese, Rao, and O'donnell (2004) defined as 139 * = ( , * ) ≡ * , = 1,2, … , , = ∑ =1 , = 1, … . , where * is the vector of unknown meta-frontier parameters. The meta-frontier should be larger than 140 or equal to the group-specific frontier, that is, * ≥ . Equation (3) indicates that the meta-141 frontier analysis production function contains all the group frontiers over the entire period. For simplicity, 142 the function in Equation (1) is assumed to be X it β (j) ; then frontier function to the meta-frontier production function, which is called the technology gap ratio (TGR).

148
The estimated TGR must be less than or equal to unity. The great virtue of the TGR estimator is that 149 establishments with heterogeneous production functions can be compared in terms of relative efficiency. 150 The TE of the meta-frontier production function, TE*, is expressed as a product of TE and TGR: 151 * = * + ( ) The TE component captures the performance of the SDFs compared with the best-performed SDF in 152 the sample, while TGR components capture the differences in frontier or best-practice technology in the 153 group. Their product (TE*) defines the technical efficiency of SDFs by meta-frontier.

154
The TE of individual SDF is estimated assuming a flexible translog functional form Equation (2) and 155 (3), which accounts for nonlinearity, substitution, and complementarity, as well as non-constant returns to 156 scale. The model is estimated using the 'true' random effect (TRE) frontier model (Greene, 2005b), and 157 it extends the conventional stochastic frontier model by disentangling the farm effect (unobserved 158 heterogeneity) from TE. The trend variable, t, is introduced to capture the effect of technological change. 159 The estimated flexible translog function for the region k frontier is specified as; 160 where is a vector of dairy outputs, is a vector of inputs (j = 1,….,J) by farms (i = 1,….,N) over 161 time (t = 1,….,T), and all the Greek letters are parameters to be estimated. The white-noise error term 162 is added to allow for random measurement error. The term is symmetrical and is assumed to 163 satisfy the classical assumptions, that is, ~(0, 2 ), ⊥ . The term is specified as 164 ~+ (0, 2 ( )), and is a farm-specific component for capturing time-invariant unobserved 165 heterogeneity, which is assumed to have an iid normal distribution. an altitude range of 1800-2400 m above sea level (asl). Kericho County covered LH1, which is 173 moderate and humid, with an annual average rainfall of >80% of the potential evaporation. UM has 174 a mean temperature of 18-21 0 C and an altitude range of 1300-1900 m asl. Bomet County covered 175 UM1 and UM2. UM1 is temperate and humid, with an annual average rainfall of >85% of the potential 176 evaporation, UM2 is temperate and subhumid, with an annual average rainfall of 65-68% of the 177 potential evaporation. LM has a mean temperature of 21-24 0 C and an altitude range of 800 to 1500 178 m asl. Narok County covered LM1, which is temperate and semiarid, with an annual average rainfall 179 of 25-40% of the potential evaporation. 180 The sampling procedure involved both purposive and random sampling techniques. The three agro-181 ecological regions and three counties were selected purposively, considering their potential smallholder 182 dairy production. The main purpose of the survey was to generate information on current dairy farming 183 practices, technology adoption, access to institutional services, and prevailing production constraints in 184 smallholder dairy production systems in the country. Following the identification of agro-ecological 185 regions and counties, a proportional to size sampling procedure was used to select 3-6 wards (smallest 186 administrative division in Kenya) from each county in which 84-92 SDFs were identified for interviews. 187 Finally, 1512 randomly selected respondents were interviewed during the first-round survey and 188 1444 and 1354 were re-interviewed in the second and third rounds respectively using the same 189 questionnaire. In the final analyses, the balanced data of 1344 respondents, with an attrition level of 190 less than 5%, were used. A structured questionnaire was prepared and pre-tested on selected 191 respondents for further modification to ensure the validity of all questions so that the required SDFs 192 information could be captured properly. The questionnaires were administered by experienced 193 enumerators who underwent specific training on the questionnaires before execution of the survey. 194 Finally, the survey was executed under the close supervision of researchers from Moi University. A 195 thorough data cleaning exercise was carried out before using the Stata 16 software program for the 196 data analysis. 197 The variables considered in both and in Equation (2) and (3)  Thevariables in this study consist of the following: 1 is number of years of schooling achieved 210 by the household head, 2 is age of household head in years, 3 is the age squared to control for 211 the non-linear life cycle effects, 4 is a dummy variable that takes a value of one if the household 212 head has access to credit, otherwise zero, 5 is a dummy variable that takes a value of one if 213 household participated in off-farm work, 0 otherwise, 6 is a dummy variable that takes a value of 214 one if a farmer participated in a dairy farmer group, 0 otherwise and 7 is a dummy variable 215 that equals one if a farmer received extension service and zero otherwise. 216 217

241
The first null hypothesis of no technical inefficieny in the agro-ecological zonal production frontier 242 and the pooled data was rejected for each agro-ecological zone and the pooled data. Hence, the 243 traditional average response model is inappropriate for the data set, given the assumption of the 244 stochastic frontier model. The second null hypothesis that the second-order coefficients of the 245 translong model have zero values was also rejected for each agro-ecological zone and the pooled 246 data. Thus, the translog model gives more accurate and consistent results than Cobb-Douglas model, 247 given the data set. The third null hypothesis that technical inefficiency effects in the stochastic frontier 248 model are not explained by any of the covariates in the inefficiency model was also rejected for 249 all agro-ecological zones and pooled data. Thus, SDFs' inefficiency is influenced by socio-economic, 250 environmental and farm-specifi characteristics. 251 The final null hypothesis that all the agro-ecological zones share the same technology was 252 considered. If the three agro-ecological zones were to share the same production frontier, there would 253 be no good reason for estimating the efficiency levels of groups relative to a meta-frontier production 254 function.

261 Stochastic regional frontiers and meta-frontier estimates 262
The technical efficiency estimates were obtained from the stochastic frontier model and estimated 263 separately for the three agro-ecological zones and the pooled data following the procedure for the 264 "true" random effect model using STATA version 16. The meta-frontier estimated using SHAZAM version 265 11 following O'Donnell, Rao, and Battese (2008). The estimated parameters of the flexible translog 266 function specified in Equation (7) for estimation for the agro-ecological zones and the pooled are 267 reported in Table 3. Table 3 also shows the results of the linear programming estimates for the meta-268 frontier. 269 The  Table 3 Estimates of zones frontiers, pooled frontier and meta-frontier Note: Robust standard errors appear in parentheses. Asterisks denote significance at the following levels. ***=1%, **=5%, *=10%.

280
In Table 3, the column titled "pooled" denotes that the model is estimated by pooling all the agro-281 ecological zones in the same regression. While the signs of the estimated meta-frontier are consistent 282 with the pooled frontier, there are differences in the magnitude of the estimated elasticities. In Table 3, 283 in general, the agro-ecological-specific frontier results fit reasonably well as most inputs have their 284 coefficients statistically significant. Notably, there is a lot of variation in magnitudes of parameter 285 estimates across the agro-ecological zones. Some inputs are statistically significant in some regions while 286 others are not, which indicates that input use varies across agro-ecological zones. Because individual 287 parameter estimates of the translog production function are not readily interpretable, we computed 288 output elasticities of inputs and returns-to-scale as shown in Table 4. These elasticities are evaluated at 289 sample means of inputs while returns-to-scale are computed as the sum of the output elasticities. The 290 output elasticities indicate a divergent pattern across agro-ecological zones. We find increasing returns 291 to scale of milk production in the three agro-ecological zones implying that SDFs operate in their first 292 stage of the classical production function and they are still likely to attain their optimal capacity. For 293 policy, these results suggest that SDFs should reduce their average long-term costs by expanding their 294 production scale. 295 The estimates of agro-ecological zone-specific variables are reported at the bottom of the panel 296 of Table 3. A zones-specific variable with a positive (negative) coefficient implies that the variable has 297 a negative (positive) effect on technical efficiency. In general, most of the variables included under 298 agro-ecological zones-specific variables have a positive effect on technical efficiency across regions 299 (e.g., education, age of household head in years, access to credit, off-farm work participation, group 300 membership and access to extension service by the SDF) which was expected a priori. 301 302 As expected, technical efficiency estimates are lower and more dispersed in the meta-frontier model. 314 SDFs in the LH agro-ecological zone achieved the highest mean technical efficiency (0.329), with 315 minimum variation (SD = 0.11), and those in the LM agro-ecological zone had the lowest mean technical 316 efficiency of 0.203. Maximum variation was recorded in the UM agro-ecological zone (SD = 0.194). 317 The mean technical efficiency across all zones is estimated at 0.272. These results suggest that SDFs in 318 UM and LM agro-ecological zones could catch up with their counterparts in the LH agro-ecological zone 319 by adopting the most improved milk production techniques. 320 321 The frontier and meta-frontier production estimates for each technology may also be used to 324 calculate the technology gap ratios (TGR) by using Equation (6). TGR measures the proximity of the 325 group-k frontier to the meta-frontier, which represents the current state of knowledge. According to 326 Equation (6), an increase in the TGR implies a decrease in the gap between the group frontier and 327 meta-frontier. Generally, a larger (smaller) TGR implies a higher (lower) gap regarding productive 328 technology between regional-specific frontier and meta-frontier. We adopted the Kruskal-Wallis non-329 parametric test to examine further whether the TGRs are distinct among groups. The result presented in 330 Table 5 strongly rejects the null hypothesis that there are no differences in the TGRs among different 331 groups. Table 4 shows that the estimates of the mean values of TGR vary more widely than the mean 332 technical efficiency estimates in the meta-frontier model. A TGR value of one (1) implies agro-ecological 333 zonal-specific frontier is tangent to the meta-frontier and there is virtually no room for improvement. 334 The average TGRs for LH, UM and LM agro-ecological zones are 0.595, 0.376 and 0.307, respectively. 335 The results suggest that to be fully technically efficient, SDFs in LH, UM and LM agro-ecological zones 336 ought to close their technology by 40.5%, 62.4% and 69.3%, respectively. Estimated TGR averages 337 0.429 across all agro-ecological zones. Adopting productivity-enhancing farm technologies from other 338 regions to local conditions could push TEs towards the meta-frontier (Danso-Abbeam & Baiyegunhi, 339