Adaptive neuro fuzzy inference system (ANFIS) machine learning algorithm for assessing environmental and socio-economic vulnerability to drought: a study in Godavari middle sub-basin, India

Climate change has increased the frequency of drought occurrence in various parts of the world. Drought as a complex phenomenon causes severe impacts on ecological and socio-economic status. Short-term and long-term occurrences of drought have made many regions vulnerable globally. This paper makes an attempt to assess drought vulnerability in Godavari Middle Sub-basin of India. Twenty-four site specific socio-economic and environmental factors were identified based on the extensive literature review. Drought frequency was assessed using standardized precipitation index (SPI). These datasets were divided into training (70%) and testing (30%) data. Frequency ratio (FR) model was utilized to establish relationship among drought conditioning factors and drought frequency. Weights obtained from the FR model were used as input to the adaptive neuro-fuzzy inference systems (ANFIS) model. Drought vulnerability results were validated using the testing data and receiver operating characteristic (ROC). The accuracy of ANFIS models for 1-month (0.957), 3-months (0.882), 6-months (0.964) and 12-months (0.938) showed high suitability of ANFIS model for the assessment of drought vulnerability. The findings revealed that very low normalized difference vegetation index (NDVI) and increasing trend of highest maximum and mean maximum temperature were major environmental factors which influenced high drought vulnerability in the sub-basin. High proportion of area under fallow land, high infant mortality rate (IMR) and moderate literacy rate were identified as major socio-economic factors making watersheds vulnerable during short and long-term droughts. Largest area of the sub-basin was found under high vulnerability for 3-months, followed by 6-months and 12-months droughts. Thus, the study calls for policy intervention towards lessening the impact of drought in highly vulnerable watersheds.


Introduction
Changes in climatic and hydrological processes such as increase in temperature, reduction in the ratio of snowfall to total precipitation and decrease in runoff in the basins have somehow increased the adverse effects of drought (Adib et al. 2021;Lotfirad et al. 2022). Global warming has already increased the temperature levels to almost 1.5°C (IPCC 2018;Kuriqi et al. 2020). The frequency of disasters has increased globally due to climate change since last few decades (Cappelli et al. 2021). The recent assessment report of Intergovernmental panel on climate change (IPCC) revealed a concern towards increasing frequency of extreme weather events specifically marine heat waves, intense hot extremes, heavy precipitation, hydrological and agricultural drought (IPCC 2021). Globally, food security, drinking water, soil fertility and ecological balance are under threat due to the extreme drought conditions (Adnan et al. 2021;Hanjra and Qureshi 2010). Drought is defined as a prolonged shortage of available water, primarily due to insufficient precipitation (Bullock et al. 2018). Moreover, the aridity is an outcome of prolonged permanent climate condition. Changes in climate may likely to aggravate the drought situation interrupting the ecological balance and causing socio-economic vulnerability (Satish Kumar et al. 2021). Drought has led to severe economic losses in many parts of the world. In USA, it led to an average annual economic loss of around $6 billion while 18% reduction in agricultural yield was experienced in Australia during 2001-2010 due to drought (Erian et al. 2021). Nearly 122 million people alone in Asia and Pacific region have been affected by droughts, flood and storms during 2011-2021. These drought conditions are likely to become severe with rising temperature due to high emission scenario by 2100 (ESCAP 2021). Thus, modelling, prediction and spatial analysis of drought at various scales are important for devising mitigation efforts. The complex interaction of society and environment has made vulnerability assessment crucial for disaster mitigation (Vargas and Paneque 2017). The earlier studies on drought vulnerability were laid emphasis on improving the policy efforts for drought mitigation at regional level (Wilhite 1997) Initial methods of drought vulnerability assessments included interpolating the census data and crop yield analysis (Liverman 1990), identifying the deficit area and drought intensity analysis (Correia et al. 1991), stochastic models for evaluating drought characteristics (Rossi et al. 1992), time series analysis (Venema et al. 1995) and socio-economic and environmental factors for drought impact assessment (Knutson et al. 1998). The use of remote sensing datasets and geographic information system (GIS), recession curves, network topology were also utilized for drought vulnerability assessment in the river basin (Demuth et al. 2000). Decile approach and drought indices became prominent at regional level drought vulnerability assessment (Wilhite et al. 2000). Logistic regression (Shewmake 2008), crop-drought vulnerability index (Simelton et al. 2009), standardized precipitation index (Edossa et al. 2010;Młyński et al. 2021), temperature condition index (Singh et al. 2003), normal precipitation index (Nohegar and Saeedeh Mahmoodabadi 2015), vegetation condition index (Ghaleb et al. 2015), standardised soil moisture index (Hao and AghaKouchak 2014), hydrological modelling (Jung and Chang 2012), fuzzy clustering iterative model (Wu et al. 2013), weights of the drought risk index (Dong and Liu 2014) and bivariate approaches for meteorological drought assessment (Masud et al. 2015) have been the wellknown methods used for drought analysis. Footprints of drought are complicated and thus, require inclusion of interdisciplinary parameters for effective prediction and forecasting (Zagade and Umrikar 2021). Recently, several indices and models namely reconnaissance drought index (RDI), standardized precipitation evapotranspiration index (SPEI), groundwater drought index (GDI), enhanced vegetation index (EVI) (Niu et al. 2019;Singh et al. 2019; Thomas et al. 2016), vegetation health index (Masroor et al. 2022), univariate analysis coupled with climate models (Thilakarathne and Sridhar 2017), reconnaissance trivariate drought index (RTDI)  standardized water supply and demand index (Wang et al. 2022), copula-based probabilistic multivariate drought index  and quantitative-qualitative scoring technique (Al Qudah et al. 2021) are being utilized for drought vulnerability assessment for the river basins (Phung et al. 2021;Udall and Overpeck 2017;Venkatcharyulu and Viswanadh 2021). Machine learning models have advantage over data driven model due to their higher accuracy and drought prediction capabilities (Zhu et al. 2021). Ribeiro et al. (2019) examined the drought induced yield losses using multi-scale remote sensing indices in Iberia. Liu et al. (2020) utilized an integrated agricultural drought index (IDI) based on remote sensing datasets and neural network for examining the agricultural drought in North China Plain. Zhu et al. (2018) have utilized standardized terrestrial water storage index for measuring hydrological drought vulnerability.  utilized sixth international coupled model intercomparison project phage 6 (CMIP6) data for assessing future metrological and hydrological drought conditions. Saha et al. (2021a) examined drought vulnerability in Karnataka state of India using GIS based bagging and artificial neural network (ANN) methods. However, this study analyzed drought vulnerability using these models individually. Apart from these models, the new soft computation algorithms mainly ensemble adaptive neuro-fuzzy inference systems (ANFIS) with less statistical limitations are becoming popular for vulnerability assessment. Chen et al. 2017 utilized GISbased ANFIS model for landslide spatial modelling. Knowledge gap was identified for integration of various socio-economic and environmental indicators for drought vulnerability using advance ensemble machine learning algorithms. Thus, this study filled this gap and utilized ensemble approach by integrating frequency ratio model and ANFIS model for analyzing integrated drought vulnerability. The approach provided a base for deep learning of drought vulnerability with various socio-economic and environmental indicators.
Godavari Middle sub-basin lying in rain shadow region of Western Ghats (mountain hills) is vulnerable to drought (Masroor et al. 2020). It has faced several droughts in the past with the frequency of two droughts per decade (Gore and Ray 2002;. Climate is expected to exacerbate further seasonality and water availability in multidate of rivers (Kuriqi et al. 2020). Frequent droughts are posing severe implications to the farmers in the subbasin (Masroor et al. 2021;Venkatcharyulu and Viswanadh 2021). Thus, assessment of drought vulnerability is imperative for identifying the determinants of drought for lessening the degree of vulnerability among the local communities. This paper makes an embryonic attempt to integrate various socio-economic and environmental factors for determining drought vulnerability using a novel ensemble adaptive neuro-fuzzy inference systems (ANFIS) model in Godavari middle sub-basin, India. ANFIS is a combination of neural network and fuzzy logic and has capability of optimizing parameters in less time and with less statistical errors (Hesami et al. 2019). The main objectives of this study are: (1) Identification of site-specific socio-economic and environmental indicators for drought vulnerability assessment, (2) analysis of drought frequency using standardized precipitation index (SPI), (3) determination of frequency ratio of drought vulnerability indicators and drought frequency and (4) integration of frequency ratio model with adaptive neuro-fuzzy interference systems (ANFIS) for drought vulnerability assessment. The findings of this assessment will help in framing proactive measures for increasing the degree of adaptation of the local communities. We also argue that the methodology adapted in the study may be effectively utilized in other geographical regions interested in analysing drought vulnerability.

Site description
Godavari middle sub-basin is located in the Western Ghats (mountain hills) of Maharashtra and Telangana states of India with geographical extent of 18°21 0 41 00 and 20°34 0 6 00 north latitudes and 75°9 0 15 00 and 78°22 0 17 00 east longitudes (Fig. 1). The sub-basin covering 39,000 km 2 area is the home of nearly 13 million population. Agriculture is the main occupation of the people in the subbasin (Masroor et al. 2022). It receives an annual average rainfall of 750 mm and experiences an average temperature of 27.5°C (Masroor et al. 2020). The sub-basin lies in an arid and semi-arid climate zone. Dry climate in the subbasin sometimes led to severe drought. The sub-basin receives low rainfall due to presence of Western Ghats. The distribution of rainfall is uneven in the sub-basin. Most of the rainfall in the sub-basin occurs through south-west monsoon. However, it receives some scanty rainfall through western disturbances. The sub-basin has experienced many droughts since 1901 (Gore and Ray 2002;Masroor et al. 2021). The temporal and spatial rainfall pattern causes severe and long-lasting droughts that affect various sectors including agriculture and industry. Consequently, the economic situation of people whose income relies on these resources is unstabilized. Drainage system is largely controlled by the geological structure of the subbasin. Undulating topography and hard rocks act as barriers for large scale agricultural production. The geological structure having dominance of hard rocks also affects water storage capacity. Chromic vertisols is the main soil of the sub-basin with considerable variation in texture and depth. The soils along the Godavari River are quite fertile. Cotton, sugarcane soybean, sorghum, and green gram are the main crops grown during the kharif (summer) season and wheat during the rabi (winter) season (Shirsath et al. 2020). The aquifers in the sub-basin are characterized by the poor permeability of the soil and hence restrict groundwater recharge (Masroor et al. 2022;Suhag 2019). Water availability in north-west and south-western parts of the Godavari middle sub-basin is decreasing due to occurrence of frequent droughts (Kuriqi et al. 2020). High intensity of rainfall and bad topographical surface have resulted in the formation of rills and gullies (Liang et al. 2010).

Methodological framework
A total of 24 socio-economic and environmental factors from various data sources were utilized to assess drought vulnerability in Godavari Middle sub-basin. Details of these factors is provided in Table 1. Spatial layers for all the factors were prepared using inverse distance weighted (IDW) method of interpolation. Drought inventory map was prepared using SPI. The frequency ratio method was used to determine the spatial relationship between the effective criteria and the occurrence of drought. Then the weights obtained from the frequency ratio method were used as input to the ANFIS model. Historical drought frequency was utilized as targeted result to train the model. Finally, results were validated with testing data and receiver operating characteristic (ROC) curves. Detailed methodological steps have been provided in Fig. 2 and discussed in the following sub-sections:

Environmental and socio-economic drought conditioning factors
Extensive literature survey was carried out for understanding and selection of drought influencing factors (Cappelli et al. 2021;Correia et al. 1991;Demuth et al. 2000;Jung and Chang 2012;Liu et al. 2020;Masud et al. 2015;Niu et al. 2019;Saha et al. 2021a, b). Fifteen environmental factors (soil types, surface water, TMRF, NDVI, NDWI leaf, NDWI water, groundwater pre and post monsoon, HMAX, LMIN, MMAX, MMIN, RH and rainy days trends and rainfall) and nine socio-economic factors (literacy rate, income, population density, IMR, HDI, rabi crop, kharif crop, net sown area, fallow land) were utilized for drought vulnerability assessment in the sub-basin. Rainfall has inverse relationship with drought vulnerability. The areas receiving low rainfall are more prone to high degree of drought vulnerability. Average annual rainfall was calculated using the monthly rainfall of the sub-basin. Increasing trend of rainfall suggested very low chance of drought occurrence and hence low drought vulnerability. Trend of monthly rainfall was examined using the linear regression Eq. (1).
where Y is defined as dependent variable, X is defined as independent variable, a is intercept, b is slope, is residual. Temperature is directly related to the occurrence of drought. Increasing trend in the pattern of temperature indicates more exposure to the areas and thus, causing more vulnerability. Trend in highest maximum (HMAX) temperature, mean maximum (MMAX) temperature, lowest minimum (LMIN) temperature, mean minimum (MMIN) temperature and relative humidity (RH) were analyzed using the Eq. (1). Relative humidity has inverse relationship with drought vulnerability. Higher the relative   humidity and lower the drought conditions. Groundwater is major source to combat drought vulnerability. The areas having high water table may have less vulnerability to drought. Over-exploitation of groundwater can have a significant impact on drought-related damages (Saha et al. 2021a, b). Thus, groundwater table data during pre-and post-monsoon seasons were obtained from Central Groundwater Board for preparing spatial groundwater table layers. Soil texture and depth have varied interaction with drought vulnerability. Coarse soils are highly prone to drought vulnerability. Spatial soil layers were prepared using data obtained from Food and Agriculture Organization (FAO). Sufficient quantity of surface water is essential for reducing the impact of drought. However, high rate of evaporation makes surface water more prone to long term drought effects (Vishwakarma et al. 2022). JRC Global surface water mapping layers, v1.0 were utilized for assessing the sub-basin (Pekel et al. 2016). The NDWI is used to monitor changes related to water content in water bodies. The index was proposed by McFeeters (1996) and calculated using Eq. (2): The NDWI leaf is a remotely sensed indicator that responds to changes in leaf water content (Gao 1996). It was determined using Eq. (3): Normalized Difference Vegetation Index (NDVI) quantifies vegetation by measuring the difference between near-infrared and red light (Rajpoot and Kumar 2019). Equation (4) was utilized to calculate it: High income and literacy rate can make the society resilient (Saha et al. 2021a, b). High rate of IMR reflect the impact of drought (Miyan 2015). Human development index is an indicator of overall development. High HDI can lead to low level of drought vulnerability. Areas having high density of population will be more vulnerable to drought (Heidari et al. 2020). We prepared population density layer using gridded population data of the world (CIESIN 2018). Larger area under fallow land during drought conditions may increase vulnerability to the communities significantly. Higher the percentage of net sown area lesser is the vulnerability to drought. High number of crops grown during different seasons (kharif and rabi) reflects less vulnerability to drought. Raster layers of these factors were prepared using annual cropland data of NRSC (NRSC 2016).

Drought inventory mapping
The historical drought events are considered as the foundation for assessing drought vulnerability. Therefore, frequency of previous droughts was assessed using standardized precipitation index (SPI) during 1980-2015. The SPI is effective to quantify drought frequency and severity at various spatial scale (McKee et al. 1993). Monthly rainfall data obtained from India Meteorological Department (IMD) was utilized to calculate drought frequency for 1-month, 3-months, 6-months and 12-months drought. SPI follows the two-parameters of Gamma probability density function (scale and shape). Gamma probability density function was applied to 1-month, 3-month, 6months and 12-months moving average precipitation series in order to estimate SPI by involving a shape and scale factor, termed as a and b respectively. Gamma (C) probability distribution used to describe precipitation variation is shown in Eq. (5): The wet periods are specified by positive SPI values, whereas a sequence of negative values denote a dry period.

Spatial relationship between the drought occurrence and conditioning factors using frequency ratio (FR) model
Frequency ratio (FR) is a bivariate statistical method used in the natural hazards studies for investigating the spatial relationship among conditioning factors of the historical floods, droughts, fires and landslides (Pham et al. 2021). The FR model can be used to present a simplified geospatial assessment for optimizing the probability of dependent and independent variables including multiclassified maps (Razavi Termeh et al. 2018). Weights to each class of conditioning factors were assigned using Eq. (6) following (Bonham-Carter 1994): where p denotes number of drought pixels within each class, q denotes number of droughts occurred in the subbasin, r denotes number of pixels correspond to on each class, and s denotes number of pixels correspond to the sub-basin.

Drought vulnerability mapping using ANFIS
ANFIS is a hybrid of artificial neural networks (ANN) and fuzzy logic which provides advantage of both the methods (Razavi Termeh et al. 2018). ANFIS uses linguistic information from the fuzzy logic as well as the learning capability of ANN for the automatic fuzzification (Walia et al. 2015). Self-learning is the key strength of this hybrid model. ANFIS is based on the hybrid learning rule which combines least square method and gradient to determine the optimization parameters (Aghdam et al. 2016). We have utilized hybrid optimization method with 200 iterations for drought vulnerability assessment ( Table 2). The structure of the ANFIS model consists of five layers as shown in Fig. 3 (Jang 1993; Kutlu Karabiyik and Can Ergün 2021).
Layer 1: Each node includes adaptive nodes (Eqs. 7 and 8): Here, lAi (X) and lBi(Y) are membership functions for X and Y input nodes and A and B are the linguistic variables. Source: Jang (1993) Layer 2: It has fixed nodes. The output of each node is the product of all input signals to that node (Eq. (9)): Here, W i is the output for each node. Layer 3: It encompasses the fixed node of normalized outputs which are referred to as the normal firepower (Eq. (10)): Layer 4: It assigns a node function to each node (Eq. 11): where w l is the normalized firepower of layer 3 and p i , q i , and r i are node parameters. The parameters of this layer can be interpreted as the result parameters. Layer 5: It has a single node denoted by P which is sum of all the input signals to finally have the output (Eq. 12):

Validation of drought vulnerability
The ROC curve is widely used for predicating accuracy of the model (Mosavi et al. 2019). It is a graphical illustration of equilibrium between positive and negative error for any possible cut-off error value. The higher the value of the metric the better the predictive capability of the model. As previously mentioned, dataset was divided into two parts training and testing (validation) data. ROC curves were generated by overlaying the testing data of each resulting drought vulnerability (one, three, six and 12-month) maps. The ROC curve was constructed by plotting specificity and sensitivity on X and Y axis respectively using the following (Eqs. 13 and 14): where TP is the true positive, TN is the true negative, FP is the false positive, and FN is false negative (Nsengiyumva and Valentino 2020;Pham et al. 2021).
The RMSE values were computed using observed and predicated values following Eq. (15) (Lotfirad et al. 2022;Saha et al. 2021a, b): where O i is the field observed values, S i is the model generated value and N indicates total number of observations. Mean absolute error (MAE) was calculated as the square root of the aggregate of difference between observed and predicated values, excepting their direction (Eq. 16).
where O i is the field observed values and S i is the model generated value. n indicates number of observations. Mean squared error (MSE) measures the amount of error in statistical models. It provides the average squared difference between the observed and predicted values. When a model has no error, the MSE is equal to zero. As model error increases, its value also increases (Zaman and Bulut 2020).
where O i is the field observed values and S i is the model generated value. n indicates number of observations. Spatial layers of drought influencing factors are presented in Figs. 4 and 5.

Drought frequency using SPI
The drought inventory maps of various time scales are illustrated in Fig. 6. Results revealed that drought frequency varied between 1 and 16 in the sub-basin for the 1-month drought. High drought frequency was observed in the watersheds mostly located in northern and south-central parts of the sub-basin (Fig. 6a). Eastern and north-western parts of the sub-basin experienced low drought frequency. Drought frequency for 3-months varied from 2 to 11. High drought frequency for these months was observed in the watersheds located in the northern part of the sub-basin (Fig. 6b). Low drought frequency was experienced in the watersheds mostly located in the periphery of western and eastern parts of the sub-basin. 6-months drought frequency varied from 2 to 7. High drought frequency was observed in the watersheds located in the northern part of the sub-basin (Fig. 6c). Low drought frequency was noticed in the south-western and eastern parts of the sub-basin for 6-months drought. 12-months drought frequency varied from 1 to 5. Only three watersheds located in the central part of the sub-basin have experienced high drought frequency while one watershed in central part and two watersheds in the eastern parts experienced low frequency (Fig. 6d).

Relationship of conditioning factors with drought occurrence using FR model
The results of drought conditioning factors and drought frequency using FR model are presented in Figs. 7 and 8, and Appendix Table 4 and 5. FR values [ 1 indicate high Step size decrease rate 0.9 Step size increase rate 1.1 correlation (Razavi Termeh et al. 2018). Relationship of FR model with conditioning factors for 1-month, 3-months, 6-months and 12-months drought are discussed in the following sub-sections:

One-month drought
Results of frequency ratio between soil types and drought frequency revealed that the chromic vertisols has the highest frequency ratio (1.66) and it covers nearly 32% area of drought occurrence (Appendix Table 4). As drought frequency increases surface water decreases. Highest drought frequency ratio (3.03) was observed for moderate surface water. However, it covered 2.38% area of drought occurrence (Appendix Table 4). Increasing trend in total monthly rainfall trend was observed for less drought occurrence areas. Highest drought frequency ratio (1.27) was recoded for moderate rainfall trend and distributed over 71.42% area of drought occurrence (Appendix Table 4 Table 4). Almost 62% drought occurred in very low to low category. The same results were found for post-monsoon season where 66% drought occurred in these categories. Highest maximum temperature and mean maximum temperature have shown direct relationship with drought frequency. Highest drought frequency ratio for higher maximum temperature and mean maximum temperature were determined as 1.43 and 1.29 respectively (Fig. 7). With the increase in temperature drought frequency increased in the sub-basin. Similar pattern was also observed in the case of lower minimum temperature and mean minimum temperature (Appendix Table 4). Moderate and high average rainfall categories have high drought frequency ratio (1.46 for both classes) and these covered nearly 71% area of drought occurrence (Appendix Table 4). A negative relationship was found between relative humidity and drought frequency. Highest drought frequency ratio (1.40) for relative humidity was observed in highly decreasing trend and covered nearly 9.5% area (Appendix Table 4). Highest frequency ratio (1.28) was found in very high population density and covered nearly 24.76% of drought occurrence (Appendix Table 5). High IMR class (FR = 8.86) was largely associated with high drought occurrence (Fig. 8). High HDI (FR = 1.72) was related to high drought occurrence. High drought occurrence affected crops during rabi season (Appendix Table 5). Similar results were also observed for kharif crops. Nearly 35% area of drought occurrence was found in high category of fallow land (Appendix Table 5).

Three-month droughts
Highest drought frequency ratio (1.08) was observed for cambisols vertisols soil and it covered nearly 76% area of drought occurrence (Appendix Table 4). Highest drought  frequency ratio (1.21) for surface water was recoded for moderate surface water class. However, it covered less than 1% of the total drought occurrence (Appendix Table 4). High category of total monthly rainfall trend has highest drought frequency ratio (1.51) and it expanded over nearly 26.19% area of drought occurrence (Appendix Table 4). Relationship between NDVI and drought frequency have shown high frequency ratio (1.04) for low NDVI values and it covered 17.14% area (Appendix Table 4). NDWI leaf values and drought frequency ratio has shown high association (FR = 1.15) with low NDWI leaf category and covered 29.53% area of drought occurrence (Appendix Table 4). Highest drought frequency was recorded for low NDWI water class (Appendix Table 4). Nearly 38% drought occurred in this category. High drought frequency was obtained for moderate to very high category of groundwater table depth during pre-post-monsoon seasons (Appendix Table 4). Highest drought frequency ratio (2.1) for higher maximum temperature was observed in very high category followed by high category (1.40). These two categories covered nearly 37.14% area of drought occurrence (Appendix Table 4). Very high and high categories of mean maximum temperature recorded high drought frequency ratio. These categories covered almost 30% drought occurrence. The frequency ratio was found high for lower minimum temperature (Appendix Table 4). High drought frequency ratio (1.5) for mean minimum temperature was observed in very low categories with 11.42% area under drought occurrence. High drought frequency ratio was observed for low and high rainfall categories. Relative humidity trend showed negative relationship with drought frequency in the sub-basin. Highest drought frequency ratio (1.36) for relative humidity was observed for very high category. However, other categories namely low and moderate also have high drought frequency ratio. Nearly 61% drought occurred in these categories (Appendix Table 4). High association was found between income and drought frequency (Appendix Table 5). Very high literacy rate was associated with high drought frequency ratio (1.27) and covered nearly 21.42% area of drought occurrence (Appendix Table 5). High drought frequency ratio Fig. 6 Drought frequency for a 1-month, b 3-months, c 6-months and d 12-months was found for moderate population density. High IMR class (2.95) was related to high drought occurrence (Fig. 8). Moderate to high HDI were associated with high drought occurrence. These two categories covered 88.09% area of drought occurrence (Appendix Table 5). Rabi crops were directly related to high drought occurrence.
Moderate (1.18), high (1.16) and very high (1.12) rabi crops received high drought frequency ratio and covered 73.33% combine area of drought occurrence (Appendix Table 5). High drought frequency ratio for Kharif crops was observed in very high (1.18) category followed by high (1.04) category combining area of these categories covered nearly 67.61% area of drought occurrence (Appendix Table 5). High (1.11) and very high (1.06) net sown area has high drought frequency and covered 73.33% combining area of drought occurrence. Very low (1.60) fallow land category was mainly related to high drought frequency covered nearly 31.42% of drought occurrence (Appendix Table 5).

Six-month droughts
Results of 6-months drought with soil revealed that highest frequency ratio (1.12) was observed for cambisols vertisols soil and covered 78.57% area of drought occurrence. Highest drought frequency ratio for surface water was observed in very low (1.09) category and it covered 41.90% drought occurrence area (Appendix Table 4). Highest drought frequency ratio (1.1) for total monthly rainfall trend was observed in high category and it covered nearly 19.04% area. However, other categories (moderate and very high) also experienced high frequency ratio and covered 60 and 8.57% area of drought occurrence respectively (Appendix Table 4). Very low and low NDVI values have high drought frequency ratio 1.45 and 1.05 respectively and has a coverage of 20.47% area of drought occurrence (Appendix Table 4 Table 4). Groundwater table for pre-monsoon have high drought frequency ratio (1.26) for high category and its covered 9.04% area. However, very low (1.15) category also have high association and coverage of 36.66%. Groundwater table for post-monsoon have high drought frequency ratio (more than 1.14) for the moderate to very high categories (Appendix Table 4). It covered nearly 48.57% area. Results of higher maximum temperature with drought occurrence have shown that highest drought frequency was experienced in very high category and covered only 9.04% area. Mean maximum temperature have shown direct relationship with high drought frequency. All these categories have high frequency ratio except medium category. These categories cover 72.85% area of drought occurrence. Lower minimum temperature has high drought frequency ratio (1.51) for very low category it covered 16.19% area (Appendix Table 4). The same results applied for mean minimum temperature where high drought frequency ratio (1.76) was observed for very low categories (Fig. 7). It covers nearly 13.33% area of drought occurrence. Moderate and high rainfall categories have high drought frequency ratio 1.45 and 1.7 respectively and these categories covered nearly 77.14% drought occurrences area for 6-month droughts (Appendix Table 4). Highest drought frequency ratio (1.29) was observed for very high category for relative humidity with the coverage of 17.14% area. However, other categories like very low and high also have high drought frequency ratio, almost 34.28% drought occurred in these categories. High drought frequency ratio (1.85) for income was observed and it covered nearly 33.33% drought occurrence area (Appendix Table 5).
Moderate literacy rate has high drought frequency ratio (1.49) and covered nearly 66.19% area of drought occurrence (Appendix Table 5). Moderate population density has high drought frequency ratio (1.25) with an extend of 27.14%. However, very high (1.18) category also have high drought frequency and covered nearly 22.85% drought occurrence (Appendix Table 5). High IMR class (1.85) was related to high drought occurrence (Fig. 8). However, it covered 33.33% area (Appendix Table 5). Moderate to high HDI are largely associated with high drought occurrence. Combine area of these categories covered 69.04% area of drought occurrence. rabi crops was directly related to high drought occurrence. High (1.31) and very high (1.31) rabi crops received high drought frequency ratio and covered combined area of 49.52% (Appendix Table 5). High drought frequency ratio for Kharif crops was observed in very high (1.16) category followed by moderate (1.11) category and covered combined area of 59.52% (Appendix Table 5). Very high (1.24) category of net sown area has high drought frequency and covered 49.04% combined area (Appendix Table 5). Very low (1.40) category of fallow land have high drought frequency and covered 27.61% area of drought occurrence (Appendix Table 5).

Twelve-month droughts
Results of 12-months drought with soil revealed that highest frequency ratio (1.13) was observed for Chromic vertisols soil and covered 22.38% drought occurrence. However, cambisols vertisols category also have high frequency ratio (1.06) and expand over 74.76% area of drought occurrence (Appendix Table 4). These two combine categories experienced almost 97.14% of drought occurrence. Highest drought frequency ratio for surface water was observed in medium (1.82) category and it hardly covered 1.42% area of drought occurrence (Appendix Table 4). Other categories such as high (1.65) and very low (1.00) also have high frequency. These two categories covered nearly 39% area of drought occurrence (Appendix Table 4). Highest drought frequency ratio (1.4) for total monthly rainfall trend was observed in high category and it covered nearly 25.71% area. However, low category also has high frequency ratio (1.30) and covered 19.04% drought occurrence (Appendix Table 4). Moderate and high NDVI values have high drought frequency ratio 1.15 and 1.12 respectively and has a combine extend of 63.80% area (Appendix Table 4). High drought frequency ratio (more than 1.0) for NDWI leaf values was observed in very low to medium categories and it has combined extent over nearly 77.14% area of drought occurrence. Highest drought frequency ratio (1.16) for NDWI water values was observed in low category. However, it covered 41.42% drought occurrence area (Appendix Table 4). Moderate category also has high drought frequency ratio (1.12) and 29.52% drought occurrence area. Groundwater table for pre-monsoon have highest drought frequency ratio (1.13) for high category and its covered only 8.09% area. However, other categories like moderate (1.07) and very low categories also have high association and a combine coverage of 60.47% area of drought occurrence (Appendix Table 4). Groundwater table for post-monsoon have high drought frequency ratio (2.11) for very high category it covered hardly 4.28% area of drought occurrence. However, other categories like high (1.76) and very low (1.07) have also shown high relationship between drought occurrence and groundwater table (Appendix Table 4). Results of higher maximum temperature trend with drought occurrence have shown that highest drought frequency ratio (2.1) was experienced in very high category and covered only 11.90% area. Mean maximum temperature have also shown similar relationship with high drought frequency. Highest drought frequency ratio (2.23) was observed in very high category and covered only 11.90% area of drought occurrence. Lower minimum temperature has high drought frequency ratio (2.25) for very low category it covered 24.28% area of drought occurrence (Appendix Table 4). The same results applied for mean minimum temperature where high drought frequency ratio (2.64) was observed for very low categories and it covered only 20% area of drought occurrence. High and moderate rainfall categories have high drought frequency ratio 1.95 and 1.17 respectively and these categories combine covered nearly 76.66% drought occurrences area for 12-month droughts (Appendix Table 4). High drought frequency ratio (more than 1.0) was observed for medium to very high categories for relative humidity with a combine coverage of 99.04% area of drought occurrence. High drought frequency ratio (2.09) for income was observed and it covered nearly 37.61% drought occurrence area (Appendix Table 5). Low and moderate literacy rate has high drought frequency ratio (more than 1.0) and covered combined area of 86.66% (Appendix Table 5). Moderate population density has high drought frequency ratio (1.38) with an extend of 30% area. However, other categories very low (1.15) and high (1.01) categories also have shown high drought frequency and combined covered 43.33% area of drought occurrence (Appendix Table 5). High IMR class (2.09) was largely related to high drought occurrence (Fig. 8). However, it covered 37.61% area. High HDI (1.85) was largely associated with high drought occurrence and covered combined area of 68.57% area of drought occurrence (Appendix Table 5). Rabi crops was directly related to high drought occurrence. High drought frequency ratio (more than 1.0) was observed in high, very high and low categories and covered combined area of 68.09%. Very low category has highest drought frequency (1.33) for Kharif crops and covered only 7. 61% area. However other categories high to very high have also shown high frequency ratio and covered combine area of 70% (Appendix Table 5). Highest drought frequency for net sown area was observed in very low category (1.27). However, it covered only 6.66% area. Other categories like high (1.09) and very high (1.16) have also shown high association with drought frequency and covered combine area of 76.66% (Appendix Table 5). Highest drought frequency for fallow land was observed in high (1.54) category and extend over nearly 30.95% area. However, very low (1.09) category also shown high association between drought frequency and fallow land (Appendix Table 5).

Drought vulnerability using ANFIS model
The ensemble ANFIS models were implemented using the normalized weights of the FR model. The dataset was divided into training (70%) and testing (30%) data to perform ensemble ANFIS model. Results of ANFIS model for 1-month drought training data reflected that most of the output values occurs within the target values (Fig. 9a). However, some values were occurred beyond the limit. The difference between the original and predicated value was represented by Mean Squared Error (MSE). The MSE value was determined by the average difference of the datasets. The MSE value for the 1-month was 0.10935 depicted that high sustainability of ANFIS model (Fig. 9a). The Root Mean Squared Error (RMSE) was also reflecting low error (0.33069) in the original and predicated values (Fig. 9a). Testing dataset confirmed the reliability of training dataset and followed the similar pattern as shown by the training data. However, MSE value was little bit higher, but RMSE value remains almost same (Fig. 10a). Largest area for 1-month drought was observed under moderate category (44.70%) followed by high (36.58%), low (14.71%) and very high (4.01%) categories (Fig. 11). No area was observed in very low category (Fig. 9a). Results of 3-months ANFIS model have shown that the MSE value (0.12775) for training data was higher than the 1-month drought (Fig. 9b). The RMSE value for original and predicated value was 0.35743. Testing dataset have shown higher MSE (0.15492) and RMSE (0.3936) values that reflected some fluctuations in the 3-months model (Fig. 10b). Though model have shown some volatility, but results of this model was still reliable because most of the output occurred in the targeted values. Largest area occurred under high category (51.80%) followed by moderate (35.11%), low (6.74) and very high (6.34%). Less than 1% area was under very low category (Fig. 11b).
Six-months ANFIS model have shown high reliability in terms of training dataset. The MSE value for the 6 months training dataset was 0.057795 and RMSE value was 0.24041 (Fig. 9c). The testing dataset have also followed the pattern of training data and has the MSE value of 0.075604 and RMSE value of 0.27496 (Fig. 10c). Largest area for 6-months drought was under high category (47.14%) followed by moderate (26.66%), very high (16.14%) and low (10.06%). Very low category has less than 1% area (Fig. 11c).

Validation of drought vulnerability maps using ROC curve
The ROC curve is a threshold-based method for evaluating classified outcome (Saha et al. 2021a, b). The final drought vulnerability maps produced using ensemble model were divided into five classes based on equal interval namely, very low, low, moderate, high and very high (Djurovic et al. 2015;Termeh et al. 2019). Area under curve (AUC) accuracy for various drought vulnerability models was presented in Table 3. It revealed highest predication and efficiency for 6 months drought (0.964) followed by 1-month drought (0.957), 12-months drought (0.938) and 3-months drought (0.882). Thus, accuracy of receiver operating characteristic (ROC) curves reflected high reliability of ensemble ANFIS models and presented suitability of these models for socio-economic and environmental drought vulnerability (Fig. 12).

Discussion
Drought may create various socio-economic and environmental implications at spatial scales. Increased frequency of meteorological and agricultural droughts has affected the quantity of surface water and posed severe threats to hydrological and socio-economic drought conditions (Kuriqi et al. 2020). Thus, selection of site-specific socio-economic and environmental factors is crucial for drought vulnerability assessment. Earlier studies have utilized NDVI, rainfall, trend of temperature, surface water, soil types, availability of groundwater, income, health and literacy rate for examining drought vulnerability (Jiang et al. 2020;Thomas et al. 2016;Udmale et al. 2014). (Ebi and Bowen 2016;Mundetia et al. 2015). However, integration of these factors with machine learning algorithms has not been explored. The ensemble of approach drought conditioning factors and machine learning algorithms has proved effective approach for analyzing drought vulnerability in our study area. High predication accuracy of the ANFIS model was found for 6 months drought. Thus, the model is suitable for assessing long-term agricultural and hydrological drought conditions. Association of rabi & kharif crops and net sown area clearly indicated implications for agricultural sector in the study area. Marked variations were observed in spatial drought vulnerability. The sub-basin located on the eastern slopes of Western Ghats receive low rainfall because of its leeward location. Very high vulnerability for 1-month drought was observed in the watersheds located in the southern part of the sub-basin (Fig. 11a). Low rainfall, large area under fallow land, low NDVI, low NDWI leaf and cambisols vertisols soil are the main reasons for very high drought vulnerability for 1-month drought (Figs. 4 and 5). Watersheds located in the central part of the sub-basin experienced high drought vulnerability (Fig. 11a). Increasing trend in highest maximum temperature and mean maximum temperature, large area under fallow land, high IMR, low to moderate rainfall, chromic vertisols soil were attributed the main causes of high drought vulnerability. Low to moderate drought vulnerability was noticed in the watersheds located in the eastern and western parts of the sub-basin (Fig. 11a). High literacy rate, high NDVI, high income has made the watershed moderately vulnerable (Figs. 4 and 5). Socio-economic conditions largely determined the drought vulnerability for 1-month drought in the sub-basin (Fig. 5). Our findings are in tune with Saha et al. (2021a, b) who carried out drought vulnerability in Odisha state of India using fuzzy-analytical hierarchical process.
Very high drought vulnerability for 3-months drought was found in the watersheds located in the northern part of the sub-basin (Fig. 11b). Increasing trend in highest maximum temperature, mean maximum and minimum temperature have contributed to very high drought vulnerability (Figs. 4 and 5). High rate of evaporation, reduction in surface water and dryness in soil in response to b Fig. 9 ANFIS model train data-targets and outputs, MSE and RMSE for a 1-month droughts b 3-months drought c 6-months droughts d 12-months drought increase temperature made these watersheds vulnerable to drought. This finding is in line with the study conducted by Krishnan et al. (2020). High drought vulnerability was observed in the watersheds located in the central and northeastern parts of the sub-basin (Fig. 11b). Large area under fallow land, increasing trend in highest maximum temperature and mean maximum temperature, low per capita income, high IMR and low NDVI were attributed to high drought vulnerability in these watersheds (Figs. 4 and 5). Low drought vulnerability was observed in watersheds located in the eastern part of the sub-basin (Fig. 11b).
Increasing trend in monthly rainfall, decreasing trend in highest maximum temperature and mean maximum temperature, very low area under fallow land and high literacy rate were the main reasons for low drought vulnerability (Figs. 4 and 5). Moderate drought vulnerability was recorded in scattered patches in the watersheds located in western, south-western and eastern parts of the sub-basin (Fig. 11b). Very high 6-months drought vulnerability was observed in watersheds located in central and northern part of the sub-basin (Fig. 11c). High water depth was the main cause for the very high 6-months drought vulnerability (Fig. 4). These watersheds receive low rainfall being located in the rain shadow zones of Western Ghats. Similar results were presented by (Saha et al. 2021a, b). High drought vulnerability was observed in the watersheds located in the northcentral part of the sub-basin (Fig. 11c). Increasing trend in highest maximum and mean maximum temperature, high NDVI and high IMR were responsible for high drought vulnerability (Figs. 4 and 5). Moderate drought vulnerability was recorded in the watersheds located in eastern and central-west part of the sub-basin (Fig. 11c). Moderate area under fallow land and low IMR were attributed to moderate drought vulnerability (Figs. 4 and 5). Low drought vulnerability was experienced in the watersheds located in the south-eastern and south-western parts of the sub-basin (Fig. 11c). Low water table depth and high NDVI were the main reasons for low drought vulnerability in these watersheds (Fig. 4).
Very high drought vulnerability for 12-months was found in the watersheds located in the central-south part of the sub-basin (Fig. 11d). Decreasing trend in rainy days, low surface waterbodies and high-water table depth were the main causes for very high drought vulnerability in these watersheds. High drought vulnerability was observed in the watersheds located in south-central and north-eastern parts of the sub-basin (Fig. 11d). Increasing trend in highest maximum and mean maximum temperature, high fallow land, moderate HDI and low income were the main reasons for high drought vulnerability (Figs. 4 and 5). Moderate drought vulnerability was found in the watersheds located in the eastern and north-western part of the sub-basin (Fig. 11d). Relationship of various socio-economic factors with moderate drought vulnerability could not be effectively established. However, some factors like moderate area under fallow land and gradually increasing trend of highest maximum and mean maximum temperature have influenced the moderate drought vulnerability (Figs. 4 and 5). Low drought vulnerability was observed in the watersheds located in eastern and western part of the sub-basin (Fig. 11d). Decreasing trend in highest maximum and mean maximum temperature and high rainfall in the eastern part and high income and high HDI were the main reasons for low drought vulnerability (Figs. 4 and 5).
Long term trend of temperature estimated by CMIP6 models have shown a 3-6.8°C increase in the temperature while bias-corrected GCMs have predicated an increase of 4.2°C by the end of the century (Xu et al. 2021). (Schellnhuber et al. 2012) estimated rise of 4°C will transform semi-arid climate to hot desert climate. The subbasin experiences semi-arid climate and increasing trend of highest maximum and mean maximum temperature will lead to unexpected and unprecedented hot desert climate. Increase in duration of drought vulnerability from 1-month to 12-months drought has created high stress on groundwater table. kharif crops especially sugarcane requires many watering during its growth period has resulted in the decline of groundwater. Further, pumping out of water for irrigating sugarcane crop has resulted in drying up of wells   (Clarke 2013). Thus, highly vulnerable watersheds accord priority for mitigating long-term drought condition in the sub-basin.

Advantages and limitations
ANFIS method has many advantages over other methods. It has ability to generate rules based on neural network and provides robust results. It is an efficient method which takes input as membership functions and fuzzy rules and yields crisp output for many applications (Salleh et al. 2017). The fuzzy rules generated through this method can be applied at different geographical regions. Despite advantages of less statistical error and high accuracy, ANFIS has some limitations that prevent it from being used in large datasets. Various limitations such as curse of dimensionality, interpretability of rules, and parameter training need to be overcome while using large numbers of inputs (Salleh et al. 2017). Training of datasets is a complex phenomenon in ANFIS modelling and largely determined by initial values of membership functions, rule-base and partitioning method. The selection of membership function is a challenging task which requires prior knowledge of relationship among the input variables. As the number of input layers increases uncertainty of membership function also increases. Therefore, limited number of input parameters are suggested. Grid partitioning of ANFIS model generates a large number of rules which indeed cannot be easily understood by model users. We thus, first analyzed the relationship among variables.
Therefore, ANFIS was integrated with frequency ratio model for input selection. Meta heuristic algorithm was utilized to for efficient training and reduction of computational complexity of ANFIS model.

Conclusion
The present study assessed the drought vulnerability in Godavari middle sub-basin of India using location specific conditioning factors. Drought inventory map was prepared for one month, 3-months, 6-months and 12-months using SPI index. Frequency ratio model was utilized for analysing the relationship between the conditioning factors and drought frequency. Drought vulnerability was examined using FR model and ensemble adaptive neuro fuzzy inference system (ANFIS) model. SPI analysis revealed that highest drought frequency was recorded for 1-month drought. Significant relationship was found among IMR, 1-month, 3-months and 6-months drought frequencies. Validation through ROC curve revealed high suitability of the ANFIS model for the predication of drought vulnerability for all drought months. However, in our study the highest accuracy was found for 6-months drought. Findings also revealed that high drought vulnerability has increased from 1-month to 12-months drought. Spatial concentration of drought vulnerability showed very high drought vulnerability for 1-month drought was observed in the watersheds located in southern part of the sub-basin. Watersheds located in the northern part of the sub-basin experienced very high drought vulnerability for 3-months. Watersheds lying in the north-western part of the sub-basin were found vulnerable to 6-months droughts while watersheds of south-central part experienced twelve months droughts vulnerability. Water saving farm practices, crop insurance community participation in drought monitoring should be encouraged in the drought vulnerable watersheds. Real time drought monitoring though remote sensing data and development of early warning system can reduce the long-term impact of drought vulnerability in the sub-basin. Timely preparedness, effective response, developing health infrastructure and provision of drought relief through various government schemes can help in reducing socio-economic drought vulnerability in the sub-basin. Non-availability of recent data on socio-economic factors has been the main limitations of the study. Prior knowledge of the relationship among the various inputs for selecting membership function is essential for successful utilization of ANFIS model. The study reflected the effective utility of model for drought vulnerability assessment. The model can be applied in other geographical regions interested in analyzing drought vulnerability at various scales.   Tables 4 and 5.