Multivariate statistics and entropy theory for irrigation water quality and entropy-weighted index development in a subtropical urban river, Bangladesh

Currently, a well-developed combination of irrigation water quality index (IWQIs) and entropy water quality index (EWQIs) for surface water appraisal in a polluted subtropical urban river is very scarce in the literature. To close this gap, we developed IWQIs by establishing statistics-based weights of variables recommended by FAO 29 standard value using the National Sanitation Foundation Water Quality Index (NSFWQI) compared with the proposed EWQIs based on information entropy in the Dhaleshwari River, Bangladesh. Fifty surface water samples were collected from five sampling locations during the dry and wet seasons and analyzed for sixteen variables. Principal component analysis (PCA), factor analysis (FA), Moran’s spatial autocorrelation, and random forest (RF) model were employed in the datasets. Weights were allocated for primary variables to compute IWQI-1, 2 and EWQI-1, 2, respectively. The resultant IWQIs showed a similar trend with EWQIs and revealed poor to good quality water, with IWQI-1 for the dry season and IWQI-2 for the wet season is further suggested. The entropy theory recognized that Mg2+, Cr, TDS, and Cl− for the dry season and Cd, Cr, Cl−, and SO42− for the wet season are the major contaminants that affect irrigation water quality. The primary input variables were lessened to ultimately shortlisted ten variables, which revealed good performance in demonstrating water quality status since weights have come effectively from PCA than FA. The results of the RF model depict NO3−, Mg2+, and Cr as the most predominant variables influencing surface water quality. A significant dispersed pattern was detected for IWQImin-3 in the wet season (Moran’s I>0). Overall, both IWQIs and EWQIs will generate water quality control cost-effective, completely objective to establish a scientific basis of sustainable water management in the study basin.


Introduction
Water quality means its qualification for a particular reason and is controlled by class and number of disintegrated compositions (Ewaid et al. 2019). This composition plays a pivotal role in plant development and advancement either legitimately as far as insufficiency or in a roundabout way through influencing supplement accessibility (Salem et al. 2019). In this way, top-notch harvests must be yielded by high-quality irrigation water as it is straightforwardly associated with soil and plant environment (Singh et al. 2018). Therefore, timely monitoring, appraising, and possible forecasting variations in water quality are required (Sadat-Noori et al. 2014;Matta et al. 2020;Islam et al. 2020a).
Generally, simultaneous measurement of physical, chemical, and biological water quality parameters is essential for a comprehensive surface water quality appraisal. One of the critical potential issues of water quality studies is the set of variables that can be continuously monitored and related costing, collecting, analyzing, and interpreting these datasets. To solve these issues, a particular water quality index (WQI) has been employed to handle the effective water quality classification using many parameters that have been widely established as informative for the end-user. Several WQI offered are highlighted by a numerical model for transforming the chosen water quality features into a dimensionless single number; generally, 0 to 100 gives a sign of water quality level for the end-user (Misaghi et al. 2017). Besides, it provides a comprehensive scenario to decision-makers to take the necessary steps for conserving a surface water body.
Researchers worldwide have undertaken several attempts to develop a universal WQI for widespread use (Panda et al. 2016;Sutadian et al. 2017;Matta et al. 2020;Hossain and Patra 2020;Yotova et al. 2021;Wu et al. 2021). Hence, appropriate WQI, such as the irrigation water quality index (IWQI), has gained considerable attention in recent decades. However, national and international agencies have proposed several indexes, for instance, US National Sanitation Foundation Water Quality Index (NSFWQI) (Brown et al. 1970), Canadian Council of Ministers of the Environment Water Quality Index (CCMEWQI) (Khan et al. 2003), and Oregon Water Quality Index (OWQI) (Tyagi et al. 2013). The NSFWQI is the most general index of these widely used indices to demarcate surface water quality globally (Jahin et al. 2020). Generally, weighting approaches have been restricted commonly into two groups: subjective and objective techniques with participatory and statistical tools. The subjective tools such as the Delphi method (Misaghi et al. 2017), analytical hierarchy process (AHP) (Sutadian et al. 2016), revised Simos' procedure (Zardari et al. 2015) provide weight to parameters based on opinions of subject experts, decisionmakers, and various stockholders.
Currently, a well-developed comprehensive IWQI to appraise river water suitability for crop production in a polluted subtropical urban river is hardly found in the literature. For example, Jahin et al. (2020) developed an IWQI in the arid and semi-arid region following FAO 29 guidelines based on the NSFWQI formula, where assigning weights were originated from PCA/FA based on the loading factor of individual indicators. One of the main limitations of the work was the nominal sample size, which restricts its application. Similarly, Misaghi et al. (2017) introduced a new IWQI for irrigation uses based on the NSFWQI adopted by FAO 29 guidelines and applied this index to the Iranian Ghezel Ozan urban River. However, the main disadvantage is that weights were taken from the Delphi method based on expert opinion, which led to uncertainty and biases of the environmental problem in different steps of the decision-making process (Islam et al. 2017a(Islam et al. , 2020cSingh et al. 2020;Saha et al. 2020;Matta et al. 2020).
A common problem of the subjective method is ambiguity with bias according to numbers and the importance of assigning parameters. To overcome these ambiguities, biases, and subjective weight selection problems, objective techniques such as principal component analysis (PCA) and factor analysis (FA) have been taken into account that provides weights to a variable based on the loading of the individual variable on each factor while variables are added to select based on variance elucidated by PCs (Jahin et al. 2020;Islam et al. 2020a). These techniques have been employed for generating novel water quality indexes (Tripathi and Singal 2019) and so on. The main advantage of using PCA/ FA is that this gives a few linear aggregations of the original variables that abridge several datasets where its inherent structure is yet kept to the highest extent promising (OECD 2008;Dutta et al. 2018). Besides, water quality in a specific region relies on some inter-associated variables that vary in terms of geogenic and anthropogenic activities (Kumar et al. 2021). Thus, it is necessary to establish a region-specific IWQI that can be a more reasonable and economic perspective because of basic reductions in sample timing, effort, and costing needed to monitor sites for a huge number of variables (Tripathi and Singal 2019;Islam et al. 2020b).
To avoid complexity and uncertainty, a statistics-based objective tool including PCA/FA is necessary to make weights for a number of different indicators. For instance, when at least 150-300 cases are found, the PCA/FA can be an appropriate tool (Sutadian et al. 2017;Tripathi and Singal 2019). Since the PCA/FA depends on the present data given for the assessment (Sutadian et al. 2016), it permits varying weights based on spatiotemporal changes in water chemistry. However, less focus has been made to assign weights according to the communality derived from the FA. Our research showcases a statistics-based tool to choose the critical influential water quality indicators by both the PCA/FA for developing IWQIs as a universal and unbiased IWQI based on selected physicochemical variables following FAO 29 standard values.
In addition to PCA/FA, this study proposed an entropyweighted water quality (EWQI) for agricultural purposes according to information entropy and entropy weight (Amiri et al. 2014;Kamrani et al. 2016) which following the DoE (1997) water quality guideline of Bangladesh. The weightbased EWQI is an enhancement over IWQIs. Information entropy measures the uncertainty in the dataset, which has been successfully applied in different fields (Singh et al. 2020;Gao et al. 2020;Amiri et al. 2021a). In this study, for the first time so far, the IWQIs will be employed to quantify the suitability of surface water quality and compare with the proposed EWQIs to investigate the sensitivity of the IWQIs in the polluted subtropical urban river basin, Bangladesh. Besides, the random forest model and Moran's spatial autocorrelation have been applied to examine the factor/parameter contributing to the IWQIs and their spatial patterns. Although some recent combination techniques present a reasonable outcome of surface water quality appraisal (Ewaid et al. 2019;Matta et al. 2020;Hasan et al. 2020;Yotova et al. 2021;Wu et al. 2021), such technique is yet scarce in Bangladesh. This study contributes to close this gap in the literature. Therefore, we intend to (i) determine statistics-based weights of broadly accepted parameters recommended by FAO to establish IWQI using PCA/FA concerning formula adopted by the NSFWQI, and (ii) to assess surface water suitability by the proposed EWQIs followed by Bangladesh guideline based on entropy theory; (iii) to appraise surface water quality through concurrent comparison of IWQIs and EWQIs and factor influencing on them using random forest model and (iv) to choose the most critical parameters to be added in a fast and easy and cost-effective surface water quality assessment. Thus, this study employed the index to surface water in the Dhaleshwari River basin, Bangladesh, as a case study to appraise this condition of surface water quality.

Study area specification
In this study, the Dhaleshwari River in Hemayetpur of Savar subdistrict, situated at the northwest part of Dhaka city, Bangladesh, was investigated. The city accommodated ap-proximately1.4 million inhabitants. Our study area lies at the latitudes of 23°51′ 30.0024″ N and longitudes of 90°16′ 0.0120″ E ( Fig. 1) with the land surface of an elevation of 15m (Banglapedia 2018). The Dhaleshwari River is divided into two tributaries: the name Dhaleshwari remains for the northern upstream branch, which combines at the southern downstream part at Manikganj district with its other branch, namely Kaligonga River (Hasan et al. 2020). The divided branches are merged again before combining with the Shitalakshya River and ultimately merging with the Meghna river, which ends up in the delta of the Ganges-Brahmaputra-Meghna Rivers (Ahsan et al. 2019). Three most prominent seasons are observed in the study area: (1) the pre-monsoon season (March-June: hot and humid rainy season with the temperature reaching up to 40°C), (2) the monsoon or wet season (July-October: very wet with temperatures~30°C), and (3) the post-monsoon or dry season (November-February: winter season with temperature 10 to 20°C). Average 2000 mm rainfall occurs in the rainy season with an average of 75% humidity and 60% cloud cover (Rahman et al. 2020). The study area is comprised of Pleistocene alluvium-soil with a gentle slope heading from west to east. Land usages are mostly associated with agriculture (24.3%), agricultural laborer (12.8%), wage laborer (4.44%), forestry, cattle breeding, and fishing (1.90%), industry (1.37%), service (20.7%), commerce (17.4%), transport (3.96%), construction (1.66%), and others (11.5%). The total cultivable land is 16,750 hectares, with fallow land of~10,550 hectares (Ahsan et al. 2019).

Sample collection process
Total 50 river-water samples were collected from 5 different sampling stations (denoted as S1, S2, S3, S4, & S5 in Fig. 1) of the studied area during the monsoon or wet season (July-October: total 25 samples) and post-monsoon or dry season (November-February: total 25 samples) in the year of 2018 following the standard guidelines (APHA 2012). Sampling stations were chosen horizontally to cover the entire industrial area, including the main effluent discharged points (S2), where the subsequent stations were~0.5km apart from each other. From each sampling station, five samples (composite samples of two/three independent collections) were collected (one sample from the center of the sampling point and the other four samples around the sampling points were collected), which were separated from each other by 100-200m. Polyethylene plastic bottles (of 1000 mL) preconditioned with 5% conc. HNO3 and rinsed with double-deionized water (Ahsan et al. 2019) were utilized to collect the river-water samples. Before the sampling, these sampling bottles were rinsed at least three times with the water samples to be collected at each sampling station. For collecting water samples, preprepared sampling bottles were submerged at~10cm underneath the water surface of the river. After sampling, the samples were immediately acidified with 2 mL conc. HNO3 per 1000 mL of samples (Habib et al. 2020;Ahsan et al. 2019); after then, the bottles were screwed carefully and marked with the respective identification numbers. The same number of duplicate samples was also collected without acidification and labeled accordingly for determining the anions and some physicochemical parameters of the Dhaleswari River. All samples were then placed in an ice bath, carried to the laboratory, and preserved in a refrigerator at 4°C on the same day until the analysis (Islam et al. 2020c;Ahsan et al. 2019).

Analytical procedures
All water samples were analyzed at the Institute of National Analytical Research and Service (INARS, ISO/IEC 17025: 2017 accredited laboratory), Bangladesh Council of Scientific and Industrial Research (BCSIR), Dhaka, Bangladesh. Before analysis, samples were allowed to reach ambient room temperature and shaken well for homogeneity. For metal analysis, a water sample (100 mL) was taken into a 250 mL cleaned glass-beaker by calibrated pipette, and was acidified with 5 mL of conc. HNO 3 . For digestions, samples were then heated on a hot plate (150-180°C), and volume was reduced to~25-30 mL. Samples were then allowed to cool and were then transferred to 100 mL cleaned and calibrated volumetric flask by rinsing the beaker at least three times with double deionized water, and then, the flask was filled up to the mark (100 mL) with deionized water. Samples were then filtered and preserved in a cleaned and dried nontransparent polyethylene plastic bottle (250 mL) with proper labelling for the determination of metals content. Multiple procedure blanks were also prepared for quality assurance. The concentration of metals in digested samples was determined by different arrangements of atomic absorption spectrometers (Models: a. AA240 FS, b. GTA 120-AA240Z, c. SpcetrAA 220, Varian, Australia) following standard method (APHA 2012;Ahsan et al. 2019;Siddique et al. 2020;Islam et al. 2020b), with a set of characteristic wavelengths of metals using hollow-cathode-lamps and directly aspirating the digested samples into the air-acetylene flame. Instrumental calibration was performed by analyzing the known concentrations of metals in standards. During the analysis, standard solutions or Certified Reference Material (CRM) after every five samples, and a method blank after ten samples was measured to monitor the instrument's performance for error minimization. The contents of metals in each sample were measured in triplicate, and results were reported as the average (n=3) contents of metals in the samples. For sample preparation, analytical reagent grade HNO 3 was used as obtained from Merck, Germany. Certified reference material (CRM) for standard stock solutions of Fe, Pb, Cr, Co, Cu, Ni, Mn, Zn, Hg, Mg, As, Cd, and Ag were obtained from Fluka Analytical (Sigma-Aldrich, Germany) for calibration purpose. All working solutions were prepared in double deionized water.
Collected non-acidified water samples were used for anions determination. Samples were filtered and were taken in small sample vials by syringe equipped with a 40-micron micro-filter carefully avoiding any possible sample contamination. These samples in the vials were then measured by Ion-Chromatograph (Model: SIC10AVP, Shimadzu, Japan) for anions viz., Cl − , F − , Br − , NO 2 − , NO 3 − , and SO 4 2− following the standard procedure reported elsewhere (APHA 2012;Ahsan et al. 2019;Islam et al. 2020c). For instrumental calibration, a mixed-standard solution of anions prepared from CRM (Fluka Analytical, Sigma-Aldrich, Germany) was used as a working standard. In this study, % of spike recovery for all metals and anions were 90-110% which were estimated using suitable equations as reported earlier .
The physicochemical parameters such as water temperature were measured in-situ using a calibrated thermometer, while DO, EC, and pH were also analyzed in-situ using a portable Multi-parameter meter (Sension TM 156, HACH, USA) calibrated with NIST (USA) traceable standards. Biochemical oxygen demand (BOD 5 ) was determined by a 5-days period method, whereas TDS was estimated gravimetrically following standard methods (Ahsan et al. 2019). Alkalinity and hardness were estimated titrimetrically. In this study, only 16 parameters were used for both wet and dry seasons since the FAO standard values available only for these parameters.

Selecting indicators
After considering FAO-29, recommended limits for IWQ assessing parameters were selected (Ayers and Westcot 1994;FAO 2008). First of all, we set the parameters based on the FAO-29 method. Variables were associated with normalization by evaluating Z scores though there is a concern due to being different units illogical to other aggregate units of two values (Tripathi and Singal 2019;Sutadian et al. 2017). To establish the least data set (LDS) based on the created Z scores, PCA/FA and correlation analysis were utilized (OECD 2008). Therefore, a minimum IWQI (IWQImin) was made.
We only chose values that are greater than 1. We performed Varimax rotation to enhance factor interpretation. Under each PC, the values that are more significant than 0.5 were retained. A multivariate correlation was performed for more than one variable remnant to verify the controlled variables under a single principal component (Jahin et al. 2020). Significantly correlated analytical variables were considered redundant, and we included only the highest loadings. However, in noncorrelated highly loaded variables, each variable was deemed significant and was selected for the study.

Sub-index value estimation
The crucial and original parameters (PCA/FA) were converted to a common unit score (S i ) varying between 0 and 100 according to the Eq. (1): where Va is the actual laboratory measurement value, Vs is the recommended value of FAO guidelines, and Vi is the ideal value for pure-water (7.0 for pH and zero for other analytical parameters).

Establishing parameter weights
Based on PCA/FA, a weight value (Wi) for each parameter was attained. Then, weights were distributed on two approaches; in the first approach, the eigenvalues for each principal component and factor loadings for each parameter from PCA were utilized (IWQI-1) (Wang et al. 2017). Then communality of each indicator after FA was employed (IWQI-2) described by Jahin et al. (2020). Each indicator elucidated the portions of variance was indicated by communality value which varies between 0 and 1, where a high value implies a more significant contribution (Härdle and Simar 2015). Weight values were then extracted from the ratio of communality of each indicator to the sum of all indicators' communalities. As for 3 IWQImin, weights were estimated by using three tools communalities (IWQImin-1), variance (IWQImin-2), and eigenvalues and factor loadings (IWQImin-3). IWQImin-2 weights were extracted from the ratio between the variation of each PC (%) and the total % of the variation of all PCs with eigenvectors more than 1. Weights for IWQImin-3 were estimated based on the eigenvalues for each PC, and factor loading for each parameter remained in LDS.

Final index and quality remarks
Irrigation water quality index (IWQI) is a robust tool (Xiao et al. 2014). The ultimate indices were computed by accumulation of scores and weights in a single dimensionless value: The ultimate index varied between 0 and 100, with high values demonstrating good water-quality. Following Jahin et al. (2020), the water quality can be demarcated as excellent (91-100), good (71-90), moderate (51-70), low (26-50), and poor (0-25), respectively.

Development of entropy water quality index
In information entropy theory (Shannon Claude 1948), entropy is taken as an estimation of uncertainty. It is employed to detect the weight of each variable, which could decrease the error triggering from omitting the artificial weight (Pei-Yue et al. 2010a, b;Amiri et al. 2021a). The entropy theory denotes how much a probabilistic method can be stochastic (Islam et al. 2017b). In this research, the EWQI is employed to show the river-water quality due to its acceptance, aptness, robustness, and consistency (Amiri et al. 2014;Islam et al. 2020b). Entropy weight is a widely used tool to represent the variable's weights, and 3 steps are used to estimate the entropy water quality index (EWQI). The equations are described briefly with steps (Islam et al. 2020a).
In the first step, an entropy weight must be formulated and assigned to the parameters (Pei-Yue et al. 2010a, b;Jian-Hua et al. 2011;Amiri et al. 2021a). According to the observed data, m water samples (i = 1, 2, ……., m) were taken to analyze n water quality parameters (j = 1,2,…., n). The matrix is given below: Then, data has been normalized with this equation: Then, the ratio of index value of the j index and in i sample is The information entropy is expressed through this formula: The smaller the value of ej is, the bigger the effect of j index. Then, the entropy weight can be calculated with the following formula: Random forest model Breiman (2001) developed the tree-based supervised artificial intelligence random forest (RF) model. The RF model was run in this study using the package of "randomForest" under the free R statistical software (R Core Team 2014). This model was used as a feature selection tool to identify variable importance from the dataset and has some benefits in studying inconsistent association compared to other machine learning models such as artificial neuron networks and support vector machines (Rahman and Islam 2019;Rahman et al. 2020;Islam et al. 2020d;Salam and Islam 2020). The RF model provides good, robust, and relatively accurate outcomes than other models. A detailed mathematical computation and theoretical basis of the RF model can be found in Breiman (2001). In this work, the mean decrease in Gini values was employed to measure the relative importance of various physicochemical variables. A lower Gini value denotes less important input variables (Islam et al. 2020d). Thus, this model was applied to explore the importance of physicochemical parameters affecting the surface-water quality during the dry and wet seasons in the polluted subtropical river.

Statistical analyses
Spatial autocorrelation, e.g., Moran's I index, was used to assess the spatial distribution pattern of the surface-water quality indices. This statistical approach involves globalspatial auto-correlation analysis that represents the total spatial association for the studied locations. Local spatial autocorrelation represents the level of spatial auto-correlation in a specific site. Moran's I index is a parametric tool that means auto-correlation varied from −1 to +1 and generates p value and Z score to appraise the degree of auto-correlation (Bhuiyan et al. 2016;Liu and Mao 2020;Islam et al. 2020a). A negative Moran's value indicates the data dispersed randomly, whereas a positive Moran's value indicates the data clustered spatially in the datasets (Islam et al. 2017a). In addition, Moran's I index was checked based on 999 transformations at the significance level of p<0.05. The details about the computation process of the Moran's I index can be observed elsewhere (Liu and Mao 2020). Statistical approaches were performed by SPSS statistical package (version 25) for Windows. Kolmogorov-Smirnov test was run to check data normality and homogeneity of surface water for both dry and wet seasons. Pearson's correlation matrix was calculated to check the association between the analyzed hydro-chemical variables. The coefficient of determination (R 2 ) was utilized to estimate the goodness of fit of the tested models. The root mean square error (RMSE) of the model was computed to assess the predictive capability of all the models. A paired sample t test was run to establish the statistical differences in seasons' concentrations of physicochemical parameters.

Water quality status
The descriptive statistical outcomes of all the sixteen physicochemical variables, including nine trace metal concentrations of 50 surface water samples from five sampling locations for dry and wet seasons, are tabulated in Table 1. Physicochemical variables reflect the water type, quality, and characteristics of surface water (Islam et al. 2020a). The pH mainly controls surface water chemistry such as alkalinity, chemical condition, and solubility of dissolved metals. During the dry season, the pH values of water samples varied from 7.49 to 9.87 with a mean of 8.07 ± 0.64, while in the wet season, the pH value ranged from 7.28 to 7.66 with a mean of 7.51 ± 0.10, which indicate slight alkalinity. This is because of the elevated influx of NO 3 − and SO 4 2− from industrial activities (Islam et al. 2017b;Hasan et al. 2020). In the dry season, the mean value of pH, EC, and TDS was 8.07 (range: 7.28-9.87), 1.15 (dS/m) (range: 0.71-3.1), and 809.24 (mg/L) (range: 388-3423), respectively. The mean value of pH (8.07) surpassed 40% of the FAO standard range for irrigation in the dry season (Ayers and Westcot 1994;FAO 2008). The EC varied from 0.71 to 3.1 dSm -1 , and elevated EC values were due to the effluent of industrial water into the irrigation system in the dry season. The study area is characterizing by industrial activities such as tannery and textiles. It was reported that the wastewater from tannery has very high EC (42500 μS/cm) and TDS (21300 mg/L) in Bangladesh (Jahan et al. 2014) and high pH (>10) may be attributed from the textile wastewater (Dey and Islam 2015). However, the TDS value crossed 12 % of the FAO standard limit for irrigation in the dry season. The elevated contents of EC, and TDS of the study sites in the dry season are due to the impact of pollutants from the textile, tannery, petrochemical, and automobile, and agricultural inputs in the lands beside the river, urban waste, and municipal wastes (Mandal et al. 2017). The mean concentrations of Mg (20.69 mg/L), Cl − (108.38 mg/L), and SO 4 2− (177.27 mg/L) were also found within the FAOs' standard except for NO 3 − (36.95 mg/L). The elevated contents of SO 4 2− in the dry season may be due to the low river flow and the high evaporation of water which is evident in the effect of discharged effluent into the Dhaleshwari River (Hasan et al. 2020 , 15.08 mg/L; and Mg 2+ , 5.95 mg/ L, respectively, in the wet season. In addition, the concentration of metals (Pb, Cd, Cr, Mn, Fe, Co, Cu, Zn, and As) in water samples was determined for both seasons. Results revealed that except Cr (0.58 mg/L) and Mn (0.42 mg/L) in the dry season, the mean concentrations of all the other metals were found within the standard value recommended by the FAO guideline. However, in the wet season, all the studied metals were within the recommended levels in the river water, but in the dry season, 76%, 88%, and 4% samples exceeded the Cr, Mn, and Cu concentrations recommended by FAO, respectively (FAO 2008). In some of the water samples, concentrations of studied metals were enough to make surface water unsuitable for irrigation except for As, Pb, and Cd as their contents were lower than the FAO standard limit for irrigation purposes. The variations in amounts of analyzed parameters were accounted for the locations of the sluice gate from the point source to collection points and also for seasonal variations were a dominant factor that resulted in the Dhaleshwari River's increase in contamination level.

Principal component analysis for water quality parameter selection
The first five components (PCs) had eigenvalues above one and were accounted for 89.73% of the variance of datasets for the dry season (Table 2). PC1 (first) explained 35.13% of the total variance with an eigenvalue of 5.97. It added the most important variables influencing water chemistry in the study area, i.e., pH, EC, Mn, Fe, and SO 4 2− . Components in PC1 are derived from lithologic impact related to the leaching of subordinate salts by precipitation. However, high pH implies severe weathering of the natural ecosystem and oxidation of Fe and Mn may also be likely sources of metal contamination (Islam et al. 2017a). These variables were well-associated and coherent with each other (Table 2). Only EC and SO 4 2− were taken into consideration for the least dataset (LDS). The PC2 (second) was elucidated at 26.65% of the variance with an eigenvalue of 4.53. The PC2 was predominated by trace metals of mixed sources, including TDS, Cr, Cu, Co, NO 3 − . The components in PC2 are originated from the integrated source where Cr, Cu, and Co may become from untreated tannery waste, municipal sewage and paint industries that share a typical drainage system with the tannery industries (Habib et al. 2020). The vehicular uses and coal combustion via municipal runoff may also be another potential source of Cr that contaminates the river water (Tamim et al., 2016). NO 3 − may derive from anthropogenic activities and agrofarming practices, particularly near the river mouth. The maximum-loaded variable, TDS and Cr, were kept for LDS because of their significant association with each other. The PC3 (third) represented 12.30% of the variance with an eigenvalue of 2.09. Pb, Cd, and Cl predominated the PC3. Pb and Cd inputs into the aquatic environment derive from urban waste leachates containing Cd batteries from automobile workshops, scrapping of Pb and Cd from batteries, and smelting plants (Kumar et al. 2021). Because of the strong significant association among these parameters, all of them were retained for the LDS. The PC4 (four) was responsible for 9.47% of the variance with an eigenvalue of 1.61. This PC4 was predominated by well-associated two parameters, i.e., Mg 2+ and NO 3 − which was originated from lithogenic inputs coming from magnesium nitrate, a highly watersoluble crystalline Mg 2+ origin in the river basin; thus, both the maximum-loaded parameters were retained for the LDS. The PC5 (five) was responsible for 6.16% of the variance with an eigenvalue of 1.05. Under the PC5, As, Co, and Zn were elevated loaded and did not exhibit a significant association with each other. Hence, the highest loaded element, e.g., Zn, was taken for LDS because industrial wastes attributed to Zn solely in the Dhaleswari River basin (Islam et al. 2020a). Arsenic and Co may be originated from geogenic sources (Islam et al. 2017b;Islam et al. 2020a). The agricultural runoff from the urban region may be a possible transportation pathway of these pollutants into river water bodies. The results of PCA reveal that the first 5 PCs had eigenvalues exceeded one and were accounted for 82.52% of the variance of the dataset in the wet season. The PC1 (first) explained 32.84% of the total variance with an eigenvalue of 5.25. It added the most critical parameters, e.g., EC, TDS, Cl − , NO 3 − , and SO 4 2− affecting the surface water chemistry in the study site (Table 2). These parameters may be considered vital parameters in surface water samples were originating from geogenic sources, which indicate more ionic substances in the analyzed samples (Hasan et al. 2021). Though they were well-associated, only the higher loaded EC, TDS, and NO 3 − were chosen for LDS. The PC2 (second) was elucidated at 24.06% of the total variance with an eigenvalue of 3.85. The   (Islam et al. 2017c). The PC3 (third) responsible for 11.33% with an eigenvalue of 1.81, which Co, Cu dominated, and As, and hence, they may have a common source. The Cu, Co, and As may be attributed to elevated agricultural pesticide and fertilizer use associated with agro-farming practice besides the Dhaleswari River basin (Hasan et al. 2020).
Cobalt showed a strong significant positive association with Cu, but As exhibited a negative association with them. Because of the strong significant positive association and the higher loaded value, Co and Cu were retained for LDS. The PC4 (four) was elucidated for 8.18% of the variance with an eigenvalue of 1.31. It was predominated by three parameters, including Cd, Mg 2+ , and Zn. Mg 2+ and Zn had higher loaded variables; however, Cd did not show any significant association with Mg 2+ and Zn. Cadmium and Mg 2+ might be related to geogenic inputs such as various kinds of mineral phases, e.g., carbonate minerals (Islam et al. 2017b), while Zn's enrichment mainly originated from anthropogenic sources, including the re-tanning and basification phases of leather goods of tannery industries. Thus, Mg 2+ and Zn were chosen for the LDS. The PC5 (five) explained for 6.10% of the variance with an eigenvalue 1.08. Chromium had the highest loading value and retained for LDS. The chromium may be originated from untreated industrial waste material releasing into river water (Habib et al. 2020;Islam et al. 2020b).

Inter-associations among entropy water quality variables
Before assessing the IWQ of all water samples from the study sites, the entropy theory was used to appraise surface water quality. It is essential to recognize the association between information entropy and entropy weight with the analyzed parameters. It is well-known that physicochemical variables with lower entropy weight and optimal information entropy value have little influence on overall water quality (Kamrani et al. 2016;Islam et al. 2017aIslam et al. , 2020cHabib et al. 2020). However, the entropy value lessens the relative error retained by ignoring the artificial weight and considering objective weighing. This entropy value ignores the likely weight consideration via a robust and consistent weighting method. The findings of the entropy weight and information entropy weight for the 15 tested parameters are outlined in Table 1. Results showed that Mg, Cr, TDS, and Cl − had a greater significant influence on surface water quality during the dry season in the Dhaleswari River basin. These had a comparatively lesser entropy information value and a larger entropy weight among the physicochemical variables. However, As and Co had little impact on the overall water quality due to lower entropy weight in the dry season. For the wet season, Cd, Cr, Cl − , SO 4 2− , and TDS had a greater influence on water quality in the study sites because of the higher entropy weight and lower entropy information value among all water quality parameters, while it is also observed that NO 3 − and pH had little influence on the surface water quality in the wet season. The magnitude to influence physicochemical variables on surface water quality is followed in the descending order: Mg 2+ >Cr>TDS> Cl − > SO 4 2 − >EC>Fe>pH>Cu>Mn>Cd>Pb>Zn>NO 3 − >As>Co for dry seas o n a n d C d > C r > C l − > S O 4 2 − > T D S > P b > C o >Cu>Zn>Fe>EC>Mg 2+ >As>Mn>NO 3 − >pH for the wet season.

Appraisal of water quality using IWQIs
Based on the outcomes of PCA/FA, the weights (Wi) of each variable acquired are outlined in Tables 2 and 3. Table 3 shows that the weights for toxic elements Pb, Cd are lower than that of essential elements Fe and Mn. The main reason is that concentrations of toxic elements are comparatively lower than essential elements in this study, because essential elements Mn and Zn can catalyze reactions by binding to substrates in surface water, favoring different reactions, including the mediation of redox reactions, via reversible changes in the oxidation condition of the metal ions (Hosseini et al. 2015;Islam et al. 2020a). The earlier cited works (Sutadian et al. 2017;Tripathi and Singgal 2019;Islam et al. 2020a andJanin et al. 2020;Wu et al. 2021) reported that a science-based appraisal of variables weights based on the PCA/FA approach relies mostly on two conditions; first, the existence of interassociation among variables, and second, as a minimum 150-300 cases required for the analysis. While the chosen variables showed strong significant associations among these variables and the tested cases were 250, the computed variable weights would be reasonable to make new irrigation water quality indices (IWQImin-1, IWQImin-2, and IWQImin-3). On the other hand, the results of the Shannon entropy theory, the entropy weight (wj), and information entropy (ej) of each variable obtained are shown in Table 1. The previous studies (Amiri et al. 2014;Islam et al. 2017aIslam et al. , 2020a also showed that the entropy-based assessment of variables could deal with uncertainty influencing the water quality and successfully use in water quality evaluation based on entropy weight because of its robustness, reliability, and precision. Information entropy helps in extracting the amount of information by measuring the equality, diversity, flexibility, complexity, interactivity, and redundancy of the random datasets. Also, it quantifies the uncertainty or randomness in the dataset . The entropy weight of water quality parameters determines its impacts on water quality. Parameters with the lowest entropy value and highest entropy weight have maximum impact on overall quality (Islam et al. 2017a) In the current study, we utilized the entropy weight and information entropy to prepare entropy irrigation water quality indices (EWQIs); the obtained objective weight would be justified.
As displayed in Fig. 2a, the IWQI-1 varied from 19.08 to 77.22, which denotes poor to good quality water for the dry season. The temporal distribution of water quality classes (Fig.  3a) elucidates that 22.4% of surface water samples were demarcated as good, 61.7% as moderate, 14.5% as low, and 1.4% as poor-quality water. Likewise, the IWQI-2 implied low to good water qualities since index values ranged from 18.08 to 76.22. Similarly, 60.8% of the water samples were divided as good, 31.3% as moderate, 7.2% as low, and 0.7% as poor quality, respectively. These nearby outcomes between the two methods can be attributed to the same weights come from the two approaches. In the case of the IWQI-min, the studied sixteen water quality variables recommended by FAO were decreased to ultimately shortlisted ten indicators/variables to be added in the LDS using the PCA/FA method (Table 3) . The IWQImin-1 varied from 5.4 to 97.11, which means poor to excellent water qualities. Water quality classes were demarcated into five grades, i.e., excellent, good, moderate, low, and poor that responsible for 28.7%, 47.6%, 14.8%, 5.5%, and 3.4% of the total water samples respectively. The IWQImin-2 varied from 40.32 (low) to 90.44 (good) water qualities. The temporal distribution of water quality classes was as follows: good (81.6%), moderate (14.8%), low (3.4%), and poor (0.2%), respectively. The IWQImin-3 ranged from 2.44 to 94.10, which denotes poor to excellent water qualities. Besides, the EWIQI-1varied from 1.32 (poor) to 86.37 (good) that accounted for 45.30% (good), 47.42% (moderate), 5.77% (low), and 1.50% (poor) of the total samples, respectively. Similarly, the EWQI-2 ranged from 0.08 (poor) to 60.62 (moderate) that means 37.04% (moderate), 60.17% (low), and 1.50% (poor), respectively. The low water quality of the Dhaleswari River is triggered by the discharge of urban waste and industrial effluent of the largest industrial centre in this area. This finding is confirmed by the other analysis (Hasan et al. 2020), where industrial activities are presented as the major source of waste in this river basin. It can be stated that nonconformity from the good water quality needs in this river basin is because of untreated municipal wastewater. Thus, suitable measures such as the construction of the wastewater treatment plants and the improvement of the sewerage system are required for maintaining good water quality status (Rahman et al. 2017). Fig. 2 Values of IWQIs and EWQIs in the Dhaleswari River basin in both seasons (similar letters denote no significant difference at p < 0.01).
The IWQI-1 varied from 69.92 to 79.09, which means moderate to good quality water in the wet season (Fig. 2b). The temporal distribution of water quality classes (Fig. 3b) exhibits that 96.18% of water samples were demarcated as good, 3.82% as moderate water quality. Similarly, in IWQI-2, values were varied from 78.08 to 92.59, which denotes Fig. 3 Distribution of water quality classes (%) among water samples in the dry and wet seasons 91.37% as good and 8.63% as excellent. For the IWQImin-1, the ultimately shortlisted ten variables can be added in the LDS using the PCA/CA results. The LDS comprised of EC, TDS, Pb, Mn, Cr, Zn, Mg, Co, NO 3 −, and Cu (Table 3). It ranged from 85.21 to 93.42 that accounted for 16.55% as excellent and 91.37% as good water quality. The IWQImin-2 varied from 89.31(good) to 94.85 (excellent). In IWQImin-3, all the water samples showed that the water quality is moderate, with the highest value of 57.72. Similarly, the EWQI-1varied from 1.32 (poor) to 92.76 (good) that responsible for 38.69% (excellent), 48.03% (good), 3.21% (moderate), 0.05% (poor) of the total samples, respectively. In addition, the EWQI-2 ranged from 0.08 (poor) to 85.65 (good), which means 99.99% as good water quality. Similar findings were also observed by Jahin et al. (2020) in Kafr El-Sheikh River, Egypt. This might be attributed to the fact that the anthropogenic inputs, surface run-off from adjacent municipal areas, upstream water flow, and stormwater drains may trigger this abnormal distribution of IWQI values as well as also the high stream water flow in the Dhaleswari River during the wet season. Our findings can be compared with the cited works performed by many researchers in different regions of the world (Wang et al. 2017;Tripathi and Singal 2019;Janin et al., 2020;Matta et al. 2020;Wu et al. 2021;Yotova et al. 2021). For example, Bora and Goswami (2017) found that the WQI values were 85.73, 122.47, and 80.75, respectively, during the pre-monsoon, monsoon, and post-monsoon season of the Kolong River, Assam, India. Wang et al. (2017) reported that the WQI values varied from 9. 40-1734.79, 61.85-1803.64, 57.82-1691.85, and 55.14-2204.90, respectively, among the water samples in the Huaihe River, Anhui, China. Islam et al. (2020b) found that EWQIs values ranged from 35.65 (Meghna River) to 159.55 (Teesta River), which was accounted for 26.7% as an excellent, 53.03% as good, 3.33% as moderate, 10% as poor, and 6.7% as unfit for drinking purpose in the surface water of six river basin in Bangladesh. Seasonally, significant water quality fluctuation occurs in the dry season compared to the wet season.

Comparison among water quality indices for both seasons
The results revealed that no significant statistical differences were observed between IWQI-1 and IWQI-2 at a 99% confidence level. Besides, the values of R 2 and RMSE were 0.99 and 0.954, respectively, suggesting a high association between the measured values from both methods. This can be ascribed to the same function of PCA/FA as they could group distinct variables in terms of correlation coefficient . Conversely, FA decreases the influence of less important parameters attained from PCA and the original group of parameters is obtained after Varimax rotating the axis confined by PCA assessment (Giri and Singh 2014). We observed various variations of WQI values in both seasons. This may be due to the high stream water flow, heavy rainfall in the monsoon season, reducing concentrations of water quality parameters in the wet season (Amiri et al. 2021b). However, the dry season showed many-fold higher IWQI values in all water samples than the wet season. Although a single number cannot tell the entire story of surface water quality, many other contributing variables are not included in the index including Na, Ca, and K. Besides, a water quality index based on several significant variables can provide a simple and robust indicator of water quality. Thus, it can be stated that weights that come from communality after FA are superior to that obtained from PCA according to their degree of multi-associated variables. Besides, no significant statistical differences were found among IWQImin-1, IWQImin-2, IWQImin-3, EWQI-1, and EWQI-2 at a 99% confidence level.
This implies that the estimated LDS was a good representative of the new variables and can be utilized effectively to trace the spatial and temporal variations in the surface water quality in the study sites. The PCA/FA is well documented in the existing literature as a robust technique to produce a more unbiased LDS for water bodies (Tripathi and Singal 2019). In the validation dataset, the IWQI-1 and IWQI-2 showed good performances and close associations with other indices such as EWIQI-1 and EWIQ1-2 with equal R 2 values of 0. 59,0.57,0.89,and 0.88 and RMSE values of 13.85,14.49,15.95,and 15.09 respectively (Fig. 4). In this study, for the dry seasons, two variables under PC1, PC2, and PC4 (EC, TDS, Cr, Mg 2+ , SO 4 2 −, and NO 3 − ) were obtained in the LDS except for PC3 (three variables Pb, Cd, and Cl − ) and PC5 (one variable Zn). Subsequently, weights of four variables (EC, TDS, and SO 4 2− , Cr) were equal when using variance and eigenvalues and factor loading. Therefore, the two weighting tools had a near trend pattern. In the case of IWQImin-1 vs IWQImin-3, and EWQI-1 vs EWQI-2, values of R 2 and RMSE were 0. 64, 0.53 and 21.76, 25.35, respectively, at 99% confidence level and hence the performances of these indices were reasonably poorer than IWQI-1 and IWQI-2. Moreover, IWQI-1 and IWQI-2 were marginally wider than those of other indices at a 95% confidence level. This implies also that especially IWQI-1 showed lower influences on uncertainty for predicting surface water quality (Wu et al. 2021).
For the wet season, there were no significant statistical differences between IWQI-1 and IWQI-2 at a 99% confidence limit. In addition, values of R 2 and RMSE were 0.84 and 12.38, respectively, suggesting a strong association between the measured values but weak performance from both methods. Besides, no significant statistical differences were observed among IWQImin-1, IWQImin-2, IWQImin-3, EWQI-1, and EWQI-2 at a 99% confidence level.
In the testing dataset, the IWQI-1 showed weak performances and strong associations with the other index, such as IWQImin-1, IWQImin-2, and IWQImin-3, with equal R 2 values 0. 73,0.69,and 0.64 and RMSE values of 15.93,18.88,and 18.78,respectively (Fig. 5). However, IWQI-2 exhibited excellent performance and good association with IWQImin-1 (R 2 =0.57, RMSE=4.25, p<0.01). For the wet seasons, two variables under PC2, PC3, and PC4 (Pb, Mn, Co, Cu, Mg, and Zn) were retained in the LDS except for PC1 (three variables EC, TDS, and NO 3 − ) and PC5 (one variable Cr). Then, weights of three variables (EC, TDS, and NO 3 − ) were equals based on variance, eigenvalues, and factor loading analysis. Therefore, the two weighting tools had an almost trend pattern. In the case of IWQImin-1 vs IWQImin-2, IWQImin-1 vs IWQImin-3, IWQImin-2 vs IWQImin-3, and EWQI-1 vs EWQI-2, values of R 2 were 0. 83,0.93,0.91,and 0.58 and RMSE values of 3.06,34.64 37.57,and 13.89, respectively at a 99% confidence level, and hence, the performances of these indices were rationally more unsatisfactory than IWQI-1 and IWQI-2 except for IWQImin-1 and IWQImin-2. Moreover, IWQI-1 and IWQI-2 were slightly larger than those of other indices at a 95% confidence level. It also implies that especially IWQI-2 showed lower uncertainty for predicting surface water quality (Wu et al. 2018). This is because the PCA is based solely on linear dataset structures, whereas the FA is based on a particular tool. The FA shows that the dataset is based on the underlying factors of the tool and that total data variance can be decomposed into that accounted for by general and single factors (OECD 2008). Jahin et al. (2020) also found a similar linear trend among the water quality parameters in their work. IWQI-1 and IWQI-2 are robust because of limited parameters requirement, which also lessens analytical cost (Sánchez et al. 2007). Interestingly, our suitable IWQI method is not devised from multiple regression analyses, which played a vital role in significant variable selection and performed well in the Dhaleswari River.

Spatial patterns of water quality indices
Spatial autocorrelation was used to perform water quality indices to assess their spatial differences during both seasons (Table 4). In the dry season, non-significant negative Moran's I values (p>0.1) were identified for all indices in the studied basin, which might be attributed to the spatial heterogeneity in pollutant loads and identify water quality development policies. In contrast, a non-significant weak positive spatial autocorrelation (p>0.1) was detected for IWQImin-2, indicating the rational spatial integration in the study region.
In the wet season, a highly significant spatial autocorrelation (p<0.1) was identified for IWQImin-3, which was negatively associated (Moran's I=−0.1696 and Z value=−1.793), suggesting that geographically neighboring locations had a comparable value of IWQImin-3 water quality. Similarly, the Moran's I values exhibited a weak negative autocorrelation at a confidence level of p>0.1 in all indices except for IWQI-2 and EWQI-2, which indicate a localized effect on water quality indices (Islam et al. 2020a). The differences in spatial autocorrelation among the water quality indices might be ascribed to surface water discharge dynamics and flush out monsoonal rain impacts (Islam et al. 2017a;Liu and Mao 2020).

Factor contribution analysis
As stated in the earlier section, in the current study, ten water quality variables have been retained in the LDS based on the PCA/CA, incorporated in the irrigation water quality using the random forest (RF) model. The RF model has computed the contribution or importance of each variable. The relative contribution/importance of all the physicochemical variables in Fig. 4 Associations among water quality indices in the dry season. Black solid line represents the linear coefficient of determination (R 2 ) where the red dashed lines represent 95% confidence interval irrigation water quality can be appraised from Table 5. Though the magnitude to influence the water quality for ten physicochemical variables is not equal, they are echoed by the mean decrease accuracy of Gini index values. For the dry season, among physicochemical variables, predicted underlying LDS and Cd have little influence on the optimal IWQI-1 model.
On the contrary, predictor variables, e.g., NO 3 − , Mg 2+ , Cl − , Pb, and Cr, have a more significant influence on the irrigation water quality in the dry season. NO 3 − is a crucial parameter that elucidated most of the variance in water quality of Chillan River, Chile (Debels et al. 2005) and the Dongjing River, China (Sun et al. 2016). For the wet season, among ten variables, Mn, Pb, Mg2+, Cr, and Cu are having a more significant influence on predicting the best model. However, underlying variables such as Zn and NO 3 − have little impact on the prediction of irrigation water quality in the wet season.

Discussion
Water quality deterioration is one of the most significant environmental challenges that have detrimental effects on human health and the aquatic ecosystem. Although fast development and industrialization have produced large quantities of industrial effluents, including liquid waste, the amounts are still unclear because of a lack of proper monitoring programs from respective water regularity bodies. The surface water of the Dhaleshwari River in central Bangladesh has deposited those wastes, especially liquid waste. Worsening of water quality utilized for irrigation purposes is a severe issue in the river basin. The deterioration of water resources could substantially impact ecological quality, health risk, and even climate warming (Ayers and Westcot 1994). Besides, the agrochemical and fertilizers utilized in an adjacent agricultural crop field are another crucial contaminating agent in the area. However, the current advancement of the Integrated Pest Management (IPM) policies stressed lessening the use of pesticides.
The present work will be used for future surface water quality appraisal for irrigation purposes as well as the local water development board in the study area: first, the outcome explained the water quality status of the river using the IWQIs and EWQIs models. Related practitioners and local government departments could get information about water quality conditions and their spatiotemporal distribution at a local scale rather than a regional scale. Second, we have retained ten critical parameters in the LDs using the PCA/FA, and the IWQI method varied significantly. Third, the surface water quality state was comparatively worse in the dry season and good in the wet season based on the IWQI classification. Finally, we found the critical variables influencing water quality, e.g., NO 3 − , Mg 2+ , Cl − , Pb, and Cr in the dry season and Mn, Pb, Mg 2+ , Cr, and Cu in the wet season based on the RF model, which comprised the most appropriate IWQI method in the study region. Thus, the tested river basin waters are partly suitable for irrigation purposes based on IWQIs and EWQIs. Much focus needs on the river basin with poor surface water quality due to the river water is yet used for irrigation and domestic purposes. The earlier studies suggest adopting a suitable management plan for the cleansing program and implementing strict water laws due to poor-quality water affected in the studied region (Ferreira et al. 2020).
The IWQIs methods were accurately better than the EWQIs model based on ten physicochemical variables. Thus, the local water development authority should be conscious of these crucial variables during any monitoring and appraisal program. Furthermore, the tentative IWQI-1 and IQWI-2 have an excellent performance in this vital basin in the Dhaleshwari River, contributing to enhancing understanding of surface water quality knowledge in a similar river basin. This study helps policymakers in science-based policy making because the application of the statistical approach aids in the reduction of the subjectivity of the index generation and prepares it more objective in characteristics.
Irrigation water quality indices, e.g., sodium adsorption ratio, soluble sodium percentage, residual sodium carbonate, and total hardness (Raghunath 1987;Islam et al. 2017a;Rahman et al. 2017;Chen et al. 2019), are mostly related to the water salinity and hardness, which are affiliated to the soil fertility and plants' yields only. Natural processes like sealevel rise triggered by global warming, source rock-water interactions, and local geology of water basins and catchment areas primarily govern the irrigation as mentioned earlier indices where meq/L contents of Na + , K + , and Ca 2+ are rigorously used for water quality evaluation. However, anthropogenic processes can release a wide range of chemical entities, e.g., Pb, Cd, Cr, Mn, Co, Zn, As, NO 3 − , and SO 4 2− to the aquatic system, which can potentially contaminate plant's edible parts and enter into the human food-chain along with the deterioration of irrigation soil quality. Instead of routine monitoring of widely used IWQIs (utilizing meq/L contents of Na + , K + , and Ca 2+ ), objective-based anthropogenic impacts on irrigation water quality relative to the FAO-29 standard values were used in this study. On the other hand, Na + , K + , and Ca 2+ do not significantly impact using heavy metals: Pb, Cd, Cr, Mn, Co, Cu, Zn, as for the development of irrigation water quality and weighted entropy indices. Therefore, Na + , K + , and Ca 2+ were not included for irrigation water quality and entropy weighted indices development study.

Conclusion
We developed a well-establish IWQIs and EWQIs for surface water suitability for agricultural purposes concerning representative variables suggested by the FAO-29 standard and a well-accepted method, namely, NSFWQI and entropy theory. The outcomes of Shannon entropy theory implied that Mg 2+ , Cr, TDS, and Cl − for the dry season and Cd, Cr, Cl−, and SO 4 2 − for the wet season were recognized as the primary pollutants, triggering water quality degradation. Both PCA and FA identified the weights for the preliminary 16 variables added in computing IWQI-1 and IWQI-2, respectively, with the final method being suggested. The IWQIs exhibited a similar trend with the EWQIs model that implied water quality classes varied from poor to good qualities. The PCA/FA lessened the dimensionality of multiple parameters to develop a welldemonstrative LDS of EC, TDS, Cr, Zn, Mg 2+ , SO 4 2− , Pb, Cd, Cl − , and NO 3 − for the dry season and Pb, Mn, Cr, Co, Cu, Mg 2+ , Zn, EC, TDS, and NO 3 − for the wet season to be added in introducing IWQImin-1, IWQImin-2, and IWQImin-3 for both seasons. The performance of IWQIs is depicted comparatively higher than EQWIs because of inclusiveness. The IWQImin with weights come from PCA developed in good measuring and predicting water quality compared to weights originated from FA. The variables chosen in the computation of IWQImin can be estimated, which will considerably reduce monitoring time and the cost of data collection and analysis of a large number of variables. IWQImin-3 was statistically negatively correlated with the wet season (Moran's I value >0). Our research has also identified the physicochemical variables including NO 3 − , Mg 2+ , Cl − , Pb, and Cr may influence the irrigation water quality for the dry season and Mn, Pb, Mg 2+ , and Cr for the wet season using RF model in the Dhaleshwari River basin, hence justifying further large basin-scale analysis.