Application of Remodeled Water Quality Indices and Multivariate Statistical Methods for the Appraisal of Water Quality in a Himalayan Lake

This study demonstrates and analyses spatio-temporal changes and trends of 15 water quality parameters that were arbitrated from the samples collected at 11 sampling locations during the water quality monitoring across the Dal Lake from September 2017 to August 2020. Further a revised WQI (named WQI min ) was developed contingent on multiple regression modeling comprising six key parameters (NH 4 -N, DO, COD, WT, TUB and NO 3 -N) so as to ease the course of action and lessen the systematic costs of the water quality assessment. The results signify that the general mean WQI value 81.9 and seasonal average WQI values ranges from 79.44 to 84.55. The quality of water showed seasonal variance, with lowest values in summer, succeeded by autumn and winter, and highest in spring. Moreover, the suggested WQI min model contingent on the selected six critical parameters displayed execution in the evaluation of Dal Lake’s water quality with R 2 = 0.99, RMSE value (5.48) and PE value of 6.34%. This manifested that the developed WQI min model can be convenient and ecacious implement to control and determine Dal Lake’s water quality. Further results showed that most of the nutrients were above the threshold value during the monitoring period, which is the leading cause of eutrophication at some places in the lake.


Introduction
The most plentiful and copious commodity in nature is water, even though it is exploited as well. The easy availability and rentability of surface water have made it liable to pollution because of its handiness in disposal of wastewater and pollutants. Surface water is also debased due to some natural processes and anthropogenic forces which culminates into defacement and impairment of this water for industrial, drinking, recreation, agricultural and other purposes (Todd et al., 2012). Thus, a substantial need to evaluate surface water quality arises which results due to increased comprehension in public health, regarding the signi cance of water quality for portable use and source water quality in marine and aquatic life (Ouyang., 2005). Moreover, for systematic management and treatment of water resources, discernment and measurably assessing the trend of spatiotemporal variation in water quality becomes crucial . To indicate the acceptability and suitability of water for human consumption, the term "water quality" has been formulated which is further used in normative documents and various scienti c publications that relate to the prerequisites for water resources "strategic" and "excellent" management (Parparov et al., 2006). In an aquatic ecosystem, water quality characteristics plays a pivotal role in determining water quality. Hence the di culties related with examining the huge number of parameters and high variability due to natural and anthropogenic factors become the problem for the water quality monitoring ( Simeonov et al., 2002;Boyacioglu., 2006).
Water Quality data can be inspected by a number of methods that work according to the nature of samples, area of the study location and on the informational objective (Stigter et al., 2006). The large-scale research in this eld has been conducted which is stipulated by a number of procedures developed for modeling, interpretations and categorization of monitoring data (Hamid et al., 2016). Water quality's traditional monitoring approaches are related to comparing the observed parameters with local prescriptive levels, which results in providing limited details on the comprehensive water quality (Pesce and Wunderlin., 2000). Water Quality Index is an ingenious way to help envisage and examine the issues pertaining to the quality of water. Proposed by Horton in 1965, for aquatic ecosystem evaluation, by appropriating ten most frequently applied water quality parameters like pH, alkalinity, DO, chloride, conductance etc. Horton. (1965) and since then it is extensively used and accepted in Asian, European, and African nations. With a global vision, the information of freshwater quality in recent years is easily acquired, through the WQI which is favorably used in determining the water quality of freshwater systems (Kannel et al., 2007; Banerjee and Srivastava., 2009; Avigliano and Schenone., 2016; Varol., 2020). WQI is considered as a practical approach which focuses on the important physicochemical parameters which in turn constitute the pollution status in a water body (Simoes et al., 2008). For WQI concept, in current times, many modi cations and alterations have been contemplated via various experts and scientists (Dwivedi., 1997;Bhargava., 1998). Several national and international organizations have formulated a great number of Indices, like Brown's Water Quality Index, CCMEWQI, TWO-TIER WQI and Oregon Water Quality Index. And for the assessment of water quality in speci c areas, these WQI's have been applied ( Dojlido et al., 1994;Lumb et al., 2002;Effendi et al., 2015;Sener et al., 2017;Zotou et al., 2020). Bascaron WQI developed by bascaron in 1979 has also been broadly used in many developing countries to monitor water quality of rivers and lakes because of the exibility it offers in weight adjustment and parameter selection. (Pesce and Wunderlin., 2000;Kannel et al., 2007;Wu et al., 2018). Besides, the indices also depend on the nature of water quality variables and their varying numbers as contrast to the respective levels and standards of a certain region. Thus, it is important to develop a modi ed WQI by taking into consideration the analytical cost and the time taken in examining the large number of parameters, so as to reduce the unnecessary parameters and hence lessen the analysis cost. Furthermore, high correlations between the WQI values based on reduced set of variables and actual WQI results have also been delineated, suggesting the suitability of WQI min model to describe and assess the changes in the water quality more affordably and effectually (Sanchez et al., 2007;Wu et al., 2018;Nong et al., 2020).
Dal Lake situated in Srinagar, considered as the most developed regions in the Union Territory of Jammu and Kashmir, becomes important both in terms of economic role besides providing portable water for the adjacent locality and hence in relation to ecological impact issues and its safe water quality, it has drawn huge controversy and discussion ever so often. In 1997, government of Jammu and Kashmir formed Lakes and Waterways development authority -an autonomous body to serve as an independent organization to look, run and preserve the Dal Lake's ecosystem in addition to other water bodies in the region. For national interest, various studies on Dal Lake have been carried out which has mostly focused on physical, chemical and biological variables also few have studied the land use in uences (Najar et al., 2011;Basharat et al., 2013;Javaid et al., 2017;. Furthermore, some studies have also labelled and determined the spatial-temporal changes in water quality of Dal Lake (Rather and Dar., 2020; Ahmad et al., 2020). Nonetheless, more studies are still necessary in this regard, for mentioned reasons: associated research were mainly carried out to examine the physio-chemical parameters of the Dal Lake and compare with the local standard and quality values. Some studies partly focused on drinking water quality index whereas comprehensive analysis, examinations and evaluation of the general surface water quality were seldom reported or outlined (Yawar et al., 2016). Some results, due to small data sets, had limitations in evaluation of the projects overall water quality (Mudasir et al., 2017). Additionally, there was no investigation in checking the critical water quality variables that largely affect the Dal Lake's water quality. Taking into account the mentioned aspects and gaps, 15 water quality parameters were selected by us, measured on a monthly basis, from the 11 water quality observation stations of Dal Lake from September 2017 to August 2020 to carry out a research on Dal Lake's water quality. The aim and objectives of the research were (1) to assess and illustrate the spatial-temporal water quality variations in each basin of the lake (2) to carry an extensive assessment of the water quality by applying WQI approach and (3) to perceive the crucial water quality variables and hence develop a WQI min model for elementary and economical water quality estimation.

Description of Study Area
The Dal Lake is located at an altitude of 1583m above mean sea level, in Srinagar city at 34 o 5′ N and 74 o 50′ E coordinates and is also the second largest lake in the union territory of Jammu and Kashmir (Rather and Dar., 2020). Four basins mainly encompass the Dal Lake comprising of Hazratbal basin, Nishat basin, Gagribal basin and Nigeen basin. Although among these, the Nigeen basin is taken as a separate lake as it is also linked and connected to Gilsar Lake through Nallah Amir Khan channel Unni K.S. (2002) and thereby in this study, Nigeen basin is not included. The total catchment area of Dal Lake is 337.17 sq km. which formulates about twenty times more than the area of the lake. A large perennial in ow channel, Telbal nallah, feds the lake normally, it is a stream which comes from the Marsar Lake high up in the mountains draining the largest sub-catchment area of about 145 km 2 and puts up about 60-70% of the total in ow to the lake. There are also other smaller streams around the shore line that feeds the lake viz., Boutkal, Merakhsha nallah, Peshpaw nallah, etc., in addition to some contribution and benefaction from the groundwater. At Harzratbal Basin, the Telbal nallah with other small streams enters the lake, and nally from Gagribal basin drains into the river Jhelum. Dal lacks in depth, is a shallow, multi-basin lake with an area of about 25.76 sq. km, out of which open water area is not more than 16.78 sq. km. According to the recent estimations, each year about 327 million cum of water ow into the lake ecosystem out of which 270.34 million cum leave the lake through two out ow channels, and 25.92 cum are drawn and used for drinking purposes and the rest is lost through seepage, evapotranspiration, and suction dredging (Vision Document., 2018).
To understand and keep the track of the variations in water quality of Dal Lake in real-time, the Lakes and Waterways Development Authority, Government of Jammu & Kashmir (LAWDA) has built 11 water quality monitoring stations across the lake to collect and examine water samples on monthly basis. Increased comprehensive and in-detail particulars of these monitoring stations and their locations are laid out in Table. S1, (supplementary data) Specimen measurement and collection of data: The water quality data used in this project was procured from head of research laboratory Lakes and Waterways Development precipitations with subsequent data. The samplings were acquitted at the center and along the boundaries in each basin of the lake with 15-20cm depth at each location, below the water surface. EC, pH and WT were determined on-site with multiparameter instruments. However, a separate DO bottles was used for xing the DO on site ( modi ed Winkler's method) and were kept in travel ice boxes lled with ice packs (0-4°C). Further, for remaining parameters samples were collected in plastic bottles wherein these bottles were pre washed (> 750 mL/sample) and within 6 hours moved the samples straight away to the nearby situated research laboratory for additional examination. Permanent markers were used to label the site description on the bottles for all the samples in order to prevent misjudgment. More details of the chemical methods and instruments used and operated during analysis are recorded in Table.S2, (supplementary data).

Water Quality Index
The Eq. (1) is used to study and analyze the calculations for Water Quality Index, that was ltered and suggested by (Pesce and Wunderlin in 2000) as follows: where n is used to show the aggregate parameters involved in the research, Ci is used to depict the standardized value of i th parameter, and weight of i th parameter is denoted by Pi. The value of Pi varies from 1 to 4 depending on the effect of parameter on water quality (Table.3 ). The WQI value was calculated every month at each monitoring station and was averaged down to obtain a nal value. The WQI min model, based on the critical parameters, is formed in order to promote easy and low-cost water quality assessment approach for Dal Lake that is selected by the stepwise multiple linear regression analysis and then calculated using Eq. (1).  ( Mann., 1945) in detail. In Figure-2, different trends of water quality parameters have been exempli ed from the results of Mann-Kendall test, as seen in this study. To test the normal distribution of water quality data, one-sample Kolmogorov-Smirnov test was carried out, and also bartlett's test was done to check the homogeneity of variance prior to the statistical analysis. One-way (ANOVA) was carried out so as to decide if there are signi cant spatial variation in water quality parameters Table.2 (Varol., 2020). A statistical software R was used to carry out both the M.K test as well as ANOVA, via speci c library functions. In this study, the WQI min models for the Dal Lake were set and established using the following steps: (1) In order to obtain the critical water quality variables for WQI min model, WQI value and Ci for every month at each monitoring station from (2017-2018) to (2018-2019) were appraised as "training data". (2) then WQI min was tested and evaluated for each station in (2019-2020) through coe cient of determination (R 2 ), Root Mean Square Error (RMSE) and the Percentage Error (PE) (Nong., 2020). To encounter homogeneity of variance and normality conditions the data was log transformation (i.e., log(x + 1)) before stepwise multiple linear regression analysis. OriginPro 2019 was used for the graphical abstracts.

Water quality attributes
Biological, physical and chemical parameters (WT, pH, DO, WT, COD and Turbidity) The annual average concentrations of various water quality parameters are shown in Table 1. Basin wise mean concentrations, the monitoring location concentrations and the trend of 14 water quality variables in Dal Lake, are respectively shown in Table 2, Figures. 1 and 2. One way ANOVA tests showed that all parameters except water temperature and COD varied signi cantly among the 3 basins of the lake Table 2.   Measured NO 3 -N values were converted from mg N/l to mg NOx/l prior to using the cited normalization factors.   basins. However mean pH and DO concentration in Hazratbal basin was 7.9 and 6.52 mg/l, respectively which was lower than the Nishat and Gagribal Basins. (ANOVA p < 0.001) ( Table .2). From M-K test results, no station showed trend for DO except at D2 and D16 where a signi cant increasing trend was observed while for pH increasing and decreasing trend was observed at D16 and D14 respectively (Figure-2).
The maximum and minimum annual average water temperature (WT) measured were 18.5°C and 17.2°C, respectively. WT did not show any signi cant variation across the lake (P > 0.05). The annual mean EC value ranged from 225-265 µS/cm. The mean EC value did not differ signi cantly among Nishat and Gagribal basins however it was slightly higher in the Hazratbal basin (269 µS/cm) (p < 0.001) while there was signi cant downward trend observed in both the Hazratbal and Nishat basins however no trend was observed in the Gagribal basin (Figure-2).
The annual average concentrations of COD ranged from 18.3 mg/L to 23.0 mg/L. The maximum COD concentration of 42 mg/l was observed in (2018-2019) at D2 of the Hazratbal basin and the minimum value of 10 mg/l in (2019-2020) was detected at D14 of the Gagribal basin ( Figure-1). Three stations D1, D4 and D5 showed a downward trend for COD and no trend was observed at other stations. The annual average turbidity was below the acceptable limit (5 NTU) (WHO .2011) ( Figure-1) The mean turbidity in the Hazratbal basin was notably higher than values observed in other two basins (P < 0.001) ( Table.2).

Nutrients (TP, OP, IRON, NO 3 -N, NH 3 -N)
The annual mean TP concentration during sampling period were higher than the standard value (0.

Development And Assessment Of Wqi Model
Regression modelling is perhaps the most widely used predictive model because its mathematically concise, has a useful baseline and the results are easy to interpret. In multiple regression we are interested in nding which of the independent variables have the larger impact on the dependent variable. To numerically capture this measure of accuracy, we use R 2 which can be interpreted as the %age of variation in Y (dependent variable) that is explained with change in X (independent variable). R 2 is called the coe cient of determination. The larger the R 2 the better the model ts. In order to evaluate the predicted accuracy, we are primarily interested in how the model will perform on the data not seen by the model. To do so we randomly divide the data into two parts 1) training set : used for tting the models. 2) validation set: used for choosing among different models. This is called data partitioning. To evaluate the accuracy of validation set we use a measure called Root mean square error (RMSE) and Percentage error (PE).
From the training dataset the outcome from multiple linear regression (Table.4) depicts that NH 4 -N makes the largest contribution to WQI having R 2 = 0.552 (p < 0.001), and the other parameters DO,COD and WT are introduced in the model in series and exceptionally improve the models with increase in R 2 values to 0.777,0.868 and 0.849, respectively. Additionally, two more parameters, TUB and NO 3 -N, are chosen as the fth and sixth variables that further help in improving the execution of the models. The performance of WQI min models, via R 2 , RMSE and PE was foreseen and is shown in (Table.5). The outcome, according to the regression model, showed that the R 2 of the models are increased as the parameters are added.
Furthermore, between the WQI and WQI min values, established on the training dataset, differences were observed while closest correlations with the WQI was shown by WQI min4 model hence displayed the best performance ( Figure-4) in addition to that WQI min4 showed the smallest RMSE and PE values between other models established on training dataset, suggesting that for the calculation of WQI of Dal Lake it was the most acceptable model. The analysis conducted on the testing data also proves that the WQI min4 displayed the overall best performance among other models with lowest PE and RMSE values of 6.34% and 5.48 (Figure-5) (R 2 = 0.99, P < 0.001), respectively.

Discussion
Physiochemical and Biochemical assessment pH plays an important role in aquatic ecosystem as it affects aquatic life also very low and high pH is associated with corrosion. Range of the concentration of pH in our study indicate that water of Dal Lake remained in alkaline state during the study period as was observed by (Raj et al., 2014). Moreover, pH was usually higher in summer that may be attributed to runoff from catchment, as summer is associated with frequent rainfalls in the region. The lowest and the highest WT was observed in winter and summer respectively which indicates that the water temperature was consistent with the air temperature across the Dal Lake. Electrical conductivity is directly related to the dissolved solids concentration. The values of EC were within the prescribed limits at all the sampling stations. Higher EC was observed during autumn which may be attributed to lower water levels and high rate of decomposition in lake (Ahmad et al., 2020). The most important parameter to access water quality is DO as trophic status as well as the biological activities of aquatic ecosystem depends on DO (Granier et al., 2000). To ensure healthy aquatic life the value of DO should range from 4 to 6 mg/l (Alam et al., 2007). The concentration of DO was within the limits prescribed by BIS/WHO also DO values were lowest in summers and showed an increasing trend as we moved towards winter indicating that solubility of oxygen increases with decrease in temperature (Solanki et al., 2009). Chemical oxygen demand is an important parameter in assessing the organic pollution of water body (Sun et al., 2016). The COD values were slightly lower in the Gagribal basin and higher in Nishat and Hazratbal basin which is due to agricultural runoff and presence of Sewage Treatment Plants in these basins (Najar et al., 2011). Higher values of turbidity were observed at station D1 and D2 of the Hazratbal basin during summer due to algae growth and excessive dredging at these stations. ( The main sources of ions in lake water are from atmospheric depositions, carbonic acid silicates besides the contributions from anthropogenic activities and weathering of carbonates (Sidle et al., 2000;Kalpana et al., 2013;Wayland et al., 2003) consequently the SO 4 results from the leachable sulphate through use of fertilizers from non-point sources and also from anthropogenic sources via sulfuric salts coming from domestic wastewater (Varol and Davraz., 2015). Moreover, to access the organic pollution by domestic sewage chloride is used as indicator (Chandrasekhar et al., 2003). In this study the concentrations of both CL and SO 4 were below the BIS and WHO prescribed standards however the values were higher than those of (Najar et al., 2011;Mushtaq et al., 2018) this is because of the increased no of STPs that are functioning around the lake. Calcium originates from disintegration of carbonate minerals. On the other hand, ferromagnesian minerals and dolomite are considered to be the primary source of magnesium in fresh wasters (Singh et al., 2012). In this study both Ca and Mg values were within the standards of WHO/BIS also Ca was the dominant cation and were in accordance with . In fresh water bodies of Kashmir, the concentration of Ca and Mg are corelated with population of plankton most commonly Cyanophycean ( Bhat and Pandit., 2003).
Assessment of water quality using WQI The overall surface water quality of Dal Lake was evaluated in the study since comprehensive study on the surface water quality of dal lake has not been carried out except (Yawar et al., 2016). The special, seasonal and annual pro les of WQI values are shown in (Figure-3) based upon the calculation the water quality of dal lake has been at good level throughout the monitoring period indicating that the water quality of dal lake has improved compared to the study carried out by Yawar et al.
(2016) this is because of the strict management and the pollution control by the lakes and waterways development authority. From the special pro les on the WQI (Figure-3a). The station D2 in the Hazratbal basin showed the lowest water quality index value this is because of the increased anthropogenic activities mainly agricultural activities around this station Sener et al.
(2017) however all other stations showed similar trend of WQI. From the seasonal pro les (Figure-3b) the WQI values were slightly higher in summers which is because Dal Lake is subjected to high anthropogenic stress as most tourists visit Dal lake during summers. The annual pro les (Figure-3c) clearly depict that the water quality of lake has improved from (2017-2018) this is because the state has been under lockdown the 2019 because of political disturbance in the area followed by a national lockdown in 2020 due to COVID pandemic which decreased the anthropogenic stress on the lake (Sun et al., 2016).

Development of WQI min model for Dal Lake
In order to assess the water quality using WQI method it requires need to measure and analyze large number of parameters. Thus, keeping in mind, the limited budgets available for the environmental protection in developing countries analytical cost and time-consuming analysis of large parameters will play a pivotal role in WQI acceptance (Ongley., 1998;Sun et al., 2016). So, it is imperative to pull out parameters that is explaining majority of the variation in the water quality data and thus can be used to provide a quick and reliable WQI results. Stepwise multiple regression analyses were used to obtain key parameters and hence develop a WQI min model for the water quality evaluation of Dal Lake. Based upon the results six crucial parameters (NH 4 -N,DO,COD,WT,TUB and NO 3 -N) explained majority of variance in water quality data and hence were selected in developing the nal model that displayed excellent performance in the water quality assessment of Dal Lake. The parameters involved in the development of WQI min model should explain the overall characteristics and variations of water quality in addition reduce the cost of analysis (Pesce and Wunderlin, 2000). The rst parameter selected that contributed highly to the WQI value in training dataset was NH 4 -N. DO have the second most expository power and was selected as second parameter followed by COD, WT, TUB and NO 3 -N. Additionally the selected six parameters can be easily measured and thus are favorable for Dal Lake's water quality assessment .
The selected parameters from our study also played a major role in developing WQI min model in other study areas (Wu et al., 2018;Kocer and sevgili., 2014) explained that ammoniacal nitrogen was an important parameter in assessing the water quality of Lake Taihu Basin, (china) and Esen Stream (Turkey) respectively. Also, in Dongjiang River (China) NH 4

Novelty, Challenges And Way Forward
From the comprehensive analysis of Dal Lake's water quality, our study can describe and answer to few criticisms on the Dal Lake's water quality, showing a good evidence that Dal Lake's water quality is currently maintained at a good level. However, previous studies concluded different results where investigations were mostly related to the WQI explaining drinking water quality of the Dal Lake (Najar et al., 2011;Javaid et al., 2017). Keeping in mind, the limited budgets available for the LAWDA the WQI min model developed in this study based on few critical parameters can play an important role in assessing the water quality of Dal Lake more quickly and economically. There is a scope to further test the developed WQI model by selecting the different monitoring sites across the Dal Lake so as to check the reliability of the model for future prediction that will indeed help in the effective management of Dal Lakes water quality. Also, the increased nutrient concentrations especially TP and NH 4 -N has resulted in the eutrophication at some places in the lake which is to be managed and controlled effectively and should be of the utmost priority for the research division of LAWDA in future. In addition, considering the national importance of Dal Lake, heavy metal analysis, algal proliferation and analysis on micropollutants possibly be needed in future.

Conclusion
In this research, WQI approach was used to determine the Dal Lake's water quality. A total of 14 water quality parameters were selected in this study. The results showed that majority of the parameters varied signi cantly between the three basins of the lake (except of WT and COD). The Dal Lakes water quality was overall "Good" throughout the monitoring period and the water quality has improved from 2017 to 2020. From the special pro les of WQI Gagribal basin showed the higher water quality and the Hazratbal basin the lower water quality compared to the three basins. Additionally, the water quality displayed distinct seasonal variation, with the lowest values in summer and the highest WQI values in spring. The seasonal variation in water quality were attributed to land use and anthropogenic activities. The results obtained were acceptable and proved that the water quality of Dal Lake has improved because of the constant efforts from the LAWDA. Moreover, the WQI min model developed in this study consists of six variables viz Ammoniacal nitrogen, Dissolved oxygen, Chemical oxygen demand, Water temperature, Turbidity and Nitrate nitrogen displayed excellent performance in explaining the Dal Lake's water quality. The selected parameters are easy to measure and thus can be used for the quick, cost-effective and reliable results on the water quality of Dal Lake. The model presented can play an important role and can be used as a useful baseline in the future monitoring also it is proposed that effective management of eutrophication should be of concern in future research. Figure 2 The results of Mann-Kendall test for 15 water quality parameters for each water quality monitoring station in the Dal Lake from 2017-2018 to 2019-2020 Contrast of WQI and WQImin estimates from training dataset (the parameters selected for the WQImin models are shown in Table 5).

Figure 5
Contrast of WQI and WQImin estimates from testing dataset (the parameters selected for WQImin models are displayed in Table 5).

Supplementary Files
This is a list of supplementary les associated with this preprint. Click to download. 20210622Finalsuplimentarymaterial.docx