Lake level fluctuations are significant for lakeshore structure planning, design, building, and operation, as well as for the management of fresh water lakes for water supply purposes. In order to regulate future lake level changes, models for modeling of high or abnormal level variations must be developed. The level measurements, or their future equally likely reproductions acquired by a simulation model, are a straightforward manner of getting lake management decision variables. Although comprehensive models incorporating hydrological and hydrometeorological variables such as precipitation, runoff, temperature, and evaporation can be found, it is more economically advantageous to use a model that simulates level variations based on past level records .
Lakes are used for a variety of domestic, industrial, and agricultural purposes [2, 3]. Forecasting lake water levels is important for water resource planning and management, lake navigation, tidal irrigation, and agricultural drainage canal management, among other things. The level of water in lakes is a complicated phenomenon that is primarily influenced by natural water exchange between the lake and its watershed, and consequently reflects hydrological changes in the watershed [4, 5]. For many practical applications, a model that predicts water-level changes based on previously measured levels is required .
Over the last few decades, hundreds of scholars have been interested in lake water level models. This is because global climate change has had an impact on the hydrological cycle, causing many lakes to dry up or flood unexpectedly. To model lake level fluctuations, several techniques have been devised. Sen et al. (2000) periodic and stochastic process . Altunkaynak et al. (2003) used the diagram model and Markov process . Altunkaynak (2007) used the artificial neural network . Altunkaynak and Sen (2007) used the fuzzy logic . Kisi (2009) used the wavelet conjunction model . Karimi et al. (2012) used the gene expression programming and adaptive neuro-fuzzy inference system . Sanikhani et al. (2015) used the adaptive-neuro-fuzzy inference system (ANFIS) and gene expression programming . Young et al. (2015) used the Time Series Forecasting Model . Shiri et al. (2016) used the extreme learning machine approach . Shafaei and Kisi (2016), used the wavelet- Support Vector Regression (SVR), wavelet-ANFIS and Wavelet-ARMA conjunction models . Liang et al. (2018) used the deep learning method . Peprah and Larbi (2021) used the Integrated Moving Average and Kalman Filtering Techniques . Luo et al. (2021) used the machine learning methods .
Most recently, three data-driven techniques, such as Least Square Support Vector Regression (LSSVR), Multivariate Adaptive Regression Splines (MARS), and M5 Model Tree, have achieved a remarkable emerging and promise in addressing difficult nonlinear situations. The methodologies mentioned above have been widely employed to solve hydrologic challenges [15–18]. LSSVR is a modified variant of support vector repression (SVR) that can solve problems involving quadratic programming . It also avoids a number of flaws that other data-driven learning systems have (e.g., local minima, time consumption and over-fitting) . In the field of engineering, LSSVR has had a successful application; for example, prediction of wastewater effluent parameters . 2009), the expense of designing the structural components of a wing-box for an airplane , design of a superconducting magnetic energy storage controller with adaptive dampening , forecast of CO2 in reservoir , economic analysis of oil recovery , forecast reservoir oil viscosity , In the hydrological study, there are a few studies have been conducted using LSSVR; for example, streamflow forecasting and estimation [15, 18, 27], stimation of daily water demand and daily inflow of dam , sediment transport modeling , modeling daily reference evapotranspiration , modeling of reservoir inflows , prediction of water pollution , forecast of air pollutans .
Multivariate adaptive regression splines are a newer artificial intelligence technique . The ability to capture the natural difficulty of data mapping in high-dimensional data patterns, a rapid and adaptable model, and accurate forecasting of continuous and binary output variables are the key advantages of this method. Furthermore, this nonparametric statistical method provides a versatile procedure for organizing the relationship between input and output variables with fewer variable interactions . Rainfall and temperature forecasting, streamflow forecasting, sediment concentration estimate, water pollution prediction, air pollutants prediction, freshwater distribution system modeling, and drought events river flow simulation are some of the previous studies using the MARS method in water resources applications [15, 18, 32, 34–38].
M5 model tree is a data mining methodology that uses the divide-and-conquer method to split the data time series into subspaces, allowing the multi-dimensional parameter space to be divided and the model to be generated automatically based on the overall quality requirement . Scholars recently investigated the M5 model tree's utility in many hydrological applications, such as water level optimization , precipitation and river flow modeling , stramflow modeling , air pollutans modeling , evapotranspiration modeling , pan-evaporation modeling , flood events , and sediment yield modeling .
In this study, the major goals of the current research are (i) investigate three different novel heuristic regression techniques (M5 model-Tree, LSSVR and MARS) for modeling water levels forecasting, (ii) investigate influence of the periodicity component for water levels forecasting, (iii) in order to demonstrate the effectiveness, Lake Michigan in the USA have been used to perform the proposed models.