Natural catastrophise such as hurricane, earthquake, and floods lead to significant economic, ecological, and social damages and casualties. Among them, flood can be identified as a phenomenon with severe effects, impacting about 109 million people throughout the world between 1995 and 2015 (Alfieri et al., 2017; Hirabayashi et al., 2013). As percentages, about 55% of people and 43% of all events were impacted by floods with lost assets totaling over 636 billion USD(Serinaldi et al., 2018) encouraging researchers worldwide to mitigate this disaster (Hallegatte et al., 2017). When considering South Asian Tropical regions, which are highlighted due to rapid occurrence of seasonal reversal of the wind direction accompanied by intense precipitation, and resultant wet summers and dry winters (Xie & Saiki, 1999), large-scale floods and droughts can be expected (Parthasarathy & Mooley, 1978). One of viable option in flood hazard management is practical and effective flood warning systems (Boulange et al., 2021). Early prediction of floods facilitates timely management of hydro-junction operations and fast evacuation of individuals from flood-affected regions, leading to a reduction in socioeconomic losses (Zhang et al., 2022).
A significant challenge in advancing flood forecasting technology is the limited availability of field data. Usually, flood prediction approaches can be divided into two categories, physically based models, and data-driven models. Physical models(Mourato et al., 2021; Pierini et al., 2014) often require substantial amount of both hydrological and geomorphological data for calibration and validation, and they might not always be readily accessible. Furthermore, the model parameters must be carefully tested and evaluated, because they are regionally dependent and can be challenging to estimate.
To overcome these limitations, machine learning based data-driven models have gained popularity in flood forecasting because of their ability to capture complex nonlinear patterns, cope with limited data effectively (Rahmati & Pourghasemi, 2017), and ability to capture spatial data from images (Lee et al., 1990). These models can be effectively implemented solely based on available rainfall data and measured discharge data, without the need for detailed catchment characteristics. Artificial Neural Network (ANN) is a common algorithm for flood simulation because it has outperformed traditional methods on many occasions (Chu et al., 2020; Elsafi, 2014; Tamiru & Dinka, 2021). Then, Recurrent Neural Network (RNN) was introduced for time series forecasting tasks with the ability of capturing essential information from long sequences of data. LSTM, which is a special type of RNN, have gained significant popularity and widespread adoption in hydrologic prediction tasks (Dtissibe et al., 2024; Fang et al., 2021; Xiang et al., 2020; Zou et al., 2023).
As a solution for some limitations of these traditional neural network algorithms such as low computational speed and ineffectiveness of capturing long-term dependencies, google initiated a new architecture called Transformers(Vaswani et al., 2017) which is based on attention mechanism (Bahdanau et al., 2014). Although this was originally designed for natural language processing (NLP), the Transformer model has denoted its effectiveness in handling other types of time series data (Farsani & Pazouki, 2021; Wu et al., 2020). In the realm of flood forecasting, there is a scarcity of studies that incorporate Transformer architecture, representing a notable gap in the literature. Moreover, existing research indicates that the accuracy comparison of models, including Transformers, varies across different datasets (Wei et al., 2023; Xu et al., 2023).
When considering the Tropical regions, especially areas in South Asia, lack of studies regarding deep learning-based flood simulation is an issue. This paper is related to our study (Madhushanka et al., 2024) which focused the lower reach of the Mahaweli catchment, Sri Lanka. In this paper, apart from the forecasting capabilities of daily stream flows, the effects of different input features on the output are thoroughly investigated.