Green Roof Hydrological Modelling With GRU and LSTM Networks

Green Roofs (GRs) are increasing in popularity due to their ability to manage roof runoff while providing a number of additional ecosystem services. Improvement of hydrological models for the simulation of GRs will aid design of individual roofs as well as city scale planning that relies on the predicted impacts of widespread GR implementation. Machine learning (ML) has exploded in popularity in recent years, however there are no studies focusing on the use of ML in hydrological simulation of GRs. We focus on two types of ML-based model: long short-term memory (LSTM) and gated recurrent unit (GRU), in modelling GRs hydrological performance, with sequence input andsingle output (SISO), and synced sequence input and output (SSIO) architectures. Results of this paper indicate that both LSTM and GRU are useful tools for GR modelling. As the time window length (memory length, time step length of input data) increases, SISO appears to have a higher overall forecast accuracy. SSIO delivers the best overall performance, when the SSIO is close to, or even exceeds, the maximum window size.


Introduction
Green roofs (GRs) have been used to alleviate stormwater management issues in numerous places around the world (Czemiel 2010; Shafique and Kim 2017). They provide a number of ecosystem services, e.g. biodiversity support, extension of hard-roof life time, reduction in stormwater runoff and noise, improved building insulation, lowering of air temperatures, however the final effect of implementation depends largely on local context and GR design (Getter and Rowe 2006;Oberndorfer et al. 2007;Lepp 2008;Kolokotsa et al. 2013;Berardi et al. 2014).
Hydrological modelling of GRs can aid the design of individual roofs as well as city scale planning that relies on the predicted impacts of widespread GR implementation (Li and Babcock 2014;Versini et al. 2015;Li et al. 2021). Currently GR hydrological simulation may rely on a variety of different models or methods including: Curve methods (CM), linear/non-linear storage reservoirs (LSR), single reservoir models (SR), and physical models (PM) Most of the models that employ these methods can be categorized as process-driven models (Rasmussen 2006;Hilten et al. 2008;Roehr and Yong 2010;Berthier et al. 2011;Li and Babcock 2015;Versini et al. 2015;Soulis et al. 2017;. CM is a purely empirical method that relies on the statistical analysis of measured data (Rasmussen 2006;Roehr and Yong 2010). Green roofs are treated as a combination of storage reservoirs in LSR, which typically represent different GR layers (Kasmin et al. 2010;Berthier et al. 2011). SR based models are based on the water balance equation (Stovin et al. 2013;Yang et al. 2015). Finally, PM are concerned with simulating real-world situations and processes (Palla et al. 2009;She and Pang 2010;Sun et al. 2013).
The results of the process-driven models are often thought to be more realistic, scalable, and/or justifiable due to the use of analytical and empirical formulae based on physical phenomena (Soulis et al. 2017;Peng et al. 2019;Sims et al. 2019;Martin et al. 2020). However, the requirement for extensive meteorological and geometric data, skilled users, and continuous calibration make this class of models impractical for many applications. To help address this challenge, we attempt to apply Machine learning (ML) to the simulation of GR hydrologic simulation in this study.
ML has become increasingly popular in many disciplines in recent years, owing to considerable increases in computer power and data availability. It has been demonstrated in various studies that a ML algorithm could learn from data and continuously improve predictions (Mitchell et al. 2003;LeCun et al. 2015).
Among sophisticated ML methods, Recurrent Neural Network (RNN) methods have been applied widely in recent years. With its built in loops, RNN is capable of collecting long-term data dependencies (Informatik et al. 2003;LeCun et al. 2015;Zhang et al. 2018). However, it may also experience disappearing and exploding gradient problems when it comes to network training (Informatik et al. 2003). To this end, Long Short-term Memory (LSTM) networks have been created to address this problem by incorporating cell state and gating mechanisms into standard RNNs (Hochreiter 1998;Gers et al. 2000). LSTM network's gates handle the choice of whether to forget or remember information by storing mistakes in memory, which prevents error signal decay (Hochreiter and Schmidhuber 1997). In other words, LSTM's gates aid in the long-term preservation of states and short-term dependence. The training procedure for building LSTM networks, however, takes a considerable period of time due to its complicated structure. GRU networks were introduced as a simpler version of LSTM networks designed to shorten the training process (Cho et al. 2014).
The advantage of RNNs is due in part to their sequential regime of operation, which is different from fixed-size networks. Different designs, such as sequence input and single output (SISO), and synced sequence input and output (SSIO), can be employed depending on the network's usage. Unlike the SSIO design, which depends on the LSTM structure to capture lengthy relationships, the SISO architecture necessitates setting a constant window size which is the length of a cutout (sliding) of a time sequence of data. For example, data x(t) could be modeled, using a k-size window as x(n), x(n + 1), …, x(n + k) (Li et al. 2020a).
When a fixed window size is chosen, LSTM and GRU are forced to confine dependencies to the size of the chosen window. Due to the transference of hidden states from earlier time steps, the SSIO architecture, on the other hand, is capable of capturing long-term dependencies on its own. In other words, if the modeller wants to utilize a fixed window regime, there is no need to employ LSTM and GRU for hydrological modelling because the LSTM network is not required to provide the model with a fixed window size. Furthermore, determining the window size necessitates a thorough understanding of the watershed's reaction to rainfall events (Hu et al. 2018;Kratzert et al. 2018Kratzert et al. , 2019Yuan et al. 2018).
Some studies do not differentiate between SSIO and SISO, and they generally use SSIO to build LSTM to reduce window size parameter setting (Kratzert et al. 2018;Li et al. 2020b;Lees et al. 2021;Yin et al. 2021). Gauch et al. (2021) compared the different SISO window size of LSTM for rainfall-runoff modelling. Li et al. (2020a) have compared the SSIO and SISO of LSTM for flood prediction. Asadi et al. (2020) utilized ML to establish a link between LST and different urban characteristic factors at the same time, and used an artificial neural network to model GRs and their possible mitigating effects on an urban heat island with a case study in Austin, Texas. In another study, Erdemir and Ayata (2017) proposed a ML model for predicting temperature drop on a green roof. Tsang and Jim (2016) used meteorological data to model soil moisture variations and design an effective irrigation plan. In that study, the researchers used artificial intelligence algorithms based on artificial neural networks and fuzzy logic. However, to the authors' knowledge, there are no studies focusing on the use of ML in the hydrological simulation of GRs. Furthermore limited ML literature has focused on the comparison of LSTM and GRU models with different architectures.
In this paper we make a comparison of LSTM and GRU models with different architectures to verify which methods and architectures are capable of improving the accuracy of the hydrological modelling of GRs. This paper attempts to broaden the application field of ML models and proves the validity of ML models in a relatively new aspect. A more accurate simulation of GR hydrological performance will help GR modellers understand the hydrological processes of GRs under a variety of rainfall conditions, help them better understand the GR's role in urban stormwater management and design more efficient irrigation systems and irrigation schemes. With the use of the developed model, we aim to answer the following three questions: i Which method has better performance in GR hydrological modelling, LSTM or GRU; ii Which architecture (SSIO or SISO) for both LSTM and GRU demonstrates better performance in GR hydrological modelling; iii Which window size (SSIO) for both LSTM and GRU demonstrates better performance in GR hydrological modelling.

Data pre-processing
The Blue Green Wave (BGW) of Champs-sur-Marne (France) is the Greater Paris Area's largest green roof (1 hectare). Versini (2019) gathered flow discharge and precipitation data from the BGW with a resolution of 30 s. Data from modified measuring sensors was gathered for 78 days between February and May 2018 in that investigation. Versini (2019) published the open data, which is used in this work.
The study area (Paris) has a mild maritime climate. The average temperature in January is 3 ℃, the average temperature in July is 18 ℃, and the annual average temperature is 10 ℃. The rainfall is distributed throughout the year, with slightly more in summer and autumn, and an annual average of 622 mm.
As described in Versini et al. (2020), the BGW has a substrate depth of 20 cm, and is covered by two types of vegetation: grasses that cover the majority of its area and a mix of perennial planting, grasses, and iris bulbs. A pipe with a diameter of 300 mm collects the water coming from a large part of the BGW (approximately 1143m 2 ) which is referred to as 'the pipe drained area' in this study. Flow rate (Q) is monitored with a UM18 sensor placed in the pipe. Over this time period of 78 days, the runoff coefficient (i.e. runoff volume / total rainfall volume) computed for the pipe drained area was equal to 0.706.
There are no missing values in the rainfall dataset, a total of 22,442 rainfall and runoff records from 6 rainfall-runoff occurrences were chosen in Table.1.
Equation (1). is used to standardize rainfall and runoff data. The normalized data are in the [0, 1] range.
whereX norm , X i , X min and X max are the normalized, observed, minimum and maximum values, respectively, of target variable(rainfall, soil moisture or runoff). (1)

Problem formulation
We utilized Q t to indicate the flow rate at the outflow of a GR at time step t, which is the quantity of interest of this issue. Similarly, rainfall and runoff records at time step t in the GR are denoted by x t = x it , i = 1, 2, … and y t = y it , i = 1, 2, … , respectively. In addition to precipitation, runoff is influenced by other factors such as topography, seasons, and other variables. However, as this research involves simulation with a high temporal resolution (every 30 s), rainfall and soil moisture, were assumed to be the dominant drivers of runoff. To summarize, runoff modelling is concerned with determining the regression connection between output runoff and input rainfall. Hochreiter and Schmidhuber (1997) suggested LSTM to cope with exploding and vanishing gradient issues. Figure 1a depicts the structure of the LSTM. A cell state c t , an input gate i t , a forget gate f t , a cell gate g t , and an output gate o t make up the LSTM unit.

LSTM
The updated hidden state h t is computed by the following equations for each time step t given the input vector X t (including x t ,y t ), prior hidden cell state h t , and previous cell state c t : Where () is a sigmoid function, and * stands for Hadamard products. Weight matrices are all W's, while bias matrices are all b's. Future time steps should not influence past time steps because real-world time series data are utilized. As a result, the bidirectional mechanism was not used in this research. A single directional LSTM network was used in both designs. Figure 1(b) depicts the GRU cell's unique structure. In the GRU, the hidden state ( h t ) and cell state c t blend into one. The updated gate z t and the reset gate r t are two control gates in the GRU cell (Fig. 1b).

GRU
The updated gate z t was used to regulate the amount of state information h t−1 c t−1 delivered into the current time step t from the previous time step t-1. The greater the value of the updated gate, the more prior time step state information is brought in.
The reset gate r t determines how much data from the previous state is put into the current candidate set ĉ t . The less information from the previous state is written, the smaller the reset gate r t is.
Updated equations in the construction of the GRU cell are determined as follows and also illustrated in Fig. 3. The parameters are the same as those in Sect. 2.3.1.

LSTM and GRU architectures
This paper focuses on two architectures, namely SISO and SSIO (see Fig. 2). Input and output of SSIO have the same length and do not need to be fixed. However the input window size becomes an additional hyperparameter that must be tuned for SISO. As a result, the architecture is addressed briefly in the introduction without going into detail in this study. Since no previous study had focus on methods in GRs modelling and the interest in this paper is only in estimating runoff based on precipitation, two architectures, namely SISO and SSIO, were compared (Hu et al. 2018;Kratzert et al. 2018Kratzert et al. , 2019Yuan et al. 2018).

SSIO model of LSTM and GRU
As LSTM and GRU networks are very similar, identical hyperparameters were developed for them.

Fig. 2 Different LSTM and GRU architectures
A trial and error method was used to tune the parameters in this study (Liang et al. 2018). It was found that, after a significant number of tests, only one layer was superior to using a multi-layer network to mimic RR interactions in the investigated area. At each time step, the suggested networks had a 5 neuron input layer, a 20 neuron hidden layer, and a single neuron output layer (see Table 2).
The batch size and epoch size were set at 64 and 500, respectively. The process of sending all data into the network to complete an iterative computation is referred to as an epoch. All data in our model were iterated 500 times. The 'Relu' activation function was selected. RMSprop was the optimizer utilized in this study.

SISO model of LSTM and GRU
For the SISO model, four alternative window sizes (memory lengths), namely 20 steps, 120 steps, 240 steps, and 360 steps, were evaluated. These options correspond to 5, 30, 60, and 90 min, respectively. The progressive options were intended to demonstrate the GR hydrological modeling's long-term dependence cascade. The SISO models' additional hyperparameters were the same as those of the SSIO models in Sect. 2.4.1. Hyperparameters in models with varied time steps and lead times were the same (see Table 2).

Evaluation metrics
The total runoff ratio (R V ), Nash-Sutcliffe efficiency (NSE), root mean square error (RMSE), and mean absolute error (MAE) were employed as error indicators (Najafzadeh 2015). The complexity of hydrological processes and a lack of understanding of hydrological prototypes are usually responsible for the diversity of models (Xia et al. 1997;. The model's information entropy measure I Q i can measure the amount of information it contains.
I Q i is computed as follows: where y i is the normalised residual order; Δ i Q(k) is model i's residuals; {Q(k)}, k = 1, 2, ⋯ , n , is GRs' observed runoff sequence; Q i (k) , k = 1, 2, ⋯ , n , is the estimated runoff flow rate sequence for hydrological model i.

Comparison between SSIO and SISO architectures of LSTM
than the SSIO but R V is lower than SSIO but the three index value of two models are very close which means that the two models have the same modelling level. Table 3 displays the simulation results of SSIO and SISO architectures of GRU. The index of GRU SISO models have the same pattern with the LSTM SISO models. The total prediction accuracy improves as the duration of the time frame grows. As the time frame grows from 20 to 360 steps, NSE, MAE, and RMSE show a clear progressive increase. The GRU SISO 360 model has the best performance. The relationship of GRU SISO 360 and GRU SSIO models is as same as LSTM models. The index value of GRU SISO 360 and GRU SSIO models are very close which means that the two models have the same modelling level.

Comparison between SSIO and SISO architectures of LSTM and GRU
For SISO of LSTM and GRU models, the total prediction accuracy improves as the duration of the time frame grows. As the time frame grows from 20 to 360 steps, NSE, MAE, and RMSE show a clear progressive increase. This behavior implies that runoff at the investigated GR is truly long-term dependent on prior rainfall history; hence, to simulate on a fine temporal scale, architectures that can retain long-term memory are necessary.
From Table 3, R V is less than 1, which means that none of the methods overestimate the events,, and the values of SSIO and SISO 360 are closest to the measured values. The results of SISO 20-240 get better with the increase of window size. Table 3 shows the information entropy metric I (Q). The greater the value of I (Q), the more information the model includes. Nevertheless, the value will change depending on the event. For LSTM and GRU, I(Q) of SISO 20-360 gets better with the increase of window size and results of SSIO are close to the best values of SISO, demonstrating that methods contain more information with the increase of window size, and SISO 360 has the  most information. Comparing LSTM and GRU models, GRU is better than LSTM. It can be concluded that GRU on the whole contains more information than LSTM. Figure 3 shows LSTM and GRU models output for the event on 29 and 30 April (23.5 mm) compared to the observed values. It can be seen that all approaches are successful in identifying a similar runoff pattern. However, as the length of input memory increases, increasingly better performance is observed. SISO's simulation performance approaches that of SSIO model when the time window size approaches 360 steps.
From Fig 3a, the modelling low flow rate of LSTM SISO models are get better fitted the observed values time than GRU SISO models, and SISO model are get better fitted the observed values with the increase of window size. From Fig. 3, the peaks of all modelling runoff values are smaller than those of the observed values, which also indirectly confirms the result that R V of all methods are not greater than 1. Moreover, the peak values of all modelling runoff values are earlier than those of the measured values, and the values of SSIO and SISO 360 are the closest to the measured values. Results of SISO 20-240 get better with the increase of window size.
According to the above conclusions, the application of ML can simulate the hydrological performance of GR well, especially in terms of peak flow, peak occurrence time, flow process line. Based on the results, it is recommended to use the LSTM SSIO method to simulate the hydrological performance of GR in practical applications, not only because this method achieves superior NSE, R V , MAE, and RMSE, but also because it can better fit the peak flow and peak occurrence time compared to the GRU SSIO method. In addition, compared with the SISO method,, the SSIO method can reduce the setting of one parameter (window size), although they both achieved relatively good fit.
Based on these results, it is likely that modellers applying this method can spend less time building and debugging models. The efficient development of accurate hydrologic models representing various types of stormwater infrastructure will ultimately aid the planning and eventual implemention of climate adaptated cities or Sponge Cities.

Comparison with previous studies
Due to the fact that there are no studies focusing on the use of ML in the hydrological simulation of GRs, this section will compare the results of this paper with the studies of LSTM or GRU applied to other RR modelling, flood prediction and so on. Li et al. (2020a) compared a LSTM SSIO model with LSTM SISO models for flood prediction. In their study, prediction accuracy of SISOs show a clear progressive increase as the time window increases. The SSIO and the SISO with the best window size have the closest agreement with observations and best overall performance. This result is in agreement with the results of this study. Apaydin et al. (2020) compared LSTM and GRU for reservoir inflow forecasting focusing only on the differences of LSTM and GRU without comparing SSIO or SISO. All input parameters of LSTM and GRU were the same, and the optimal values were obtained using trial and error. In their study, GRU and LSTM models had the closest results to measured values, but LSTM performed slightly better, with the best correlation coefficients of LSTM and GRU being 0.87 and 0.85, respectively. Gao et al. (2020) compared LSTM and GRU for runoff forecasting. They used the SISO of LSTM and GRU. All input parameters of LSTM and GRU were the same. In their study, GRU and LSTM models predicted runoff with almost identical accuracy, with an NSE of predicted future runoff in the next 2 h for LSTM and GRU of 0.987 and 0.990, respectively.

Conclusion
This paper is the first research, to the authors' knowledge, focusing on the use of ML in GR hydrological simulation. The performances of LSTM and GRU models with SISO, SSIO architectures in modelling the hydrological performance of GRs are examined in this study.
In conclusion, LSTM and GRU are both useful in modelling GRs, but GRU was found to be better than LSTM. For architectures, SISO's overall forecast accuracy improves as the duration of the temporal frame grows. The total forecast accuracy of SISO improves as the length of the temporal window rises. SSIO and SISO 360 have the same best overall performance. Given that SSIO has less parameters, it may be the preferred method for GR modelling in general.
This study has demonstrated the potential of ML methods to improve the simulation of GRs, which may help pave the way for related tools to be implemented in existing hydrologic models. Better representation of individual stormwater management elements including GRs in hydrologic models can aid the design of individual GRs and improve the accuracy of large scale models that support city planning.
Authors Contributions Haowen Xie contributed to the conception of the study and performed the data analyses; Mark Randall contributed significantly to analysis and manuscript preparation; Kwok-wing Chau helped perform the analysis with constructive discussions.
Funding No funding was received to assist with the preparation of this manuscript.

Availability of data and material Not applicable.
Code Availability Not applicable.

Declarations
Ethical Approval Not applicable.

Consent to Publish Not applicable.
Competing Interests All authors declare that we have no financial and personal relationships with other people or organizations that can inappropriately influence our work, there is no professional or other personal interest of any nature or kind in any product, service and/or company that could be construed as influencing the position presented in, or the review of, the manuscript entitled.