3.1 Research concept and design
Fig. 2 schematically illustrates the concept of the present study on urban flooding. Urban flooding are not simply floods that occur in urban areas. It does not happen when the river crosses the embankment (river flooding) or when a storm surge (coast flooding) comes. It is caused by excessive runoff from developed watersheds that has nowhere to go. Additionally, coastal and river flooding occur over large areas, whereas urban flooding is fragmented and localized (Fig. 1c).
Fig. 2(a) path is a concept in which socioeconomic disparities and the vulnerable population increase as a side effect of rapid economic development. This is expressed as the flood-vulnerable population. Fig. 2(b) path is a concept of resource allocation that affects the reduction/increase of flood risk in the urbanization process. This is expressed as a flood vulnerable area. Fig. 2(c) path shows the relationship between the flood vulnerable population due to Fig. 2(a) and the flood vulnerable area due to Fig. 2(b). This relationship represents environmental injustice and is revealed through the distribution pattern of flood vulnerability characteristics Fig. 2(a) and (b) and the justification of Fig. 2(c).
In this study, appropriate variables for Fig. 2(a) and (b) paths are selected and data are collected. After that, an environment prone to flooding is identified and a model that considers temporal and spatial effects is selected and analyzed. Finally, we discuss environmental injustice from the results.
3.2 Variable Selection
As a dependent variable, the flooded area, human casualties, and flood damage density were considered. The flooded area oversimplifies the damage, and human casualties have hardly occurred in Korea since 1990 (Son et al. 2015). Therefore, the flood damage density was selected.
Natural characteristics, such as rainfall (Highfield and Brody 2013), rainfall intensity (Brody and Highfield 2013; Highfield and Brody 2013), and elevation (Wang et al. 2017) can be assumed to affect flooding.
Notably, economic characteristic variables are closely related to flood damage. These factors are statistically significant, because the amount of damage to a land or building varies according to its value, irrespective of whether they have the same area (Rufat et al. 2015). Therefore, these factors were used as control variables in previous empirical studies to determine the effects of floods (Brody and Highfield 2013; Highfield and Brody 2013; Lee and Brody 2018).
The following variables were selected in consideration of the disaster prevention plan elements applied to Seoul and environmental justice.
The location and structure of detached houses (Creach et al. 2020), basements and semi-basements (Forrest et al. 2020), aged and poorly built houses (Brody et al. 2011) and type of house (Ketteridge and Fordham 1998) are related to flood damage and social vulnerability, they may be considered as variables.
Even for land use, if development is carried out in an area that is vulnerable to flooding, in addition to land coverage (Burby et al. 2001; Stevens et al. 2010), it is imperative to impose a limit (Berke et al. 2009; Stevens et al. 2010) for the development density. In addition, land use may be related to vulnerable population.
Most of the residential areas in Seoul where low-rise houses are characterized by a small housing area, a large proportion of the elderly and single-person households, and no significant change in infrastructure (Maeng et al. 2016). Therefore, the average land area of detached houses is used as a variable representing land use in the area where the vulnerable population lives.
Notably, we did not select the race variable because Korea is a single nation with no serious racial conflict problems. In addition, Korea's higher education completion rate is 10.4 % higher than the OECD average in terms of adults, so it belongs to the top group. Therefore, the education level variable was excluded because it had no discriminatory power (Indicators 2020).
To consider the increase in the population that is vulnerable to flooding and economic inequality, we selected the ratios of the elderly population (Cutter et al. 2003; Parker et al. 2005) and recipients of public assistance (Cutter et al. 2003; Rufat et al. 2015) as variables in our study.
3.3 Data Construction
The unit of analysis for the study is gu, administrative districts under the Seoul Metropolitan Government (Lee and Brody 2018). From 2000 to 2018, 475 samples were obtained from 25 gu in Seoul (25 × 19). The detailed construction process of the variables determined using ArcMap 10.4 is explained below:
2.3.1 Derivation of flood risk areas
Analyzing the entire Seoul area may have the disadvantage of overestimating/underestimating the flood vulnerability characteristics because the scope of the study is broad. Therefore, flood risk areas are selected for analysis. First, consider the topographical factors. This is because factors such as elevation, slope, and relative relief directly affect urban flooding. The design flood level is also very important. This is because, for water to be drained, it is affected by the design flood level of each river along with the topographical factors. Lastly, even if the stream does not overflow, the water in the urban watershed is discharged into the stream, so water may accumulate around the stream when the flood level rises. The River Act stipulates that the area affected by flooding is within 500 m from the river boundary. For this reason, flood risk areas were selected according to the following criteria. 1) the areas with an elevation of 30 m or lower, slope of 7° or lower, and relative relief of 20 m or lower 2) the areas lower than the design flood level of each river 3) the areas located 500 m away from the river boundary (Fig. 1 d, e).
2.3.2 Variable data construction
The flood damage data of Seoul was collected from the Water Resources Management Information System (WRMIS), and the inflation was adjusted as of 2018. The total annual precipitation (mm) was acquired by first obtaining the data on the automated synoptic observing system (ASOS) and automatic weather system (AWS) locations, and then, the calculating the average values by assigning the distance adjusted weight to inverse distance weighted (IDW) interpolation. Because the maximum one-hour precipitation (mm) and maximum daily precipitation (mm) do not provide the AWS data, only the ASOS was used. For the average surface elevation, the average data were acquired from the digital elevation model (DEM) data from the National Geographic Information Institute for regional statistics. The map scale was 1:5000, and the average cell size was 90 m. The data on rainwater discharge facilities were acquired from sewage statistics published by the Ministry of Environment. The total length was calculated by summing the length of the sewage pipes and rainwater pipes. Building-related variables were acquired from the building registry provided by the civilian opening system of the architectural data. The property data were combined into the building database (DB) provided at the new road address. The average public land price, financial independence, and ratio of public assistance recipients were constructed based on data provided by Seoul City. The average public land price was reflected in the inflation and was constructed in connection with the variables related to the building and the Parcel Number (PNU) code. The vulnerable population density for people aged 65 years or above was used from the census date of the Statistical Geographic Information Service (SGIS). Table 2 shows the variable descriptions and sources.
3.4 Data Analysis
To determine the impact of flood vulnerability characteristics on flood damage, we applied a panel model. Pielke and Downton (2000) argued that because natural disasters occur in a very complicated way according to regional characteristics and policies, the limited independent variables cannot explain all the damages from a natural disaster. The panel model can take into account unobserved time and individual effects. The panel model can be expressed using a regression model, as follows (Hsiao 2014):

where, Yit represents the dependent variable, a is the Y-intercept, β is the slope, Xit represents the independent variable, єit is the error term, i represents the individual (1,2, 3, etc.), t is time (1,2,3, etc.), λt is the unobserved time effect, and vit is the remaining stochastic disturbance term.
This study proposes a strongly balanced data structure based on the region and time to determine the fixed and random effects of floods on damages in lowlands. Setting the model in the panel model is the most important aspect in the process of setting the panel model and estimating the parameter using the data.
In this study, we conducted the Chow, Breusch–Pagan Lagrange Multiplier (Breusch–Pagan LM), and Hausman tests, along with autocorrelation and heteroscedasticity tests, to select the most suitable model for the panel model analysis from the mixed, fixed effect, random effect, and feasible generalized least squares models (FGLS).