Building an hourly WBGT prediction model from daily weather data
Historical hourly WBGT data
As an objective variable for building a machine learning model for prediction of the hourly WBGT, a 12-year (2010–2021) historical hourly WBGT dataset in 842 cities in Japan was obtained from a website of the Ministry of the Environment in Japan (MOEJ). Eleven cities in Japan provide measured WBGT values, whereas others provide estimates only based on a formula based on air temperature, relative humidity, solar radiation, and wind speed (Ono & Tonouchi, 2014); thus, estimated data were used to ensure consistency across all cities. A validation using three years of WBGT data from six cities in various regions in Japan revealed that the determination coefficient between the WBGT estimates and observations is very high (determination coefficient (R2) = 0.998), and 97.6% of the estimation errors are within ± 1°C, making the estimates applicable to the summer WBGTs throughout Japan (Ono & Tonouchi, 2014). We utilized the WBGT data to reconstruct and evaluate the results.
Historical daily weather data
As explanatory variables for building a machine learning model for predicting the hourly WBGT, daily mean/maximum/minimum air temperatures (Ta/Ta,max/Ta,min) and daily mean wind speed (WS) were obtained from the Automated Meteorological Data Acquisition System (AMeDAS) of the Japan Meteorological Agency (JMA). The daily mean relative humidity (RH) and total solar radiation (SR) were obtained from the corresponding 1 km grid of agro-meteorological grid square data from the National Agriculture and Food Research Organization in Japan (NARO).
Building an hourly WBGT prediction model from daily weather data
To predict the hourly resolution of the WBGT, a machine learning technique called extreme gradient boosting (XGBoost) (Chen & Guestrin, 2016) was applied to build a model to predict the hourly WBGT using the above daily weather indices as explanatory variables. XGBoost was noted to have high WBGT prediction accuracy (Niwa & Manabe, 2024). The modeling concept is expressed in Eq. (1).
$$\:{WBGT}_{s,d,hh}=\:{f}_{hh}\left({x}_{c,d-1}\:,{x}_{c,d}\:,\:{x}_{c,d+1}\:,\:{cos\theta\:}_{c,d}\right)\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\left(1\right)$$
Here, xc,d = (Ta,c,d, Ta(max),c,d, Ta(min),c,d, RHc,d, WSc,d, SRc,d), where Ta,c,d …, SRc,d are the Ta, Ta(max), Ta(min), RH, WS, and SR in city c on day d, respectively, and d − 1 and d + 1 represent the days before and after d, respectively. In addition, cosθc,d = (cosθc,d,00, cosθc,d,01, …, cosθc,d,23), where θc,d,hh is the solar zenith angle at time hh of day d in city c (when cosθc,d,hh < 0, cosθc,d,hh = 0). These indicators were selected because many of them are useful for predicting hourly WBGT via daily weather indicators (Takakura et al., 2018, 2019). A model was then constructed using these 42 variables as explanatory variables and hourly WBGT as the objective variable. To improve the prediction accuracy and versatility of XGBoost, hyperparameters (HPs) were optimized for each hour via training and validation data. For HP tuning, Optuna (Akiba et al., 2019), an open source HP optimization framework, was used.
The results of the model evaluation demonstrated that the model perform well across different hours of the day (Fig. 1, Fig. 2). The mean absolute error (MAE) remains low throughout the day, with values ranging from 0.55 to 0.95. Although there are variations, with peaks at approximately 9:00 and 15:00, the overall values indicate that the model's predictions are consistently close to the actual values. This range of MAEs is sufficiently low considering that Class 1, the highest accuracy standard for electronic instruments for WBGT measurement defined in JIS B 7922:2023 (Japanese Standards Association & Japan Electric Measuring Instruments Manufacturers’ Association, 2023), which is one of the Japanese Industrial Standards (JIS), allows an error of ± 1°C. The root mean squared error (RMSE) values vary between 0.8 and 1.3, peaking at 9:00 and 15:00. R2 consistently remains high, with values ranging from 0.96 to 0.99, with low peaks at 9:00 and 15:00. The bias score fluctuates near zero, with values ranging from − 0.005 to 0.002. Therefore, a highly accurate prediction model could be built via this method, and the final model was built with all the data used for training and validation.
Future projections
Projection of future hourly WBGTs on the basis of climate scenario data across Japan
The bias-corrected daily climate scenario data over Japan (NIES2020) (Ishizaki, 2021) were used as input data to project future hourly WBGTs for the grid corresponding to the target cities. NIES2020 data are based on the outputs of five global climate models (GCMs: MIROC6, MRI-ESM2-0, ACCESS-CM2, IPSL-CM6A-LR, and MPI-ESM1-2-HR) and four GHG emission pathways (shared socioeconomic pathway (SSP)1-1.9, SSP1-2.6, SSP2-4.5, and SSP5-8.5). However, SSP1-1.9 data include three GCMs only (MIROC6, MRI-ESM2-0 and IPSL-CM6A-LR). We used the data for the 2030s to 2050s and 2060s to 2080s to make projections for the middle and end of the 21st century.
To reproduce city-specific WBGTs, bias correction parameters were identified between the hourly WBGT values predicted from daily weather variables in the agro-meteorological grid square data and the hourly WBGT values in the target cities. This correction addresses the bias between average WBGTs per grid and city-specific WBGTs. As mentioned in the Discussion later in the manuscript, with WBGT measurements becoming more commonly recorded at schools, in this study, a method to correct the bias of grid data by using data from specific points (AMeDAS observation points for this study) is presented. This method would enable the evaluation of WBGTs and heat impacts under climate change for each school in the future. For the validation of the bias correction results, the hold-out method was employed to avoid overlearning the data. For each city, for years with more than 100 data points, bias correction parameters were identified from the data of all years, except the final year; bias correction was applied to the data of the final year; and the results were compared with the MOEJ-provided hourly WBGT. As a result, the MAE for the 842 cities decreased from 0.62 to 0.54, the RMSE decreased from 0.79 to 0.71, and the bias decreased from 0.19 to 0.01 after bias correction. The MAE, RMSE, and bias for each city are shown in Fig. 3. In most cities, errors and biases improved significantly. Then, bias correction using the identified parameters was performed on the future gridded hourly WBGT data to project the hourly WBGT for the target cities.
The overall process of the projection of WBGT data on target cities on the basis of climate scenario data across Japan is presented in Fig. 4.
Projection of heat impacts on school sports club activities
School sports club activities in Japan are held mainly after school from 3 to 6 p.m. Daily activity time is advised to be no more than 2 hours on weekdays and 3 hours on weekends, with at least 2 rest days per week, and should be as short and reasonable as possible (Japan Sports Agency & Agency for Cultural Affairs Japan, 2022). A guideline for heat illness prevention in sports activities in Japan, including school club activities, is widely referenced and includes the following: stop strenuous exercise when the WBGT is between 28°C and 31°C (28 ≦ WBGT < 31°C) and stop all exercise when the WBGT is 31°C or higher (31°C ≦ WBGT) (Japan Sport Association, 2019). In Japanese sports club activities, it has been reported that the number of heat illness cases correlates well with outdoor WBGTs that range up to 31°C, regardless of whether the activity takes place indoors or outdoors (Iwashita, 2018). We therefore evaluated whether school sports club activities for two hours per day within 3 to 6 p.m., five days a week, would be limited by the WBGT under future climate change. The 71st percentile values of the WBGT from 3 to 6 p.m. (corresponding to the highest value during the 5 relatively cool days of the 7-day period) were evaluated as follows: heat level 1 (stop strenuous exercise) when the value reaches 28 ≤ WBGT < 31°C for 2 hours or more, heat level 2 (stop all exercise) when the value reaches 31°C ≤ WBGT for 2 hours or more, and level 0 otherwise. A schematic diagram of the evaluation of the heat level is shown in Fig. 5.
In addition, as a sensitivity analysis, considering the existing findings (Oyama et al., 2024) that lower heat criteria should be used in cooler regions, where the average WBGT from May to October is less than 18°C, the projected heat impact for the heat level 1 criterion is 25 < WBGT ≤ 28°C instead of 28 < WBGT ≤ 31°C, and the heat level 2 criterion is 28°C < WBGT instead of 31°C < WBGT.
Assessment of the effectiveness of countermeasures
As countermeasures against heat, three measures were set, as they are both feasible in school sports club activities and reproducible in this study: A: including early mornings (7 to 9 a.m.) in activity times; B: replacing two of the five outdoor activity sessions with indoor activities (e.g., indoor training, strategy meetings, team building); and C: implementing both A and B. We assessed to what extent these measures reduce the projected heat impact. As in the previous section, a sensitivity analysis was also carried out to evaluate a 3°C reduction in the criteria for heat levels 1 and 2. A schematic diagram of the evaluation of the heat level with such countermeasures is shown in Fig. 6.
Programming environment used in the study
The analysis, modeling, and visualization for this study were conducted via the Python and R programming environments, which were used in Microsoft Windows 11 Pro. To construct an hourly WBGT prediction model from daily weather data and the future projection of WBGTs in target cities, we utilized Python version 3.12 (October 2023, https://www.python.org/downloads/release/python-3120/). For the projection of heat impacts on school sports club activity and assessment of countermeasures’ effectiveness, we utilized R version 4.3.2 (October 2023, https://cran.r-project.org/bin/windows/base/old/4.3.2/).