Providing a comfortable space for customers is an important requirement for airlines, and the avoidance and mitigation of aircraft shaking have always been crucial. Turbulence is often the cause of aviation accidents [1, 2]. In addition, the potential increase in aircraft turbulence owing to the effects of global warming is a prevalent concern .
When a pilot reports encountering one or more instances of severe turbulence during a flight, the corresponding aircraft must undergo maintenance work to confirm its airworthiness. Therefore, turbulence remains a major issue for airlines. In addition, if the maximum acceleration recorded exceeds the operational acceleration limit of the aircraft, the scope of maintenance work considerably increases, which significantly impacts aircraft operation schedules. Therefore, severe turbulence should be avoided as much as possible. However, if turbulence reporting relies primarily on the opinions of pilots, which tend to vary, variations in reports provided by pilots are inevitable.
Most existing research concerning turbulence prediction has been performed from a meteorological viewpoint [4, 5]. In these studies, data were acquired in real time from many sensors and analyzed using a time-series approach . Although turbulence forecasting with pinpoint accuracy is desirable, it is expensive and infeasible for airlines to prepare a suitable environment for the sensors necessary to achieve this. In recent years, owing to the accumulation of aviation data and improvements in computation rapidity, the concept of turbulence prediction via machine learning has been introduced [7, 8]. However, studies concerning this subject are limited. Furthermore, it is difficult to determine an optimal machine learning approach for turbulence prediction. There exists a need to utilize open data (such as meteorological data) to improve analysis accuracy; this could aid in developing turbulence predictions that can be logically deduced from the data provided by the airlines.
In this paper, we propose a method for predicting turbulence occurrence, to contribute to the safe and comfortable operation of aircrafts. Figure 1 outlines this method, which involves the accumulation and aggregation of open data and quick access recorder (QAR) data [9, 10], and the prediction of turbulence using machine learning methods, the results of which are fed back to airlines and pilots. Flights to and from Matsumoto Airport in Japan, on E-170 aircrafts operated by Fuji Dream Airlines (FDA), frequently experience turbulence during the winter season. In this study, we consider the Matsumoto Airport as the model airport representing mountainous areas subject to turbulence. This technique can also be adapted to other airports.
For our study, we used meteorological data from Japan and turbulence information provided by FDA. Because turbulence is a relatively rare event, we first estimated the risk cluster. To this end, we performed a principal component analysis (PCA) of the meteorological data to obtain a projection matrix \(W\) to reduce the number of dimensions of the data to be analyzed. Subsequently, using the turbulence-occurrence indicator and the meteorological data transformed by \(W\), we calculated the risk cluster using the k-means method. We used this risk cluster to predict the days with turbulence risk for meteorological data from the year 2019 through support vector classification (SVC). The results based on this meteorological data revealed that the prediction method accurately identified the days with a risk of turbulence.
We believe that the integration and utilization of open data  (such as meteorological data and aviation data accumulated by airlines) will be promoted through this study; we demonstrated the possibility of calculating logical criteria for determining turbulence using machine learning. We expect that our research will not only improve the safety of aircraft operations but also lead to the development of human resources by providing a guide for making safety decisions, thereby promoting the effective utilization of aviation data.