Climate classification (CC) divides Earth’s surface into regions based on the similarity of climatic features. The shift of climatology from the classical approach of describing the characteristics of climatic elements to the modern approach of explaining the formation of climatic phenomena is reflected in CC (Yazawa 1980). In other words, CC methods can be divided into two categories: empirical CC based on classical climatology and genetic CC based on modern climatology.
With the development of synoptic meteorology and weather forecasting, modern climatology has been able to reveal the physical processes behind and causal relationships among climatic phenomena. Bergeron (1930) first introduced the concept of an “air mass,” which has since been refined into air mass climatology (Fukui 1962). An air mass is a large-scale atmospheric volume with uniform temperature and humidity that forms over vast oceanic or continental surfaces until their properties reach near-equilibrium. An air mass tends to form with a large-scale stationary anticyclone. The boundary between different air masses is defined as a transition zone or front, where cyclonic disturbances such as extratropical cyclones frequently occur, develop, and move eastward. Because the weather and climate are inherent to an air mass, an air mass can give comprehensive information on a climate and the area covered by the climate (Fukui 1962). However, air mass climatology has two drawbacks (Yoshino 1978). First, determining the area of an air mass in a globally applicable manner is difficult. A global classification tends to ignore local climatic features, but a local classification would divide Earth’s surface into many small areas. Second, the difficulty in determining air masses can result in an arbitrary determination of fronts. To overcome these two difficulties, an objective and quantitative approach to determining air masses is required.
Figure 1 shows Alisov’s CC (ACC) (Alisov 1936; Alisov 1954), which aims to understand global climatic features based on air masses. While the Köppen–Geiger CC (KGCC) (Köppen 1918; Köppen and Geiger 1936) is a representative empirical CC system based on differences in the vegetation landscape due to climate, the ACC is a representative genetic CC system based on the mechanisms and physical processes caused by air mass zones and their fronts. The mean positions of large-scale air mass zones and fronts usually shift by season due to seasonal changes in general atmospheric circulations. The north–south variations of air mass zones and fronts can be used to divide climatic zones into two categories: those that remain year round, and those that change seasonally. The ACC (1954) divides the global climate into four air mass zones according to temperature (i.e., latitude): equatorial, tropical, polar, and Arctic/Antarctic. Then, the differences between air masses in January and July are used to determine seven climatic zones, as listed in Table 1. In this study, the air mass zone was defined as the distribution of each air mass in the summer and winter seasons, and the climatic zone was defined as the spatial extent of a climate superimposed on a single map accounting for seasonal changes. According to the ACC, the Sahara and Arabian Deserts are in a tropical climatic zone covered by a tropical air mass zone with hot and dry characteristics throughout the year. However, the fronts at both latitudinal edges do not correspond to large precipitation lines, which made Suzuki (1961) question whether the locations of the fronts were correctly determined. Wadachi (1997) argued that the ACC could not identify areas with a prevailing subtropical high. Despite some shortcomings, the ACC demonstrates the most crucial circulation processes in different climatic zones and may be used as a basis to explain global climatic types (Khlebnikova 2009).
Table 1
Alisov’s climate classification (Alisov 1954)
Climatic zones
|
Winter
Summer
|
Detailed types
|
Landscape of KGCC
|
1. Equatorial
|
Equatorial
Equatorial
|
(1) Continental
(2) Maritime
|
Equatorial evergreen forest (Af)
|
2. Subequatorial
|
Tropical
Equatorial
|
(3) Continental
(4) Maritime
(5) Monsoon west-coast
(6) Monsoon east-coast
|
Savanna (Am)
Steppes, Semi-deserts (Aw)
Savanna, Equatorial forests (Af)
|
3. Tropical
|
Tropical
Tropical
|
(7) Continental
(8) Maritime
(9) Eastern edge of OA
(10) Western edge of OA
|
Hot, dry. Tropical deserts (BW)
Relatively cold, humid. Deserts (BW)
Tropical rain forests (Af)
|
4. Subtropical
|
Polar
Tropical
|
(11) Continental
(12) Maritime
(13) West-coast
(14) Monsoon east-coast
|
Semi-deserts, Steppes (BS)
Mediterranean (Cs)
Subtropical humid forests (Cw)
|
5. Polar (Temperate)
|
Polar
Polar
|
(15) Continental
(16) Maritime
(17) Maritime west-coast
(18) Monsoon east-coast
|
Semi-deserts, Steppes, Coniferous and mixed broadleaf forests (Df)
Broadleaf deciduous forests (Cf)
Temperate forests, Steppes (Dw)
|
6. Subarctic
|
Arctic
Polar
|
(19) Continental
(20) Maritime
|
Taiga, Sparse growth of trees (Dw)
|
7. Arctic / Antarctic
|
Arctic
Arctic
|
(21) Arctic
(22) Antarctic
|
Tundra, Ice fields(ET, EF)
Tundra, Ice fields(ET, EF)
|
Many CCs based on climatic causes have the disadvantage of sometimes not corresponding to actual climatic conditions (Nishina 2019). The subtropical and polar climatic zones in the ACC have been criticized for their low correspondence with the vegetation landscape of arid and temperate climates in the KGCC (Khlebnikova 2009). Although the KGCC is based on vegetation rather than the actual climate, it is widely used to explain agricultural and cultural regions as one of the CC systems that best reflects real climate differences (Nishina 2019). The latest revisions to the KGCC (Kottek et al. 2006; Peel et al. 2007; Kriticos et al. 2012; Beck et al. 2018) refer to terrestrial climatic variables (i.e., monthly temperature and precipitation) to approximate the vegetation distribution on land but still exclude climatic zones over the ocean. Because the KGCC is based on vegetation distribution, it is inherently unable to classify the climate over oceans.
Alisov (1954) attempted to subdivide the seven major climatic zones into 22 smaller regions using surface conditions such as continent vs. ocean, eastern vs. western, and plains vs. mountains (Table 1). However, he failed to reveal the figures with methodology behind such detailed types (Mizukoshi and Yamashita 1985), which is probably because data with sufficiently long periods and high spatial resolution were unavailable in the 1950s. Such subdivision is now possible with the presently available sophisticated data. Brunschweiler (1957) attempted to divide air masses into continental and maritime types and investigated the monthly occurrence frequency of individual air masses based on analysis of daily data at significant sites in the Northern Hemisphere. Then, he used the monthly changes in air mass areas to redefine the annual mean distribution of air masses in the Northern Hemisphere. Although Brunschweiler’s methodology is clear, some researchers have questioned the rationale of using air mass frequencies of 80%, 50%, and 20% (Suzuki 1961). Oliver (1970) classified Australia in a similar manner using the annual prevailing air masses.
In recent years, weather observatories have been exploited globally, and several studies have tried applying machine learning to clustering such data for global CC (Mahlstein and Knutti 2010; Zscheischler et al. 2012; Metzger et al. 2013; Zhang and Yan 2014; Rohli et al. 2015; Netzel and Stepinski 2016; Sathiaraj et al. 2019). Most of these studies focused on reproducing or comparing their work with the revised KGCC (Kottek et al. 2006; Peel et al. 2007). Rohli et al. (2015) used global reanalysis data to extend the KGCC over the whole Earth. Netzel and Stepinski (2016) showed that an information-theoretic measure of clustering called the V-measure (Rosenberg and Hirschberg 2007) could be used to quantitatively assess the homogeneity within a climatic type and differences between climatic types. Sathiaraj et al. (2019) compared three clustering techniques at their ability to identify climatic types in the United States: K-means (MacQueen 1967), DBCAN (Ester et al., 1996), and BIRCH (Zhang et al., 1996). Sathiaraj et al. (2019) also showed that data clustering by machine learning can help geographers and climatologists assess and evaluate long-term temporal and spatial changes in climate. Other studies have used clustering techniques to classify upper-level air masses (Vrac et al. 2012; Pernin et al. 2016; Watanabe et al. 2020). They referred to non-stationary air masses on synoptic timescales. However, the above studies subjectively determined the number of clusters or air masses for classification.
In this study, we applied a technique of data clustering by machine learning to global reanalysis data in an attempt to objectively determine the global air mass distribution and develop a revised genetic CC system based on air masses. Because reliable objective reanalysis data with high spatiotemporal resolution are now available for Earth over a period of more than 40 years, it can be used to revise the genetic CC system to reflect actual climatic conditions more accurately. This will allow us to classify the global climate in a data-driven manner by focusing on climatic causes (i.e., air masses and fronts) instead of observed responses such as precipitation. Thus, this study renews the classical ACC for the first time in almost 70 years. We subdivide air mass zones into regions that distinguish between land and ocean and that reflect east–west differences. Objective estimation of the global distribution of large-scale air masses, including over the oceans, will lead to better understanding of future climate change and will be helpful for determining appropriate mitigation and adaptation measures (Mahlstein and Knutti 2010).
The remainder of this paper is organized as follows. Section 2 explains the data and methodology. Section 3 presents the global distribution of air mass zones after the optimal number of clusters is determined and compares the genetic CC of this study with the conventional ACC. Section 4 summarizes the findings of this study and briefly discusses remaining issues.