3.1 Descriptive statistics
A description of the single variable, frequency and correlation coefficients is a standard technique for statistical analysis [32]. Descriptive statistical analysis of the data was conducted by measuring means, standard deviations, minimum, and maximum of all the variables (Table 2). These information was determined for each of the independent variables, in underground and surface mines, from the data of 1986 to 2018. As for underground mines, 93% of observations were in the Appalachian region, 3% in the Western region, and the rest in the Interior region. The majority (about 80%) of mines were small size. Additionally, the highest number of CWP cases in the observations was 946 (West Virginia, Boone county). In surface mines, approximately 86% of mines are in the Appalachian region, 9% in the Interior region, and the rest (5%) in the West region. Similar to underground mines, the majority of mines (86%) were classified as small mine. Additionally, the highest number of CWP cases in the observations was 175 (West Virginia, Logan county). The descriptive statistical analysis of the CWP rate for various independent variables is discussed in the following.
Table 2 Descriptive statistics for variables in underground and surface coal mines in the U.S., 1986-2018.
Variable
|
Underground mines
(# of Observations = 29,707)
|
Surface mines
(# of Observations = 32,643)
|
Mean
|
Std. Dev.
|
Min
|
Max
|
Mean
|
Std. Dev.
|
Min
|
Max
|
Number of CWP
|
0.18
|
1.52
|
0
|
101
|
0.02
|
0.40
|
0
|
34
|
Number of CWP per Total hour
|
0.00
|
0.09
|
0
|
10
|
0.00
|
0.03
|
0
|
5
|
Appalachian region
|
0.93
|
0.25
|
0
|
1
|
0.86
|
0.34
|
0
|
1
|
Interior region
|
0.03
|
0.18
|
0
|
1
|
0.09
|
0.28
|
0
|
1
|
Western region
|
0.03
|
0.18
|
0
|
1
|
0.05
|
0.22
|
0
|
1
|
Small size
|
0.79
|
0.41
|
0
|
1
|
0.86
|
0.35
|
0
|
1
|
Medium size
|
0.09
|
0.29
|
0
|
1
|
0.07
|
0.26
|
0
|
1
|
Large size
|
0.12
|
0.33
|
0
|
1
|
0.07
|
0.26
|
0
|
1
|
Thin seam height (≤40”)
|
0.27
|
0.45
|
0
|
1
|
0.49
|
0.50
|
0
|
1
|
Medium seam height
(>40 & ≤75”)
|
0.59
|
0.49
|
0
|
1
|
0.36
|
0.48
|
0
|
1
|
Thick seam height (>75”)
|
0.14
|
0.35
|
0
|
1
|
0.15
|
0.36
|
0
|
1
|
Bituminous coal rank
|
0.96
|
0.20
|
0
|
1
|
0.93
|
0.25
|
0
|
1
|
Anthracite coal rank
|
0.04
|
0.20
|
0
|
1
|
0.07
|
0.25
|
0
|
1
|
Mine operation type: Mining method is a significant determinant of RCMD exposure [3]. RCMD compositions at surface and underground differ from each other, due to the different operations in these mining types [6,37]. Consequently, underground coal workers are at a higher risk of CWP than surface mines due to a confined space and limitations in artificial ventilation systems. Analysis of CWP by mine operation shows that, out of 7,337 CWP cases, the majority (i.e., 76%) was reported in underground mines. Approximately 11% of the CWP cases were reported in surface mine operation. The rest (i.e.,13%) of cases were workers in other mine operations, mainly mill or preparation plant (Fig. 2a). As shown by chronological data of CWP cases in the U.S. coal mines (Fig. 2b), there was a decrease in the rate of CWP by implication of permissible exposure limit (PEL). However, there was an increase in the prevalence of CWP in the 1990s.
Geographic location: The geographic location of coal mines is an important factor in assessing the RCMD health risk. Regional variations in dust characteristics exist due to the geographical clustering of coal mines in the U.S. In Central Appalachia, for instance, mines may have more rock strata sourced dust compared to other regions [14]. Amandus' study showed that coal workers in the eastern region of Appalachian coal field, including West Virginia and Pennsylvania, are at a higher risk of CWP than U.S. western mines [38]. The findings of Sarver’s study supported this hypothesis that RCMD characteristics differ substantially among mining regions. Understanding the difference in mineral and elemental compositions, as well as particle size distributions of RCMD among geographic locations sheds light on the recent CWP resurgence [14].
From 1986 to 2018, a total number of 106 counties from 16 states across the country reported coal miners with CWP disease. The number and distribution of CWP prevalence among coal miners in different states and counties are demonstrated in Fig. 3. The hot spot area, including West Virginia, Kentucky, Virginia, and Pennsylvania reported a higher number of CWP cases.
Mine size: Several studies has identified mine size (identified by number of underground miners employed) as a predictor of CWP risk among US underground coal miners [1,5,19,41,42]. These studies indicated workers in small mines are associated with an increased risk of CWP, but it was unknown if abnormal lung functions are also linked to the mine size. Blackley et al. (2014) showed that mine size significantly affects the CWP prevalence and lung function abnormality [10]. The spirometry and radiographic analysis among 3770 coal miners in their study showed that there is a higher risk of abnormal spirometry (18.5% vs. 13.8%, p<0.01), CWP (10.8% vs. 5.2%, p<0.01), and progressive massive fibrosis (2.4% vs. 1.1%, p<0.01) in miners working in small mine operations compared to that of large operations. They also concluded that coal workers in small mines (i.e., the number of employees is less than 50) in Kentucky, Virginia, or West Virginia are at a higher risk of CWP prevalence than those in large mines. Suarthana et al. (2012) found an association between decreasing mine size and prevalence of CWP and PMF among coal miners in the U.S. One possible explanation is that smaller mines may have fewer health and safety resources than larger operations [1]. Moreover, previous investigations indicated that the average concentrations of RCMD in small mines are higher [1,10-12].
Size analysis of coal mines in the U.S. revealed that most of underground and surface mines are in a small size (Fig. 4a). The distribution of CWP per mine size indicated that the number of CWP in the large mines are more than that in small and medium mines. However, the total number of CWP in each mine size does not necessarily show that people in the large mine size are at a higher risk of CWP. Therefore, the rate of CWP per number of employees was calculated to compare the prevalence of CWP per mine size. The data showed that the rate of CWP in the small mines are more than that in medium and large mines (Fig 4b).
Coal seam thickness: Coal seam thickness is one of the potential contributing factors that could influence the prevalence of CWP among coal miners [9,10,19]. Seam height in coal mines vary based on the coal reserves' geographic location and geological properties. Suarthana et al. (2012) reported that the average coal seam thickness for central Appalachia mines is lower than that in other regions. Further to this review, it was concluded that CWP and abnormal lung function prevalence were likely associated with the low seam height and small mine size in the U.S [9,19]
Distribution of surface and underground mines in three coal seam thickness classes was performed (Fig S1 in SI). Only less than 11% of mines had thick coal seams, and the majority of mines had either thin or medium size coal seam heights. In underground mines, the medium seam height was dominant (i.e., 44%), while the majority (i.e., 50%) of surface mines operated thin coal seams.
The distribution of CWP per coal seam height and mine size for both underground and surface coal mines was subsequently studied. The results showed that CWP rate is more prevalent in underground mines operating medium seams than that of mines operating thin and thick coal seams (Fig. 5a). However, the rate of CWP in surface mines indicated a higher rate percentage for thick seams than thin and medium seams (Fig. 5b). Regardless of coal seam thickness, the majority of CWP cases in underground and surface mines was reported in the small mine size.
Coal rank: Several studies confirmed that there is a higher risk of CWP for higher coal rank, even at the same level of RCMD concentrations [11,17]. Gamble et al., (2011) proposed higher rank coal as a plausible factor for CWP prevalence within Appalachian region [17,18]. In many bituminous coal mines, the higher prevalence of CWP have been also linked to a higher quartz content in respirable dust [8]. Previous studies indicated that an apparent link between the coal rank and CWP may be attributed to the particle surface charge and mineralogical composition of RCMD [34,35]. However, coal rank causal effects have not been exclusively investigated.
Distribution of U.S. surface and underground coal mines based on the coal rank showed that bituminous mines account for about 95% of the coal operations (Fig. 6a). Compared to surface anthracite mines (59 mines), only a few active underground anthracite mines (9 mines) existed in 2018. The distribution of CWP by coal rank indicated that the bituminous coal rank contributes to about 95% CWP rate. (Figs. 6 b and c).
3.2 Regression Analysis
Analysis was performed (at three significance levels of 0.01, 0.05, and 0.1) to determine the presence and strength of correlations among the variables considered in this study. The correlation study indicated no strong correlation between the independent variables (Tables S.1, and S.2 in Supplemental information). Thus, these variables can be used in multivariant regression modeling. Regression analysis of the relationship between CWP rate and the identified contributing factors was carried out utilizing GEE model. Table 3 shows the main results of the GEE analysis which utilized for testing the hypotheses of relationship between CWP and the independent variables (described in section 3.3).
H1: It was hypothesized that workers in underground mines are more susceptible to developing CWP than those in other operations. The regression analysis showed that coal workers in underground coal mines are at a higher risk of CWP than surface coal miners (β=4.010, p<0.01). It also showed that coal workers in other mine operation (including milling and preparation plant) are at a higher risk of CWP than workers at surface mines (β=2.706, p<0.05). Therefore, H1 is supported, and mine operation type is a significant factor contributing to the CWP prevalence.
H2: Geographical location was hypothesized to be a contributing factor to the prevalence of CWP. The regression analysis showed a significant positive coefficient for both Appalachia (including West Virginia, Kentucky, Pennsylvania, Virginia, Alabama, Tennessee, Maryland, and Ohio) and Interior (including Illinois and Indiana) regions compared to the Western (including Wyoming, Texas, New Mexico, Oklahoma, Utah, and Colorado) geographic region. The statistical analysis showed that, compared to the Western region, underground coal workers in both Appalachia (β=4.407, p<0.01), and Interior (β=3.750, p<0.01) geographic regions are at a greater risk of CWP. Therefore, H2 is supported for underground coal mines. The result of regression for surface mines, with Western region as a reference, showed that surface coal workers in the Appalachian region are at a higher risk of CWP (β=5.101, p<0.01). The outcome of the regression model for Interior region was not statistically significant. Therefore, H2 is supported only for Appalachia vs. Western surface coal mines.
H3: The third hypothesis investigated how the size of mine could influence the prevalence of CWP among coal miners. We categorized the mine size based on the average number of employees in each mine (small: less than 50; medium: between 50 and 100; large: more than 100). The results of the statistical analysis indicated that underground coal workers in small mines are at a higher risk of CWP in comparison with workers at medium (β=-1.961, p<0.01) and large (β=-1.879, p<0.01) mines. Therefore, H3 is supported for the underground mines. In surface mines, coal workers in small mines are at a higher risk of CWP in comparison with medium size mine workers (β=-1.277, p<0.1). The results were not statistically significant for large operations. Therefore, H3 is supported only for the medium vs. small surface coal mines.
H4: It was hypothesized that coal rank contributes to the CWP incidence rate. For the coal rank, the statistical analysis showed a significant relationship between CWP rates and bituminous coal rank in underground mines. It indicated that underground bituminous coal miners are at a higher risk of CWP than anthracite coal miners (β=7.383, p<0.01). Therefore, H4 is supported for underground mines. On the other hand, surface anthracite coal miners are at a higher risk of CWP than anthracite coal miners (β=-1.476, p<0.01). Therefore, H4 is supported for the surface mines. It should be noted that the MSHA database classifies coal rank only as bituminous and anthracite. Only 0.3% of coal production in 2018 came from anthracite [19].
H5: Finally, a hypothesis examined how the coal thickness could influence the prevalence of CWP among coal miners. The seam thickness was categorized into three groups based on the average of seam thickness in each mine (thin: Seam height ≤ 40”; medium: 40” < Seam height ≤ 75”; and thick: Seam height >75 inches). The GEE result indicated that coal workers in the underground mines operating thin (β=1.416, p<0.05) and medium (β=1.397, p<0.01) seams are at a higher risk of CWP prevalence, compared with those working in thick-seam underground operations. Therefore, H5 is supported for underground coal mines. The result of regression for surface mines shows that we cannot make a conclusion for the coal workers in the thin-seam surface operations, but coal workers in the medium-seam operations (β=-1.969, p<0.01) are at a lower risk of CWP prevalence in comparison with workers for thick-seam mines. Therefore, H5 is not supported for the medium vs. thick-seam surface coal mines. The results of hypothesis testing are summarized in Table 4.
Table 3 GEE Estimation Results. The values represent coefficient. The standard errors are included in the parentheses.
Category
|
Variables
|
Underground
|
Surface
|
CWP-rate
|
CWP-rate
|
|
|
Mine operation
|
Surface (reference)
|
4.010**
(0.73)
2.706**
(0.72)
|
Underground
|
Other
|
Geographic location
|
Western (reference)
|
|
Appalachia
|
4.407***
(0.493)
|
4.407***
(0.493)
|
Interior
|
3.750***
(0.670)
|
3.750***
(0.670)
|
Mine size
|
Small (reference)
|
|
Medium
|
-1.961***
(0.639)
|
-1.961***
(0.639)
|
Large
|
-1.879***
(0.563)
|
-1.879***
(0.563)
|
Coal rank
|
Anthracite coal (reference)
|
|
Bituminous coal
|
7.383***
(1.075)
|
7.383***
(1.075)
|
Coal seam thickness
|
Seam height>75” (reference)
|
|
Seam height ≤40”
|
1.416*
(0.791)
|
1.416*
(0.791)
|
Seam height >40” and ≤75”
|
1.397**
(0.554)
|
1.397**
(0.554)
|
Model Parameters
|
Constant
|
-21.027***
(1.554)
|
-21.027***
(1.554)
|
Observations
|
29,707
|
29,707
|
Year
|
1986-2018
|
1986-2018
|
Wald Chi2
|
996.48***
|
996.48***
|
*** p<0.01, ** p<0.05, * p<0.1
Table 4 Summary of hypothesis testing
Hypothesis
|
Underground mine
|
Surface mine
|
H1: Workers in underground coal mines are more susceptible to CWP
|
Supported
|
H2: The coal region contributes to CWP incidence rate
|
Supported
|
Supported only for Appalachia vs. Western
|
H3: Workers in smaller operations are more susceptible to lung diseases
|
Supported
|
Supported only for medium vs. small
|
H4: The coal rank contributes to the CWP incidence rate
|
Supported
|
Supported
|
H5: Workers in thin-seam mine operations are more susceptible to the CWP incidence
|
Supported
|
Supported
|
In order to examine the accuracy of the regression model results, VIF and homoscedasticity analyses were conducted. VIF identifies multicollinearity in the regression models. Multicollinearity exists when there is a correlation among independent variables in a regression model. The presence of multicollinearity will negatively influence the results of the estimation. This connection is, in other words, a concern since independent variables should be independent. Each of the VIF scores for the dataset were less than 5 (mean score of 1.55) (Table S3), indicating that lack of multicollinearity has been met.
The homogeneity of variance of the residuals is one of the main assumptions of GEE. For all predicted dependent variables, the variance of residuals is roughly equal. This makes the prediction of regression unbiased, consistent, and accurate [19, 30]. The existence of homoscedasticity was tested using Breusch-Pagan test (Table 5). The p-value was statistically significant at significance level of 0.01; therefore, the null hypothesis, which is the existence of homoscedasticity, was rejected. Table 5 shows the result of Breusch-Pagan’s test. To account for heteroscedasticity, the robust standard error was used in GEE model [29, 36].
Table 5 Homoscedasticity test results
Test
|
Chi-square
|
Pr > ChiSq
|
Variables
|
Breusch-Pagan
|
192025.76
|
< 0.0001
|
Cross of all variables
|