Artificial Intelligence-based Predictive Analysis of Self-Advancing Goaf Edge Support (SAGES) for Improving Underground Mines Safety


 An efficient and robust communication channel is a critical factor in underground coal mines to ensure the safety of the working environment. Through this paper, we explore how the functionalities and operational and analytical efficiency of SAGES can be analyzed utilizing historical data acquired during the depillaring operations carried out in underground coal mines using SAGES for safety and productivity during deployment with the incorporation of IoT with Artificial Intelligence. Applying predictive machine learning of convergence, load at withdrawal, and duration of yielding, in hrs will automate the analysis of the after-effects of depillaring operations. This will ultimately help in the maintenance of components of SAGES and measures to be taken on the safety of the workplace. Linear estimators, gradient boosting, and clustering methods are employed to detect the output properties of the deployed SAGES. Further, we illustrate and analyze a fine-tuning approach for supervised and unsupervised algorithms subsequently, which are analyzed and the performance of various estimator models in our experiments are compared. To end with, we perform an ensemble of the models for each target variable.


INTRODUCTION
Self-Advancing (mobile) Goaf Edge Support (SAGES) [Singh et al., 2016] have been developed by IIT (ISM), Dhanbad in collaboration with Jaya Bharat Equipment Pvt. Ltd., Hyderabad in the EOI (Expression of Interest) project of Ministry of Coal, Govt. of India for the depillaring operations undertaken in underground coal mines. SAGES (see Figure 1) have been developed for depillaring operations in underground coal mines to avoid labor-intensive and time-consuming processes in erecting wooden props as a support system at goaf 1 edges (working front) for protection of roof during depillaring operations. Goaf edges of the depillaring area in an underground coal mine are inherently high-stress zones and the support systems erected at such places should be capable of providing adequate support to the roof (see Figure 2). In the caving system of depillaring, it is very important that the initial, periodic, and main falls of roof take place timely inside the goaf to avoid building up of the roof pressure to an unsafe limit. This is also equally important that all such falls are restricted inside the goaf edges and do not extend up to the working area. Goaf edge supports play a very vital role in achieving these objectives by working as ``Breaker Line Supports''. However, care is to be taken, as regards to the selection of support and its capacity, which would help in creating a stiffer and operationally effective breaker line to facilitate regular fall in the goaf resulting in better strata management. Goaf edge supports also work as a substantial physical barrier between the supported area of extraction and the goaf and, thus discourage the persons from entering the goaf. SAGES when deployed in underground coal mines are operated using a radio remote control which eliminates the danger to workers from being exposed to hazardous roof conditions associated with setting and withdrawal of conventional roof supports to carry out the operations. One of the critical contributing factors of accidents is the lack of predictive tools and methods of failure of roof and sides, thereby leading to failure to withdraw or removal of workers before the actual failure. The historical data are collected from the deployed SAGES machines in the underground coal mines between the period of 2017-2018. The historical data are going to be used for the prediction of slices 3 (see Figure 3 that shows seam called 29L where the shaded area was a slice which became goaf after coal extraction and SAGES is deployed to support the unstable roof for the further extraction of coals from the rest of the 29L slices) during the extraction of SAGES using newly proposed AI models. Through these predictive analytical results, required steps can be undertaken before deployment of the SAGES through which, safety in underground coal mines can be improved. Following predictions are going to be performed using the proposed model (objectives): a. Maximum load at the withdrawal of SAGES after completion of a slice. b. Maximum convergence between canopy and base of the SAGES at time of withdrawal of the SAGES. c. Prediction of yielding of the SAGES during extraction of a slice and duration of yielding.

RELATED WORK
There have been various proposals and developments for advancement in automated technologies using AI in the mining industry for improving efficiency, hazardous exposure reduction, and improving safety. The research described in Hyder et. al, 2019 shows that Artificial Intelligence can be incorporated in mineral processing, hazard detection, exploration, and accident analysis. Coal excavation automation [Nalbantov et. al., 2010] wherein the shearer loader has been integrated with a pattern recognition system which detects the correct position of the coal to be excavated, hence automating the process of mining. The pattern recognition system consists of two mechanisms, each being noise-elimination mechanism and coal-bed boundary detector. This data-mining approach is based on image recognition. Safety being the primary goal, Matloob et. al., 2021 describes how AI and machine learning can provide an effective risk assessment to highlight probable hazards during mining and with other economic benefits. It prescribes formal and informal approaches for risk assessment which includes methods such as fault tree analysis, hazard identification and ranking, workplace risk assessment, and control but does not showcase an approach on how data preprocessing and analysis can be automated through which useful insights can be extracted. Janusz et. al., 2017 proposes an approach on how to assess and predict hazardous seismic events in coal mines which compares conventional methods of seismic and seismoacosutic to an ensemble of diverse regression models called the seismic hazard assessment model wherein batch data is aggregated and split into static and time-series data. Features are extracted from the time series data and presented as input to different regression models such as Support Vector Machines (SVM), regression trees, and generalized linear models but do not propose a solution for automation of data processing and analytics for improving mines safety.
Research undertaken in Fu et. al., 2020 explains how deep learning models such as convolutional neural networks, long short-term memory, deep belief networks, and deep reinforcement learning to carry out operations such as drilling, haulage, blasting, mineral processing, and communication but does not account for the bottleneck due to unlabeled image data for the development of sensors and deep learning models.
Although developments and proposals have been made in the field of automating the operation of mining, however, these approaches do not account for the automation of data analysis, processing, and proficient communication of the data recorded during mining operations carried out in underground coal mines. Coal excavation automation automates the process of discovering coal-bed boundaries through image processing but does not account for how human labor can be minimized as well as does not describe an approach on how the recorded data help deduce other mechanical properties during mining operations. Analysis methods can be divided into statistical analysis [Cunningham et. al., 1995] and model-based diagnosis. Both categories of analytics depend on the actual data recorded from XR5 5 data loggers installed in the SAGES. Whilst model-based analytics can provide an intuition of behavioral variation of data, statistical methods on the other hand are better for anomaly detection and data wrangling. SAGES are used in the goaf area in the underground coal mines to support the unstable roof of the location where coal has been freshly extracted with the help of depillaring operations. As shown in Figure 4, the WSN (Wireless Sensors Network) repeaters and the underground base station receive the status of the SAGES sensors which is transferred to the base station. With the help of IoT, results of workmen health monitoring, mine environment health monitoring, strata behavior monitoring is recorded and transferred to the base station subsequently, which are then transferred to the cloud applications assuming ground base station has access to the Internet. This data is preprocessed using ETL (Extract, Transform and Load) data pipeline in the cloud system. The preprocessed data is used as an input for statistical analysis, big data mining, and machine learning. By embedding machine learning with IoT systems, the processing of information (PROC) extracted from the SAGES sensors and data loggers will be more efficient and faster. The absence of artificial intelligence generates processing constraints as IoT nodes has limited processing power due to which occurrence of resource exhaustion attack is inevitable which in turn degrades the quality of IoT network. The data analysis structure will contain automated online monitoring, data transmission, and processing as well as analysis and reporting of production forecasting, thus helping in mine planning and design. By adapting and developing technologies for safe mining and proficient data processing and analysis, we can proceed towards the smart-SAGES goal.

Data logging (acquisition) in SAGES
The SAGES is having inbuilt rock monitoring sensors and data loggers. The data logger is programmed for logging the sensor data at every minute. One of the sensors is a pressure transmitter fitted in the hydraulic legs of the support. The pressure sensors respond to the strata load building upon the canopy of the SAGES due to the convergence of the roof strata. The stability of the roof very much depends on its convergence. The layers of the strata above the roof separate from the main roof and put excessive load on the support when the convergence of the roof is increased. Ultimately, they fail in the face and working zones causing loss of the machine, men, and coal. Therefore, a draw wire type convergence sensor is fixed between the base and canopy of the SAGES. The data monitoring system continuously monitors the strata load and convergence at the working faces. It generates a warning when convergence, load, and combined load-convergence combined exceeds a prescribed limit for safe working. This monitoring system helps in-time decisions to avoid any eventuality and ensure a safe working place.
The new data logger is integrated and functional in the SAGES. It is deployed in the mines packed inside steel cover for the field trial (see Figure 5). Load and convergence observations of every minute are stored in data loggers which are afterward analyzed manually for the progressive development and maintenance of SAGES. It is an example of automation and information technology in mines. By adopting IoT (Atzori et. al., 2010) based information and communication system (A. Singh et. al., 2018) and merging digital technologies like artificial intelligence (machine learning), we can implement an efficient communication channel and data collection and analysis process to enable proficient decision making, hence increasing underground safety in mines. We are using the data from the XR5 data loggers that were installed in the initial field trial before the newly designed and developed IoT-enabled data logger (A. Singh et. al., 2018) for SAGES. Taking a close look at these XR5 data logger files, we find that there may exist an intercorrelation within these observations which could not be extracted manually. To deal with this issue, we formulate the predictive analysis task as an unsupervised and supervised machine learning problem and propose an approach using Linear Estimators, Gradient Boosting, and DBSCAN1D (Meglicki et. al, 2021) algorithms. Through the related works (see Section 2) and Data Analysis (see Section 4.2), we perceive that the proposed approach could precisely distinguish and calculate the load at withdrawal, convergence, and duration of yielding of four deployed SAGES 1, 2, 3, and 4 respectively and also develop a proficient and seamless processing and analysis of data with IoT.

Development Strategy
Generalized Linear Model (GLM) [Müller et. al., 2004] is a statistical method to calculate the relation between continuous and/or categorical variables. It is a flexible generalization of ordinary linear regression for analysis [Kumari et. al., 2018], thus allowing response variables to have an abnormal error distribution. GLM has been used for analytical purposes in various domains such as field research, operations research, and exploration as described in Wignall et. al., 1987.

Figure 6: Representation of difference between (a) Linear Regression and (b) Generalized Linear Model
Data mining is a process that is used to extract information, patterns, and trends from a large dataset for analytical and decision-making purposes. As the data loggers installed in the SAGES provide us with unlabeled data for the strata load on Leg-1 and Leg-2 and absolute convergence recorded during deployment of the SAGES in underground coal mines, we employ unsupervised clustering algorithms to generate groups into a given number of clusters in presence of noise to mine information and trend which is followed in the load on the legs and convergence value during depillaring operations. Unsupervised methods such as Density-Based Spatial Clustering of Applications with Noise (DBSCAN) [Babur et. al., 2015] has been used for unsupervised clustering. By using Euclidean distance, DBSCAN (see Figure 7) groups together points in an n-dimensional space to find associations and structures in data.  , the recorded data was monitored using LogXR 6 software (see Figure 8), plotting the data recorded in the XR5 data logger during deployment of the SAGES. Spatial clustering depends on the location, prominence as well as feature size of the points to be clustered. Other unsupervised algorithms, for instance, K-Means [Na et. al., 2010] cannot be used for analysis as clusters formed using K-Means must have the same feature size, absence of noise and it cannot work with points which may form clusters with a varying density as well as anomalous points will be assigned to the regular cluster which can create false trends. We can view from Figure 9, how the DBSCAN algorithm segregates noise and outliers in the data whereas the K-Means clustering algorithm does not and considers outliers/noise as "normal" data. Thus, working with spatial clustering in 1-D space will use an enhanced and extended version of the DBSCAN algorithm, that is, the DBSCAN1D algorithm [Meglicki et. al., 2021]. DBSCAN1D runs in O(N) 7 time instead of O(NlogN) and does not depend on the ε-neighborhoods.

Quantitative Methodology
The dataset has been collected from XR5 data loggers installed inside SAGES. These data files contain information about the strata load on Leg-1 and Leg-2, in ton being applied on the SAGES and convergence, in mm which was caused due to variation in load and other environmental factors. The acquired data consists of files from August 4, 2017, to January 25, 2018, for SAGES-1 & SAGES-2 and November 7, 2017, to February 15, 2018, for SAGES-3 and SAGES-4. The data consists of observations that have been procured every minute for the given duration. Features such as set load, withdrawal load, set-date, withdrawal date, location (face, block, level, slice, dip (categorical)), convergence, roof exposure and blasted holes were extracted at the time of deployment in underground coal mines and from the monitored raw data in XR5 data loggers grouped for each slice at every location in the coal mine which has been used to predict convergence, in mm and strata load at withdrawal on the SAGES.

XR5 Logger Data Analysis
We analyze the data obtained from XR5 data logger files extracted from SAGES through JointGrid plots. JointGrid plots are used for bivariate analysis and marginal univariate distribution analysis of the feature variables. From Figure  10, we observe that there is an approximately linear relationship between strata load on Leg-1 and Leg-2 of SAGES. Looking at the same plots for strata load on leg vs. convergence (mm), we can conclude that although we can see an increasing relation between the two, there is not much linearity and the value of absolute convergence, in mm varies irregularly with high values of strata load on Leg-1 and Leg-2 of SAGES. From the residual 8 plots in Figure 11, we can observe that linear regression models are not a good fit when there is a transition of SAGES among slices, where the values of strata load and convergence tend to deviate significantly from the y=0 line. Therefore, we try to fit a Generalized Linear Model (GLM) on the data points through we can observe the regression lines and it fits with our conclusions made from the JointGrid plots in Figure 10. GLM generalizes linear regression by allowing the linear model to relate to the response variable via a link function 9 g and by allowing the magnitude of the variance of each measurement to be a function of its predicted value, precisely, GLM is determined by link function g and the variance function v(μ) and x. The architecture/network structure of the model can be understood from the below equations. 8 Residual plot represents how much a regression line vertically misses a data point 9 It maps a non-linear relationship in a data onto a linear one GLM network structure Figure 11: GLM Residual plots for SAGES-1 Depillaring operations using SAGES-1 and SAGES-2 were carried out for six slices, whereas using SAGES-3 and SAGES-4, the operations were carried out for four slices. By employing the K-Means clustering algorithm on raw data to cluster strata load on Leg-1 and Leg-2 as shown in Figure 12, we can observe that the K-Means algorithm cluster strata load values in the XR5 data loggers in correlation to the number of slices for which the particular SAGES is used. The number of clusters is equal to the number of slices for which SAGES were used. We can also observe that the K-Means clustering algorithm spatially generates clusters but also considers the noise and outliers, which in the case of DBSCAN cannot be observed (see Section 3.3).  Figure 13 explains the variability of the features which concludes that the similarity in the variation of each feature is ~95% to others for all days and months of deployment of SAGES and thus, we use a single unsupervised machine learning algorithm to calculate the duration of yielding in SAGES. The similarity in variation in all the three properties arises due to deployment at the same location with the depillaring driving factor being the same with the strata load on Leg-1 and Leg-2 having a causal effect on convergence.

Base Architecture
The base architecture is used to extract task-specific features which are passed on to task-specific modules during implementation. For predicting withdrawal load and convergence, we extracted block and SAGES-wise data. Since SAGES-1 and SAGES-2 are deployed at the same location at rise and dip sides respectively, set-load, convergence, withdrawal load, face value will be affected by each other's influence. Percent extraction of the rib, %, and roof and side condition were not included in the prediction for both as they contained only a constant numerical value. Support yielding duration, in hrs, was not used as a feature due to the unavailability of proper values. A similar approach was followed for predicting target features for SAGES-3 & SAGES-4. The final data for predictive analysis of convergence in SAGES has a shape of (62, 13) for both rise and dip sides. The data for predictive analysis of strata load at withdrawal on SAGES has a shape of (62, 15). We will be using Conditional Generative Adversarial Networks (CTGAN 10 ) (Xu et. al., 2019) for modeling the probability distribution of rows in tabular format and generating realistic synthetic data of 1000 rows. Each continuous value is represented using a one-hot vector indicating the mode and a normalized scalar indicating the value within the model. Figure 14 shows an example of Mode Specific Normalization (MSN) where ƞ1, ƞ2, and ƞ3 are the three modes which is leveraged to deal with columns having non-Gaussian and multimodal distributions. For each column Variational Gaussian Mixture (VGM) model estimates the number of modes to fit on the Gaussian mixture. Probability from each node is calculated and the highest probability density is sampled to normalize the value. Values in a column are represented as a one-hot vector for each column processed independently, which indicates the mode.
For handling categorical values in the dataset, CTGAN introduces training by sampling and conditional generators approach to resample the training data and deal with imbalance labels so that all the categorical features get optimum chance to be included in the sample from which GAN learns. To capture the linear and non-linear relationships and complexities in the data, CTGAN uses generator and critic models to learn the conditional distribution of the real data which contain a fully connected hidden network with activation function of LeakyReLU to capture correlations and non-linear interactions. The features from data recorded from the XR5 data logger installed in SAGES are extracted and the timestamp objects are converted into DateTime features. The complete non-uniform time series data is resampled into a 1minute window for which the most recent values were chosen as features that were returned as float. The variation of the strata load on Leg-1 and Leg-2 of the SAGES mimics that of the signal as shown in Figure 16. Peaks of these signals were marked by finds all local maxima by a simple comparison of neighboring values in a 1-D array. A subset of peaks can be selected by filtering their prominence, distance, height, and width. The parameters below were found by manual trial and experimentation. The parameters are as follows: a. Distance is the required minimum horizontal distance in the sample between the neighboring peaks in the signal with a value greater than or equal to 1. b. Width is the required minimum width of the peaks in the sample of the signal with numerical values or a 2element sequence or an array of the dimension of the signal. c. Prominence is the required minimum prominence of the peaks in the signal sample filtering out the peaks whose prominence is not in the given range. Its values can be numerical values or a 2-element sequence or an array of the dimension of the signal. d. Height is the required minimum height of the peaks in the signal sample with numerical values or a 2-element sequence or an array of the dimension of the signal.

Conditional -GAN (CTGAN) similarity statistics between real and synthetic data
The reflection on the statistical scores of the CTGAN generated synthetic data has been used as the pivot point for the similarity between real data and synthetic data. Similarity score gives the aggregate of metrics mentioned in Table-2. We can observe that basic statistics and mean correlation between the two datasets are significant with a similarity score equal to 0.8220.
Basic Statistics 0.9774 Correlation Column correlations 0.7781 Mean correlation between real and synthetic data 0.9462 1 -MAPE 11 Estimator results 0.5864 Similarity Score 0.8220 Table-2: Similarity statistics between real and CTGAN resampled data From the plots representing the similarity between CTGAN generated synthetic data and real data (see Figure  17), we can observe the similarity in the distribution of variables such as strata load at withdrawal at rise side, Face value, in m, setting load at dip side and dip value. Dip value, setting load at dip side and strata load at withdrawal at rise side in both original and synthetic resampled data follow a remarkably similar distribution. Face value, in m in both datasets, have distinguishable features which result in a decrease of similarity score to 0.8220. Through the observations (see Table-

Load at withdrawal on SAGES
We segment the CTGAN resampled dataset of shape (1000, 15) to predict two variables, Strata Load at Withdrawal at Rise Side (includes SAGES-1 and SAGES-3) and Strata Load at Withdrawal at Dip Side (includes SAGES-2 and SAGES-4) which was passed on to Generalized Linear Estimator tuned manually by minimizing the RMSE (Root Mean Squared Error). We train a Generalized Linear estimator with a Poisson distribution for five-folds as the dataset contains non-negative responses. The basic network of a Generalized Linear Estimator is the same as shown in the GLM network structure (see Section-3.3 and Section-4.2).
By using the Poisson distribution family with GLE, the model defines the dependency between the response and the covariates present in the data which is fit on the data by maximizing the penalized likelihood, represented by The network structure of Poisson regression used in Generalized Linear Estimators can be expressed as:

Poisson Regression network
Analysis of strata load on Leg-1 and leg-2 on SAGES show that one leg has a causal effect on the other, which conveys significant linearity (see Figure 10). From the GLM residual plots (see Figure 11), it is understandable that linear regression as a model cannot be used, instead, a Generalized Linear Estimator model has a substantially improved fit between Load on Leg-1 and Leg-2. We split the resampled data into a train to test ratio of 85:15 which gives us immense data for cross-validation 12 training as well as for testing. By splitting the data into five-folds, we reduce overfitting in the data while training. We train two separate hyper tuned models each for Rise Side and Dip Side of deployment to contain the data's interactivity. To minimize the risk of overfitting, we use regularization techniques. In ridge regularization, a small penalty is added to alter the cost function which is equivalent to the square of the magnitude of the coefficients. It uses a shrinkage parameter that minimizes their value.

Withdrawal Load at Rise side
where p is the total number of predictors

Ridge Regularization
Assumptions - • It was assumed that set load, convergence, load at withdrawal are correlated and had interactivity strength.
• Similar interactivity strength was observed in set date and withdrawal date at both dip and rise sides of underground coal mines. • Null values in Blasting to Induce Caving were assumed to be equal to 0.

Convergence in SAGES
For the predictive analysis of Convergence, at both Rise and Dip sides, we segment the data with different target features and create a train-test splitting in a ratio of 85:15, and three separate models were incorporated on this data -Generalized Linear Estimator, Gradient Boosting Estimator with default parameters and Gradient Boosting Estimator with hypertuned parameters -the outputs of which were stacked to create a final ensemble model.

• Generalized Linear Estimator model
Generalized Linear Estimators used for the predictive analysis of Convergence in SAGES has been trained for six-folds and follow the same concept of usage, architecture, and network (see Section-5.1), but with Elastic Net regularization as that of GLE used for the predictive analysis of Load at Withdrawal on SAGES.   [Zou et. al., 2004] is used to create a convex combination of Ridge and Lasso regularization. During regularization, the L1 penalty forms a sparse model whereas the quadratic part of the penalty stabilizes L1 as well as eliminates the limit of variable selection and promotes grouping effect which easily identifies variables using correlation, subsequently enhancing the sampling procedure. We can describe the geometry of Elastic Net in a 2dimensional space with 1 = 2 . The geometry of the elastic net explains the grouping and the sparsity of L1 penalty.

GBM resulting model equation
Boosting, also known as additive modeling is a method of combining multiple simple or weak models to generate a single composite model, which finally uses gradient descent to minimize the loss. Decision trees are used as weak learners by transforming the data into a tree representation. Each node and leaf of the tree represents the attribute and the label respectively. Cumulatively, each weak predictor corrects its predecessor's error. The ensemble consists of N trees. Tree_1 is trained using the feature matrix X and the labels y. The predictions labeled y1(hat) are used to determine the training set residual errors r1. Tree_2 is then trained using the feature matrix X and the residual errors r1 of Tree_1 as labels. The predicted results r1(hat) are then used to determine the residual r2. The process is repeated until all the N trees forming the ensemble are trained (see Figure 21).

Table 5: GBM parameters • Hypertuned Gradient Boosting Estimator model
This model has the same architecture as the Gradient Boosting Estimator used with default parameters and has been trained for six-folds. Hyperparameter tuning is an essential step as it directly impacts the performance of the machine learning model. By manual hypertuning the model hyperparameters, we can observe the behavior of individual parameters. The hyperparameters used in this particular model are as follows:   • Ensembling

Model Parameters
Separate results for convergence in SAGES at dip side and rise side are predicted using each model. Predictions are stacked horizontally for the dip side and rise side respectively to create two datasets with target variable as Convergence for the respective segments.
To develop an ensemble composite model, we use Random Forest Regressor and XGBoost regressor for Convergence at rise side and dip side respectively. Random Forest [Ali el. al., 2012] is a meta-estimator 13 that uses bagging 14 technique in which the models are developed parallelly with no interaction between the individual weak learners. In the case of regression, it outputs the mean prediction of the individual trees. It adds additional randomness to the model by searching for the best feature among a subset of features while splitting a node. These models were chosen based on experimentation.
By using the base structure of gradient boosting trees, XGBoost [Chen et. al., 2016] is a boosting technique that minimizes a regularized (L1 and L2) objective function that combines a convex loss function and a penalty term for model complexity i.e., the regression tree functions. XGBoost approaches the process of sequential tree building using parallelized implementation due to the interchangeable nature of loops while building the base learners. This nesting of loops limits parallelization because the inner loop must be executed before the execution of the outer loop. Therefore, the order of loops is interchanged using the initialization and sorting of parallel threads to improve run-time.  13 Meta-estimator is a parameter which takes another estimator as a parameter 14 Bagging is a bootstrap aggregation method used for ensemble learning to reduce variance in a noisy dataset  [Chen et. al., 2016] To implement the ensemble models, the data is scaled such that the mean is equal to 0 and the standard deviation is equal to 1. Feature scaling is used to normalize the range of independent variables used for predictive analysis of the data. The scaled data is fit into the models and the respective RMSE scores are compared.

Assumptions -
• Through data analysis, it was found that set date and withdrawal date did not play an important role in the prediction of convergence, mm, and thus, were not included in the predictor features. • It was assumed that Load at withdrawal at dip and rise side and set load at dip and rise side are correlated and had interactivity strength. • Null values in Blasting to Induce Caving were assumed to be equal to 0.

Duration of Yielding in SAGES
We use the peaks found in feature extraction as a parameter to Density-Based Spatial Clustering of Applications with Noise in 1-Dimension (DBSCAN1D) [Meglicki et. al., 2021] for unsupervised machine learning to calculate the duration of yielding in SAGES. Density-Based Clustering refers to unsupervised learning methods that identify distinctive groups/clusters in the data, based on the idea that a cluster in data space is a contiguous region of high point density, separated from other such clusters by contiguous regions of low point density. DBSCAN1D is a base algorithm for density-based clustering in a single dimension (see Section-3.3). It can discover clusters of different shapes and sizes from a large amount of data, which may contain noise and outliers.
• eps (ε) is a measure of distance used to locate the neighborhood of points. The value of eps (ε) is chosen through the nearest neighbors algorithm plot as shown below. By choosing the value at the elbow, we get the efficient and effective eps (ε) value. eps (ε) value may differ depending on the data. • min_samples is the minimum threshold of the number of points clustered together to be considered dense.
Its value should be greater than (dimension of data + 1) or (2 * dimension of data) DBSCAN1D is suitable for spatial clustering of different densities and sizes in 1-dimension in large datasets. We develop the algorithm to create spatial clusters of prominent peak points found in the Load and Convergence variation of SAGES recorded during deployment. SAGES when undergoing yielding produce a dense cluster of peak point patterns at various stages of monitoring due to the application of pressure at valves. Using DBSCAN1D, we can create clusters of prominent peak points generated during yielding and assign them adjacent labels. The resultant dataset consists of the prominent peak values, cluster labels, DateTime of the respective peak, and uniform step-minute value of the respective peak point. In the resultant dataframe, we convert the DateTime index to durations for the respective clusters. From Figure 25 which displays variation and spatial clustering plot, we can interpret the total number of clusters without noise as well as the respective peak points marked where yielding takes place in SAGES. Below is the duration table of the clusters.

Model Comparison
Model development for the predictive analysis of strata load at withdrawal in SAGES uses only Generalized Linear Estimators (GLE) fit on Conditional GANs resampled data, each for the segmented locations. Similar variance in performance metric score has been observed for the predictive analysis of strata load at withdrawal at both rise and dip side. The key metrics for the purpose of machine learning model selection are Mean Absolute Error (MAE), R2score as these metrics provide better insights for regression-based tasks. By comparing the different models, including composite and ensembles, we can observe that ensemble models, that are Random Forest Regressor and XGBoost Regressor perform significantly better compared to individual learners. Therefore, for the final model, we will be using the models for end-stage predictive analysis of Convergence monitored in SAGES. Feature importance plots signify the percentage probability, or the relative importance of feature variables used by the machine learning model for the prediction of the target variable. It is calculated as the decrease in the impurity of the node in a boosting tree weighted by the probability of reaching the respective node. From Figure 27 and Figure  28, we can observe that slice category number, strata load at withdrawal at both rise and dip side and the setting load at dip side play an important role for the predictive analysis of convergence, in mm. Features such as SAGES number, Level, Block lie on the lower side of the variable importance plot, which signifies that the features are highly uncorrelated.  SHAP dependency plots (see Figure 29) show how much a particular variable has affected the target variable. SHAP values allow us to decompose any prediction into the sum of effects of each feature value. Strata load at withdrawal (dip side) in SAGES have a significant positive effect on Convergence, in mm followed by slice category number indicating a sparse but significant dependency from Convergence value in SAGES.

CONCLUSION
This paper presents our approach for predicting Convergence, in mm, Strata Load at Withdrawal, in ton, and duration of yielding in SAGES deployed in underground coal mines. The predictive models were designed and developed using the following methods, Generalized Linear Estimator, Gradient Boosting Estimator, Density-Based Spatial Clustering of Applications with Noise in 1-Dimension (DBSCAN1D) for automating the analysis of the aftereffects of depillaring operations. The models do not need to be trained for separate SAGES machines (four SAGES machines with number 1 to 4 deployed in the underground mines). The SAGES number has been feature engineered and merged to the dataset. Additionally, it was also observed that SAGES numbers were not predominantly correlated with other features, therefore, it was concluded to not develop separate models for each SAGES machine. The main features which were of importance are setting load, dip value, face value, slice number, etc. which when given as an input to the model retain the characteristics and properties. By incorporating Artificial Intelligence along with IoT in SAGES in the underground mines, a safe working environment could be ensuring for timely decision making. SAGES can be preset with the specifications recorded through predictive analysis which can help further in deducing other mechanical properties like pressure, temperature, setting load, and so on.