Evaluation of underground hard rock mine pillar stability using gene expression programming and decision tree-support vector machine models-深地科学

【下载全文】

Evaluation of underground hard rock mine pillar stability using gene expression programming and decision tree-support vector machine models

Abstract

Assessing the stability of pillars in underground mines (especially in deep underground mines) is a critical concern during both the design and the operational phases of a project. This study mainly focuses on developing two practical models to predict pillar stability status. For this purpose, two robust models were developed using a database including 236 case histories from seven underground hard rock mines, based on gene expression programming (GEP) and decision tree-support vector machine (DT-SVM) hybrid algorithms. The performance of the developed models was evaluated based on four common statistical criteria (sensitivity, specificity, Matthews correlation coefficient, and accuracy), receiver operating characteristic (ROC) curve, and testing data sets. The results showed that the GEP and DT-SVM models performed exceptionally well in assessing pillar stability, showing a high level of accuracy. The DT-SVM model, in particular, outperformed the GEP model (accuracy of 0.914, sensitivity of 0.842, specificity of 0.929, Matthews correlation coefficient of 0.767, and area under the ROC of 0.897 for the test data set). Furthermore, upon comparing the developed models with the previous ones, it was revealed that both models can effectively determine the condition of pillar stability with low uncertainty and acceptable accuracy. This suggests that these models could serve as dependable tools for project managers, aiding in the evaluation of pillar stability during the design and operational phases of mining projects, despite the inherent challenges in this domain.

Highlights

Stability assessment of pillars in deep underground mines is crucial in both design and operational phases.
Pillar stability in underground mines is evaluated using two robust data mining methods.
Both models showed high accuracy in evaluating column stability; however, the hybrid model shows better performance.
The presented models effectively determine pillar stability with low uncertainty and acceptable accuracy, compared to earlier models.
Despite field constraints, these models may help project managers evaluate pillar stability throughout mining project design and operation.

1 INTRODUCTION

Today, underground mining methods such as self-supported, supported, and caving methods are extensively used due to the increase in the depth of mineral deposits. Ground subsidence and severe damage to surface structures are among the downsides of using these methods. Hence, selecting an appropriate underground mining method to mitigate ground subsidence is crucial. Partial extraction methods, particularly among self-supported methods, have a substantial impact on subsidence control (Yu et al., 2018). Room and pillar mining is a partial extraction method suitable for shallow to deep and flat deposits containing resistant ores, including coal, metal, and building stone mines (Mallı et al., 2017; Zhang et al., 2016). In this method, a part of the ore is used as a pillar to support the hanging wall. Ensuring the stability of these pillars is a critical aspect of their design. Incidents in underground mines resulting in injuries, fatalities, equipment breakdowns, and loss of work hours are often linked to pillar instability (such as pillar failures and collapses) (Esterhuizen et al., 2008; Ghasemi et al., 2014; Wessels & Malan, 2023). Therefore, ensuring pillar stability is of great importance for the social and economic safety of underground mines.

A wide range of studies have been carried out on this issue over the past few years, and a number of strategies were presented to assess pillar stability and mitigate risks caused by instability. The first step in evaluating pillar stability is to evaluate stability in the design phase before starting mining. Therefore, it is important that the pillar stability be assessed at this stage for the implementation of appropriate control measures to increase safety in mining. Several methods have been developed to predict pillar stability, broadly classified into three categories: safety factor approach (González-Nicieza et al., 2006; Lunder, 1994), numerical simulation methods (Deng et al., 2003; Kim et al., 2019; Li et al., 2013, 2019; Liang et al., 2019; Seo et al., 2016), and machine learning methods (Liang et al., 2020; Zhou et al., 2011). The utilization of the first two methods lacks high reliability due to the simplifications made, and their applicability to other mines presents challenges. Machine learning methods are use more often, primarily due to their reliance on databases from numerous projects, low cost, high speed, and ease of use. The use of machine learning methods to generate empirical models for predicting pillar stability has rapidly increased in recent years. These models make use of data obtained from one or multiple mining projects, taking into account different input parameters, in order to predict the status of pillar stability. Table 1 provides a summary of the utilization of machine learning models for predicting pillar stability.

Table 1. Application of machine learning models for prediction of pillar stability.

Source	Input parameters	Technique(s)	Database	Accuracy (%)
Tawadrous and Katsabanis (2007)	MT, D, H, h, w, L, SH, BH, OD	Artificial neural networks	230	93.00
Monjezi et al. (2011)	AG, SG, DP, DG, H	Artificial neural networks	149	98.80
Zhou et al. (2011)	w, h, UCS, S	Fisher discriminant analysis, Support vector machine	46	90.00–97.50
Wattimena (2014)	w, h, UCS, S	Multinomial logistic regression	178	79.21
Zhou et al. (2015)	w, h, UCS, S	Linear discriminant analysis, Multinomial logistic regression, Multilayer perceptron neural networks, Support vector machine, Random forest, Gradient boosting machine	251	75.00–82.00
Ghasemi et al. (2017)	w, h, UCS, S	J48, Support vector classification	178	74.00–81.00
Ding et al. (2018)	w, h, UCS, S	Stochastic gradient boosting	205	81.21
Liang et al. (2020)	w, h, UCS, S	Gradient boosting decision tree, Extreme gradient boosting, and Light gradient boosting machine	236	83.69
Ahmad et al. (2020)	w, h, UCS, S	Random trees, C4.5	46	83.00–100.00
Li et al. (2022)	w, h, UCS, S	Logistic model tree	178	93.80–94.10
Zhou et al. (2023)	AG, SG, DP, DG, H	Neural–metaheuristic paradigm	149	99.90
Li et al. (2023)	w, h, UCS, S	Support vector machine	306	79.30–91.30

Abbreviations: AG, angle of the goaf line; BH, backfill height; D, dip of the orebody; DG, distance of the point from the goaf line; DP, distance of the point from the center of the pillar; GSI, geological strength index; h, pillar height; H, overburden thickness; L, pillar length; MT, mine type; OD, overall rock mass density; RMR, rock mass rating; S, pillar stress; SG, specific gravity; SH, stope height; UCS, uniaxial compressive strength; w, pillar width.

In view of the above discussion, it is essential to use a method that can provide pillar stability with accurate and reliable assessments that do not pose any significant uncertainty while being easy to use in other mining projects. Previous studies have attempted to predict pillar stability using machine learning methods. Some of these studies have used black box methods, which cannot be easily used in other projects. However, other studies have used methods that could not evaluate stability status with acceptable accuracy. This study is the first attempt to utilize gene expression programming (GEP) and a hybrid decision tree-support vector machine (DT-SVM) algorithm as two practical models for evaluating the stability of hard rock pillars. In various domains of mining and rock engineering, GEP, DT, and SVM have emerged as preferred machine learning methods for constructing predictive models. These methods are a suitable choice for constructing precise models in this domain due to their white (glass) box and the ease with which results can be interpreted (Afrasiabian & Eftekhari, 2022; Cemiloglu et al., 2023; Ghasemi et al., 2017; Huang et al., 2023; Kadkhodaei & Ghasemi, 2019; Khandelwal & Monjezi, 2013; Raheel et al., 2023; Shamsi et al., 2022; Shirani Faradonbeh et al., 2016; Zalaghaie et al., 2023). By utilizing these innovative models, this study aims to provide a more reliable and efficient approach for assessing pillar stability in mining projects. These models can be used to evaluate pillar stability in the design stage, before the start of the mining operation, or to monitor the status of the pillar (as a warning) after a mining operation. Figure 1 shows the flow chart of this research.

Details are in the caption following the image — Figure 1
Open in figure viewer PowerPoint

Flow chart of this research.

2 DATABASE DESCRIPTION

Two empirical models for predicting pillar stability status using the GEP technique and the DT-SVM hybrid algorithm have been developed based on a database compiled from well-known studies. This database includes 236 case histories from seven underground hard rock mines including Selebi-Phikwe mine, Elliot Lake uranium mine, Open stope mine in Canada, Westmin Resources Ltd.'s H-W mine, Zinkgruvan mine, Stone mines in the United States, and Alicante marble mine (Liang et al., 2020). The database contains four input parameters including effective parameters on pillar stability, that is, pillar width (w), pillar height (h), uniaxial compressive strength (UCS), average pillar stress (S), and one output parameter. The output of this database is the pillar stability status, that is, stable (100 cases), unstable (53 cases), and failed (83 cases). In this paper, the labels 0, 1, and 2 are used to represent stable, unstable, and failed cases, respectively. The stable status indicates that the pillar shows no sign of stress-induced fracturing or only has minor spallings that do not affect pillar stability; the unstable status indicates that the pillar has partially failed, and has prominent spallings, but still possesses a certain supporting capacity; and the failed status indicates that the pillar is crushed, and has pronounced openings of joints, which represents a high risk of collapse (Liang et al., 2020). Figure 2 displays the ridgeline with a box plot illustrating statistics of input parameters and the output parameter. A ridgeline plot is a data visualization technique commonly used to display the distribution of a continuous variable across multiple categories. It is also referred to as a joy plot due to its resemblance to a landscape with peaks and valleys. In a ridgeline plot, several density curves are superimposed on top of one another, with each curve representing the distribution of the variable for a particular category or group. The ridges of each curve are aligned, resulting in a layered effect. The width of each ridge corresponds to the density of the data within that category, allowing for easy comparison of the distributions across categories (Wickham, 2016). To create a predictive model for pillar stability using GEP, the database was split into two subsets: stable–unstable (S–U) and unstable–failed (U–F). The S–U subset includes 100 stable and 136 unstable cases, and the U–F subset includes 53 unstable cases and 83 failed cases. Generally, the process of model development involves partitioning the database into two subsets: training and testing. The training subset is utilized for model development, whereas the testing subset is used to evaluate the performance of the developed models. This technique is commonly used, and the percentage of partitioning depends on the size of the databases (Akbarzadeh et al., 2022; Guido et al., 2020; Shaffiee Haghshenas et al., 2023). To develop and validate the S–U and U–F models, the database was randomly split into two subsets: a training subset (85% of the database) used to develop the model and a testing subset (15% of the database) used to validate the model. On the other hand, to create a predictive model for pillar stability using the DT-SVM, the database was split into two subsets: training (to develop the model) and testing (to validate the model). It should be noted that the training and testing subsets' partitioning for the DT-SVM model is similar to the GEP model. However, in this case, the partitioning was carried out on the entire database (236 case histories). The allocation of cases into the training and testing data sets was performed randomly using MATLAB software.

3 GEP

Ferreira (2001) was the first to introduce GEP as a method for constructing mathematical models through genetic and evolutionary computations. GEP is a hybrid algorithm that combines elements of a genetic algorithm (GA) and genetic programming (GP). It utilizes both tree (derived from GA) and linear structures (derived from GA) (Ferreira, 2001). The GEP algorithm has found widespread application in engineering research owing to its exceptional performance, ease of use, speed, and precision (Bastami et al., 2020; Hoang & Tien Bui, 2018; Jahed Armaghani et al., 2018; Kadkhodaei & Ghasemi, 2019; Ramesh et al., 2020). GEP achieves this objective by using a combination of different mathematical operators, selected on the basis of genetic parameters. This approach allows for the optimization of solutions and the creation of highly efficient and effective mathematical models in engineering. The algorithm's structure commences with the generation of the initial population of chromosomes, as shown in Figure 3. The initial population is exposed to various genetic operators such as mutation, transposition, and recombination, which facilitate the formation of linear and tree structures. In transferring chromosomes from one gene to another, each genetic operator plays an important role. The iterative process of the algorithm continues until it reaches an optimal structure, ensuring that the most effective mathematical models are developed (Ferreira, 2006).

3.1 GEP model development

To develop models using GEP, w, h, UCS, and S are entered into the algorithm as input parameters, and the pillar stability status is entered as an output parameter. Therefore, two equations based on Equation ( 1) are developed to predict pillar stability status: S–U and U–F. This equation demonstrates that the GEP predictive models are dependent on the four input parameters of the database ( w, h, UCS, and S), and they are developed accordingly.

Y_{S - U} and Y_{U - F} = f (w, h, UCS, S) .

(1)

The GEP models were developed using GeneXpro Tools 5.0 software (GEPSOFT, 2014), based on the training subset. The software uses a five-step process, which includes determining the mathematical functions, fitness function, genetic operators, chromosome properties, and linking function (Ferreira, 2001). Table 2 provides an overview of the parameters used in the GEP model. The selection of parameters in the GEP model is left to the discretion of the user and through trial and error, the optimal model structure can be attained by adjusting the parameters. These factors affect the time needed to reach the optimal model. Nonetheless, researchers suggested different combinations of these parameters that could facilitate their implementation and speed up the process to achieve an optimum model (Abbaszadeh Shahri et al., 2021; Dehghani et al., 2021; Kadkhodaei et al., 2022; Lawal et al., 2021; Mousavi et al., 2012; Sadrossadat et al., 2018; Zare Naghadehi et al., 2018).

Table 2. Parameters used in the gene expression programming models.

Steps	Description	Selected parameters	Stable–unstable model	Unstable–failed model
1	Fitness function	Root-mean-square deviation with rounding threshold = 0
2	Functions	Addition (+), Subtraction (–), Multiplication (*), Division (/)
3	Characteristics of chromosomes	Head size	9	8
		Number of gene(s)	3	3
		Number of chromosomes	90	90
4	Genetic operators	Mutation rate	0.044	0.044
		Inversion rate	0.1	0.1
		Insertion sequence transposition	0.1	0.1
		RIS transposition	0.1	0.1
		Gene transposition	0.1	0.1
		One-point recombination	0.3	0.3
		Two-point recombination	0.2	0.2
		Gene recombination	0.2	0.2
5	Linking function	Addition (+)

Following the steps described earlier, the GEP algorithm was executed (10 000 iterations) to construct the models. The expression trees generated for each gene of the S–U and U–F models are displayed in Figures 4 and 5, respectively. The mathematical equation for each gene can be extracted from its corresponding tree structure (Equations 2-4 for the S–U model and Equations 5-7 for the U–F model). It is important to emphasize that the process of deriving these equations initiates at the lowest node of each expression tree and advances upward, proceeding from left to right. The final step involved using the linking function (+) to define two indexes for evaluating the pillar stability status ( SU and UF indexes) according to Equations ( 8) and ( 9), respectively. To determine the pillar stability status, a threshold value of 0 was applied. If SU ≥ 0, the pillar is considered unstable, while if it is less than 0, the pillar is considered stable. Similarly, if UF ≥ 0, the pillar is deemed to have failed, whereas if it is less than 0, the pillar is unstable.

Sub - ET 1_{S - U} = \frac{UCS \times w^{3} (S - 5.39)}{w - S},

(2)

\begin{matrix} S u b - E T 2_{S - U} & = & U C S \times [h \times (S - U C S - w) - h], \end{matrix}

(3)

Sub - ET 3_{S - U} = \frac{UCS \times (h + 21.72 S) \times h^{2}}{UCS - 7.14},

(4)

Sub - ET 1_{U - F} = UCS \times [(w + 1.17) \times (S - UCS) + UCS \times h],

(5)

Sub - ET 2_{U - F} = [5.84 (h - UCS) + S \times h] \times (\frac{U C S}{S} + 8.99),

(6)

Sub - ET 3_{U - F} = [(UCS - h) w - 2 S] \times (- 9.54 + h) \times (7.9 - w),

(7)

SU = \frac{UCS \times w^{3} (S - 5.39)}{w - S} + UCS \times [h \times (S - UCS - w) - h] + \frac{UCS \times (h + 21.72 S) \times h^{2}}{UCS - 7.14},

(8)

UF = UCS \times [(w + 1.17) \times (S - UCS) + UCS \times h] + [5.84 (h - UCS) + S \times h] \times (\frac{U C S}{S} + 8.99) + [(UCS - h) w - 2 S] \times (- 9.54 + h) \times (7.9 - w) .

(9)

3.2 GEP models' assessment

Four assessment criteria have been used to evaluate the performance of GEP models: accuracy (AC), specificity (SP), Matthews correlation coefficient (MCC), and sensitivity (SE). Additionally, the receiver operating characteristic (ROC) curve was used. Table 3 shows the description of criteria AC, MCC, SE, and SP (Ghasemi & Gholizadeh, 2018). It is worth noting that all of these criteria can be computed using the confusion matrix generated by the developed models. A confusion matrix is a matrix that offers a summary of predictions in a matrix format. The matrix indicates the number of accurate and inaccurate predictions for each category. It aids in comprehending the classes that the model mistakenly identifies as another class (Tiwari, 2022). Table 4 displays a standard confusion matrix for binary predictions. The ROC curve is an essential metric used to evaluate the performance of classification problems. The relationship of FP rates with the TP rate parameters is shown in this curve. The ROC curve is widely utilized for comparing different models, and an indicator known as the area under the ROC curve (AUC) serves this purpose. AUC values range from 0 to 1, and the closer the value is to 1, the better the model's performance. Therefore, AUC serves as a vital metric for evaluating the effectiveness of classification models (Han et al., 2011). To assess the performance of the developed model, 15% of the database (35 cases for the S–U model and 20 cases for the U–F model) were randomly selected and utilized. The prediction outcomes for these cases using the S–U and U–F models are derived from the developed models, and the results are displayed in Tables 5 and 6, respectively. Additionally, the confusion matrix for the predicted cases is presented in Figure 6. Based on the confusion matrix results, the values for MCC, SE, AC, and SP for the S–U training subset are 0.799, 0.908, 0.901, and 0.895, respectively, those for the S–U testing subset are 0.765, 0.692, 0.888, and 1, respectively, those for the U–F training subset are 0.761, 0.841, 0.888, and 0.917, respectively, and those for the U–F testing subset are 0.600, 0.667, 0.800, and 0.909, respectively. Furthermore, Figure 7 shows the ROC curve of the developed GEP models. Considering the performance evaluation criteria, which include MCC, SE, AC, SP, and the ROC curve, it can be concluded that the models developed using the GEP show exceptional performance in predicting the pillar stability status.

Table 3. Description of performance evaluation criteria (Ghasemi & Gholizadeh, 2018).

Criterion	Rangea	Descriptionb
Accuracy	0 to 1	It represents the percentage of data correctly assigned to the respective class. $A C = \frac{T P + T N}{T P + F P + T N + F N}$
Sensitivity	0 to 1	It represents the ratio of the number of data correctly set to the first class to total data in the first class. $S E = \frac{T P}{T P + F N}$
Specificity	0 to 1	It represents the ratio of the number of data correctly set to the second class to the total data in the second class. $S P = \frac{T N}{F P + T N}$
Matthews correlation coefficient	−1 to 1	It represents the correlation between the observed and predicted classifications. $M C C = \frac{(T P \times T N) - (F P \times F N)}{\sqrt{(T P + F N) \times (T P + F P) \times (T N + F P) \times (T N + F N)}}$

a The closer they are to the value of 1, the more accurate and reliable the developed model.
b TP is true positive (the model correctly predicts the positive class), FP is false positive (the model incorrectly predicts the positive class), TN is true negative (the model correctly predicts the negative class), and FN is false negative (the model incorrectly predicts the negative class).

Table 4. A typical confusion matrix.

Actual class	Predicted class
	First class	Second class
First class	True positive	False negative
Second class	False positive	True negative

Table 5. Prediction results of the stable–unstable (S–U) model based on the testing subset.

Number of cases	w (m)	h (m)	UCS (MPa)	S (MPa)	Pillar stability status
Number of cases	w (m)	h (m)	UCS (MPa)	S (MPa)	Actual	S–U value	Predicted
1	3.00	3.00	210.00	34.50	0	−67 842.94	0
2	10.50	6.00	240.00	92.00	1	211 870.53	1
3	5.30	3.80	94.00	55.00	1	43 591.64	1
4	15.00	30.00	100.00	38.00	1	3 422 759.29	1
5	4.60	2.70	210.00	72.40	0	−19 718.64	0
6	14.00	13.25	61.00	1.43	0	−90 113.06	0
7	12.20	27.40	150.00	17.20	1	939 331.63	1
8	4.70	5.00	172.00	93.50	1	284 620.92	1
9	6.60	3.80	94.00	40.00	1	3643.13	1
10	17.00	14.00	61.00	6.05	0	55 050.94	1
11	6.10	3.80	94.00	35.00	0	1480.31	1
12	4.60	3.80	94.00	56.00	1	50 498.04	1
13	5.40	5.00	172.00	93.50	1	274 651.47	1
14	1.90	3.80	94.00	58.00	1	63 259.50	1
15	6.10	5.50	210.00	27.60	0	−105 699.95	0
16	6.00	6.00	240.00	84.00	1	385 769.89	1
17	12.20	15.80	165.00	8.40	1	170 023.51	1
18	27.00	13.50	61.00	8.97	0	307 440.31	1
19	3.00	2.70	210.00	104.80	1	50 765.55	1
20	8.70	5.00	172.00	85.00	1	139 365.08	1
21	6.80	3.80	94.00	53.00	1	23 174.73	1
22	5.70	4.00	172.00	105.40	1	173 454.38	1
23	5.50	7.30	160.00	27.00	1	42 117.06	1
24	5.50	4.00	172.00	93.50	1	139 827.72	1
25	3.00	3.80	94.00	54.00	1	54 264.98	1
26	13.00	16.50	61.00	5.11	1	43 835.39	1
27	6.10	2.40	210.00	54.50	0	−84 780.45	0
28	16.25	11.00	104.00	4.27	0	−113 440.96	0
29	3.00	2.70	210.00	108.30	1	56 693.07	1
30	15.00	16.00	61.00	4.79	0	22 239.43	1
31	3.80	3.80	94.00	58.00	1	58 179.31	1
32	15.00	12.00	176.00	37.00	0	−198 640.27	0
33	7.20	4.00	172.00	56.70	0	−13 848.79	0
34	24.00	18.00	72.00	36.00	0	−1 536 551.64	0
35	7.30	6.00	240.00	67.00	1	179 214.68	1

Note: Code 0 indicates stable status and code 1 indicates unstable status.

Table 6. Prediction results of the unstable–failed (U–F) model based on the testing subset.

Number of cases	w (m)	h (m)	UCS (MPa)	S (MPa)	Pillar stability status
Number of cases	w (m)	h (m)	UCS (MPa)	S (MPa)	Actual	U–F value	Predicted
1	4.60	3.80	94.00	55.00	2	2717.67	2
2	12.50	15.20	160.00	17.80	2	20 691.22	2
3	3.80	3.80	94.00	59.00	2	8231.69	2
4	3.80	3.80	94.00	34.00	1	−6254.22	1
5	5.70	3.80	94.00	47.00	1	−6491.91	1
6	15.20	27.40	153.00	12.60	2	35 548.23	2
7	4.60	4.50	172.00	93.50	2	38 534.15	2
8	4.70	3.80	94.00	34.00	1	−11 405.37	1
9	16.00	16.25	61.00	6.32	1	−38 200.31	1
10	5.70	3.80	94.00	63.00	2	5160.06	2
11	5.60	4.00	172.00	93.50	2	9714.81	2
12	4.80	4.00	172.00	77.90	1	1905.46	2
13	14.00	28.00	90.00	49.00	2	96 759.21	2
14	15.00	27.00	176.00	28.00	1	143 369.12	2
15	14.25	18.00	104.00	5.33	1	−40 966.70	1
16	3.50	3.80	94.00	55.00	1	7335.35	2
17	4.70	5.00	172.00	93.50	2	53 612.83	2
18	27.00	40.00	176.00	38.00	2	−1 525 279.68	1
19	16.00	24.00	61.00	5.55	1	−38 533.38	1
20	10.70	18.30	215.00	9.40	2	237 421.98	2

Note: Code 1 indicates unstable status and code 2 indicates failed status.

3.3 Effect of input parameters on pillar stability

Figure 8 shows the importance of input parameters on the pillar stability status as determined by the developed models. The calculation logic is founded on assessing the impact of each parameter during the execution of genetic operators to achieve consistent fitness. GEP utilizes a sophisticated stochastic method to calculate the variable importance of all the variables within a model. This involves computing the importance of each model variable by randomizing its input values and subsequently determining the decrease in the R-square between the model output and the target. Following this, the results for all variables are normalized to ensure that their cumulative values add up to 1. As can be seen in Figure 8, w (pillar width) has the most significant impact on the pillar stability status in both GEP models compared to other input parameters. Furthermore, the sensitivity analysis method based on the developed models was used to investigate the impact of parameters w, h, UCS, and S on the PS condition. This was achieved by adjusting one input parameter within the range of minimum to maximum values found in the database while keeping all other parameters constant at their average values from the database. The sensitivity analysis results of w, UCS, h, and S on pillar stability are shown in Figure 9. These results show that input parameters w and UCS are directly related to pillar stability status. Conversely, the input parameters h and S display an inverse relationship with the pillar stability status. This means that on increasing w and UCS and reducing S and h, the stability of the pillar increases, which is consistent with engineering logic.

4 DECISION TREE-SUPPORT VECTOR MACHINE (DT-SVM) HYBRID ALGORITHM

SVM was originally developed to address classification problems where the primary objective is to identify an optimal hyperplane that effectively separates two classes. However, in reality, most problems involve multiple categories, which makes the traditional two-class support vector classification (SVC) unsuitable for multiclassification problems. To overcome this limitation, multiclass SVC conversion into multiple two class SVCs is utilized. This approach categorizes multiclass methods into two groups: one-against-one (OAO) and one-against-all (OAA) algorithms (Abbaszadeh Shahri et al., 2022; Huang, Sun, et al., 2022; Ma & Guo, 2014).

The decision tree is a tree-like model that is used to make decisions or predictions. It is a type of supervised learning algorithm that is mostly used for classification problems, but can also be used for regression problems. In a decision tree, every internal node signifies a feature or attribute, while each leaf node signifies a class label. The tree is constructed by recursively splitting the data into subsets based on the value of the features until the leaf nodes are pure (i.e., all data points in the leaf node belong to the same class) or a stopping criterion is met. After constructing the decision tree, it can be utilized to predict the class label of a new data point by following a path from the root node to a leaf node, guided by the values of its features (Tien Bui et al., 2014).

The DT-SVM hybrid algorithm is a machine learning approach that combines the SVM and DT algorithms to improve the accuracy and robustness of classification tasks (Chen et al., 2011; Li, 2023). In this algorithm, DT is used to identify the most important features and to partition the data into subsets based on their values. Then, SVM is used to build a model for each subset to improve the classification accuracy. The final prediction is made by combining the results from all the SVM models. The main advantage of this hybrid approach is that it combines the strengths of both DT and SVM. DT is effective in handling large and complex data sets, identifying important features, and creating rules for classification. On the other hand, SVM is known for its high accuracy and robustness to noise and outliers. By combining the two algorithms, the DT-SVM hybrid algorithm can produce a more accurate and robust model. Currently, hybrid algorithms are increasingly being used to create highly accurate and efficient models, reducing the influence of typical uncertainties in model creation (Huang et al., 2021; Huang, Zhou, et al., 2022; Sun, Li, Zhang, Huang, 2021; Sun, Li, Zhang, et al., 2021).

4.1 DT-SVM model development

To develop model using the DT-SVM model, w, h, UCS, and S are entered into the algorithm as input parameters and the pillar stability status is entered as an output parameter. The DT-SVM algorithm was developed using MATLAB 2020. The algorithm was implemented in the MATLAB software environment and consists of four steps as follows (Chen et al., 2011):

1.
Load and partitioning data: In this step, the database comprising four input parameters and one output parameter is entered, after which it is randomly divided into two sets of training and test data sets in a ratio of 85%:15%. The output parameter is divided into three classes: 0 (stable), 1 (unstable), and 2 (failed).
2.
Train the DT: During this step, the DT algorithm is applied to the training data. The algorithm recursively partitions the data based on the most informative feature at each node, resulting in a tree-like structure that can be used for making predictions. Figure 10 shows the structure of the developed DT for the evaluation of pillar stability status.
3.
Train the SVM: In this step, the SVM classifier algorithm with a linear kernel is applied to the training data. The SVM algorithm aims to find the hyperplane that maximally separates the data points of different classes in the feature space. In the case of a linear kernel, the input data are transformed into a higher-dimensional space using a linear function. The advantage of using a linear kernel in SVM is that it can effectively classify data that are linearly separable in the input space. When the input data are not linearly separable, the linear kernel can still perform well by transforming the data into a higher-dimensional space where it may be linearly separable (Shawe-Taylor & Sun, 2011). As an example, Figure 11 shows the 2D structure of the developed SVM based on w and h for the evaluation of pillar stability status.
4.
Hybrid model: The DT and SVM models are combined to create an ensemble model that takes advantage of the strengths of both models. For this purpose, DT is used to train SVM, which allows the SVM to make more informed decisions.

4.2 DT-SVM model assessment

To assess the performance of the DT-SVM model, as in Section 3.2, four evaluation criteria including MCC, SE, AC, SP, and AUC have been used. All these criteria have been calculated using the confusion matrix of the developed model (Figure 12) for both training and testing subsets. It should be noted that to evaluate the performance of the developed DT-SVM model, 15% of the database was selected (randomly) from the database by MATLAB software. Figure 13 shows the ROC curve of the developed model. According to the confusion matrix results, the overall MCC, SE, AC, and SP for the DT-SVM training subset are 0.992, 0.993, 0.996, and 0.997, respectively, and those for the testing subset are 0.767, 0.842, 0.914, and 0.929, respectively. Based on the evaluation criteria for performance including MCC, SE, AC, SP, and AUC, the model developed using DT-SVM demonstrates outstanding performance in predicting pillar stability status.

5 COMPARISON WITH OTHER MODELS

Finally, compared to other models in this field, the accuracy of the developed models GEP and DT-SVM is compared (Table 1). This comparison relies on both the number of databases used and the accuracy of the developed model, which is evaluated based on the specific database utilized in each research study. Figure 14 shows the comparison of the accuracy of GEP and the DT-SVM model with some previously developed models. As shown in the figure, the developed GEP and DT-SVM models surpass their counterparts in terms of model accuracy and database, with an overall accuracy of 0.894 and 0.940, respectively. Some developed models such as Tawadrous and Katsabanis (2007) and Monjezi et al. (2011) have a large number of databases and high accuracy, but due to the use of black box methods, it is not possible to use the models in different projects. On the other hand, some other models have high accuracy but are limited in scope due to a small number of databases (such as Zhou et al. [2011] and Ahmad et al. [2020]), resulting in high uncertainty and making them unsuitable for use in projects. Therefore, the GEP and DT-SVM models, with their comprehensive database and white box model, have the highest accuracy compared to other models and can be utilized in various projects. Furthermore, Figure 15 shows the comparison results between the actual class and the predictions made by the developed models. According to this figure, the DT-SVM model demonstrates higher accuracy in assessing pillar stability with fewer errors, whereas the GEP model has shown a tendency to conservatively predict instances of instability (unstable or failure), resulting in a higher rate of incorrect predictions.

There are two different phases for using the developed GEP and DT-SVM models:

1.
Operational phase:
During this phase, the mining operation is carried out, taking into consideration the dimensions of the pillar. Consequently, Equation (8) of the GEP model is used to verify the stability status. If the pillar stability status is found to be unstable, Equation (9) of the GEP model is used to check for the unstable status, and appropriate support system measures are implemented accordingly. Similarly, for a comprehensive investigation, the DT-SVM model is utilized to determine the class of the pillar status, and based on the three defined classes (stable, unstable, failed), the final decision is made.
2.
Design phase:
In this phase, the minimum stable dimensions of the pillar are determined using Equation (8) of the GEP model. These dimensions are chosen in a manner that the value of the equation becomes negative, indicating stability. Subsequently, the obtained dimensions are fed into the DT-SVM model to assess the pillar condition. If the pillar is classified as stable in the DT-SVM model, then these dimensions are considered as the stable dimensions of the pillar. In other words, the dimensions of the pillar should be determined in such a way that both models show the stable state of the pillar.

6 ADVANTAGES AND LIMITATIONS OF THIS STUDY

This study attempts to develop two robust models to predict pillar stability in underground hard rock mines. It is based on a comprehensive database comprising 236 case histories from seven underground hard rock mines. The collected database was utilized to implement two widely used engineering research models, namely, GEP and DT-SVM, in order to develop predictive models for assessing pillar stability status. To evaluate the performance of the developed models, 15% of the primary database was designated as a testing subset. This section of the database comprises unseen data within the models, as it was not involved in the model training process. The performance evaluation results of the developed models indicated superior performance compared to other existing models. Consequently, project managers can effectively utilize the models developed in this study during both the design and operational phases. Nevertheless, despite the superior accuracy and performance of the developed models compared to other models, they still have certain limits. The models were developed using a limited database and should thus only be used within the specified range of input parameters. Consequently, utilizing the developed models for projects that fall outside the parameters of this study might lead to errors. The model's comprehensiveness will undoubtedly increase with the incorporation of further data. Subsequently, if the models are used in conditions that are not present in the database, an initial estimate is provided first, followed by the addition of new conditions to the database and the updating of the model. Other variables that affect pillar stability, such as underground water conditions, excavation parameters, and discontinuities, were ignored. Enhancing the models to align more closely with reality would involve collecting and integrating these overlooked parameters.

7 SUMMARY AND CONCLUSIONS

This study introduces a novel approach to evaluating the stability of underground mine hard rock pillars by using the GEP and DT-SVM techniques. This is the first attempt to use these methods for the initial status estimation of mine pillars. A database comprising 236 case histories from various underground hard rock mine projects was utilized in developing the GEP and DT-SVM hybrid predictive models. To assess the performance of the developed models, four performance evaluation criteria ( MCC, AC, SE, and SP) were used with the test database. Additionally, the ROC curve was utilized, and a comparison of the accuracy of the developed model with some previous models was conducted. The most significant findings of this study are as follows:

1.
The outcomes of the performance criteria, including MCC, AC, SE, and SP, on the test data set indicate that the developed models show satisfactory performance. Also, the accuracy of the developed models in the training stage (90.05% for the S–U GEP model, 88.79% for the U–F GEP model, and 99.6% for the DT–SVM model) indicates the high accuracy of the developed models. Consequently, the models created for assessing pillar stability through the GEP and DT-SVM techniques have demonstrated outstanding performance. They can be used during the mine design phase before excavation and throughout the mining operation to identify and provide early warnings of potential pillar instability or failure.
2.
In the comparison of the developed models with other models, it was evident that the models created in this study outperformed other models. This superiority can be attributed to the comprehensiveness of the database, the white box of the model, low uncertainty, and high accuracy. As a result, project managers can easily utilize this model to assess the pillar stability status during both the mine design stage and mine operation.
3.
The sensitivity analysis of input parameters based on the GEP model showed that increasing the pillar width (w) and pillar strength (UCS) and reducing the pillar height (h) and pillar stress (S) lead to an increase in pillar stability. In simpler terms, w and UCS have a direct relationship and h and S have an inverse relationship with pillar stability.

Using the developed GEP and DT-SVM models, it is possible to assess the stability of the pillar during mining operations. In the event of instability, appropriate support system measures can be implemented based on the identified instability status. Furthermore, during the pillar design phase, the two developed models can be used to determine the minimum stable dimensions of the pillar. The aim is to find dimensions for the pillar that satisfy the stability criteria of both models, demonstrating a stable state of the pillar according to both the GEP and DT-SVM models.

AUTHOR CONTRIBUTIONS

Mohammad H. Kadkhodaei: Writing—original draft; methodology; validation; software; visualization. Ebrahim Ghasemi: Conceptualization; supervision; review and editing; formal analysis. Jian Zhou: Data collection; discussion; formal analysis; review, and editing. Melika Zahraei: Data preparation; software; writing—original draft. All authors read and approved the final manuscript.

ACKNOWLEDGMENTS

The authors have no funding to report.

CONFLICT OF INTEREST STATEMENT

The authors declare no conflict of interest.

Biography

Mohammad H. Kadkhodaei is a PhD candidate at Isfahan University of Technology, Iran. He obtained his BS degree in mining engineering from Isfahan University of Technology in 2017 and his MS in mine exploitation from the same university in 2019. His research interests lie in the fields related to mining and underground excavation, geotechnical risk assessment, and data mining. During the course of his PhD research, he conducted various research projects in the field of underground mining and published several articles. Additionally, during this period, he served as a reviewer for research papers related to his field of interest in various journals.