Real-time lithology identification while drilling based on drill cuttings image analysis with ensemble learning


Abstract

Accurate lithology identification through geological exploration is crucial for hazard risk management during deep underground operations. Artificial intelligence has advanced in image recognition but using it to analyze underground drill cuttings for accurate lithology remains challenging. Issues include imprecise sampling control, harsh environments, and inconsistent image acquisition procedures, all leading to poor image quality. To address these issues, a lithology identification while drilling method was proposed. A cuttings sampling, testing, and transporting system was developed and deeply integrated with the drilling rig, achieving automation in cuttings sampling operations while standardizing the timing, procedures, and environment for sampling. A cuttings image preprocessing method was proposed, which meets the requirements of machine learning for image dimensions while enabling the automatic calculation of the proportions of different lithological particles. This is highly significant for accurately determining stratigraphic interfaces. An ensemble learning method was applied to enhance the identification accuracy. Underground trials were conducted at a coal mine in Huainan, China, involving the construction of four boreholes and the acquisition of more than a thousand cuttings images. During the trials, the system cooperated with the drilling rig to realize the accurate identification of lithology information during drilling, with an accuracy of 97.42% and an average processing time of less than 0.11 s per image. The results showed that the proposed lithology identification method can accurately obtain formation lithology in real time during drilling. This study guides drilling operations, ensuring target area coverage, effective hazard management, and supporting unmanned drilling technology development.

Highlights


  • Developed a complete system using ensemble learning to analyze drill cuttings images and provide real-time lithology information during drilling.

  • The cuttings sampling, testing, and transporting system ensures standardized cuttings sampling and high-quality image acquisition.

  • Provided an image preprocessing method that meets machine learning requirements, enabling automated calculation of lithology particle percentages.

  • Successfully applied in underground drilling, enhancing efficiency.


1 INTRODUCTION

Drilling is a primary technical method for deep underground hazard management, mineral resources survey, and exploration. With the advancement of drilling equipment and construction technologies, underground drilling operations have achieved basic automationand are progressing toward higher levels of automation and intelligence (Gao, 2023; Li et al., 2021). Automatedleveling and stabilization, automatic rod handling, and autonomous navigation-based drilling technologies are increasingly being applied in actual production. These technologies significantly reduce worker labor intensity, decrease the number of personnel required for drilling operations, and improve operational safety. In the pursuit of achieving higher levels of automation and intelligence in drilling operations, automated sensing technologies, particularly formation lithology identification technology, are critical challenges that must be addressed (Zhang et al., 2021, 2022). Automated and accurate identification of formation lithology not only helps determine the distribution of resources and potential geological anomalies but also provides data to optimize drilling parameters for different formations during automated construction (Nautiyal & Mishra, 2023). This optimization enhances efficiency and reduces the risks of borehole hazards such as borehole collapse and drill jamming.

Researchers have conducted in-depth investigations into rapid lithology identification during drilling operations, focusing on gathering and analyzing various types of data produced during borehole drilling to extract information about the geological formations encountered (Ge et al., 2023; Yue & Yang, 2022). The mechanical characteristics and drillability of different geological layers vary significantly, leading to distinct patterns in the drilling data associated with different formations (Che et al., 2016; Yu et al., 2021). By utilizing an integrated measurement while drilling (MWD) system on the rig, it becomes possible to continuously monitor drilling data in real time, which facilitates the assessment of formation drillability (Hansen et al., 2024; Shi et al., 2020; Yang et al., 2020), the specific energy required for rock breaking (Lakshminarayana et al., 2021; Liang et al., 2022), the rock quality index (Guo & Gu, 2024; Yin & Liu, 2001), and other relevant parameters. These metrics can be effectively employed to infer formation characteristics. By capturing and analyzing the vibration signals from drilling tools, considering their time-domain, frequency-domain, and time-frequency characteristics, researchers have shown that identifying lithology is feasible under controlled lab conditions (Fang et al., 2024; Qin et al., 2018; Wang et al., 2023). Similarly, by using acoustic sensors to record sounds generated as the drill bit cuts through different seams and analyzing the spectral features of these signals, real-time lithology identification can be achieved, particularly in specific cases (Du et al., 2023; Khoshouei et al., 2022). Moreover, collecting core sample images during drilling and extracting their visual features such as color, shape, and texture has proven to be another effective method for lithology identification (Hou et al., 2023; Lin et al., 2023; Lu et al., 2023). With the development of computer technology, machine learning and deep learning are being increasingly used in industrial production. Kadkhodaie-Ilkhchi et al. (2010) demonstrated the applicability of boosting, neural networks, and fuzzy systems to predict rock types from MWD data. Akyildiz et al. (2023) developed machine learning models using MWD data to predict marble quality classes. Fernández et al. (2023) used drill monitoring technology, discontinuity index, and machine learning techniques to recognize rock mass structures in underground mining, improving predictive models for discontinuities and rock quality characterization. Although automated real-time lithology identification based on drilling data, drill string vibrations, and acoustic signals have been shown to have the ability to reduce human error and improve identification efficiency, however, it faces many challenges when applied deep underground in coal mines. Drilling data is easily influenced by factors such as the mechanical properties of the rock, formation stress, and drill string friction resistance. Noise from the environment and the rig interferes with the vibration and sound signals being collected. Extracting useful data under such high interference and correlating it with formation information is challenging, making accurate analysis of formation data in practical applications difficult.

The automatic lithology identification technology based on image recognition has distinct advantages in certain situations, as it is not affected by changes in the strata stress or other variable information. Li et al. (2025) developed a data-driven lithology classification model for the New Austrian Tunneling Method tunnels by integrating borehole imaging with MWD parameters, employing synthetic sampling, grid search hyperparameter optimization, and extreme gradient boosting to address multidimensional imbalance while leveraging borehole imaging for ground-truth validation of rock types. Fernández et al. (2024) proposed an image-augmented machine learning framework for geological interpretation, combining hyperspectral imaging with MWD data analysis to resolve class imbalance issues in automated lithology identification systems. However, most image-based stratigraphic information studies are still limited to laboratory environments or noncoal mine drilling scenarios, focusing on offline analysis of the collected data, and have not yet realized real-time identification of geological formations during the coal mine deep underground drilling process (Grossi et al., 2024; Singh et al., 2023). At the same time, the identification system has not been able to establish data communications with the drilling rig to provide real-time guidance during drilling operations. Moreover, underground coal mine environments with low lighting, dust pollution, and variations in image capture timing, environment, procedures, and camera parameters can reduce the accuracy of identification, resulting in poor model generalization ability.

In this study, we focused on lithology identification based on drill cuttings images and deep learning, proposed a real-time lithology identification method during the drilling process. The cuttings sampling, testing, and transporting system (CSTT system) was developed, achieving automatic image sampling and lithology analysis of drill cuttings. The image capture procedures, parameters, and lighting conditions were standardized to reduce the impact of image quality on model identification accuracy. A data exchange channel was established between the CSTT system and the drilling rig, integrating drill cuttings sampling into the drilling process, ensuring consistent sampling times and constant intervals between sampling locations. A cuttings image preprocessing method was proposed, which meets the requirements of machine learning for image dimensions while enabling the automatic calculation of the proportions of different lithological particles. The methods and supporting equipment presented in this study were tested in underground coal mines, demonstrating their ability to meet the requirements for accurately identifying stratigraphic information in real time. The development and industrial application of this system not only replaces traditional manual lithology identification methods, enhancing accuracy, but also allows smart drilling rigs to detect stratigraphic lithology information. This advancement paves the way for the development of additional automation features.

2 METHODOLOGY

2.1 Overview

The study proposes a real-time lithology identification method and develops the CSTT system and smart drilling rig to support this approach, as shown in Figure 1. The CSTT system features automatic cuttings sampling, image acquisition, analysis, and transport functions, and it works in coordination with the smart drilling rig. It collects and analyzes the lithology of the cuttings to identify formation information. The smart drilling rig performs automated drilling operations and, at specific stages of the drilling process, sends sampling commands to the CSTT system as per the preprogrammed procedures. After the identification is completed, the system transmits the identification results to the smart drilling rig, which are displayed on the rig's control system screens and used as the basis for the next stage of drilling.

Details are in the caption following the image
Lithology identification while drilling based on drill cuttings analysis.

In the initial phase, several demonstration boreholes need to be drilled to collect sufficient data for training the machine learning model. The trajectory planning of the demonstration boreholes should cover typical formations and potential geological structures that need to be identified later in the mining area. During drilling, the smart drill rig coordinates with the CSTT system to collect cuttings image data at specific stages according to a predesigned frequency. The raw images of collected drill cuttings were preprocessed using a preprocessing method for drill cuttings particle images proposed in this paper. The preprocessed drill cuttings images are then labeled based on borehole resurvey data and logging data. Using transfer learning, the pretrained deep learning network is further trained to build a lithology identification model. It then undergoes an iterative process of training, adjustment, and evaluation to ensure accurate classification of the drill cuttings image samples. Logging and borehole re-survey data are used to validate and optimize the model. At this stage, a soft voting ensemble learning method based on the multitraining model is applied to enhance the model's feature extraction capability from the cuttings, ensuring high identification accuracy.

After completing the demonstration boreholes and the testing and optimization of the model, it will be integrated into the control system of a smart drilling rig. Underground drilling tests will then be conducted to fully verify the reliability and accuracy of the system. Finally, the qualified model will be scaled and integrated into multiple smart drilling rigs to enable real-time identification and recording of formation changes during drilling operations. The data flow of the lithology identification method is shown in Figure 2. The functions of the main equipment in the system are shown in Figure 3.

Details are in the caption following the image
The data flow of the lithology identification method. (a) Model training phase and (b) model application phase.
Details are in the caption following the image
The functions of the main equipment in the system.

2.2 The CSTT system

Obtaining formation information by analyzing the shape, color, composition, and weight of drill cuttings is one of the most direct and effective methods. The realization of automatic sampling and analysis of drill cuttings is conducive to improving the automation of the supporting operations of drilling construction and improving the timeliness and accuracy of stratigraphic information acquisition.

2.2.1 Automated sampling testing and transporting of drill cuttings

In this study, a CSTT system was developed, as shown in Figure 4. The system can be deeply integrated and works in coordination with the smart drilling rig to realize the collection of drill cuttings, automatic sampling, weighing, image acquisition, and lithology analysis. Additionally, it can help with long-distance transportation of drill cuttings.

Details are in the caption following the image
Schematic structure of the drill cuttings detection and transport device.

The CSTT system consists of a flow meter, a drill cuttings collection device, an image capture device, a sampling robotic arm, a weighing device, and a drill cuttings conveyor. The flow meter is installed on the drill rig to monitor the flow rate of the drilling fluid entering the borehole. The drill cuttings collection device is installed at the head of the borehole to collect the drilling fluid and cuttings flowing out of the borehole, which are then transported through pipes to the drill cuttings detection and transport device. A sampling robot arm collects the drill cuttings samples and sends them to an image acquisition device for image sampling. The captured images are sent to a data processing unit to analyze their lithology and store them. The weighing device is used to obtain drill cuttings weight information. The drill cuttings conveyor is responsible for the conveying of drill cuttings.

2.2.2 Consistency in sampling timing

Accurate control of drill cuttings sampling locations is the foundation for spatial correspondence between lithology identification results and sampling locations. The precise control of sampling position and sampling timing is realized by deeply integrating the CSTT system and the smart drilling rig. The workflow of the CSTT system is illustrated in Figure 5. When the system is activated, the CSTT system first establishes communication with the smart drill rig control system. The drilling process detection system consists of various sensors responsible for monitoring the drill rig's operational status and determining when drill cuttings sampling is needed.

Details are in the caption following the image
The work process of the cuttings sampling, testing and transporting system.

2.2.3 Consistency of the image acquisition environment

The high humidity, dust, and darkness of the deep underground environment can degrade the quality of acquired drill cuttings images. To ensure consistency in the imaging environment, an image capture device was developed, as shown in Figure 6. This device creates a relatively enclosed environment between the camera and the sampling tray and provides independent lighting to eliminate the impact of ambient light. In addition, the camera uses the same shooting parameters for each capture, such as the shooting environment, shooting angle, camera settings, and exposure time, further ensuring consistency in the drill cuttings image sampling environment.

Details are in the caption following the image
The image acquisition device.

2.2.4 Methods to prevent mixing of drill cuttings from different formations

To minimize the impact of mixed cuttings from different formations on identification accuracy during the drill cuttings image collection process, this study proposes the following methods: After each drill rod has been completely drilled into the strata, the drilling rig is maintained in a rotating state for a period to thoroughly flush the borehole with drilling fluid, expelling any residual cuttings inside the borehole. To avoid the mixing of cuttings from different layers due to the on-off water operations during the installation of new drill rods, the timing of cuttings sampling is chosen when the half-length of each drill rod has been drilled into the borehole. After capturing the image of the cuttings in the sampling tray, the tray is flipped over by using the flipping device, as shown in Figure 7, before proceeding to the next sampling process, allowing the cuttings inside the tray to be poured into the weighing device below. Even if there is a small amount of residual drill cuttings in the sampling tray, it will be covered by the new inflow of drill cuttings and will not be captured by the image acquisition device.

Details are in the caption following the image
Functions of the sampling robot arm in each position. (a) Drill cuttings sampling position, (b) drill cuttings dumping position, (c) image acquisition position, and (d) next drill cuttings sampling position.

2.3 Preprocessing method for drill cuttings particle images

The raw data collected are high-resolution images of drill cuttings. Traditional methods, when using transfer learning to train a pretrained network, require resizing the images to the desired dimensions, such as 224 pixels by 224 pixels. This resizing can lead to a significant loss of valuable information. Since drill cuttings are usually a mixture of crushed particles of different materials, knowing the proportion of different materials in the drill cuttings is more meaningful for obtaining the exact location of the stratigraphic interface and guiding the drilling operation.

2.3.1 Preprocessing methods for model training data

To meet the image size requirements of the machine learning model and to capture the proportion of different lithology samples in the drill cuttings, the original drill cuttings image will be cropped in the preprocessing stage to divide it into multiple blocks of the desired size for the model while removing extraneous information such as container edges. Each block after cropping will be numbered sequentially, as shown in Figure 8.

Details are in the caption following the image
Cropping of drill cuttings images.

After cropping the images, the next step is to label the lithology of each image. In this step, we use specialized borehole measurement equipment to resurvey the boreholes. These results will be used in conjunction with borehole logging data and known geologic information to obtain formation lithology information at the location where each drill cuttings image was acquired. This lithology information will be used to label the lithology of the image blocks, as shown in Figure 9. The image blocks are then divided into different groups based on lithology to form the model training data set.

Details are in the caption following the image
Labeling of drill cuttings images.

Notably, before cropping and labeling the original images, drill cuttings image samples collected from the interface of two geological layers were excluded from the training data set. This step was taken to reduce the impact of mixed cuttings on the lithology labeling accuracy of the training samples.

There is significant variation in the size and orientation of rock particles within cuttings images. It is impossible for the training image set to cover all possible scenarios. Therefore, data augmentation (including image enhancement, image transformation, etc.) is commonly used to increase the number of training samples. By applying various transformations to the images, we can simulate changes in the field environment and expand the recognition model's adaptability. To avoid increasing the storage space required for the data set, runtime augmentation is typically employed. This approach involves applying a certain transformation or combination of transformations to an original training image with a given probability when it is read. The transformed image is then added to the current training batch for model training. The transformed image is not saved after the training session concludes.

Based on the characteristics of drill cuttings images, it is essential to ensure that the transformed images still retain the fundamental features of the drill cuttings. Therefore, only a subset of effective transformation functions is selected. With an 80% probability, an image undergoes one or more of the following geometric transformations: horizontal flip, vertical flip, rotation, and translation. The parameters for these transformations are randomly selected within a specified range, as shown in Table 1.

Table 1. Augmentation options.
Method Value
Random reflection axis X and Y
Random rotation (°) Min: −180 Max: 180
Random rescaling Min: 0.5 Max: 2.0
Random horizontal translation (pixels) Min: −100 Max: 100
Random vertical translation (pixels) Min: −100 Max: 100

2.3.2 Preprocessing methods for normal construction data

After the model is trained, optimized, and integrated into the drilling rig system, the preprocessing method for raw drill cuttings images is slightly different during normal drilling operations. The drill cuttings samples consist of irregular rock particles with randomly distributed orientations. To enable the recognition model to comprehensively extract features from the drill cuttings images, the collected image data are cropped, as shown in Figure 10. Each image block is randomly rotated, creating multiple new image blocks. These newly generated image blocks, along with the original image block, form the lithology testing data set for that specific image block.

Details are in the caption following the image
Preprocessing methods for normal construction data.

The lithology identification result for each image block was determined by selecting the class with the highest weighted average probability in the test data set. The proportion of image blocks with the same lithology type, relative to the total number of blocks, was used to determine the distribution ratio of that lithology within the drill cuttings.

2.4 Soft voting ensemble learning method based on multitraining models

Training machine learning models is a stochastic process, which means that each training run may yield a different model. This randomness causes variations in identification accuracy when applied to different test datasets. Some models may excel in identifying certain features of the drill cuttings images, while others may struggle. To further enhance identification accuracy, this study utilizes the soft voting ensemble learning algorithm to create a multiclassifier system, as depicted in Figure 11.

Details are in the caption following the image
Schematic diagram of the ensemble learning model for lithology identification.

2.4.1 Soft voting ensemble learning

Soft voting is a widely utilized ensemble learning technique, especially for classification tasks (Khan et al., 2023; Tavana et al., 2023). It combines the predicted probabilities from several models and selects the class with the highest weighted average probability as the final output. Soft voting's main advantage lies in its flexibility and editability, as each model can be separately trained, optimized, and assigned distinct weights based on performance, ultimately enhancing overall classification accuracy.

We denote the prediction output of on sample x as a N dimensional vector . The final output result based on the weighted soft voting ensemble learning method can be expressed as
(1)
where is the output of for the class label and is the weight of , typically , .

Model diversity remains essential for achieving better generalization. If the individual models are too similar, the benefits of using an ensemble approach may be diminished. Thus, ensuring diversity among the models, while simultaneously optimizing each individual learner, can significantly boost the stability and robustness of the ensemble model, leading to better performance across a wider range of data.

2.4.2 Diversification of individual learners

As an example, we take the coal-rock identification problem, which is a typical binary classification problem, . f is the true function, assume the error rate of individual classifier is , and for each individual classifier , the error rate of the ensemble is
(2)
Suppose the ensemble combines T individual classifiers through a voting method, and if more than half of the individual classifiers are correct, then the ensemble classification is correct. The result of the ensemble can be written as follows:
(3)
Assuming that the error rates of the classifiers are independent, the Hoeffding inequality can tell us that the ensemble's error rate is
(4)

As T increases, the ensemble's error rate decreases exponentially. However, this analysis is based on the premise that the errors of the individual learners are independent of each other. For the problem we are solving, all individual learners are trained to solve the same problem, making independence impossible. Therefore, selecting different types of machine learning algorithms as individual learners to construct a heterogeneous ensemble learning system enhances ensemble diversity and contributes to improved system performance.

This study utilizes transfer learning by training three kinds of deep learning models, GoogLeNet, NasNet-Mobile, and Inception-v3. Each model is selected for its unique characteristics and advantages in image recognition tasks. GoogLeNet incorporates Inception modules that use parallel convolution and pooling operations to extract multiscale features efficiently. Despite its 22-layer depth, the Inception module's design minimizes computational complexity and parameter count, ensuring high performance and resource efficiency. NasNet-Mobile is a lightweight convolutional neural network optimized for mobile and embedded devices. It maintains high accuracy while significantly reducing computational requirements and parameter count, making it suitable for resource-constrained environments. Inception-v3 introduces factorized convolutions to reduce computational complexity, incorporates batch normalization for accelerated training, and utilizes auxiliary classifiers to mitigate gradient vanishing issues. It strikes a balance between accuracy and computational efficiency, widely used in image classification tasks. The primary differences among these models lie in their design focus: GoogLeNet emphasizes multiscale feature extraction through Inception modules, Inception-v3 builds on this by optimizing efficiency and training speed, while NasNet-Mobile is specifically designed for deployment in resource-limited scenarios.

3 RESULTS

3.1 The CSTT system test results

In March 2024, a field test of the system was conducted at an underground coal mine in Huainan China. The borehole design and geological information of the construction area are shown in Figure 12. The test site for the system is shown in Figure 13. The boreholes were drilled from the gas control roadway, passing through multiple strata to extract gas from the bottom 11-3 and 11-2 coal seams and adjacent layers.

Details are in the caption following the image
Borehole design and geological histograms.
Details are in the caption following the image
Underground testing of the system.

The construction site was on a roadway above the 11-3 coal seams. The normal distance from the roadway floor to the top of the 11-2 coal seam was approximately 21.4 m. The strata above the 11-2 coal seam included, sequentially, a sandy mudstone layer approximately 9.4 m thick, a carbonaceous mudstone layer approximately 0.2 m thick, a mudstone layer approximately 0.9 m thick, the 11-3 coal seam approximately 0.2 m thick, and another sandy mudstone layer approximately 10.7 m thick. During the trial, four boreholes were drilled, with boreholes No. 1 to No. 3 designated for data collection and borehole No. 4 used as a test borehole. The equipment used during the test is shown in Table 2.

Table 2. Equipment and parameters.
Type Model
Drilling rig ZDY6500LFK smart drilling rig
Drill rod Φ73 mm high strength drill rods
Drill bit Φ94 mm PDC drill bit
Other devices The CSTT system and YZD12-type borehole multiparameters visualization measurement instrument

During the construction of boreholes No. 1 to No. 3, under the unified control of the rig controller, the smart drilling rig and the CSTT system collaborated to collect drill cuttings samples every 0.75 m, and a total of 294 drill cuttings images were collected. The acquired images, as shown in Figure 14, are transmitted to the drilling rig control center for unified numbering and preprocessing. The numbering of the images includes three parts: image acquisition time, borehole number, and image sequence number.

Details are in the caption following the image
Example of images of drill cuttings samples.

Through observation of the collected cuttings samples, there are distinct visual differences between sandy mudstone-dominated rock cuttings and coal cuttings. The sandy mudstone wetted by drilling fluid exhibits a dark red color, which forms a sharp contrast with the bright black coal, providing favorable conditions for accurate computer classification. Moreover, by applying the method proposed in this study to avoid mixing cuttings from different strata, the lithology of the cuttings accurately reflects the borehole position. When the borehole remains within a single stratum, the cuttings display uniform lithological characteristics. As the borehole approaches the interface between two strata, the cuttings gradually show a distinct mixture of lithologies from both formations. Once the borehole crosses the interface and stabilizes within the new stratum, residual cuttings from the previous layer may still influence the sample. However, the amount of these mixed particles is minimal (<2% volumetric content) and does not significantly affect the overall lithological composition of the current cuttings.

Following the drill cuttings particle image preprocessing method, high-resolution raw images collected from Boreholes No. 1–3 were processed. Stratigraphic variations along the borehole trajectory were identified based on borehole survey and logging data, allowing drill cuttings images to be classified by lithology. Since this experiment focused on the construction of cross-measure boreholes mainly within sandstone and coal seams, there were insufficient samples of mudstone and carbonaceous mudstone. For simplicity, samples were categorized into two classes: coal and rock. Images of drill cuttings at coal-rock interfaces were removed to improve lithological labeling accuracy. Images were then cropped to meet the input size requirements of each pretrained model, yielding images of 224 × 224 pixels. This produced a training data set of 25 543 image blocks.

The training phase utilizes the previously mentioned data set as the training data set for all models using the Stochastic Gradient Descent with Momentum (SGDM) optimization algorithm. The initial learning rate is set to 0.01, with a mini-batch size of 128, maximum epochs of 30, and validation frequency of 50. Thirty percent of the training data set is used for model validation, while the remaining 70% is utilized for actual training. Figure 15 illustrates the training accuracy and loss trends for the three different models, showing rapid convergence within fewer than 60 iterations and stabilization at 100% accuracy and low loss values.

Details are in the caption following the image
Training accuracy and training loss for three models. (a) NasNet-Mobile, (b) googLeNet, (c) Inception-v3, and (d) verage.

In this task, the NasNet-Mobile and Inception-v3 models rapidly achieved nearly 100% accuracy within the first 20 iterations, with the training loss quickly dropping to near zero, demonstrating fast and stable convergence. This performance is primarily due to their optimized architectures, efficient parameter settings, and well-initialized weights. In contrast, the GoogLeNet model showed significant fluctuations in accuracy and loss during the initial 20 iterations before gradually stabilizing and reaching high accuracy. This behavior is likely due to its more complex structure, requiring more time to adjust weights and parameters effectively. Overall, the differences in model architectures, initial parameter choices, learning rates, and data preprocessing methods are the main factors contributing to the varied training performances observed among these models.

3.2 Lithology identification generalization test results

To further test the model's accuracy and generalization ability, a new data set was sourced from borehole No. 4, which was drilled to a total depth of 81.75 m. Every 0.75 m, the CSTT system automatically captured one drill cuttings image, resulting in a collection of 109 images.

After cropping, numbering, and initial screening, 9811 usable test image samples were obtained. To accurately label the lithology of each test sample, this study utilized the YZD12 type borehole multiparameter visualization instrument. This instrument conducted video imaging of the borehole, providing precise information about coal-rock variations along the depth of the borehole, as shown in Table 3.

Table 3. Information on stratigraphic changes in the direction of the borehole trajectory.
Data type Azimuth (°) Inclination (°) Seam Borehole depth at the start of the coal seam (m) Borehole depth at the end of the coal seam (m) Borehole depth (m)
Design data 23.0 −30.0 11-3 Coal seam 30.9 31.8 69.70
11-2 Coal seam 49.4 53.1
Remasured data 23.0 −30.3 11-3 Coal seam 37.5 38.3 81.75
11-2 Coal seam 52.5 66.2

Using the preprocessing method previously described, the raw drill cuttings images were processed. The resulting images were categorized by lithology, resulting in 2028 images labeled as coal cuttings samples and 7783 images labeled as rock cuttings samples. Using three different models, the identification results for the test data set are depicted in the confusion matrix shown in Figure 16. Each model rapidly provides lithology identification results and accuracy for the drill cuttings image samples. Notably, the GoogLeNet network has the highest accuracy of the three single identification models in recognizing both coal and rock samples.

Details are in the caption following the image
Chaos matrix for each model. (a) GoogLeNet, (b) Inception-v3, and (c) NasNet-Mobile.

Coal-rock real-time recognition falls under the realm of pattern recognition and constitutes a binary classification task. Evaluation metrics employed include accuracy, precision, recall, and F1 score. Based on the correspondence between actual results and predicted results, the recognition accuracy of each model can be calculated, as shown in Table 4. The table also records the time taken by each model to complete the recognition task for all image samples. It is evident from the table that among the three models, GoogLeNet achieves the highest recognition speed and accuracy, with an average image recognition time of approximately 0.02 s per image.

Table 4. Model accuracy assessment.
Model Accuracy Precision Recall F-measure Time consumed(s)
GoogLeNet 0.9718 0.9981 0.9662 0.9819 220
NasNet-Mobile 0.9621 0.9937 0.9582 0.9756 910
Inception-v3 0.9690 0.9975 0.9633 0.9800 335
Ensemble learning 0.9742 0.9985 0.9701 0.9841 1116

To determine the impact of the weights of three classification models on the accuracy of the ensemble model, we analyzed the identification results of each model. Due to the inherent differences in identification results from different models for the test data set samples, as shown in Figure 17, the ensemble learning model achieved the highest recognition accuracy of 97.42% when the weights for GoogLeNet, Inception-v3, and NasNet-Mobile were set to 0.47, 0.31, and 0.22. The recognition time was longer compared to other models, taking approximately 0.11 s per image.

Details are in the caption following the image
Variation of ensemble accuracy with weights.
To analyze the recognition results of each model and calculate the lithology and coal-rock ratio of each complete original drill cuttings image sample collected, the coal-rock ratio is computed using the following formula:
(5)
where P A is the proportion of output result A; N A is the number of samples with output result A; and N B is the number of samples with output result B.

Figure 18 presents the computed results. In the figure, bar charts are used with different fill colors to represent the current identification results. Above each bar, a curve indicates the probability that the recognition result corresponds to the current value, ranging from 0.5 to 1.0.

Details are in the caption following the image
Results of each model.

From the results, each model successfully detected stratigraphic changes along the borehole trajectories. The ensemble learning method, which combines multiple trained models, produced identification results closest to the actual observations presented in Table 3. Since the drill cuttings were sampled every 0.75 m, while the data measured by the YZD12 borehole multiparameter visualization instrument was continuous, there was some discrepancy between the formation information detected by the CSTT system and the actual data. Significant variations in the proportion of coal and rock fragments were observed at coal-rock interfaces. This indicates that despite taking measures, the mixing of cuttings is difficult to avoid in the initial period after entering a new formation. As a result, the starting point of a coal seam should be identified by the appearance of coal fragments in the cuttings, while the transition to a new rock layer should be marked by the appearance of rock fragments in the coal cuttings.

The model exhibited fluctuations in recognition results during drilling at depths of 8.25–9.75 and 75.00–76.50 m. Investigation using drilling video footage revealed the presence of previously unidentified carbonaceous mudstone layers at these depths. The similarity in color and shape between carbonaceous mudstone fragments and coal fragments influenced the recognition results. The method of detecting stratigraphic changes using drill cuttings image samples exhibits a certain time delay. This delay is attributed to the time required for the transport of cuttings and increases with the depth of the borehole. These observations highlight the effectiveness of the models in identifying stratigraphic variations along borehole trajectories, the challenges posed by unexpected lithological features on model performance.

3.3 Real-time field application

In the construction of an additional seven boreholes within the same roadway, the lithology identification method and full equipment suite proposed in this study were applied. During drilling, the CSTT system automatically collected drill cuttings images every 0.75 m and fed them into the lithology identification model for stratigraphic recognition. The predicted lithology was subsequently compared and validated against re-survey results, achieving an overall model accuracy of 92.35%.

Based on these predictions, drilling parameters were adjusted upon detecting lithology changes. For rock seams, high rotational speeds (160–200 r/min) and relatively low thrust (40–55 kN) were employed to prevent excessive weight on bit (WOB) from causing drill string buckling, thereby enhancing drilling efficiency through increased cutting frequency. For coal seams, a higher thrust (900–110 kN) and lower rotational speeds (80–100 r/min) were used to minimize rotational oscillation effects on borehole wall stability, reducing collapse incidents while increasing penetration depth per rotation to maintain operational efficiency. Under these optimized parameters, no collapses occurred and the average rate of penetration (ROP) in these boreholes exceeded that of a neighboring borehole of the same depth by 21.32%.

It is acknowledged that the proposed model has certain limitations. Since the model was developed using data specific to the Huainan coalfield, its performance may vary in different geological formations. Thus, further research is needed to examine the model's generalizability across diverse geological environments.

4 DISCUSSION

This study explores real-time lithology identification during drilling through machine learning analysis of drill cuttings. While prior research has significantly advanced lithology identification technologies, it has not sufficiently addressed critical challenges such as imprecise sampling control, harsh environments, and inconsistent image acquisition methods, all which impact image quality. These unresolved issues have hindered the effective implementation of machine learning methods in practical deep underground drilling operations.

The CSTT system developed in this study, along with its image preprocessing method, replaces manual sample collection with automated machine sampling. It ensures consistency in image acquisition by maintaining constant sampling environments, timing, and parameters. Through deep integration of sampling analysis processes and automated drilling construction workflows, the study achieves precise correlation between lithology identification results and their respective sampling locations. Additionally, this study segments high-resolution images into smaller blocks, preventing the loss of effective data and addressing data scarcity issues. More importantly, it enables the model to accurately identify the coal-rock proportions within drill cuttings. It is worth noting that the ensemble learning model constructed in this study achieved higher identification accuracy compared to individual classification models but required the longest processing time. Among the individual classification models, GoogLeNet completed the identification task the fastest while achieving accuracy comparable to the ensemble learning model. These results suggest that in scenarios where computational resources are limited or real-time performance is a priority, using a single classifier like GoogLeNet for identification tasks can be a better choice.

It must be acknowledged that the lithology identification model developed in this study has certain limitations. When processing cuttings images with similar visual characteristics, the identification accuracy significantly decreases. This is due to the model's reliance on a single data type. Incorporating cross-validation methods using drilling parameters, drilling vibration signals, or other technical approaches can enhance the model's ability to distinguish similar cuttings, which warrants further investigation. Additionally, the time delay caused by the movement of cuttings within the borehole may result in delayed recognition results. These issues warrant further investigation.

5 CONCLUSION

This study proposed a lithology identification while drilling method based on automatic drill cuttings sampling and analysis, facilitating rapid and accurate stratigraphic lithology identification during drilling operations. The conclusions are as follows:
  • 1.

    By integrating a smart drilling rig with the CSTT system, a complete framework was established to automate drill cuttings sampling and lithology analysis throughout the drilling process. The successful development and industrial application of this system enhance identification accuracy, replacing traditional methods while enabling smart drilling rigs to detect formation lithology information, paving the way for further intelligent functionality.

  • 2.

    The CSTT system, developed for automatic drill cuttings sampling and analysis, ensures consistent imaging conditions through an integrated image acquisition device, which stabilizes environmental factors and camera parameters to improve identification accuracy. Through deep integration with the smart drilling rig control system, the CSTT system ensured precise image capture timing, maintaining spatiotemporal alignment between identification results and sampling locations. This setup also reduced the impact of mixed cuttings from different formations, thereby enhancing the robustness of the system. Additionally, the preprocessing method for drill cuttings particle images allows the model to identify the lithology of drill cuttings while also determining the coal-rock ratio within the sample.

  • 3.

    A lithology identification model based on soft voting ensemble learning was applied, leveraging multiple trained classifiers to extract features from drill cuttings images, achieving a recognition accuracy of 97.42% with a processing time under 0.11 s per image. Field experiments confirmed the model's high accuracy and generalization capability, validating its suitability for practical applications.

While this method and its supporting equipment maintain consistency in the imaging environment, parameters, and timing, accurately distinguishing between different lithology drill cuttings with similar shapes and colors remains challenging during practical drilling. Additionally, further improvements are needed to address recognition delays caused by the movement of drill cuttings.

ACKNOWLEDGMENTS

The first author would like to express sincere appreciation for the scholarship provided by China Coal Technology Engineering Group and University of Wollongong. This study is financially supported by the Australian Research Council Linkage Program (LP200301404) and ARC Training Centre for Innovative Composites for the Future of Sustainable Mining Equipment (IC220100028). The authors acknowledge technical staff of CCTEG Xi'an Research Institute (Group) Co. Ltd., Guangyu Peng and Zeng Meng for the support during experimental process.

    CONFLICT OF INTEREST STATEMENT

    The authors declare no conflict of interest.

    Biographies

    • image

      Kun Li, associate researcher, PhD candidate of University of Wollongong, young high-potential talent of China Coal Technology and Engineering Group, has more than 10 years of research and development of coal mine underground drilling technology and equipment, has completed more than 20 scientific research projects, presided over the development of three generations of Smart Drilling Rig, more than 20 different types of drilling rigs have been successfully developed under his efforts, has published 22 papers, and has applied for more than 40 invention patents.

    • image

      Ting Ren, professor, and PhD supervisor at the School of Civil, Mining, Environmental and Architectural Engineering, University of Wollongong. He is also the deputy director of the ARC Training Centre for Innovative Composites for Future Mining Equipment. With over 30 years of experience in mining engineering and mine safety, he specializes in dust control, gas management, and fire prevention in coal mines. He has led major research projects funded by ARC, ACARP, and the Coal Services Health & Safety Trust, collaborating with CCTEG, BHP Billiton Mitsubishi Alliance, and Anglo American. Previously, he was a senior research engineer at CSIRO for over 6 years. A Chartered Engineer (UK), he has published over 300 papers. His contributions have earned him the BMA Health and Safety Award and the Australian Bulk Handling Review Award.




    附件【Deep Underground Science and Engineering - 2025 - Li - Real‐time lithology identification while drilling based on drill.pdf