Development of an optimization model for a monitoring point in tunnel stress deduction using a machine learning algorithm


Highlights


  • Scientific guidance for monitoring scheme optimization is provided.

  • The stress distribution of the overall section is characterized using limited numbers of monitoring points.

  • Interdisciplinary application of machine learning helps in solving practical engineering problems.


1 INTRODUCTION

Assessments of stress patterns in tunnel engineering are invaluable, as they directly impact the safety, efficiency, and success of tunnel projects, making rigorous stress analysis an indispensable component of tunnel engineering practice. However, the challenges of accurate stress distribution deduction in tunnel engineering are equally enormous. In addition to the complex nature of subsurface conditions and the intricate interplay of geology, the geometric intricacies of tunnel design add complexity. In recent years, structural health monitoring (SHM) has become an essential technology to determine the stress distribution and ensure the long-term stability of underwater shield tunnels (Cunha, 2022; Memisoglu Apaydin et al., 2022). Even so, it is difficult to capture the mechanical behaviors of the whole tunnel structure. On the one hand, the number of monitoring points and sensors used in practical engineering is limited, making it difficult to monitor the whole section of the tunnel (Tan et al., 2021). Moreover, some data recorded by sensors are invalid, so they cannot be used for analysis (He et al., 2022). On the other hand, an optimized monitoring scheme is essential to obtain useful information of the structure. However, in practical engineering, the locations of monitoring sections, and the number and distribution of monitoring points are mostly determined based on empirical or semi-emperical methods (Ariznavarreta-Fernández et al., 2016; Mghazli et al., 2023; Wei et al., 2023). A common problem is that the abnormal response of the structure cannot be captured by sensors in case it occurs exactly at the point without sensors (Li et al., 2020). To solve this problem, an optimal sensor installation scheme is needed. Therefore, this study focused on developing an optimization model for monitoring point selection by using a machine learning algorithm and characterizing the mechanical behavior of the whole section based on data from limited sensors.

The purpose of SHM is to obtain the dynamic variation information of the structure, identify the abnormal behavior in advance, and prevent the occurrence of disasters (Gómez et al., 2020; Tan et al., 2023). The SHM system has already been installed in many well-known tunnels at home and abroad, such as the Channel Tunnel in Europe, the Seikan Tonneru in Japan, and the Botlek tunnel in the Netherlands and the Wuhan Yangtze River tunnel, the Nanjing Yangtze River tunnel, and the Xiang'an tunnel in China (Ikuma, 2003; Li et al., 2020; Tan, Chen, Wu, et al., 2020). Although these systems provide important data information for the analysis of structural mechanical behavior, it is impossible to evaluate whether the location of the monitoring points is reasonable and the number of monitoring sensors really meets the engineering needs (Soleimani-Babakamali et al., 2023; Zhang et al., 2022). In addition, there are few reports about optimization of the tunnel monitoring scheme in previous studies, and no unified specifications and standards have been published as guidance for monitoring scheme optimization.

Previous engineering cases show that complex topographic conditions and emergencies may trigger structural abnormal behaviors at the location without any monitoring points (Xu et al., 2022). Accordingly, experts and scholars are increasingly realizing that SHM is insufficient in that it only relies on limited points to determine the mechanical behaviors of tunnels (Du, Zhou, et al., 2021; Maes et al., 2022). Currently, the studies on sensing the mechanical behavior of the entire tunnel surface can be divided into two categories: physics-guided models and data-driven models. Numerical simulation is the most widely used physics-guided method (Yang et al., 2023), such as the finite element method, discrete element method, and their corresponding variations (Liao et al., 2023; Wang et al., 2022). Numerical results are of great importance and can be used to determine structurally sensitive positions of abnormalities and structural variation (overall changing trends), but the absolute value or amplitude of the simulation results often deviates from the actual situation, which makes it difficult to apply in practical engineering. In recent years, with the rapid development of computer science and big data technology, the data-driven based model used to deduce the mechanical behavior of the whole section of a tunnel has attracted considerable attention (Salimi et al., 2019; Sun et al., 2022; Zhu et al., 2020). For example, based on the compressive sensing method, Bao proposed a wireless sensing technique that can recover lost data (Bao et al., 2015). Wan proposed a Bayesian multitask learning methodology to reconstruct the lost temperature data (Wan & Ni, 2019). Compared with other areas, the application of a data-driven model is quite limited in civil engineering, especially in tunnel engineering. Use of data-driven models to perform minimal unbiased estimations has become an important solution for reconstructing structural responses (Xie et al., 2022; Zhang & Wu, 2019). However, the validity of a data-driven model and the accuracy of reconstructing results are considerably influenced by the monitoring data used in the data experiment (Ángela et al., 2022).

Based on available research and practical engineering needs, this study proposes a spatial deduction model for monitoring scheme optimization and stress distribution characterization using a machine learning algorithm. The Nanjing Yangtze river tunnel was selected as a case study. First, a simulation model was developed to generate a data set that reflected the variation characteristics of tunnel mechanical behaviors. Then, clustering analysis was performed on the data set to determine the key points for monitoring. Lastly, on the basis of the monitoring data, a spatial deduction model was developed and data at the positions without sensors were derived. As an important application, the presented model was applied in the case study to characterize the stress distribution of the whole section.

2 METHODOLOGY

In this section, the framework of monitoring scheme optimization is introduced, including the methodology to determine the typical monitoring points and derive the mechanical behaviors of overall structures based on sparse monitoring data.

2.1 The framework of monitoring scheme optimization

The framework of monitoring scheme optimization is composed of three modules: (1) numerical simulation, (2) clustering analysis to determine typical positions, and (3) deduction of the overall stress distribution. In fact, the clustering and deduction analysis are processes of data clustering–reconstruction. The flowchart of the proposed model is shown in Figure 1.

The typical positions are defined as the points that can be used to describe the mechanical states of any position; thus, it is essential to obtain the information of the whole section. In order to realize this goal, a numerical model is developed first to constitute the data set of the mechanical behavior of the overall tunnel section. The tunnel section is divided into a number of elements, whose mechanical information is recorded for feature analysis of machine learning. Then, the constituted data set is fed into the clustering model to determine the optimized positions of sensors. The clustering algorithm is superior in terms of determining the key points of the mechanical behavior of the entire data set. By learning the variation features of the time series of each element, the clustering model divides all of the elements into different clusters. The mechanical behavior of elements in one cluster can be characterized by only one element, that is, a clustering centroid. Accordingly, it is determined as the typical positions of monitoring points. Subsequently, the data reconstruction analysis is conducted using supervised learning models to derive the mechanical behavior of the whole tunnel section based on the limited data from typical monitoring positions. By evaluating the characterization results derived by the data from different points, the optimal positions and numbers of points can be obtained. Finally, the optimal monitoring scheme is determined.

2.2 Data clustering–reconstruction method

The spectral algorithm is a clustering algorithm developed on the basis of graph theory. This algorithm has been widely used as it does not require many assumptions about the data structure (Alshammari et al., 2021). In addition, considering that the responses of the tunnel structure have strong spatial and temporal correlations, and linear regression (LR) is a classical model used to describe the relationship between time series (Tan, Chen, Wang, et al., 2020), the spectral algorithm and the LR model are used in this study for data clustering–reconstruction analysis. Specifically, the spectral algorithm is used to determine the typical positions for monitoring and LR mode is used to derive the information of those points without monitoring.

In the spectral model, all data are regarded as nodes in a graph and the similarities between every two nodes are represented by edge weights, where a higher value of edge weight indicates stronger similarity. By properly partitioning the graph into different groups, a partition of the graph can be obtained, where the edges between different groups have very low weights and the edges within a group have high weights. On this basis, data clustering can be achieved (Alshammari et al., 2021). Essentially, spectral clustering makes use of the spectrum (eigenvalues) of the similarity matrix of the data to perform dimension reduction and then clusters the data in a lower dimension using conventional clustering methods like K-means. More specifically, for data set , denotes the simulated data of element , are the edge weights between and , and is defined as the sum of weights of all the edges that are connected to and can be expressed as follows:
(1)
For points connected by an edge, , or otherwise . The K-means algorithm is usually adopted to calculate the weight , and radial basis function (RBF) is selected as the kernel function; thus, the edge weight can be defined as follows:
(2)
where is the kernel function of RBF.
Based on Equations ( 1) and ( 2), all clustering centers can be obtained. Assuming that is the clustering center set of simulated data, and only the data at the central points are known. Then, these data are fed into LR mode for reconstruction. In the LR model, the data information at the location of the cluster center and no measurement point is assumed to have a linear relationship to train the model and minimize the error between the reconstruction results and the data of the cluster center. The information at any position can be expressed by the data of central points as follows (Yang et al., 2018):
(3)
where is the reconstruction result of element ; denotes the number of clustering centers; and is the regression parameter.
In addition, the objective function of LR model training is expressed as
(4)

2.3 Evaluation indicators for model performance

The performance of the monitoring scheme can be evaluated by calculating the characterization error of a model on the overall section. Three evaluation indicators are adopted in this study, namely, root mean square error (RMSE), mean absolute error (MAE), and mean absolute percentage error (MAPE) (Du, Li, et al., 2021; Du, Zhou, et al., 2021). The detailed descriptions of these three indicators are as follows.

RMSE is a widely used indicator in machine learning to describe the deviation between the deduction result and ground truth, which is considered as a numerical result in this study. Given the deduction result and the numerical simulation result , the deviation between the deduction value and the numerical simulation data is calculated as follows:
(5)
MAE is an indicator that describes the absolute error between the deduction result and the numerical result, which can directly reflect the error. It can be calculated as follows:
MAPE is a percentage indicator to describe the absolute error between the deduction result and the numerical result. This indicator can avoid the effect of data size, and the recommended form is expressed as follows:
(7)

3 NUMERICAL SIMULATION OF AN UNDERWATER SHIELD TUNNEL

According to the flowchart of monitoring scheme optimization, the numerical simulation was conducted first. The Nanjing Yangtze River tunnel was selected as the case study, and a numerical model was developed according to the geological condition and tunnel structure to obtain the data of mechanical behaviors.

3.1 Background for the case study

The Nanjing Yangtze River tunnel is located in Jiangsu, China, and became operational in January 2016. This tunnel is 7014 m long, including the shield-crossing portion of 3537 m. This tunnel is constructed with fabricated lining, and each lining ring consists of 10 segments, namely, one key segment, two adjacent segments, and seven standard segments, as shown in Figure 2a. The external diagram of tunnle segment is 14.5 m, and the width and thickness of the segment are 2.0 and 0.6 m, respectively. Consisting of two lines, the tunnel is designed in the form of two-tubing dual eight-lane expressways. The tunnel internal structure is shown in Figure 2b. The main geological layers to which the tunnel boring machine (TBM) is exposed are mucky silty clay, silty clay with silty sand, silty-fine sand, medium-coarse sand, gravelly sand, siltstone, and pebble. The highest water pressure that this tunnel subjected to is nearly 0.7 MPa. It is one of the biggest underwater shield tunnels in the world. In addition, the geological conditions to which this tunnel is exposed are also complicated compared to other such tunnels worldwide. More detailed information about this case study can be found in our previous studies (Tan, Chen, Wang, et al., 2020).


3.2 Numerical modeling

To determine the mechanical behaviors of a certain section of the tunnel structure, the finite element method performed on the second development was used for numerical modeling. The geological layers of this section are shown in Figure 3a, and the corresponding mechanical property parameters are presented in Table 1. Along this line, a numerical model is developed and is shown in Figure 3b. The numerical model of the tunnel segment was divided into 100 elements in the circumferential direction and eight elements in the longitudinal direction, namely, 800 elements in total.

Table 1. Parameters of different geological layers.
Geological layers Unit weight (kN/m3) Ground resistance coefficient (MPa/m) Lateral pressure coefficient
Silt 19.4 5 0.43
Silty fine sand-1 19.3 50 0.40
Silty fine sand -3 20.2 35 0.37
Silty clay 18.6 12 0.65
Gravel 20.3 45 0.30
More specifically, the beam-spring model was adopted to simulate the interaction between the tunnel lining and surrounding rock (Koizumi et al., 1988), where a number of nonlinear springs were arranged around the external surface of the tunnel lining to simulate the ground resistances. The stiffness properties of nonlinear springs are shown in Figure 4 and can be calculated using Equation ( 8). The stiffness of the nonlinear spring is almost 0 when the spring is under tension and related to the mechanical parameters of the geological layers when the spring is compressed.
(8)
where denotes the length of the tunnel edge surrounded by springs; is the number of springs around the tunnel edge; and is the ground resistance.
The material of the tunnel segment is concrete (Type: C60), whose elastic modulus is 36 GPa, possion's ration is 0.2, and unit weight is 24 kN/m 3. Considering the influence of segment joints on the stiffness of the shield tunnel, the elastic modulus of concrete is multiplied by 0.8 (Chen et al., 2015). In addition, the boundary conditions of the numerical model are shown in Figure 5. The soil pressure and water pressure are separately applied on the tunnel because sand is the main geological layer that the tunnel is exposed to. The water pressure can be calculated as follows:
(9)
where denotes the unit weight of water.
Soil pressure applied to the tunnel structure includes overlying soil pressure , the foundation soil pressure and the lateral soil pressure . If the center point of the tunnel is considered as the origin of coordinates and the vertical direction as the y-axis, the soil pressure can be calculated as follows:
(10)
(11)
(12)
where is the number of overlying soil layers; and are the thickness and unit weight of soil layer , respectively; is the earth pressure coefficient; represents the external diameter of the tunnel segment; and G and F are the gravity and buoyancy that the tunnel is subjected to.

3.3 Numerical results of tunnel lining stress

The water pressure applied to the tunnel varies periodically from year to year. It consists of two components: initial pressure and dynamic pressure. The former is determined by construction investigations, and is 0.437 MPa for this section, calculated according to Figure 3a. The latter varies with the season and is recorded by the Changjiang Maritime Safety Administration (specific water level data are available from the website https://cj.msa.gov.cn/). As an example, the variation of the water level of the Yangtze River tunnel over 1 year (366 days) was determined to calculate the external load applied on the tunnel structure, as shown in Figure 6. Therefore, the actual water pressure applied on the model is calculated by the sum of the dynamic pressure and the initial water pressure.


Water pressure applied in the numerical model.

Pressure applied to the tunnel can be calculated using Equations (9)–(12), so the corresponding mechanical behavior of the tunnel can also be obtained. Each element records the stress variation in 1 year, so a time series containing 366 data included one element, expressed as , and a data set containing 800 time series of all elements was also prepared. For example, Figure 7 shows some numerical results of stress variation at the locations of arch crown, hance, and spandrel.

Details are in the caption following the image
Numerical results of segment stress at some typical positions: (a) arch crown, (b) spandrel, and (c) hance.

4 DETERMINATION FOR THE MONITORING SCHEME OF A TUNNEL SEGMENT

Based on the data set prepared by numerical simulation, data clustering and deduction model are instantiated in this section to obtain the optimal monitoring scheme. In addition, a series of comparison experiments are conducted to verify the optimization results.

4.1 Model instantiation on data set

First, the clustering experiment was conducted. The number of clustering centroids was set to 4, 8, 10, and 20 in sequence to determine the reasonable number of monitoring points and the typical positions of segment stress variation. Based on the numerical results obtained for the tunnel, the cluster centroids were obtained using spectral clustering algorithms, as shown in Figure 8. For comparison, the monitoring scheme with different numbers of points was uniformly distributed on a tunnel surface. The red points represent the positions of clustering centroids when , while the blue points represent the positions of the four added clustering centers relative to the first four clustering centers when ; the purple and black points also have similar meanings. Newly added clustering centroids and previous ones are separated by a circle. The place of the monitoring points in the segment is represented by the distance from the clustering centroids to the center of the circle. Specifically, the smaller the distance, the closer it is to the inside of the segment.


Distribution of typical monitoring points obtained from the Spectral model.

Based on the clustering analysis results, the data reconstruction experiment was conducted. The data at the clustering centroids are regarded as the input to be fed into the LR model to derive the stress information of all sections. As a result, the characterization ability of the model under different monitoring conditions was evaluated by three indicators, as shown in Figure 9.


Characterization ability of a model with different numbers of monitoring points. (a) Root mean square error (RMSE), (b) mean absolute error (MAE), and (c) mean absolute percentage error (MAPE).

4.2 Discussion of experimental results

According to the monitoring scheme shown in Figure 8, it can be found that when , the clustering centroids are mainly distributed in the spandrel and hance of the tunnel. With increasing monitoring points, some clustering centroids are added without changing previous ones and distributed close to previous ones in the space. Furthermore, if there are too many additional measurement points, the monitoring points show a clustering distribution feature, such as the distribution when . Based on the analysis of model reconstruction results shown in Figure 9, it can be seen that the characterization ability of the model is not directly proportional to the increase in the clustering center number. For example, in contrast to the condition with four monitoring points, although the number of clustering centers under the condition with eight monitoring points is doubled, the characterization ability of the model is not significantly enhanced. In addition, the characterization ability of the LR model when is stronger than when . Therefore, a larger number of monitoring points in practical engineering does not mean a better choice because too many monitoring points can result in information redundancy and resource wastage.

To this end, for the case study in this paper, it was suggested to set up eight monitoring points in the field to capture the stress variations of the whole tunnel section after considering the influence of equipment life and other factors on the data quality. In general, spandrel and hance are critically important locations for sensing the whole surface behavior of the tunnel. Then, some monitoring points can be added to the arch crown and inch arch to monitor more mechanical behaviors. Based on the optimized monitoring scheme, the elements randomly selected from the arch crown, spandrel, and hance are derived as examples, which are shown in Figure 10. As shown, the deduction results are in good agreement with the numerical results. This flow can deduce the mechanical behavior of the overall section with limited sensors and provide a deduction accuracy of up to 98% (100%–1.091%). Therefore, the optimal monitoring scheme is reasonable.


Comparison of numerical results with deduction results. (a) Arch crown, (b) hance, and (c) spandrel.

4.3 Verification of the proposed model

In order to verify the reliability of the proposed monitoring scheme optimization model, other widely used methods are adopted to conduct data clustering and reconstruction experiments. Clustering algorithms used in this study included density peaks clustering algorithm (DPCA), K-means, and fuzzy C-means (FCM). Also, the support vector machine (SVM) model was adopted for data reconstruction experiments (Alex & Alessandro, 2014; Gao et al., 2017; Tan et al., 2022). These clustering models are instantized on the prepared numerical data set and the key points of segment stress variation when are determined, as shown in Figure 11.


Key points of segment stress determined by other baselines: (a) Density peaks clustering algorithm, (b) K-means, and (c) fuzzy C-means.

As shown in Figure 11, although there are some differences in the distribution of clustering centers determined by different methods, the distribution positions are similar, that is, spandrel, hance, inch arch, and arch crown are key positions of stress variation. Furthermore, the data at key positions determined by different clustering models were fed into SVM model and the LR model to characterize the stress distribution of the overall section. The deduction ability of different models is evaluated, as shown in Figure 12.


Comparison of the deduction capability of different models. (a) Root mean square error (RMSE), (b) mean absolute error (MAE), and (c) mean absolute percentage error (MAPE). DPCA, density peaks clustering algorithm; FCM, fuzzy C-means, LR, linear regression; SVM, support vector machine.

The performances of the LR model and the SVM model are compared in terms of three indicators: RMSE, MAE, and MAPE. Comparison results show that the values of all three evaluation indicators of the LR model are smaller than those of the SVM model, indicating that the LR model is superior to the SVM model in terms of tunnel mechanical behavior deduction. In addition, among all clustering algorithms, the spectral algorithm is the most suitable one to cluster the data of tunnel mechanical behaviors. The error between the numerical simulation results and the deduction results obtained from the combined application of spectral and LR models is the smallest among all models. To this end, it can be concluded that the combined application of spectral and LR models is reliable to characterize the stress distribution of the overall tunnel section. The proposed model of monitoring scheme optimization is reliable.

5 APPLICATION IN THE NANJING YANGTZE RIVER TUNNEL

In order to ensure the safe operation of the Nanjing Yangtze River tunnel, some sensors have been installed in the field to determine the mechanical behaviors of tunnel segments. For each monitoring section, a total of 20 stress sensors have been installed, where each segment is equipped with two sensors, one inside and the other outside, as shown in Figure 13. In order to characterize the stress distribution of the overall section, the proposed model was applied in this case study, and the characterization results of one section were derived as an example.


Monitoring scheme of stress sensors at the site.

Since the sensor is embedded inside the segment, its position cannot be changed. Accordingly, the data of the eight sensors closest to the optimization positions determined by the spectral model were selected for application. At certain time points, the stress distribution of this section was derived according to the monitoring data recorded by these 8 sensors, as shown in Figure 14. It can be found that at a specific time point, the maximum stress is located at the arch crown, and the stress inside the segment is significantly greater than that outside the segment.


Spatial deduction results for the real-time variation of segment stress.

In addition, the stress evolution trend of some typical positions, such as the arch crown, spandrel, and hance, was also determined using the proposed model, as shown in Figures 15 and 16. The results show that the tunnel segment stress varied periodically in this period, and the evolution trend is basically the same, but there are certain differences in the variation amplitude between different positions. For example, the variation amplitude of the arch crown is significantly higher than that of other positions. The deduction results for the real-time distribution of the overall section and its evolution trend can provide an important scientific basis for the prevention of disasters.

Details are in the caption following the image
Comparison of the spatial deduction model and monitoring data at some positions: (a) arch crown, (b) hance, (c) and spandrel.
Details are in the caption following the image
Spatial deduction results for segment stress at some positions without sensors: (a) hance, (b) spandrel, and (c) arch crown.

6 CONCLUSIONS

In this study, a monitoring scheme optimization method for a tunnel structure was proposed to characterize the stress distribution of the overall tunnel section. This provided scientific guidance for monitoring scheme optimization and solved the problem of insufficient monitoring for the overall tunnel structure using limited sensors. Some specific conclusions were drawn as follows.

The monitoring scheme optimization model was developed on the basis of the combined application of spectral and LR models. The spectral model was applied to determine the typical positions of monitoring points, and the LR model was used to reconstruct the data information of the overall section using the data at the typical positions determined by the spectral model. The results show that sensors should be placed at the spandrel and hance of the tunnel, and can be placed at the inch arch and arch crown if necessary. However, too many monitoring points can result in information redundancy, which consequently decreases model characterization ability.

The reliability of the presented model was verified by comparative analysis with some widely used models. Three indicators, namely, RMSE, MAE, and MAPE, were used to evaluate the characterization ability of the proposed model. It was found that the combined application of spectral and LP models outperforms all other models, and the optimized monitoring scheme is reliable. The accuracy of the presented deduction model is up to 98%, which means that the deduction model is reasonable. As an important application, the proposed model was applied in the Nanjing Yangtze River tunnel to derive the stress distribution of the overall section and its evolution trend, which provides an important scientific basis for disaster prevention.

ACKNOWLEDGMENTS

This work was supported by the Key project in Hubei Province under grant numbers 2023BCB048, the National Key R&D Program of China under grant number 2021YFC3100805, the National Natural Science Foundation of China under grant numbers 42293355 and 51991392, and the Project for Research Assistant of Chinese Academy of Sciences.

    CONFLICT OF INTEREST STATEMENT

    The authors declare no conflict of interest.

    Biography

    • Weizhong Chen, PhD, is a professor and chief engineer at the Institute of Rock and Soil Mechanics. With a background in large-scale hydropower engineering, transportation, and energy projects in China, his research is focused on (1) underground engineering stability analysis and control, (2) technology for disaster control of underground engineering with large deformations of soft rock, and (3) online monitoring, big data analysis, and early warning system for major underground engineering.