1 Introduction
Inverse problem-solving method involves using numerical calculations to infer unknown structural parameters or system states given known measured data and physical models [
1–
4]. It is a significant research tool in fields such as structural mechanics [
5–
8], material science [
9,
10], computer vision [
11,
12], geophysics [
13,
14], and medical health [
15,
16]. In the field of structural health monitoring, inverse problem-solving methods also have important application. They can provide information about the structural health status, including shape, size, and location of damage, based on the acquired data [
17,
18]. Structural health monitoring systems typically comprise three main components: sensor system, data processing system, and health assessment system [
19]. The sensor system converts the mechanical response of the structure into an image or electrical signal, and the data processing system further processes the signal into data such as displacement or strain. The health assessment system can be used to obtain the health status of the structure according to the displacement or strain input. In engineering practice, the sensor system may become damaged over time or due to harsh working environments, leading to the loss of displacement or strain data. When the health assessment system lacks sufficient input signals of displacement or strain, accurately predicting structural status can pose a significant challenge. Essentially, overcoming this engineering challenge involves solving the inverse mechanical problem of determining structural parameters in the absence of certain observed information.
Researchers have used many analytical methods to solve these inverse mechanical problems, such as the classical Tikhonov regularization [
20] and Total Variation regularization [
21] methods, which construct functional models to tackle the inverse problems. The traditional methods for solving inverse mechanical problems often require knowledge of specific parameters and boundary conditions of the physical model, and involve numerous iterative calculations to ensure the uniqueness and stability of the solution. However, in the calculation process, even minor deficiencies in input information can result in a significant increase in errors [
22]. Optimization algorithms, such as Genetic Algorithm (GA), can also be used to retrieve structural parameters from observed variables, but their solution efficiency and accuracy are low [
23]. Thus, solving inverse problems with incomplete input information remains challenging.
In recent years, the advancement of data acquisition and computing capabilities has made data-driven solutions a prominent subject of research across various fields, offering possibilities for the development of high-precision algorithms for predicting structural parameters from observed data [
24–
26]. Data-driven methods and deep neural networks have also been proven to be effective in solving thin plate bending problems and partial differential equations [
27–
29]. In addition, generative models have gained increasing relevance in addressing challenges within the field of mechanics, especially in handling complex problems that involve incomplete input information [
30–
32]. In general, the process of solving inverse mechanical problems with incomplete information can be divided into two steps, which have been validated in Ref. [
33]. First, an information completion mapping is designed using analytical or numerical methods to reconstruct the lost observation data. Second, a structure perception mapping is designed using iterative methods or optimization algorithms to map the complete observation data to the structural parameters. By utilizing the proposed Convolutional-Generative Adversarial Network (CGAN) method, Ref. [
33] demonstrated that machine learning models can precisely evaluate the structural parameters, notwithstanding missing a large quantity of sampled strains randomly in the space domain. In addition to the CGAN model, it is essential to further propose a machine learning model with high precision, high efficiency and rapid convergence for solving inverse mechanical problems.
In this paper, a lightweight machine learning model is proposed based on step-by-step completion of displacement information and prediction of structural parameters as shown in Fig.1. All displacement information can be divided into two groups and completed by using Fully Connected Neural Networks (FCNNs) successively. By combining advanced neural network architectures and intelligent completion techniques, the limitations posed by incomplete observation data can be overcome. The results obtained from this study have the potential to significantly contribute to the field of solving mechanical inverse problems, paving the way for improved accuracy and efficiency in engineering applications.
2 Method
2.1 Establishing displacement-digital structural parameter database
Based on the finite element method, a program is developed using MATLAB to calculate supercells’ displacement field. The process involves three steps. First, a two-dimensional square model with a size of 10 × 10 unit cells is established based on the concept of digital material. Due to its simple composition and programmable characteristics, digital material is widely used in fine-tuning mechanical properties, elastic waves, and so on [
34–
36]. Digital materials can be formed by combining two types of unit cells with significant differences in equivalent properties. Here, each unit cell is randomly assigned a value of 1 or 0, representing a Young’s modulus of 1 MPa (for hard material) or 0 MPa (for soft material). To avoid stiffness matrix singularity, a small value (ranging from 10
−6 to 10
−10) is used instead of 0 MPa. The Poisson’s ratios (
v) for both materials are set to 0.3. Second, planar four-node elements are utilized to divide each unit cell into a grid of 10 × 10 nodes. As shown in Fig.2, the nodes along the left boundary of the model are randomly selected and assigned fixed boundary conditions. To prevent rigid body displacement or rotation, the number of fixed nodes must be greater than or equal to 2. On the right boundary, all nodes have prescribed displacement values in the
x-direction within the range of [0,0.1]. The vertical displacements of all nodes in the model are set to 0. Finally, the finite element program calculates the displacements in the
x-direction for all nodes. This process generates data sets that consist of the displacement field and corresponding structural parameters (ordered as 0 and 1) for training a neural network. In this study, 411029 and 82801 sets of data are generated for training and validating the machine learning model, respectively.
2.2 Building displacement information completion model based on fully connected neural network
The input of the model is the displacement field with missing information, which corresponds to specific grid nodes marked as “None” points. For example, as shown in Fig.3, each row has five missing points. The network’s output is the displacement values of the missing points, numbered in Fig.3.
To complete the displacement information, a step-by-step displacement completion method is proposed in this study. The displacement field is first divided into rows/columns along the loading direction. If the loading direction is along the x-axis, the field is divided into rows. If it is along the y-axis, the field is divided into columns. Then, numbers are assigned to the rows or columns containing the points with missing displacement information. Subsequently, FCNN1 is utilized to complete the odd-numbered rows or columns, providing the missing displacement values. By integrating the output of FCNN1 with the original displacement data, this combined information is then input into FCNN2. The displacement values for the even-numbered rows or columns that contain missing data can be obtained by this process. Ultimately, this results in the creation of a complete displacement information field. The above approach capitalizes on the independent yet interconnected nature of the displacement field, allowing establishment of mapping relationships for displacement information completion in different regions. As a result, it reduces the data output volume of individual models and alleviates the challenges of model training.
As an example, 96 displacement values need to be built into a final count of 121 displacement values. As shown in Fig.3(a), the displacement field with missing information is divided into rows according to the loading direction and numbered from 1 to 5. The missing displacement information of odd-numbered rows (Rows 1, 3, 5) is first completed using the FCNN1, resulting in the displacement field shown in Fig.3(b). Subsequently, the missing displacement information of even-numbered rows (Rows 2, 4) is completed based on FCNN2, leading to the final complete displacement field shown in Fig.3(c).
The displacement field of the missing information, as shown in Fig.3(a), is flattened into a vector with a size of 1 × 96. This serves as the input variables for FCNN1 model. The FCNN is a type of neural network based on a multi-layer perceptron. It is composed of multiple fully connected layers with each layer containing multiple neurons, and each neuron is connected to all neurons in the previous layer [
37]. The output of FCNN1 is the 15 displacement values in odd-numbered rows in Fig.3(a). FCNN1 consists of four fully connected layers with 256, 128, 64, and 5 neurons, respectively. The Parametric Rectified Linear Unit (PReLU) is used as activation function in all layers except the output layer. PReLU is an improved version of ReLU that has a small slope of
α = 0.3 in the negative region (Fig.4(a)), which helps to avoid the inactivation problem of ReLU in the negative range. The mean square error (
MSE) is chosen as the loss function, measuring the difference between the output displacement values and the correct displacement values. The learning rate is set as
(Fig.4(b)), where
max_lr represents the maximum value of the learning rate (1 × 10
−3),
min_lr is the minimum value (1 × 10
−4). The decay rate is set to 2000, and
i is the training step. The Adam optimizer is utilized to minimize the loss. The average relative error is defined as the absolute percentage error between the real displacement and the predicted displacement in a training batch. For FCNN1, a parallel network structure (Fig.5) is designed to complete the missing displacement information of odd-numbered rows.
Subsequently, combining the output result of FCNN1 (Fig.3(b)) with the original known displacement data, the 111 displacements are flattened into a vector with a size of 1 × 111. This serves as the input variable for the FCNN2 model. The output of FCNN2 is the 10 displacement values in even-numbered rows in Fig.3(b). Similar to FCNN1, the parallel network structure is used in FCNN2 to map the displacement information. The hyperparameters of the network structure remain the same as those of FCNN1.
Finally, by combining the results from FCNN1 and FCNN2, the completion from 96 displacement data to 121 displacement data are achieved, which provides richer input information for the subsequent prediction of digital structural parameters. The details of machine learning models are shown in Table A1 in the Appendix.
The hyperparameter tuning of the machine learning model is essential. In general, maximizing the model’s complexity is an effective method for achieving outstanding performance on the training set. If an overfitting issue occurs, it can be solved by expanding the data set, implementing techniques like batch normalization, dropout, pooling and early stopping.
2.3 Building structural parameters prediction model based on convolution neural network
CNN is used in this section to establish a mapping between the displacement field and structural parameters. The completed displacement information from FCNNs serves as the input variables for the CNN. As shown in Fig.6, the displacement field and structural parameters are treated as two-dimensional image data here. The CNN architecture, as depicted in Fig.6, consists of four convolutional layers and two fully connected layers. Each convolutional layer has a feature mapping size of 128, 256, 512, or 1024 with the convolution kernel size set as 3 × 3. The activation function used in CNN is “Leaky ReLU”. To preserve the original information during the convolution operation, “Same” padding is applied to ensure that the data dimensions remain unchanged. The convolutional layers are succeeded by the fully connected layers consisting of 512 and 100 neurons, respectively. To enhance the convergence speed of the model, the weights are initialized using the “Xavier” method, and the biases are initialized to a constant value of 0.1. The loss function is defined as the MSE between the real and predicted values of the structural parameters. To make the output value conform to the expression of the digital material, Sigmoid activation function is used to compress the output values into the range from 0 to 1. A threshold of 0.5 is further applied, where values greater than or equal to 0.5 are changed to 1, while values less than 0.5 are changed to 0.
TensorFlow deep learning framework [
38] is utilized, and the training process of the model is accelerated using the NVIDIA GTX-1060 6G graphics card. The details of software and hardware are shown in Tab.1, providing the necessary computational capabilities for training the model effectively.
2.4 Inverse mechanical problem-solving method based on genetic algorithm
In the field of solving inverse mechanical problems, analytical methods require high accuracy in model parameters and initial boundary conditions, while feedback iterative methods are more feasible when input information is lacking. Here, feedback iterative method is employed to solve the inverse mechanical problem with incomplete input information. By comparing the error of output parameters and target parameters, the design variables are updated iteratively. This iteration continues until the error is reduced to an acceptable value. One specific type of feedback iterative methods is the GA, which was introduced by Holland in 1992 [
39]. GA requires only the determination of variable coding and the definition of a corresponding fitness function, which is particularly suitable for addressing inverse mechanical problems involving the prediction of structural parameters based on incomplete displacement field information. The known condition is an input represented by 96 displacements (Fig.3(a)) or 121 displacements (Fig.3(c)), while the target output is the corresponding 0/1 sequence of digital structural parameters. Utilizing the GA toolbox integrated into MATLAB 2018b, the function
X = GA (
FITNESSFCN,
NVARS, [],
lb,
ub,
INTCON) is employed to address the aforementioned inverse mechanical problem. The objective of this function statement is to employ the “GA” function to identify the minimum value of
X with integer constraints on
FITNESSFCN.
NVARS is the number of design variables,
lb and
ub are the upper bound and lower bound of
X, respectively. Through iterative solving of the above function, the GA aims to identify the set of structural parameters that most accurately correspond to the given displacement field, ultimately solving the mechanical inverse problem.
The data generation, FCNNs, CNNs, and GA models in this section are presented in a data flow chart in Fig. A1 in the Appendix. The hyperparameter settings of machine learning algorithms and the GAs are provided in Table A1 in the Appendix.
3 Results
3.1 Training and verification of displacement information completion model
The training set consists of 411029 sets of displacement-digital structural parameter data, and the verification set comprises 82801 sets. Fig.7 illustrates the training progress of FCNN1 model, and the process takes approximately 1.5 h for 145 epochs. The loss function decreases to 4.01 × 10−5, and the average relative error decreases to 2.01% on the verification set. The FCNN2 model is trained for 145 epochs, with the training time of approximately 2.2 h. Fig.8 illustrates the training process for FCNN2, where the loss function decreases to 7.74 × 10−5 and the mean relative error decreases to 3.25% on the verification set. Therefore, the well trained FCNN models can be applied for displacement information completion.
Evaluations were conducted on the well-trained FCNN1 and FCNN2 models using a test set consisting of 5000 data samples. The mean relative errors on the test set were calculated to assess the performance of the models. Fig.9(a) shows the results for the FCNN1 model. It can be observed that FCNN1 performs well in completing the missing displacement determinations, with over 99.9% of the displacement values having a mean relative error less than 5%. Similarly, Fig.9(b) presents the results for the FCNN2 model. The evaluation demonstrates that FCNN2 achieves over 84.9% of the displacement values with a mean relative error less than 5%. These findings indicate that both FCNN1 and FCNN2 exhibit good performance in accurately completing the missing displacement information.
3.2 Training and verification of structural parameters prediction model
A total of 411029 sets of data are used as the training set, and 82801 sets of data are used as the verification set of the CNN model. To compare the influence of displacement information completion on the prediction of structural parameters by data-driven method, two models with the same network structure are trained: CNN1 and CNN2. CNN1 establishes the mapping between 96 displacements and structural parameters, while CNN2 establishes the mapping between 121 displacements and structural parameters.
For CNN1, the training time for a single epoch is approximately 110 s. The changing trend of the average accuracy curve with epoch is shown in Fig.10(a). After 212 epochs of training, the accuracy of the model on the verification set reaches 77.62%. On the other hand, CNN2 is trained to establish the relationship between 121 displacement values and structural parameters. It should be noted that the input displacement field for CNN2 is completed by FCNNs. The training time for a single epoch of CNN2 is around 173 s. After 149 epochs of training, the accuracy of CNN2 on the verification set reaches 95.15%. The changing trend of the accuracy curve with epoch is illustrated in Fig.10(b).
Based on the training results, it can be observed that the average accuracy of CNN2 on the verification set is 17.53% higher than that of CNN1, which indicates that the step-by-step displacements completion method is more effective in improving the prediction accuracy of digital structural parameters.
3.3 Comparison of prediction accuracy between data-driven method and genetic algorithm
The proposed method, denoted as ML-121 in this study, involves using FCNNs to complete the displacement from 96 to 121 points, and then using CNN2 to predict the structural parameters. For comparison, CNN1 is used to predict structural parameters based on 96 displacements (for which the method is denoted as ML-96). Fig.11(a) illustrates the displacement completion results of FCNNs, while Fig.11(b) illustrates the structures ① and ② predicted from ML-96 and ML-121, respectively.
The accuracy and efficiency of individual examples are compared first. As shown in Fig.12(a), the mean relative error of FCNNs in completing the displacement information for 20 points is 3.11%. Fig.12(b) presents the accuracy of predicting structural parameters using the CNN model based on 96 displacements (79%) and 121 displacements (100%). The results demonstrate that the displacement information completion can improve the prediction accuracy of structural parameters and is a crucial step in the data-driven method for solving mechanical inverse problems with missing input information.
For comparison, GA is used to predict structural parameters based on 96 displacements (GA-96) and 121 displacements (GA-121). Fig.11(c) demonstrates the structures ③ and ④ predicted from GA-96 and GA-121, respectively. The structural parameter optimization accuracies of the GA based on 96 and 121 displacements are 68% and 69%, respectively. Compared with the data-driven method mentioned above, the GA’s accuracy is lower, and the completion of displacement information does not significantly improve the accuracy of prediction of structural parameters. Additionally, the data-driven method takes approximately 0.0129 s to predict a single structural parameter, while the GA takes around 10 s, making the latter 770 times slower than the former. The data-driven method excels in real-time prediction and perception of structural parameters, making it suitable for applications in structural health monitoring.
To further evaluate the accuracy of each method, a test set containing 5000 data is used to predict structural parameters using the ML-96, ML-121, GA-96, and GA-121 models, respectively. It is noted that the 121 displacements are predicted by the trained FCNNs. The histogram in the orange areas of Fig.13(a) and Fig.13(b) shows the distribution of structural prediction accuracy for GA, while the accuracy of structural parameters prediction using CNN models are shown in the blue areas of Fig.13(a) and Fig.13(b).
The average accuracy of optimizing the structure from 96 displacements using the GA is 60.98%, while the average accuracy of optimizing the structure from 121 displacements is 62.42%. Although the number of displacement values has been increased from 96 to 121, the improvement in accuracy for GA is only 1.44%. These results indicate that when predicting the structural parameters, the information carried by the displacement field has not been fully utilized by GA. As a comparison, the average accuracy of predicting the structure from 96 displacements using CNN1 is 77.04%, while the average accuracy of predicting the structure from 121 displacements using CNN2 is 86.25%. The average accuracies using the CNN model from 96 and 121 displacements are 16.07% and 23.83% higher than those from GA, respectively.
The above results demonstrate that the data-driven method has a stronger ability than GA to extract features from the displacement field, whether using the displacements with missing information or the complete displacements. By considering the displacements as a matrix and preserving the spatial relationship between the measuring points, the data-driven method fully utilizes the information carried by the displacements. By completing the displacements step by step, this approach achieves a balance between training efficiency and accuracy, and has a positive impact on the subsequent prediction of structural parameters. Conversely, the GA treats the displacement information as independent variables, disregarding the spatial relationship of displacements and treating the “two-dimensional” data as “zero-dimensional” data. Therefore, the data-driven method proposed in this paper exhibits higher accuracy in predicting structural parameters, and provides a new approach for solving mechanical inverse problems with missing displacement information.
4 Conclusions
In this study, a comprehensive process for solving inverse mechanical problems with missing input information is proposed based on the data-driven method. The process involves utilizing a FCNN to complete the displacement values and to construct a mapping relationship between the displacement values and structural parameters using a CNN. This method achieves a structural parameter prediction accuracy of 95.15% when 20% of the displacement information is missing. By comparison, the data-driven method is 23.83% higher in accuracy than the GA in predicting structural parameters. These findings highlight the advantages in accuracy and efficiency offered by the data-driven method, and present a new approach for addressing inverse mechanical problems with missing displacement information.
5 Appendix
As illustrated in Fig. A1, the data generation, FCNNs, CNNs, and GA models are presented in a flow chart format. The original figures in the manuscript are accompanied by annotations.