1 INTRODUCTION
The identification and classification of rock samples are important for oil and gas exploration, mineral resource exploration, and geological analysis. At present, rock sample identification mainly includes gravity and magnetic, logging, seismic, remote sensing, electromagnetic, geochemistry, hand specimen, and thin slice analysis methods (Anonymous,
2017 ), and the use of image deep learning to establish an automatic identification and classification model of rock samples has great development potential. The image deep learning method can reduce the dependence on experimental equipment and professional knowledge, and achieve the purpose of automatic identification and classification of rock samples starting from image recognition. At the same time, it has obvious practical significance for the research, exploration, and development of oil and gas fields and mineral resources.
Some scholars at home and abroad have conducted in-depth research on the classification of rock images by using image recognition technology.Hossein Izadi et al.(
2017 ) developed an ore flake recognition system and finally achieved better practical results with neural network classification through the analysis of the color and texture characteristics of the ore flakes.Hong et al.(
2017 ) proposed a method based on image processing, fractal theory, and artificial neural network to use images of rock joint surfaces to quantitatively determine geological strength indicators.Yadigar (
2019 ) proposed an effective well geological facies classification model based on deep learning. In the aspect of rock image recognition, many domestic researchers have made great progress.Liu Juexian et al.(
2016 ) mentioned texture, shape, and spatial features, and used these as feature parameters to classify with support vector machines and achieved good results.Cheng Guojian et al.(
2015 ) analyzed the color, texture, shape, and spatial characteristics of the image in the classification of rock slice images, and used these as feature parameters to classify with support vector machines and achieved good results. Based on the deep learning system TensorFlow,Xu Shuteng (
2018 ) designed a targeted Unet convolutional neural network model to automatically extract the deep feature information of ore minerals under the mineral phase microscope to realize intelligent identification and classification of ore minerals under the microscope. With the current breakthroughs in image processing by deep learning, combined with the advantages of deep learning, will significantly improve the image recognition of cuttings.
Based on the deep convolutional network and migration learning method, this paper uses the black coal, gray-black mudstone, gray fine sandstone image sample set to establish the corresponding automatic recognition and classification model. The testing, results show model has good recognition abilities.
2 INTRODUCTION OF RELATED THEORIES AND MODELS
In the task of rock image classification, first, experts need to manually label the rock names in the rock sample images and make network training sample files based on the color, texture, and particle size of the rock as the basis for discrimination. This process is difficult, time-consuming, and laborious. The training set of labeled rock samples is very limited. Therefore, for the classification of rock images, which is a small sample data set, it is more important to select a network structure with strong generalization ability and fewer training parameters.
Since the GoogLeNet network clusters sparse matrices into denser sub-matrices to improve computing performance when improving network performance, the “Inception” module in the network serves as the basic neuron of the GoogLeNet network, increasing the width and depth of the network. Multiple inceptions together with other structures are superimposed to build a 22-layer network with high sparsity and high computational performance. In addition, the GoogLeNet network has achieved a top-5 accuracy of more than 93% in the classification results of the ImageNet large-scale data set. Therefore, considering the advantages of the above-mentioned network training parameters, such as a small number of training parameters, good versatility, and high accuracy, this paper proposes to use GoogLeNet Inception-v3 to classify rock samples based on the idea of training parameter migration learning (Hu, Yan & Xia,
2017 ; Szegedy, Vanhoucke, Ioffe, Shlens & Wojna,
2016 ). The schematic diagram of the GoogLeNet network structure is shown in . Among them, the basic module of Inception is shown in .
The Inception-V3 network expands the network without increasing the computational cost on the Inception infrastructure, extracts more subtle features under the same computing power, and improves the training effect. Its design theory is mainly based on the Hebbian principle and multi-scale processing intuition to increase the depth and width of the network (Wang & Fujimoto,
2018 ).
In the Inception basic module. \(1 \times 1\) convolution is used to process the size of the input image and reduce the computational cost. The Inception-V3 structure replaces the \(5 \times 5\) convolution with 2 continuous \(3 \times 3\) convolutions, which further reduces the computational cost. Instead of stacking convolution layers continuously, it uses convolution kernels of different sizes to keep the experience field constant. Amplify, and finally realize the splicing to achieve feature fusion of different scales. The most notable feature of Inception-V3 is the expansion of the convolution calculation between layers.
3 EXPERIMENTAL DESIGN
3.1 Sample data and data enhancement
The rock image samples used in the experiment come from the official data set of the 9th “Teddy Cup” Data Mining Challenge. The rock images of black coal, gray fine sandstone, and gray-black mudstone are mainly selected for training and recognition. All image data are divided into the training set, validation set, and test set in the proportion of \({60}\%,{20}\%\) , and \({20}\%\) . The training set is used to train the deep convolutional neural network model after data enhancement; the verification set is used to select the optimal model; the test set is used to test the accuracy of the optimal model.
Since the deep convolutional neural network for image classification requires a large number of training samples to obtain the optimal parameters of the model, the samples collected in this article are much less than the hundreds of thousands of samples in other fields, so it is necessary to use image processing techniques such as image flip, rotation, image color contrast and saturation transformation, and random noise to amplify the original data set. On the one hand, a large amount of data set can be obtained; on the other hand, by enhancing the data set, it can prevent the neural network from learning irrelevant patterns, so as to continuously improve the network structure and fundamentally improve the overall performance of the network. Because some of these technologies perform complex processing on the image, it will cause the relevant characteristics of the image to change. In order to improve the efficiency of modeling, this article uses image flip and translation technology to quickly expand the original data, and the image is flipped and translated without changing the main characteristics of the rock image (Wu & Xiao,
2019 ). This article focuses on the classification of rock sample images. The enhancement operation is shown in .
As shown in
Figure 3 , due to the limitation of the texture characteristics of the rock itself, the shape of the rock varies in size and is distributed anywhere in the image, which is not convenient for the network to uniformly train and learn. Then, the sample file is translated at different distances and rotated at different angles. Operation, the data volume of the rock image training sample can be expanded to about 3 times of the original training set, which can not only improve the generalization ability of the model but also improve the accuracy of the rock sample classification network.
3.2 Realization of rock classification
The operating system of this experiment is Windows 10, and the environment is Tensorflow 2.3.0 version. In the Tensorflow environment, the training and testing processes based on Inception-v3 are shown in
Figure 4 and
Figure 5 , respectively.
The flow of data when the Inception-v3 model is used for transfer learning training is as follows: First, a rock image is input and processed in the feature extraction model. The feature extraction model is the convolutional layer and pooling layer in the Inception-v3 model; The convolutional layer and the pooling layer in the model are migrated and utilized to calculate the image features of the rock image and 2048-dimensional vectors are used to represent the image features and save the image features in the buffer. The structure of the entire network starts from the input side. First, three convolutional layers, are set to connect one pooling layer; then two convolutional layers are set to connect one pooling layer, and finally connect 11 mixed layers. The original model also sets up the Dropout layer, fully connected and Softmax layer. For the rock image recognition in this paper, these three layers need to be retrained.
3.2.1 Feature extraction
This step freezes the training weight of the model after pre-training in the source domain and transfers it to the training target data set.In this way, the training of small sample images can be completed by modifying the final classifier of the network (Zhang, Li & Han,
2018 ). The purpose of feature extraction in migration learning is to apply the features extracted from the source domain to the small sample data of the target domain, which not only simplifies the feature extraction steps but also obtains a training model with better performance. The detailed steps of feature extraction are as follows:
(1) In order to prevent the basic weight information in the pre-training model from being modified during the training process, the weights pre-trained by Inception-v3 on the Image Net data set are used as the input of the target data set, and the top layer of the model is frozen to establish the basic model.
(2) Online data enhancement is performed on the small sample data set to obtain more details of the target data set and increase the generalization ability of the model.
(3) Set the image input size to(299,299,3), use the normalization method to adjust the image pixels from \(\lbrack 0\) , 255] to \(\left\lbrack {-1,1}\right\rbrack\) , and build a feature extractor.
(4) Connect the defined basic model and the feature extractor to build a model, and perform global average pooling on the acquired feature vectors.
(5) In order to further reduce the amount of calculation, the Dropout is set to 0.3 , the classification layer is set, the rock sample image is classified, and the image classification model is constructed.
In feature extraction, the final model parameters are as follows: the learning rate is 0.001 ; the batch size is 32; the number of iterations is 50; the optimizer is Adam, which is a method of calculating the adaptive learning rate of each parameter, and the algorithm is easy to implement, The computer memory requirements are low and the calculation efficiency is high; the classifier is Softmax.
Finally, the evolution process of extracting various rock features in some layers during model training is shown in
Figure 6 .
3.2.2 Fine tuning
After the network model completes the training set and validation set process, and the data of the two are in a state of convergence, in order to train the target data to better adapt to the target domain, the model needs to be fine-tuned to unfreeze all or basic models. Generally speaking, for image classification tasks, the bottom layer of the model contains the most basic functions of the image, which can be applied to almost any type of image. Therefore, the bottom layer of the model does not need to be adjusted and is still in a frozen state. Therefore, it is necessary to adjust the top part of the model, continue training with the top part of the model and the added classification layer in the small sample target domain data set, adjust the pre-trained network weights to better adapt to the small sample target data, and finally improve the classification accuracy of the model.
3.3 Training results and model evaluation
Since the models built through deep learning are end-to-end models, there is no need to manually select features, just include as many different types of rock images as possible in the data set; input the original data, the model can automatically extract the features of each category. During training, each step randomly selects the rock images in the validation set for prediction and evaluation of the model, inputs its eigenvalue vector and makes predictions, and compares the results with the actual category labels, and changes the weight parameters of the model through backpropagation; The accuracy of the model continues to increase with the number of steps in each iteration. Although the process of model feature selection cannot be directly observed, it is still possible to evaluate feature selection and model results through training accuracy, test accuracy and cross-entropy value. By analyzing the log files in the training process of the rock sample classification network, the accuracy and loss function curves of the Inception-v3 network for the three kinds of rock sample image classification are shown in
Figure 7 and
Figure 8 , where the blue curve represents the training set the orange curve represents the change process of the verification set.
The training accuracy refers to the percentage of accurately classified images of the current training, and the test accuracy refers to the percentage of accurately classified images randomly selected. Cross-entropy shows the learning effect during the model training process. The smaller the value, the better the learning effect. The predicted value of each training is compared with the actual value and the weight of the last layer is changed through backpropagation.
Figure 7 . Model classification accuracy curve
It can be seen from
Figure 7 and
Figure 8 that the accuracy of the training set and the validation set has been improved rapidly, and finally stabilized at more than 80%, the cross entropy decreased significantly, and finally stabilized at a lower value. It can be seen that the training effect of the model is good.
3.4 Model test
The test set images that are not involved in the training are used for identification and analysis, and the accuracy of the model can be used to verify the generalization ability of the model, that is, whether the model can achieve a good recognition and classification effect for the images that are not involved in the training. Among them, there are 5 black coal test pictures, 7 gray fine sandstone test pictures, and 7 gray black mudstone test pictures. The recognition and classification results of rock sample images are given in the form of probabilities. Each image corresponds to three probabilities. The rock type corresponding to the highest probability is considered to be the rock type in the image (Hu & Wu,
2021 ). The final black coal test results are shown in , the gray fine sandstone test results are shown in , and the gray-black mudstone test results are shown in .
It can be seen from
Table 1 that the model recognizes and classifies the five black coal test images correctly, and the recognition and classification probabilities are all above 90%, and most of the classification probabilities can reach above 95%, indicating that the recognition model has effectively extracted the black coal. The characteristics of black coal, and can more accurately identify the category of black coal.
It can be seen from
Table 2 that the model recognizes and classifies the seven gray fine sandstone test images correctly, and the recognition and classification probabilities are all above 83%, and most of the classification recognition probabilities are above
\({90}\%\) , indicating that the model can also be more accurate. Identify the category of gray fine sandstone.
It can be seen from
Table 3 that the model’s recognition and classification probability of gray-black mudstone is not stable, but the model recognizes and classifies the 7 gray-black mudstone test images correctly, indicating that the model can accurately identify the gray-black mudstone category, but reflects The recognition probability of the category needs to be improved. It may be that the gray-black mudstone training set has fewer samples, and the sample features are not obvious enough, which will affect the model’s recognition probability of this category.
Since the recognition and classification results of the rock sample image are given in the form of probability, the type of rock corresponding to the highest probability is considered to be the type of rock in the image. Therefore, from the classification and recognition results, the model trains three types of rocks The recognition and classification results of the set are correct, but the recognition probability of some images is low. The reason for the low classification recognition probability may be due to the limited training set, and there are few or no rock images similar to the test set, resulting in the lack of extraction of the characteristics of the rocks in the image. In general, the model in this article tests the three types of rocks The image recognition and classification are correct, which has a certain reference value in the research and application of rock sample classification in the future.
4 CONCLUSION
In this paper, a deep learning migration classification model for rock images based on Inception-v3 is established, which realizes the effective recognition of black coal, gray fine sandstone and gray-black mudstone. The model learns features independently without manual operation, which reduces the influence of subjective factors. Moreover, the training process has low requirements on the size and brightness of the rock image. The model was tested separately using the test set images, and there was no error condition, indicating that the model has strong robustness and generalization ability, and can effectively identify the rock in the image. Characteristics. Due to the limited data sources at present, the model does not meet the standards of actual engineering application. In the future, more data can be collected to improve the accuracy of the model, so as to meet the needs of actual engineering.