1 Introduction
Ever since the successful isolation of monolayer graphene by Geim
et al. [
1], 2D materials have emerged as focal points in physics, chemistry, and other disciplines. 2D materials typically exhibit outstanding optical [
2-
6], electronic [
7-
11], thermal [
12,
13], magnetic [
14-
16], and mechanical [
17,
18] properties, and thus they have significant application prospects in photovoltaics [
19-
23], light-emitting [
24-
28], electronic devices [
29-
35], energy storage [
36-
41], catalysts [
42-
47], biomedicine [
48-
54], and sensors [
55-
61]. At present, the research on 2D materials is in the stage of rapid development. A large number of new 2D materials, including noble‐transition‐metal dichalcogenides (PdSe
2, PtSe
2, PdS
2, PtS
2, etc.) [
62-
73], emerging single-element compounds Xenes (tellurene, selenene, etc.) [
74-
78], and perovskite oxides [
79-
81], have been synthesized successively, greatly enriching the quantity and variety of 2D materials. According to recent high-throughput computations based on density functional theory (DFT) [
82] by Mounet
et al. [
83], over 5000 layered materials may exist, meaning there is huge space for the exploration of 2D materials. In addition, stacking different 2D material layers into van der Waals heterostructures [
84-
88], not only provides new avenues for the development of novel materials but also opens the way for the exploration of a large number of new properties.
Although the research on 2D materials has rapidly developed in the past two decades, it is also apparent that traditional experimental and computational methods are challenging to meet the increasing development needs of 2D materials. High-precision, high-throughput, and high-efficiency experimental and computational techniques are necessary for gaining in-depth understanding of 2D materials. However, conventional experimental methods face challenges of insufficient measurement precision and cumbersome processes. Similarly, traditional computational methods often involve a huge amount of computation and complex computational tasks, requiring a lot of time and high computational costs. For instance, DFT calculations based on the first principle demand highly optimized computational methods and powerful computational resources, and the computational cost increases rapidly with the increasing number of atoms, rendering such calculations very expensive for large-scale material systems. In recent years, the vigorous advancement of deep learning has brought transformative impacts across various fields [
89-
93]. Deep learning has also been widely used in 2D materials, providing an effective solution to overcome the limitations of traditional experimental and computational methods. Deep learning [
91,
94,
95] is a machine learning method based on multi-layered neural networks, which can automatically, efficiently, and accurately learn the features and patterns of data. In the realm of 2D materials, deep learning has been used to reveal the hidden complex relationships among material formation mechanisms, atomic structures, and material properties, becoming a powerful tool for exploring 2D materials. This review focuses on the applications of deep learning in the domain of 2D materials. Firstly, the basic concepts of deep learning will be introduced, and several deep learning architectures commonly used in the 2D materials domain are listed. Then, we will present the applications of deep learning in 2D materials characterization, encompassing defects and materials identification, as well as thickness characterization. On this basis, the utilization of deep learning in predicting the physical and chemical properties of 2D materials and designing new 2D materials will be briefly given. Finally, the challenges and opportunities of deep learning in future research on 2D materials are discussed.
2 Brief introduction to deep learning
Since the 1950s, Artificial Intelligence has been an important research field in computer science, aiming to develop computer systems capable of emulating human intelligent behavior. Machine learning [
96], a significant branch of AI, has made substantial advancements in the last decades, leading to revolutionary advances across numerous application domains. The primary objective is to explore the utilization of statistical learning techniques and optimization algorithms to train models, enabling models to learn patterns and rules from data for autonomous decision-making and prediction. As shown in Fig.1(a), the neural network [
97-
99], a subfield of machine learning, is a computational model inspired by biological systems. It aims to replicate the intricate structure and functionality of biological neural networks to achieve pattern recognition, classification, regression, and decision-making based on non-linear data. Deep learning emerges from the foundation of neural networks, which use multiple layers of neural networks to achieve the ability to learn feature representations from large amounts of data automatically. It is mainly characterized by a deeper network structure and more powerful computational capabilities. Compared to traditional machine learning methods that require a manual feature extraction process, deep learning automatically uses a hierarchical structure to extract high-level abstract features from raw data. It has more robust data-driven capabilities that significantly improve classification, recognition, and prediction accuracy. Thus, deep learning is widely used in feature extraction and performance prediction of 2D materials. In this section, we will briefly introduce deep learning architectures commonly employed in the field of 2D materials research.
2.1 Convolutional neural networks
The convolutional neural networks (CNN) [
100], was initially proposed by Fukushima in 1988. Lecun
et al. [
101] pioneered the application of a gradient-based learning algorithm to CNN, achieving notable success in handwritten digit classification. Since then, CNN has been widely used in many fields. A CNN primarily comprises two components, the feature extractor, and the classifier, as depicted in Fig.1(b). The feature extractor consists of several convolutional and pooling layers, which are mainly used to extract high-level features from input data, enabling the subsequent classifier to achieve more accurate classification. The classifier consists of fully connected layers, receiving the high-level abstract features extracted by the feature extractor and mapping them to specific output classes. The convolutional layer is a fundamental component of the CNN, which comprises multiple convolutional kernels. Convolutional kernels are weight matrices describing the importance of the input content to the output content at a particular location, with the purpose of capturing specific local patterns or features in the input. The parameters of kernels are learned during the training process, which allows the network to adapt to different tasks. The method of sliding a kernel across the entire input to generate a matrix is termed convolution. Following the convolution calculation by all convolutional kernels, an equal number of matrices, termed feature maps, are produced. These feature maps represent the features extracted by the convolutional layer. Pooling layers typically succeed the convolutional layers used for reducing the spatial dimensions of feature maps produced by convolutional layers, thereby decreasing the number of parameters and computations. Finally, the fully connected layers map the feature maps post-convolution and pooling to specific output classes, yielding the final prediction.
2.2 Generative adversarial networks
Generative adversarial networks (GAN) is a kind of deep learning model introduced by Goodfellow in 2014 [
102,
103]. As illustrated in Fig.1(c), GAN consists of a generator
G and a discriminator
D. The generator is responsible for generating synthetic data samples that approximate the underlying distribution of the real data, while the discriminator is responsible for distinguishing the real and generated data. The generator usually takes random noise as input, and then the generated sample
G(
z) is output by mapping the noise to a new data space. On the other hand, the discriminator receives both real data samples from the actual dataset and generated data from the generator as input. The output of the discriminator is a probability score, indicating the probability that the input data is the real data. The training process of GAN can be summarized as an adversarial game between the generator and discriminator. GAN training diverges from single-structure network training, employing distinct alternating iterative approaches. During the generator training, the discriminator remains constant. The generator constantly generates new samples and adjusts its parameters based on the feedback from the discriminator, aiming to produce samples that closely resemble the real data. During the training of the discriminator, the generator remains fixed. The discriminator refines its parameters by continuously judging the real and generated samples, ultimately becoming more accurate at distinguishing between the real and generated data. Through continuous iterations, the generator learns to produce samples increasingly similar to the real data, while the discriminator becomes more accurate at distinguishing between the real and generated samples. The network reaches an optimal state when the discriminator cannot discern whether the data come from the real dataset or the generator.
2.3 U-net
The U-net was proposed by Ronneberger
et al. [
104] in 2015 for image segmentation, and its network architecture is shown in Fig.1(d). Generally, the U-net is composed of an encoder, a decoder, and a skip connection. The encoder employs convolutional blocks, succeeded by pooling layer for downsampling, to extract feature representations at different hierarchical levels from the input image. Each encoder layer consists of two successive 3 × 3 convolutional layers, a linear activation function, and a 2 × 2 max pooling layer, ultimately diminishing the image to a compact feature map. The purpose of the decoder is to semantically project the lower-resolution features obtained from the encoder into a higher-resolution pixel space, ultimately achieving accurate segmentation. The decoder first upsamples the feature map at each layer by a 2 × 2 upsampling convolution. Subsequently, skip connections concatenate the feature maps from the corresponding layers in both the encoder and the decoder. The effect of the encoder and decoder enables the model to retain both global features acquired from the downsampling path and local features learned from the upsampling path, thereby enhancing the precision of image segmentation. Following the integration of features, two consecutive 3 × 3 convolutional layers and a linear activation are employed. In the last layer of the decoder, an additional 1 × 1 convolution is used to reduce the feature map to the desired number of channels and generate the segmented image. The U-net architecture features a distinct U-shaped pattern, facilitating the propagation of contextual information throughout the network. The architecture efficiently exploits contextual information from larger, overlapping regions, resulting in more accurate segmentation. U-net has made significant achievements in the field of medical image segmentation and is one of the classic models in the area of image segmentation.
3 Deep learning in the characterization of 2D materials
The structural information of 2D materials, including defects, thicknesses, and morphology, holds important significance in understanding their physical and chemical attributes, as well as in broadening related applications. Both experimental and theoretical approaches have been employed to characterize the structures of 2D materials. The experimental techniques consist of optical microscopy (OM) [
105-
107], transmission electron microscopy (TEM) [
108,
109], scanning transmission electron microscopy (STEM) [
110,
111], atomic force microscopy (AFM) [
112-
114], Raman spectroscopy [
115-
117], and reflection contrast spectroscopy [
118,
119]. Theoretical methods comprise molecular dynamics (MD) simulations and DFT calculations. However, the above traditional characterization methods have certain limitations, such as expensive computational resources and biases introduced by manual analysis. Combining deep learning with traditional characterization methods can potentially overcome their inherent limitations and address some bottlenecks. In this section, we will introduce the research progress of deep learning in defects identification, material identification, and thickness characterization of 2D materials.
3.1 Defects identification
Defects in 2D materials, including vacancies, doping, and edge defects, have an essential effect on the properties of the materials [
120]. Precisely regulating defects in lattice structures can manipulate the electronic, mechanical, and magnetic properties of 2D materials. Hence, accurately identifying defects at the atomic level in 2D materials is desirable. In recent years, with the continuous advancement of high-resolution imaging and characterization techniques (e.g., AFM and TEM), it is possible to obtain microscopic information about materials, such as the distribution and types of defects. These characterizations can also uncover the dynamics related to the material properties at atomic-level spatial resolution and sub-second temporal resolution. Although the capacity for gathering material data with high spatiotemporal resolution has improved, the properties of the 2D material inferred from these advanced characterization images remain constrained by manual data analysis. Therefore, these high-quality data are only used for qualitative research. Relying solely on manual analysis makes it difficult to promptly and precisely extract all information from the images, resulting in a large amount of data being discarded and wasted. It is evident that the constraints of manual analysis hinder the in-depth application of advanced characterization techniques. Thus, there is a pressing need to automatically and intelligently extract more information concerning the dynamics and interactions of individual defects from high-quality images.
In 2017, Ziatdinov
et al. [
121] developed a deep neural network-based workflow for atomic-resolution STEM images. As illustrated in Fig.2(a), the network is based on a fully convolutional network (FCN). The network can effectively use the limited prior information about possible defect types to extract the coordinates of all atomic species in the image. These coordinates are then employed to identify various defect structures that were not in the training set. As shown in Fig.2(b), the trained network first outputs a probability map for each pixel being an atom. Then, the map is thresholded at a specific value to produce a binary image. Finally, the Laplacian of Gaussian algorithm is applied to calculate the coordinates of the atomic centers in the image. It’s worth noting that the model was trained using simulated image data, but the trained model can effectively process experimental images. Furthermore, they combined deep learning, Laplacian of Gaussian algorithm, and simple graph representations to extract relevant structural such as bond lengths and angles and classify defects based on chemical structures.
Later, Madsen
et al. [
122] proposed a CNN-based method for recognizing local atomic structures in TEM images. This network demonstrated the capability to identify local atomic structures in TEM images and was trained using simulated images as well. It also provides reliable predictions for experimental images. They conducted validation tests on atomic-resolution TEM images of single-layer defective graphene and metallic nanoparticles. The results demonstrated that the network can accurately identify atoms under diverse microscope conditions and exhibits robustness to the variations of microscope parameters. Furthermore, this work suggests the potential of neural networks in classifying atomic column heights in TEM images. However, due to the requirement for a large amount of high-quality experimental data for validation, the feasibility of this method has not been validated on experimental images. On the other hand, Maksov
et al. [
123] developed a deep learning framework based on dynamic STEM imaging. The aim was to identify and track the evolving changes in defect structures and the phase transition in Mo-doped layered WS
2. The network is trained using only the first frame from the sequence of images obtained from STEM. The dataset preparation process is illustrated in Fig.2(c). A fast Fourier transform is initially applied to the original experimental image. Subsequently, a high-pass filter is used to eliminate nonperiodic components of the lattice. The resulting image then undergoes an inverse fast Fourier transform to obtain the periodic image. Following this processing, the original image is subtracted from the periodic image, revealing deviations from the ideal periodic lattice. Finally, the difference in the image is thresholded to identify the locations of individual defects, which will serve as the ground truth image. The trained framework can extract thousands of lattice defects from the original STEM image sequence data within seconds and can be extended to the remaining frames. The extracted defects were further grouped into five clusters by the Gaussian mixture model, as shown in Fig.2(d). The authors further analyzed the distribution and temporal-spatial trajectories of these defects and categorized the dynamic changes of defects into three types: weakly mobile trajectories, strong diffusion, and unrelated events. The results suggest that two defect types associated with Mo doping [Classes 1 and 3 shown in Fig.2(d)] can switch reversibly along their movement trajectories. These two defects were further classified into four types of complexes (Mo
w + V
S) and Mo
w defects non-coupled with S vacancies. The Markov matrices of the four subclasses mentioned above indicate a possible coupling between Mo
w defects and S vacancies. Maksov
et al. [
123] proposed that this coupling may be attributed to two factors: the low diffusion barrier of S vacancies and the higher likelihood of S sublattice atoms being ejected during electron beam irradiation. This suggests the capture of S vacancies near the Mo dopant, resulting in a transition between Mo
w and (Mo
w + V
S) defect types. Additionally, this work also investigated the diffusion characteristics of S vacancy defects [as category 5 shown in Fig.2(d)], revealing diffusion coefficient values in the range of 3 × 10
−4 nm
2/s to 6 × 10
−4 nm
2/s. It is evident that deep learning not only enables the identification of point defects in 2D materials and the analysis of defect diffusion coefficients but also provides insights into the transformation pathways and probabilities of defect complexes. In 2023, Yang
et al. [
124] introduced a dual-mode CNN platform named 2DIP-Net to classify defects in monolayer 2H-MoTe
2. They utilized the faster RCNN model to detect hexagonal cells and cropped the detected cells into unit cells in the initial stage. Subsequently, further segregating unit cells into the Te
2/Mo column part significantly enhanced the accuracy of defect classification. The proposed model achieved an accuracy of over 97.87% for the classification of various defects.
In the quantitative study of doping and defects in 2D transition metal dichalcogenides (TMDs), STEM plays a pivotal role. Specifically, the annular dark-field (ADF) imaging mode of STEM can offer more detailed information, enabling quantitative analysis of minute structures. To achieve more efficient quantitative analysis for ADF images, Yang
et al. [
125] proposed an automated method based on U-net. This method enables reliable quantitative analysis of dopants and defects in TMDs with single-atom accuracy. The approach exhibits a measurement accuracy of up to 98% and a detection limit of 1 × 10
12 cm
−2. Regarding efficiency, the proposed model only requires 3 seconds for quantitative analysis of a 1024 × 1024 pixel STEM image, while accomplishing the same task through a skillful researcher takes approximately 1 hour. The model’s efficiency is 1200 times greater than the current analysis technique performed by humans. Additionally, the automated method shows excellent performance in reducing noise in STEM images and efficiently processing a large number of images. Through this method, they further investigated the dynamic evolution of the structure of TMDs under electron beam irradiation. This work also reveals the spatial distribution, temporal variations, and electron beam irradiation tolerance of point defects and dopants in WSe
2, MoS
2, V-doped WSe
2, and V-doped MoS
2 under electron beam irradiation. Fig.2(e) presents the defect evolution of V-doped monolayer WSe
2, revealing the dynamic response of vacancy under electron beam stimulus. It is worth noting that, in order to reduce the impact of high statistical noise on quantification accuracy in high-speed imaging, CNN was employed to preprocess the images for denoising. This network reduces the statistical noise and enhances the atomic contrast. Compared to the raw images, the signal-to-noise ratio has been improved by ~14.6 times. Thus, the denoising process significantly enhances the accuracy of subsequent quantitative analysis for classifying and labeling atomic sites.
While the spatial resolution of STEM has achieved atomic levels, the precision of individual atomic structure analysis is limited to the picometer scale. The resultant local strains induced by point defects substitution and long-range strain fields operate at a sub-picometer level, which lies below the detection limit of STEM. Thus, conducting sub-picometer level defect detection through high-resolution experimental characterization remains challenging. In 2020, Lee
et al. [
126] developed a deep learning model based on the FCN architecture to process STEM images for the localization and classification of single-atom defects in WSe
2−2xTe
2x. The atomic sites were classified into five categories: the chalcogen columns may hold two Se atoms (without defects), one or two Te substitutions, and one or two Se vacancies. Based on the different classes of defective images output by the model, they generated high signal-to-noise ratio class-averaged images of each defect type. Experimental results have shown that class-averaged images allow for sub-picometer precision measurements of atomic spacings and local strains, achieving a precision of up to 0.2 picometer that cannot be attained with individual images alone.
These advancements in deep learning-based electron microscopy have facilitated automated analysis. However, models trained on simulated images exhibit poor performance when applied to low-quality experimental images, which may involve background noise, aberrations, contamination, poor contrast, and so on. To narrow the gap between simulative training and experimental testing, Chu
et al. [
127] thoroughly considered various interference factors, such as different noise levels, carbon contamination, and high-order aberrations during the construction of simulated STEM images. They trained two U
2-net models using low-quality simulated and experimental Ti
3C
2T
x STEM images, effectively reducing the model’s reliance on input image resolution. The trained models exhibited excellent performance in identifying experimental STEM images, achieving an overall accuracy of 96.8% in identifying vacancy defects and 99.4% in identifying single atoms. In addition to high recognition accuracy, the model also exhibits impressive identification efficiency, achieving approximately 45 images per second. To summarize, deep learning models can perform efficient and precise automatic analysis of extensive datasets. Compared to traditional manual analysis, deep learning methods can reduce evaluation biases, information in feature extraction, and confidence bias in labeling. These methods significantly enhance the efficiency and accuracy of data analysis, consequently driving scientific research and applications in the field of 2D materials.
3.2 Material identification and thickness characterization
2D materials are typically prepared through methods such as chemical vapor deposition (CVD) or mechanical exfoliation. In the preparation process, layers of 2D sheets with varying thicknesses are randomly deposited onto a substrate. However, 2D materials with varied atomic layer thicknesses exhibit significant differences in optical, physical, chemical, thermal, and electrical properties. Therefore, accurately and efficiently identifying and characterizing the thickness of 2D materials is crucial for both scientific research and industrial applications. Various spectroscopic and microscopic techniques are commonly employed to characterize the atomic layer thickness of 2D materials, including Raman spectroscopy, photoluminescence spectroscopy, reflectance/transmittance spectroscopy, scanning tunneling microscope, and OM, respectively. However, these characterization techniques have inherent limitations. High-performance and large-scale characterization methods have consistently posed primary obstacles in applying 2D materials. Based on the optical contrast between atomic layers and the substrate, OM provides a simple and cost-effective method for measuring the thickness of 2D materials. However, this technique, requiring manual operation and processing, is sensitive to substrate and illumination conditions and relies on calibrated illumination methods. The application of deep learning in the fields of image and visual recognition has become highly mature. Applying deep learning to OM enables the automatic extraction of detailed information from microscopy images, facilitating large-scale characterization of 2D materials. Furthermore, it is cost-effective and highly adaptable, without the need for expensive experimental equipment, and demonstrates exceptional scalability.
In 2019, Wu
et al. [
128] improved the SegNet network for recognizing 2D materials, achieving an average segmentation accuracy of 97.17%. This result presents a significant improvement compared to the original SegNet network, with an average segmentation accuracy of 92.04%. Yu
et al. [
129] reported a neural network based on the U-net to identify the thickness of mechanically exfoliated MoS
2 and graphene on SiO
2/Si substrates, the processing is depicted in Fig.3(a). Under bright-field microscopy at 100 times magnification, a set of 24 images of MoS
2 and 30 images of graphene was acquired to serve as the training set. The MoS
2 training set was expanded to 960 images by employing augmentation methods, including random cropping, flipping, rotating, and color alteration of the original images. Although the model was trained on a limited amount of real data, the U-net with weighted loss achieved a cross-validation score and accuracy of around 70%−80%. The result demonstrated that the proposed model could successfully distinguish between monolayer and bilayer, suitable for the initial screening process. On the other hand, Han
et al. [
130] introduced an optical identification neural network (2DMOINet) for 2D materials based on an encoder−decoder architecture in 2020, as illustrated in Fig.3(c). They applied this network to 13 typical 2D materials, including mechanically exfoliated graphene,
h-BN, 2H phase semiconducting TMDs, 2H-phase metallic TMDs, 1T-phase TMDs, black phosphorus, metal trihalides, and the quasi-one-dimensional crystal, as shown in Fig.3(b). The network can identify materials in OM images in real-time and is robust to changes in brightness, contrast, white balance, and light field inhomogeneity. Compared to traditional identification methods constrained by specific crystal types and imaging conditions, 2DMOINet based on deep learning displays greater generality. Furthermore, the authors classified the thickness of materials into single-layer, 2‒6 layers, and greater than 6 layers, as depicted in Fig.3(d). The model achieved a classification accuracy mostly of over 70% for materials and thicknesses on the test dataset, with an average classification accuracy of 79.78%. The authors also compared two training methods for different identification tasks: utilizing a pre-trained 2DMOINet as an initialization for the new task and starting with a random initialization. Through the pre-training approach, they found that only 60 images of CVD-grown graphene were required to attain a global accuracy of 67%. In contrast, the random initialization method required 240 images to reach a comparable level. This result demonstrated that the pre-trained 2DMOINet can adapt to optical identification problems with minimal additional training through transfer learning.
Meanwhile, Masubuchi
et al. [
131] reported a neural network with the instance segmentation model Mask-RCNN to identify the thickness of four 2D materials: graphene,
h-BN, WTe
2, and MoS
2, respectively. They implemented the network on an automated OM to search for 2D materials on SiO
2/Si substrates and identify thicknesses in real-time. They established three classifications, including “monolayer” (1 layer), “few layers” (2−10 layers), and “thick layers” (10−40 layers). The output of the network covers the detection bounding boxes, class labels, confidence scores, and segmentation masks, as shown in Fig.3(i) and (j). In the training process, the weights of the network heads are initialized utilizing pre-trained weights obtained on the large-scale object segmentation dataset (MS-COCO dataset [
132]). In contrast, the rest of the network weights are initialized randomly. To improve the generalization ability of the model, it is first pre-trained on a mixed dataset containing four types of 2D materials. Then, the weights obtained from pre-training are used as sources, followed by transfer learning on each material subset to accomplish thickness classification. The experimental results further evinced that, compared to models exclusively pre-trained on MS-COCO, the model pre-trained on the mixed training set of 2D material exhibited a swifter reduction in test loss and attained a lower minimum loss value. In 2023, Mahjoubi
et al. [
133] proposed a deep learning method based on hierarchical deep convolutional neural networks to automatically identify and classify mechanically exfoliated graphene flakes on Si/SiO
2 substrates. They employed AFM and Raman spectroscopy to characterize the thickness of the captured optical images and generated pixel-wise thickness maps as the ground truth. The experimental results indicate the model’s robustness to the background color, brightness, and resolution of microscopic images. This is attributed to the optimized adaptive gamma correction method for enhancing image quality before the training. The model achieves a pixel classification accuracy of over 99%, with a minimum IoU value exceeding 56% and an MIoU of 59%.
In 2022, Zhang and colleagues [
134] analyzed 16 detailed semantic segmentation models that performed well on public datasets and applied them to layer identification and segmentation for graphene and MoS
2. The assessment of these models included an evaluation of their complexity, size, classification accuracy, and segmentation performance, respectively. These parameters are evaluated based on a range of metrics, including Giga Floating-point Operations Per Second (GFOPs), Params, Accuracy, Kappa coefficient, Dice coefficient, and Mean intersection over union (MIoU), respectively. Based on the open-source dataset provided by Saito
et al. [
129], they divided the labeled images randomly into training and testing datasets in the ratio of 8:2. Inspired by PSPNet [
135], they improved U
2-net
† [
136] by adding a pyramid pooling module into the encoder output of the first nested residual U-block — this modification aimed to fuse multi-scale and contextual information. The improved model is called 2DU
2-net
†. Comparing the input image and the labeled image output by models, it can be found that 2DU
2-net
† has the best segmentation results in distinguishing the detailed, dispersed, and edge regions of the 2D material, exhibiting finer contour lines than other models. 2DU
2-net
† demonstrates an accuracy of 99.03%, a Kappa coefficient of 95.72%, a Dice coefficient of 96.97%, and a MIoU of 94.18%. Compared to U
2-net
†, most metrics exhibit a significant improvement. Despite U
2-net
† exhibiting some defects and errors in edge segmentation, it displays the fewest parameters. In terms of computation, compatibility, and training deployment, it presents more advantages than other models. Furthermore, experimental results indicated that models with backbone networks tended to have larger GFOPs and parameters. In contrast, lightweight networks based on encoder-decoder structures, such as 2DU
2-net
† and U
2-net
†, demonstrate superior performance and achieve a lightweight computational level regarding computation, parameters, and inference speed.
To solve the problem of image degradation caused by out-of-focus, Dong
et al. [
137] improved the loss function of GAN and proposed a microscopy image deblurring model. This model enables restoring the structure and color information of out-of-focus, low-quality microscopic images of CVD-grown MoS
2. Furthermore, they utilized an optimized U-net for segmentation and layer identification on the restored images. As depicted in Fig.3(e), experimental results indicate significantly improved segmentation accuracy in the restored images compared to experimentally out-of-focus ones. In the same year, Zhu
et al. [
138] reported a pixel-based supervised artificial neural network (ANN) model, which utilized the six primary color channels: red, green, blue (i.e., RGB), as well as hue, saturation, and value (i.e., HSV) from OM images as input to distinguish and characterize various 2D materials. The model identified 8 types of monolayer and bilayer 2D materials across different imaging conditions with more than 90% accuracy, compared to an average accuracy of 49% for identifications by trained researchers. In addition, as shown in Fig.3(f)‒(h), the model can identify the chemical compositions and interface distributions of the CVD-grown MoS
2/WS
2 van der Waals heterostructures. Compared with the mapping that takes several hours to demonstrate by Raman spectroscopy, this method dramatically improves the characterization efficiency. Combined with a ternary regression model, the model was also able to identify sulfur vacancy defect concentrations in CVD-grown MoS
2. Compared to traditional optical microscopy based on RGB color information, multispectral imaging microscopy can acquire more abundant spectral information. In 2023, Dong
et al. [
139] developed a multispectral microscopy framework to characterize the thickness of ultra-thin atomic crystals. Similar to the previous study [
137], the framework includes image acquisition, the restoration of out-of-focus images, and the segmentation of the restored images. The Dice coefficients from the well-trained model surpass 80% in classifying all categories.
4 Deep learning in the prediction of 2D materials
In contrast to three-dimensional bulk materials, 2D materials are only composed of single or several layers of atoms. Their physical and chemical properties can be precisely tuned by compositional adjustments, defect engineering, surface doping, phase transition processes, thickness modulation, and chemical modification. Thus, 2D materials have great exploration space. Employing deep learning methods to predict the properties of 2D materials can significantly accelerate research progress and reduce research costs. Below, we will introduce the research progress in combining deep learning with computational simulation methods to predict the electronic, mechanical, and thermal properties of 2D materials.
4.1 Electronic properties
The applications of 2D materials are extremely relevant to their electronic properties, with the band gap playing a central role. For instance, graphene has excellent electrical conductivity but a zero bandgap feature, leading to poor switching function for circuit control. This shorting hinders the applications of graphene in digital circuits, semiconductor devices, and optoelectronic devices. In contrast, MoS2, with a direct bandgap of about 1.89 eV, is an ideal 2D semiconductor. With a bandgap of about 5.9 eV, h-BN is a typical wide bandgap semiconductor and is thus widely used as an insulating layer in semiconductor devices. Therefore, fast and accurate prediction of the bandgap of 2D materials holds great importance for their application across various fields. Combining deep learning with DFT computational methods can realize low-cost and high-precision bandgap prediction.
In 2019, Nemnes
et al. [
140] combined DFT calculations and ANN to predict the bandgap of hybrid
h-BN graphene. They generated 900 non-equivalent structures of
h-BN graphene, each containing 200 atoms of C, N, B, and H, with a fixed 34 hydrogen atoms in each system, as illustrated in Fig.4(a). The first ANN model was constructed by 166 input neurons, corresponding to the structure of C, N, and B atoms. Then, the second model considered the chemical neighborhoods of specific atom types. The number of input neurons is reduced from 200 to 20, with four neurons representing the corresponding proportion of C, B, N, and H atoms. While the remaining 16 neurons represent the normalization counts of quadruplet atom combinations (X
i, Y
1, Y
2, Y
3, where X
i = C, B, and N; Y
1, Y
2, and Y
3 represents the three closest neighboring atoms to X
i). This design of the input layer allows for the input to be independent of the structure size, and thus allows for the processing of structures with different dimensions. Both trained ANN models demonstrated excellent performance in predicting bandgaps. The determination coefficients were 99.7% and 97.5% on the training set and 95% and 88.8% on the test set for the two models, respectively, as depicted in Fig.4(b) and (c). In 2020, Dong
et al. [
141] proposed three neural networks for predicting the bandgaps of hybrid
h-BN graphene with randomly configured supercells. Different configurations generated by geometric methods and the corresponding bandgap calculated by DFT are used as training dataset. Initially, a VGG16 convolutional network was constructed, incorporating 12 convolutional layers, a global-flatten layer, three fully connected layers, and an output layer. However, the network exhibited rapid saturation and subsequently degraded accuracy. Furthermore, they introduced residual convolutional networks (RCN) and concatenate convolutional networks (CCN). The structure of RCN is similar to ResNet50. Similar to RCN, the concatenation blocks are also introduced into CCN. The DFT calculation results serve as true values for evaluating the predicted bandgaps of the 4 × 4 supercell system using different models. Experimental results demonstrate that the proposed three models can predict the bandgap with a relative error of less than 10% in over 90% of cases. In contrast, the prediction results of the support vector machine (SVM) [
142] significantly deviated from the true values, exhibiting errors exceeding 20% in over half of cases, as shown in Fig.4(d). Meanwhile, the predicted value of the three networks indicates a strong direct linear correlation with the true value, while the correlation between the SVM and the true value is very weak, as shown in Fig.4(e).
In 2022, Ma
et al. [
143] employed CNN to predict the formation energy of defective graphene. To precisely depict structures and distributions of various defects, they proposed a descriptor of a three-dimensional matrix constructed by the voxelization method, taking the chemical bond between atoms as the description unit. The chemical construction parameter matrix involves the bond position matrix, bond length matrix, and bond angle matrix. The dataset covers a range of defect types, including single vacancy (SV), Stone–Wales defects (SW), and double vacancy (DV). During the dataset preparation, as shown in Fig.4(f), to avoid interaction between different defects affecting the formation energy, it is necessary to ensure that the defect distance between defects in distinct cells is greater than 15 Å. Then, diverse types of defects are randomly selected and spliced to generate different structures. The corresponding formation energy can be approximated as the sum of the formation energy of each defect, where the formation energy of a single defect is calculated by DFT as the true value. The prepared dataset is augmented by translation and rotation of the structure description matrix along a lattice axis. The trained model performs well in predicting the formation energy of defective graphene, with a coefficient of determination of 0.998 and a mean absolute error of 0.51 eV. Furthermore, the sensitivity of model prediction accuracy to the defect type and interaction distance has also been investigated. For diverse defects, the double vacancy shows the largest prediction error, 0.12 eV different from the true value obtained by DFT calculations. As shown in Fig.4(g), under diverse defect distances, the model can accurately predict the total energy of different defect combinations, and the prediction error of less than 0.3 eV. Finally, they investigated the generalization ability of the model to unknown defects, and extended the proposed three-layer descriptor to MoS
2. The mean absolute error of the predicted results from the model is 53 meV per 1000 atoms, and the performance is close to graphene, proving that the proposed multi-layer descriptor and CNN model have good generalization ability.
4.2 Mechanical properties
Unlike three-dimensional bulk materials, 2D materials demonstrate exceptional flexibility due to their atomic-level thickness. They find extensive applications in flexible electronic devices, where properties like Young’s modulus and fracture behavior hold significant importance, directly influencing the breadth of applications and the durability of 2D materials. In addition, 2D materials can withstand large mechanical strains, and strain engineering has been proved theoretically and experimentally to modify the band structure, as well as the electronic and photoelectric properties of 2D materials. Consequently, it is crucial to evaluate the mechanical properties of 2D materials. Experimentally, in situ microscopies such as AFM, scanning electron microscopy, and TEM are commonly employed to characterize the mechanical properties of 2D materials. The AFM nanoindentation test is the most widely used method. During the characterization, the AFM tip is pressed onto the 2D material surface, and a force-displacement curve can be obtained by recording the displacement of the tip and the pressure applied. According to the force-displacement curve, mechanical properties such as Young’s modulus, tensile strength, fracture strength, fraction strain, and friction coefficient of 2D materials can be determined. However, the experiment process is cumbersome and repetitive. In addition, currently available nanoindentation experimental devices face challenges in maintaining minimal penetration depth and may induce inevitable error, and thus unable to reveal the intrinsic mechanical properties. In addition to experimental methods, the mechanical properties can also be characterized by the indentation and tensile calculations using DFT calculations and MD simulations. The main obstacles are expensive computational resources and high time consumption. Employing deep learning to assist in predicting the mechanical properties of 2D materials, which not only can overcome the limitations of either experimentally direct measurement or computational simulation, but also can effectively, economically, and accurately predict the mechanical properties of a large number of 2D materials.
In 2020, Dewapriya
et al. [
144] employed shallow and deep neural networks to predict the fracture stress of defective graphene under various conditions. The training data for the shallow network is partially obtained based on the analytical solutions of the Bailey durability criterion and the Arrhenius equation. In contrast, the dataset for training the deep CNN is obtained from MD simulations. According to the experimental results, the shallow network performs well when the number of training samples is limited. It can accurately predict the fracture stress of single-vacancy randomly distributed defective graphene. In comparison, deep networks require larger training samples, but can effectively solve more complex problems such as the effect of defect distribution on fracture stress in graphene. On the other hand, understanding crack propagation behavior is of great importance in science and industry, which is essential for the lifespan extension of industrial products. For this purpose, Hsu
et al. [
145] introduced a model based on convolutional long short-term memory (ConvLSTM) networks to predict the crack propagation path in crystalline Lennard−Jones materials. Based on this work, Lew
et al. [
146] applied the ConvLSTM model to predict the fracture behavior of graphene. They investigated the parameter calibration process for obtaining meaningful fracture predictions, aiming to attain fracture predictions consistent with MD simulations. In this work, when generating the dataset using MD simulations, the influence of graphene orientation on its fracture behavior was taken into consideration. Thus, the loading direction is set in the simulation from the armchair direction to the zigzag direction in increments of 10°. Each fracture path image with 160 × 120 pixels obtained from MD simulations was segmented into sequential input and output matrix pairs. In total, 5544 matrix pairs were generated as the training dataset. After the convolutional layers learned the geometric features, the long short-term memory layer focused on the sequential relationships along crack propagation. Finally, a dense layer was employed for classification, as shown in Fig.5(a). The effects of input and output widths on the prediction results are analyzed. Experimental results indicated that when the pixel width of the input matrix is increased to 32, the model has a sufficient amount of fracture history. Hence, the prediction results are most consistent with the MD simulation results. According to the analysis results and the length of the graphene armchair unit cell, the input and output matrices with widths of 32 pixels and 2 pixels, respectively, were employed in the experiment to calibrate the proposed model. The calibrated model demonstrates a prediction accuracy of up to 95% for the graphene crack path. The comparison between the predictions and the MD simulation results is illustrated in Fig.5(b). To prove the generalization ability of the model, the calibrated model was also applied to predict the crack paths of bicrystal graphene and arbitrarily oriented graphene. The predictions are closely aligned with the MD simulation results, proving the model has good generalization performance. In addition, the effect of point defects on the fracture path of graphene was also studied, and the comparison showed that when the defect size increases to 3.2 nm, there is a significant deviation in the crack path. This result is consistent with the defect tolerance threshold of nanocrystalline graphene in Ref. [
147].
In 2022, Yu
et al. [
148] improved the ConvLSTM model by introducing 2D convolutional long short-term memory layers to extract more spatial features. This improved model was employed to predict the fracture paths of graphene under various defects. This work demonstrates that the improved model achieved a prediction accuracy of 99% for graphene with diverse orientations and 98% for graphene with different defects, showcasing outstanding generalizability and transferability. During the same year, Elapolu
et al. [
149] introduced a deep learning model based on CNN and bidirectional recurrent neural networks (Bi-RNNs) for predicting crack propagation paths in polycrystalline graphene under tensile loading. The CNN is utilized to automatically extract crucial features such as grain orientation and grain boundary positions. Additionally, the Bi-RNNs propagate sequential information regarding crack positions and microstructure details, which is employed to forecast the path of crack propagation. They employed algorithms proposed in previous studies [
150-
152] to generate polycrystalline graphene sheets with a scale of 20 nm × 40 nm. The graphene sheets also feature effective grain sizes spanning 3‒9 nm and grain orientation angles ranging from 0°−60°, with grain boundaries composed of pentagon-heptagon defects. Subsequently, the simulations were conducted to stretch 700 distinct sheets along the
y-direction to prepare a dataset of polycrystalline graphene images, as shown in Fig.5(c). Fig.5(d) presents the comparison between the output of the model and the results of MD simulations. The fully grown crack length is used to evaluate the quality of the model prediction performance. Note that the crack length prediction by the model was basically consistent with the MD simulations. As the grain size increased, the difference in crack lengths between the output by MD simulation and the ConvLSTM model decreased, as illustrated in Fig.5(e). This improvement can be attributed to the sparser distribution of grain boundaries within the domain, resulting in reduced kinking along the crack propagation path. In addition, in the grain direction close to 30°, there are two zigzag directions of crack growth, and the growth path is not unique, so the prediction accuracy of the model is relatively low.
Later, Shishir
et al. [
153] proposed a deep CNN model to extract the average grain size of polycrystalline graphene sheets and predict Young’s modulus and fracture stress. The centroidal Voronoi tessellation method is used to generate initial structures of polycrystalline graphene sheets close to realistic materials. At last, a total of 50 polycrystalline graphene sheets with varying average grain sizes were prepared. To account for the randomness in atomic structure and grain boundary orientation, each grain of different sizes was created with 10 differently oriented atomic structures, resulting in 500 polycrystalline graphene sheets. Through data augmentation, the dataset contains a total of 2000 input images. Using MD simulations, uniaxial tensile simulations were conducted on polycrystalline graphene sheets to obtain stress-strain curves. Then, the average fracture stress and Young’s modulus of the polycrystalline graphene were calculated based on these curves. Fig.5(f) shows that the trained model accurately extracts the average grain size from the polycrystalline graphene images. The coefficient of determination on both the training and testing sets exceeds 0.98. The root mean square errors of Young’s modulus and fracture stress are 21.573 GPa and 2.8101 GPa on the testing set, respectively, while the standard deviations of the two values range from 13.319 to 36.064 GPa and 1.779 to 6.536 GPa, respectively, as depicted in Fig.5(g). It is evident that the error predicted by the model falls within the standard deviation range of MD simulations. These results prove that the model can accurately predict Young’s modulus and fracture stress of polycrystalline graphene, thus avoiding the time and economic cost required by MD simulations. In 2023, Shen
et al. [
154] utilized the deep convolutional neural network to predict Young’s modulus and tensile strength of defective
hBN containing coexisting types of defects. The trained model demonstrates an
R2 value of 0.986 for predicting Young’s modulus and 0.894 for predicting tensile strength, respectively. The proposed model demonstrates high predictive accuracy, making it promising for assisting in the defect engineering design of
h-BN. Additionally, the model performs better in predicting Young’s modulus, which may be attributed to different coupling mechanisms between defects and tensile strength compared to Young’s modulus. It is possible to further improve the model's performance in predicting tensile strength by adjusting the network structure.
4.3 Thermodynamic properties
The thermodynamic properties of 2D materials are fundamental intrinsic characteristics for their applications. Thermal conductivity is a critical parameter in energy engineering and thermal management. In electronic and energy storage systems, materials with high thermal conductivity are required to enhance heat dissipation and curb overheating problems, aiming to reduce the demand for complex and expensive thermal management systems. Conversely, in thermal insulation and thermoelectric materials, materials with low thermal conductivity are more suitable for reducing energy loss and improving the efficiency of thermoelectric conversion. Hence, a precise evaluation of the thermal conductivity of 2D materials is a fundamental issue for matching target applications.
In 2018, Yang
et al. [
155] trained four supervised machine learning models to predict the thermal properties, including linear regression, polynomial regression, decision tree, and random forest, as well as four ANN models. For the ANN models, two of them possess one hidden layer, each containing 10 (ANN-10) and 20 neurons (ANN-20), respectively. The remaining two models feature two hidden layers, each containing 10 neurons (DNN-10-10) and 20 neurons (DNN-20-20) per layer. The trained model can predict the interfacial thermal resistance (R) between graphene and
h-BN, given the system temperature, coupling strength, and tensile strain as input parameters. The prediction results of each model are shown in Fig.6(a)‒(h). The results indicate that ANN outperform most machine learning models, and all ANN perform significantly better than linear regression and polynomial regression models. Meanwhile, the prediction accuracy of ANN-10 and ANN-20 is comparable to that of the decision tree and random forest models. Among all the models, the two-layer deep neural network demonstrates the best performance, with the mean square error of 0.055 and 0.045 × 10
−7 K∙m
2∙W
−1 with 10 and 20 neurons per layer, respectively. The minimum root mean square error in the machine learning model was 0.059 × 10
−7 K∙m
2∙W
−1.
The high thermal conductivity of graphene limits the application of graphene in semiconductor devices. Previous works have shown that the thermal conductivity of porous graphene is much lower than that of pristine graphene. This phenomenon suggests that the spatial distribution and density of holes play a crucial role in reducing thermal conductivity. However, as the number of pores increases, the design complexity increases dramatically, making it difficult to determine the optimal distribution of holes to maximize or minimize the thermal conductivity. In 2020, Wan
et al. [
156] predicted the thermal conductivity of porous graphene based on the CNN. They subsequently applied this approach to efficiently search for the optimal porous graphene structure with the lowest thermal conductivity in reverse design. The investigation employed a porous graphene structure with dimensions 160 Å × 45 Å, as depicted in Fig.6(i). The central region of the structure is the porous area, with holes of about 8.8 Å in size. The clusters of atoms in blue represent the candidate sites for hole formation. The dataset consists of gray-scale images of porous graphene with dimensions 54 × 50, and the corresponding thermal conductivity was obtained by MD simulations. The coefficient of determination of the trained model on the test set is 0.96, and the root mean square error is 1.09 W/(mK). It is close to the results of MD simulations [root mean square error of 0.74 W/(mK)], indicating that the model can accurately predict the thermal conductivity of porous graphene.
In 2021, Liu
et al. [
157] proposed a deep neural network capable of quickly predicting the thermal conductivity of piled graphene with various geometric parameters and different sizes under external mechanical loading. The characterization of piled graphene is illustrated in Fig.6(j) and (k). First, piled graphene is projected onto the
x−
y plane and uniformly discretized into several subregions, described using a 2D matrix. The total area of stacked graphene (β) is then divided by the number of graphene sheets (
gij) contained within the discrete subregion to obtain the physical information pixel value (
vij). Finally, the physical information for each pixel is represented by gray-scale images, referred to as fingerprints. These fingerprints, along with their corresponding thermal conductivity values obtained from MD simulations, serve as the training dataset. The final results demonstrate that the predictions of the model are consistent with the MD simulations. The determination coefficients of the training, validation, and test sets are 0.9787, 0.935, and 0.925, and the root mean square errors are 0.3220, 0.6319, and 0.6024 W/(mK), respectively. This result shows that the deep neural network trained without overfitting could predict the thermal conductivity of piled graphene with extremely high accuracy. Furthermore, they have constructed a comprehensive databank using piled graphene and corresponding thermal conductivities obtained from MD simulations and deep neural network predictions. This databank stores key geometric characteristics of piled graphene, such as the design domain size (
l ×
w), the number of graphene sheets (
ns), and the total area of graphene (
Ag), as shown in Fig.6(l). This databank can be used to search for piled graphene structures and their corresponding thermal conductivities, guiding for designing them with specific thermal conductivities.
5 Deep learning in the design of 2D materials
Materials can be defined based on the type and number of constituent atoms, the stoichiometric or non-stoichiometric ratios of elements, as well as structural characteristics such as crystallography, nanostructure, and microstructure. Due to their varying atomic compositions, stoichiometries, and structures, diverse materials exhibit distinct properties in optics, mechanics, and electronics. Traditional material design is a forward mapping progress from material parameters to target properties. The forward design of new materials typically involves a series of steps, including molecular design, performance prediction, chemical synthesis, and experimental evaluation. These steps require a large number of experimentations and simulations, resulting in huge time and resource consumption, as well as high trial-and-error costs. High-throughput computational screening [
158-
162] overcomes these challenges by employing first-principles calculations [
163]. It operates within a virtual chemical library constructed through combinatorial enumeration, facilitating the efficient screening of potential candidates for subsequent chemical synthesis. This approach greatly enhances the efficiency of material design and discovery. However, when tackling large-scale problems, the immense size of the chemical search space leads to exponential growth in time and computational costs. Moreover, constructing a virtual chemical library [
164-
167] heavily relies on the experience and intuition of materials scientists. Conversely, reverse design initiates from a material possessing specific desired functionalities and retraces backward to deduce its chemical composition and structure. This process, empowering the determination of optimal material design parameters from target property, is a reverse mapping from target performance to design parameters. In recent years, the increasing amount of experimental and simulation data has made data-driven deep learning the fourth paradigm of materials science, as shown in Fig.7(a) [
168]. Notably, deep generative models have found widespread application in reverse design. They acquire material design knowledge and principles hidden within high-dimensional data, generating new materials with specific functionalities. This process is no longer dependent on the experience or intuition of researchers. Among these models, variational autoencoders (VAE), GAN, and reinforcement learning have achieved noteworthy advancement in molecular design. They facilitate the design of material structures based on the desired material performance. Below, we will present the latest advances in utilizing deep learning to design functional 2D materials.
In 2020, Dong
et al. [
169] introduced a reverse design framework utilizing regression and conditional generative adversarial networks (RCGAN) to generate hybrid 2D structures of graphene and
h-BN with specific bandgap values, as depicted in Fig.7(b). The conventional GAN faces challenges in generating data using continuous and quantitative labels as inputs. The proposed RCGAN overcomes this limitation by incorporating supervised regression networks. This work employed DFT calculations to obtain bandgap values for various graphene and
h-BN composite structures and established the training dataset by the calculated results. Taking a 4 × 4 supercell structure as an example, as shown in Fig.7(c) and (d), the findings indicate a favorable linear correlation between the desired and bandgap of the generated structure from the trained model, with a correlation factor of 0.87. 64% of the generated structures exhibit a relative error in bandgap within 10%, and the mean absolute error is 9.45%. These outcomes underscore the high prediction precision of the model. Additionally, this work also evaluated the diversity of generated structures. For 4 × 4 and 5 × 5 supercells, structures generated by the model that were equivalent to real structures in the training set accounted for only 12% and 0.5% of all generated structures, respectively. In the generated 6 × 6 supercell structures, there were no instances of equivalence with real structures. These findings suggest that RCGAN was successfully trained without common training issues such as mode collapse.
While VAE and GAN have made significant progress in reverse design, they are susceptible to encountering mode collapse issues during training, which may lead to potential failures. The intrinsic reversibility of invertible neural networks (INN) confers potential advantages in stability and performance, partly alleviating the problem of mode collapse encountered in VAE and GAN during training. In 2021, Fung
et al. [
170] introduced a reverse design framework, named Materials Design with Invertible Neural Networks (MatDesINNe) based on INN. This framework enables thorough and efficient sampling of the entire design space, facilitating both forward and reverse mappings between material parameters and target properties. As a result, it generates material candidates with the desired property. They applied this framework to the bandgap engineering design of MoS
2. Within the design space containing applied strains and external electric fields, the framework generated new material candidates with high fidelity, accuracy, and diversity. The study characterized applied strain by assessing variations in equilibrium lattice constants (
a,
b,
c,
α,
β,
γ) and introduced an additional dimension to the design space, represented as the electric field perpendicular to the monolayer. For lattice parameters, a range of ±20% around the equilibrium points was sampled. Regarding the applied electric field, the sampling range extended from ‒1 V/Å to 1 V/Å. The entire design parameter space has been depicted in Fig.7(e). Sampling across the whole range of the design space generated a total of 11 000 samples for the dataset. For target bandgaps of 0 eV, 0.5 eV, and 1 eV, with the bandgap values computed by DFT as the true value, the study conducted tests on 10 000 samples. Then, the proposed model was compared with a mixture density network (MDN) and a conditional variational autoencoder (cVAE). The results indicate that for the case of a target bandgap of 0 eV, all models performed well, as the majority of samples had zero bandgaps. For non-zero target bandgaps of 0.5 eV and 1 eV, the mean absolute error values of MDN and cVAE significantly increased, reaching 0.421 eV, 0.461 eV, and 0.840 eV, 0.973 eV, respectively, signifying a marked decline in model performance. The mean absolute error values for the conditional invertible neural network were 0.219 eV and 0.193 eV, indicating reasonably good performance but falling short of the precision requirements. The best-performing model was the conditional neural network utilizing the MatDesINNe framework (MatDesINNe-cINN), with mean absolute error values of 0.013 eV and 0.015 eV. Subsequently, the authors validated the performance of the MatDesINNe-cINN model on 200 samples using DFT calculations. The experimental results for the target bandgap of 0.5 eV are depicted in Fig.7(f) and (g). For the three target bandgap values, the model exhibited a mean absolute error of approximately 0.1 eV, indicating that the model can generate samples with specific bandgaps with high accuracy.
On the other hand, Wan
et al. [
156] introduced a reverse design scheme based on CNN, which efficiently searches porous graphene structures with minimal thermal conductivity. They revealed the correlation between hole distribution and hole density, as well as the reduction of thermal conductivity in porous graphene. This approach only requires 1000 MD simulations to select the optimal solution from a million-design space. Compared with the MD simulation calculation force to search the entire design space to screen the optimal solution, the efficiency is greatly improved. The process of reverse design employing CNN is outlined as follows. Firstly, 100 structures are randomly chosen for training to establish the first generation of CNN. Subsequently, this network is employed to forecast the thermal conductivity of the remaining structures within the design space, screening the lowest thermal conductivity. These new structures, alongside their corresponding thermal conductivity values from MD calculations, are added to the training set. This expansion augments the training data and facilitates the training of the subsequent generation of CNN. Through iterative processes, graphene structures with the lowest thermal conductivity can be found. The comparison between the proposed scheme and a random search method shows that the random search method converges slowly, while the reverse design based on CNN converges more swiftly. It screens the top 100 structures with the lowest thermal conductivity in the 7th generation. With 24 potential sites and a porosity of 0.5, they further verify the performance of the model in a vast design space, containing 2 704 156 possible structures. Experimental results demonstrate that in the 8th generation’s training set, the mean thermal conductivity of the top 100 structures is 14.99 W·m
−1·K
−1. These works validate the model’s capacity to screen porous graphene structures with low thermal conductivity quickly.
2D materials have shown high activity in catalytic reactions. Numerous experimental and high-throughput computational studies have reported the use of 2D materials as hydrogen evolution reaction catalysts. However, due to the long experimental cycles and the massive cost of high-throughput calculations of adsorption energies, the rapid discovery of high-performance 2D hydrogen evolution reaction catalysts remains a significant challenge. In 2023, Wu
et al. [
171] utilized crystal graph convolutional neural networks (CGCNN) to screen high-performance 2D hydrogen evolution reaction catalysts from a 2D materials database. The trained model can discover high-performance 2D materials catalysts from the 3401 composite structures with different active sites in just a few hours, achieving a remarkable prediction accuracy of 95.2%. However, the average time (represented by the dotted line) for calculating the adsorption energy of the hydrogen evolution reaction using DFT is 94 528 seconds, as depicted in Fig.7(h). Therefore, the calculation time required for the total 3401 active sites using DFT is 94 528 × 3401 seconds ≈ 10.19 years. Clearly, the efficiency of the model is markedly enhanced in contrast to the decade-long duration required for DFT calculation, demonstrating the capability of CGCNN for efficiently discovering high-performance new structures over a large 2D materials space.
6 Summary and outlook
Over the past decade, research in the field of 2D materials has experienced significant growth, witnessing a growing abundance of new 2D materials and heterostructures. This development has far outpaced the processing capacities of conventional experimental and computational approaches. In recent years, the emergence of deep learning has brought unprecedented opportunities to investigate 2D materials. The training datasets can be collected from material databases, experimental results, and simulation computations [
172]. The dataset is then used to train various deep learning models, establishing a mapping relationship between input features and target outputs. The trained models can perform structure characterization, property prediction, and reverse design for 2D materials. To a certain extent, the limitations of traditional experimental and computational methods are overcome, and the research efficiency of 2D materials is greatly improved.
This review discusses recent advancements in the application of deep learning for characterizing structure (defect identification, materials identification, and thickness characterization), predicting properties (electronic, mechanical, and thermodynamic properties), and inverse design in 2D materials. A summarized overview is shown in Tab.1. Deep learning models demonstrate the ability to accurately identify and quantitatively analyze doping and defects in 2D materials with single-atom precision. Integrating deep learning with material characterization techniques will expedite the high-precision characterization of 2D materials, enabling large-scale representation. Combining deep learning with theoretical calculations not only enables high-precision performance predictions at a fraction of the computational cost but also aids in exploring the performance of 2D materials under complex factors. Thus, integrating deep learning with experimental results becomes important for practical property predictions in future research. Furthermore, deep learning is essential for reverse design as it allows for the determination of optimal material design parameters based on the properties of the expected material. The design efficiency of 2D materials is improved by freeing from the constraints of traditional trial-and-error research. Several research efforts have demonstrated the feasibility of rapidly designing 2D materials with target properties. It is foreseeable that in future research, deep learning methods will continue to unleash its substantial potential, offering more support and assistance for the study and application of 2D materials. Nevertheless, it is also essential to be mindful of the challenges and limitations inherent to the deep learning method.
Firstly, preparing 2D materials in experiments necessitates a high level of expertise and complex instruments. On the other hand, computational simulations require expensive computing resources and a substantial time investment. Consequently, there is a problem of insufficient 2D materials data sets, which fail to comprehensively and objectively represent the characteristics of 2D materials [
173]. This shortcoming may lead to potential challenges where the model struggles to learn crucial features, resulting in poor performance and issues such as underfitting, overfitting, and reduced generalization capability [
174]. Transfer learning offers a solution to the issue of limited data. It involves the transference of knowledge learned from related tasks to new tasks. Through the fine-tuning of pre-trained model parameters acquired from training on large-scale datasets, the model can adapt to new datasets and tasks. Thus, transfer learning can improve the capacity of the model for generalization and diminish the risk of overfitting. Secondly, the quality of 2D materials data is another pivotal factor affecting model performance. On one hand, when creating datasets, it is necessary to consider diversity and representativeness. Simultaneously, repeated experiments or simulations should be conducted to ensure the reliability and precision of the data. On the other hand, establishing sizable, high-quality, openly accessible datasets can offer crucial support for deep learning research in the field of 2D materials. Aggregating and sharing a substantial volume of 2D material datasets can facilitate the training of more precise and efficient models. Furthermore, in the works discussed in this review, a majority of training datasets are generated via simulation calculations, which deviate somewhat from real physical experiments. Therefore, it is crucial to pay special attention to the reliability of the model when applied to actual data. Ultimately, the interpretability of the deep learning method still needs to be addressed. Deep learning methods are widely regarded as black-box algorithms. Their prediction results lack interpretability, and their internal mechanisms and logic are inscrutable. To some extent, this barrier limits the reliability and credibility of deep learning methods in practical applications. When designing deep learning models and loss functions, it is imperative to follow the fundamental principles of physics and chemistry. Incorporating more prior knowledge and constraints into the model can improve the reliability and interpretability. This incorporation enables the model to more accurately reflect the patterns and features of the real world, thereby attaining more theoretically reasonable predictions.
In the future, on the one hand, there is potential to integrate deep learning and robotics to construct automated systems for the efficient preparation of various types of 2D materials and complex heterogeneous structures. This development will enable intelligent synthesis of 2D materials and device design [
175]. In the field of 2D materials, current studies are mostly still semi-autonomous, and a significant technical challenge is establishing a closed-loop autonomous materials experimentation process. The emergence of autonomous robotic scientists will substantially change the existing human-machine collaboration method. On the other hand, there is still huge room for exploration in the application of deep learning to the study of 2D materials. In the realm of reverse engineering 2D materials, it is worth further investigating the generation of materials with specific thermal conductivity and mechanical properties, in addition to the mentioned materials with desired bandgaps. Regarding property prediction, aside from electronic, mechanical, and thermal properties, other properties of 2D materials, such as optical properties, superconductivity, and toxicity, need to be further studied [
174]. The integration of deep learning in the study of 2D materials serves to advance the field of 2D materials science, overcoming inherent limitations in traditional experimental and computational methods. While some progress has been made, there remain numerous challenges in the future.