1 Introduction
Owing to the unique electromagnetic properties of nanoscale photonic structures, light–matter interactions can be precisely engineered at the subwavelength scale [
1−
17]. Representative examples include localized surface plasmon resonances that confine light to deep-subwavelength volumes [
18−
20], the emergence of optical magnetic resonances in intrinsically non-magnetic materials [
20−
24], the tailored manipulation of optical near-fields through subwavelength architectures [
25−
27], and the occurrence of various nonlinear optical phenomena [
28,
29]. These nanoscale optical effects can have a wide range of applications, such as invisibility cloaks [
30−
32], analog computing [
33−
36], wireless communications [
37−
44], integrated quantum optics [
45−
47], metamaterials [
48−
54], plasmons [
55−
60], photonic crystals [
45,
61−
63], and metasurfaces that can manipulate the amplitude, phase, and polarization of electromagnetic waves [
16,
64−
82] can provide control over light propagation. Reconfigurable metasurfaces represent a groundbreaking class of structures capable of dynamically manipulating electromagnetic waves through tuning mechanisms that utilize various approaches [
83−
94] or different materials [
95−
108] depending on the operating frequency.
Although optical devices such as near-field microscopes have advanced considerably since their invention [
109–
111], numerically modeling many problems associated with these devices remains challenging [
112,
113]. Similar difficulties also arise in the design of nano-optical devices, where achieving specific functionalities often requires intensive simulations and parameter sweeps. These limitations highlight the need for more efficient strategies. Owing to its unique advantages, artificial intelligence [
114−
119] has found widespread applications across various fields [
120−
175], and it can likewise be leveraged to address the challenges encountered in nanophotonics. For instance, machine learning techniques such as genetic algorithms [
141,
176] have been applied to optimize metasurface unit cells, while Bayesian optimization [
177−
180] has been employed to accelerate parameter tuning with reduced computational cost. In addition, deep learning methods [
180−
202] — including convolutional neural networks (CNNs), generative adversarial networks (GANs) [
203,
204], and conditional generative models — have been successfully used for the inverse design of photonic structures, enabling direct prediction of geometry from target optical responses. These approaches not only reduce the design cycle from days to minutes but also unlock innovative device functionalities and high-performance metasurfaces.
Over the past decade, the convergence of nanophotonic and machine learning has emerged as a prominent and rapidly evolving research frontier. Early applications of machine learning in photonics include contributions to inverse quantum design [
205−
207], computational microscopy [
208], sensing [
209,
210], metrology [
211], optical neural networks (ONNs) [
212−
214], and other areas. Among the various branches of machine learning, deep learning stands out as a transformative force driving the rapid advancement of artificial intelligence. Built upon neural network architectures, deep learning has become a dominant framework in modern machine learning research and applications. Its algorithms, inspired by the human learning process, have achieved remarkable success in numerous domains, including feature detection [
135,
215−
218], speech recognition [
122,
130,
219−
233], image processing [
136,
137,
203,
234−
274] and object classification [
176,
275−
288].
The inverse design process based on deep learning methods [
134,
156,
289−
300] primarily involves two components: the forward design stage of the neural network and the inverse design stage. At the core of deep learning technology lies the construction of artificial neural networks (ANNs). ANNs are inspired by the working principles of biological neurons, in which the fundamental unit of information processing is the artificial neuron or node. Each neuron receives multiple inputs, multiplies them by corresponding weights, adds a bias term, and then passes the result through a nonlinear activation function. This mechanism is analogous to the synaptic weights and soma processing in biological neurons, allowing neurons to flexibly determine whether to “activate” based on the input signal and transmit information to the next layer. Raw data are received by the input layer, while the output layer generates the final results. The hidden layers progressively extract abstract features through layer-by-layer nonlinear mapping, enabling deep learning to automatically learn complex functional representations. To train ANNs, electromagnetic simulations are typically used to generate large sets of training samples. Starting from a randomly initialized network, the output is computed through forward propagation. The weights are then iteratively optimized using the backpropagation algorithm [
301−
303], guided by a loss function (such as mean squared error (MSE), cross-entropy, or mean absolute error), until the model accurately approximates the target distribution of the training set.
In neural-network modeling of nanophotonic devices, input labels x denote structural parameters (geometry, materials, excitation) and output labels y denote device responses (spectra). Electromagnetics makes the forward mapping x→y deterministic and single-valued, so forward networks can learn this mapping accurately and act as fast surrogates for costly simulations. The inverse mapping y→x is inherently one-to-many and multimodal — a single response y can correspond to many different designs x — so inverse networks struggle to represent this ambiguity and often converge to averaged solutions that miss valid design diversity. Even cascade or tandem schemes that couple inverse and forward models to enforce physical consistency typically produce a single feasible design rather than enumerating alternative viable structures.
Compared with traditional models, the tandem neural network (TNN) [
64,
304−
306] integrates forward and inverse networks into a unified framework to efficiently address the one-to-many ambiguity problem in inverse design. In this approach, a forward model is first trained to predict the device’s physical response
y from its physical parameters
x, ensuring that the predictions remain consistent with physical laws. Building upon this, an inverse model is then trained, which takes the target response
y as input and attempts to generate the corresponding parameters
x. The generated
x is subsequently fed back into the forward model for verification, thereby constraining the inverse output to ensure its validity and physical consistency. This tandem architecture not only enhances the stability of inverse design but also yields multiple physically valid solutions, making it an effective strategy for resolving the multi-solution ambiguity in complex photonic device design. In recent years, supported by deep learning and open-source frameworks such as TensorFlow [
307,
308], the application of neural networks in photonics has expanded rapidly and achieved notable practical improvements.
Overall, this review systematically summarizes recent progress in applying artificial intelligence techniques to nanophotonic devices. As illustrated in Fig. 1, this review primarily centers on photonic devices, with particular emphasis on meta-lens, meta-grating, and beam splitters. From an algorithmic perspective, it addresses not only deep learning methods, including neural networks, but also a variety of optimization strategies, such as topology optimization (TO), genetic algorithms, and particle swarm optimization. Furthermore, Table 1 provides a comparative overview of commonly used machine learning algorithms, highlighting their respective advantages and limitations.
2 Nanophotonic devices
Photonics is the discipline that studies how light is generated, manipulated, and detected, and it plays a pivotal role in advancing modern technology. In 1960, Maiman demonstrated the first working laser [
309,
310], marking the beginning of a new era in photonics. During the 1970s, the advent of low-loss optical fibers [
311,
312] and the rise of fiber-optic communications [
313,
314] enabled information to be transmitted at the speed of light; in particular, the development of sub-20 dB/km optical fibers by Corning Laboratory in 1970 became a landmark achievement [
315−
317]. In the 1980s, breakthroughs in nonlinear optics — such as frequency conversion technologies [
318−
324] — expanded the application of lasers to a wider range of fields. The 1990s witnessed the emergence of photonic crystals, which further accelerated the development of novel optical devices.
In the early 21st century, super-resolution imaging technologies overcame the diffraction limit, opening new avenues for optical microscopy and imaging. In the 2010s, the development of photonic integrated circuits (PICs) [
325−
329] enabled the integration of multiple optical components onto a single chip, thereby improving both speed and energy efficiency [
330]. Entering the 2020s, the convergence of artificial intelligence and photonics has further accelerated the field, with AI being widely applied to optical design optimization, complex data processing, and intelligent sensing systems [
177,
237,
331−
343].
2.1 Meta-lens
The utilization of meta-lens is on the rise [
344,
345], owing to their capacity to regulate phase distributions. Their functional capabilities have now surpassed those of conventional lenses. In Ref. [
346], as shown in Fig. 2(a), Fan
et al. proposed an improved genetic algorithm (GA) for synthesizing one-dimensional geometric phase-controlled dielectric metasurfaces by optimizing the target phase pattern, in contrast to earlier approaches that relied on amplitude control. The proposed GA workflow consists of initialization, random mutation, evolutionary screening, and replication. In this method, random mutations are introduced by altering the phase of individual unit cells, while a multi-step hierarchical mutation strategy is employed to overcome local optima, thereby enhancing the GA optimization process. The effectiveness of this approach is validated by generating a longitudinal optical pattern — specifically, a non-diffractive light sheet — using a one-dimensional geometric phase dielectric metasurface with a non-analytical, counterintuitive phase distribution. This strategy demonstrates the ability of the optimized GA to address complex metasurface design problems and to realize customized light-sheet generation. Reference [
347], as shown in Fig. 2(b), presented an inverse-designed meta-optical element with a substantial numerical aperture of ~0.44 and an extended depth of field (EDOF), characterized by a lens-like point spread function (PSF). This EDOF device maintains the same focusing efficiency as a hyperbolic meta-lens throughout its entire focal range. By leveraging the extended focal range in combination with computational post-processing, the authors demonstrated achromatic broadband imaging across the entire visible spectrum using a compact 1 mm meta-optical system. Ref. [
348], as shown in Fig. 2(c), also adopts an inverse design approach, demonstrating that axisymmetric structures based on full-wave Maxwell equations can address emerging challenges, such as large wavelength separations and the integration of active materials. This method enables the creation of fundamentally novel solutions distinct from conventional lenses. These solutions include reconfigurable optics that shift multi-frequency nodal points in response to variations in the refractive index, as well as multispectral lenses with significant spectral separation (
λ = 1 μm and 10 μm). The inverse-designed metasurface achieved near-diffraction-limited focusing at the 1550 nm telecommunications wavelength, with a numerical aperture of 0.4. Due to axisymmetry, the full-wave Maxwell solver can be applied to structures with diameters spanning hundreds to thousands of wavelengths before domain decomposition becomes necessary. Moreover, multilayer TO with approximately 10
5 degrees of freedom enables the solution of highly complex design problems, even within the constraints of axisymmetric geometries.
Another algorithm well-suited for meta-lens design is TO. As shown in Fig. 3(a), Lin
et al. [
349] proposed a fully Maxwell-based topological optimization of a single multilayer meta-lens with a thickness of approximately 10 wavelengths. This meta-lens achieves simultaneous focusing across 60 angular ranges and a 23% spectral bandwidth without chromatic or angular aberrations, thereby demonstrating “flat-field achromatism”. The system exhibits diffraction-limited performance and maintains an absolute focusing efficiency exceeding 50% across the range of incident angles and operating frequencies. Similarly, as shown in Fig. 3(b), Christiansen
et al. [
350] reported a genetic algorithm enhanced by topological optimization. Their study presents a concise code to illustrate the use of TO as an inverse design tool in photonics, exemplified through the design of two-dimensional dielectric meta-lens and metallic reflectors. The results further show that gradient-based optimization methods outperform genetic algorithm-based approaches for photonic inverse design, achieving a figure of merit (FOM) approximately 18% higher than that obtained with the genetic algorithm.
TO has also been extended to the design of large-area meta-lens. As shown in Fig. 3(c), Phan
et al. [
65] proposed a computationally efficient strategy in which individually optimized segments of a metasurface are stitched together, reducing the optimization complexity from a high-order polynomial to a linear scale. Using this approach, they designed and fabricated a large-area, high-numerical aperture silicon meta-lens that achieved focusing efficiencies exceeding 0.8. Similarly, as shown in Fig. 3(d), Lin
et al. [
351] proposed an overlapped-domain strategy for designing wide-area metasurfaces, in which each simulation unit includes a unit cell structure, overlapping regions from adjacent cells, and a perfectly matched layer (PML) absorber. Within this framework, they applied a topology modeling approach to design a large-area (200
λ), high-numerical aperture (0.71) polychromatic, omnidirectional achromatic lens with a focusing efficiency of about 50%. The design incorporated a large number of degrees of freedom and was optimized to maximize the minimum focal intensity at ten evenly spaced wavelengths spanning the visible spectrum (480–700 nm).
Meta-lens designed using deep learning methodologies have attracted growing attention due to their potential for efficient and high-performance device optimization. As shown in Fig. 3(e), Pestourie
et al. [
68] proposed an active learning framework tailored for composite materials, which substantially accelerates the training of surrogate models representing the physical response of nanophotonic structures, reducing training time by at least an order of magnitude compared with conventional approaches. By replacing direct numerical solutions of partial differential equations (PDEs) with these surrogate models, the method achieves more than a two-order-of-magnitude reduction in simulation time. The study demonstrated a ten-layer metasurface consisting of 100 unit cells with a periodicity of 400 nm, capable of simultaneously focusing incident light at three discrete wavelengths (405 nm, 540 nm and 810 nm) onto three separate focal points. This multi-wavelength focusing performance highlights the versatility of the approach. Moreover, the authors emphasized that integrating active learning with surrogate modeling holds significant promise for accelerating the optimization of large-scale photonic devices, potentially enabling rapid design cycles and enhanced device functionalities. Similarly, as shown in Fig. 3(f), Seo
et al. [
117] proposed a deep learning–based image restoration framework for end-to-end meta-lens imaging. This approach effectively mitigates the severe chromatic and angular aberrations inherent in large-area broadband meta-lens, enabling aberration-free full-color imaging with mass-produced meta-lens of 10 mm in diameter. The proposed framework not only provides an effective strategy to overcome the intrinsic limitations of conventional optical systems but also establishes a foundation for the development of compact and high-efficiency imaging technologies. With neural network assistance, meta-lens imaging achieves resolutions comparable to ground-truth images.
2.2 Meta-grating
The grating is periodically incorporated into the metasurface structure, with metaatoms arranged in a periodic manner, thereby forming a meta-grating. This meta-grating allows precise control over the deflection of incident light into specific diffraction orders, enabling effective modulation of anomalous diffraction effects. Notably, this design not only improves diffraction efficiency but also achieves an abrupt phase shift covering the full 2π range at the interface. Furthermore, TO algorithms have been introduced into the design process of such meta-gratings. Jiang
et al. [
352] demonstrated that with training on visual data of periodic, topologically enhanced meta-gratings, generative neural networks can produce efficient, topologically intricate devices that can operate across a broad range of diffraction conditions. Subsequent iterative analysis further improves device robustness and efficiency, and these enhanced designs can serve as additional training data to refine the neural network model. As can be seen from Fig. 4(a), at short operating wavelengths, the efficiency histogram produced by the neural network method is similar to the efficiency histogram of the device with only iterative optimization. Nevertheless, GAN-based approaches have shown limited performance when compared to iterative optimization methods, indicating considerable room for improvement. Wen
et al. [
353] proposed a novel GAN training framework and network architecture specifically designed for meta-gratings with complex topologies. Their model facilitates the generation of highly efficient and robust free-form metasurfaces, achieving performance on par with or even exceeding that of gradient-based TO — all at a low computational cost. As illustrated in Fig. 4(b), the progressive growing GAN (PGGAN) integrates three major innovations: progressive network growth to improve the learning of local topological features; a self-attention mechanism for better modeling of global spatial dependencies within device patterns; and incremental expansion of the training dataset to more effectively represent efficient topological variations across the design space.
Inverse design algorithms play a crucial role in realizing high-performance, free-form nanophotonic devices. Chen
et al. [
354] revealed that redefining the design space itself can impose fundamental constraints on the development of inversely designed devices. The devices are first conceptualized in an unconstrained latent space and then translated into real physical space, where practical geometric restrictions can be efficiently enforced. Physical modifications to the device are mapped back into the latent representation via backpropagation. As a feasibility demonstration, the authors introduced a parameterization strategy to design efficient meta-gratings by applying strict minimum feature size constraints — using variable-based methods for localization and Global Topology Optimization Networks (GLOnets) for globalization. As illustrated in Fig. 4(c), which showcases the reparameterized design process of a meta-grating, this study utilizes robust, fixed-topology, parametric GLOnets to achieve global top-level optimization of a meta-grating under a 30 nm minimum feature size and 10 nm width variation constraint, ultimately reaching a relatively high efficiency exceeding 90%.
Inverse design algorithms are primarily used to simulate the optical responses corresponding to different combinations of structural parameters. In the literature, Inampudi
et al. [
355] proposed a method that utilizes ANNs to rapidly establish a numerical mapping relationship. This study employs a metasurface grating composed of periodic dielectric structures with arbitrary shapes. When incident plane waves illuminate the structure at normal incidence, the light is diffracted into 13 distinct orders. Thus, it is necessary to calculate the diffraction efficiency (DE) for each order generated by the arbitrary-shaped unit. Under the specific parameter combinations selected, the mean prediction error of the network remains below 0.04. As illustrated in Fig. 4(d), which shows a representative unit of the meta-grating, the approach demonstrates high predictive accuracy.
Recently, hybrid optimization strategies for customizing meta-gratings have garnered growing interest. As shown in Fig. 4(e), Elsawy
et al. [
356] employed a combination of statistical learning and evolutionary optimization techniques, integrated with a full-wave time-domain discontinuous Galerkin (DGtD) solver, to maximize the performance of phase-gradient metasurfaces. Their study focused on the optimal design of a GaN-based phase-gradient metasurface operating at visible wavelengths. Numerical results demonstrated that with only 150 full-wave simulations, arrays of both rectangular and cylindrical nanopillars achieved diffraction efficiencies exceeding 88% for TM polarization and over 85% for TE polarization, respectively.
2.3 Waveguide-based coupler
Free-space and integrated nanophotonic platforms have emerged as foundational platforms for photon manipulation, guidance, filtering, and transmission via subwavelength-engineered structures (see Refs. [
357−
367]). In response to the increasing demand for highly integrated optical circuits, a wide range of computational optimization techniques have been developed to reduce device footprints while enhancing overall performance. These include inverse design methods [
368], adaptive gradient and adjoint-based optimization [
167,
171,
222,
369−
371], multi-objective optimization algorithms [
372,
373], and multiskilled diffractive neural networks (MDNNs) [
374].
The reverse-designed optical coupler represents a critical component within integrated photonic platforms. Du
et al. [
375] realized an ultra-compact multifunctional integrated photonic platform with a footprint of only 3 mm × 0.2 mm, integrating 86 micro-couplers and 91 phase shifters. The functionality of the platform was experimentally validated through quantum state evolution and photonic neural network tasks — such as handwritten digit recognition. As illustrated in Fig. 5(a), which shows a schematic of the platform, these experiments demonstrated high operational fidelity and achieved an on-chip test accuracy of 87%.
Similar to other nanophotonic interfaces, as shown in Fig. 5(b), Dory
et al. [
376] proposed an optimization-driven inverse design approach to overcome the fabrication constraints of diamond nanostructures and develop functional devices for diamond-based nanophotonic circuits. They demonstrated a grating coupler integrated with a diamond optical waveguide that enables interference between two nanobeam photonic crystal resonators via an inverse-designed waveguide splitter. This design achieved a simulated coupling efficiency of 95%, which marked a significant step forward in the development of integrated quantum photonic circuits. This design achieved a simulated coupling efficiency of 95%, marking a significant advancement in the development of integrated quantum photonic circuits.
As shown in Fig. 5(c), Gostimirovic
et al. [
377] developed an ANN model to accelerate the design of polarization-insensitive subwavelength grating (SWG) couplers on silicon photonic integrated platforms. The model optimizes SWG-based grating couplers by incorporating reverse design at the structural level to achieve either single- or dual-polarization operation. By mapping the nonlinear input–output behavior of the device into a learnable weight matrix, this method enables rapid prediction of device performance through straightforward linear algebraic operations and nonlinear activation functions. It achieves a computational speed approximately 1830 times faster than conventional numerical simulations while maintaining 93.2% accuracy compared to simulated results.
Beyond the aforementioned studies, additional research efforts have focused on optimizing other types of couplers. Jin
et al. [
378] proposed a scalable computational approach for designing highly efficient and compact on-chip devices capable of coupling light at multiple wavelengths. The designed device enables efficient broadband proximity coupling, covering six distinct wavelengths ranging from 560 nm to 1500 nm. As illustrated in Fig. 5(d), which shows a schematic of the waveguide coupler, this structure achieves high performance across the entire targeted spectrum.
2.4 Beam spiltter
A fundamental application of photonics lies in wavelength division multiplexing (WDM) [
379−
382], which increases the data transmission capacity of a single optical waveguide or fiber by employing multiple distinct wavelength channels. Piggott
et al. [
383] developed an inverse design algorithm capable of automatically generating various linear optical components. Users can specify target device functionalities, and the algorithm identifies structural designs that meet those requirements. Based on a local convex optimization framework, the method determines the distribution of dielectric constants and electric fields that satisfy both physical constraints and device performance targets. As a demonstration, a compact wavelength multiplexer, as shown in Fig. 6(a), was designed to separate 1300 nm and 1550 nm light from an input waveguide into two output waveguides. The device occupies an area of only 2.8 μm × 2.8 μm.
Building on this work, Piggott
et al. [
384] proposed an inverse design methodology that incorporates fabrication constraints. This approach was used to design several nanophotonic devices, including a 1 × 3 power splitter, a spatial-mode demultiplexer, a wavelength demultiplexer, and a two-port directional coupler. Experimental results of the 1 × 3 power splitter, as illustrated in Fig. 6(b), showed an insertion loss of 0.642 ± 0.057 dB and excellent power uniformity of 0.641 ± 0.054 dB, indicating low loss and high uniformity. The complex structural features of the device demonstrate the potential of inverse design for fabricating practical components in integrated photonic systems. Building on this foundation, the team subsequently proposed a novel one-dimensional photonic crystal split-beam [
385] nanocavity, optimized through the integration of a deterministic design method with the hill-climbing algorithm, and experimentally demonstrated to achieve an ultra-high quality factor.
Su
et al. [
386] developed the Stanford photonic inverse design (SPIN) algorithm for nonlinear nanophotonic inverse design. By modularizing the inverse design workflow into interchangeable subcomponents, SPIN enables systematic and flexible exploration of diverse design strategies. The authors applied inverse design to generate hundreds of 3D wavelength demultiplexer designs and extensively investigated local minima in the design space, as illustrated in Fig. 6(c), thereby providing critical insights into the impact of initial conditions on device performance.
2.5 Light sensor
The ability to visually adapt and autonomously adjust responses to incoming light is a fundamental characteristic of intelligent biomimetic artificial vision systems. Various types of neuromorphic photocells have been developed, primarily based on low-dimensional materials [
387], metal oxides [
388,
389], ferroelectric materials [
390], organic compounds [
391−
393], and perovskite photoactive materials [
394,
395]. Thanks to their unique photosensitivity, these materials effectively utilize their photoconductive properties to detect and collect information from external optical stimuli. This functionality allows them to be integrated into advanced sensing and imaging systems that require precise and rapid optical signal acquisition.
As shown in Fig. 7(a), Lee
et al. [
396] developed a multidimensional photodetector based on an array of programmable light-responsive transistors (PLRTs) fabricated from black phosphorus (bP). The conductivity and photoresponsivity of these bP-PLRT devices can be dynamically programmed with 5-bit precision, both spatially and temporally, by modulating charge within the gate dielectric layer through combined electrical and optical control. This programmability enables the implementation of an onboard CNN. The sensor array is capable of simultaneously capturing optical images over a broad infrared spectral bandwidth and performing on-chip inference tasks with 92% image recognition accuracy, making it suitable for constructing more advanced visual perception neural networks.
As shown in Fig. 7(b), Zhu
et al. [
397] designed a novel flexible photoelectric sensor array comprising one thousand and twenty-four pixels, which utilizes a hybrid active material system of carbon nanotubes and perovskite quantum dots (CsPbBr
3-QDs) to enable efficient neuromorphic vision sensing. The device exhibits exceptional photosensitivity, with a measured responsivity of 5.1 × 10
7 A/W and a specific detectivity of 2 × 10
16 Jones. Furthermore, the sensor array was successfully trained via neuromorphic reinforcement learning using weak light pulses at an intensity of 1 μW/cm
2. These results underscore its potential to emulate the flexibility, complexity, and adaptability of biological visual processing, providing a promising platform for artificial neuromorphic vision systems. In a related effort, Luo
et al. [
374] proposed a multifunctional diffractive neuromorphic network implemented through an on-chip metasurface device, as shown in Fig. 7(c). By integrating a well-established complementary metal-oxide-semiconductor (CMOS) camera detector with the metasurface, they constructed a chip-scale architecture capable of processing images directly at the physical level. This system enables ultra-fast and highly energy-efficient image processing, with compelling applications in machine perception, autonomous vehicles, and advanced medical diagnostics.
Fiber-optic sensing, enhanced by artificial intelligence, now achieves higher accuracy and adaptability, enabling multifunctional signal detection and improved robustness against environmental disturbances. Wang
et al. [
398] proposed a pattern recognition strategy combining CNNs and LSTMs with phase-sensitive optical time-domain reflectometry for anomaly detection, achieving superior results on benchmark datasets. Expanding on this concept, Sabih and Vishwakarma [
399] applied CNNs with bidirectional LSTMs to optical flow analysis, integrating domain knowledge to further improve classification accuracy. Leveraging these deep learning techniques in biomedical sensing, a mesh microbend fiber-optic sensor [
400] was developed for perioperative monitoring of heart rate and respiratory rate, where an EMD-LSTM-CNN(ELC) model outperformed traditional FFT and WT methods, offering enhanced feature extraction and strong clinical potential.
2.6 Optical interference unit
Recently, due to the unique advantages of nano-optics such as parallel computing, low power consumption, and propagation at the speed of light, people are increasingly interested in using it to complete some specific machine learning tasks [
401,
402]. The so-called artificial intelligence interference is mainly applied to photonic systems. The next two sections will introduce optical interference neural networks and optical diffraction neural networks respectively.
As shown in Fig. 8(a), Shen
et al. [
229] introduced a novel all-optical neural network architecture that harnesses the inherent advantages of optical computing to achieve a computational throughput at least two orders of magnitude higher than existing technologies, along with a three-order-of-magnitude improvement in power efficiency for conventional learning tasks. The architecture was experimentally realized using a cascaded array of Mach–Zehnder interferometers (MZIs), achieving a simulation accuracy of 91.7% in vowel recognition tasks, thereby demonstrating its practical applicability.
However, such optical networks often lack efficient training protocols and their physical implementation remains dependent on models simulated in electronic computers [
403]. To facilitate on-chip training of ONNs, Hughes
et al. [
403] employed the adjoint variable method (AVM) to develop a photonic analogue of the backpropagation algorithm [
404]. As shown in Fig. 8(b), they demonstrated the training process using a digitally simulated photonic ANN. In their work, the trained optical neural network was used to implement a logical XOR gate, successfully learning the gate function after approximately 400 iterations. The proposed method is also applicable to more complex computational tasks.
In addition to optical unit interfaces (OUI), a variety of functional computing units have been conceptualized and established as foundational components of ONNs [
405−
408]. As shown in Fig. 8(c), Estakhri
et al. [
407] proposed a metamaterial-based analog computing platform capable of solving partial differential equations through interaction with incident electromagnetic waves. When arbitrary input waveforms serve as functions corresponding to specific integral operators, the system yields a multifunctional electromagnetic field that represents the solution. The performance of the metastructure was verified through design, numerical simulation, and experimental validation at microwave frequencies. Experimental results indicated that the system reaches a steady-state resonant response within fewer than 300 cycles of a monochromatic input wave. Based on these observations, the authors estimated that the solution time reaches nanoseconds at microwave frequencies and picoseconds in the visible light regime. This approach offers a promising route toward compact, high-speed, and integrated on-chip analog computing systems.
Khoram
et al. [
409] employed a nanophotonic neural medium (NNM) to demonstrate and execute computer vision tasks. During training, a 20 × 20-pixel image is converted into a vector and encoded into the spatial intensity profile of the input light. Within the NNM, the nanostructure induces strong optical interference that modulates the propagation of light, directing it toward ten output ports corresponding to different classes. The port with the highest optical energy intensity is identified as the predicted category. The iterative training process, which utilizes mini-batch stochastic gradient descent, is illustrated in Fig. 8(d).
2.7 Optical diffractive neural network
Optical diffraction-based devices are also capable of performing a wide range of machine learning tasks [
410]. Lin
et al. [
215] proposed a novel machine learning approach through an all-optical diffractive deep neural network (D
2NN) architecture. This system leverages passive diffraction layers to computationally implement various functions through deep learning. A D
2NN comprises a series of engineered diffractive layers, each designed to modulate incident light. Every point on these layers acts as an optical neuron, serving as a secondary source that re-radiates light. The properties of the emitted wave — both amplitude and phase — are determined by the interaction between the incoming optical field and the locally defined complex transmission or reflection coefficients. Through the cumulative modulation across multiple layers, D
2NNs achieve task-specific optical transformations without relying on electronic processing. Connections between neurons in successive layers are established through secondary wave interference, modulated by prior layer diffraction patterns and local optical coefficients. The transmission or reflection coefficient at each point functions as a multiplicative bias term, which is optimized iteratively during training. As illustrated in Fig. 9(a), which depicts the network architecture and experimental outcomes, the cognitive capability of the D
2NN was experimentally validated using a 3D-printed handwritten digit classifier. Numerical simulations of a five-layer D
2NN achieved a classification accuracy of 91.75% on a test set of approximately 10,000 images. In contrast to earlier optoelectronic learning systems [
229,
411−
413], the D
2NN framework provides a fully optical deep learning platform based on passive diffractive components, operating at the speed of light with high energy efficiency.
Several studies have investigated the use of ONNs operating in Fourier space for image processing applications. For example, Yan
et al. [
414] proposed a Fourier-space diffractive all-optical neural network (F-D
2NN) situated at the Fourier plane within an optical imaging system. This configuration allows direct measurement of salient object features at the sensor. The study demonstrated the network’s capability for all-optical visual saliency detection through its application in cell segmentation. As shown in the right panel of Fig. 9(b), the saliency detection results clearly outline cell boundaries, accurately capturing their morphology and spatial distribution. The proposed architecture enables high-speed image processing and computer vision tasks at the speed of light, with its performance in high-accuracy visual saliency detection and object classification thoroughly validated through extensive numerical simulations.
More extensive neural network-like (NN-like) diffractive optical systems have been explored in recent literature, including reconfigurable diffractive processing units (DPUs), ensemble learning in diffractive optical networks [
415], and broadband diffractive neural networks [
416]. Zhou
et al. [
417] developed a reconfigurable DPU that can be programmed to implement various neural network architectures, facilitating large-scale neuromorphic optoelectronic computation. In their work, classification tasks were conducted on both handwritten digit datasets and human motion datasets. After adaptive training, the network achieved accuracies of 96.0% and 96.3%, respectively, resulting in highly accurate performance for rapid image and video recognition across multiple benchmark datasets. As illustrated in Fig. 9(c), the system demonstrates exceptional experimental accuracy. Furthermore, its computational capability outperforms that of state-of-the-art electronic computing platforms.
2.8 Other applications
The capability of AI algorithms to autonomously learn the nonlinear relationships between meta-devices and their electromagnetic responses has established them as an indispensable tool for designing high-performance optical devices. In addition to the optical components previously discussed, AI techniques have been extensively applied to a growing range of nanophotonic systems — including Fano resonators [
418], noisy intermediate-scale quantum (NISQ) devices [
419], photonic crystals [
420], photon extractors [
368], and particle accelerators [
421] — significantly broadening the scope of nanodevice applications. In this section, we will further introduce applications in holographic imaging, nonlinear optical fibers, and optical information storage. These three examples illustrate the profound synergy between nanophotonic devices and artificial intelligence.
Optical holography enables the simultaneous acquisition of both phase and amplitude information of an object [
69], and can even accurately reconstruct arbitrary three-dimensional vector field distributions of a wavefront [
422]. Early research primarily focused on metasurface-based display and encryption of single holographic images [
423,
424]. Recently, multiplexing has emerged as a superior alternative [
425−
430]. Wei
et al. [
67] established a spin-multiplexed metasurface inverse design platform based on a bidirectional deep neural network (Bi-DNN) model for rapidly generating high-pixel-density metasurfaces. As shown in Fig. 10(a), the meta-lens designed using this platform achieved focusing efficiencies of 54.06% and 50.49% in the two spin channels, respectively, with the near-field reconstructed holographic image closely matching the target image. Furthermore, by integrating the Bi-DNN model with advanced optimization algorithms, both electromagnetic response prediction and structural design can be enhanced, thereby supporting the development of high-performance photonic devices.
Teğin
et al. [
431] proposed and experimentally demonstrated an optical computing architecture termed the scalable optical learning operator (SOLO). This framework integrates a fixed nonlinear multimode fiber (MMF) mapping stage with a single-layer digital neural network to form a reconfigurable processing system. After training on a large dataset, the network can accurately identify the corresponding outputs, which are then recorded via a camera. The study conducted classification and prediction across multiple datasets, achieving an accuracy of 83% in diagnosing COVID-19 from lung X-ray images. As illustrated in Fig. 10(b), the experimental setup confirms the high stability of the SOLO framework.
Optical information storage [
432−
435] demonstrates broad application potential in meeting the demands of massive data storage owing to its high capacity, exceptional stability, and long lifespan. However, its further advancement remains challenged by the need to balance readout quality and error rates. In recent years, the integration of optical information storage with deep learning has opened new pathways toward high-fidelity data storage. Wang
et al. [
436] proposed a novel deep learning-based framework for optical storage, which converts information retrieved from a birefringence measurement system into binary data and feeds it into a neural network for classification, achieving nearly 100% accuracy. As illustrated in Fig. 10(c), experiments further confirmed that the deep learning model enables error-free information extraction even under external environmental interference. This experimental optical storage architecture lays the groundwork for realizing long-lifetime, highly stable optical storage systems capable of performing reliably in dynamic conditions, showcasing promising application prospects.
Nanophotonic represents an emerging frontier focused on manipulating light-matter interactions at the nanoscale, where the design of ultra-compact and high-performance devices remains a central challenge. Traditional empirical design methods are often computationally intensive and offer limited flexibility. In response, deep learning has enabled powerful inverse design strategies that begin with target optical functionalities and efficiently generate corresponding nanophotonic structures. This review has discussed how machine learning [
163,
437−
442] facilitates accurate modeling and optimization of nanostructures. Advanced algorithms, including federated learning [
443,
444], transfer learning [
445−
447], and self-adaptive algorithm [
448,
449] and distributed learning techniques [
287,
450−
453], have been successfully applied to the design of meta-lens, meta-gratings, beam splitters, and other key photonic components. In addition, hybrid optimization strategies [
176,
284,
454,
455] and other optimization algorithms [
456−
463] have also garnered growing interest. These approaches significantly improve design efficiency, overcome local minima, and broaden solution space coverage.
3 Perspective and conclusion
Overall, AI has demonstrated considerable advantages in nanophotonic design, such as generating performance-targeted device designs within milliseconds and escaping local minima commonly encountered in traditional optimization processes. However, these methods also face several challenges: training neural networks requires large amounts of simulation data, and they remain sensitive to ill-posed inverse problems with multiple solutions, often resulting in poor convergence and limited coverage of the solution space. Future research may focus on physics-informed neural networks and reversible neural architectures. The former incorporates physical constraints such as Maxwell’s equations to reduce the demand for training data, while the latter addresses multi-solution issues through invertible network design. Combining AI with metaheuristic or gradient-based optimization is expected to further enhance design efficiency. Meanwhile, emerging photonic technologies such as ONNs, PICs and quantum photonics, provide new pathways toward all-optical AI hardware, enabling high-speed, low-power optical computing and real-time information processing. Although the integration of deep learning and nanophotonic device design still encounters bottlenecks — including limited algorithm generalizability and manufacturability constraints — breakthroughs in key technologies could unlock significant potential for developing novel photonic devices.