Nanophotonics Laboratory, School of Optical and Electronic Information, Huazhong University of Science and Technology, Wuhan 430074, China
zhaoming@mail.hust.edu.cn
Show less
History+
Received
Accepted
Published Online
2025-08-07
2026-01-25
2026-02-11
PDF
(4315KB)
Abstract
The optical diffractive neural network (ODNN), based on the free-space propagation of light waves, exhibits significant ad-vantages, including ultra-high speed, low power consumption, and parallel computation. However, this technology faces challenges in practical applications, particularly concerning fabrication and alignment accuracy, with stringent requirements on manufacturing processes. In this paper, a class of hybrid optical diffractive neural networks (H-ODNNs) is designed by constructing continuous passive phase modulation layers using diffraction neurons of varying sizes. Three representative tasks (digit recognition, image processing, and wavelength multiplexing) substantiate its superior performance in enhancing robustness. Compared with network with vaccination (a common method for enhancing robustness), the H-ODNN does not need vaccination training, the average training time is reduced by approximately 50%, and even achieves superior performance. Additionally, the larger size of some diffraction neurons, the H-ODNN reduces the complexity of fabrication and improves manufacturing yield. This work provides a new concept for the design of ODNN.
In recent years, the rapid advancement of computer technology and hardware has significantly accelerated the progress of deep learning [1,2]. Although neural networks exhibit outstanding performance, their computational speed heavily relies on expensive GPUs, which hinders integration and miniaturization. All-optical computing refers to systems that utilize light as both the information carrier and processing medium, leveraging optical properties such as intensity, phase, frequency, and polarization to perform information processing and computation [3–6]. Compared to conventional computing platforms, optical computing offers notable advantages, including low power consumption, ultra-high speed, and large-scale parallelism. Building on research in computer-based deep neural networks and all-optical computing, researchers have proposed a novel optical computing platform known as the optical diffractive neural network (ODNN) [7–14]. This platform operates on the principle of free-space light propagation, performing computational tasks at near-light-speed through a series of diffraction layers. The framework of ODNN offers greater design flexibility, enabling it to tackle more complex tasks. With the continuous efforts of researchers, ODNN has achieved significant progress across various fields, including classification [15–21], image processing [22–24], and matrix computation [25].
Although ODNNs offer significant advantages in terms of speed and power efficiency, it faces substantial challenges in fabrication and alignment. The neuron size of ODNN is closely related to its operating wavelength. As the wavelength decreases, the supported neuron size becomes smaller, leading to higher network integration. However, this also results in greater challenges in fabrication and alignment. For networks operating in the visible-light band with neuron sizes about 500 nm, extreme sensitivity to fabrication and alignment errors becomes a critical limitation that must be addressed for achieving ultra-high integration and miniaturization. As shown in Fig. 1a, for the traditional network structure (the size of the diffraction layer is fixed), when the number of neurons is large, the ideal performance is high but the robustness is poor. When the number of neurons is small, the ideal performance is poor but the robustness is good. Their ideal performance and robustness to errors are mutually restrictive. Currently, researchers have attempted to achieve high robustness and high accuracy by introducing vaccination training into conventional networks with a large number of neurons [22,26,27]. However, this approach presents two challenges. On the one hand, it requires comprehensive consideration of various errors, and the training performance is highly dependent on the specific conditions. On the other hand, incorporating vaccination training increases the complexity of network training, significantly increasing training time and degrading overall network performance (Fig. 1b).
In this study, we propose a class of hybrid optical diffractive neural networks (H-ODNNs) with high robustness, inspired by the variable hidden layers of the computer-based neural network. Unlike conventional ODNNs, H-ODNNs incorporate diffractive layers with variable neuron sizes, as illustrated in Fig. 1a. The performance of 20 distinct network configurations (including various hybrid designs and vaccinated network) under different axial and lateral misalignment conditions were evaluated through three representative tasks (digit recognition, image processing, and wavelength multiplexing). Results reveal that H-ODNN can simultaneously possess high performance and high robustness. Notably, the (a configuration of H-ODNN) demonstrates performance comparable or superior to vaccinated network across all misalignment conditions, since eliminates the need for additional resilience training, it significantly shortened the overall training time compared to vaccinated network. Crucially, due to the large size of neurons in some diffractive layers, the H-ODNN has relatively lower requirements for fabrication, which can bring higher yield. This research provides a novel approach to designing ODNNs, offering advantages in both fabrication and practical applications.
2 Methods and principles
2.1 Design of H-ODNN
As shown in Fig. 2, the H-ODNN consists of two phase modulation layers of identical size (W = 100). To facilitate the performance comparison among various network structures across different studies, we fixed the number of neurons in one diffraction layer at 100 × 100, while the other diffraction layer contained m × m neurons (m = 10, 20,…, 90), forming hybrid structural configurations. The networks were named based on the position of the diffraction layer with fewer neurons, where “–” indicates that this layer is positioned as the first diffraction layer, and “+” indicates placement as the second diffraction layer. When both diffraction layers have 100 × 100 neurons, the network configuration corresponds to the conventional structure and is denoted as . For the total of 19 network structures considered (18 hybrid structures and one conventional structure), the spacing between layers was consistently set at 60, and the distance from the second diffraction layer to the output plane was fixed at 160.
The field transmission function for each layer is defined as , where is the phase distribution corresponding to the l-th diffraction layer. In the hybrid network, the phase distribution can be expressed as:
where is the phase matrix of the diffraction layer optimized randomly, is a matrix of size consisting entirely of ones and is the Kronecker product symbol. The gradient of the loss function with respect to the phase parameters is calculated, and the parameters are updated iteratively using the gradient descent method to ensure the convergence of the loss function. The details of forward propagation and training can be found in the Supplemental Material.
ODNNs are complex multi-layer diffractive structures, where fabrication and assembly errors between different layers constitute critical parameters that influence network performance. In simulations, the layers of the network are perfectly aligned according to predefined configurations; however, in physical implementations, the actual layers often deviate from their ideal positions. For ODNNs operating in the terahertz band, the required absolute alignment precision is relatively low due to their longer operational wavelengths and larger neuron sizes. In contrast, for ODNNs working in the visible spectrum, the shorter wavelengths result in significantly smaller diffractive neuron dimensions (approximately 500 nm), posing substantial experimental challenges and considerably increasing fabrication costs. This study primarily investigates two common types of misalignment errors: lateral and axial misalignments, the introduction methods of the two errors are shown in Fig. 2. The lateral misalignment is set as the offset distance in the y-direction for the second layer; the axial misalignment is set as the offset distance in the z-direction between the first layer and the second layer.
2.2 Principle of robustness
The output light field of a plane coherent wave after modulation by an arbitrary diffraction layer can be expressed as:
where is the light field distribution at the output plane, T is the complex transmittance distribution, p is the scaling factor related to the size of the neuron, h is the convolution kernel, and z is the diffraction propagation distance.
When there is lateral misalignment in the x or y direction in the ODNN (taking the y direction as an example), the output light field can be expressed as:
where is the deviation in the y direction. The difference between the output light field and the ideal light field can be expressed as:
expanding the above equation and retaining the first-order term, we get:
from this expression, it can be seen that as the value of p increases (when a network with a larger neuron size is used), the value of will approach 0. Therefore, the difference in the output light field will also approach 0, implying that the impact of the lateral misalignment will be minimized.
When there is an axial misalignment in the ODNN, the output light field can be expressed as:
where is the deviation in the interlayer distance. The difference between the output light field and the ideal light field can be expressed as:
by differentiating this equation, we get:
from this expression, it can be seen that as the value of p increases, the derivative of the output light field difference approaches 0. This implies that becomes more constant, and thus, the effect of the axial misalignment is minimized.
2.3 Principle of network vaccination
In neural network training, the vaccination technique is a method that enhances a model’s robustness to noise or outliers in input data by deliberately introducing controlled noise or misalignments. The vaccination technique has also been widely applied in the training of ODNNs. Specifically, different types of errors are introduced in-to the training process. Random axial and lateral misalignments, and , are applied to the diffraction layer for vaccination:
where denotes a random uniform distribution, and represent the lateral and axial displacement errors of diffraction layer, respectively. Additionally, and denote the maximum error magnitudes for the respective parameters. In this paper, we conduct vaccination training on the , and the network after the training was named .
3 Simulations and verifications
To systematically validate and evaluate the performance advantages and robustness of the proposed H-ODNNs under physical misalignments, three representative optical computing tasks in this section are designed, including digit recognition, image denoising, and wavelength classification. For each task, we construct and train 18 different H-ODNNs, 1 conventional ODNN, and 1 vaccinated ODNN. The performance of these 20 networks under lateral and axial misalignments are compared to assess their robustness. Additionally, the training processes of 4 typical networks (, , , ) are compared, including per-epoch training time, convergence epochs, and loss values, to evaluate optimization difficulty and computational resource requirements across different structures. Detailed training configurations for all network structures corresponding to each task are provided in the Supplementary Material.
3.1 Digit recognition
Image classification, such as handwritten digit recognition, is one of the typical applications of ODNNs [15–21], attracting extensive research interest in both model design and physical implementation. In this section, we design a class of ODNN for digit image classification (digits 0 to 5), as shown in Fig. 3a.
The accuracy (testing set) of the 20 trained networks under lateral and axial misalignments is presented in Fig. 3b and c, respectively. Under ideal condition, the classification accuracy of the 4 selected networks are 88.58%, 90.33%, 91.92%, and 81.75%, respectively. When subjected to lateral misalignments, the 4 networks experience accuracy declines by an average of 1.06%, 12.89%, 30.91%, and 0.23%, respectively. Under axial misalignments, their accuracy declines by an average of 0.58%, 4.67%, 11.05%, and 0.34%, respectively. Here, the average performance decline is defined as the average difference between the central performance and the performance under various misalignment conditions. Under all tested misalignment conditions, the performance of consistently surpasses that of . For these 4 selected networks, the per-epoch training time, convergence epoch number, and loss values are shown in Fig. 3d–f, respectively. The per-epoch training time of network is 24.5% shorter than that of , while the convergence epochs and loss values of are comparable to those of . The complete simulation results of 20 types of networks can be found in the Supplemental Material.
3.2 Image denoising
Image processing is a promising research direction for ODNNs. These networks offer significant advantages in computational power consumption and processing speed compared to traditional image processing methods [22–24]. In this section, we design a class of ODNN for motion-blurred image restoration, as shown in Fig. 4a.
The peak signal-to-noise ratio (PSNR) variations on the testing set for the 20 trained networks under lateral and axial misalignments are presented in Fig. 4b and c. Under ideal condition, the PSNR values of the 4 selected networks are 17.39 dB, 18.86 dB, 19.18 dB, and 16.73 dB, respectively. When subjected to lateral misalignments, the PSNR declines by an average of 0.86 dB, 4.10 dB, 4.47 dB, and 0.09 dB, respectively. Under axial misalignments, the PSNR declines by an average of 1.16 dB, 2.54 dB, 2.94 dB, and 0.43 dB, respectively. Under small misalignment levels, the performance of surpasses that of . As the misalignment increases, the performance of slightly declines but remains comparable to overall.
Additionally, we randomly select two digits, “0” and “9”, from the test set to demonstrate the image restoration effects of the 4 networks under lateral and axial misalignments. The results are shown in Fig. 4d and e. The per-epoch training time, convergence epochs, and loss values for these 4 networks are presented in Fig. 4f–h, respectively. The training time of is 73.3% shorter than that of , while their convergence epochs and loss values remain similar. The complete simulation results of 20 types of networks can be found in the Supplemental Material.
3.3 Wavelength classification
Wavelength multiplexing is an important research di-rection in the field of diffraction optics. ODNNs for wavelength multiplexing have attracted significant attention in recent years [28,29]. In this section, we design a class of ODNN for two wavelength classification, as illustrated in Fig. 5a. Figure 5b shows the variations in the proportion of target area intensity within the target region for the 20 trained networks under lateral and axial misalignments. The pro-portion of target area intensity at wavelength is defined as:
where is the total light intensity of the output plane, is the light intensity within the target area of wavelength . For the 4 selected networks under ideal condition, are 91.32%, 94.75%, 94.01%, and 73.67%, respectively; are 88.27%, 93.34%, 94.11%, and 72.82%, respectively. For the 4 selected networks under lateral misalignments, the declines by an average of 3.98%, 33.40%, 39.05%, and 2.37%, respectively; the average declines by an average of 4.75%, 34.62%, 38.00%, and 4.05%, respectively. For the 4 selected networks under axial misalignments, the average declines by an average of 1.20%, 4.87%, 22.79%, and 5.90%, respectively; the average declines by an average of 1.76%, 9.64%, 26.25%, and 8.02%, respectively.
It is worth noting that under all tested misalignments, the performance of consistently surpasses that of . Compared with the previous tasks of digit recognition and image denoising, the vaccinated network shows poorer noise resistance in wavelength classification. This is because diffraction propagation is wavelength-dependent, which increases the complexity of the inoculation training process. Figure 5c–e show the per-epoch training time, convergence epochs, and loss values of the 4 net-works, respectively. The training time of is 56.9% shorter than that of , and the loss value of is significantly higher than that of .
4 Discussion
In the design of ODNNs, network performance is often interrelated with its robustness. Under the constraint of fixed diffraction layer sizes, using smaller diffraction neuron sizes allows for a large number of trainable units, which is theoretically beneficial for improving the overall performance of the network. However, in practical applications, various errors that cannot be ignored occur. Smaller diffraction neurons make the network more sensitive to these errors, which severely impact the ODNN’s performance and hinders its practical application. Although using larger diffraction neuron sizes enhances the network’s robustness, it leads to a drastic reduction in the number of diffraction units, significantly lowering the network’s performance.
In this study, we draw inspiration from the hidden layers of computational neural networks that contain different numbers of neurons. By varying the number and size of diffraction neurons, we design a class of H-ODNNs with enhanced robustness. These networks are applied to digit recognition, image denoising, and wavelength classification tasks. We simulate the performance variations of 20 different networks, including 18 hybrid networks, 1 conventional network, and 1 vaccinated network, under various lateral and axial misalignments across the three tasks. The simulation results consistently show that introducing hybrid structure significantly improves the robustness of ODNNs. In addition, compared with the conventional vaccination method, the hybrid structure provides better central performance and faster training speed.
It is noteworthy that as the number of diffraction layers increases, ODNNs also need to consider combining different hybrid structures between layers. This aspect should be further explored in specific studies.
Importantly, the hybrid structure we proposed can be applied to the design of any multi-layer diffraction structure with the aim of improving the robustness of such optical structures. The hybrid structure provides a new design concept to overcome the limitations imposed by diffraction neuron sizes, especially in visible-light diffractive net-works, where diffraction neuron size limitations are severe. This approach holds great value for the fabrication and practical application of multi-layer diffraction structures.
Zhao , Z.Q. , Zheng , P. , Xu , S.T. , Wu , X.: Object detection with deep learning: A review. IEEE Trans. Neural Netw. Learn. Syst30(11), 3212–3232(2019)
[2]
Krizhevsky , A. , Sutskever , I. , Hinton , G.: ImageNet classification with deep convolutional neural networks. Commun. ACM60(6), 84–90(2017)
[3]
Wetzstein , G. , Ozcan , A. , Gigan , S. , Fan , S. , Englund , D. , Soljačić , M. , Denz , C. , Miller , D.A.B. , Psaltis , D.: Inference in artificial intelligence with deep optics and photonics. Nature588(7836), 39–47(2020)
[4]
Badloe , T. , Lee , S. , Rho , J.: Computation at the speed of light: metamaterials for all-optical calculations and neural networks. Adv. Photonics4(6), 064002(2022)
[5]
He , S. , Wang , R. , Luo , H.: Computing metasurfaces for all-optical image processing: A brief review. Nanophotonics11(6), 1083–1108(2022)
[6]
Xiang , J. , Zhang , Y. , Zhao , Y. , Guo , X. , Su , Y.: All-optical silicon microring spiking neuron. Photon. Res10(4), 939–946(2022)
[7]
Hu , J. , Mengu , D. , Tzarouchis , D.C. , Edwards , B. , Engheta , N. , Ozcan , A.: Diffractive optical computing in free space. Nat. Commun15(1), 1525(2024)
[8]
Lin , X. , Rivenson , Y. , Yardimci , N.T. , Veli , M. , Luo , Y. , Jarrahi , M. , Ozcan , A.: All-optical machine learning using diffractive deep neural networks. Science361(6406), 1004–1008(2018)
[9]
Gao , S. , Chen , H. , Wang , Y. , Duan , Z. , Zhang , H. , Sun , Z. , Shen , Y. , Lin , X.: Super-resolution diffractive neural network for all-optical direction of arrival estimation beyond diffraction limits. Light Sci. Appl13(1), 161(2024)
[10]
Zhang , Y. , Zhu , S. , Hu , J. , Gu , M.: Femtosecond laser direct nanolithography of perovskite hydration for temporally programmable holograms. Nat. Commun15(1), 6661(2024)
[11]
Wang , J. , Wang , K. , Cai , C. , Fu , T. , Wang , J.: Diffractive neural network on a 3D photonic device for spatial mode bases mapping. Laser Photonics Rev18(12), 2400634(2024)
[12]
Xia , R. , Wu , L. , Tao , J. , Zhao , M. , Yang , Z.: Monolayer directional metasurface for all-optical image classifier doublet. Opt. Lett49(9), 2505–2508(2024)
[13]
Fang , X. , Hu , X. , Li , B. , Su , H. , Cheng , K. , Luan , H. , Gu , M.: Orbital angular momentum-mediated machine learning for high-accuracy mode-feature encoding. Light Sci. Appl13(1), 49(2024)
[14]
Li , Y. , Luo , Y. , Mengu , D. , Bai , B. , Ozcan , A.: Quantitative phase imaging (QPI) through random diffusers using a diffractive optical network. Light. Adv. Manuf4, 17(2023)
[15]
Duan , Z. , Chen , H. , Lin , X.: Optical multi-task learning using multi-wavelength diffractive deep neural networks. Nanophotonics12(5), 893–903(2023)
[16]
He , C. , Zhao , D. , Fan , F. , Zhou , H. , Li , X. , Li , Y. , Li , J. , Dong , F. , Miao , Y. , Wang , Y. , Huang , L.: Pluggable multitask diffractive neural networks based on cascaded metasurfaces. Opto-Electronic Advances7(2), 230005(2024)
[17]
Luo , X. , Dong , S. , Wei , Z. , Wang , Z. , Hu , Y. , Duan , H. , Cheng , X.: Full‐fourier‐component tailorable optical neural meta‐transformer. Laser Photonics Rev17(12), 2300272(2023)
[18]
Xiang , C. , Qiu , J. , Liu , Q. , Xiao , S. , Liu , T.: Multiplexed metasurfaces for diffractive optics via a phase correlation method. Opt. Lett50(6), 1989–1992(2025)
[19]
Léonard , F. , Backer , A. , Fuller , E. , Teeter , C. , Vineyard , C.: Co-design of free-space metasurface optical neuromorphic classifiers for high performance. ACS Photonics8(7), 2103–2111(2021)
[20]
Léonard , F. , Fuller , E.J. , Teeter , C.M. , Vineyard , C.M.: High accuracy single-layer free-space diffractive neuromorphic classifiers for spatially incoherent light. Opt. Express30(8), 12510–12520(2022)
[21]
Léonard , F. , Fuller , E.J. , Teeter , C.M. , Vineyard , C.M.: Role of depth in optical diffractive neural networks. Opt. Express32(13), 23125–23133(2024)
[22]
Işıl , Ç. , Gan , T. , Ardic , F.O. , Mentesoglu , K. , Digani , J. , Karaca , H. , Chen , H. , Li , J. , Mengu , D. , Jarrahi , M. , Akşit , K. , Ozcan , A.: All-optical image denoising using a diffractive visual processor. Light Sci. Appl13(1), 43(2024)
[23]
Yang , X. , Rahman , M. , Bai , B. , Li , J. , Ozcan , A.: Complex-valued universal linear transformations and image encryption using spatially incoherent diffractive networks. Advanced Photonics Nexus3(1), 76–85(2024)
[24]
Zhu , H. , Yin , R. , Hu , T. , Xia , R. , Li , M. , Zhao , M. , Yang , Z.: Restoration of motion-blurred numeral image using a complex-amplitude diffractive processor. Opt. Lett49(17), 4914–4917(2024)
[25]
Kulce , O. , Mengu , D. , Rivenson , Y. , Ozcan , A.: All-optical synthesis of an arbitrary linear transformation using diffractive surfaces. Light Sci. Appl10(1), 196(2021)
[26]
Bai , B. , Yang , X. , Gan , T. , Li , J. , Mengu , D. , Jarrahi , M. , Ozcan , A.: Pyramid diffractive optical networks for unidirectional image magnification and demagnification. Light Sci. Appl13(1), 178(2024)
[27]
Ma , G. , Yang , X. , Bai , B. , Li , J. , Li , Y. , Gan , T. , Shen , C. , Zhang , Y. , Li , Y. , Işıl , Ç. , Jarrahi , M. , Ozcan , A.: Multiplexed all‐optical permutation operations using a reconfigurable diffractive optical network. Laser Photonics Rev18(11), 2400238(2024)
[28]
Feng , J. , Chen , H. , Yang , D. , Hao , J. , Lin , J. , Jin , P.: Multi-wavelength diffractive neural network with the weighting method. Opt. Express31(20), 33113–33122(2023)
[29]
Cheong , Y. , Thekkekara , L. , Bhaskaran , M. , Rosal , B. , Sriram , S.: Broadband diffractive neural networks enabling classification of visible wavelengths. Adv. Photon. Res5(6), 2300310(2024)