A fully real-valued end-to-end optical neural network for generative model

Shan Jiang , Bo Wu , Qixiang Cheng , Jianji Dong

Front. Optoelectron. ›› 2026, Vol. 19 ›› Issue (1) : 4

PDF (2753KB)
Front. Optoelectron. ›› 2026, Vol. 19 ›› Issue (1) :4 DOI: 10.2738/foe.2026.0004
RESEARCH ARTICLE

A fully real-valued end-to-end optical neural network for generative model

Author information +
History +
PDF (2753KB)

Abstract

Optical neural networks (ONNs) hold great promise for low-latency, energy-efficient inference. However, the absence of a fully real-valued end-to-end ONN, in which the inputs, weight matrices, and nonlinear activations are all represented in the real-number domain and can be optically cascaded, remains a key bottleneck. Existing approaches either rely on electrical post-processing of photodetector outputs to extend the number field in the linear layers, which breaks optical cascadability, or employ photodiode–driven micro-ring modulators (MRMs) to implement nonlinearities, constraining subsequent-layer inputs to the nonnegative domain and thereby limiting network expressivity and architectural flexibility. Here, we employ two MRMs biased at different resonance wavelengths to achieve real-valued optical encoding, together with a dual-MRM activation element driven by the differential photocurrent of photodiodes, which provides optically cascadable real-valued nonlinear activation. Combined with a real-valued Mach–Zehnder interferometer mesh for matrix computation, this architecture realizes a fully real-valued end-to-end ONN. We experimentally demonstrate a tanh-like nonlinear activation function and validate it on an iris classification task, achieving an accuracy of 98%. We further model the generator of a generative adversarial network based on this structure, in which the nonlinear activation is based on the experimentally measured nonlinear transfer curve. The generator can use natural optical noise as its input, thereby eliminating electro-optic conversion and digital-to-analog conversion at the input stage. With the above merits, the proposed ONN achieves successful optical-to-optical on-chip image generation, validating the superiority of optical computing.

Graphical abstract

Keywords

Photonic neural networks / Optical nonlinear activation function / Photonic integrated circuit / Optical computing

Cite this article

Download citation ▾
Shan Jiang, Bo Wu, Qixiang Cheng, Jianji Dong. A fully real-valued end-to-end optical neural network for generative model. Front. Optoelectron., 2026, 19(1): 4 DOI:10.2738/foe.2026.0004

登录浏览全文

4963

注册一个新账户 忘记密码

1 Introduction

With the rapid advancement of artificial intelligence (AI), the computational demands of neural networks have become increasingly diverse and complex. In particular, optical neural networks (ONNs), which leverage the inherent low latency and energy efficiency of photonic systems, have attracted growing attention as promising candidates for next-generation AI accelerators. However, most existing ONN architectures lack the capability to support fully real-valued, end-to-end computation; that is, they cannot realize operation in the real-valued domain across all layers, including the inputs, weight matrices, and nonlinear activations. Although some ONNs have achieved real-valued weight representations, they still operate under constraints of nonnegative input encoding and nonnegative nonlinear activation functions, such as ReLU or other irregular mappings defined only on the positive domain [111]. This often destroys the zero-mean property of data distributions that naturally arises for real-valued samples, leading to unstable gradient descent and slower training convergence. Moreover, almost all large language models rely on GELU or similar nonlinearities to effectively distinguish whether two tokens are correlated, anti-correlated, or unrelated. When the activation function is restricted to nonnegative outputs, the representational capacity of Transformer architectures is therefore severely limited. Consequently, extending ONNs to be fully real-valued end-to-end is crucial for enhancing their expressivity. In the following, we review the commonly adopted schemes for the linear and nonlinear components in current ONN implementations and discuss the challenges they face.

Within the linear computation of ONNs, a basic strategy for implementing real-valued vector–matrix multiplication is to separate the positive and negative entries of the weight matrix W and the input vector x and compute them independently [12], effectively representing the real-valued product Wx as (W+|W|)(x+|x|). This approach requires multiple passes through photonic chip and relies on subsequent electronic post-processing to recombine the results. Another approach is to set a bias point and a scaling factor, and realize a real-valued W via linear scaling [13,14]. An alternative is to introduce a “negative” port and realize real-valued weights by subtracting the optical intensities measured at the positive and negative ports [1517]. However, these operations still depend on electronic post-processing, and the input vector x remains restricted to nonnegative values.

For the nonlinear activation in ONNs, current implementations are generally limited to the nonnegative domain, and existing approaches typically fall into two categories: all-optical schemes that rely on specialized materials [8,13,1822] or nonlinear mechanisms [6, 7, 2325], and O–E–O schemes involving optical–electrical–optical conversion [910,26]. All-optical schemes often suffer from accumulated optical attenuation, which is detrimental to deep cascades. In O–E–O schemes, a commonly used approach is to employ a photodetector (PD) to convert optical power into photocurrent that drives the micro-ring modulators (MRMs) of the subsequent layer [27]. Since the MRMs are supplied with optical power, the optical intensity does not attenuate as in all-optical schemes [2,28]. This method has already demonstrated promising performance and, when combined with a “negative” port in the linear layer, can realize an on-chip real-valued weight matrix together with a cascadable nonlinear activation [3,29]. However, the input and the nonlinear activation remain restricted to the positive domain, as illustrated in Fig. 1a.

To address these limitations, we propose a fully real-valued, end-to-end optical neural network that simultaneously supports real-valued input vector x, weight matrix w, and nonlinear activation functions (NAFs) and, importantly, remains optically cascadable, as shown in Fig. 1b. We experimentally implement the proposed architecture on a photonic chip and perform an iris classification benchmark task, thereby verifying its convergence performance. Furthermore, we model the generator of a generative adversarial network (GAN) based on this structure, which is required to operate in the real domain. In this context, the required noise input is supplied by the random fluctuations of partially coherent light, which can potentially serve as an on-chip optical noise source in future implementations, eliminating the electro–optic conversion and digital-to-analog conversion at the input stage. The results demonstrate that real-valued inputs and activation functions yield the best image generation quality among the evaluated situations.

2 Structure and operation principle

Figure 2a illustrates the overall architecture of the proposed ONN. In the input part, two columns of MRMs, corresponding to two sets of operating wavelengths (λn±,n=1,2,3,4), are employed: the reddish column with operation wavelength of λn+ are for loading the positive elements of the input vector, while the blue column with operation wavelength of λn are for loading the negative elements. As an example, after normalizing the real-valued data set into the interval [1,1], consider the case where the input signal at the first port is 1. The reddish MRM is biased to be resonant at λ1+, so that the optical power at λ1+ is completely dropped at the drop port and does not enter the Mach–Zehnder interferometer (MZI) mesh. In contrast, the blue MRM is detuned from its resonance wavelength λ1, such that the optical power at λ1 is transmitted with negligible loss through the through port into the MZI mesh. After encoding, the optical signals are processed by a MZI mesh and performs a real-domain matrix-vector multiplication. The MZI mesh has been demonstrated in prior work [17] to represent an arbitrary real matrix by introducing a “negative” port and performing the subtraction operation in post processing as shown on the left of Fig. 2b. To integrate the NAF and the MZI mesh, the work [17] further proposed subtracting the optical intensities at two adjacent output ports, rather than subtracting a single “negative” port from all other ports, as illustrated in the right panel of Fig. 2b. Conceptually, this operation is equivalent to left-multiplying the original weight matrix by an invertible matrix and therefore preserves the ability to represent an arbitrary real-valued matrix.

In the first nonlinear layer of Fig. 2a, the five output ports of the MZI mesh are connected to 1 × 2 multimode interferometers (MMIs), because the optical signal at each intermediate port (except for the top and the bottom) must be split into two paths to perform subtraction with its upper and lower neighboring ports, respectively. Specifically, the output optical intensities are processed as: i1i2,i2i3,i3i4,i4i5, where in denotes half of the optical intensity at the nth output port of the MZI mesh.

Because each output port of the MZI mesh carries a superposition of optical signals at positive and negative wavelengths (λ1/+,λ2/+,λ3/+,λ4/+), the theoretical output of the first-layer matrix multiplication is given by the intensity difference between the positive and negative components, i.e., in=in+in, where in+and in represent the summed optical intensities of the positive and negative wavelength components at the nth output port of the MZI mesh, respectively. To physically separate these positive and negative components, we then feed the optical signals into unbalanced Mach–Zehnder interferometers (UMZIs). By properly designing the arm-length difference between the two arms of the UMZI, the free spectral range (FSR) can be engineered to the desired value. In our implementation, the FSR is approximately 4.5 nm. The operating wavelengths assigned to the input signals are then carefully chosen to match the spectral transmission of the UMZI, with each pair of adjacent positive and negative wavelengths separated by half of the FSR, such that the upper output port of the UMZI selectively transmits the wavelengths encoding positive values, whereas the lower port passes those encoding negative values. As illustrated in Fig. 2c, the UMZI thus separates these positive and negative wavelength components. Finally, the differential outputs can be expressed as (i1+i1)(i2+i2), (i2+i2)(i3+i3), (i3+i3)(i4+i4), (i4+i4)(i5+i5), which can equivalently be rewritten as (i1++i2)(i1+i2+), (i2++i3)(i2+i3+), (i3++i4)(i3+i4+), (i4++i5)(i5+i5+).

Next, the key question is how to implement this differential expression and the associated real-domain nonlinear activation on chip, and how to map the resulting signal to the input of the next layer. Taking (i1++i2)(i1+i2+) as an example, Fig. 2d illustrates the core idea of our scheme. The optical components i1+ and i2 are routed to one Ge photodetector (Ge PD), whereas i1 and i2+ are routed to another Ge PD, so that the two PDs generate photocurrents I1++I2 and I1+I2+, respectively. The two PDs are connected in series, with the two terminals of the series pair biased at equal magnitude but opposite polarity (one at a negative voltage and the other at a positive voltage). According to Kirchhoff’s current law, the differential current, Idiff=(I1++I2)(I1+I2+), is injected into the MRMs of the next layer on the right. Because the PN-junction polarities in the regions contacted by the upper and lower metal electrodes are opposite for the two MRMs—specifically, in the left MRM the region under the upper metal electrode is p-type and that under the lower electrode is n-type, whereas in the right MRM the region under the upper electrode is n-type and that under the lower electrode is p-type. The two MRMs can be regarded as two PN junctions connected in parallel with opposite polarity, with their common lower node held at 0 V.

To conserve wavelength resources, these MRMs are driven by light source at the same wavelengths as those used in the input layer and are thermally pre-tuned to be resonant at these wavelengths. Depending on the sign of Idiff, two cases arise, as shown in Fig. 2e, where the yellow arrows indicate the path and direction of the photocurrent flow. When Idiff is negative, the PN junction of the right MRM is forward-biased, while no current flows through the left one. Consequently, the carrier concentration in the right MRM increases, which in turn decreases its refractive index and blue-shifts its resonance. The inputs of the second layer and the outputs of the first layer thereby establish a mapping, as illustrated by the curve in the coordinate plane. Since the corresponding output wavelength represents a negative value, the mapping curve is plotted in the third quadrant. When Idiff is positive, the situation is reversed: the PN junction of the left MRM becomes forward-biased, and no current flows through the right one. Taken together, the inputs to the second layer and the outputs from the first layer establish a mapping that follows a tanh-like curve. After this nonlinear activation, the optical signals are fed into the next linear layer and then into the subsequent nonlinear layer, where the same procedure is repeated, thereby realizing a fully real-valued, end-to-end ONN.

3 Chip fabrication and characterization

The chip was fabricated using a standard silicon photonics foundry process. Figure 3a shows the microscope image of the chip architecture. It implements a compact end-to-end ONN architecture comprising four input ports, a 4×4 linear layer, a nonlinear activation layer, a 2×4 linear layer, and two output ports. In Fig. 3b, we show the measured transmission spectra at the upper output port of a UMZI under a voltage sweep at different operating wavelengths. By carefully selecting a suitable bias voltage, the UMZI can effectively separate the positive and negative wavelengths. Figure 3c shows the experimentally measured profile of the nonlinear activation function, in good agreement with the earlier prediction. The horizontal axis represents the differential optical power input to the PDs, which is proportional to the generated photocurrent, while the vertical axis represents the differential optical power from the through ports of the two driven MRMs, which serves as the input to the next layer. The output intensity tends to saturate when the absolute value of the differential optical power injected into the PD exceeds 0.2 mW. The extinction ratio is 10 dB, which meets the requirements for computation. In this chip, we use the same wavelength set (λn±,n=1,2,3,4) for both the input-layer and activation-layer MRMs, as mentioned above. As a result, the corresponding ports can share a single grating coupler: the laser source is coupled into the chip through it and then distributed to the corresponding ports of two layers via a set of 1×2 power splitters.

4 Experiment results

To assess the chip’s convergence capability, we conducted an experimental classification task on the Iris data set. The Iris data set contains 3 classes of 50 instances each, of which 35 were used for training and 15 for testing. The four input ports of the optical chip correspond to the four features of the data set—sepal length, sepal width, petal length, and petal width. For the entire data set, we first perform a binary classification: one class is Versicolor, and the other class combines the remaining two species. After completing this categorization, we further perform binary classification on the remaining two categories. By performing this one-versus-rest method [3], the chip—despite having only two output ports—implements three-class classification, as illustrated in Fig. 4a. The output of each training run is post-processed using a softmax function to evaluate the probabilities of the iris species. The cross-entropy loss of each run is then fed to the Adam optimization algorithm, which updates the heater voltages accordingly.

The loss curve and accuracy curve for the Setosa vs. Virginica classification are shown in Fig. 4b. The loss drops significantly during the first 20 iterations and then gradually stabilizes. Meanwhile, the classification accuracy on the test set approaches 100% within the first 10 iterations. The confusion matrix for the entire training data set is shown in Fig. 4c, indicating an overall classification accuracy of 98%.

Due to the limited scale of the current chip, we implement only a simple Iris classification experiment on hardware. To further evaluate the advantages of our fully real-valued architecture, we extract the nonlinear activation function realized on chip and use it in simulations of a GAN generator, which typically operates in the real domain. Compared with conventional discriminative networks, a GAN must not only fit an input–output mapping, but also construct a well-structured probabilistic geometry in the latent space to enable smooth interpolation. Consequently, GANs tend to benefit from symmetric real-valued latent-variable distributions and activation functions in the intermediate layers that are approximately zero-centered or exhibit responses over both positive and negative domains, thereby yielding a more meaningful latent geometry and more stable adversarial training.

The training of a GAN follows a game-theoretic paradigm: the discriminator learns to distinguish between real and generated images, while the generator learns to synthesize increasingly realistic outputs to deceive the discriminator. This approach can be used to effectively enlarge the data set in certain training tasks. Through alternating optimization, both networks evolve iteratively, and eventually, the generator is capable of producing visually convincing results.

The GAN architecture adopted in our study is illustrated in Fig. 5a. The model was trained using a specific subset of the MNIST data set consisting of the digit “7”. These original grayscale images were downsampled and resized to 8 × 8. The generator is a fully connected network with 4 input neurons, a hidden layer of 4 neurons, and 64 output neurons. It takes real-valued random vectors as input, and both the hidden and output layers use a tanh activation function. The discriminator receives both real and generated samples and outputs the probability of a sample being real. The generator and discriminator are trained adversarially using a cross-entropy loss, and their parameters are updated with the Adam optimizer.

In the simulation experiments, we varied the generator’s input domain and nonlinear activation function to evaluate image generation performance under three different configurations in Fig. 5b. The first configuration uses real-valued inputs with a real-valued nonlinear activation; the second employs nonnegative inputs with a real-valued nonlinear activation; and the third adopts nonnegative inputs with a ReLU activation function. In all cases involving real-valued nonlinear activation, the activation function is obtained from the chip and corresponds to the curve shown in Fig. 3c. The results demonstrate that when both the input and the nonlinear activation are real-valued, the generator achieves the best image generation performance.

As shown in Fig. 6a, the random input does not rely on pseudo-random numbers generated by electronic computers; instead, it is derived from the noise of partially coherent source. Specifically, the broadband output of an ASE light source is filtered by a programmable optical filter to obtain a narrowband optical signal with a 3-dB bandwidth of 0.1 nm, as shown in the spectrum in Fig. 6b. This signal is partially coherent. After photodetection by a PD, since the PD bandwidth exceeds the optical bandwidth, it captures the random phase and intensity noise. As a result, the voltage waveform displayed on the oscilloscope exhibits random fluctuations, as illustrated in Fig. 6c. In our simulations, we sample the detected noise as the input. However, if a GAN were implemented directly on a photonic chip, a partially coherent light source could serve as the stochastic input, eliminating electronic pseudo-random number generation and digital-to-analog conversion, thereby substantially simplifying the practical system. The resulting probability density functions (PDFs) of the noise at different times, as shown in Fig. 6d, are highly consistent. The observed right-skewed shape of the distribution is a characteristic feature resulting from the superposition of the electronic noise floor and optical chaotic noise. The mean and variance of the output distribution can be precisely controlled through amplitude modulation [30]. These demonstrate the stability, controllability, and reproducibility of this noise source.

5 Discussion and conclusion

There remains room for further enhancement. By introducing a nonzero bias voltage at the shared lower node of the two parallel MRMs, rather than fixing it at 0 V, the activation response can be tuned to exhibit Leaky-ReLU-like characteristics. Furthermore, cascading additional MRMs or other functional components [1,31] could enable the realization of more general nonlinear activation curves, thereby allowing the framework to adapt to more flexible network architectures [32]. In addition, since the supply light can be shared across different layers, the maximum number of required wavelength channels is independent of the network depth and instead relates to the network width, specifically depending on the layer with the largest number of neurons. In this work, different input ports use different wavelengths of light to satisfy the incoherent-input condition of the real-valued MZI mesh. However, the number of required input wavelengths can be greatly reduced because this requirement can also be met by employing partially coherent light as the working optical source. All ports can share only two wavelengths—one encoding the positive element and the other encoding the negative element. In contrast to the partially coherent optical noise used as the input for the GAN, by choosing a narrowband source whose optical bandwidth exceeds the PD detection bandwidth, effective optical-intensity detection rather than noise-dominated detection can be achieved [29]. Moreover, compared with a conventional MZI mesh that performs only matrix operations, our architecture introduces UMZIs at each output port for wavelength separation, which incurs extra footprint. In the future, by leveraging a Si/SiN dual-layer platform, photonic components could be vertically stacked, thereby substantially increasing areal density.

The system’s total power consumption comprises several key components. The laser source contributes 0.62 W to the overall power budget. The photonic chip integrates 42 phase shifters, each consuming about 20 mW to achieve a π phase shift, resulting in a total of 0.84 W. The NAF operates with an MRM driving current of 0.1 mA and a PD bias voltage of 3 V, yielding a total power consumption of 2.4 mW. Additionally, the external electrical circuits consume a total of 70.57 mW, comprising 170 µW for the DACs, 48.8 mW for the transmitters, and 21.6 mW for the receivers. In total, the power consumption of the experimental system is 1.53 W. In terms of computational operations, a dot product entails 2N operations—where N denotes the vector size—and each NAF corresponds to three extra operations. Thus, the system performs 72 OPs per inference. The latency of the inference process is primarily determined by the O−E−O conversion time, which is dominated by the RC time constant and amounts to approximately 1.76 ns [3]. Accordingly, the energy efficiency is calculated as: 1.53 W × 1.76 ns/72 Ops = 37 pJ/OP. Both the latency and energy consumption remain competitive with prior works that also rely on O−E−O-based nonlinearities.

In summary, we have proposed a scheme for real-valued input encoding by modulating two MRMs biased at different resonance wavelengths, and developed a differential-photocurrent-driven dual-MRM activation framework that realizes cascadable real-valued nonlinear activation. Combined with previously reported schemes for real-valued weight matrices, these advances enable a fully real-valued, end-to-end optical neural network. Experimental results confirm that the fabricated chip achieves effective convergence on classification tasks. Moreover, simulations demonstrate that the proposed real-valued architecture outperforms alternative approaches in applications that require real-domain implementations, underscoring its strong potential for future photonic neural network systems.

References

[1]

Chen , B. Xiong, X. , Zhang , R. , Dai , Y. , Yang , J. , Bai , J. , Li , W. , Zhu , N. , Li , M. : A fully reconfigurable all-optical integrated nonlinear activator. arXiv:2503.11141 (2025)

[2]

Shi , Y. , Ren , J. , Chen , G. , Liu , W. , Jin , C. , Guo , X. , Yu , Y. , Zhang , X. : Nonlinear germanium-silicon photodiode for activation and monitoring in photonic neuromorphic networks. Nat. Commun 13(1), 6048(2022)

[3]

Wu , B. , Zhou , H. , Cheng , J. , Zhang , W. , Zhang , S. , Huang , C , Huang , D. , Zhou , H. , Dong , J. , Zhang , X. : Monolithically integrated asynchronous optical recurrent accelerator. eLight 5, 7(2025)

[4]

Liao , K. , Lian , Y. , Yu , M. , Du , Z. , Dai , T. , Wang , Y. , Yan , H. , Wang , S. , Lu , C. , Chan , C.T. , Zhu , R. , Di , D. , Hu , X. , Gong , Q. : Hetero-integrated perovskite/Si3N4 on-chip photonic system. Nat. Photonics 19(4), 358–368(2025)

[5]

Lim , G.K. , Chen , Z.L. , Clark , J. , Goh , R.G.S. , Ng , W.H. , Tan , H.W. , Friend , R.H. , Ho , P.K.H. , Chua , L.L. : Giant broadband nonlinear optical absorption response in dispersed graphene single sheets. Nat. Photonics 5(9), 554–560(2011)

[6]

Wu , T. , Li , Y. , Ge , L. , Feng , L. : Field-programmable photonic nonlinearity. Nat. Photonics 19(7), 725–732(2025)

[7]

Li , G.H.Y. , Sekine , R. , Nehra , R. , Gray , R.M. , Ledezma , L. , Guo , Q. , Marandi , A. : All-optical ultrafast ReLU function for energy-efficient nanophotonic deep learning. Nanophotonics 12(5), 847–855(2023)

[8]

Feldmann , J. , Youngblood , N. , Wright , C.D. , Bhaskaran , H. , Pernice , W.H.P. : All-optical spiking neurosynaptic networks with self-learning capabilities. Nature 569(7755), 208–214(2019)

[9]

Pour Fard, M.M. , Williamson , I.A.D. , Edwards , M. , Liu , K. , Pai , S. , Bartlett , B. , Minkov , M. , Hughes , T.W. , Fan , S. , Nguyen , T.A. : Experimental realization of arbitrary activation functions for optical neural networks. Opt. Express 28(8), 12138(2020)

[10]

Wang , T. , Sohoni , M.M. , Wright , L.G. , Stein , M.M. , Ma , S.Y. , Onodera , T. , Anderson , M.G. , McMahon , P.L. : Image sensing with multilayer nonlinear optical neural networks. Nat. Photonics 17(5), 408–415(2023)

[11]

Khoo , I.C. , Wood , M. , Shih , M.Y. , Chen , P. : Extremely nonlinear photosensitive liquid crystals for image sensing and sensor protection. Opt. Express 4(11), 432–442(1999)

[12]

Zhang , S. , Zhou , H. , Wu , B. , Jiang , X. , Gao , D. , Xu , J. , Dong , J. : Redundancy-free integrated optical convolver for optical neural networks based on arrayed waveguide grating. Nanophotonics 13(1), 19–28(2024)

[13]

Feldmann , J. , Youngblood , N. , Karpov , M. , Gehring , H. , Li , X. , Stappers , M. , Le Gallo, M. , Fu , X. , Lukashchuk , A. , Raja , A.S. , Liu , J. , Wright , C.D. , Sebastian , A. , Kippenberg , T.J. , Pernice , W.H.P. , Bhaskaran , H. : Parallel convolutional processing using an integrated photonic tensor core. Nature 589(7840), 52–58(2021)

[14]

Liu , S. , Xu , T. , Wang , B. , Wang , D. , Xiao , Q. , Fan , L. , Huang , C. : Calibration-free and precise programming of large-scale ring resonator circuits. Optica 12(7), 1113(2025)

[15]

Yi , D. , Zhao , C. , Zhang , Z. , Xu , H. , Tsang , H.K. : Accelerating convolutional processing by harnessing channel shifts in arrayed waveguide gratings. Laser Photonics Rev 19(1), 2400435(2025)

[16]

Cheng , J. , Zhao , Y. , Zhang , W. , Zhou , H. , Huang , D. , Zhu , Q. , Guo , Y. , Xu , B. , Dong , J. , Zhang , X. : A small microring array that performs large complex-valued matrix-vector multiplication. Front Optoelectron 15(1), 15(2022)

[17]

Wu , B. , Liu , S. , Cheng , J. , Dong , W. , Zhou , H. , Dong , J. , Li , M. , Zhang , X. : Real-Valued Optical Matrix Computing with Simplified MZI Mesh. Intell. Comput. 2, 0047 (2023)

[18]

Teo , T.Y. , Ma , X. , Pastor , E. , Wang , H. , George , J.K. , Yang , J.K.W. , Wall , S. , Miscuglio , M. , Simpson , R.E. , Sorger , V.J. : Programmable chalcogenide-based all-optical deep neural networks. Nanophotonics 11(17), 4073–4088(2022)

[19]

Cheng , Z. , Ríos , C. , Pernice , W.H.P. , Wright , C.D. , Bhaskaran , H. : On-chip photonic synapse. Sci. Adv 3(9), e1700160(2017)

[20]

Brückerhoff-Plückelmann , F. , Bente , I. , Becker , M. , Vollmar , N. , Farmakidis , N. , Lomonte , E. , Lenzini , F. , Wright , C.D. , Bhaskaran , H. , Salinga , M. , Risse , B. , Pernice , W.H.P. : Event-driven adaptive optical neural network. Sci. Adv 9(42), eadi9127(2023)

[21]

Chen , C. , Yang , Z. , Wang , T. , Wang , Y. , Gao , K. , Wu , J. , Wang , J. , Qiu , J. , Tan , D. : Ultra-broadband all-optical nonlinear activation function enabled by MoTe2/optical waveguide integrated devices. Nat. Commun 15(1), 9047(2024)

[22]

Zhou , Z. , Liu , C. , Zhao , W. , Liu , J. , Jiang , T. , Peng , W. , Xiong , J. , Wu , H. , Zhang , C. , Ding , Y. , Da Ros, F. , Xu , X. , Xu , K. , Yan , S. , Tang , M. : Ultrafast silicon/graphene optical nonlinear activator for neuromorphic computing. Adv. Opt. Mater 12(34), 2401686(2024)

[23]

Mourgias-Alexandris , G. , Tsakyridis , A. , Passalis , N. , Tefas , A. , Vyrsokinos , K. , Pleros , N. : An all-optical neuron with sigmoid activation function. Opt. Express 27(7), 9620–9630(2019)

[24]

Jha , A. , Huang , C. , Prucnal , P.R. : Reconfigurable all-optical nonlinear activation functions for neuromorphic photonics. Opt. Lett 45(17), 4819–4822(2020)

[25]

Wu , B. , Li , H. , Tong , W. , Dong , J. , Zhang , X. : Low-threshold all-optical nonlinear activation function based on a Ge/Si hybrid structure in a microring resonator. Opt. Mater. Express 12(3), 970–980(2022)

[26]

Nezami , M.S. , de Lima, T.F. , Mitchell , M. , Yu , S. , Wang , J. , Bilodeau , S. , Zhang , W. , Al-Qadasi , M. , Taghavi , I. , Tofini , A. , Lin , S. , Shastri , B.J. , Prucnal , P.R. , Chrostowski , L. , Shekhar , S. : Packaging and interconnect considerations in neuromorphic photonic accelerators. IEEE J. Sel. Top. Quantum Electron. 29(2: Optical Computing), 1–11 (2023)

[27]

Ashtiani , F. , Geers , A.J. , Aflatouni , F. : An on-chip photonic deep neural network for image classification. Nature 606(7914), 501–506(2022)

[28]

Zhang , J. , Wu , B. , Zhang , S. , Cheng , J. , Wang , Y. , Zhou , H. , Dong , J. , Zhang , X. : Highly integrated all-optical nonlinear deep neural network for multi-thread processing. Adv. Photonics 7(4), 046003(2025)

[29]

Wu , B. , Huang , C. , Zhang , J. , Zhou , H. , Wang , Y. , Dong , J. , Zhang , X. : Scaling up for end-to-end on-chip photonic neural network inference. Light Sci. Appl 14(1), 328(2025)

[30]

Brückerhoff-Plückelmann , F. , Borras , H. , Klein , B. , Varri , A. , Becker , M. , Dijkstra , J. , Brückerhoff , M. , Wright , C.D. , Salinga , M. , Bhaskaran , H. , Risse , B. , Fröning , H. , Pernice , W. : Probabilistic photonic computing with chaotic light. Nat. Commun 15(1), 10445(2024)

[31]

Xu , J. , Dong , W. , Huang , Q. , Zhang , Y. , Yin , Y. , Zhao , Z. , Zeng , D. , Gao , X. , Gu , W. , Yang , Z. , Li , H. , Han , X. , Geng , Y. , Zhai , K. , Chen , B. , Fu , X. , Lei , L. , Wu , X. , Dong , J. , Su , Y. , Li , M. , Liu , J. , Zhu , N. , Guo , X. , Zhou , H. , Wen , H. , Qiu , K. , Zhang , X. : Progress in silicon-based reconfigurable and programmable all-optical signal processing chips. Front Optoelectron 18(1), 10(2025)

[32]

Huang , T. , Chen , L. , Lu , M. , Pan , J. , Xu , C. , Wang , P. , Shum , P.P. : Rapid prediction of complex nonlinear dynamics in Kerr resonators using the recurrent neural network. Front Optoelectron 18(1), 19(2025)

RIGHTS & PERMISSIONS

The author(s)

AI Summary AI Mindmap
PDF (2753KB)

320

Accesses

0

Citation

Detail

Sections
Recommended

AI思维导图

/