Machine learning-based direct solver for one-to-many problems on temporal shaping of relativistic electron beams

Jinyu Wan; Yi Jiao

doi:10.1007/s11467-022-1205-y

PDF(5427 KB)

Front. Phys. ›› 2022, Vol. 17 ›› Issue (6) : 64601. DOI: 10.1007/s11467-022-1205-y

RESEARCH ARTICLE

Machine learning-based direct solver for one-to-many problems on temporal shaping of relativistic electron beams

Jinyu Wan¹^,²^,³ ,
Yi Jiao¹^,²

Author information +

History +

Abstract

To control the temporal profile of a relativistic electron beam to meet requirements of various advanced scientific applications like free-electron-laser and plasma wakefield acceleration, a widely-used technique is to manipulate the dispersion terms which turns out to be one-to-many problems. Due to their intrinsic one-to-many property, current popular stochastic optimization approaches on temporal shaping may face the problems of long computing time or sometimes suggesting only one solution. Here we propose a real-time solver for one-to-many problems of temporal shaping, with the aid of a semi-supervised machine learning method, the conditional generative adversarial network (CGAN). We demonstrate that the CGAN solver can learn the one-to-many dynamics and is able to accurately and quickly predict the required dispersion terms for different custom temporal profiles. This machine learning-based solver is expected to have the potential for wide applications to one-to-many problems in other scientific fields.

Graphical abstract

Keywords

beam shaping / one-to-many problem / machine learning

Cite this article

EndNote

Ris (Procite)

Bibtex

Download citation ▾

Jinyu Wan, Yi Jiao. Machine learning-based direct solver for one-to-many problems on temporal shaping of relativistic electron beams. Front. Phys., 2022, 17(6): 64601 https://doi.org/10.1007/s11467-022-1205-y

1 Introduction

Much of the interest in advanced scientific applications of particle accelerator user facilities such as free-electron lasers (FELs) [1-3], terahertz radiation [4,5] and plasma wakefield accelerators (PWFAs) [6,7] has grown in the past decades. FELs and terahertz radiation are emerging as powerful imaging tools in various fields [8-11] like physics, chemistry, biology and material science. PWFAs are potential to provide an accelerating gradient up to multi-GV/m level for high energy physics and photon science [12,13]. Such a gradient is much higher than that obtained with conventional radio-frequency-based accelerators. To realize these advanced accelerator applications, a critical issue is to provide relativistic electron beams of particular temporal shapes to improve quality of the electron or photon beam, namely temporal shaping of electron beams [14-19]. For example, a linearly ramped beam is required by PWFAs to supply a high transformer ratio of wakefield [14,15], and beams with flat-top temporal profiles are desired in FELs to obtain high radiation performance [16,17].

One widely-used temporal shaping method is to let an electron bunch with inhomogeneous energy distribution pass through an energy dispersion section, like a so-called bunch compressor consisting of several bending magnets to realize various temporal profiles [20]. For beams with small transverse emittances and large energy spread, the contribution of geometrical terms relating to the transverse emittances can be neglected in the transfer map and the dispersion terms that connect to the energy spread will tend to dominate the process [21]. Hence, for an initial beam having an inhomogeneous energy chirp, a desired custom temporal profile can be achieved by finding an appropriate combination of dispersion terms (see details in Section 2). Due to the highly nonlinear process, different combinations of dispersion terms may realize the same target profile, i.e., the temporal shaping can intrinsically turn into a one-to-many problem. The exploration of multiple solutions instead of only one is beneficial for various scientific applications, e.g., for beams with the same temporal profile, but with opposite energy chirp along the electron beam can provide different benefits in FELs [22,23]. In addition, it should be noted that not all potential solutions are equally feasible to implement in practical scientific experiments, e.g., one may require unfeasibly stronger magnets than those in other solutions. Therefore, if possible, it is beneficial to derive multiple, if not all, potential solutions that can result in almost the same target profile.

Grid scan [24] was first used to solve such a temporal shaping problem to realize a ramped profile with a second-order approximation. However, due to the highly nonlinear process, the second-order approximation would be insufficient. Yet, when contributions of the higher-order longitudinal dispersion terms are included in the scan, the time cost of grid scan can grow exponentially. Later, it has been shown [25-28] that such one-to-many problems can be solved more efficiently with stochastic optimization methods like genetic algorithm (GA) [29,30], particle swarm optimization (PSO) [31] and extremum seeking (ES) [32]. Nevertheless, the stochastic optimization process is still indirect and some challenging problems still remain. For a single-objective optimization problem that have multiple potential solutions, the optimization process sometimes stops when the first solution is found if insufficiently large population size is given, resulting in the omission of other potential solutions that may be more feasible. Although one can implement additional constrains to see whether an obtained solution is feasible or not, the constrained optimization can be more difficult to solve and more computationally expensive [33]. In this case, the finally obtained results can be highly dependent on the choice of initial solutions. So a warm start is critical for most of these optimization algorithms.

In recent years, machine learning (ML) has attracted increasing interests of accelerator experts as a powerful tool to reveal the complicated correlations between various accelerator parameters [34-42]. It is noticed that, however, most of these ML applications are based on supervised ML models, which are only powerful to capture the map of one-to-one problems where one feature vector X has only one definite label vector Y. When a supervised ML model is trained with data samples of a one-to-many problem that has different labels for the same input feature vector, it tends to output the mean label value of the samples rather than the respective label value of each sample itself. For example, a supervised ML model is able to predict the temporal profile of an electron bunch with known accelerator settings [36]. While for the inverse problem, namely, predicting the accelerator settings for a desired custom temporal profile, the supervised ML methods like multilayer perceptron may fail to give the right answer because multiple solutions possibly exist.

In this paper, we propose a direct solver for one-to-many problems of beam profile shaping with the aid of a special ML method, the conditional generative adversarial networks (CGAN) [43]. The generative adversarial networks (GANs) [44] are emerging techniques of unsupervised and semisupervised ML that has potential to handle one-to-many problems [45,46]. Until now, the GANs have become one of the state-of-the-art techniques to solve difficulties in image synthesis [47], style transfer [48], and image superresolution [49]. The purpose of GANs is to train a generative model to discover and learn the regularities or patterns in training data so that the model can be used to generate new samples that appear to be drawn from the training dataset. Instead of minimizing the mean error as in most supervised ML methods, GANs take a different approach to learn correlations hidden in the training data, via the competition of a pair of neural networks, the generator and the discriminator. The generator is trained to create fake data samples as authentic as possible to fool the discriminator. The discriminator is trained to output the probability of whether an input sample is from the original training dataset or is generated by the generator. The two neural networks are trained together in a zero-sum game to be adversarial. Successful training of a GAN usually ends with the discriminator outputting the same probability for both real and fake samples, meaning the generator model is generating plausible examples. After training, the fixed-length random vector is randomly drawn from a Gaussian distribution and is used to seed the generative process. A random vector will correspond to points in the problem domain, forming a compressed representation of the data distribution. For one-to-many problems, each solution can correspond to different input vector, such that a new random input vector provided to the generator model as input can generate new and different output samples.

CGAN is an important extension to the GANs. Compared to GANs that generate new samples from a random input noise vector, a CGAN can generate samples of a given type by making both the generator and the discriminator class-conditional. With an additional label added to the input data, both the generator and the discriminator receive additional information about the correlation of the samples and the given label so that the networks can synthesize samples with user specified content. This extended approach allowing one to direct the data generation process has proven to be effective to create images with a target class [50,51]. The power of CGAN to generate data samples for a given class implies the potential of solving inverse design problems like temporal profile shaping of electron beams.

The work is structured as follows. In Section 2, we review the temporal shaping problem and describe the CGAN solver. In Section 3, we present the results of using CGAN solver to realize two typical temporal profiles and two other temporal profiles with more scientific merits. The application of the CGAN solver to more complicated situation with CSR effect considered is also investigated. Finally, conclusions are summarized in Section 4.

2 Methods

In a system of magnetic elements, a transfer map M describes the relation of initial conditions

ζ^{i}

and final conditions

ζ^{f}

, which can be symbolically written in the form

ζ^{f} = M ζ^{i}

. A Taylor map [21] that represents the final conditions as a Taylor series of the initial conditions can be described as

(1)

\begin{aligned} ζ_{j}^{f} = \sum_{k} R_{j k} ζ_{k}^{i} + \sum_{k l} T_{j k l} ζ_{k}^{i} ζ_{l}^{i} + \sum_{k l m} U_{j k l m} ζ_{k}^{i} ζ_{l}^{i} ζ_{m}^{i} + . . ., \end{aligned}

where R, T and U are the first-, second- and third-order transfer matrices, and j, k, l and m are the element indices of the coordinate.

It is empirically found that the CGAN is more effective in dealing with low dimensional problems. Therefore, the transfer map here is needed to be sparse. Fortunately, for beams with small transverse emittance and large energy spread, the contribution of geometrical terms can be neglected and the dispersion terms will tend to dominate. A Taylor map that presents the final longitudinal coordinate q

_{z, f}

of a charged particle as a Taylor series of the initial coordinates can be simplified as

(2)

\begin{aligned} q_{z, f} = & q_{z, 0} + R_{56} δ (q_{z, 0}) + T_{566} δ (q_{z, 0})^{2} \\ + U_{5666} δ (q_{z, 0})^{3} + . . ., \end{aligned}

(3)

\begin{aligned} δ (q_{z, 0}) = h_{1} q_{z, 0} + h_{2} q_{z, 0}^{2} + h_{3} q_{z, 0}^{3} + . . ., \end{aligned}

where q

_{z, 0}

is the initial longitudinal coordinate with respect to the bunch center, R

_{56}

, T

_{566}

and U

_{5666}

are the first-, second- and third-order longitudinal dispersion terms respectively,

δ

Δ

E_{0}

represents the energy deviation relative to the nominal beam energy, and

h_{1}

h_{2}

h_{3}

... are the first-, second- and third-order energy chirps respectively. From Eq. (2), for a initial beam with a specific energy chirp, a desired custom temporal profile can be achieved at the exit of the bunch compressor by using an appropriate combination of longitudinal dispersion terms.

In our temporal shaping scheme (see Fig.1), the first-, second- and third-order longitudinal dispersion terms, i.e. R

_{56}

, T

_{566}

and U

_{5666}

, labeled with the corresponding temporal profile are taken as training data to train a CGAN. Here we consider the third-order approximation because the third-order longitudinal dispersion term was proven to be important in previous studies and it is rarely necessary to consider even higher order terms for single-pass transport [52]. The CGAN used in this study consists of a pair of neural networks, the generator G and the discriminator D. The generator G is trained as a solver that can produce fake samples of longitudinal dispersion terms to realize new temporal profiles. Meanwhile, a discriminator D is trained to judge whether the given dispersion terms and temporal profile are drawn from the original training data or is generated by G. The D will compete with the G to force the fake samples generated by G to satisfy the distribution learned from the training data. By feeding the trained solver with the target temporal profile and the noise component, the solver is expected to predict different potential solutions to realize the target when multiple solutions exist.

Fig.1 Schematic diagram of the CGAN solver in this study. The color of the initial beam from black to white represents the charge-density from low to high. The compressed temporal profile is obtained by letting the initial beam pass through a chicane having specific R $_{56}$ , T $_{566}$ and U $_{5666}$ . By feeding the custom temporal profiles and noise components to the trained generator of the CGAN, the CGAN solver is able to predict the required dispersion terms to realize the input target temporal profiles.

Full size|PPT slide

In this study, we use an initial electron beam (see Fig.1) of Gaussian charge-density distribution having an inhomogeneous energy chirp, with the same beam parameters as in Ref. [17]. The initial beam is sent to a chicane-type magnetic compressor that have specific R

_{56}

, T

_{566}

and U

_{5666}

values, and the custom desired temporal profiles are expected to be obtained at the exit of the chicanes. The layout of the chicane is shown in Fig.1. The layout of chicane is chosen to such that the R

_{56}

, T

_{566}

and U

_{5666}

of this chicane can be adjusted in a large range by tuning strengths of the magnets with a direct search method [53].

To produce training data for the CGAN, 10 000 sets of R

_{56}

, T

_{566}

and U

_{5666}

samples are stochastically generated within an empirically large enough range. The initial beam is sent to the chicanes that have different stochastic R

_{56}

, T

_{566}

and U

_{5666}

values, and the final longitudinal coordinates of the beam at the exit of the chicanes are calculated with an accelerator simulation code Accelerator Toolbox [54]. The final temporal profile is then obtained and converted to a vector of length 200. The dispersion terms and corresponding temporal profile are treated as feature and label of the real data samples, respectively. A uniform distribution with dimensionality 100 is defined, from which a noise component is randomly selected as the noise vector and fed to G.

The training of a CGAN is difficult because one has to ensure the balance between the G and D. When the loss of any one of the G and D converges quickly to zero while the other does not, it will cause the training to fail [55]. For this particular case, it is found that the training of D is much simpler than the training of G. To suppress the training of D, the learning rate of D is set to be one tenth of that of G, i.e., 0.0001 and 0.001 respectively. The training of the networks is implemented with Adam optimizer [56] for 10 000 epochs on an open source ML library tensorflow [57]. The training time of the networks is about 20 minutes on a personal computer with a GeForce RTX 2070 Super Graphics Card.

3 Beam temporal shaping results and discussion

3.1 Realization of two typical temporal profiles

Two temporal profiles that are common in bunch compression, i.e., a cusp-shaped profile and a double-horn profile [see Fig.2(c, d)] are used as target profiles to test the performance of our CGAN solver. To look into the details of how the CGAN solver solves the temporal shaping problems, grid scan is first performed in the R

_{56}

, T

_{566}

and U

_{5666}

space to find all potential solutions for the two test target profiles, respectively. According to the

100 \times 100 \times 100

grid scan within a empirically large range [see Fig.2(a, b)], at least two potential solutions that have almost the same highest objective performance are found for each target profiles, which suggests that both temporal shaping problems are one-to-many. Besides the potential solutions, a plenty of local optima are also observed in the variable space of the double-horn profile, indicating that the bunch compression process of the double-horn profile has a stronger nonlinearity.

Fig.2 Using stochastic optimization methods to solve temporal shaping problems. (a) and (b) show the grid scan results and the evolutionary trajectory of the PSO and the surrogate model-based PSO in the variable space for a cusp-shaped profile and a double-horn profile, respectively. The z-axis in (a) and (b) is −log(objective function) representing the objective performance. The contour maps of the grid scan results are plotted at the bottom, where the color from blue to yellow represent the objective performance from low to high. For each R $_{56}$ -T $_{566}$ grid in (a) and (b), only one U $_{5666}$ with the highest objective performance is plotted. Two separate predictions of the CGAN solver are also shown in (a) and (b) for comparison. (c) and (d) are the final temporal profiles obtained with the optimization methods for the cusp-shaped profile and double-horn profile, respectively. $R^{2}$ is the determination coefficient to the target temporal profile.

Full size|PPT slide

Before implementing the CGAN solver, we first use current state-of-the-art approaches for temporal shaping, the stochastic optimization methods, to realize the two test target profiles. After comparing various optimization methods, the PSO which shows the best performance is used to solve the two test problems. The optimization function is to minimize the mean square error with respect to the target profile, and the free variables in this optimization are the R

_{56}

, T

_{566}

and U

_{5666}

of the bunch compressor. All the optimized variables are normalized to a range of [0, 1]. The initial population are 300 solutions randomly generated from a uniform distribution within the variable range. The velocity weight factor is set to be 0.4, and the acceleration coefficients of the group best experience and the personal experience are both set to be 1. In addition, considering the population size can significantly affect the optimization performance, we also try to use a surrogate model-based PSO similar to the method proposed in Ref. [41] to optimize the test problems with larger population size. In this method, a ML surrogate model is trained to quickly evaluate the candidate solutions that is several times faster than numerical simulation. With the aid of the surrogate model, the population size can be extended to ten times of the standard PSO (i.e., 3000) while the computing time can be remained at the same order of magnitude. To reduce the influence of randomness in the optimization, the optimization is repeatedly performed for five times, and only the best solutions obtained among the repeated tests are selected.

The evolutionary trajectories of the optimization are plotted in the variable space [see Fig.2(a, b)], and the final temporal profiles obtained with the optimization are shown in Fig.2(c, d). The results in Fig.2(a) appears that for the cusp-shaped profile, the PSO can find the two potential solutions to realize the target profile with high fitness. While for the double-horn profile that is more complicated, the PSO can only find one potential solution and miss another. However, when the population size is extended to 3000 with the aid of surrogate model, all two potential solutions for the double-horn profile can be actually found. The results indicate that for a one-to-many problem, if the population size is not sufficiently large, the stochastic optimization methods may find one potential solution while with others missed.

Then the CGAN solver is implemented to solve the two test problems. The target profiles and the noise components are simultaneously fed to the trained solver which finally results in multiple sets of fake R

_{56}

, T

_{566}

and U

_{5666}

samples. It is found that the fake samples converge to several points in the phase space, which represent multiple solutions of a temporal shaping problem. For each of the cusp-shaped profile and double-horn profile, two separate solutions obtained with the CGAN solver are shown in Fig.2(a, b) in the variable space. For the cusp-shaped profile, one CGAN solution is almost the same as the best solution obtained with grid scan, and another one having slightly lower objective performance is close to a solution obtained with the PSO. For the double-horn profile, the two CGAN predictions are close to the two highest peaks in the grid scan map.

The longitudinal phase space distribution resulted from the CGAN predictions are shown in Fig.3. It is found that the beams in Fig.3(b, d) are over compressed, i.e., the head and tail of the beam are reversed. The over compressed beam finally results in almost the same temporal profiles as the under compressed beam in Fig.3(a, c), with a high determination coefficient close to 1.

Fig.3 Longitudinal phase space distribution and temporal profiles of two separate CGAN predictions. The left two columns represent the cusp-shaped profile, and the right two columns represent the double-horn profile, respectively.

Full size|PPT slide

The results in Fig.2 and Fig.3 indicate that the CGAN solver can predict the longitudinal dispersion terms to realize the custom desire temporal profiles with high accuracy. Furthermore, the CGAN solver is able to give different solutions for the same input temporal profile when multiple solutions exist, which may be a limitation of stochastic optimization methods that sometimes lead to one solution with insufficiently large population size. The acquirement of the multiple solutions in Fig.3 is crucial for temporal shaping since an under compressed beam and an over compressed beam can provide different benefits in scientific applications. For example, an over compressed scheme has the potential to provide larger bandwidth FEL radiation [23]. However, compared to the under compressed beam, the over compressed beam also leads to a significant coherent synchrotron radiation (CSR) that can reduce the slice alignment and spoil the transverse emittance of the electron beam [22]. Besides, it is possible to select one from the obtained multiple solutions that is more feasible to implement in practical scientific experiments by experienced operators or an additional evaluator. For instance, it is found that to realize the cusp-shaped profile, the octupole strength required to achieve the longitudinal dispersion terms of one obtained solution is significantly higher than that of another solution (

- 15.7

m^{- 3}

and

0.4

m^{- 3}

respectively). The strong octupoles are not trivial to build and will bring high sensitivity during the beamline optimization and operation.

Fig.4 shows the average solving time for different methods. It is noted that for any target profile, it takes dozens of minutes to perform the optimization with the stochastic optimization methods. The optimization results are not reusable for a new target profile. This optimization time can be much longer when some complicated effects like space charge and CSR are considered. Compared to the stochastic optimization methods, once the CGAN is trained, it only needs little time (fractions of one second) to directly predict the dispersion terms for any new temporal profile, which can be several orders of magnitude faster than using stochastic optimization methods.

Fig.4 The average solving time with the CGAN solver, the PSO and the surrogate model-based PSO for five repeated tests.

Full size|PPT slide

3.2 Consideration of CSR effect

When a high energy electron beam is compressed to be denser and shorter, CSR effect will be significant and play an important role in the beam dynamics. To explore the effectiveness of the CGAN solver to shape beam profile under strong CSR effect, the dispersion terms and corresponding lattices settings used above are converted to ELEGANT [58] lattice files with 1D CSR effect taken into account. The charge of the beam is set to be 500pC and the central energy is 1GeV. It takes several hours to generate about 10000 data samples on a personal workstation. Similar to above works, we train a new CGAN (with the same hyperparameter settings) with the new tracking results as training data.

The same target profiles, i.e., the cusp-shape and double-horn profiles, are used to test the performance of the CGAN solver. As Fig.5 shows, for the cusp-shape profile, the beam profile obtained with the CGAN solution generally matches the target profile, which is slightly wider than the target. For the double-horn profile, the head of the beam matches the target profile but a mismatch is observed at the tail of the beam. It may be because that the target profiles are generated by non-CSR tracking and it may be difficult to produce the same beam profile when CSR effect is considered. To our experience of using CGAN, when an unattainable target is given as the input label, the CGAN can generate samples that match the target as closely as possible, like the results in Fig.5. The tolerance of CGAN to the realizability of the given target might vary from case to case and is challenging to measure since the neural network itself is a “black box”. Considering this challenge, in an explorative research, the stochastic optimization methods like PSO are still a general solution that cannot be simply replaced by the CGAN solver. In the application of the CGAN solver, if possible, an expert in the applied domain is recommended to ensure that the given targets are not too far from reality.

Fig.5 Beam profiles obtained with CGAN solutions for two target profiles with CSR effect taken into account. (a) and (b) represent a cusp-shape target and a double-horn target, respectively.

Full size|PPT slide

3.3 Realization of two temporal profiles with scientific merits

In addition to the above two common temporal profiles, two other temporal profiles, namely the flat-top profile and the linearly-ramped profile (see Fig.6), are also studied. These two additional temporal profiles are more frequently-used in various scientific applications and have greater scientific merit. The flat-top temporal profile is desired in FELs to reduce the third-order curvature in the time-energy correlation due to wakefield, so as to obtain better FEL performance with improved pulse energy, peak power and bandwidth control [16]. The linearly-ramped temporal profile is treated as the optimal shape of the drive beam in PWFAs to supply a high transformer ratio because it maximizes the energy that can be gained by a trailing particle accelerated in its wakefield [14].

Fig.6 The temporal profiles obtained with the CGAN solver for the flat-top profile (a) and the linearly-ramped profile (b). The red/blue solid line represents the results are predicted by the CGAN model without/with CSR effect considered in the training.

Full size|PPT slide

The two additional test temporal profiles are fed to the same trained CGAN solvers (both CGAN solvers with/without CSR effect considered are used separately) to predict multiple fake samples of longitudinal dispersion terms. The final temporal profiles resulted from the CGAN solutions are illustrated in Fig.5. When CSR effect is not considered, the CGAN solver can predict the longitudinal dispersion terms with high accuracy, resulting in almost the same temporal profiles compared to the target profile, especially for the linearly-ramped profile. For the flat-top profile, horns occur at the head and the tail of the compressed beam, which cannot be completely eliminated due to the nature of bunch compression with a single chicane compressor. The horns may be further flattened with an additional bunch compressor [59] that is, however, beyond the scope of this study. Nevertheless, the FWHM of the horns is very narrow compared with the flat part of the bunch when CSR effect is not considered. When CSR effect is taken into account, the CGAN predictions can result in compressed beams with close duration as the target profiles. However, it seems difficult to shape the beam to the same regular shapes as the target profiles with the CGAN solver. One explanation might be that under strong CSR effect, the beam itself cannot form these regular target profiles after travelling through such a simple chicane compressor. Nevertheless, if necessary, the CGAN solutions can be also used as good initial solutions for further exploration of the potential solutions for the target profiles.

4 Conclusion

We have proposed a CGAN solver for one-to-many problems of temporal shaping of electron beams. By learning from the stochastically generated data, a trained CGAN solver can quickly and accurately predict available combinations of the dispersion terms up to the 3rd order to realize desired custom temporal profiles. For one-to-many problems, the CGAN can predict many, if not all the, potential solutions simultaneously by receiving different input noise vectors. Once the CGAN is trained, it can produce the required dispersion terms for a new target profile within fractions of a second, showing orders of magnitudes faster computing speed than the conventional method, i.e., the stochastic optimization methods.

This method can be easily transferable to other similar problems, for instance, photon pulse shaping and transverse phase space manipulation of an electron bunch. We expect that the CGAN solver can serve as a direct and real-time method to solve one-to-many problems in more scientific applications.

References

Publishing order | Descend order by publishing year | Descend order by cited within

[1]

P.Emma, R.Akre, J.Arthur, R.Bionta, C.Bostedt, J.Bozek, A.Brachmann, P.Bucksbaum, R.Coffee, F.J. Decker, Y.Ding, D.Dowell, S.Edstrom, A.Fisher, J.Frisch, S.Gilevich, J.Hastings, G.Hays, Ph.Hering, Z.Huang, R.Iverson, H.Loos, M.Messerschmidt, A.Miahnahri, S.Moeller, H.D. Nuhn, G.Pile, D.Ratner, J.Rzepiela, D.Schultz, T.Smith, P.Stefan, H.Tompkins, J.Turner, J.Welch, W.White, J.Wu, G.Yocky, J.Galayda. First lasing and operation of an ångstrom-wavelength free-electron laser. Nat. Photonics , 2010, 4( 9): 641

CrossRef ADS Google scholar

[2]

E.Allaria, D.Castronovo, P.Cinquegrana, P.Craievich, M.Dal Forno, M.B. Danailov, G.D’Auria, A.Demidovich, G.De Ninno, S.Di Mitri, B.Diviacco, W.M. Fawley, M.Ferianis, E.Ferrari, L.Froehlich, G.Gaio, D.Gauthier, L.Giannessi, R.Ivanov, B.Mahieu, N.Mahne, I.Nikolov, F.Parmigiani, G.Penco, L.Raimondi, C.Scafuri, C.Serpico, P.Sigalotti, S.Spampinati, C.Spezzani, M.Svandrlik, C.Svetina, M.Trovo, M.Veronese, D.Zangrando, M.Zangrando. Two-stage seeded soft-X-ray free-electron laser. Nat. Photonics , 2013, 7( 11): 913

CrossRef ADS Google scholar

[3]	N.Huang, H.Deng, B.Liu, D.Wang, Z.Zhao. Features and futures of X-ray free-electron lasers. The Innovation , 2021, 2 : 100097 CrossRef ADS Google scholar

[4]	S.Bielawski, C.Evain, T.Hara, M.Hosaka, M.Katoh, S.Kimura, A.Mochihashi, M.Shimada, C.Szwaj, T.Takahashi, Y.Takashima. Tunable narrowband terahertz emission from mastered laser–electron beam interaction. Nat. Phys. , 2008, 4( 5): 390 CrossRef ADS Google scholar

[5]	H.Tang, L.Zhao, P.Zhu, X.Zou, J.Qi, Y.Cheng, J.Qiu, X.Hu, W.Song, D.Xiang, J.Zhang. Stable and scalable multistage terahertz-driven particle accelerator. Phys. Rev. Lett. , 2021, 127( 7): 074801 CrossRef ADS Google scholar

[6]

X.Q. Yan, C.Lin, H.Y. Lu, K.Zhu, Y.B. Zou, H.Y. Wang, B.Liu, S.Zhao, J.Zhu, Y.X. Geng, H.Z. Fu, Y.Shang, C.Cao, Y.R. Shou, W.Song, Y.R. Lu, Z.X. Yuan, Z.Y. Guo, X.T. He, J.E. Chen. Recent progress of laser driven particle acceleration at Peking University. Front. Phys. , 2013, 8( 5): 577

CrossRef ADS Google scholar

[7]	E.Gschwendtner, P.Muggli. Plasma wakefield accelerators. Nat. Rev. Phys. , 2019, 1( 4): 246 CrossRef ADS Google scholar

[8]	B.E. Cole, J.B. Williams, B.T. King, M.S. Sherwin, C.R. Stanley. Coherent manipulation of semiconductor quantum bits with terahertz radiation. Nature , 2001, 410( 6824): 60 CrossRef ADS Google scholar

[9]

S.P. Hau-Riege, R.A. London, R.M. Bionta, M.A. McKernan, S.L. Baker, J.Krzywinski, R.Sobierajski, R.Nietubyc, J.B. Pelka, M.Jurek, L.Juha, J.Chalupský, J.Cihelka, V.Hájková, A.Velyhan, J.Krása, J.Kuba, K.Tiedtke, S.Toleikis, T.Tschentscher, H.Wabnitz, M.Bergh, C.Caleman, K.Sokolowski-Tinten, N.Stojanovic, U.Zastrau. Damage threshold of inorganic solids under free-electron-laser irradiation at 32.5 nm wavelength. Appl. Phys. Lett. , 2007, 90( 17): 173128

CrossRef ADS Google scholar

[10]	G.J. Wilmink, J.E. Grundt. Current state of research on biological effects of terahertz radiation. Int. J. Infrared Millim. Terahertz Waves , 2011, 32( 10): 1074 CrossRef ADS Google scholar

[11]

M.L. Grünbein, J.Bielecki, A.Gorel, M.Stricker, R.Bean, M.Cammarata, K.Dörner, L.Fröhlich, E.Hartmann, S.Hauf, M.Hilpert, Y.Kim, M.Kloos, R.Letrun, M.Messerschmidt, G.Mills, G.N. Kovacs, M.Ramilli, C.M. Roome, T.Sato, M.Scholz, M.Sliwa, J.Sztuk-Dambietz, M.Weik, B.Weinhausen, N.Al-Qudami, D.Boukhelef, S.Brockhauser, W.Ehsan, M.Emons, S.Esenov, H.Fangohr, A.Kaukher, T.Kluyver, M.Lederer, L.Maia, M.Manetti, T.Michelat, A.Münnich, F.Pallas, G.Palmer, G.Previtali, N.Raab, A.Silenzi, J.Szuba, S.Venkatesan, K.Wrona, J.Zhu, R.B. Doak, R.L. Shoeman, L.Foucar, J.-P.Colletier, A.P. Mancuso, T.R. M. Barends, C.A. Stan, I.Schlichting. Megahertz data collection from protein microcrystals at an X-ray free-electron laser. Nat. Commun. , 2018, 9 : 3487

CrossRef ADS Google scholar

[12]

M.Litos, E.Adli, W.An, C.I. Clarke, C.E. Clayton, S.Corde, J.P. Delahaye, R.J. England, A.S. Fisher, J.Frederico, S.Gessner, S.Z. Green, M.J. Hogan, C.Joshi, W.Lu, K.A. Marsh, W.B. Mori, P.Muggli, N.Vafaei-Najafabadi, D.Walz, G.White, Z.Wu, V.Yakimenko, G.Yocky. High-efficiency acceleration of an electron beam in a plasma wakefield accelerator. Nature , 2014, 515( 7525): 92

CrossRef ADS Google scholar

[13]

D.P. Anderle, V.Bertone, X.Cao, L.Chang, N.Chang, G.Chen, X.Chen, Z.Chen, Z.Cui, L.Dai, W.Deng, M.Ding, X.Feng, C.Gong, L.Gui, F.K. Guo, C.Han, J.He, T.J. Hou, H.Huang, Y.Huang, K.I. KumeričKi, L.P. Kaptari, D.Li, H.Li, M.Li, X.Li, Y.Liang, Z.Liang, C.Liu, C.Liu, G.Liu, J.Liu, L.Liu, X.Liu, T.Liu, X.Luo, Z.Lyu, B.Ma, F.Ma, J.Ma, Y.Ma, L.Mao, C.Mezrag, H.Moutarde, J.Ping, S.Qin, H.Ren, C.D. Roberts, J.Rojo, G.Shen, C.Shi, Q.Song, H.Sun, P.Sznajder, E.Wang, F.Wang, Q.Wang, R.Wang, R.Wang, T.Wang, W.Wang, X.Wang, X.Wang, J.Wu, X.Wu, L.Xia, B.Xiao, G.Xiao, J.J. Xie, Y.Xie, H.Xing, H.Xu, N.Xu, S.Xu, M.Yan, W.Yan, W.Yan, X.Yan, J.Yang, Y.B. Yang, Z.Yang, D.Yao, Z.Ye, P.Yin, C.P. Yuan, W.Zhan, J.Zhang, J.Zhang, P.Zhang, Y.Zhang, C.H. Chang, Z.Zhang, H.Zhao, K.T. Chao, Q.Zhao, Y.Zhao, Z.Zhao, L.Zheng, J.Zhou, X.Zhou, X.Zhou, B.Zou, L.Zou. Electron-ion collider in China. Front. Phys. , 2021, 16( 6): 64701

CrossRef ADS Google scholar

[14]	K.L. F. Bane, P.Chen, P.B. Wilson. On collinear wake field acceleration. IEEE Trans. Nucl. Sci. , 1985, 32( 5): 3524 CrossRef ADS Google scholar

[15]	R.J. England, J.B. Rosenzweig, G.Travish. Generation and measurement of relativistic electron bunches characterized by a linearly ramped current profile. Phys. Rev. Lett. , 2008, 100( 21): 214802 CrossRef ADS Google scholar

[16]

Y.Ding, K.L. F. Bane, W.Colocho, F.J. Decker, P.Emma, J.Frisch, M.W. Guetg, Z.Huang, R.Iverson, J.Krzywinski, H.Loos, A.Lutman, T.J. Maxwell, H.D. Nuhn, D.Ratner, J.Turner, J.Welch, F.Zhou. Beam shaping to improve the free-electron laser performance at the Linac coherent light source. Phys. Rev. Accel. Beams , 2016, 19( 10): 100703

CrossRef ADS Google scholar

[17]	T.K. Charles D.M. Paganin M.J. Boland R.T. Dowd, in: Proceedings of 8th International Particle Accelerator Conference, Copenhagen, Denmark, May 2017, paper MOPIK055, pp 644− 647

[18]	N.Sudar, P.Musumeci, I.Gadjev, Y.Sakai, S.Fabbri, M.Polyanskiy, I.Pogorelsky, M.Fedurin, C.Swinson, K.Kusche, M.Babzien, M.Palmer. Demonstration of cascaded modulator-chicane microbunching of a relativistic electron beam. Phys. Rev. Lett. , 2018, 120( 11): 114802 CrossRef ADS Google scholar

[19]

V.Shpakov, M.P. Anania, M.Bellaveglia, A.Biagioni, F.Bisesto, F.Cardelli, M.Cesarini, E.Chiadroni, A.Cianchi, G.Costa, M.Croia, A.Del Dotto, D.Di Giovenale, M.Diomede, M.Ferrario, F.Filippi, A.Giribono, V.Lollo, M.Marongiu, V.Martinelli, A.Mostacci, L.Piersanti, G.Di Pirro, R.Pompili, S.Romeo, J.Scifo, C.Vaccarezza, F.Villa, A.Zigler. Longitudinal phase-space manipulation with beam-driven plasma wakefields. Phys. Rev. Lett. , 2019, 122( 11): 114801

CrossRef ADS Google scholar

[20]	E.L. Saldin, E.A. Schneidmiller, M.V. Yurkov. An analytical description of longitudinal phase space distortions in magnetic bunch compressors. Nucl. Instrum. Methods Phys. Res. Sect. A , 2002, 483 : 516 CrossRef ADS Google scholar

[21]	K.L. Brown, Stanford Linear Accelerator Center: SLAC report No. 75, 1971.

[22]	M.W. Guetg, B.Beutner, E.Prat, S.Reiche. Optimization of free electron laser performance by dispersion-based beam-tilt correction. Phys. Rev. Spec. Top. Accel. Beams , 2015, 18( 3): 030701 CrossRef ADS Google scholar

[23]	E.Prat, P.Dijkstal, E.Ferrari, S.Reiche. Demonstration of large bandwidth hard X-ray free-electron laser pulses at SwissFEL. Phys. Rev. Lett. , 2020, 124( 7): 074801 CrossRef ADS Google scholar

[24]	P.Piot, C.Behrens, C.Gerth, M.Dohlus, F.Lemery, D.Mihalcea, P.Stoltz, M.Vogt. Generation and characterization of electron bunches with ramped current profiles in a dual-frequency superconducting linear accelerator. Phys. Rev. Lett. , 2012, 108( 3): 034801 CrossRef ADS Google scholar

[25]	T.K. Charles, D.M. Paganin, R.T. Dowd. Caustic-based approach to understanding bunching dynamics and current spike formation in particle bunches. Phys. Rev. Accel. Beams , 2016, 19( 10): 104402 CrossRef ADS Google scholar

[26]	D.J. Dunning J.K.Jones H.M. Castaneda Cortés, in: Proceedings of the 39th Free Electron Laser Conference, Hamburg, Germany, August 2019, pp 711– 714

[27]	Y.Ding K.L. F. Bane Y.M. Nosochkov, in: Proceedings of the 39th Free Electron Laser Conference, Hamburg, Germany, August 2019, pp 661– 664

[28]	F.Mayet, R.Assmann, F.Lemery. Longitudinal phase space synthesis with tailored 3D-printable dielectric-lined waveguides. Phys. Rev. Accel. Beams , 2020, 23( 12): 121302 CrossRef ADS Google scholar

[29]	L.Yang, D.Robin, F.Sannibale, C.Steier, W.Wan. Global optimization of an accelerator lattice using multiobjective genetic algorithms. Nucl. Instrum. Methods Phys. Res. Sect. A , 2009, 609 : 50 CrossRef ADS Google scholar

[30]

J.Wu, N.Hu, H.Setiawan, X.Huang, T.O. Raubenheimer, Y.Jiao, G.Yu, A.Mandlekar, S.Spampinati, K.Fang, C.Chu, J.Qiang. Multi-dimensional optimization of a terawatt seeded tapered free electron laser with a multi-objective genetic algorithm. Nucl. Instrum. Methods Phys. Res. Sect. A , 2017, 846 : 56

CrossRef ADS Google scholar

[31]	X.Huang, J.Safranek. Nonlinear dynamics optimization with particle swarm and genetic algorithms for SPEAR3 emittance upgrade. Nucl. Instrum. Methods Phys. Res. Sect. A , 2014, 757 : 48 CrossRef ADS Google scholar

[32]	N.Bruchon, G.Fenu, G.Gaio, M.Lonza, F.A. Pellegrino, L.Saule. Free-electron laser spectrum evaluation and automatic optimization. Nucl. Instrum. Methods Phys. Res. Sect. A , 2017, 871 : 20 CrossRef ADS Google scholar

[33]	T.Takahama S.Sakai, in: Proceedings of 2006 IEEE International Conference on Evolutionary Computation, 2006, pp 1– 8

[34]	M.Wielgosz, A.Skoczeń, M.Mertik. Using LSTM recurrent neural networks for monitoring the LHC superconducting magnets. Nucl. Instrum. Methods Phys. Res. Sect. A , 2017, 867 : 40 CrossRef ADS Google scholar

[35]	Y.Li, W.Cheng, H.Y. Li, R.Rainer. Genetic algorithm enhanced by machine learning in dynamic aperture optimization. Phys. Rev. Accel. Beams , 2018, 21( 5): 054601 CrossRef ADS Google scholar

[36]	C.Emma, A.Edelen, M.J. Hogan, B.O’Shea, G.White, V.Yakimenko. Machine learning-based longitudinal phase space prediction of particle accelerators. Phys. Rev. Accel. Beams , 2018, 21( 11): 112802 CrossRef ADS Google scholar

[37]	A.Scheinker, A.Edelen, D.Bohler, C.Emma, A.Lutman. Demonstration of model-independent control of the longitudinal phase space of electron beams in the Linac-coherent light source with femtosecond resolution. Phys. Rev. Lett. , 2018, 121( 4): 044801 CrossRef ADS Google scholar

[38]	J.Wan, P.Chu, Y.Jiao, Y.Li. Improvement of machine learning enhanced genetic algorithm for nonlinear beam dynamics optimization. Nucl. Instrum. Methods Phys. Res. Sect. A , 2019, 946 : 162683 CrossRef ADS Google scholar

[39]	S.C. Leemann, S.Liu, A.Hexemer, M.A. Marcus, C.N. Melton, H.Nishimura, C.Sun. Demonstration of machine learning-based model-independent stabilization of source properties in synchrotron light sources. Phys. Rev. Lett. , 2019, 123( 19): 194801 CrossRef ADS Google scholar

[40]	X.Xu, Y.Zhou, Y.Leng. Machine learning based image processing technology application in bunch longitudinal phase information extraction. Phys. Rev. Accel. Beams , 2020, 23( 3): 032805 CrossRef ADS Google scholar

[41]	A.Edelen, N.Neveu, M.Frey, Y.Huber, C.Mayes, A.Adelmann. Machine learning for orders of magnitude speedup in multiobjective optimization of particle accelerator systems. Phys. Rev. Accel. Beams , 2020, 23( 4): 044601 CrossRef ADS Google scholar

[42]	J.Wan, P.Chu, Y.Jiao. Neural network-based multiobjective optimization algorithm for nonlinear beam dynamics. Phys. Rev. Accel. Beams , 2020, 23( 8): 081601 CrossRef ADS Google scholar

[43]	M.Mirza S.Osindero, Conditional generative adversarial nets, arXiv: 1411.1784 (2014)

[44]	I.Goodfellow J.Pouget-Abadie M.Mirza B.Xu D. Warde-Farley S. Ozair A.Courville Y.Bengio, Generative adversarial networks, arXiv: 1406.2661 (2014)

[45]	Y.L. Tuan, H.Y. Lee. Improving conditional sequence generative adversarial networks by stepwise evaluation. IEEE/ACM Trans. Audio Speech Lang. Process. , 2019, 27( 4): 788 CrossRef ADS Google scholar

[46]	V.Sandfort, K.Yan, P.J. Pickhardt, R.M. Summers. Data augmentation using generative adversarial networks (CycleGAN) to improve generalizability in CT segmentation tasks. Sci. Rep. , 2019, 9( 1): 16884 CrossRef ADS Google scholar

[47]	E.L. Denton S.Chintala A.Szlam R.Fergus, Deep generative image models using a Laplacian pyramid of adversarial networks, arXiv: 1506.05751 (2015)

[48]	J.Y. Zhu T.Park P.Isola A.A. Efros, in: Proceedings of 2017 IEEE International Conference on Computer Vision, 2017, pp 2242– 2251

[49]	C.Ledig L.Theis F.Huszár J.Caballero A.Cunningham A.Acosta A.Aitken A.Tejani J.Totz Z.Wang W.Shi, in: Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp 105– 114

[50]	S.Gurumurthy R.K. Sarvadevabhatla V.B. Radhakrishnan, in Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp 4941– 4949

[51]	P.Isola J.Y. Zhu T.Zhou A.A. Efros, Image-to-image translation with conditional adversarial networks, In: Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp 5967− 5976

[52]	R.J. England, J.B. Rosenzweig, G.Andonian, P.Musumeci, G.Travish, R.Yoder. Sextupole correction of the longitudinal transport of relativistic beams in dispersionless translating sections. Phys. Rev. Spec. Top. Accel. Beams , 2005, 8( 1): 012801 CrossRef ADS Google scholar

[53]	J.C. Lagarias, J.A. Reeds, M.H. Wright, P.E. Wright. Convergence properties of the Nelder-Mead simplex method in low dimensions. SIAM J. Optim. , 1998, 9( 1): 112 CrossRef ADS Google scholar

[54]	A.Terebilo, in: Proceedings of the 2001 Particle Accelerator Conference, Vol. 4, 2001, pp 3203– 3205

[55]	G.Rizzo, T.H. M. Van. Adversarial text generation with context adapted global knowledge and a self-attentive discriminator. Inf. Process. Manage. , 2020, 57( 6): 102217 CrossRef ADS Google scholar

[56]	D.P. Kingma J.Ba, Adam: A method for stochastic optimization, arXiv: 1412.6980 (2014)

[57]

M.Abadi A.Agarwal P.Barham E.Brevdo Z.Chen C.Citro G.S. Corrado A.Davis J.Dean M.Devin S.Ghemawat I.Goodfellow A.Harp G.Irving M.Isard Y.Jia R.Jozefowicz L.Kaiser M.Kudlur J.Levenberg D.Mane R.Monga S.Moore D.Murray C.Olah M.Schuster J.Shlens B.Steiner I.Sutskever K.Talwar P.Tucker V.Vanhoucke V.Vasudevan F.Viegas O.Vinyals P.Warden M.Wattenberg M.Wicke Y.Yu X. Zheng, TensorFlow: Large-scale machine learning on heterogeneous distributed systems, arXiv: 1603.04467 ( 2016)

[58]	M.Borland, Technical Report No. LS-287, Argonne National Laboratory, 2000

[59]	T.K. Charles, D.M. Paganin, A.Latina, M.J. Boland, R.T. Dowd. Current-horn suppression for reduced coherent-synchrotron-radiation-induced emittance growth in strong bunch compression. Phys. Rev. Accel. Beams , 2017, 20( 3): 030705 CrossRef ADS Google scholar

Acknowledgements

The authors thank Dr. Juhao Wu for nice discussion and suggestions. This work was supported by National Natural Science Foundation of China (No. 11922512), Youth Innovation Promotion Association of Chinese Academy of Sciences (No. Y201904) and National Key R&D Program of China (No. 2016YFA0401900).