Bayesian method for fitting the low-energy constants in chiral perturbation theory

Hao-Xiang Pan, De-Kai Kong, Qiao-Yi Wen, Shao-Zhou Jiang

Front. Phys. ›› 2024, Vol. 19 ›› Issue (6) : 64203.

PDF(4795 KB)
Front. Phys. All Journals
PDF(4795 KB)
Front. Phys. ›› 2024, Vol. 19 ›› Issue (6) : 64203. DOI: 10.1007/s11467-024-1430-7
RESEARCH ARTICLE

Bayesian method for fitting the low-energy constants in chiral perturbation theory

Author information +
History +

Abstract

The values of the low-energy constants (LECs) are very important in the chiral perturbation theory. This paper adopts a Bayesian method with the truncation errors to globally fit eight next-to-leading order (NLO) LECs Lir and next-to-next-leading order (NNLO) LECs Cir. With the estimation of the truncation errors, the fitting results of Lir in the NLO and NNLO are very close. The posterior distributions of Cir indicate the boundary-dependent relations of these Cir. Ten Cir are weakly dependent on the boundaries and their values are reliable. The other Cir are required more experimental data to constrain their boundaries. Some linear combinations of Cir are also fitted with more reliable posterior distributions. If one knows some more precise values of Cir, some other Cir can be obtained by these values. With these fitting LECs, most observables provide a good convergence, except for the πK scattering lengths a03/2 and a01/2. An example is also introduced to test the improvement of the method. All the computations indicate that considering the truncation errors can improve the global fit greatly, and more prior information can obtain better fitting results. This fitting method can be extended to the other effective field theories and the perturbation theory.

Graphical abstract

Keywords

chiral perturbation theory / low-energy constants / Bayesian statistics

Cite this article

Download citation ▾
Hao-Xiang Pan, De-Kai Kong, Qiao-Yi Wen, Shao-Zhou Jiang. Bayesian method for fitting the low-energy constants in chiral perturbation theory. Front. Phys., 2024, 19(6): 64203 https://doi.org/10.1007/s11467-024-1430-7

1 Introduction

Effective field theory (EFT) is a very important theory in dealing with interactions between particles under a low-energy scale. Chiral perturbation theory (ChPT) is a kind of EFT. It first focuses on the low-energy strong interactions between the low-energy pseudoscalar mesons and then extends to baryons and other mesons. ChPT is based on the SU(3)L×SU(3)R flavor symmetry in the chiral limit, in which the three lightest quarks are considered massless. The only constraints of the chiral Lagrangian are symmetries, such as charge conjugate symmetry, parity symmetry, and chiral symmetry. However, there are infinite independent terms satisfying these symmetries. The Weinberg power-counting scheme expands these terms by the mesonic momentum (p) [1]. The leading-order (LO, O(p2) order) terms give the most contributions, and they are considered first. If one wants to obtain a higher precision, the terms in the next-to-leading order (NLO, O(p4) order), the next-to-next-to-leading order (NNLO, O(p6) order), etc., will be considered gradually. Each term contains a corresponding unknown parameter, called low-energy constant (LEC), which contains the information of the effective strong interactions. For the three-flavor ChPT, there are 2, 10+2, 90+4, and 1233+21 LECs in the LO, NLO, NNLO and next-to-next-to-next-to-leading order (O(p8) order) [25], respectively. If all these LECs were known, all theoretical calculations would be obtained numerical values. However, the number of these LECs are too large, especially in the high orders. Besides, with CHPT itself, one cannot fix these LECs. The LECs are usually determined by the other approaches, such as global fit [68], lattice QCD [911], chiral quark model [1215], resonance chiral theory [1317], sum rules [18], holographic QCD [19], and dispersion relations [2022]. Each method has its advantages and sphere of application. Until now, no approach can determine the exact values of these LECs. This paper only focuses on the global fit method.
There has been a lot of research based on global fits. Some LECs up to NNLO have been fitted. Ref. [23] fits K4 form factors and ππ scatter lengths to get the values of L1r, L2r and L3r. Six years later, Lir(i=1,2,3,5,7,8) is determined by fitting the quark mass ratio ms/m^, the decay constant ratio FK/Fπ and the K4 form factor [24]. Another eleven years later, a new global fit appears, adding ππ scattering lengths (a00 and a02), πK scattering lengths (a01/2 and a01/3) and scalar form factor threshold parameters (r2Sπ and cSπ). L18r and some Cir are obtained [6]. Ref. [7] adds some two-flavor LECs and updates the values of LECs fitted in Ref. [6]. The last two references not only fit the LECs L18r at the NLO, but also estimate a part of the NNLO LECs Cir. However, both of them ignore the higher-order truncated contributions. Ref. [8] proposes a geometric sequence model to introduce the higher-order truncated contributions. Its NLO fitting values of Lir are very close to the NNLO fitting values in Ref. [7]. This is because a physical quantity contains not only the sum of LO and NLO theoretical values, but also the sum of higher-order contributions, which sometimes cannot be ignored when compared to the NLO contributions. If the NLO fit includes the higher-order contributions, Lir will be closer to the true values. Hence, fitting Lir at NLO and NNLO yields closer results. This shows that the higher-order contributions indeed cannot be simply ignored in the ChPT fit. Above all references have adopted a classical statistical method to fit LECs. Theoretically, the precision of the fitting results is dependent on the amount and precision of the experimental data. In other words, more data and more precise data will lead to more precise LECs. However, there exist some problems in the classical statistics and some improvements are needed.
i) The geometric sequence model in Ref. [8] is too simple. The contribution at each order, in fact, needs not be a geometric sequence. In addition, in order to estimate the NNLO contribution, the geometric sequence itself requires the LO and NLO contributions. However, the LO contribution is sometimes zero, so the NNLO contribution cannot be estimated. In some special cases, the NNLO contribution may be larger than the NLO one, such as πK scattering lengths a01/2 and a03/2 [6, 7]. Hence, Ref. [8] adopts a special approach to deal with this problem. In most cases, two two-flavor NLO LECs l¯2,3 have a bad convergence. It takes a long time to fit them, about one day with 20 cores in a CPU Intel Xeon Gold 6230. In addition, how to confirm the sign of the NNLO contribution is also a problem. These cause that the model is not consistent for all physical quantities. The model is not a universal approach.
ii) The number of Cir is much larger than the number of the input experimental data. There exists an overfitting problem in the NNLO fit. Refs. [6, 7] adopt a random walking algorithm, but the result is boundary-dependent. A Monte Carlo method is used to fit the LECs in Ref. [8], but its efficiency is low. Moreover, the complicated errors of Cir are hard to be estimated. They usually cannot be obtained as a normal distribution.
iii) Although the geometric sequence model gives a reasonable result in Ref. [8], this model is hard to extend to the other EFTs, because of the reasons discussed above. Furthermore, it is also hard to evaluate different models in order to select the best one, because χ2/d.o.f. (degrees of freedom) is too small and an overfitting problem exists. It is hard to select the best model from some overfitting models by χ2/d.o.f. A more universal method requires a credible quantified index. The best model can be selected by this index.
iv) Refs. [68] treat the two-flavor NLO LECs l¯i as the independent input experimental data, but some l¯i are possibly dependent on other experimental quantities. In fact, the ππ scattering lengths a00 and a02 are dependent on l¯1, l¯2 and l¯4 [25]. Hence, their covariance matrix needs to be considered.
v) The most important thing is that before a global fit, one has known something about ChPT and the fitting experimental data, but this information does not obviously embody in the fit. For example, for the NLO fitting, although the truncation errors are not known, the other references have given some approximate values of the NNLO LECs. With these NNLO LECs, even at the NLO fit, one can roughly obtain the signs of the truncation errors. If these signs are introduced into the NLO fit, the results may be more precise. Furthermore, ChPT assumes that the orders of magnitude of LECs at a given chiral order are nearly the same. If this information is considered in the global fit, the range of the unknown LECs even through the NNLO contribution can be estimated. Simply speaking, more information may lead to a more precise result.
In addition to classical statistics, Bayesian statistics, which has been successful in artificial intelligence, can play a better role in the global fit of EFTs. Bayesian statistics can make good use of the known information to give a more reasonable result. Even when the amount of data is small, Bayesian statistics can be better than classical statistics. Ref. [26] has applied Bayesian statistics to EFTs. It proposes two toy models and compares the results obtained by Bayesian and classical statistics. The advantages of Bayesian statistics in EFTs have been demonstrated. Later, Ref. [27] introduces Bayesian statistics into nuclear physics. A year later, a specific framework for using Bayesian statistics in EFTs appears [28]. Subsequently, Refs. [2957] use Bayesian statistics to calculate the magnitudes of truncation errors in the different EFTs. This paper will improve the approach in Ref. [8]. The new approach contains the framework of Bayesian statistics and the application of Markov Chain Monte Carlo (MCMC). Some MCMC algorithms, such as the Metropolis-Hastings algorithm [58, 59], Hamiltonian Monte Carlo algorithm [60] and No-U-turn Sampler algorithm [61], will be used to fit the LECs with the help of the PyMC3 package [62]. The major improvements of the new approach and the motivations of this paper are as follows.
i) The geometric sequence is not required in the fit. It is replaced by a Bayesian method. Generally, the new method does not require the assumptions about how ChPT converges.
ii) The approach is more general. Some examples are carried out to check whether the approach works well. The parameters in the examples are completely random. Hence, this approach is not only used to fit the LECs in ChPT, but also can be applied to other EFTs and perturbation theory.
iii) The cost of time for this approach is greatly reduced with the help of MCMC. A better result will be obtained within ten minutes.
iv) The covariance matrix given in Ref. [25] will be considered in the fit, so it is maintained.
v) The Bayesian method is applied fully in the fit. More information under some reasonable assumptions is considered if possible, such as the assumptions of the signs and the order of magnitude of the truncation errors.
vi) Although the number of input values is not large enough, some clearer distribution of Cir and some more precise values of Lir will be obtained. In addition, the boundary dependence of Cir can be seen more clearly.
This paper is organized as follows: Section 2 gives a brief introduction to Bayesian statistics and MCMC. In Section 3, two Bayesian models and some evaluation criteria are introduced. One model contains truncation errors, but the other one not. Some details of the calculation are also discussed. One example is studied in Section 4, in order to evaluate the above models. The input physical observables mentioned in ChPT is given in Section 5. In Section 6, some NLO and NNLO LECs are fitted by the above models. A set of new LECs are obtained. Section 7 gives a summary and some discussions.

2 Bayesian statistics and MCMC

This section provides a brief introduction to fit data by Bayesian statistics and MCMC. More details can be found in Refs. [26, 27]. Some content is very basic and can also be found in textbooks about probability theory and Bayesian analysis. For convenience, some parameters are given meanings in ChPT, but it has a much wider scope of applications. They can be any parameters to be fitted in a problem.
Considering a general case, some parameters need to be fitted from a set of data. D=(D1,D2,D3,,Dm) denotes a set of known input data. In physics, it is usually experimental data or physical constant quantities. All Di are not assumed independent. a=(a1,a2,a3,,an) is a parameter vector. In physics, its components are usually some parameters needed to be fitted. In this paper, a means LECs. The rest of this section will introduce an approach to fit a by Bayesian statistics and MCMC. This approach is faster than only the Bayesian statistics without MCMC.
The core of Bayesian statistics is Bayes’ formula
pr(a|D)=pr(D|a)pr(a)pr(D).
The meanings of Eq. (1) is as follows.
i) pr(a) is the prior probability distribution function (PDF). It reflects the knowledge of a before D is observed. If one does not know anything about a, pr(a) is usually set to a uniform distribution. Usually, experiment or/and theory can give an approximated value. At least the order of magnitude is known before fitting in most cases. Due to the introduction of pr(a), one would argue that Bayesian statistics are subjective. However, pr(a) is nothing more than some assumptions in the construction of a model. This is similar to the χ2 fit usually needing an initial value of a reasonable range.
ii) pr(D|a) is the likelihood function. It is related to D and reflects the confidence of D under the given a. It can be expressed as
pr(D|a)=exp{12[μth(a)D¯]T(ΣD)1[μth(a)D¯]},
where μth(a) is the theoretical expected value of the data, which is dependent on a. D¯ is the expected value of the data, i.e., the experimental central value. ΣD is the covariance matrix of D. The errors and the correlation information of D are contained in ΣD.
iii) pr(a|D) is the posterior PDF. It is the result of Bayesian analysis. It also reflects the full knowledge of D from a fitting model. pr(a|D) is the PDFs of a, but not only some expected values. pr(a|D) can be viewed as an update of pr(a) after D have been observed. In addition, pr(a|D) in one fit can be regarded as pr(a) in another fit after appending some new D.
iv) pr(D) is called Bayesian evidence. It is known as the marginal likelihood PDF. It means the average probability of D in the fitting model. In addition, it can also be simply treated as a normalization coefficient. Because a fit is concerned with the relative PDFs of a rather than their absolute PDFs, this normalization coefficient does not play an important role in the fit. Ignoring pr(D), Bayes’ formula can be expressed in a proportional form
pr(a|D)pr(D|a)pr(a).
Hence, pr(D|a)pr(a) is also called the core of the posterior PDF.
There are some different methods to determine pr(a|D) without pr(D), such as MCMC. We have tried three algorithms to generate the Markov chain, i.e., Metropolis−Hasting algorithm [58, 59, 63], Hamiltonian Monte Carlo algorithm [60] and No-U-turn Sampler algorithm [61]. The last two algorithms are a bit more complicated, but they have a faster computational efficiency. The details can be found in the above references. We have checked that all these algorithms can obtain almost the same distribution. The No-U-turn Sampler algorithm is the fastest one. It costs about half the time compared to the Metropolis-Hastings algorithm.

3 Models and details

3.1 Preparation

The above section gives a general approach to fit the parameter a in the known analytical relationship μth(a) by Bayesian statistics and MCMC. However, in ChPT, this approach cannot be adopted directly, because the strict theoretical relationship μth(a) is hard to be obtained. It is usually calculated order by order,
μth(a)=μLO(aLO)+μNLO(aLO,aNLO)+μNNLO(aLO,aNLO,aNNLO)+,
where μLO, μNLO and μNNLO are the theoretical chiral expansion of μth(a) at the LO, NLO and NNLO, respectively. aLO, aNLO and aNNLO are the LO, NLO and NNLO LECs, respectively, such as Lir and Cir. At present, the higher-order relationship μHO(a) (i.e., truncation error) is lacking, so this paper only considers the expansion up to the NNLO. As discussed in the introduction, μHO(a) may make a great impact on the results. Hence, it should be considered in the fit.
The introduction mentions that many references have discussed how to estimate the truncation errors, such as Ref. [29]. However, that approach cannot be used directly in the present case. There exist some serious problems. Ref. [29] knows μLO, μNLO and μNNLO without errors to estimate the distribution of μth. However, in the present case, μth with systematical errors and the analytical expressions of μLO(aLO), μNLO(aLO,aNLO) and μNNLO(aLO,aNLO,aNNLO) are known, but μLO, μNLO, μNNLO, aNLO, aNNLO and their distributions are needed to be fitted by D and ΣD. Ref. [29] computes the Bayesian evidence by a multidimensional integral (Eq. (8) in Ref. [29]). In several special cases, the Bayesian evidence can be integrated analytically, but it usually needs to be integrated numerically. A multi-dimensional numerical integral is usually hard to be done, and it may cost a lot of time. However, the MCMC approach avoids determining the Bayesian evidence, and the computational speed is faster. In addition, Ref. [29] requires Eq. (4) to be convergent order by order, but Refs. [6, 8] have already indicated μNLO>μNNLO for some physical quantities. Hence, a new approach is needed.
Generally, in an actual fit, some of aLO, aNLO and aNNLO may have dimensions, and their values may be very small or very large. For example, the NNLO LECs Ci is about 103GeV2. For convenience, they are first removed the dimensions. For example, most literature provides Cir (defined in Ref. [64]) without dimension, but not Ci. Moreover, very small or very large values may lead to numerical errors. Hence, all LECs divide by an order of magnitude, in order to make them roughly 1. This can be done in an actual fit. For example, both experiment and theory can estimate Cir is about 106. The order of magnitude of LECs is regarded as a prior of LECs in this paper. For convenience, all the quantities in this section are assumed to be dimensionless, and all aLO, aNLO and aNNLO are assumed roughly 1. In fact, the number 1 is not very strict. As long as the number is not very large, the fit also works well. For convenience, aLO is assumed to be known, and it does not need to be fitted in this section. If one wants to fit aLO, there is no difference from fitting aNLO and aNNLO.
In the actual ChPT fit in this paper, the number of Di is less than the total number of aLO, aNLO and aNNLO. There exists an overfitting. Hence, some constraint conditions are introduced to decrease the parametric space. In order to consider the convergence of ChPT, Eq. (2) need to be introduced some information about the high orders. If one has no more information about the high orders, in this paper, the parameters in Eq. (2) are modified to
μth(a)(μth(a)|μNLO/μLO||μNNLO/μLO|),
D¯(D¯0.2I0.05I),
ΣD(ΣD0.22I0.052I),
where I means an identity matrix with a suitable dimension. These changes assume that |μiNLO/μiLO| satisfies a normal distribution N(0.2,0.22) (μ=0.2,σ2=0.22) for any i (ignore the negative part), and |μiNNLO/μiLO| has a similar meaning. The values come from the convergence hypothesis of ChPT. According to ChPT, |μNLO/μLO| is about 0.1−0.3, and |μNNLO/μNLO| is also about 0.1−0.3. Both 0.2 and 0.05 are near the central values. The standard deviations are chosen the same as the expected values, in order to give a large enough possibility at a wide range, because the estimation may not be very exact. In order to make the model universal, we choose the relative difference, but not the absolute value. This is because EFT/ChPT can provide us an approximate ratio between two orders, but not their absolute values. Of course, if one knows an approximate absolute value of a special quantity at a given order, Eq. (5) can be replaced with this absolute value. Some similar constraints about the truncation errors will be discussed in Section 3.3. Of course, these constraints can be correspondingly modified to different values, if one has a better understanding about some physical quantities.
For convenience, only one input datum D or a component form Di is discussed. If one wants to consider more than one datum D, the discussion also works.

3.2 Model A

First, the truncation error is not considered in the fit, which is called Model A. Considering a physical quantity with an experimental value D±σD, its theoretical value is μ±σ. The theoretical values to the NLO and NNLO without errors are
μA(NLO)=μALO(aLO)+μANLO(aLO,aNLO),
μA(NNLO)=μALO(aLO)+μANLO(aLO,aNLO)+μANNLO(aLO,aNLO,aNNLO),
respectively. The term with a superscript without a couple of parentheses means the theoretical value only at this order. For example, μANLO means the NLO theoretical value of μ. Eqs. (8) and (9) are applied in the NLO and the NNLO fit, respectively.
This model assumes pr(aiNLO) is the standard normal distribution, because the magnitudes of all aNLO and aNNLO are already normalized to roughly 1. In other words, one only introduces the information about the rough magnitudes of aNLO and aNNLO, but no more information is considered at present. The advantage of this assumption is that more information of pr(a|D) can be derived from the experimental data D themselves, in order to reduce the subjectivity. In addition, Eq. (2) is adopted in the fit, but not Eqs. (5)−(7). We have checked that both aNLO and aNNLO need not be very close to 1. The results change slightly, as long as their values are not very large. This is because the standard normal distribution has a not very small possibility in a wide range. The same conclusion is true for the below model.
In order to improve Model A, more information is appended. It is called Model B.

3.3 Model B

Generally, the truncation error can be simply considered as a normal distribution, and the parameters of the normal distribution are based on the known information from the knowledge of the theory. However, in some special cases, the sign of the truncation error is known, or the probability of the sign is known. This information from the sign is considered separately. Eqs. (8) and (9) are improved to
μB(NLO)=μBLO(aLO)+μBNLO(aLO,aNLO)+(2s1)eμBLO,
μB(NNLO)=μBLO(aLO)+μBNLO(aLO,aNLO)+μBNNLO(aLO,aNLO,aNNLO)+(2s1)eμBLO,
respectively. The last terms on the right-hand side of Eqs. (10) and (11) represent the higher-order (HO) truncation error μBHO=(2s1)eμBLO. For μB(NLO) and μB(NNLO), it means the contribution higher than the NLO and NNLO, respectively. The parameter s relates to the sign of the truncation error. It is assumed to be a Bernoulli random variable with parameters 1,
pr{s=k}=pk(1p)1k,k=0,1.
p is the probability for s=1. If one does not know the information of the sign, p=0.5. s parameter can give a correct sign of the truncation error. If the estimating truncation error gives a narrow range with a wrong sign, the theoretical values will be far from the true value and the fit will be bad. The parameter s is introduced to solve this problem. s can change the wrong sign into a correct one. Oppositely, if the estimating truncation error gives a correct sign, or the range is too wide to cover the true truncation error, s will have no impact on this case. The parameter e reflects the relative magnitude of the truncation error, relative to μBLO. One needs not know the absolute magnitude of the truncation error. However, if the EFT is satisfied, the relative magnitude at each order can be estimated. For example, the ratio between two adjacent orders is about p/Λ, where p is the momentum of the low-energy particles and Λ is the scale of the EFT. In ChPT, |μBNLO/μBLO| is about 0.1−0.3, and |μBNNLO/μBNLO| is also about 0.1−0.3, and so on. Therefore, it can be considered that |μBHO/μBLO| is about 5% (2%) for μB(NLO) (μB(NNLO)). Hence, the parameter e is assumed to be a Gaussian random variable
pr(e)=N(μe,σe2),
where μe is the expected magnitude of μBHO/μBLO, and σe is its standard deviation. If one does not know more information about the truncation error, a possible and reasonable choice is μe=σe=0.05 (0.02) for μB(NLO) (μB(NNLO)).
The parameters p, μe, σe, aNLO and aNNLO sometimes can be estimated through the information of the data. Hence, they can be set to another values, even though the prior PDFs of them can be also set to another form, as long as the information is accurate enough.
There are two extreme cases in Model B, which will be adopted only for model evaluation in Section 4. These two cases are called Model B1 and Model B2, respectively.
Model B1. In this case, one knows nothing about μBHO, such as the sign and the rough magnitude. Only the approximate order of magnitude of μBHO is known from ChPT, such as about 5% of LO at the NLO fit. As in the discussion above, for all quantities, we set μe=σe=0.05 (0.02) and p=0.5 for the NLO (NNLO) fit. At present, we do not consider more information about aNLO and aNNLO. Hence, aNLO and aNNLO are set to the standard normal distribution N(0,1). The convergence constraints are the same as Eqs. (5)−(7).
Model B2. In this model, the magnitudes of each μBHO all have a certain understanding. Hence, one can set different prior PDFs to different μBHO, separately. The parameters μe, σe and p from different quantities can be set to different values. For example, if one knows the sign is positive, p is set to 1. The priors for μe and σe are set as
μeNLO=|μtrNLO/μtrLO|,σeNLO=max(0.3μe,0.05),μeNNLO=|μtrNNLO/μtrLO|,σeNNLO=max(0.3μe,0.02),
where the superscripts NLO (NNLO) represent the NLO (NNLO) fit, the subscript “tr” means true value. Because we have only adopted this model for the example in Section 4 to evaluate the models, all the true values are known. Similarly, the true ranges of aNLO and aNNLO are generated by some given parameters. Their true ranges are also known. Therefore, their prior ranges are given the same as their true ranges. In addition, the constraints can be set to different values for the different physical quantities.
Models B1 and B2 adopt two extreme priors, they are only used to fit the example in Section 4. Because this example are artificial, and the true values are known, we can select none or all prior information in the fit. For the actual experimental data, the known prior information is between Model B1 and Model B2. For example, one may have some information about a part of D, and the signs and the approximate magnitudes of μBHO can be given as Model B2. However, for another part of D, one may have no information about their μBHO, because of the lack of the current theory and/or experiment. For this part of D, one can only give the prior PDFs as those in Model B1. Besides these two cases, one may more possibly know some information of μBHO. For example, μBHO is more likely to be positive, or its value is possible around 1 or 2. The prior PDF can be set according to this information. The fitting method of Models B, B1 and B2 are the same, except the prior PDFs are different. It can be expected that the general Model B is better than Model B1, but worse than Model B2. Therefore, in Section 6, we have uniformly used Model B to represent the new model proposed in this paper.

3.4 Calculation details

This section discusses some special cases in the fit.
Sometimes, one needs to fit the differentiation of μ(t) numerically, such as fs and gp in Section 5. The numerical deviation Δμ=μ(t+Δt)μ(t) needs to calculate the difference between the two quantities μ(t+Δt) and μ(t), but each quantity has an error. If one adopts Eq. (10) or (11) to determine μ(t+Δt) and μ(t), the estimating truncation error of μ(t) will contain the above two errors and become large. Therefore, the truncation error of μ(t) is estimated from μ,LO, μ,NLO and μ,NNLO, but not the difference of Eq. (10) or (11). In other words, μ(t) is treated as one quantity, but not a difference. However, for physical quantities with derivative values such as r2Sπ and cSπ, we place the HO terms in the denominator, which absorbs the effects of higher-order errors well.
Sometimes, in the NNLO fit, the amount of aNNLO is much larger than the number of D, but the total number of aLO and aNLO is less than the number of input D. The NNLO fit in ChPT is in this situation. All aLO, aNLO and aNNLO are fitted as follows.
i) All aNNLO first linearly combine into some linearly independent a~. The number of a~ is equal to the number of D, and one a~i only correlates to one Di. This is also reasonable in ChPT, because the NNLO fit only contains the linear combinations of Cir. One can combine them to the linearly independent ones.
ii) aLO and aNLO are first fitted at the NLO by Model B. The results denote to a^LO±σ^LO and a^NLO±σ^NLO. This is called the NLO fit.
iii) In the NNLO fit, aLO, aNLO and a~ are fitted simultaneously. If no more information is known, the NNLO priors of aLO and aNLO are set to some suitable normal distributions N(μLO(NLO),(σLO(NLO))2), where
μLO(NLO)=a^LO(NLO),σLO(NLO)=max(a^LO(NLO)/2,σ^LO/(NLO)).
The definition of σLO(NLO) chooses the maximum of the two parameters a^LO(NLO)/2 and σLO(NLO). This is because either of them may be very small, this definition enlarges the prior ranges of N(μLO(NLO),(σLO(NLO))2), in order to improve performance. The prior PDFs of a~ is set to the standard normal distribution, if one knows nothing about a~. Otherwise, some more reasonable prior PDFs can be set according to the known information.
The prior PDFs of aLO, aNLO and a~ not only make good use of the information from the NLO fit, but also allow some free spaces for these parameters. Because the NLO fitting aLO and aNLO can give a reasonable order of magnitude in most cases, the NNLO fit also selects the NLO posterior PDFs to calculate the NNLO prior PDFs. In addition, the new parameter a~ is also introduced in the NNLO fit. Hence, the NNLO fit is not a repeated fit to the data, even if some of the NLO posterior information is used. We have also tried to do the NNLO fit without the posterior PDFs from the NLO fit for the example in Section 4, and set the prior of aLO, aNLO, and aNNLO uniformly to the standard normal distribution. However, this gives very poor results, which can deviate very far from the true values. Therefore, it is necessary to use some sensible information about LECs as a prior in the NNLO fit.
Of course, if some information about aLO and aNLO is known, one can set another sensible prior PDFs.
iv) Finally, all aNNLO are fitted with the posterior a~ obtained above, with some appropriate uniform distributions. The boundaries of the uniform distribution are dependent on the approximate order of magnitudes of the truncation errors. This is because the NLO research has usually been studied widely, and more information is known. However, the NNLO research is usually lacking, and the values of aNNLO are not quite sure. Hence, a uniform distribution can give a larger probability near the boundaries, in order to study the boundary-dependent property. After the fit, the posterior PDFs of the truncation error will be changed into better ones.
Models B is very efficient. For the actual fit in ChPT, which will be discussed below, a personal computer with CPU Intel i3-10105 only costs about ten minutes with 4 cores. This method greatly reduces the time compared with the method in Ref. [8], which costs about one day with 20-core CPU Intel Xeon Gold 6230.
All the numerical results are represented by the highest posterior density (HPD). The HPD is the minimum interval containing a certain proportion of probability density. The most common proportion is 95% HPD or 98% HPD, but we have chosen 68% HPD. Because it is similar to 1σ interval in the classical statistics [28], such as the minimum χ2 method. All the results in this paper have been compared. It indicates that the difference between 68% HPD and 1σ interval is very small, most last significant digits have no difference or a difference of 1 or 2. Only very few of them have a difference of 3 or 4. No one is larger than 4. Hence, we sometimes do not distinguish them in this paper.

3.5 Evaluation criteria

In order to evaluate which model is the best, there needs an evaluation criterion. This criterion is better to be quantified. One can evaluate different models by the quantified index. Bayesian evidence is one possible criterion, but it is too simple. The widely applicable information criterion (WAIC) and leave-one-out cross-validation (LOOCV) are introduced in recent years. WAIC considers how well the data fits the model and also penalizes complex models. LOOCV splits the data into a training set and validation set and repeats many times to evaluate the model. The definitions of WAIC and LOOCV involve some related concepts and formulas, which need a long discussion. Their definitions and a more detailed explanation can be found in Refs. [65, 66]. Simply speaking, if Model B has larger values of WAIC and LOOCV than Model A, Model B is considered better than Model A. Of course, only a couple of these values for one model are meaningless, because one does not know how large is enough. They are only meaningful for comparing different models.
For the example in Section 4, the true values of parameters ai,tr are known. In addition to both WAIC and LOOCV, the fitting results ai,model can be compared to the true values directly. For example, ai,A means the expected value of ai is fitted by Model A. It is more intuitive to see how well the fit is. Hence, we define the following two quantities as criteria.
Pctmodel=ai,modelai,trai,tr×100%,
Pctσmodel=ai,modelai,trσi,model.
Pctmodel is the relative error between the fitting value ai,model and the true value ai,tr. It indicates how well the fitting expected value is. Pctσmodel is the ratio of the difference between the true value and the fitting value to the fitting standard error σi,model. It indicates how well the fitting error is. The smaller these two values are, the better the model is. These two criteria are only used for the example in Section 4, because one does not know the true values in the actual fit.
In order to clarify the convergence of μ, the percentages at each order are defined as Ref. [8],
Pctorder=μ¯modelorderμ¯model×100%,
where μ¯modelorder is defined in Eqs. (8)–(11). μ¯model means the fitting value obtained by a special model, containing all orders. The notation bar means the expected value. For example, μ¯ANLO means the NLO expected contribution obtained by Model A, and μ¯A is the expected value containing all orders obtained by Model A.
For the NNLO fit, the differences among WAIC, LOOCV, Pctmodel and Pctσmodel among different models are small. It is more important to evaluate how well all aiNNLO are fitted, because the NNLO fitting aiNLO are usually precise enough, but aiNNLO usually have large errors. For the example in Section 4, the true values of aiNNLO are known, and aiNNLO=a~i, the fitting values can also compare to the true values directly. Usually, the contributions of aiNNLO do not mix with aiLO and aiNLO, such as ChPT. The contributions of aiNNLO can be separated, called μaiNNLO. In order to see how well the fitting aiNNLO are, we defined
PMmodel=i=1n((ai,trNNLOa¯i,modelNNLO)ai,trNNLOμ¯ai,modelNNLOμi,tr)2/n.
The subscript “tr” means the true values, the subscript “model” means the model which are adopted, and μi,tr is the true value of the i-th physical quantity. The notation bar means the expected value. n is the number of physical quantities. In this paper n=17. For example, ai,ANNLO means aiNNLO is fitted by Model A, μai,ANNLO means only the contribution from aiNNLO by Model A. The first fraction on the right side of Eq. (19) is the relative error of aiNNLO, while the second fraction is treated as its weight. The weight represents the contribution of μ¯ai,modelNNLO in μi,tr. The smaller the PM value is, the better the result is. A larger weight needs a more precise a¯i,modelNNLO to reduce the PM value. PM value is only used in the example in Section 4, because the true values of this example are known, but in the actual case, the true values are not known.
The next section will evaluate the above models by these evaluation criteria.

4 Model evaluation

In order to quantitatively demonstrate the advantage of Model B based on Bayesian statistics, this section gives an example to fit the parameters similar to LECs. The same as the actual fit of the LECs in Section 6, a group of functions is generated randomly, each group containing 17 different quantities Oi. They are shown in Eq. (A1) in Appendix A. The power of t is similar to the chiral dimension in ChPT. Taylor expanding these functions about t, the analytical results at each order can be obtained. The t, t2 and t3 orders correspond to LO, NLO and NNLO in ChPT, respectively. After the expansion, t=1. aiLO, aiNLO and aiNNLO are similar to LO, NLO (Lir) and NNLO (Cir) LECs in ChPT, respectively. bi are some known constants, which are introduced to adjust the convergences of these Taylor series. All parameters aiLO, aiNLO, aiNNLO and bi are generated randomly and independently. For convenience, the parameters in each function are different, although they have the same name. For example, b1 in O1 and O2 are different. The values of bi and aiLO in the example can be found in in Appendix A. In fact, the LO LECs do not appear in the actual ChPT fit in this paper. Hence, we treat them as known constants and do not fit them. This section only discusses the impact from truncation errors, but it does not mention overfitting. Hence, each Oi only contains one aiNNLO, i.e., a~iNNLO=aiNNLO.
Since all the parameters bi, aiLO, aiNLO and aiNNLO in this example are known, all the analytical results Oi can be calculated by these parameters directly. In this section, we define all the known values of these parameters as true values. The fitting values of these parameters are called theoretical values, which are fitted by the models in Section 3. In order to distinguish these two types of values, all the true values are marked by a subscript “tr”, such as ai,trNLO, and all the theoretical values are marked by the model name, such as ai,ANLO.
In order to imitate the realistic experiment, the fitting data do not adopt the true values but with some experimental errors σi. The imitative experimental data are generated by the distribution N(Oi,tr,σi2), σi/Oi,tr=0.02 in the example [26]. For convenience, these imitative experimental data are also called experimental data for short. Their values are in the third column of Tab.2 with a subscript “exp”, respectively.
Because the above true values are known, the true values of μLO, μNLO and μNNLO can be also calculated analytically. The parameters of truncation errors in Model B2 are set as Eq. (14) and the description above it. The values of p, μe and σe are given in in Appendix B. Similarly, the true values of the LECs are also known, so their prior distribution are set to the normal distribution N(μai,σai2), where
μaiNLO/NNLO=N(ai,trNLO/NNLO,(0.1ai,trNLO/NNLO)2),σaiNLO/NNLO=0.5μaiNLO/NNLO.
We have deliberately given μai a deviation from the true value, in order to avoid fit at the true value. The distribution parameters of μai and σai at each order are given in in Appendix B.

4.1 The NLO fit of the example

The input parameters in Model B2 are given in Columns 2 to 6 of in Appendix B. After the NLO fit, we have checked that the obtained Markov chain satisfies the assumption of the detailed balance condition, and the results are reliable. All the other fits in this paper have the same conclusion.
Fig.1 illustrates the distributions obtained by Models A and B2. The shapes of the lines are similar to normal distributions, although the details have a little difference. We have checked that the boundaries of 68% HPD are almost the same as 1σ boundaries of a normal distribution. Hence, we sometimes do not distinguish them. It can be seen that the center values of Model B2 are more closed to the true values. However, the errors of Model B2 are larger than those of Model A. This is because Model B2 considers the errors of the truncation errors, but Model A does not.
Fig.1 The NLO fitting posterior PDFs of aiNLO. The red lines and the light red areas are obtained by Model A. The blue lines and the light blues area are obtained by Model B2. The lines are the distribution curve of aiNLO. The light-colored areas depict the 68% HPDs. The green lines denote the true value.

Full size|PPT slide

The numerical posterior information of aiNLO is listed in Tab.1. The WAIC and LOOCV of Model B2 are the largest, but these values of Model A are the smallest. The WAIC and the LOOCV of Model B1 are a bit smaller than those of Model B2, but much larger than those of Model A. This means that Model B2 gives the best results, but Model A is the worst. Model B1 obviously improves the results of Model A, but a bit weaker than Model B2. This conclusion can also be seen from PctA,B1,B2. However, most |PctσB2| are still a bit larger than |PctσB1|. This is because the errors of ai,B2NLO are about half ai,B1NLO. Overall, ai,B2NLO is closer to the true value.
Tab.1 The NLO and the NNLO fitting results of aiNLO in the example. Row 2 is the true value of aiNLO. Rows 3, 6 and 9 are the NLO fitting results of Model A, B1 and B2, respectively. Rows 12, 15 and 18 are the NNLO fitting results of Model A, B1 and B2, respectively. The percentage PctA,B1,B2 is defined in Eq. (16), and the ratio PctσA,B1,B2 is defined in Eq. (17).
i 1 2 3 4 5 6 7 8 WAIC LOO
ai,trNLO 0.53 0.80 −3.07 0.3 1.01 0.14 −0.34 0.47
NLO
ai,ANLO 0.541 (35) 0.914 (30) −3.671 (135) 0.366 (18) 1.021 (28) 0.118 (22) −0.413 (14) 0.545 (18) −49.130 −56.310
PctA 2.1% 14.3% 19.6% 22.0% 1.1% −15.7% 21.5% 16.0%
PctσA 0.3 3.8 −4.5 3.7 0.4 −1.0 −5.2 4.2
ai,B1NLO 0.550 (121) 0.842 (79) −3.192 (335) 0.324 (46) 0.972 (68) 0.110 (53) −0.361 (34) 0.503 (43) 14.964 7.889
PctB1 3.8% 5.2% 4.0% 8.0% −3.8% −21.4% 6.2% 7.0%
PctσB1 0.2 0.5 −0.4 0.5 −0.6 −0.6 −0.6 0.8
ai,B2NLO 0.539 (41) 0.860 (43) −3.252 (175) 0.314 (28) 1.027 (38) 0.149 (28) −0.359 (18) 0.475 (23) 27.307 23.713
PctB2 1.7% 7.5% 5.9% 4.7% 1.7% 6.4% 5.6% 1.1%
PctσB2 0.2 1.4 −1.0 0.5 0.4 0.3 −1.1 0.2
NNLO
ai,ANLO 0.924 (574) 0.603 (146) −1.984 (457) 0.416 (383) 0.553 (357) 0.084 (383) −0.243 (53) 0.510 (316) 14.364 7.782
PctA 74.34% −24.63% −35.37% 38.67% −45.25% −40.00% −28.53% 8.51%
PctσA 0.69 −1.35 2.38 0.30 −1.28 −0.15 1.83 0.13
ai,B1NLO 0.534 (116) 0.831 (74) −3.042 (26) 0.316 (43) 0.968 (62) 0.116 (40) −0.346 (27) 0.491 (42) 41.143 32.782
PctB1 0.75% 3.87% −0.91% 5.33% −4.16% −17.14% 1.76% 4.47%
PctσB1 0.03 0.42 0.11 0.37 −0.68 −0.60 −0.22 0.50
ai,B2NLO 0.525 (81) 0.808 (35) −3.195 (111) 0.319 (18) 0.995 (37) 0.138 (32) −0.354 (12) 0.474 (24) 56.730 53.277
PctB2 −0.9% 1.0% 4.1% 6.3% −1.5% −1.4% 4.1% 0.9%
PctσB2 −0.06 0.23 −1.13 1.06 −0.41 −0.06 −1.17 0.17
Fig.2(a) illustrates the proportions of Oi at each order. The contributions at NLO and HO from Model B2 are closer to the true values than those from Model B1. This is because Model B2 has utilized more information compared to Model B1. Despite adopting relatively less information, Model B1 still satisfies convergence well in its results. However, there are noticeable differences between Models B1 and B2 at the HO due to some truncation errors not being accurately estimated. Nevertheless, these discrepancies have a minimal impact on the results of aiNLO. Therefore, whether Model B1 or Model B2, their results closely approximate the true values. This indicates that even if one does not possess complete knowledge about all physical quantities’ truncation errors, Model B1 still yields better results compared to Model A.
Fig.2 The proportions of Oi at each order for the example. The red, green and blue strips in the figure represent the true values, the values obtained by Models B1 and B2, respectively. The lightest and the second lightest colors are the proportions [defined in Eq. (18)] of LO and NLO, respectively. (a) The NLO fit. The darkest color is the proportion of HO. (b) The NNLO fit. The darkest color and the dark gray are the proportions of NNLO and HO, respectively. To avoid layer masking, the colors of the NNLO and the HO true values of O7, O11 and O14 are interchanged. Similarly, the colors of O11, O12 and O13 of Model B2 are also interchanged.

Full size|PPT slide

Tab.2 shows the comparison of the true values, the experimental values and the fitting results from Models B1 and B2. It can be seen that the theoretical values from both Model B1 and Model B2 are not obviously different from the experimental values and the true values. In particular, the theoretical values obtained by Model B2 are closer to the true value than those obtained by Model B1. The 1σ errors from Model B1 and Model B2 are roughly equal to the experimental data, but Model B2 has smaller errors. Most true values fall within 1σ intervals of the theoretical values. A few true values are in the 1σ to 2σ intervals. No true values exceed the 2σ intervals. Tab.2 also indicates that more information leads to a better result.
Tab.2 The comparison of the NLO fitting values for the example. The subscripts tr, exp, B1 and B2 in the first row represent the true values, the experimental values, the theoretical values from Model B1 and Model B2, respectively. The experimental values in the third column are sampled from the true values. Oi is defined in Eq. (A1).
i 102Oi,tr 102Oi,exp 102Oi,B1 102Oi,B2
1 −35.800 −34.637 ± 0.716 −33.778 ± 1.786 −35.023 ± 1.129
2 0.173 0.171 ± 0.003 0.168 ± 0.005 0.172 ± 0.004
3 −0.276 −0.279 ± 0.006 −0.279 ± 0.031 −0.279 ± 0.012
4 0.603 0.590 ± 0.012 0.584 ± 0.013 0.595 ± 0.008
5 27.281 27.753 ± 0.546 27.124 ± 0.756 27.600 ± 0.535
6 −0.524 −0.548 ± 0.010 −0.554 ± 0.020 −0.541 ± 0.014
7 −1.486 −1.434 ± 0.030 −1.403 ± 0.028 −1.459 ± 0.023
8 −0.955 −0.970 ± 0.019 −0.954 ± 0.414 −0.965 ± 0.219
9 −0.227 −0.226 ± 0.005 −0.231 ± 0.008 −0.227 ± 0.004
10 −52.511 −52.773 ± 1.050 −54.107 ± 3.062 −52.493 ± 2.069
11 44.936 46.250 ± 0.899 47.299 ± 2.319 45.728 ± 1.351
12 −4.223 −4.397 ± 0.084 −4.427 ± 0.673 −4.393 ± 0.374
13 −14.674 −14.769 ± 0.293 −14.574 ± 2.295 −14.782 ± 1.325
14 −24.577 −24.765 ± 0.492 −24.351 ± 0.492 −24.755 ± 0.419
15 −15.864 −15.505 ± 0.317 −15.974 ± 1.741 −15.435 ± 0.893
16 3.831 3.746 ± 0.077 3.847 ± 0.145 3.774 ± 0.069
17 −4.193 −4.208 ± 0.084 −4.110 ± 0.146 −4.203 ± 0.086

4.2 The NNLO fit of the example

In the NNLO fit, the priors of aiNLO and aiNNLO in Models A and B1 are the same as those discussed in Sections 3.2 and 3.3. The priors in Model B2 adopt Eq. (15), and the parameters are given in Columns 7 to 11 of in Appendix B.
The numerical NNLO fitting results of aiNLO obtained by Models A, B1 and B2 are shown in Rows 12 to 20 of Tab.1. The NNLO fitting results of aiNNLO obtained by Models A, B1 and B2 are shown in Tab.3. Besides WAIC and LOO, the last row also gives the PM value defined in Eq. (19).
Tab.3 The NNLO fitting results of the example. Column 2 is the true value of aiNNLO. Columns 3, 6 and 9 are the results of Models A, B1 and B2, respectively. The percentage PctA,B1,B2 is defined in Eq. (16), and the ratio PctσA,B1,B2 is defined in Eq. (17). PM is defined in Eq. (19).
i ai,trNNLO ai,ANNLO PctA PctσA ai,B1NNLO PctB1 PctσB1 ai,B2NNLO PctB2 PctσB2
1 0.02 0.176 (298) 780.0% 0.5 0.013 (29) −35.0% −0.2 0.017 (10) −15.0% −0.3
2 0.19 0.060 (293) −68.4% −0.4 0.102 (46) −46.3% −1.9 0.177 (31) −6.8% −0.4
3 −0.72 0.351 (692) −148.8% 1.5 −0.073 (264) −89.9% 2.5 −0.703 (209) −2.4% 0.1
4 0.22 −0.682 (917) −410.0% −1.0 0.917 (735) 316.8% 0.9 0.203 (96) −7.7% −0.2
5 −0.16 0.018 (465) −111.3% 0.4 −0.090 (60) −43.8% 1.2 −0.137 (43) −14.4% 0.5
6 0.26 0.035 (485) −86.5% −0.5 0.189 (71) −27.3% −1.0 0.192 (58) −26.2% −1.2
7 −0.42 0.088 (645) −121.0% 0.8 −0.209 (520) −50.2% 0.4 −0.413 (165) −1.7% 0.0
8 −0.45 0.016 (1005) −103.6% 0.5 −0.136 (188) −69.8% 1.7 −0.472 (118) 4.9% −0.2
9 −0.99 −0.822 (525) −17.0% 0.3 −0.261 (200) −73.6% 3.6 −0.966 (208) −2.4% 0.1
10 −0.06 −0.415 (670) 591.7% −0.5 −0.076 (59) 26.7% −0.3 −0.083 (24) 38.3% −1.0
11 0.24 0.005 (993) −97.9% −0.2 0.163 (646) −32.1% −0.1 0.254 (132) 5.8% 0.1
12 −0.18 −0.182 (605) 1.1% 0.0 −0.194 (85) 7.8% −0.2 −0.219 (51) 21.7% −0.8
13 1.02 0.342 (706) −66.5% −1.0 1.011 (71) −0.9% −0.1 0.997 (57) −2.3% −0.4
14 0.29 −0.226 (181) −177.9% −2.9 0.140 (132) −51.7% −1.1 0.265 (90) −8.6% −0.3
15 −0.11 −0.297 (427) 170.0% −0.4 −0.087 (62) −20.9% 0.4 −0.110 (35) 0.0% 0.0
16 −0.56 0.095 (707) −117.0% 0.9 −0.870 (394) 55.4% −0.8 −0.567 (218) 1.2% 0.0
17 0.19 0.247 (714) 30.0% 0.1 0.188 (112) −1.1% 0.0 0.187 (67) −1.6% 0.0
WAIC 14.364 41.143 56.730
LOO 7.782 32.782 53.277
PM 0.2650 0.0510 0.0177
Tab.1 shows that the best results of aiNLO are obtained by Model B2. The NNLO PctB1 (PctσB1), PctB2 (PctσB2) and their NLO values show that most of the results are improved. There exists a significant difference between the NLO and the NNLO results. This indicates that even though the NNLO prior PDFs are calculated from the NLO posterior PDFs, the NNLO fitting aiNLO does not stay at the prior PDFs, as it can change to the other ranges. In other words, the NNLO fit is not a repeated NLO fit.
Tab.3 shows that there are significant differences between PctA (PctσA) and PctB1 (PctσB1) for aiNNLO. Although a few |PctB1| have large values (the largest is 316.8%), and several |PctσB1| also have large values, Model B1 still has a significant improvement over Model A. This can also be noticed from their PM values, which change significantly. Similarly, Model B2 also shows a more significant improvement in the results. Most PctB2 and PctσB2 are smaller than those from Models A and B1. It can be seen that for the NNLO fit, the more useful information is known, the better the fitting results are.
Fig.2(b) illustrates the distributions obtained by Models A and B2 at each order. Tab.4 gives a comparison among the true values, the experimental values, the fitting results from both Models B1 and B2. Both of them indicate the same conclusion as the NLO fit. Model B2 can give better predictions of the truncation errors and the theoretical values.
Tab.4 The comparison of the NNLO fitting values for the example. The subscripts tr, exp, B1 and B2 in the first row represent the true values, the experimental values, the theoretical values from Model B1 and Model B2, respectively. The experimental values in the third column are sampled from the true values. Oi is defined in Eq. (A1).
i 102Oi,tr 102Oi,exp 102Oi,B1 102Oi,B2
1 −35.800 −34.637 ± 0.716 −34.219 ± 2.034 −35.189 ± 0.910
2 0.173 0.171 ± 0.003 0.170 ± 0.007 0.172 ± 0.005
3 −0.276 −0.279 ± 0.006 −0.279 ± 0.049 −0.279 ± 0.036
4 0.603 0.590 ± 0.012 0.584 ± 0.032 0.590 ± 0.015
5 27.281 27.753 ± 0.546 27.091 ± 0.843 27.604 ± 0.675
6 −0.524 −0.548 ± 0.010 −0.550 ± 0.028 −0.547 ± 0.023
7 −1.486 −1.434 ± 0.030 −1.434 ± 0.031 −1.469 ± 0.020
8 −0.955 −0.970 ± 0.019 −0.968 ± 0.327 −0.971 ± 0.144
9 −0.227 −0.226 ± 0.005 −0.226 ± 0.008 −0.225 ± 0.010
10 −52.511 −52.773 ± 1.050 −53.136 ± 4.369 −52.212 ± 2.036
11 44.936 46.250 ± 0.899 46.466 ± 2.628 45.904 ± 1.176
12 −4.223 −4.397 ± 0.084 −4.406 ± 0.736 −4.390 ± 0.402
13 −14.674 −14.769 ± 0.293 −14.713 ± 2.670 −14.760 ± 1.834
14 −24.577 −24.765 ± 0.492 −24.304 ± 0.614 −24.692 ± 0.496
15 −15.864 −15.505 ± 0.317 −15.609 ± 1.776 −15.541 ± 1.006
16 3.831 3.746 ± 0.077 3.757 ± 0.154 3.761 ± 0.098
17 −4.193 −4.208 ± 0.084 −4.136 ± 0.134 −4.212 ± 0.093

4.3 Discussion

In the NLO fit, we have also removed one Oi and fitted the rest. The results are almost no different from the 17-input fit. Moreover, the 16-input fit can predict the 17th quantities well. This also shows that our model has a good predictive ability.
We have fitted other examples and obtained the same conclusion. If an example converges faster than the example in this paper, but the experimental errors and the NNLO contributions are at the same order, the experimental errors will have an impact on the HO values. The NNLO fitting results are a little worse. An example of this type can be downloaded from the source file in the arXiv version of this paper (arXiv: 2311.10423).

5 Observables and inputs

In order to fit the actual data in ChPT and compare the results by different methods, almost the same physical quantities are chosen as those in Refs. [7, 8], besides the covariance matrix of ππ scattering lengths a00, a02 and the two-flavor LECs l¯1, l¯2 and l¯4 is considered.
In Refs. [7, 8], 12 input values are used in the NLO fit, i.e., the quark mass ratio ms/m^ [6, 24, 67, 68], the ratio of decay constants of K meson and π meson FK/Fπ [6, 7, 67, 68], the shape factors F and G at threshold and their slope fs, gp, fs and gp for K4 form factors [24], ππ scattering lengths a00 and a02 [25], πK scattering lengths a01/2 and a03/2 [69], pion scalar radius r2Sπ in the form factor FSπ(t). In addition, there are 5 more input values added for the NNLO fit, i.e., the pion scalar curvature cSπ of the pion scalar form factor [69] and four two-flavor LECs l¯i(i=1,,4) [25, 70]. The values of these 17 physical quantities are listed below. In this paper, both 12 and 17 inputs are considered in the NLO fit for comparison.
The values of ms/m^ and FK/Fπ are
msm^=27.31.3+0.7,FKFπ=1.199±0.003.
The values of fs, gp, fs and gp are
fs=5.712±0.032,fs=0.868±0.049,gp=4.958±0.085,gp=0.508±0.122.
The values of ππ scattering lengths a00, a02 and the three relevant two-flavor LECs are
a00=0.220±0.005,a02=0.0444±0.0010,l¯1=0.4±0.6,l¯2=4.3±0.1,l¯4=4.4±0.2.
The covariance matrix of a00, a02 and l¯1, l¯2, l¯4 is listed in Tab.5.
Tab.5 The covariance matrix of a00, a02 and l¯1, l¯2, l¯4. This is a symmetric matrix, only the values in the upper right corner of the matrix are given [25].
Δa00 Δa02 Δl¯1 Δl¯2 Δl¯4
Δa00 2.0×105 3.2×106 1.9×104 1.7×105 4.2×104
Δa02 9.7×107 1.6×104 1.2×105 4.2×106
Δ¯1 3.5×101 3.3×102 6.7×102
Δ¯2 1.2×102 7.2×103
Δ¯4 4.8×102
We have tested whether the covariance matrix is present or not, it has a slight impact on the final fitting results, because the errors of l¯i themselves are very large. Of course, in order to make the results more statistically significant, the covariance matrix is considered in the global fit.
The experimental values of πK scattering lengths a01/2mπ and a03/2mπ are
a01/2mπ=0.224±0.022,a03/2mπ=0.0448±0.0077.
The experimental values of the scalar radius r2Sπ and the pion scalar form factor cSπ are
r2Sπ=0.61±0.04fm2,cSπ=11±1GeV4.
For l¯3, the following result is adopted [70]:
l¯3=3.2±0.7.

6 Fitting the LECs in ChPT

This section adopts the Bayesian Model B mentioned in Section 3.3 to perform a global fit, in order to obtain a new set of some NLO and NNLO LECs. The truncation errors are considered in the fit. Most references in this paper indicate that all Lir (Cir) are at the order about 103 (106). Following the preparation in Section 3.1, they need to be first normalized by multiplying a factor 103 (106), respectively.

6.1 The NLO fitting Lir by Model A

Although this paper does not adopt the minimum χ2 method [68] to fit Lir, it can still obtain similar results from the NLO fit by Model A. The fit does not add the covariance matrix and does not consider the truncation errors, in order to compare with the results in Ref. [7]. The fitting results with the first 12 inputs in Section 5 are shown in Tab.6. For comparison, the results in Ref. [7] are also given. Free fit means no assumptions in the fit. Otherwise, L4r are assumed to be some fixed values. It can be seen that these two approaches indeed give very close results. The classical statistics is very similar to the Bayesian statistic. The slight differences come from the prior of Lir. This proves that they are equivalent laterally. However, Bayesian statistic is easier to introduce extra information. The minimum χ2 method can also add some constraints in the definition of χ2 [68], but this information is restricted. For example, the prior PDF of LECs cannot be embodied in. In addition, the modified χ2 destroys the original definition of χ2. In other words, the new χ2 may not satisfy a χ2 distribution in fact.
Tab.6 The NLO fit by Model A, of which some different choices of L4r. Columns 2, 4, 6 and 8 are the results from free L4r, L4r0, L4r0.3 and L4r0.3, respectively. Columns 3, 5, 7 and 9 are the results in Ref. [7] for comparison.
LECs Free fit Free fit [7] 103L4r0 103L4r0 [7] 103L4r0.3 103L4r0.3 [7] 103L4r0.3 103L4r0.3 [7]
103L1r 1.04(09) 1.11(10) 0.90(09) 0.98(09) 0.92(09) 1.00(09) 0.88(09) 0.95(09)
103L2r 1.00(11) 1.05(17) 1.49(08) 1.56(09) 1.41(08) 1.48(09) 1.57(08) 1.64(09)
103L3r 3.52(28) 3.82(30) 3.52(28) 3.82(30) 3.52(28) 3.82(30) 3.52(28) 3.82(30)
103L4r 1.82(25) 1.87(53) 0 0 0.3 0.3 0.3 0.3
103L5r 1.24(03) 1.22(06) 1.25(03) 1.23(06) 1.24(03) 1.23(06) 1.25(03) 1.23(06)
103L6r 1.46(25) 1.46(46) 0.12(05) 0.11(05) 0.13(06) 0.14(06) 0.37(04) 0.36(05)
103L7r 0.40(14) 0.39(08) 0.19(14) 0.24(15) 0.23(14) 0.27(14) 0.16(14) 0.21(17)
103L8r 0.60(12) 0.65(07) 0.51(12) 0.53(13) 0.53(12) 0.55(12) 0.50(12) 0.50(14)

6.2 The NLO fitting Lir

In order to fit Lir, a similar approach to that of the example in Section 4 is adopted, but the parameters in HO are slightly different from the example. ms/m^|1, ms/m^|2, FK/Fπ, fs, gp, a00, a02, a01/2mπ, a03/2mπ, l¯1, l¯2, l¯3 and l¯4 are the same as the expansion in Eq. (10). fs, g, r2Sπ and cSπ involve a numerical differentiation. They are estimated with the method in Section 3.4. We have gotten some information about the higher-order experimental data and the range of the LECs, so the parameters are set in a way that is between Model B1 and Model B2. Therefore, from here, all data are fitted using Model B. In this subsection, besides fitting the whole 17 inputs (Model B17), we also fit the first 12 inputs (Model B12) in Section 5 for comparing to Refs. [7, 8].
The setting parameters can be found in Columns 2 to 7 in in Appendix B, the parameters about a01/2mπ and a03/2mπ are given by Ref. [7], which indicates that their convergences have been broken. The values about fs and a00 are given by their NNLO distributions, which are statistically obtained from the ranges of Lir and Cir collected by Refs. [7, 8, 71] and the references in them. The other parameters are given the same as Model B1. The prior of Lir is given in Columns 2 and 5 in in Appendix B. They refer to the Lir ranges given in Refs. [7, 8, 71] and the references in them. Because the values in the different references are not very close, the prior ranges are wide enough to cover all possible ranges.
The numerical results of both fits can be found in Tab.7. It can be seen that the results obtained by both Models B12 and B17 are close to the NNLO results in Refs. [7, 8]. Moreover, both of them also satisfy the large-Nc limit, i.e., 2L1rL2r, L4r and L6r closing to zeros, although it does not give a strong prior of L4r. This shows that the contributions from truncation errors have a great impact on the NLO fit. It is also very possible that the truncation errors cannot be ignored in the NNLO fit. In addition, all theoretical errors from Model B are slightly larger than those in Refs. [7, 8]. This is because Ref. [8] does not consider the errors caused by the truncation errors. Ref. [7] even does not consider the truncation errors. Model B cannot only estimate these truncation errors, but also considers their PDFs. These PDFs lead the fitting errors to be slightly larger than those in Refs. [7, 8]. However, the difference is not very large, because the truncation errors are not very large. It also shows that the change between 12 and 17 inputs is not very large. The relative difference does not exceed 20%. However, since more inputs are added, all theoretical errors became smaller. In addition, since Model B12 and Model B17 do not adopt the same inputs, the WAIC and LOOCV cannot be adopted as model evaluation criteria. Hence, we do not give these two values. The following discussion is based on the results of Model B17, because the fit becomes more accurate as the input value increases. The red part in Fig.3 is the corner plot of Lir with 17 inputs, from which one can see both the distributions and the potential correlations between Lir.
Fig.3 The corner plot of the 17-input fitting Lir. The red and blue colors mean the NLO and NNLO fit, respectively. The small and large loops mean the 68% HPD and the 95% HPD, respectively. The light-colored areas are the 68% HPD.

Full size|PPT slide

Tab.7 The fitting results of Lir. The superscripts indicate the input number in the fit. Columns 5 to 8 are the NLO and the NNLO fitting results in Refs. [7, 8], respectively.
LECs NLO B12 NLO B17 NNLO B17 NLO fit [7] NNLO fit [7] NLO fit 2 [8] NNLO fit 2 [8]
103L1r 0.51(15) 0.46(14) 0.43(12) 1.00(09) 0.53(06) 0.44(05) 0.43(05)
103L2r 1.08(22) 0.88(18) 0.83(15) 1.48(09) 0.81(04) 0.84(10) 0.74(04)
103L3r 3.36(61) 2.94(49) 2.64(44) 3.82(30) 3.07(20) 2.84(16) 2.74(17)
103L4r 0.19(18) 0.22(16) 0.26(11) 0.3 0.3 0.30(33) 0.33(08)
103L5r 1.10(37) 1.10(34) 1.21(27) 1.23(06) 1.01(06) 0.92(02) 0.95(04)
103L6r 0.05(22) 0.08(13) 0.12(11) 0.14(06) 0.14(05) 0.22(08) 0.20(03)
103L7r 0.26(17) 0.34(18) 0.33(13) 0.27(14) 0.34(09) 0.23(12) 0.23(08)
103L8r 0.51(22) 0.59(21) 0.60(15) 0.55(12) 0.47(10) 0.44(10) 0.42(09)
χ2 (d.o.f.) 1.0(9) 4.2(4) 4.3(9)
Tab.8 lists the 17-input theoretical contributions at each order. l¯ir is replaced by lir [2], because lir has a better convergence, theoretically. It can be seen that most expansions at each order conform to the convergence hypothesis very well. Most LO values contribute more than 70%, most NLO values contribute within 10% to 23%, and most HO values contribute less than 10%. All these percentages are neither too large nor too small. All theoretical results agree well with the experimental data. The ratios of the adjacent two orders are about 0.2, except for a01/2 and a03/2, which HO contributions are larger than the NLO ones. This situation also exists in Refs. [7, 8]. There are two reasons. One is that the experimental values of both a01/2 and a03/2 are not very precise. Compared to a00 and a02, their errors are too large and the estimating truncation errors are not so precise. It may lead to a poor convergence. The second reason is that there indeed exist broken convergence problems in the expansions of a01/2 and a03/2. These two reasons are related to a more precise experiment and theoretical calculation, and we do not discuss it anymore in this paper. However, although the NLO fitting results of a01/2 and a03/2 in Ref. [8] are converged, it assumes a geometric sequence model. Ref. [7] also exists this problem. However, Model B introduces the priors and has a wider scope of application. A better prior can predict its theoretical value within a more reasonable range. In addition, the total contribution of l3r is basically occupied by the NLO and its HO value tends towards 0. This is because the error of l3r itself is very large, which is about 3.7 times its expected value. Therefore, the contribution of l3r in the fit becomes very small, and the fitting expected value can be far away from the experimental expected value. Hence, adopting the experimental values of l3r as a constraint to constrain LECs in Ref. [8] seems not particularly good. Model B adopts both the convergence assumption and the prior PDFs. It can handle most precise data, so most results also conform with the convergence assumption very well. Only a few results with poor convergence, because of the problem itself or the large experimental errors.
Tab.8 The convergences of 17 inputs. The LECs are adopted from the 17-input NLO fitting results obtained by Model B in Tab.7. The second to the fourth columns are the contributions at the LO, NLO and HO, respectively. The percentage PctLO,NLO,HO is defined in Eq. (18). The last two columns are the theoretical estimation and the experimental inputs, respectively.
Observables LO|PctLO NLO|PctNLO HO|PctHO Theory Experiment
ms/m^|1 25.84 (96.2%) 0.84 (3.1%) 0.19 (0.7%) 26.9±3.1 27.31.3+0.7
ms/m^|2 24.21 (88.4%) 3.23 (11.8%) 0.06 (−0.2%) 27.4±6.3 27.31.3+0.7
FK/Fπ 1.000 (84.1%) 0.184 (15.5%) 0.004 (0.3%) 1.188±0.036 1.199 ± 0.003
fs 3.782 (66.2%) 1.267 (22.2%) 0.660 (11.6%) 5.709±0.347 5.712±0.032
gp 3.782 (77.9%) 0.915 (18.8%) 0.159 (3.3%) 4.856±0.191 4.958±0.085
a00 0.159 (72.5%) 0.044 (20.2%) 0.016 (7.4%) 0.2197±0.005 0.2196±0.0034
10a02 0.455 (104.0%) 0.019 (−4.4%) 0.002 (0.4%) 0.437±0.015 0.444±0.012
a01/2mπ 0.142 (63.3%) 0.033 (14.6%) 0.049 (22.1%) 0.224±0.014 0.224±0.022
10a03/2mπ 0.709 (158.3%) 0.084 (−18.8%) 0.177 (−39.5%) 0.448±0.090 0.448±0.077
103l1r 0 (0.0%) 4.07(98.6%) 0.06 (1.4%) 4.1±1.1 4.0±0.6
103l2r 0 (0.0%) 3.50 (160.2%) 1.32 (−60.2%) 2.2±0.9 1.9±0.2
103l3r 0 (0.0%) 0.18 (104.0%) 0.01 (−4.0%) 0.2±3.1 0.3±1.1
103l4r 0 (0.0%) 6.07 (99.9%) 0.01 (0.1%) 6.1±2.0 6.2±1.3
fs 0.531±0.322 0.868±0.049
g 0.368±0.036 0.508±0.122
r2Sπ 0.60±0.13 0.61±0.04
cSπ 10±2 11±1
Refs. [7, 8] fit the NLO LECs only with the first 12 inputs in Section 5, because the remaining five physical quantities have zero value in the LO. Therefore, if the truncation errors are not considered, the NLO fit does not contain the NNLO contribution. The results would exhibit a large deviation, because the HO contributions may lead to large influences. Although Ref. [8] can estimate the truncation errors, it requires at least two-order values because of a geometric-sequence model. Hence, this model cannot work for these five physical quantities. At present, the Bayesian method only requires at least one-order values to estimate the truncation errors. In other words, with Model B, even physical quantities with zero LO can be used as part of data fitting in NLO. Therefore, we also perform a full fit of all 17 physical quantities at the NLO.

6.3 The NNLO fitting Lir and C~i

The Cir to be fitted at the NNLO in this paper is the same as those in Ref. [8]. There exist 38 Cir, while the number of observables are 17. Hence, these 38 Cir are combined into 17 linearly independent C~i before the fit. The definitions of C~i are in Appendix A in Ref. [8]. In the NNLO fit, Lir and C~i are fitted simultaneously using the approach mentioned in Section 3.4.
The setting parameters are placed in Columns 8 to 10 in in Appendix B. All the parameters are given as Model B1, because we have known nothing about the truncation errors. The prior of C~i can be found in Columns 6 and 7 in in Appendix B. They are referred to the Cir ranges given in Tab.9 in Ref. [8]. The blue part in Fig.3 shows the NNLO fitting corner plot of Lir. Column 4 in Tab.7 lists the NNLO fitting results of Lir. Both Fig.3 and Tab.7 indicate that there is no significant change of the theoretical expected values between the NLO and the NNLO fit. In addition, the NNLO fitting Lir and their correlations with smaller theoretical errors, because it is the introduction of the NNLO contributions. Tab.7 also indicates that the difference between the 17 inputs at NNLO and 12 or 17 inputs at NLO are not very large, all within 20%. This indicates that this method is stable and does not cause an obvious change of Lir as the order increases. This is exactly one of the motivations in this paper.
Tab.9 The values and the errors of C~i, comparing with the results in Ref. [8].
C~i Results Ref. [8] C~i Results Ref. [8]
C~1 0.11(22) 0.02(12) 10C~10 0.22(32) 0.06(13)
C~2 0.09(58) 0.19(34) C~11 0.22(07) 0.24(02)
102C~3 1.29(77) 0.72(42) 103C~12 0.02(05) 0.18(01)
102C~4 0.05(08) 0.22(03) 103C~13 0.13(35) 1.02(44)
10C~5 0.08(04) 0.16(02) 104C~14 0.38(27) 0.29(06)
103C~6 0.76(151) 0.26(13) 103C~15 0.12(03) 0.11(01)
102C~7 0.70(63) 0.42(12) 104C~16 0.46(35) 0.56(06)
10C~8 0.08(22) 0.45(09) 104C~17 0.26(22) 0.19(16)
102C~9 0.46(43) 0.99(11)
in Appendix B gives the posterior distributions of C~i. The introduction of the constraints in Eqs. (5)−(7) causes some C~i to deviate from normal distributions, but not very seriously. Tab.9 shows the numerical results of C~i. Compared with those results in Ref. [8], all standard deviations are slightly larger. The reason is that Eq. (11) considers the errors of the truncation errors and enlarges the theoretical errors.
Tab.10 gives the theoretical contributions at each order with the NNLO fit. It can be seen that most physical quantities satisfy the chiral convergence very well, except for a01/2, 10a03/2, l2r and l3r. This situation also exists in the NNLO fit and in Ref. [8]. The reason has been discussed in Section 6.2. It also leads to a large theoretical error of l3r. If a set of more precise experimental data are introduced, this problem may not exist anymore.
Tab.10 Same as Tab.8, except for the NNLO fit.
Observables LO|PctLO NLO|PctNLO NNLO|PctNNLO HO|PctHO Theory Experiment
ms/m^|1 25.84 (95.1%) 1.45 (5.3%) 0.12 (−0.5%) 0.003 (0.01%) 27.2±4.1 27.31.3+0.7
ms/m^|2 24.21 (88.4%) 3.60 (13.1%) 0.38 (−1.4%) 0.020 (−0.07%) 27.4±10.7 27.31.3+0.7
FK/Fπ 1.000 (83.2%) 0.197 (16.4%) 0.007 (0.6%) 0.002 (−0.12%) 1.202±0.050 1.199±0.003
fs 3.782 (66.7%) 1.342 (23.7%) 0.494 (8.7%) 0.050 (0.88%) 5.668±0.351 5.712±0.032
gp 3.782 (76.9%) 0.834 (17.0%) 0.284 (5.8%) 0.018 (0.37%) 4.918±0.080 4.958±0.085
a00 0.159 (72.2%) 0.045 (20.6%) 0.015 (6.8%) 0.001 (0.34%) 0.2204±0.004 0.2196±0.0034
10a02 0.455 (104.1%) 0.020 (4.6%) 0.003 (−0.7%) 0.005 (1.20%) 0.437±0.017 0.444±0.012
a01/2mπ 0.142 (63.2%) 0.034 (15.2%) 0.049 (21.6%) 0.000 (0.00%) 0.225±0.013 0.224±0.022
10a03/2mπ 0.709 (161.9%) 0.093 (−21.2%) 0.179 (−40.8%) 0.000 (0.04%) 0.438±0.061 0.448±0.077
103l1r 0 (0.0%) 3.57 (90.4%) 0.38 (9.7%) 0.004 (−0.11%) 4.0±0.94 4.0±0.6
103l2r 0 (0.0%) 3.32 (174.8%) 1.36 (−71.8%) 0.057 (−3.01%) 1.9±0.82 1.9±0.2
103l3r 0 (0.0%) 0.25 (−115.1%) 0.46 (214.6%) 0.001 (0.48%) 0.2±3.01 0.3±1.1
103l4r 0 (0.0%) 6.85 (104.8%) 0.27 (−4.1%) 0.044 (−0.67%) 6.5±1.91 6.2±1.3
fs 0.472±0.461 0.868±0.049
g 0.508±0.029 0.508±0.122
r2Sπ 0.59±0.07 0.61±0.04
cSπ 11±1 11±1

6.4 The NNLO fitting Cir

This section discusses the fit about Cir. C~i, which have been determined in Tab.9, are linear combinations of Cir. Although the number of C~i is less than the number of Cir, three Cir can be determined by solving the linear equations [8]. However, some of these values will be one to two orders of magnitude times larger than those in the other references. Some constraints are required to be introduced. For the other unsolvable Cir, their distribution is solved by the Monte Carlo method [8]. Although the approach in Ref. [8] can solve this problem, its efficiency is very low, and it needs to take a lot of time. Therefore, this paper adopts the MCMC algorithms in Section 2. We have repeated the computation many times, and some similar results are obtained. Randomness does not affect the results obtained by this method. The difference is that all the prior PDFs of parameters Cir are all set to the different uniform distributions. The boundaries of these prior uniform distributions are the same as Eq. (38) in Ref. [8]. The reason to use the prior uniform distributions instead of a prior normal distribution is that we want to explore the boundary dependence of each Cir in this overfitting problem. Normal distributions would generate fewer samples near the boundaries, and the efficiency is low.
Figure 5 in Appendix B illustrates the posterior distributions of Cir. It can be seen that different Cir have different shapes. Cir(i=3,7,8,10,16,17,18,20,22,23,28,30,32,33,36,63,66,69,83,88,90) have a large probability near both boundaries. Their posterior PDFs are dependent on both sides. In addition, Cir(i=2,6,26,29,34) only depend on one side. This can also be seen from their posterior distributions. One side has a shape similar to a half-Gaussian distribution. The constraint of these Cir at this side is reliable, but the other side gives no constraint of these Cir. Finally, these twelve Cir(i=1,4,5,11,12,13,14,15,19,21,25,31) give Gaussian-like posterior PDFs, so these twelve results have higher credibility. Of course, 17 data to fit 38 Cir is far from adequate. There exists an overfitting problem. Hence, some Cir are boundary-dependent. This property is similar to those in Ref. [8].
Tab.11 gives the fitting values of Cir and compares the results in Refs. [7, 8, 71]. The brackets “[” and “]” denote that the results are strongly dependent on the lower and the upper boundaries, respectively. The parentheses “(” and “)” denote that the results are weakly dependent on the lower and the upper boundaries, respectively. We have tried to double the boundaries, the strong-dependent boundaries deviate from the original values a lot, while weak-dependent boundaries change the original values slightly. Of course, the boundaries chosen in Ref. [8] are wide enough, they cover almost all results in the other references [6, 7, 19, 7180]. Hence, the true values have a large probability in the intervals in Tab.11.
Tab.11 The values of Cir are in units of 106. The brackets “[” and “]” represent strong dependence on the lower and the upper boundaries, respectively. “(” and “)” represent weak dependence on the lower and the upper boundaries, respectively. The results with an asterisk mean the input boundaries on the website [68] are very close to those in Ref. [7] (less than 1010). The symbol “0” for the results in Ref. [71] means these values are zeros in the large-NC limits.
LECs Results Ref. [8] Ref. [7] Ref. [71] LECs Results Ref. [8] Ref. [7] Ref. [71]
C1r 14.82 (41.49) 14[37] 12 25.331.11+0.60 C21r −0.41 (0.82) 0.28(0.56) −0.48 0.510.09+0.09
C2r 3.48 (8.98] 16(1] 3.0 0 C22r 5.88 [15.71] 14(13] 9.0 2.982.21+1.70
C3r 1.70 [6.05] 2.9[6.0] 4.0 0.430.09+0.09 C23r 0.92 [3.52] 5.6(0.9] −1.0 0
C4r 18.54 (29.94) 26[16) 15 18.110.85+0.51 C25r −21.17 (58.67) 34(33) −11 25.76+5.023.49
C5r −3.62 (19.23) 31[7) −4.0 10.881.11+0.85 C26r −4.30 [42.04) 31(36] 10 23.044.59+2.98
C6r −3.43 [4.22) 7.9[1.8) −4.0 0 C28r 0.45 [3.95] 4.9[0.9) −2.0 1.530.09+0.00
C7r 1.00 [6.26] 2.4[6.1] 5.0 0 C29r −23.20 [24.09) 49[11) −20 8.42+2.041.79
C8r 10.52 [16.84] 15[16] 19 17.85+1.361.28 C30r 3.44 [4.61] 9.0(1.9] 3.0 3.150.17+0.09
C10r 3.51[13.75] 13(6] −0.25 5.530.51+0.43 C31r 2.15 (9.56) 0.71(6.70) 2.0 3.911.11+0.60
C11r −2.90 (4.17) 2.6(1.8) −4.0 0 C32r 1.79 [3.63] 5.6(1.9] 1.7 1.45+0.260.17
C12r −6.02 (5.57) 18(2) −2.8 2.890.09+0.09 C33r −0.01 [3.58] 0.69[3.12) 0.82 0.43+0.430.17
C13r 1.74 (2.06) 2.2(0.9) 1.5 0 C34r 9.27 (10.48] 0.68(4.67) 7.0 5.61+2.471.53
C14r −3.32 (3.15) 4.2(1.2) −1.0 7.401.79+1.19 C36r 1.27 [5.12] 4.1(4.3] 2.0 0
C15r 1.30 (1.13) 1.2(1.0) −3.0 0 C63r 11.09 [22.82] 6.6[16.8) 21.08+2.131.79
C16r 1.10 [4.20] 0.81(1.34) 3.2 0 C66r 3.90 [26.83] 6.5[25.4] 6.800.60+0.34
C17r 0.27 [2.62] 3.6(1.6] −1.0 1.450.34+0.09 C69r −1.16 [19.86] 4.6[19.0] 4.420.09+0.00
C18r −1.46 [5.45] 1.1[5.4] 0.63 5.100.77+0.60 C83r −1.11 [20.31] 14(16] 14.791.87+1.45
C19r −5.42 (5.22) 5.3(2.8) −4.0 2.301.11+0.77 C88r −24.09 [63.17] 38[59] 14.37+7.915.78
C20r 0.53 [3.45] 2.9[2.3) 1.0 1.45+0.260.17 C90r 20.83 [62.90] 35[44) 19.72+4.683.74

7 Discussion and summary

This paper proposes a more general Bayesian model (Model B) with the truncation errors. This model is based on the idea of a simple truncation-error model [8] and the Bayesian model framework [28]. Compared to Refs. [7, 8], there are some advantages in Model B.
i) This model can transform the understanding of ChPT into the prior knowledge during the fitting process, containing the information of the LECs, the convergence of ChPT and the truncation errors. The prior information can be conveniently introduced by Eqs. (5)−(7). It does not need the other assumptions, such as the geometric-sequence assumption in Ref. [8]. It can also give a set of more precise NLO fitting LECs. A similar result is obtained at the NNLO fit in Ref. [7], see Tab.7. Hence, there are good reasons to believe that the NNLO fitting LECs are also more precise, although there lacks the higher-order fitting result to be compared to.
ii) With the help of the MCMC method, the distributions of the LECs can be obtained, and the computational speed is faster. The computational time of Model B is the shortest. The Bayesian method has another inherent advantage. Some clear distribution figures of LECs can be obtained, because Bayesian statistics can give more points in a given time. Therefore, one cannot only obtain the expected values and errors of the LECs, but also their distributions. Refs. [7, 8] cannot give the distributions of LECs, although they can give the errors.
iii) Model B gives a general fitting method. It can be used to fit the other problems. The two extremes of this model (Models B1 and B2) have been evaluated by a toy example in Section 4. It confirms that more prior information indeed gives more precise results. With the quantified evaluation criteria in Section 3.5, one can see the improvement of the prior information more clearly. The actual ChPT fit is between the two extremes. It is better than Model A. However, Model A gives a similar result as the minimum χ2 method in Ref. [7].
iv) For the NNLO LECs Cir, more smooth PDFs are given, comparing to Ref. [8] (Ref. [7] does not give PDFs). With these PDFs, one can see how the Cir depend on the boundaries more intuitively.
v) There also exist some slight improvements in this paper. The covariance matrix given in Ref. [25] is considered. The results are insensitive to the initial conditions, compared to Ref. [7].
In order to test the effectiveness of the model, one example is randomly generated, in order to imitate the actual ChPT. Some parameters ai and some quantities Oi are introduced, which imitate the LECs and the experimental data, respectively. The exact values of ai and Oi are known, and they are treated as the true values. Model A, which does not consider the truncation errors, is also introduced, in order to compare two ideal cases of Model B. One case knows nothing about the truncation errors, except the orders of magnitude. The other one knows the distributions of the truncation errors. The fitting results indicate that the prior information of the truncation errors can improve the fit greatly, even though this information is not so precise. Hence, Model B is adopted to fit the actual ChPT data.
In the actual ChPT fit, it indicates that the Bayesian method without the truncation errors are similar to the classical statistics. In other words, the classical statistics can be treated as a special case of Bayesian statistics. However, Bayesian statistics can be applied more widely. With the help of Model B, some Lir and Cir (defined in Ref. [64]) are fitted at the NLO and the NNLO. The fitting Lir are almost unchanged between the NLO and the NNLO fit. The change between 12 and 17 input data are also small, but all the theoretical errors decrease for the 17 inputs, because of the more precise estimation of the truncation errors. Model B also solves a problem in the free fit, which leads to L4r and L6r being very large, but they are zeros in the large-NC limit. Because the number of Cir to be fitted is larger than the number of the experimental data, some independent C~ir are fitted first, which are the linear combinations of the Cir to be fitted. From the posterior PDFs of Cir, the reliable intervals of twelve Cir are obtained, and five Cir are only constrained with the upper or the lower boundary of the intervals, and the other 21 Cir are strongly dependent on both boundaries. It needs more experimental data to confirm these uncertain Cir. Because all the C~ir does not exist overfitting, they are more precise than Cir. If one knows some more values of these Cir, some other Cir can be restrained by these C~ir. For the physical quantities to be fitted, most theoretical contributions are well convergent, except a01/2 and a03/2. It possibly comes from the large experimental errors, or some of these quantities are indeed not convergent. This needs more precise experimental data and theoretical calculations in the future. It can be seen that Model B can estimate the truncation errors very well.
Some input parameters are very rough, such as Eqs. (6) and (7). A more precise estimation beyond the simple convergence assumption will be studied in the future work. In addition, if more analytical and experimental results are introduced, the results should be more precise. However, the NNLO theoretical calculation is complicated. It needs to be studied in the future. In addition, this approach can also be used to fit the other LECs, such as pion-nucleon, meson-baryon chiral LECs. However, both their experimental data and theoretical results are less than the mesonic LECs at present.
In conclusion, truncation errors usually cannot be ignored in the global fit, and some prior information can improve the fit greatly, even though this information is sometimes not very exact. Model B provides a feasible implementation scheme. A new set of more reliable Lir and Cir are fitted by Model B. This model cannot only fit LECs in ChPT, but also fit other parameters in the other EFTs and the perturbation theory.

8 Appendix A: One testing example

Eq. (A1) gives the functions of the example Section 3. For convenience, the parameters with the same name in the different functions are different. The values of bi and aiLO can be found in . The values of aiNLO and aiNNLO are given in the second row in Tab.1 and the second column III, respectively, which are marked by a subscript “tr”.
O1=b1exp(a1NNLOt3+a4NLOb2t2+a7NLOb3t2a1LOt)b1,O2=b1sin(b2exp(b5))b1sin(b2exp(a2NNLOb3t3a8NLOb4t2+b5exp(a1NLOb6t2)a2LOt)),O3=b1ln(a3NNLOb3t3a1NLOb2t2a6NLOb4t2a3LOt+1),O4=b1exp(1b2)b1exp(a4NNLOb7t3+a4NNLOt3a1NLOb6t2+a3NLOb5t2a4NLOb4t2b2cos(a3NLOb3t2)+a4LOt+1),O5=b1ln(a5NNLOb4t3a5NNLOb6t3a6NLOb2b3t2a8NLOb5t2a5LOt+1),O6=b1ln(b2ln(a6NNLOb4t3+a2NLOb3t2+a6NLOb5t2+a6LOt+1)+1),O7=b1exp(b2)+b1exp(a1NLOb6t2+b2exp(a7NNLOt3+a5NLOb3t2+a5NLOt2+a8NLOt2)+b4sin(a4NLOb5t2)+a7LOt),O8=b1exp(b2exp(a8NNLOb3t3+a3NLOb4t2+a7NLOt2b5a8LOt))b1exp(b2),O9=b1ln(b2+1)b1ln(a9NNLOb4t3+a9NNLOb5t3+a9NNLOb8t3+b2exp(a5NLOb3t2)b6sin(a9NNLOb7t3)+a9LOt+1),O10=b1sin(b2+b4sin(b5))+b1sin(a10NNLOt3a4NLOt2+b2exp(a5NLOb3t2)+b4sin(b5exp(a2NLOb6t2))+a10LOt),O11=b1ln(b2sin(a11NNLOb6t3+a2NLOb3t2a3NLOb4t2+a7NLOb5t2a11LOt)+1),O12=b1ln(b2sin(a12NNLOb6t3+a12NNLOt3+a2NLOb3t2+a4NLOt2+a5NLOt2+a6NLOb4b5t2+a12LOt)+1),O13=b1exp(a8NLOb8t2b2sin(a4NLOb3t2)b4exp(a13NNLOb6t3a5NLOb5t2+a6NLOb7t2+a13LOt)+a13LOt)b1exp(b4),O14=b1ln(a14NNLOb4t3b2ln(a14NNLOb3t3+1)b5sin(a2NLOb6t2)+a14LOt+1),O15=b1exp(b3)+b1exp(a15NNLOb2t3a7NLOb5t2a7NLOb7t2a8NLOb6t2+b3cos(a7NLOb4t2)a15LOt),O16=b1sin(ln(b6+1)+1)b1sin(a16NNLOb5t3+a3NLOb2t2+a5NLOb4t2+a16LOtln(b6exp(a1NLOb7t2)+1)exp(a3NLOb3t2)),O17=b1ln(b2exp(b3)b5sin(b6)+1)b1ln(b2exp(b3exp(a8NLOb4t2))b5sin(b6exp(C17b7t3))+a17LOt+1).
Table A1 The values of parameters bi and aiLO in Eq. (A1). Because the values are exact, more significant digits are given.
102b1 102b2 102b3 102b4 102b5 102b6 102b7 102b8 102aiLO
O1 −50.00000 −50.00000 −50.00000 −50.00000
O2 10.00000 10.00000 10.00000 10.00000 10.00000 10.00000 10.00000
O3 −0.27574 −81.82470 −20.69689 −95.19898 −130.30649
O4 −0.15389 55.22236 −22.53571 10.47685 14.87959 10.84243 85.28317 164.01310
O5 32.80394 52.16045 52.16045 42.66749 12.43968 77.33250 52.35691
O6 −10.25934 −10.95043 −11.71888 −28.67356 −26.84030 −32.24206
O7 −2.39804 2.37745 −9.62388 9.39379 9.39265 0.57071 42.90526
O8 −24.83947 69.61634 −2.52600 −10.10231 7.19485 99.63693
O9 0.51431 99.49478 30.47334 32.08646 32.08646 113.26784 96.25052 32.08646 77.78096
O10 −69.04271 −69.23904 38.65428 20.36290 10.78947 10.06250 82.44726
O11 −62.98961 −66.28358 42.37512 −8.89931 13.20056 6.67356 91.91503
O12 50.00000 10.00000 −135.00000 10.00000 10.00000 −100.00000
O13 74.39589 103.26600 103.32687 156.81033 95.54550 83.85802 100.61745 105.09613 107.97725
O14 −33.59097 −54.18416 −54.20951 −41.78655 −28.19762 −28.53647 −48.27921
O15 28.77264 26.35989 40.37381 49.81383 40.01344 63.80493 40.01344 45.46315
O16 −9.19703 3.17944 1.51912 5.66408 −5.66251 −8.10996 10.10760 58.11734
O17 −29.07789 −62.10099 −44.02866 −48.43771 −31.06314 −34.52566 −40.99873 −12.31774

9 Appendix B: Some tables and figures for the fits

Table A2 The fitting parameters and the priors in Model B2. The subscripts NLO and NNLO represent the NLO and NNLO fit, respectively. The definitions of these parameters are in Eqs. (10)−(12) and the text below them.
i μe,NLO σe,NLO pNLO μaiNLO σaiNLO μe,NNLO σe,NNLO pNNLO μaiNNLO σaiNNLO
1 0.140 0.050 1 0.616 0.308 0.040 0.020 1 0.023 0.012
2 0.100 0.050 1 0.751 0.376 0.040 0.020 0 0.178 0.089
3 0.020 0.050 0 3.232 1.616 0.100 0.030 1 0.758 0.379
4 0.040 0.050 1 0.268 0.134 0.120 0.036 1 0.196 0.098
5 0.140 0.050 1 1.097 0.549 0.060 0.020 1 0.146 0.073
6 0.110 0.050 0 0.108 0.054 0.020 0.020 0 0.200 0.100
7 0.110 0.050 1 0.281 0.140 0.060 0.020 1 0.347 0.173
8 0.140 0.050 1 0.434 0.217 0.010 0.020 0 0.484 0.242
9 0.070 0.050 0 0.130 0.039 0 0.958 0.479
10 0.120 0.050 0 0.050 0.020 0 0.061 0.031
11 0.110 0.050 0 0.060 0.020 0 0.275 0.138
12 0.010 0.050 1 0.040 0.020 1 0.217 0.109
13 0.140 0.050 1 0.060 0.020 1 0.987 0.494
14 0.140 0.050 1 0.070 0.021 1 0.279 0.139
15 0.020 0.050 0 0.030 0.020 1 0.098 0.049
16 0.060 0.050 0 0.050 0.020 1 0.622 0.311
17 0.140 0.050 1 0.050 0.020 1 0.187 0.093
Table A3 The prior of the LECs in Model B2. The subscripts NLO and NNLO represent the NLO and NNLO fit, respectively.
i μaiNLO σaiNLO μaiNNLO σaiNNLO
1 0.616 0.308 0.023 0.012
2 0.751 0.376 0.178 0.089
3 3.232 1.616 0.758 0.379
4 0.268 0.134 0.196 0.098
5 1.097 0.549 0.146 0.073
6 0.108 0.054 0.200 0.100
7 0.281 0.140 0.347 0.173
8 0.434 0.217 0.484 0.242
9 0.958 0.479
10 0.061 0.031
11 0.275 0.138
12 0.217 0.109
13 0.987 0.494
14 0.279 0.139
15 0.098 0.049
16 0.622 0.311
17 0.187 0.093
Table A4 The parameters for fitting Lir and C~i. The superscripts 12 and 17 represent the fit with 12 and 17 inputs, respectively. The subscripts NLO and NNLO represent the NLO and NNLO fitting, respectively. The definitions of these parameters are in Eqs. (10)−(12) and the text below them.
Quantity μe,NLO12 σe,NLO12 pNLO12 μe,NLO17 σe,NLO17 pNLO17 μe,NNLO17 σe,NNLO17 pNNLO17
ms/m^|1 0.050 0.050 0.5 0.050 0.050 0.5 0.020 0.020 0.5
ms/m^|2 0.050 0.050 0.5 0.050 0.050 0.5 0.020 0.020 0.5
FK/Fπ 0.050 0.050 0.5 0.050 0.050 0.5 0.020 0.020 0.5
fs 0.150 0.050 1 0.150 0.050 1 0.020 0.020 0.5
gp 0.050 0.050 0.5 0.050 0.050 0.5 0.020 0.020 0.5
fs 0.050 0.050 0.5 0.050 0.050 0.5 0.020 0.020 0.5
g 0.050 0.050 0.5 0.050 0.050 0.5 0.020 0.020 0.5
a00 0.100 0.050 1 0.100 0.050 1 0.020 0.020 0.5
10a02 0.050 0.050 0.5 0.050 0.050 0.5 0.020 0.020 0.5
a01/2mπ 0.350 0.105 1 0.350 0.105 1 0.020 0.020 0.5
10a03/2mπ 0.250 0.075 0 0.250 0.075 0 0.020 0.020 0.5
r2Sπ 0.200 0.060 0.5 0.200 0.060 0.5 0.020 0.020 0.5
cSπ 0.200 0.060 0.5 0.020 0.020 0.5
l¯1 0.200 0.060 0.5 0.050 0.050 0.5
l¯2 0.200 0.060 0.5 0.050 0.050 0.5
l¯3 0.200 0.060 0.5 0.050 0.050 0.5
l¯4 0.200 0.060 0.5 0.050 0.050 0.5
Table A5 The parameters of the priors of Lir and C~i. Their definition is above Eq. (20). The superscripts 12 and 17 represent the fit with 12 and 17 inputs, respectively. The subscripts NLO and NNLO represent the NLO and the NNLO fits, respectively.
i μLir,NLO12 σLir,NLO12 μLir,NLO17 σLir,NLO17 μC~i,NNLO17 σC~i,NNLO17
1 0.500 0.300 0.500 0.300 0.077 0.989
2 1.000 0.500 1.000 0.500 0.190 3.102
3 3.000 1.000 3.000 1.000 1.073 0.007
4 0.200 0.200 0.200 0.200 0.068 0.095
5 1.000 0.500 1.000 0.500 0.040 0.071
6 0.000 0.300 0.000 0.300 0.806 1.772
7 0.300 0.300 0.300 0.300 0.561 1.581
8 0.500 0.500 0.500 0.500 0.043 0.250
9 0.173 0.497
10 0.405 1.819
11 0.066 0.227
12 0.013 0.053
13 0.244 0.388
14 0.007 0.693
15 0.022 0.072
16 0.214 1.122
17 0.207 0.222
Fig.A1 The posterior distributions of the NNLO fitting C~i. The vertical coordinate is the posterior PDF and the horizontal coordinate is the value of C~i. The pink shaded area depicts the 68% HPD. The blue line is the distribution curve of Lir.

Full size|PPT slide

Fig. A2 The posterior distributions of Cir. The horizontal axis represents the value of Cir, and the upper and the lower boundaries are given in Eq. (38) in Ref. [8]. The vertical coordinate is the posterior PDF. The pink shaded area depicts the 68% HPD. The blue line is the distribution curve of Cir.

Full size|PPT slide

References

[1]
S. Weinberg, Phenomenological Lagrangians, Physica A 96(1−2), 327 (1979)
CrossRef ADS Google scholar
[2]
J. Gasser and H. Leutwyler, Chiral perturbation theory to one loop, Ann. Phys. 158(1), 142 (1984)
CrossRef ADS Google scholar
[3]
J. Gasser and H. Leutwyler, Chiral perturbation theory: Expansions in the mass of the strange quark, Nucl. Phys. B 250(1−4), 465 (1985)
CrossRef ADS Google scholar
[4]
J. Bijnens, G. Colangelo, and G. Ecker, The mesonic chiral Lagrangian of order p6, J. High Energy Phys. 02, 020 (1999)
CrossRef ADS Google scholar
[5]
J. Bijnens, N. Hermansson-Truedsson, and S. Wang, The order p8 mesonic chiral Lagrangian, J. High Energy Phys. 01(1), 102 (2019)
CrossRef ADS arXiv Google scholar
[6]
J.BijnensI. Jemos, A new global fit of the Lr at next-to-next-to-leading order in chiral perturbation theory, Nucl. Phys. B 854(3), 631 (2012)
[7]
J. Bijnens and G. Ecker, Mesonic low-energy constants, Annu. Rev. Nucl. Part. Sci. 64(1), 149 (2014)
CrossRef ADS arXiv Google scholar
[8]
Q. H. Yang, W. Guo, F. J. Ge, B. Huang, H. Liu, and S. Z. Jiang, New method for fitting the low-energy constants in chiral perturbation theory, Phys. Rev. D 102(9), 094009 (2020)
CrossRef ADS arXiv Google scholar
[9]
K. U. Can, G. Erkol, M. Oka, and T. T. Takahashi, Look inside charmed-strange baryons from lattice QCD, Phys. Rev. D 92(11), 114515 (2015)
CrossRef ADS arXiv Google scholar
[10]
K. U. Can, G. Erkol, B. Isildak, M. Oka, and T. T. Takahashi, Electromagnetic structure of charmed baryons in lattice QCD, J. High Energy Phys. 05(5), 125 (2014)
CrossRef ADS arXiv Google scholar
[11]
H.BahtiyarK. U. CanG.ErkolM.OkaT.T. Takahashi, Ξ → Ξc transition in lattice QCD, Phys. Lett. B 772, 121 (2017)
[12]
T.M. YanH. Y. ChengC.Y. CheungG.L. LinY.C. Lin H.L. Yu, Heavy quark symmetry and chiral dynamics, Phys. Rev. D 46(3), 1148 (1992) [Erratum: Phys. Rev. D 55, 5851 (1997)]
[13]
R.J. DowdallC.T. H. DaviesG.P. LepageC.McNeile, Vus from π and K decay constants in full lattice QCD with physical u, d, s and c quarks, Phys. Rev. D 88, 074504 (2013), arXiv:
[14]
A. Bazavov, . (MILC), . Results for light pseudoscalar mesons, PoS LATTICE 2010, 074 (2010)
[15]
V. Bernard and E. Passemar, Chiral extrapolation of the strangeness changing form factor, J. High Energy Phys. 04, 001 (2010)
CrossRef ADS arXiv Google scholar
[16]
A.Bazavov. (MILC), ., MILC results for light pseudoscalars, in: Proceedings of 6th International Workshop on Chiral dynamics: Bern, Switzerland, July 6–10, 2009, PoS CD09, 007 (2009), arXiv:
[17]
A. Bazavov, D. Toussaint, C. Bernard, J. Laiho, C. DeTar, L. Levkova, M. B. Oktay, S. Gottlieb, U. M. Heller, J. E. Hetrick, P. B. Mackenzie, R. Sugar, and R. S. Van de Water, Nonperturbative QCD simulations with 2+1 flavors of improved staggered quarks, Rev. Mod. Phys. 82(2), 1349 (2010)
CrossRef ADS arXiv Google scholar
[18]
M.GoltermanK. MaltmanS.Peris, NNLO low-energy constants from flavor-breaking chiral sum rules based on hadronic τ-decay data, Phys. Rev. D 89(5), 054036 (2014)
[19]
P. Colangelo, J. J. Sanz-Cillero, and F. Zuo, Holography, chiral Lagrangian and form factor relations, J. High Energy Phys. 11, 012 (2012)
[20]
Z. H. Guo, J. J. Sanz Cillero, and H. Q. Zheng, Partial waves and large NC resonance sum rules, J. High Energy Phys. 06, 030 (2007)
[21]
Z.H. GuoJ. J. Sanz-CilleroH.Q. Zheng, O(p6) extension of the large-NC partial wave dispersion relations, Phys. Lett. B 661, 342 (2008), arXiv:
[22]
Z.H. GuoJ. J. Sanz-Cillero, ππ-scattering lengths at O(p6) revisited, Phys. Rev. D 79, 096006 (2009)
[23]
J.BijnensG. ColangeloJ.Gasser, Kl4 decays beyond one loop, Nucl. Phys. B 427(3), 427 (1994)
[24]
G.AmorósJ.BijnensP.Talavera, K4 form-factors and ππ scattering, Nucl. Phys. B 585, 293 (2000) [Erratum: Nucl. Phys. B 598, 665(2001)], arXiv:
[25]
G.ColangeloJ. GasserH.Leutwyler, ππ scattering, Nucl. Phys. B 603(1–2), 125 (2001)
[26]
M.R. SchindlerD.R. Phillips, Bayesian methods for parameter estimation in effective field theories, Ann. Phys. 324, 682 (2009) [Erratum: Ann. Phys. 324, 2051 (2009)], arXiv:
[27]
R.J. FurnstahlD.R. PhillipsS.Wesolowski, A recipe for EFT uncertainty quantification in nuclear physics, J. Phys. G 42(3), 034028 (2015)
[28]
S. Wesolowski, N. Klco, R. J. Furnstahl, D. R. Phillips, and A. Thapaliya, Bayesian parameter estimation for effective field theories, J. Phys. G 43(7), 074001 (2016)
CrossRef ADS arXiv Google scholar
[29]
J. A. Melendez, S. Wesolowski, and R. J. Furnstahl, Bayesian truncation errors in chiral effective field theory: Nucleon‒nucleon observables, Phys. Rev. C 96(2), 024003 (2017)
CrossRef ADS arXiv Google scholar
[30]
I. Svensson, A. Ekström, and C. Forssén, Bayesian parameter estimation in chiral effective field theory using the Hamiltonian Monte Carlo method, Phys. Rev. C 105(1), 014004 (2022)
CrossRef ADS arXiv Google scholar
[31]
A. Ekström, C. Forssén, C. Dimitrakakis, D. Dubhashi, H. T. Johansson, A. S. Muhammad, H. Salomonsson, and A. Schliep, Bayesian optimization in ab initio nuclear physics, J. Phys. G 46(9), 095101 (2019)
CrossRef ADS arXiv Google scholar
[32]
S. Wesolowski, R. J. Furnstahl, J. A. Melendez, and D. R. Phillips, Exploring Bayesian parameter estimation for chiral effective field theory using nucleon–nucleon phase shifts, J. Phys. G 46(4), 045102 (2019)
CrossRef ADS arXiv Google scholar
[33]
I. K. Alnamlah, E. A. C. Pérez, and D. R. Phillips, Effective field theory approach to rotational bands in odd-mass nuclei, Phys. Rev. C 104(6), 064311 (2021)
CrossRef ADS arXiv Google scholar
[34]
C. J. Yang, A. Ekström, C. Forssén, and G. Hagen, Power counting in chiral effective field theory and nuclear binding, Phys. Rev. C 103(5), 054304 (2021)
CrossRef ADS arXiv Google scholar
[35]
A. E. Lovell, F. M. Nunes, M. Catacora-Rios, and G. B. King, Recent advances in the quantification of uncertainties in reaction theory, J. Phys. G 48(1), 014001 (2020)
CrossRef ADS arXiv Google scholar
[36]
D. R. Phillips, R. J. Furnstahl, U. Heinz, T. Maiti, W. Nazarewicz, F. M. Nunes, M. Plumlee, M. T. Pratola, S. Pratt, F. G. Viens, and S. M. Wild, Get on the BAND Wagon: A Bayesian framework for quantifying model uncertainties in nuclear dynamics, J. Phys. G 48(7), 072001 (2021)
CrossRef ADS arXiv Google scholar
[37]
P.BedaqueA. BoehnleinM.CromazM.DiefenthalerL.ElouadrhiriT.HornM.KucheraD.Lawrence D.LeeS. LidiaR.McKeownW.MelnitchoukW.NazarewiczK.OrginosY.RoblinM.Scott SmithM.SchramX.N. Wang, A. I. for nuclear physics, Eur. Phys. J. A 57(3), 100 (2021)
[38]
S. Wesolowski, I. Svensson, A. Ekström, C. Forssén, R. J. Furnstahl, J. A. Melendez, and D. R. Phillips, Rigorous constraints on three-nucleon forces in chiral effective field theory from fast and accurate calculations of few-body observables, Phys. Rev. C 104(6), 064001 (2021)
CrossRef ADS arXiv Google scholar
[39]
M. A. Connell, I. Billig, and D. R. Phillips, Does Bayesian model averaging improve polynomial extrapolations? Two toy problems as tests, J. Phys. G 48(10), 104001 (2021)
CrossRef ADS arXiv Google scholar
[40]
Y. H. Lin, H. W. Hammer, and U. G. Meißner, Dispersion-theoretical analysis of the electromagnetic form factors of the nucleon: Past, present and future, Eur. Phys. J. A 57(8), 255 (2021)
CrossRef ADS arXiv Google scholar
[41]
T. Djärv, A. Ekström, C. Forssén, and H. T. Johansson, Bayesian predictions for A = 6 nuclei using eigenvector continuation emulators, Phys. Rev. C 105(1), 014005 (2022)
CrossRef ADS arXiv Google scholar
[42]
B. Acharya and S. Bacca, Gaussian process error modeling for chiral effective-field-theory calculations of np at low energies, Phys. Lett. B 827, 137011 (2022)
CrossRef ADS arXiv Google scholar
[43]
D. Odell, C. R. Brune, D. R. Phillips, R. J. deBoer, and S. N. Paneru, Performing Bayesian analyses with AZURE2 using BRICK: An application to the 7Be system, Front. Phys. (Lausanne) 10, 888476 (2022)
CrossRef ADS arXiv Google scholar
[44]
A. E. Lovell, A. T. Mohan, T. M. Sprouse, and M. R. Mumpower, Nuclear masses learned from a probabilistic neural network, Phys. Rev. C 106(1), 014305 (2022)
CrossRef ADS arXiv Google scholar
[45]
G. Hagen, S. J. Novario, Z. H. Sun, T. Papenbrock, G. R. Jansen, J. G. Lietz, T. Duguet, and A. Tichai, Angular-momentum projection in coupled-cluster theory: Structure of 34Mg, Phys. Rev. C 105(6), 064311 (2022)
CrossRef ADS arXiv Google scholar
[46]
T. Papenbrock, Effective field theory of pairing rotations, Phys. Rev. C 105(4), 044322 (2022)
CrossRef ADS arXiv Google scholar
[47]
S. S. Li Muli, B. Acharya, O. J. Hernandez, and S. Bacca, Bayesian analysis of nuclear polarizability corrections to the Lamb shift of muonic H-atoms and He-ions, J. Phys. G 49(10), 105101 (2022)
CrossRef ADS arXiv Google scholar
[48]
Q.Y. ZhaiM. Z. LiuJ.X. LuL.S. Geng, Zcs(3985) in next-to-leading-order chiral effective field theory: The first truncation uncertainty analysis, Phys. Rev. D 106(3), 034026 (2022)
[49]
K. Fraboulet and J. P. Ebran, Addressing energy density functionals in the language of path-integrals I: Comparative study of diagrammatic techniques applied to the (0+0)D O(N)-symmetric φ4-theory, Eur. Phys. J. A 59(4), 91 (2023)
CrossRef ADS arXiv Google scholar
[50]
W. Jiang and C. Forssén, Bayesian probability updates using sampling/importance resampling: Applications in nuclear theory, Front. Phys. (Lausanne) 10, 1058809 (2022)
CrossRef ADS arXiv Google scholar
[51]
A. Ekström, C. Forssén, G. Hagen, G. R. Jansen, W. Jiang, and T. Papenbrock, What is ab initio in nuclear theory, Front. Phys. (Lausanne) 11, 1129094 (2023)
CrossRef ADS arXiv Google scholar
[52]
W. I. Jay and E. T. Neil, Bayesian model averaging for analysis of lattice field theory results, Phys. Rev. D 103(11), 114502 (2021)
CrossRef ADS arXiv Google scholar
[53]
M. Catacora-Rios, G. B. King, A. E. Lovell, and F. M. Nunes, Exploring experimental conditions to reduce uncertainties in the optical potential, Phys. Rev. C 100(6), 064615 (2019)
CrossRef ADS arXiv Google scholar
[54]
A. Ekström and G. Hagen, Global sensitivity analysis of bulk properties of an atomic nucleus, Phys. Rev. Lett. 123(25), 252501 (2019)
CrossRef ADS arXiv Google scholar
[55]
X.ZhangK. M. NollettD.R. Phillips, S-factor and scattering-parameter extractions from 3He + 4He → 7Be + γ, J. Phys. G 47, 054002 (2020)
[56]
B. K. Luna and T. Papenbrock, Low-energy bound states, resonances, and scattering of light ions, Phys. Rev. C 100(5), 054307 (2019)
CrossRef ADS arXiv Google scholar
[57]
E. Epelbaum, J. Golak, K. Hebeler, H. Kamada, H. Krebs, U. G. Meißner, A. Nogga, P. Reinert, R. Skibiński, K. Topolnicki, Y. Volkotrub, and H. Witała, Towards high-order calculations of three-nucleon scattering in chiral effective field theory, Eur. Phys. J. A 56(3), 92 (2020)
CrossRef ADS arXiv Google scholar
[58]
N. Metropolis, A. W. Rosenbluth, M. N. Rosenbluth, A. H. Teller, and E. Teller, Equation of state calculations by fast computing machines, J. Chem. Phys. 21(6), 1087 (1953)
CrossRef ADS Google scholar
[59]
W. K. Hastings, Monte Carlo sampling methods using Markov chains and their applications, Biometrika 57(1), 97 (1970)
CrossRef ADS Google scholar
[60]
S. Duane, A. Kennedy, B. J. Pendleton, and D. Roweth, Hybrid Monte Carlo, Phys. Lett. B 195(2), 216 (1987)
CrossRef ADS Google scholar
[61]
M. D. Homan and A. Gelman, The No-U-turn sampler: Adaptively setting path lengths in Hamiltonian Monte Carlo, J. Mach. Learn. Res. 15, 1593 (2014)
[62]
J. Salvatier, T. V. Wiecki, and C. Fonnesbeck, Probabilistic programming in python using PyMC3, PeerJ Comput. Sci. 2, e55 (2016)
CrossRef ADS Google scholar
[63]
P.Gregory, Bayesian Logical Data Analysis for the Physical Sciences, Cambridge: Cambridge University Press, 2005
[64]
J. Bijnens, G. Colangelo, and G. Ecker, Renormalization of chiral perturbation theory to order p6, Ann. Phys. 280(1), 100 (2000)
CrossRef ADS Google scholar
[65]
A.GelmanJ. B. CarlinH.S. SternD.B. DunsonA.Vehtari D.B. Rubin, Bayesian Data Analysis, 3rd Ed., Boca Raton: CPC Press, 2013
[66]
A. Vehtari, A. Gelman, and J. Gabry, Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC, Stat. Comput. 27(5), 1413 (2016)
CrossRef ADS arXiv Google scholar
[67]
G. Amoroós, J. Bijnens, and P. Talavera, Two-point functions at two loops in three flavor chiral perturbation theory, Nucl. Phys. B 568(1−2), 319 (2000)
CrossRef ADS Google scholar
[68]
J.Bijnens, Chiral perturbation theory, URL: home.thep.lu.se/~bijnens/chpt/ (2019)
[69]
J. Bijnens and P. Dhonte, Scalar form-factors in SU(3) chiral perturbation theory, J. High Energy Phys. 10, 061 (2003)
CrossRef ADS Google scholar
[70]
J. Gasser, C. Haefeli, M. A. Ivanov, and M. Schmid, Integrating out strange quarks in ChPT, Phys. Lett. B 652(1), 21 (2007)
CrossRef ADS arXiv Google scholar
[71]
S. Z. Jiang, Z. L. Wei, Q. S. Chen, and Q. Wang, Computation of the O(p6) order low-energy constants: An update, Phys. Rev. D 92(2), 025014 (2015)
CrossRef ADS Google scholar
[72]
S. Z. Jiang, Y. Zhang, C. Li, and Q. Wang, Computation of the p6 order chiral Lagrangian coefficients, Phys. Rev. D 81(1), 014001 (2010)
CrossRef ADS arXiv Google scholar
[73]
K. Kampf and B. Moussallam, Tests of the naturalness of the coupling constants in ChPT at order p6, Eur. Phys. J. C 47(3), 723 (2006)
CrossRef ADS Google scholar
[74]
M. Jamin, J. A. Oller, and A. Pich, Order p6 chiral couplings from the scalar form-factor, J. High Energy Phys. 02, 047 (2004)
CrossRef ADS Google scholar
[75]
J.BijnensP. Talavera, K3 decays in chiral perturbation theory, Nucl. Phys. B 669(1–2), 341 (2003)
[76]
V. Cirigliano, G. Ecker, M. Eidemuüller, R. Kaiser, A. Pich, and J. Portolés, The ⟨ SPP⟩ Green function and SU(3) breaking in K3 decays, J. High Energy Phys. 04, 006 (2005)
[77]
R. Unterdorfer and H. Pichl, On the radiative pion decay, Eur. Phys. J. C 55(2), 273 (2008)
CrossRef ADS arXiv Google scholar
[78]
V. Cirigliano, G. Ecker, M. Eidemüller, R. Kaiser, A. Pich, and J. Portolés, Towards a consistent estimate of the chiral low-energy constants, Nucl. Phys. B 753(1-2), 139 (2006)
CrossRef ADS Google scholar
[79]
V. Bernard and E. Passemar, Matching chiral perturbation theory and the dispersive representation of the scalar form-factor, Phys. Lett. B 661(2−3), 95 (2008)
CrossRef ADS arXiv Google scholar
[80]
B. Moussallam, Flavor stability of the chiral vacuum and scalar meson dynamics, J. High Energy Phys. 08, 005 (2000)
CrossRef ADS Google scholar

Declarations

The authors declare that they have no competing interests and there are no conflicts.

Acknowledgements

H. X. Pan thanks Qin-He Yang for providing the original program. This project was supported by the Guangxi Science Foundation under Grants No. 2022GXNSFAA035489.

RIGHTS & PERMISSIONS

2024 The Authors
AI Summary AI Mindmap
PDF(4795 KB)

536

Accesses

0

Citations

1

Altmetric

Detail

Sections
Recommended

/