1 Introduction
Effective field theory (EFT) is a very important theory in dealing with interactions between particles under a low-energy scale. Chiral perturbation theory (ChPT) is a kind of EFT. It first focuses on the low-energy strong interactions between the low-energy pseudoscalar mesons and then extends to baryons and other mesons. ChPT is based on the
flavor symmetry in the chiral limit, in which the three lightest quarks are considered massless. The only constraints of the chiral Lagrangian are symmetries, such as charge conjugate symmetry, parity symmetry, and chiral symmetry. However, there are infinite independent terms satisfying these symmetries. The Weinberg power-counting scheme expands these terms by the mesonic momentum (
) [
1]. The leading-order (LO,
order) terms give the most contributions, and they are considered first. If one wants to obtain a higher precision, the terms in the next-to-leading order (NLO,
order), the next-to-next-to-leading order (NNLO,
order), etc., will be considered gradually. Each term contains a corresponding unknown parameter, called low-energy constant (LEC), which contains the information of the effective strong interactions. For the three-flavor ChPT, there are 2, 10+2, 90+4, and 1233+21 LECs in the LO, NLO, NNLO and next-to-next-to-next-to-leading order (
order) [
2–
5], respectively. If all these LECs were known, all theoretical calculations would be obtained numerical values. However, the number of these LECs are too large, especially in the high orders. Besides, with CHPT itself, one cannot fix these LECs. The LECs are usually determined by the other approaches, such as global fit [
6–
8], lattice QCD [
9–
11], chiral quark model [
12–
15], resonance chiral theory [
13–
17], sum rules [
18], holographic QCD [
19], and dispersion relations [
20–
22]. Each method has its advantages and sphere of application. Until now, no approach can determine the exact values of these LECs. This paper only focuses on the global fit method.
There has been a lot of research based on global fits. Some LECs up to NNLO have been fitted. Ref. [
23] fits
form factors and
scatter lengths to get the values of
,
and
. Six years later,
is determined by fitting the quark mass ratio
, the decay constant ratio
and the
form factor [
24]. Another eleven years later, a new global fit appears, adding
scattering lengths (
and
),
scattering lengths (
and
) and scalar form factor threshold parameters (
and
).
and some
are obtained [
6]. Ref. [
7] adds some two-flavor LECs and updates the values of LECs fitted in Ref. [
6]. The last two references not only fit the LECs
at the NLO, but also estimate a part of the NNLO LECs
. However, both of them ignore the higher-order truncated contributions. Ref. [
8] proposes a geometric sequence model to introduce the higher-order truncated contributions. Its NLO fitting values of
are very close to the NNLO fitting values in Ref. [
7]. This is because a physical quantity contains not only the sum of LO and NLO theoretical values, but also the sum of higher-order contributions, which sometimes cannot be ignored when compared to the NLO contributions. If the NLO fit includes the higher-order contributions,
will be closer to the true values. Hence, fitting
at NLO and NNLO yields closer results. This shows that the higher-order contributions indeed cannot be simply ignored in the ChPT fit. Above all references have adopted a classical statistical method to fit LECs. Theoretically, the precision of the fitting results is dependent on the amount and precision of the experimental data. In other words, more data and more precise data will lead to more precise LECs. However, there exist some problems in the classical statistics and some improvements are needed.
i) The geometric sequence model in Ref. [
8] is too simple. The contribution at each order, in fact, needs not be a geometric sequence. In addition, in order to estimate the NNLO contribution, the geometric sequence itself requires the LO and NLO contributions. However, the LO contribution is sometimes zero, so the NNLO contribution cannot be estimated. In some special cases, the NNLO contribution may be larger than the NLO one, such as
scattering lengths
and
[
6,
7]. Hence, Ref. [
8] adopts a special approach to deal with this problem. In most cases, two two-flavor NLO LECs
have a bad convergence. It takes a long time to fit them, about one day with 20 cores in a CPU Intel Xeon Gold 6230. In addition, how to confirm the sign of the NNLO contribution is also a problem. These cause that the model is not consistent for all physical quantities. The model is not a universal approach.
ii) The number of
is much larger than the number of the input experimental data. There exists an overfitting problem in the NNLO fit. Refs. [
6,
7] adopt a random walking algorithm, but the result is boundary-dependent. A Monte Carlo method is used to fit the LECs in Ref. [
8], but its efficiency is low. Moreover, the complicated errors of
are hard to be estimated. They usually cannot be obtained as a normal distribution.
iii) Although the geometric sequence model gives a reasonable result in Ref. [
8], this model is hard to extend to the other EFTs, because of the reasons discussed above. Furthermore, it is also hard to evaluate different models in order to select the best one, because
(degrees of freedom) is too small and an overfitting problem exists. It is hard to select the best model from some overfitting models by
A more universal method requires a credible quantified index. The best model can be selected by this index.
iv) Refs. [
6–
8] treat the two-flavor NLO LECs
as the independent input experimental data, but some
are possibly dependent on other experimental quantities. In fact, the
scattering lengths
and
are dependent on
,
and
[
25]. Hence, their covariance matrix needs to be considered.
v) The most important thing is that before a global fit, one has known something about ChPT and the fitting experimental data, but this information does not obviously embody in the fit. For example, for the NLO fitting, although the truncation errors are not known, the other references have given some approximate values of the NNLO LECs. With these NNLO LECs, even at the NLO fit, one can roughly obtain the signs of the truncation errors. If these signs are introduced into the NLO fit, the results may be more precise. Furthermore, ChPT assumes that the orders of magnitude of LECs at a given chiral order are nearly the same. If this information is considered in the global fit, the range of the unknown LECs even through the NNLO contribution can be estimated. Simply speaking, more information may lead to a more precise result.
In addition to classical statistics, Bayesian statistics, which has been successful in artificial intelligence, can play a better role in the global fit of EFTs. Bayesian statistics can make good use of the known information to give a more reasonable result. Even when the amount of data is small, Bayesian statistics can be better than classical statistics. Ref. [
26] has applied Bayesian statistics to EFTs. It proposes two toy models and compares the results obtained by Bayesian and classical statistics. The advantages of Bayesian statistics in EFTs have been demonstrated. Later, Ref. [
27] introduces Bayesian statistics into nuclear physics. A year later, a specific framework for using Bayesian statistics in EFTs appears [
28]. Subsequently, Refs. [
29–
57] use Bayesian statistics to calculate the magnitudes of truncation errors in the different EFTs. This paper will improve the approach in Ref. [
8]. The new approach contains the framework of Bayesian statistics and the application of Markov Chain Monte Carlo (MCMC). Some MCMC algorithms, such as the Metropolis-Hastings algorithm [
58,
59], Hamiltonian Monte Carlo algorithm [
60] and No-U-turn Sampler algorithm [
61], will be used to fit the LECs with the help of the PyMC3 package [
62]. The major improvements of the new approach and the motivations of this paper are as follows.
i) The geometric sequence is not required in the fit. It is replaced by a Bayesian method. Generally, the new method does not require the assumptions about how ChPT converges.
ii) The approach is more general. Some examples are carried out to check whether the approach works well. The parameters in the examples are completely random. Hence, this approach is not only used to fit the LECs in ChPT, but also can be applied to other EFTs and perturbation theory.
iii) The cost of time for this approach is greatly reduced with the help of MCMC. A better result will be obtained within ten minutes.
iv) The covariance matrix given in Ref. [
25] will be considered in the fit, so it is maintained.
v) The Bayesian method is applied fully in the fit. More information under some reasonable assumptions is considered if possible, such as the assumptions of the signs and the order of magnitude of the truncation errors.
vi) Although the number of input values is not large enough, some clearer distribution of and some more precise values of will be obtained. In addition, the boundary dependence of can be seen more clearly.
This paper is organized as follows: Section 2 gives a brief introduction to Bayesian statistics and MCMC. In Section 3, two Bayesian models and some evaluation criteria are introduced. One model contains truncation errors, but the other one not. Some details of the calculation are also discussed. One example is studied in Section 4, in order to evaluate the above models. The input physical observables mentioned in ChPT is given in Section 5. In Section 6, some NLO and NNLO LECs are fitted by the above models. A set of new LECs are obtained. Section 7 gives a summary and some discussions.
2 Bayesian statistics and MCMC
This section provides a brief introduction to fit data by Bayesian statistics and MCMC. More details can be found in Refs. [
26,
27]. Some content is very basic and can also be found in textbooks about probability theory and Bayesian analysis. For convenience, some parameters are given meanings in ChPT, but it has a much wider scope of applications. They can be any parameters to be fitted in a problem.
Considering a general case, some parameters need to be fitted from a set of data. denotes a set of known input data. In physics, it is usually experimental data or physical constant quantities. All are not assumed independent. is a parameter vector. In physics, its components are usually some parameters needed to be fitted. In this paper, means LECs. The rest of this section will introduce an approach to fit by Bayesian statistics and MCMC. This approach is faster than only the Bayesian statistics without MCMC.
The core of Bayesian statistics is Bayes’ formula
The meanings of Eq. (1) is as follows.
i) is the prior probability distribution function (PDF). It reflects the knowledge of before is observed. If one does not know anything about , is usually set to a uniform distribution. Usually, experiment or/and theory can give an approximated value. At least the order of magnitude is known before fitting in most cases. Due to the introduction of , one would argue that Bayesian statistics are subjective. However, is nothing more than some assumptions in the construction of a model. This is similar to the fit usually needing an initial value of a reasonable range.
ii) is the likelihood function. It is related to and reflects the confidence of under the given . It can be expressed as
where is the theoretical expected value of the data, which is dependent on . is the expected value of the data, i.e., the experimental central value. is the covariance matrix of . The errors and the correlation information of are contained in .
iii) is the posterior PDF. It is the result of Bayesian analysis. It also reflects the full knowledge of from a fitting model. is the PDFs of , but not only some expected values. can be viewed as an update of after have been observed. In addition, in one fit can be regarded as in another fit after appending some new .
iv) is called Bayesian evidence. It is known as the marginal likelihood PDF. It means the average probability of in the fitting model. In addition, it can also be simply treated as a normalization coefficient. Because a fit is concerned with the relative PDFs of rather than their absolute PDFs, this normalization coefficient does not play an important role in the fit. Ignoring , Bayes’ formula can be expressed in a proportional form
Hence, is also called the core of the posterior PDF.
There are some different methods to determine
without
, such as MCMC. We have tried three algorithms to generate the Markov chain, i.e., Metropolis−Hasting algorithm [
58,
59,
63], Hamiltonian Monte Carlo algorithm [
60] and No-U-turn Sampler algorithm [
61]. The last two algorithms are a bit more complicated, but they have a faster computational efficiency. The details can be found in the above references. We have checked that all these algorithms can obtain almost the same distribution. The No-U-turn Sampler algorithm is the fastest one. It costs about half the time compared to the Metropolis-Hastings algorithm.
3 Models and details
3.1 Preparation
The above section gives a general approach to fit the parameter in the known analytical relationship by Bayesian statistics and MCMC. However, in ChPT, this approach cannot be adopted directly, because the strict theoretical relationship is hard to be obtained. It is usually calculated order by order,
where , and are the theoretical chiral expansion of at the LO, NLO and NNLO, respectively. , and are the LO, NLO and NNLO LECs, respectively, such as and . At present, the higher-order relationship (i.e., truncation error) is lacking, so this paper only considers the expansion up to the NNLO. As discussed in the introduction, may make a great impact on the results. Hence, it should be considered in the fit.
The introduction mentions that many references have discussed how to estimate the truncation errors, such as Ref. [
29]. However, that approach cannot be used directly in the present case. There exist some serious problems. Ref. [
29] knows
,
and
without errors to estimate the distribution of
. However, in the present case,
with systematical errors and the analytical expressions of
,
and
are known, but
,
,
,
,
and their distributions are needed to be fitted by
and
. Ref. [
29] computes the Bayesian evidence by a multidimensional integral (Eq. (8) in Ref. [
29]). In several special cases, the Bayesian evidence can be integrated analytically, but it usually needs to be integrated numerically. A multi-dimensional numerical integral is usually hard to be done, and it may cost a lot of time. However, the MCMC approach avoids determining the Bayesian evidence, and the computational speed is faster. In addition, Ref. [
29] requires Eq. (4) to be convergent order by order, but Refs. [
6,
8] have already indicated
for some physical quantities. Hence, a new approach is needed.
Generally, in an actual fit, some of
,
and
may have dimensions, and their values may be very small or very large. For example, the NNLO LECs
is about
. For convenience, they are first removed the dimensions. For example, most literature provides
(defined in Ref. [
64]) without dimension, but not
. Moreover, very small or very large values may lead to numerical errors. Hence, all LECs divide by an order of magnitude, in order to make them roughly 1. This can be done in an actual fit. For example, both experiment and theory can estimate
is about
. The order of magnitude of LECs is regarded as a prior of LECs in this paper. For convenience, all the quantities in this section are assumed to be dimensionless, and all
,
and
are assumed roughly 1. In fact, the number 1 is not very strict. As long as the number is not very large, the fit also works well. For convenience,
is assumed to be known, and it does not need to be fitted in this section. If one wants to fit
, there is no difference from fitting
and
.
In the actual ChPT fit in this paper, the number of is less than the total number of , and . There exists an overfitting. Hence, some constraint conditions are introduced to decrease the parametric space. In order to consider the convergence of ChPT, Eq. (2) need to be introduced some information about the high orders. If one has no more information about the high orders, in this paper, the parameters in Eq. (2) are modified to
where means an identity matrix with a suitable dimension. These changes assume that satisfies a normal distribution () for any (ignore the negative part), and has a similar meaning. The values come from the convergence hypothesis of ChPT. According to ChPT, is about 0.1−0.3, and is also about 0.1−0.3. Both 0.2 and 0.05 are near the central values. The standard deviations are chosen the same as the expected values, in order to give a large enough possibility at a wide range, because the estimation may not be very exact. In order to make the model universal, we choose the relative difference, but not the absolute value. This is because EFT/ChPT can provide us an approximate ratio between two orders, but not their absolute values. Of course, if one knows an approximate absolute value of a special quantity at a given order, Eq. (5) can be replaced with this absolute value. Some similar constraints about the truncation errors will be discussed in Section 3.3. Of course, these constraints can be correspondingly modified to different values, if one has a better understanding about some physical quantities.
For convenience, only one input datum or a component form is discussed. If one wants to consider more than one datum , the discussion also works.
3.2 Model A
First, the truncation error is not considered in the fit, which is called Model A. Considering a physical quantity with an experimental value , its theoretical value is . The theoretical values to the NLO and NNLO without errors are
respectively. The term with a superscript without a couple of parentheses means the theoretical value only at this order. For example, means the NLO theoretical value of . Eqs. (8) and (9) are applied in the NLO and the NNLO fit, respectively.
This model assumes is the standard normal distribution, because the magnitudes of all and are already normalized to roughly 1. In other words, one only introduces the information about the rough magnitudes of and , but no more information is considered at present. The advantage of this assumption is that more information of can be derived from the experimental data themselves, in order to reduce the subjectivity. In addition, Eq. (2) is adopted in the fit, but not Eqs. (5)−(7). We have checked that both and need not be very close to 1. The results change slightly, as long as their values are not very large. This is because the standard normal distribution has a not very small possibility in a wide range. The same conclusion is true for the below model.
In order to improve Model A, more information is appended. It is called Model B.
3.3 Model B
Generally, the truncation error can be simply considered as a normal distribution, and the parameters of the normal distribution are based on the known information from the knowledge of the theory. However, in some special cases, the sign of the truncation error is known, or the probability of the sign is known. This information from the sign is considered separately. Eqs. (8) and (9) are improved to
respectively. The last terms on the right-hand side of Eqs. (10) and (11) represent the higher-order (HO) truncation error . For and , it means the contribution higher than the NLO and NNLO, respectively. The parameter relates to the sign of the truncation error. It is assumed to be a Bernoulli random variable with parameters 1,
is the probability for . If one does not know the information of the sign, . parameter can give a correct sign of the truncation error. If the estimating truncation error gives a narrow range with a wrong sign, the theoretical values will be far from the true value and the fit will be bad. The parameter is introduced to solve this problem. can change the wrong sign into a correct one. Oppositely, if the estimating truncation error gives a correct sign, or the range is too wide to cover the true truncation error, will have no impact on this case. The parameter reflects the relative magnitude of the truncation error, relative to . One needs not know the absolute magnitude of the truncation error. However, if the EFT is satisfied, the relative magnitude at each order can be estimated. For example, the ratio between two adjacent orders is about , where is the momentum of the low-energy particles and is the scale of the EFT. In ChPT, is about 0.1−0.3, and is also about 0.1−0.3, and so on. Therefore, it can be considered that is about 5% (2%) for (). Hence, the parameter is assumed to be a Gaussian random variable
where is the expected magnitude of , and is its standard deviation. If one does not know more information about the truncation error, a possible and reasonable choice is (0.02) for ().
The parameters , , , and sometimes can be estimated through the information of the data. Hence, they can be set to another values, even though the prior PDFs of them can be also set to another form, as long as the information is accurate enough.
There are two extreme cases in Model B, which will be adopted only for model evaluation in Section 4. These two cases are called Model B1 and Model B2, respectively.
Model . In this case, one knows nothing about , such as the sign and the rough magnitude. Only the approximate order of magnitude of is known from ChPT, such as about 5% of LO at the NLO fit. As in the discussion above, for all quantities, we set (0.02) and for the NLO (NNLO) fit. At present, we do not consider more information about and . Hence, and are set to the standard normal distribution . The convergence constraints are the same as Eqs. (5)−(7).
Model . In this model, the magnitudes of each all have a certain understanding. Hence, one can set different prior PDFs to different , separately. The parameters , and from different quantities can be set to different values. For example, if one knows the sign is positive, is set to 1. The priors for and are set as
where the superscripts NLO (NNLO) represent the NLO (NNLO) fit, the subscript “tr” means true value. Because we have only adopted this model for the example in Section 4 to evaluate the models, all the true values are known. Similarly, the true ranges of and are generated by some given parameters. Their true ranges are also known. Therefore, their prior ranges are given the same as their true ranges. In addition, the constraints can be set to different values for the different physical quantities.
Models B1 and B2 adopt two extreme priors, they are only used to fit the example in Section 4. Because this example are artificial, and the true values are known, we can select none or all prior information in the fit. For the actual experimental data, the known prior information is between Model B1 and Model B2. For example, one may have some information about a part of , and the signs and the approximate magnitudes of can be given as Model B2. However, for another part of , one may have no information about their , because of the lack of the current theory and/or experiment. For this part of , one can only give the prior PDFs as those in Model B1. Besides these two cases, one may more possibly know some information of . For example, is more likely to be positive, or its value is possible around or . The prior PDF can be set according to this information. The fitting method of Models B, B1 and B2 are the same, except the prior PDFs are different. It can be expected that the general Model B is better than Model B1, but worse than Model B2. Therefore, in Section 6, we have uniformly used Model B to represent the new model proposed in this paper.
3.4 Calculation details
This section discusses some special cases in the fit.
Sometimes, one needs to fit the differentiation of numerically, such as and in Section 5. The numerical deviation needs to calculate the difference between the two quantities and , but each quantity has an error. If one adopts Eq. (10) or (11) to determine and , the estimating truncation error of will contain the above two errors and become large. Therefore, the truncation error of is estimated from , and , but not the difference of Eq. (10) or (11). In other words, is treated as one quantity, but not a difference. However, for physical quantities with derivative values such as and , we place the HO terms in the denominator, which absorbs the effects of higher-order errors well.
Sometimes, in the NNLO fit, the amount of is much larger than the number of , but the total number of and is less than the number of input . The NNLO fit in ChPT is in this situation. All , and are fitted as follows.
i) All first linearly combine into some linearly independent . The number of is equal to the number of , and one only correlates to one . This is also reasonable in ChPT, because the NNLO fit only contains the linear combinations of . One can combine them to the linearly independent ones.
ii) and are first fitted at the NLO by Model B. The results denote to and . This is called the NLO fit.
iii) In the NNLO fit, , and are fitted simultaneously. If no more information is known, the NNLO priors of and are set to some suitable normal distributions , where
The definition of chooses the maximum of the two parameters and . This is because either of them may be very small, this definition enlarges the prior ranges of , in order to improve performance. The prior PDFs of is set to the standard normal distribution, if one knows nothing about . Otherwise, some more reasonable prior PDFs can be set according to the known information.
The prior PDFs of , and not only make good use of the information from the NLO fit, but also allow some free spaces for these parameters. Because the NLO fitting and can give a reasonable order of magnitude in most cases, the NNLO fit also selects the NLO posterior PDFs to calculate the NNLO prior PDFs. In addition, the new parameter is also introduced in the NNLO fit. Hence, the NNLO fit is not a repeated fit to the data, even if some of the NLO posterior information is used. We have also tried to do the NNLO fit without the posterior PDFs from the NLO fit for the example in Section 4, and set the prior of , , and uniformly to the standard normal distribution. However, this gives very poor results, which can deviate very far from the true values. Therefore, it is necessary to use some sensible information about LECs as a prior in the NNLO fit.
Of course, if some information about and is known, one can set another sensible prior PDFs.
iv) Finally, all are fitted with the posterior obtained above, with some appropriate uniform distributions. The boundaries of the uniform distribution are dependent on the approximate order of magnitudes of the truncation errors. This is because the NLO research has usually been studied widely, and more information is known. However, the NNLO research is usually lacking, and the values of are not quite sure. Hence, a uniform distribution can give a larger probability near the boundaries, in order to study the boundary-dependent property. After the fit, the posterior PDFs of the truncation error will be changed into better ones.
Models B is very efficient. For the actual fit in ChPT, which will be discussed below, a personal computer with CPU Intel i3-10105 only costs about ten minutes with 4 cores. This method greatly reduces the time compared with the method in Ref. [
8], which costs about one day with 20-core CPU Intel Xeon Gold 6230.
All the numerical results are represented by the highest posterior density (HPD). The HPD is the minimum interval containing a certain proportion of probability density. The most common proportion is 95% HPD or 98% HPD, but we have chosen 68% HPD. Because it is similar to
interval in the classical statistics [
28], such as the minimum
method. All the results in this paper have been compared. It indicates that the difference between 68% HPD and
interval is very small, most last significant digits have no difference or a difference of 1 or 2. Only very few of them have a difference of 3 or 4. No one is larger than 4. Hence, we sometimes do not distinguish them in this paper.
3.5 Evaluation criteria
In order to evaluate which model is the best, there needs an evaluation criterion. This criterion is better to be quantified. One can evaluate different models by the quantified index. Bayesian evidence is one possible criterion, but it is too simple. The widely applicable information criterion (WAIC) and leave-one-out cross-validation (LOOCV) are introduced in recent years. WAIC considers how well the data fits the model and also penalizes complex models. LOOCV splits the data into a training set and validation set and repeats many times to evaluate the model. The definitions of WAIC and LOOCV involve some related concepts and formulas, which need a long discussion. Their definitions and a more detailed explanation can be found in Refs. [
65,
66]. Simply speaking, if Model B has larger values of WAIC and LOOCV than Model A, Model B is considered better than Model A. Of course, only a couple of these values for one model are meaningless, because one does not know how large is enough. They are only meaningful for comparing different models.
For the example in Section 4, the true values of parameters are known. In addition to both WAIC and LOOCV, the fitting results can be compared to the true values directly. For example, means the expected value of is fitted by Model A. It is more intuitive to see how well the fit is. Hence, we define the following two quantities as criteria.
is the relative error between the fitting value and the true value . It indicates how well the fitting expected value is. is the ratio of the difference between the true value and the fitting value to the fitting standard error . It indicates how well the fitting error is. The smaller these two values are, the better the model is. These two criteria are only used for the example in Section 4, because one does not know the true values in the actual fit.
In order to clarify the convergence of
, the percentages at each order are defined as Ref. [
8],
where is defined in Eqs. (8)–(11). means the fitting value obtained by a special model, containing all orders. The notation bar means the expected value. For example, means the NLO expected contribution obtained by Model A, and is the expected value containing all orders obtained by Model A.
For the NNLO fit, the differences among WAIC, LOOCV, and among different models are small. It is more important to evaluate how well all are fitted, because the NNLO fitting are usually precise enough, but usually have large errors. For the example in Section 4, the true values of are known, and , the fitting values can also compare to the true values directly. Usually, the contributions of do not mix with and , such as ChPT. The contributions of can be separated, called . In order to see how well the fitting are, we defined
The subscript “tr” means the true values, the subscript “model” means the model which are adopted, and is the true value of the -th physical quantity. The notation bar means the expected value. is the number of physical quantities. In this paper . For example, means is fitted by Model A, means only the contribution from by Model A. The first fraction on the right side of Eq. (19) is the relative error of , while the second fraction is treated as its weight. The weight represents the contribution of in . The smaller the PM value is, the better the result is. A larger weight needs a more precise to reduce the PM value. PM value is only used in the example in Section 4, because the true values of this example are known, but in the actual case, the true values are not known.
The next section will evaluate the above models by these evaluation criteria.
4 Model evaluation
In order to quantitatively demonstrate the advantage of Model B based on Bayesian statistics, this section gives an example to fit the parameters similar to LECs. The same as the actual fit of the LECs in Section 6, a group of functions is generated randomly, each group containing 17 different quantities . They are shown in Eq. (A1) in Appendix A. The power of is similar to the chiral dimension in ChPT. Taylor expanding these functions about , the analytical results at each order can be obtained. The , and orders correspond to LO, NLO and NNLO in ChPT, respectively. After the expansion, . , and are similar to LO, NLO () and NNLO () LECs in ChPT, respectively. are some known constants, which are introduced to adjust the convergences of these Taylor series. All parameters , , and are generated randomly and independently. For convenience, the parameters in each function are different, although they have the same name. For example, in and are different. The values of and in the example can be found in in Appendix A. In fact, the LO LECs do not appear in the actual ChPT fit in this paper. Hence, we treat them as known constants and do not fit them. This section only discusses the impact from truncation errors, but it does not mention overfitting. Hence, each only contains one , i.e., .
Since all the parameters , , and in this example are known, all the analytical results can be calculated by these parameters directly. In this section, we define all the known values of these parameters as true values. The fitting values of these parameters are called theoretical values, which are fitted by the models in Section 3. In order to distinguish these two types of values, all the true values are marked by a subscript “tr”, such as , and all the theoretical values are marked by the model name, such as .
In order to imitate the realistic experiment, the fitting data do not adopt the true values but with some experimental errors
. The imitative experimental data are generated by the distribution
,
in the example [
26]. For convenience, these imitative experimental data are also called experimental data for short. Their values are in the third column of Tab.2 with a subscript “exp”, respectively.
Because the above true values are known, the true values of , and can be also calculated analytically. The parameters of truncation errors in Model B2 are set as Eq. (14) and the description above it. The values of , and are given in in Appendix B. Similarly, the true values of the LECs are also known, so their prior distribution are set to the normal distribution , where
We have deliberately given a deviation from the true value, in order to avoid fit at the true value. The distribution parameters of and at each order are given in in Appendix B.
4.1 The NLO fit of the example
The input parameters in Model B2 are given in Columns 2 to 6 of in Appendix B. After the NLO fit, we have checked that the obtained Markov chain satisfies the assumption of the detailed balance condition, and the results are reliable. All the other fits in this paper have the same conclusion.
Fig.1 illustrates the distributions obtained by Models A and B2. The shapes of the lines are similar to normal distributions, although the details have a little difference. We have checked that the boundaries of 68% HPD are almost the same as boundaries of a normal distribution. Hence, we sometimes do not distinguish them. It can be seen that the center values of Model B2 are more closed to the true values. However, the errors of Model B2 are larger than those of Model A. This is because Model B2 considers the errors of the truncation errors, but Model A does not.
Fig.1 The NLO fitting posterior PDFs of . The red lines and the light red areas are obtained by Model A. The blue lines and the light blues area are obtained by Model B2. The lines are the distribution curve of . The light-colored areas depict the 68% HPDs. The green lines denote the true value. |
Full size|PPT slide
The numerical posterior information of is listed in Tab.1. The WAIC and LOOCV of Model B2 are the largest, but these values of Model A are the smallest. The WAIC and the LOOCV of Model B1 are a bit smaller than those of Model B2, but much larger than those of Model A. This means that Model B2 gives the best results, but Model A is the worst. Model B1 obviously improves the results of Model A, but a bit weaker than Model B2. This conclusion can also be seen from . However, most are still a bit larger than . This is because the errors of are about half . Overall, is closer to the true value.
Tab.1 The NLO and the NNLO fitting results of in the example. Row 2 is the true value of . Rows 3, 6 and 9 are the NLO fitting results of Model A, B and B2, respectively. Rows 12, 15 and 18 are the NNLO fitting results of Model A, B and B2, respectively. The percentage is defined in Eq. (16), and the ratio is defined in Eq. (17). |
| 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | WAIC | LOO |
|
| 0.53 | 0.80 | −3.07 | 0.3 | 1.01 | 0.14 | −0.34 | 0.47 | | |
|
| | | | | NLO | | | | | |
| 0.541 (35) | 0.914 (30) | −3.671 (135) | 0.366 (18) | 1.021 (28) | 0.118 (22) | −0.413 (14) | 0.545 (18) | −49.130 | −56.310 |
| 2.1% | 14.3% | 19.6% | 22.0% | 1.1% | −15.7% | 21.5% | 16.0% | | |
| 0.3 | 3.8 | −4.5 | 3.7 | 0.4 | −1.0 | −5.2 | 4.2 | | |
| 0.550 (121) | 0.842 (79) | −3.192 (335) | 0.324 (46) | 0.972 (68) | 0.110 (53) | −0.361 (34) | 0.503 (43) | 14.964 | 7.889 |
| 3.8% | 5.2% | 4.0% | 8.0% | −3.8% | −21.4% | 6.2% | 7.0% | | |
| 0.2 | 0.5 | −0.4 | 0.5 | −0.6 | −0.6 | −0.6 | 0.8 | | |
| 0.539 (41) | 0.860 (43) | −3.252 (175) | 0.314 (28) | 1.027 (38) | 0.149 (28) | −0.359 (18) | 0.475 (23) | 27.307 | 23.713 |
| 1.7% | 7.5% | 5.9% | 4.7% | 1.7% | 6.4% | 5.6% | 1.1% | | |
| 0.2 | 1.4 | −1.0 | 0.5 | 0.4 | 0.3 | −1.1 | 0.2 | | |
| | | | | NNLO | | | | | |
| 0.924 (574) | 0.603 (146) | −1.984 (457) | 0.416 (383) | 0.553 (357) | 0.084 (383) | −0.243 (53) | 0.510 (316) | 14.364 | 7.782 |
| 74.34% | −24.63% | −35.37% | 38.67% | −45.25% | −40.00% | −28.53% | 8.51% | | |
| 0.69 | −1.35 | 2.38 | 0.30 | −1.28 | −0.15 | 1.83 | 0.13 | | |
| 0.534 (116) | 0.831 (74) | −3.042 (26) | 0.316 (43) | 0.968 (62) | 0.116 (40) | −0.346 (27) | 0.491 (42) | 41.143 | 32.782 |
| 0.75% | 3.87% | −0.91% | 5.33% | −4.16% | −17.14% | 1.76% | 4.47% | | |
| 0.03 | 0.42 | 0.11 | 0.37 | −0.68 | −0.60 | −0.22 | 0.50 | | |
| 0.525 (81) | 0.808 (35) | −3.195 (111) | 0.319 (18) | 0.995 (37) | 0.138 (32) | −0.354 (12) | 0.474 (24) | 56.730 | 53.277 |
| −0.9% | 1.0% | 4.1% | 6.3% | −1.5% | −1.4% | 4.1% | 0.9% | | |
| −0.06 | 0.23 | −1.13 | 1.06 | −0.41 | −0.06 | −1.17 | 0.17 | | |
Fig.2(a) illustrates the proportions of at each order. The contributions at NLO and HO from Model B2 are closer to the true values than those from Model B1. This is because Model B2 has utilized more information compared to Model B1. Despite adopting relatively less information, Model B1 still satisfies convergence well in its results. However, there are noticeable differences between Models B1 and B2 at the HO due to some truncation errors not being accurately estimated. Nevertheless, these discrepancies have a minimal impact on the results of . Therefore, whether Model B1 or Model B2, their results closely approximate the true values. This indicates that even if one does not possess complete knowledge about all physical quantities’ truncation errors, Model B1 still yields better results compared to Model A.
Fig.2 The proportions of at each order for the example. The red, green and blue strips in the figure represent the true values, the values obtained by Models B1 and B2, respectively. The lightest and the second lightest colors are the proportions [defined in Eq. (18)] of LO and NLO, respectively. (a) The NLO fit. The darkest color is the proportion of HO. (b) The NNLO fit. The darkest color and the dark gray are the proportions of NNLO and HO, respectively. To avoid layer masking, the colors of the NNLO and the HO true values of , and are interchanged. Similarly, the colors of , and of Model B2 are also interchanged. |
Full size|PPT slide
Tab.2 shows the comparison of the true values, the experimental values and the fitting results from Models B1 and B2. It can be seen that the theoretical values from both Model B1 and Model B2 are not obviously different from the experimental values and the true values. In particular, the theoretical values obtained by Model B2 are closer to the true value than those obtained by Model B1. The 1 errors from Model B1 and Model B2 are roughly equal to the experimental data, but Model B2 has smaller errors. Most true values fall within 1 intervals of the theoretical values. A few true values are in the to intervals. No true values exceed the intervals. Tab.2 also indicates that more information leads to a better result.
Tab.2 The comparison of the NLO fitting values for the example. The subscripts tr, exp, B and B2 in the first row represent the true values, the experimental values, the theoretical values from Model B and Model B2, respectively. The experimental values in the third column are sampled from the true values. is defined in Eq. (A1). |
| | | | |
|
1 | −35.800 | −34.637 0.716 | −33.778 1.786 | −35.023 1.129 |
2 | 0.173 | 0.171 0.003 | 0.168 0.005 | 0.172 0.004 |
3 | −0.276 | −0.279 0.006 | −0.279 0.031 | −0.279 0.012 |
4 | 0.603 | 0.590 0.012 | 0.584 0.013 | 0.595 0.008 |
5 | 27.281 | 27.753 0.546 | 27.124 0.756 | 27.600 0.535 |
6 | −0.524 | −0.548 0.010 | −0.554 0.020 | −0.541 0.014 |
7 | −1.486 | −1.434 0.030 | −1.403 0.028 | −1.459 0.023 |
8 | −0.955 | −0.970 0.019 | −0.954 0.414 | −0.965 0.219 |
9 | −0.227 | −0.226 0.005 | −0.231 0.008 | −0.227 0.004 |
10 | −52.511 | −52.773 1.050 | −54.107 3.062 | −52.493 2.069 |
11 | 44.936 | 46.250 0.899 | 47.299 2.319 | 45.728 1.351 |
12 | −4.223 | −4.397 0.084 | −4.427 0.673 | −4.393 0.374 |
13 | −14.674 | −14.769 0.293 | −14.574 2.295 | −14.782 1.325 |
14 | −24.577 | −24.765 0.492 | −24.351 0.492 | −24.755 0.419 |
15 | −15.864 | −15.505 0.317 | −15.974 1.741 | −15.435 0.893 |
16 | 3.831 | 3.746 0.077 | 3.847 0.145 | 3.774 0.069 |
17 | −4.193 | −4.208 0.084 | −4.110 0.146 | −4.203 0.086 |
4.2 The NNLO fit of the example
In the NNLO fit, the priors of and in Models A and B1 are the same as those discussed in Sections 3.2 and 3.3. The priors in Model B2 adopt Eq. (15), and the parameters are given in Columns 7 to 11 of in Appendix B.
The numerical NNLO fitting results of obtained by Models A, B1 and B2 are shown in Rows 12 to 20 of Tab.1. The NNLO fitting results of obtained by Models A, B1 and B2 are shown in Tab.3. Besides WAIC and LOO, the last row also gives the PM value defined in Eq. (19).
Tab.3 The NNLO fitting results of the example. Column 2 is the true value of . Columns 3, 6 and 9 are the results of Models A, B and B2, respectively. The percentage is defined in Eq. (16), and the ratio is defined in Eq. (17). PM is defined in Eq. (19). |
| | | | | | | | | | |
|
1 | 0.02 | 0.176 (298) | 780.0% | 0.5 | 0.013 (29) | −35.0% | −0.2 | 0.017 (10) | −15.0% | −0.3 |
2 | 0.19 | 0.060 (293) | −68.4% | −0.4 | 0.102 (46) | −46.3% | −1.9 | 0.177 (31) | −6.8% | −0.4 |
3 | −0.72 | 0.351 (692) | −148.8% | 1.5 | −0.073 (264) | −89.9% | 2.5 | −0.703 (209) | −2.4% | 0.1 |
4 | 0.22 | −0.682 (917) | −410.0% | −1.0 | 0.917 (735) | 316.8% | 0.9 | 0.203 (96) | −7.7% | −0.2 |
5 | −0.16 | 0.018 (465) | −111.3% | 0.4 | −0.090 (60) | −43.8% | 1.2 | −0.137 (43) | −14.4% | 0.5 |
6 | 0.26 | 0.035 (485) | −86.5% | −0.5 | 0.189 (71) | −27.3% | −1.0 | 0.192 (58) | −26.2% | −1.2 |
7 | −0.42 | 0.088 (645) | −121.0% | 0.8 | −0.209 (520) | −50.2% | 0.4 | −0.413 (165) | −1.7% | 0.0 |
8 | −0.45 | 0.016 (1005) | −103.6% | 0.5 | −0.136 (188) | −69.8% | 1.7 | −0.472 (118) | 4.9% | −0.2 |
9 | −0.99 | −0.822 (525) | −17.0% | 0.3 | −0.261 (200) | −73.6% | 3.6 | −0.966 (208) | −2.4% | 0.1 |
10 | −0.06 | −0.415 (670) | 591.7% | −0.5 | −0.076 (59) | 26.7% | −0.3 | −0.083 (24) | 38.3% | −1.0 |
11 | 0.24 | 0.005 (993) | −97.9% | −0.2 | 0.163 (646) | −32.1% | −0.1 | 0.254 (132) | 5.8% | 0.1 |
12 | −0.18 | −0.182 (605) | 1.1% | 0.0 | −0.194 (85) | 7.8% | −0.2 | −0.219 (51) | 21.7% | −0.8 |
13 | 1.02 | 0.342 (706) | −66.5% | −1.0 | 1.011 (71) | −0.9% | −0.1 | 0.997 (57) | −2.3% | −0.4 |
14 | 0.29 | −0.226 (181) | −177.9% | −2.9 | 0.140 (132) | −51.7% | −1.1 | 0.265 (90) | −8.6% | −0.3 |
15 | −0.11 | −0.297 (427) | 170.0% | −0.4 | −0.087 (62) | −20.9% | 0.4 | −0.110 (35) | 0.0% | 0.0 |
16 | −0.56 | 0.095 (707) | −117.0% | 0.9 | −0.870 (394) | 55.4% | −0.8 | −0.567 (218) | 1.2% | 0.0 |
17 | 0.19 | 0.247 (714) | 30.0% | 0.1 | 0.188 (112) | −1.1% | 0.0 | 0.187 (67) | −1.6% | 0.0 |
WAIC | − | 14.364 | − | − | 41.143 | − | − | 56.730 | − | − |
LOO | − | 7.782 | − | − | 32.782 | − | − | 53.277 | − | − |
PM | − | 0.2650 | − | − | 0.0510 | − | − | 0.0177 | − | − |
Tab.1 shows that the best results of are obtained by Model B2. The NNLO (), () and their NLO values show that most of the results are improved. There exists a significant difference between the NLO and the NNLO results. This indicates that even though the NNLO prior PDFs are calculated from the NLO posterior PDFs, the NNLO fitting does not stay at the prior PDFs, as it can change to the other ranges. In other words, the NNLO fit is not a repeated NLO fit.
Tab.3 shows that there are significant differences between () and () for . Although a few have large values (the largest is 316.8%), and several also have large values, Model B1 still has a significant improvement over Model A. This can also be noticed from their PM values, which change significantly. Similarly, Model B2 also shows a more significant improvement in the results. Most and are smaller than those from Models A and B1. It can be seen that for the NNLO fit, the more useful information is known, the better the fitting results are.
Fig.2(b) illustrates the distributions obtained by Models A and B2 at each order. Tab.4 gives a comparison among the true values, the experimental values, the fitting results from both Models B1 and B2. Both of them indicate the same conclusion as the NLO fit. Model B2 can give better predictions of the truncation errors and the theoretical values.
Tab.4 The comparison of the NNLO fitting values for the example. The subscripts tr, exp, B and B2 in the first row represent the true values, the experimental values, the theoretical values from Model B and Model B2, respectively. The experimental values in the third column are sampled from the true values. is defined in Eq. (A1). |
| | | | |
|
1 | −35.800 | −34.637 0.716 | −34.219 2.034 | −35.189 0.910 |
2 | 0.173 | 0.171 0.003 | 0.170 0.007 | 0.172 0.005 |
3 | −0.276 | −0.279 0.006 | −0.279 0.049 | −0.279 0.036 |
4 | 0.603 | 0.590 0.012 | 0.584 0.032 | 0.590 0.015 |
5 | 27.281 | 27.753 0.546 | 27.091 0.843 | 27.604 0.675 |
6 | −0.524 | −0.548 0.010 | −0.550 0.028 | −0.547 0.023 |
7 | −1.486 | −1.434 0.030 | −1.434 0.031 | −1.469 0.020 |
8 | −0.955 | −0.970 0.019 | −0.968 0.327 | −0.971 0.144 |
9 | −0.227 | −0.226 0.005 | −0.226 0.008 | −0.225 0.010 |
10 | −52.511 | −52.773 1.050 | −53.136 4.369 | −52.212 2.036 |
11 | 44.936 | 46.250 0.899 | 46.466 2.628 | 45.904 1.176 |
12 | −4.223 | −4.397 0.084 | −4.406 0.736 | −4.390 0.402 |
13 | −14.674 | −14.769 0.293 | −14.713 2.670 | −14.760 1.834 |
14 | −24.577 | −24.765 0.492 | −24.304 0.614 | −24.692 0.496 |
15 | −15.864 | −15.505 0.317 | −15.609 1.776 | −15.541 1.006 |
16 | 3.831 | 3.746 0.077 | 3.757 0.154 | 3.761 0.098 |
17 | −4.193 | −4.208 0.084 | −4.136 0.134 | −4.212 0.093 |
4.3 Discussion
In the NLO fit, we have also removed one and fitted the rest. The results are almost no different from the 17-input fit. Moreover, the 16-input fit can predict the 17th quantities well. This also shows that our model has a good predictive ability.
We have fitted other examples and obtained the same conclusion. If an example converges faster than the example in this paper, but the experimental errors and the NNLO contributions are at the same order, the experimental errors will have an impact on the HO values. The NNLO fitting results are a little worse. An example of this type can be downloaded from the source file in the arXiv version of this paper (arXiv: 2311.10423).
5 Observables and inputs
In order to fit the actual data in ChPT and compare the results by different methods, almost the same physical quantities are chosen as those in Refs. [
7,
8], besides the covariance matrix of
scattering lengths
,
and the two-flavor LECs
,
and
is considered.
In Refs. [
7,
8], 12 input values are used in the NLO fit, i.e., the quark mass ratio
[
6,
24,
67,
68], the ratio of decay constants of
meson and
meson
[
6,
7,
67,
68], the shape factors
and
at threshold and their slope
,
,
and
for
form factors [
24],
scattering lengths
and
[
25],
scattering lengths
and
[
69], pion scalar radius
in the form factor
. In addition, there are 5 more input values added for the NNLO fit, i.e., the pion scalar curvature
of the pion scalar form factor [
69] and four two-flavor LECs
[
25,
70]. The values of these 17 physical quantities are listed below. In this paper, both 12 and 17 inputs are considered in the NLO fit for comparison.
The values of and are
The values of , , and are
The values of scattering lengths , and the three relevant two-flavor LECs are
The covariance matrix of , and , , is listed in Tab.5.
Tab.5 The covariance matrix of , and , , . This is a symmetric matrix, only the values in the upper right corner of the matrix are given [25]. |
We have tested whether the covariance matrix is present or not, it has a slight impact on the final fitting results, because the errors of themselves are very large. Of course, in order to make the results more statistically significant, the covariance matrix is considered in the global fit.
The experimental values of scattering lengths and are
The experimental values of the scalar radius and the pion scalar form factor are
For
, the following result is adopted [
70]:
6 Fitting the LECs in ChPT
This section adopts the Bayesian Model B mentioned in Section 3.3 to perform a global fit, in order to obtain a new set of some NLO and NNLO LECs. The truncation errors are considered in the fit. Most references in this paper indicate that all () are at the order about (). Following the preparation in Section 3.1, they need to be first normalized by multiplying a factor (), respectively.
6.1 The NLO fitting by Model A
Although this paper does not adopt the minimum
method [
6–
8] to fit
, it can still obtain similar results from the NLO fit by Model A. The fit does not add the covariance matrix and does not consider the truncation errors, in order to compare with the results in Ref. [
7]. The fitting results with the first 12 inputs in Section 5 are shown in Tab.6. For comparison, the results in Ref. [
7] are also given. Free fit means no assumptions in the fit. Otherwise,
are assumed to be some fixed values. It can be seen that these two approaches indeed give very close results. The classical statistics is very similar to the Bayesian statistic. The slight differences come from the prior of
. This proves that they are equivalent laterally. However, Bayesian statistic is easier to introduce extra information. The minimum
method can also add some constraints in the definition of
[
6–
8], but this information is restricted. For example, the prior PDF of LECs cannot be embodied in. In addition, the modified
destroys the original definition of
. In other words, the new
may not satisfy a
distribution in fact.
Tab.6 The NLO fit by Model A, of which some different choices of . Columns 2, 4, 6 and 8 are the results from free , , and , respectively. Columns 3, 5, 7 and 9 are the results in Ref. [7] for comparison. |
LECs | Free fit | Free fit [7] | | [7] | | [7] | | [7] |
|
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
6.2 The NLO fitting
In order to fit
, a similar approach to that of the example in Section 4 is adopted, but the parameters in HO are slightly different from the example.
,
,
,
,
,
,
,
,
,
,
,
and
are the same as the expansion in Eq. (10).
,
,
and
involve a numerical differentiation. They are estimated with the method in Section 3.4. We have gotten some information about the higher-order experimental data and the range of the LECs, so the parameters are set in a way that is between Model B
1 and Model B
2. Therefore, from here, all data are fitted using Model B. In this subsection, besides fitting the whole 17 inputs (Model B
17), we also fit the first 12 inputs (Model B
12) in Section 5 for comparing to Refs. [
7,
8].
The setting parameters can be found in Columns 2 to 7 in in Appendix B, the parameters about
and
are given by Ref. [
7], which indicates that their convergences have been broken. The values about
and
are given by their NNLO distributions, which are statistically obtained from the ranges of
and
collected by Refs. [
7,
8,
71] and the references in them. The other parameters are given the same as Model B
1. The prior of
is given in Columns 2 and 5 in in Appendix B. They refer to the
ranges given in Refs. [
7,
8,
71] and the references in them. Because the values in the different references are not very close, the prior ranges are wide enough to cover all possible ranges.
The numerical results of both fits can be found in Tab.7. It can be seen that the results obtained by both Models B
12 and B
17 are close to the NNLO results in Refs. [
7,
8]. Moreover, both of them also satisfy the large-
limit, i.e.,
,
and
closing to zeros, although it does not give a strong prior of
. This shows that the contributions from truncation errors have a great impact on the NLO fit. It is also very possible that the truncation errors cannot be ignored in the NNLO fit. In addition, all theoretical errors from Model B are slightly larger than those in Refs. [
7,
8]. This is because Ref. [
8] does not consider the errors caused by the truncation errors. Ref. [
7] even does not consider the truncation errors. Model B cannot only estimate these truncation errors, but also considers their PDFs. These PDFs lead the fitting errors to be slightly larger than those in Refs. [
7,
8]. However, the difference is not very large, because the truncation errors are not very large. It also shows that the change between 12 and 17 inputs is not very large. The relative difference does not exceed 20%. However, since more inputs are added, all theoretical errors became smaller. In addition, since Model B
12 and Model B
17 do not adopt the same inputs, the WAIC and LOOCV cannot be adopted as model evaluation criteria. Hence, we do not give these two values. The following discussion is based on the results of Model B
17, because the fit becomes more accurate as the input value increases. The red part in Fig.3 is the corner plot of
with 17 inputs, from which one can see both the distributions and the potential correlations between
.
Fig.3 The corner plot of the 17-input fitting . The red and blue colors mean the NLO and NNLO fit, respectively. The small and large loops mean the 68% HPD and the 95% HPD, respectively. The light-colored areas are the 68% HPD. |
Full size|PPT slide
Tab.7 The fitting results of . The superscripts indicate the input number in the fit. Columns 5 to 8 are the NLO and the NNLO fitting results in Refs. [7, 8], respectively. |
LECs | NLO B12 | NLO B17 | NNLO B17 | NLO fit [7] | NNLO fit [7] | NLO fit 2 [8] | NNLO fit 2 [8] |
|
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
(d.o.f.) | – | – | – | – | | | |
Tab.8 lists the 17-input theoretical contributions at each order.
is replaced by
[
2], because
has a better convergence, theoretically. It can be seen that most expansions at each order conform to the convergence hypothesis very well. Most LO values contribute more than 70%, most NLO values contribute within 10% to 23%, and most HO values contribute less than 10%. All these percentages are neither too large nor too small. All theoretical results agree well with the experimental data. The ratios of the adjacent two orders are about 0.2, except for
and
, which HO contributions are larger than the NLO ones. This situation also exists in Refs. [
7,
8]. There are two reasons. One is that the experimental values of both
and
are not very precise. Compared to
and
, their errors are too large and the estimating truncation errors are not so precise. It may lead to a poor convergence. The second reason is that there indeed exist broken convergence problems in the expansions of
and
. These two reasons are related to a more precise experiment and theoretical calculation, and we do not discuss it anymore in this paper. However, although the NLO fitting results of
and
in Ref. [
8] are converged, it assumes a geometric sequence model. Ref. [
7] also exists this problem. However, Model B introduces the priors and has a wider scope of application. A better prior can predict its theoretical value within a more reasonable range. In addition, the total contribution of
is basically occupied by the NLO and its HO value tends towards 0. This is because the error of
itself is very large, which is about 3.7 times its expected value. Therefore, the contribution of
in the fit becomes very small, and the fitting expected value can be far away from the experimental expected value. Hence, adopting the experimental values of
as a constraint to constrain LECs in Ref. [
8] seems not particularly good. Model B adopts both the convergence assumption and the prior PDFs. It can handle most precise data, so most results also conform with the convergence assumption very well. Only a few results with poor convergence, because of the problem itself or the large experimental errors.
Tab.8 The convergences of 17 inputs. The LECs are adopted from the 17-input NLO fitting results obtained by Model B in Tab.7. The second to the fourth columns are the contributions at the LO, NLO and HO, respectively. The percentage is defined in Eq. (18). The last two columns are the theoretical estimation and the experimental inputs, respectively. |
Observables | LO | NLO | HO | Theory | Experiment |
|
| (96.2%) | (3.1%) | (0.7%) | | |
| (88.4%) | (11.8%) | (−0.2%) | | |
| (84.1%) | (15.5%) | (0.3%) | | 1.199 0.003 |
| (66.2%) | (22.2%) | (11.6%) | | |
| (77.9%) | (18.8%) | (3.3%) | | |
| (72.5%) | (20.2%) | (7.4%) | | |
| (104.0%) | (−4.4%) | (0.4%) | | |
| (63.3%) | (14.6%) | (22.1%) | | |
| (158.3%) | (−18.8%) | (−39.5%) | | |
| (0.0%) | (98.6%) | (1.4%) | | |
| (0.0%) | (160.2%) | (−60.2%) | | |
| (0.0%) | (104.0%) | (−4.0%) | | |
| (0.0%) | (99.9%) | (0.1%) | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
Refs. [
7,
8] fit the NLO LECs only with the first 12 inputs in Section 5, because the remaining five physical quantities have zero value in the LO. Therefore, if the truncation errors are not considered, the NLO fit does not contain the NNLO contribution. The results would exhibit a large deviation, because the HO contributions may lead to large influences. Although Ref. [
8] can estimate the truncation errors, it requires at least two-order values because of a geometric-sequence model. Hence, this model cannot work for these five physical quantities. At present, the Bayesian method only requires at least one-order values to estimate the truncation errors. In other words, with Model B, even physical quantities with zero LO can be used as part of data fitting in NLO. Therefore, we also perform a full fit of all 17 physical quantities at the NLO.
6.3 The NNLO fitting and
The
to be fitted at the NNLO in this paper is the same as those in Ref. [
8]. There exist 38
, while the number of observables are 17. Hence, these 38
are combined into 17 linearly independent
before the fit. The definitions of
are in Appendix A in Ref. [
8]. In the NNLO fit,
and
are fitted simultaneously using the approach mentioned in Section 3.4.
The setting parameters are placed in Columns 8 to 10 in in Appendix B. All the parameters are given as Model B
1, because we have known nothing about the truncation errors. The prior of
can be found in Columns 6 and 7 in in Appendix B. They are referred to the
ranges given in Tab.9 in Ref. [
8]. The blue part in Fig.3 shows the NNLO fitting corner plot of
. Column 4 in Tab.7 lists the NNLO fitting results of
. Both Fig.3 and Tab.7 indicate that there is no significant change of the theoretical expected values between the NLO and the NNLO fit. In addition, the NNLO fitting
and their correlations with smaller theoretical errors, because it is the introduction of the NNLO contributions. Tab.7 also indicates that the difference between the 17 inputs at NNLO and 12 or 17 inputs at NLO are not very large, all within 20%. This indicates that this method is stable and does not cause an obvious change of
as the order increases. This is exactly one of the motivations in this paper.
Tab.9 The values and the errors of , comparing with the results in Ref. [8]. |
| Results | Ref. [8] | | | Results | Ref. [8] |
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
in Appendix B gives the posterior distributions of
. The introduction of the constraints in Eqs. (5)−(7) causes some
to deviate from normal distributions, but not very seriously. Tab.9 shows the numerical results of
. Compared with those results in Ref. [
8], all standard deviations are slightly larger. The reason is that Eq. (11) considers the errors of the truncation errors and enlarges the theoretical errors.
Tab.10 gives the theoretical contributions at each order with the NNLO fit. It can be seen that most physical quantities satisfy the chiral convergence very well, except for
,
,
and
. This situation also exists in the NNLO fit and in Ref. [
8]. The reason has been discussed in Section 6.2. It also leads to a large theoretical error of
. If a set of more precise experimental data are introduced, this problem may not exist anymore.
Tab.10 Same as Tab.8, except for the NNLO fit. |
Observables | LO | NLO | NNLO | HO | Theory | Experiment |
|
| (95.1%) | (5.3%) | (−0.5%) | (0.01%) | | |
| (88.4%) | (13.1%) | (−1.4%) | (−0.07%) | | |
| (83.2%) | (16.4%) | (0.6%) | (−0.12%) | | |
| (66.7%) | (23.7%) | (8.7%) | (0.88%) | | |
| (76.9%) | (17.0%) | (5.8%) | (0.37%) | | |
| (72.2%) | (20.6%) | (6.8%) | (0.34%) | | |
| (104.1%) | (4.6%) | (−0.7%) | (1.20%) | | |
| (63.2%) | (15.2%) | (21.6%) | (0.00%) | | |
| (161.9%) | (−21.2%) | (−40.8%) | (0.04%) | | |
| (0.0%) | (90.4%) | (9.7%) | (−0.11%) | | |
| (0.0%) | (174.8%) | (−71.8%) | (−3.01%) | | |
| (0.0%) | (−115.1%) | (214.6%) | (0.48%) | | |
| (0.0%) | (104.8%) | (−4.1%) | (−0.67%) | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
6.4 The NNLO fitting
This section discusses the fit about
.
, which have been determined in Tab.9, are linear combinations of
. Although the number of
is less than the number of
, three
can be determined by solving the linear equations [
8]. However, some of these values will be one to two orders of magnitude times larger than those in the other references. Some constraints are required to be introduced. For the other unsolvable
, their distribution is solved by the Monte Carlo method [
8]. Although the approach in Ref. [
8] can solve this problem, its efficiency is very low, and it needs to take a lot of time. Therefore, this paper adopts the MCMC algorithms in Section 2. We have repeated the computation many times, and some similar results are obtained. Randomness does not affect the results obtained by this method. The difference is that all the prior PDFs of parameters
are all set to the different uniform distributions. The boundaries of these prior uniform distributions are the same as Eq. (38) in Ref. [
8]. The reason to use the prior uniform distributions instead of a prior normal distribution is that we want to explore the boundary dependence of each
in this overfitting problem. Normal distributions would generate fewer samples near the boundaries, and the efficiency is low.
Figure 5 in Appendix B illustrates the posterior distributions of
. It can be seen that different
have different shapes.
have a large probability near both boundaries. Their posterior PDFs are dependent on both sides. In addition,
only depend on one side. This can also be seen from their posterior distributions. One side has a shape similar to a half-Gaussian distribution. The constraint of these
at this side is reliable, but the other side gives no constraint of these
. Finally, these twelve
give Gaussian-like posterior PDFs, so these twelve results have higher credibility. Of course, 17 data to fit 38
is far from adequate. There exists an overfitting problem. Hence, some
are boundary-dependent. This property is similar to those in Ref. [
8].
Tab.11 gives the fitting values of
and compares the results in Refs. [
7,
8,
71]. The brackets “[” and “]” denote that the results are strongly dependent on the lower and the upper boundaries, respectively. The parentheses “(” and “)” denote that the results are weakly dependent on the lower and the upper boundaries, respectively. We have tried to double the boundaries, the strong-dependent boundaries deviate from the original values a lot, while weak-dependent boundaries change the original values slightly. Of course, the boundaries chosen in Ref. [
8] are wide enough, they cover almost all results in the other references [
6,
7,
19,
71–
80]. Hence, the true values have a large probability in the intervals in Tab.11.
Tab.11 The values of are in units of . The brackets “[” and “]” represent strong dependence on the lower and the upper boundaries, respectively. “(” and “)” represent weak dependence on the lower and the upper boundaries, respectively. The results with an asterisk mean the input boundaries on the website [68] are very close to those in Ref. [7] (less than ). The symbol “” for the results in Ref. [71] means these values are zeros in the large- limits. |
LECs | Results | Ref. [8] | Ref. [7] | Ref. [71] | | LECs | Results | Ref. [8] | Ref. [7] | Ref. [71] |
|
| 14.82 (41.49) | | 12 | | | | −0.41 (0.82) | | −0.48 | |
| 3.48 (8.98] | | 3.0 | | | | 5.88 [15.71] | | 9.0 | |
| 1.70 [6.05] | | 4.0 | | | | 0.92 [3.52] | | −1.0 | |
| 18.54 (29.94) | | 15 | | | | −21.17 (58.67) | | −11 | |
| −3.62 (19.23) | | −4.0 | | | | −4.30 [42.04) | | 10 | |
| −3.43 [4.22) | | −4.0 | | | | 0.45 [3.95] | | −2.0 | |
| 1.00 [6.26] | | 5.0 | | | | −23.20 [24.09) | | −20 | |
| 10.52 [16.84] | | 19 | | | | 3.44 [4.61] | | 3.0 | |
| | | −0.25 | | | | 2.15 (9.56) | | 2.0 | |
| −2.90 (4.17) | | −4.0 | | | | 1.79 [3.63] | | 1.7 | |
| −6.02 (5.57) | | −2.8 | | | | −0.01 [3.58] | | 0.82 | |
| 1.74 (2.06) | | 1.5 | | | | 9.27 (10.48] | | 7.0 | |
| −3.32 (3.15) | | −1.0 | | | | 1.27 [5.12] | | 2.0 | |
| (1.13) | | −3.0 | | | | 11.09 [22.82] | | – | |
| 1.10 [4.20] | | 3.2 | | | | 3.90 [26.83] | | – | |
| 0.27 [2.62] | | −1.0 | | | | −1.16 [19.86] | | – | |
| −1.46 [5.45] | | 0.63 | | | | −1.11 [20.31] | | – | |
| −5.42 (5.22) | | −4.0 | | | | −24.09 [63.17] | | – | |
| 0.53 [3.45] | | 1.0 | | | | 20.83 [62.90] | | – | |
7 Discussion and summary
This paper proposes a more general Bayesian model (Model B) with the truncation errors. This model is based on the idea of a simple truncation-error model [
8] and the Bayesian model framework [
28]. Compared to Refs. [
7,
8], there are some advantages in Model B.
i) This model can transform the understanding of ChPT into the prior knowledge during the fitting process, containing the information of the LECs, the convergence of ChPT and the truncation errors. The prior information can be conveniently introduced by Eqs. (5)−(7). It does not need the other assumptions, such as the geometric-sequence assumption in Ref. [
8]. It can also give a set of more precise NLO fitting LECs. A similar result is obtained at the NNLO fit in Ref. [
7], see Tab.7. Hence, there are good reasons to believe that the NNLO fitting LECs are also more precise, although there lacks the higher-order fitting result to be compared to.
ii) With the help of the MCMC method, the distributions of the LECs can be obtained, and the computational speed is faster. The computational time of Model B is the shortest. The Bayesian method has another inherent advantage. Some clear distribution figures of LECs can be obtained, because Bayesian statistics can give more points in a given time. Therefore, one cannot only obtain the expected values and errors of the LECs, but also their distributions. Refs. [
7,
8] cannot give the distributions of LECs, although they can give the errors.
iii) Model B gives a general fitting method. It can be used to fit the other problems. The two extremes of this model (Models B
1 and B
2) have been evaluated by a toy example in Section 4. It confirms that more prior information indeed gives more precise results. With the quantified evaluation criteria in Section 3.5, one can see the improvement of the prior information more clearly. The actual ChPT fit is between the two extremes. It is better than Model A. However, Model A gives a similar result as the minimum
method in Ref. [
7].
iv) For the NNLO LECs
, more smooth PDFs are given, comparing to Ref. [
8] (Ref. [
7] does not give PDFs). With these PDFs, one can see how the
depend on the boundaries more intuitively.
v) There also exist some slight improvements in this paper. The covariance matrix given in Ref. [
25] is considered. The results are insensitive to the initial conditions, compared to Ref. [
7].
In order to test the effectiveness of the model, one example is randomly generated, in order to imitate the actual ChPT. Some parameters and some quantities are introduced, which imitate the LECs and the experimental data, respectively. The exact values of and are known, and they are treated as the true values. Model A, which does not consider the truncation errors, is also introduced, in order to compare two ideal cases of Model B. One case knows nothing about the truncation errors, except the orders of magnitude. The other one knows the distributions of the truncation errors. The fitting results indicate that the prior information of the truncation errors can improve the fit greatly, even though this information is not so precise. Hence, Model B is adopted to fit the actual ChPT data.
In the actual ChPT fit, it indicates that the Bayesian method without the truncation errors are similar to the classical statistics. In other words, the classical statistics can be treated as a special case of Bayesian statistics. However, Bayesian statistics can be applied more widely. With the help of Model B, some
and
(defined in Ref. [
64]) are fitted at the NLO and the NNLO. The fitting
are almost unchanged between the NLO and the NNLO fit. The change between 12 and 17 input data are also small, but all the theoretical errors decrease for the 17 inputs, because of the more precise estimation of the truncation errors. Model B also solves a problem in the free fit, which leads to
and
being very large, but they are zeros in the large-
limit. Because the number of
to be fitted is larger than the number of the experimental data, some independent
are fitted first, which are the linear combinations of the
to be fitted. From the posterior PDFs of
, the reliable intervals of twelve
are obtained, and five
are only constrained with the upper or the lower boundary of the intervals, and the other 21
are strongly dependent on both boundaries. It needs more experimental data to confirm these uncertain
. Because all the
does not exist overfitting, they are more precise than
. If one knows some more values of these
, some other
can be restrained by these
. For the physical quantities to be fitted, most theoretical contributions are well convergent, except
and
. It possibly comes from the large experimental errors, or some of these quantities are indeed not convergent. This needs more precise experimental data and theoretical calculations in the future. It can be seen that Model B can estimate the truncation errors very well.
Some input parameters are very rough, such as Eqs. (6) and (7). A more precise estimation beyond the simple convergence assumption will be studied in the future work. In addition, if more analytical and experimental results are introduced, the results should be more precise. However, the NNLO theoretical calculation is complicated. It needs to be studied in the future. In addition, this approach can also be used to fit the other LECs, such as pion-nucleon, meson-baryon chiral LECs. However, both their experimental data and theoretical results are less than the mesonic LECs at present.
In conclusion, truncation errors usually cannot be ignored in the global fit, and some prior information can improve the fit greatly, even though this information is sometimes not very exact. Model B provides a feasible implementation scheme. A new set of more reliable and are fitted by Model B. This model cannot only fit LECs in ChPT, but also fit other parameters in the other EFTs and the perturbation theory.
8 Appendix A: One testing example
Eq. (A1) gives the functions of the example Section 3. For convenience, the parameters with the same name in the different functions are different. The values of and can be found in . The values of and are given in the second row in Tab.1 and the second column III, respectively, which are marked by a subscript “tr”.
Table A1 The values of parameters and in Eq. (A1). Because the values are exact, more significant digits are given. |
| | | | | | | | | |
|
| −50.00000 | −50.00000 | −50.00000 | | | | | | −50.00000 |
| 10.00000 | 10.00000 | 10.00000 | 10.00000 | 10.00000 | 10.00000 | | | 10.00000 |
| −0.27574 | −81.82470 | −20.69689 | −95.19898 | | | | | −130.30649 |
| −0.15389 | 55.22236 | −22.53571 | 10.47685 | 14.87959 | 10.84243 | 85.28317 | | 164.01310 |
| 32.80394 | 52.16045 | 52.16045 | 42.66749 | 12.43968 | 77.33250 | | | 52.35691 |
| −10.25934 | −10.95043 | −11.71888 | −28.67356 | −26.84030 | | | | −32.24206 |
| −2.39804 | 2.37745 | −9.62388 | 9.39379 | 9.39265 | 0.57071 | | | 42.90526 |
| −24.83947 | 69.61634 | −2.52600 | −10.10231 | 7.19485 | | | | 99.63693 |
| 0.51431 | 99.49478 | 30.47334 | 32.08646 | 32.08646 | 113.26784 | 96.25052 | 32.08646 | 77.78096 |
| −69.04271 | −69.23904 | 38.65428 | 20.36290 | 10.78947 | 10.06250 | | | 82.44726 |
| −62.98961 | −66.28358 | 42.37512 | −8.89931 | 13.20056 | 6.67356 | | | 91.91503 |
| 50.00000 | 10.00000 | −135.00000 | 10.00000 | 10.00000 | | | | −100.00000 |
| 74.39589 | 103.26600 | 103.32687 | 156.81033 | 95.54550 | 83.85802 | 100.61745 | 105.09613 | 107.97725 |
| −33.59097 | −54.18416 | −54.20951 | −41.78655 | −28.19762 | −28.53647 | | | −48.27921 |
| 28.77264 | 26.35989 | 40.37381 | 49.81383 | 40.01344 | 63.80493 | 40.01344 | | 45.46315 |
| −9.19703 | 3.17944 | 1.51912 | 5.66408 | −5.66251 | −8.10996 | 10.10760 | | 58.11734 |
| −29.07789 | −62.10099 | −44.02866 | −48.43771 | −31.06314 | −34.52566 | −40.99873 | | −12.31774 |
9 Appendix B: Some tables and figures for the fits
Table A2 The fitting parameters and the priors in Model B2. The subscripts NLO and NNLO represent the NLO and NNLO fit, respectively. The definitions of these parameters are in Eqs. (10)−(12) and the text below them. |
| | | | | | | | | | | |
|
1 | 0.140 | 0.050 | 1 | 0.616 | 0.308 | | 0.040 | 0.020 | 1 | 0.023 | 0.012 |
2 | 0.100 | 0.050 | 1 | 0.751 | 0.376 | | 0.040 | 0.020 | 0 | 0.178 | 0.089 |
3 | 0.020 | 0.050 | 0 | 3.232 | 1.616 | | 0.100 | 0.030 | 1 | 0.758 | 0.379 |
4 | 0.040 | 0.050 | 1 | 0.268 | 0.134 | | 0.120 | 0.036 | 1 | 0.196 | 0.098 |
5 | 0.140 | 0.050 | 1 | 1.097 | 0.549 | | 0.060 | 0.020 | 1 | 0.146 | 0.073 |
6 | 0.110 | 0.050 | 0 | 0.108 | 0.054 | | 0.020 | 0.020 | 0 | 0.200 | 0.100 |
7 | 0.110 | 0.050 | 1 | 0.281 | 0.140 | | 0.060 | 0.020 | 1 | 0.347 | 0.173 |
8 | 0.140 | 0.050 | 1 | 0.434 | 0.217 | | 0.010 | 0.020 | 0 | 0.484 | 0.242 |
9 | 0.070 | 0.050 | 0 | | | | 0.130 | 0.039 | 0 | 0.958 | 0.479 |
10 | 0.120 | 0.050 | 0 | | | | 0.050 | 0.020 | 0 | 0.061 | 0.031 |
11 | 0.110 | 0.050 | 0 | | | | 0.060 | 0.020 | 0 | 0.275 | 0.138 |
12 | 0.010 | 0.050 | 1 | | | | 0.040 | 0.020 | 1 | 0.217 | 0.109 |
13 | 0.140 | 0.050 | 1 | | | | 0.060 | 0.020 | 1 | 0.987 | 0.494 |
14 | 0.140 | 0.050 | 1 | | | | 0.070 | 0.021 | 1 | 0.279 | 0.139 |
15 | 0.020 | 0.050 | 0 | | | | 0.030 | 0.020 | 1 | 0.098 | 0.049 |
16 | 0.060 | 0.050 | 0 | | | | 0.050 | 0.020 | 1 | 0.622 | 0.311 |
17 | 0.140 | 0.050 | 1 | | | | 0.050 | 0.020 | 1 | 0.187 | 0.093 |
Table A3 The prior of the LECs in Model B2. The subscripts NLO and NNLO represent the NLO and NNLO fit, respectively. |
| | | | |
|
1 | 0.616 | 0.308 | 0.023 | 0.012 |
2 | 0.751 | 0.376 | 0.178 | 0.089 |
3 | 3.232 | 1.616 | 0.758 | 0.379 |
4 | 0.268 | 0.134 | 0.196 | 0.098 |
5 | 1.097 | 0.549 | 0.146 | 0.073 |
6 | 0.108 | 0.054 | 0.200 | 0.100 |
7 | 0.281 | 0.140 | 0.347 | 0.173 |
8 | 0.434 | 0.217 | 0.484 | 0.242 |
9 | | | 0.958 | 0.479 |
10 | | | 0.061 | 0.031 |
11 | | | 0.275 | 0.138 |
12 | | | 0.217 | 0.109 |
13 | | | 0.987 | 0.494 |
14 | | | 0.279 | 0.139 |
15 | | | 0.098 | 0.049 |
16 | | | 0.622 | 0.311 |
17 | | | 0.187 | 0.093 |
Table A4 The parameters for fitting and . The superscripts 12 and 17 represent the fit with 12 and 17 inputs, respectively. The subscripts NLO and NNLO represent the NLO and NNLO fitting, respectively. The definitions of these parameters are in Eqs. (10)−(12) and the text below them. |
Quantity | | | | | | | | | |
|
| 0.050 | 0.050 | 0.5 | 0.050 | 0.050 | 0.5 | 0.020 | 0.020 | 0.5 |
| 0.050 | 0.050 | 0.5 | 0.050 | 0.050 | 0.5 | 0.020 | 0.020 | 0.5 |
| 0.050 | 0.050 | 0.5 | 0.050 | 0.050 | 0.5 | 0.020 | 0.020 | 0.5 |
| 0.150 | 0.050 | 1 | 0.150 | 0.050 | 1 | 0.020 | 0.020 | 0.5 |
| 0.050 | 0.050 | 0.5 | 0.050 | 0.050 | 0.5 | 0.020 | 0.020 | 0.5 |
| 0.050 | 0.050 | 0.5 | 0.050 | 0.050 | 0.5 | 0.020 | 0.020 | 0.5 |
| 0.050 | 0.050 | 0.5 | 0.050 | 0.050 | 0.5 | 0.020 | 0.020 | 0.5 |
| 0.100 | 0.050 | 1 | 0.100 | 0.050 | 1 | 0.020 | 0.020 | 0.5 |
| 0.050 | 0.050 | 0.5 | 0.050 | 0.050 | 0.5 | 0.020 | 0.020 | 0.5 |
| 0.350 | 0.105 | 1 | 0.350 | 0.105 | 1 | 0.020 | 0.020 | 0.5 |
| 0.250 | 0.075 | 0 | 0.250 | 0.075 | 0 | 0.020 | 0.020 | 0.5 |
| 0.200 | 0.060 | 0.5 | 0.200 | 0.060 | 0.5 | 0.020 | 0.020 | 0.5 |
| | | | 0.200 | 0.060 | 0.5 | 0.020 | 0.020 | 0.5 |
| | | | 0.200 | 0.060 | 0.5 | 0.050 | 0.050 | 0.5 |
| | | | 0.200 | 0.060 | 0.5 | 0.050 | 0.050 | 0.5 |
| | | | 0.200 | 0.060 | 0.5 | 0.050 | 0.050 | 0.5 |
| | | | 0.200 | 0.060 | 0.5 | 0.050 | 0.050 | 0.5 |
Table A5 The parameters of the priors of and . Their definition is above Eq. (20). The superscripts 12 and 17 represent the fit with 12 and 17 inputs, respectively. The subscripts NLO and NNLO represent the NLO and the NNLO fits, respectively. |
i | | | | | | |
|
1 | 0.500 | 0.300 | 0.500 | 0.300 | 0.077 | 0.989 |
2 | 1.000 | 0.500 | 1.000 | 0.500 | 0.190 | 3.102 |
3 | 3.000 | 1.000 | 3.000 | 1.000 | 1.073 | 0.007 |
4 | 0.200 | 0.200 | 0.200 | 0.200 | 0.068 | 0.095 |
5 | 1.000 | 0.500 | 1.000 | 0.500 | 0.040 | 0.071 |
6 | 0.000 | 0.300 | 0.000 | 0.300 | 0.806 | 1.772 |
7 | 0.300 | 0.300 | 0.300 | 0.300 | 0.561 | 1.581 |
8 | 0.500 | 0.500 | 0.500 | 0.500 | 0.043 | 0.250 |
9 | | | | | 0.173 | 0.497 |
10 | | | | | 0.405 | 1.819 |
11 | | | | | 0.066 | 0.227 |
12 | | | | | 0.013 | 0.053 |
13 | | | | | 0.244 | 0.388 |
14 | | | | | 0.007 | 0.693 |
15 | | | | | 0.022 | 0.072 |
16 | | | | | 0.214 | 1.122 |
17 | | | | | 0.207 | 0.222 |
Fig.A1 The posterior distributions of the NNLO fitting . The vertical coordinate is the posterior PDF and the horizontal coordinate is the value of . The pink shaded area depicts the 68% HPD. The blue line is the distribution curve of . |
Full size|PPT slide
Fig. A2 The posterior distributions of . The horizontal axis represents the value of , and the upper and the lower boundaries are given in Eq. (38) in Ref. [8]. The vertical coordinate is the posterior PDF. The pink shaded area depicts the 68% HPD. The blue line is the distribution curve of . |
Full size|PPT slide
{{custom_sec.title}}
{{custom_sec.title}}
{{custom_sec.content}}