1. School of Mathematics, Liaoning Normal University, Dalian 116029, China
2. School of Mathematical Sciences, Dalian University of Technology, Dalian 116024, China
tt010725@163.com
Show less
History+
Received
Accepted
Published
2023-10-15
Issue Date
Revised Date
2023-12-27
PDF
(549KB)
Abstract
By utilizing the improvement function, we change the nonsmooth convex constrained optimization into an unconstrained optimization, and construct an infeasible quasi-Newton bundle method with proximal form. It should be noted that the objective function being minimized in unconstrained optimization subproblem may vary along the iterations (it does not change if the null step is made, otherwise it is updated to a new function). It is necessary to make some adjustment in order to obtain the convergence result. We employ the main idea of infeasible bundle method of Sagastizàbal and Solodov, and under the circumstances that each iteration point may be infeasible for primal problem, we prove that each cluster point of the sequence generated by the proposed algorithm is the optimal solution to the original problem. Furthermore, for BFGS quasi-Newton algorithm with strong convex objective function, we obtain the condition which guarantees the boundedness of quasi-Newton matrices and the R-linear convergence of the iteration points.
In the field of operations research, it is often necessary to minimize a function that has no derivatives at some points, and such problems are called non-differentiable optimization (NDO) or nonsmooth optimization (NSO). Many practical problems are NSO problems. Let the nonnegative variable represent income. In many economic systems, income corresponds to a tax function , which usually has discontinuous derivatives. Given some threshold and tax rate . is given in the following form: , where . It is easy to verify that if , then is a continuously increasing convex function. Further, it is easy to verify that has the following form: . We usually want to minimize the tax rate, and thus the following optimization problem arises: , where is a convex function and the set of constraints is a convex set, which is a convex programming, and its more general form is
where are convex function, which is generally not differentiable. Suppose the Slater bound norm of Problem (1) holds, i.e., there exists such that . We also assume that there exists an oracle that for any given can compute the function value and one subgradient of each function, i.e., and .
In general, NSO problems are difficult to solve. Among the many solution methods, the well-known methods are the subgradient method, the cutting plane method, the analytic central tangent plane method, and the bundle method. Among them, the latter two are currently considered to be the most stable and trustworthy methods for solving NSO problems. The NSO problems with constraints are more complicated [4]. For general nonsmooth convex constrained problems, one approach is to equivalently solve the unconstrained optimization problem with an exact penalty function [7]. However, the application of penalty functions inevitably involves the problem of taking values of penalty parameters. If a large parameter value is required to guarantee the accuracy of a given penalty function, which will increase the difficulty in numerical computation. There are also some bundle methods that do not use the idea of penalty functions [12]. However, all descent steps in these methods are required to be feasible, including the initial point. We know that this requirement is likely to be as difficult as solving Problem (1). As a result, the entire computational burden of solving the problem increases substantially. Other work on bundle methods for solving constrained optimization problems can be found in the literature [5, 8, 9, 14]. One of them, [14] gives a dual stable bundle method resulting from combining the proximal bundle method and the horizontal bundle method, which combines the advantages of both, while proposing an effective stopping criterion. When the objective function is nonconvex, [5] constructed a reassignment bundle method. Under the condition of using only the approximate information of the objective and constraint functions, [8] gave a proximal bundle method, which requires the error between the approximate and real values to be fixed but can be unknown, and finally obtains the approximate optimal solution of the problem.
We give the improvement function related to Problem (1): for a given , let
It is called the improvement function of Problem (1). is the solution of Problem (1) when and only when is the optimal solution of the unconstrained optimization problem , see Theorem 2.1 in the next section. As a theoretical tool in the convergence analysis of the bundle method, the use of dates back to literature [11]. However, the method requires that the resulting sequence of descent steps is feasible. The piecewise linear model of the improvement function has been used in feasible direction methods for solving smooth problems [15], while these methods still require that the descent steps are feasible and the improvement function itself is not addressed in the algorithm. Infeasible bundle methods are less common, with the literature [6] and the “stage I–stage II” improvement of the feasible methods in the constrained bundle methods before [3]. In these methods, it is necessary to predict the values of the objective function and parameters, which is obviously not feasible for general functions. In this paper, the infeasible bundle method of Sagastizdbal and Solodov [16] and the Moreau-Yosida regularization idea [13] are employed, where the objective function changes with the generation of descent step, but the adjusted model is still an upper and lower approximation of the updated objective function.
The full paper is structured as follows: Section 2 performs the problem transformation and gives the upper and lower approximation models of the objective function and its related properties, where the objective function is constructed by using the bundle method idea. Section 3 gives the specific infeasible quasi-Newtonian bundle method for solving the convex constrained optimization problem. Section 4 presents the global convergence analysis of the given algorithm. Section 5 gives the convergence results of the BFGS bundle method. The last section is the conclusion.
In the whole text, the following standard notation is used: denotes the inner product in Euclidean space and the corresponding norm denoted by , , for a subset of , conv represents its convex hull and represents the sub-differential of the convex function at the point .
2 Bundle method
Theorem 2.1 [6] Assuming that Problem (1) satisfies the Slater constraint specification, the following conclusions are equivalent:
(1) is the solution of Problem (1);
(2) ;
(3)
In view of the conclusion of Theorem 2.1, for a given , consider the following problem:
According to the Moreau-Yosida regularization conclusion, the above problem is equivalent to
where
is an symmetric positive definite matrix, . (4) is called the Moreau-Yosida regularization of associated with . It is well known that is a continuous differentiable convex function defined on , although may be non-differentiable. The derivative of at point is , where is the unique optimal solution to Problem (4). Further, is a global Lipschitz continuous function which module is . is a minimal point of when and only when and . Let . Then (4) is equivalent to
The following considers the approximation of using the bundle method. Assume that the information is already available in the bundle . The linearization errors of and are defined as and , then bundle meets .
From Lemma 2.1, we know that . Let . Then . Furthermore, make
Let be the optimal solution of (6). Let . Then . Let . It is an approximation of the real solution of (5). Then let . For the upper and lower approximation functions of , the following conclusion holds.
Lemma 2.2 :
(i) ;
(ii) .
Proof (i) Due to the fact that for , then . By the definition of and the optimal solution of (6), we have . Thus, .
(ii) Apparently holds.□
Let . Then . Let be a given positive number and be a given positive number during each inner loop iteration. If
then is considered to be an approximation of . If (7) does not hold, let and compute , and then . According to Lemma 2.1, add the information of the new trial point to the bundle, construct a new approximate model for , and then resolve (6) to find the new , and check it with (7). If condition (7) never holds, using a similar proof in [13], the following two lemmas can be obtained.
Lemma 2.3Assuming thatis not an optimal solution for , and (7) is never satisfied in the sub-algorithm, then .
Lemma 2.4Let. Then
The following lemma shows that each inner loop iteration must stop in a finite number of steps, thus ensuring the smooth execution of the entire algorithm.
Lemma 2.5Ifis not an optimal solution of , then after a finite number of computational iterations of the sub-problem (6), it is always possible to find a solutionof the sub-problem such that (7) holds.
Proof Assuming that the conclusion does not hold, we will keep adding information of the new trial point to the bundle, leading to . By Lemma 2.3, we have . By Lemma 2.4, . Since is not an optimal solution for , there exists a positive and a positive integer such that . This holds for all . By and the fact that (7) does not hold, there is , which contradicts .□
For the decreasing step iteration point generated by the algorithm, the following lemma ensures that the new model is still a valid lower approximation to the new objective function after the objective function is updated from to in (6), thus ensuring the next iteration to proceed smoothly.
Lemma 2.6 [16] Assume that the algorithm produces descent step iteration point , its linearization error associated with is defined as follows:
Then , where , and is obtained by converting to the corresponding in the result of Lemma 2.1.
Step 1 (Initial step) are given constants and is a sequence of positive numbers satisfying . is the initial estimated solution, is an symmetric positive definite matrix. Let , choose such that . Start the iterative process of the sub-problem with , .
Step 2 (Compute the search direction) If , the algorithm stops and is an optimal solution. Otherwise, calculate
Step 3 (Line search) Starting from , let be the smallest non-negative integer such that
where satisfies
Setting .
Step 4 (Correct the quasi-Newton matrix) Let . If . Correct to , so that is a symmetric positive definite matrix and satisfies the quasi-Newton equation . Otherwise, set . Let , and go back to Step 2.
Note 1 Take not of the stopping criterion in Step 2. If , then we can infer that from , and from (7), we know that . Then we get and from (9). From Theorem 2.1, we know that is the optimal solution to the original Problem (1).
Note 2 In Step 3, the process of finding requires repeatedly solving the sub-problem (6) to find a descent step iteration point that satisfies conditions (11) and (12). During this process, the objective function of the sub-problem (6) does not change (Step 1). Once is found, Step 3 is skipped and the next iteration is performed. At this time, the objective function of (6) is changed (the so-called descent step), and the original approximate model for becomes the approximate model for , i.e., not only the approximate model is changed, but the original objective function that is being approximated is also changed. According to Lemma 2.1 and Lemma 2.6, is still a valid approximation of , which ensures the smooth iteration of the algorithm. This is different from the previous bundle methods, in which the objective function in the sub-problem of the bundle method is fixed and does not change with the generation of descent steps.
Note 3 As the iteration proceeds, the number of elements in the bundle increases, and it is necessary to selectively delete and compress the bundle when the number of elements in the bundle becomes large. Let the optimal solution of the dual problem of Problem (6) be with . It is well known that the optimal solution of Problem (6) can be expressed according to the optimal solution of its dual problem. However, in this formulation, only the sub-gradient corresponding to takes effect, and this is also in effect for the linearization error . Thus, when , we perform the following selection deletion and aggregation into techniques, where represents the upper limit of the number of elements that can be stored in the bundle:
— Select the point pair corresponding to , which can be removed directly from the bundle.
— If there are still many remaining point pairs corresponding to , their information is compressed into a point pair in the form of a convex combination (with as the convex combination factor). A new affine function is constructed using this point pair, which is inserted into the model. After which, the point pairs corresponding to can be arbitrarily selected until the number of elements in bundle is less than . This technique can both preserves the information of the positive point pairs in the bundle and ensures that the number of elements in the bundle is within the specified range.
4 Global convergence analysis
The following theorem is a reasonable explanation for the existence of in each iteration of the algorithm.
Theorem 4.1Ifis not the minimal value point of, then there existssuch that
holds for all, wheresatisfies
Proof Since is not a minimal point of , there exists a positive number such that is not a minimal point of either. By Lemma 2.5, for each , there is always a such that (14) holds. If , by Lemma 2.2, we have , so . Since is not an optimal solution, (10) implies that . Also, since is continuously differentiable and , there is a positive number such that . While we have , which implies that (13) holds. If , then there also exists (sufficiently small) such that + , and thus (13) also holds.□
Since it is assumed that , there exists a constant such that
Introducing the notation , if we assume that both and are lower bounded functions, then is bounded.
Lemma 4.1For all, we have
Proof According to the algorithm criterion, for all , there is
Thus, for all , then . Since , so . The conclusion is confirmed.□
The global convergence results of the algorithm are given below.
Theorem 4.2Assuming thatandare lower bounded and there exist two positive constantsandsuch thatandhold for all , the algorithm produces any cluster of the sequencethat is a minimal value point for Problem (1).
Proof By Lemma 4.1 and the assumption that and are lower bounded, combined with (15) and (16), there exists such that
Since , according to Lemma 2.1 and the algorithm criterion, we have and . Thus, according to the assumptions on ,
Let be any cluster-point of , and be a sub-sequence converging to . According to Lemma 2.3,
If , then according to (17) and (18), we have , which means that is the optimal solution to the original problem. Otherwise, . Assuming that , according to the line search criterion and Lemma 2.1, we have
Then by (l8), is bounded and with the assumption that is bounded, we get is bounded. We may assume again . Since , we get . By and the assumption for , , which implies that as well as , we finally get .□
Note 4 We will discuss the assumption of boundedness of and in Theorem 4.2 in the next section.
5 Convergence analysis of infeasible BFGS bundle methods
This section mainly considers the convergence of the infeasible BFGS bundle method.
The infeasible BFGS bundle method is obtained by taking in the initial step of Algorithm 3.1 and taking the positive sequence to satisfy . The selection of is specified in Step 4, and the infeasible BFGS bundle method is obtained as follows.
Step 4 (Correction of the quasi-Newton matrix) Let . If there exist two constants such that the iteration of the algorithm satisfies the following two conditions:
Let , otherwise make . Let , go back to Step 2.
From Step 4, if (20) holds, then ; if is a positive definition matrix, is also a positive definition matrix and satisfies the quasi-Newton equation
As we all know, if is a strongly convex function on , then is also strongly convex on [10]. Now assume that is strongly convex on . Then has a unique minima on . Introduce the label or does not hold for . The following theorem shows that the assumption of boundedness of and in Theorem 4.2 holds under certain conditions.
Theorem 5.1Suppose thatis strongly convex on. The sequencesandare generated by Algorithm 5.1. Suppose thatandsatisfy
whereis some symmetric positive definite matrix andis a sequence of positive numbers satisfying. Thenandare bounded.
Proof If (20) and (21) hold, we have
After organizing, we get
In addition,
We reorganized and get
For , we have . The is obtained by using the BFGS correction formula. Assuming that has an infinite number of elements, for , where for a certain , then (20) and (21) hold, i.e.,
Then, by Lemma 2.4, (20) and (21), for all , we have
Then for all , where satisfying , there are
By assumption , we have . Then according to [2, Theorem 3.2], it follows that for all satisfying , and are bounded depending on . Since for all , then the sequences and are bounded. When the number of elements in is finite, a similar proof method can be used to obtain that and are bounded.□
The following results show that without the assumption of boundedness of and , it is still possible to prove that the iterative sequence generated by the algorithm still converges to the optimal solution of Problem (1) for some special problems.
Theorem 5.2Supposeis strongly convex on , and the sequenceare generated by Algorithm 5.1 with , . Then-linearly converges to the unique solutionof Problem (1) further, with
Proof The division is both a finite set and an infinite set. In either case, we can use the strong convexity of , the global continuity of modulo , the positive definite correction criterion of , Lemma 2.2, and to obtain that there exists the constant and a positive integer such that for all , there is
By the strong convexity of and [1, Lemma 4.3], for all , there exists a constant such that
So, there is
The conclusion is proofed.□
6 Conclusions
In this paper, we combine the infeasible bundle method of Sagastizdbal with the Moreau-Yosida regularization idea to construct a quasi-Newton bundle method for solving convex constrained optimization problems. The global convergence result of the algorithm is derived without guaranteeing that every iteration point is feasible. Although the method is essentially similar to the unconstrained bundle method, it differs from the previous methods in that the objective function of the subproblem to be solved at each iteration changes with the generation of descent steps, which requires necessary adjustments to ensure the convergence of the algorithm. For the special BFGS bundle method, we briefly discuss the conditions that guarantee the boundedness of the quasi-Newton matrix.
Bonnans J F, Gilbert J Ch, Lemaréchal C, Sagastizábal C A. A family of variable metric proximal methods. Math Program1995; 68(1/2/3): 15–47
[2]
Byrd R H, Nocedal J. A tool for the analysis of quasi-Newton methods with application to unconstrained minimization. SIAM J Numer Anal1989; 26(3): 727–739
[3]
FletcherRLeyfferS. A bundle filter method for nonsmooth nonlinear optimization. Numerical Analysis Report NA/195, Dundee: Department of Mathematics, The University of Dundee, 1999
[4]
Frangioni A. Generalized bundle methods. SIAM J Optim2002; 13(1): 117–156
[5]
Hare W, Sagastizábal C A. A redistributed proximal bundle method for nonconvex optimization. SIAM J Optim2010; 20(5): 2442–2473
[6]
KiwielK C. Methods of Descent for Nondifferentiable Optimization. Lecture Notes in Math, Vol 1133. Berlin: Springer-Verlag, 1985
[7]
Kiwiel K C. Exact penalty functions in proximal bundle methods for constrained convex nondifferentiable minimization. Math Program1991; 52(1/2/3): 285–302
[8]
Kiwiel K C. A proximal bundle method with approximate subgradient linearizations. SIAM J Optim2006; 16(4): 1007–1023
[9]
Kiwiel K C. A proximal-projection bundle method for Lagrangian relaxation, including semidefinite programming. SIAM J Optim2006; 17(4): 1015–1034
[10]
Lemaréchal C, Sagastizábal C A. Practical aspects of the Moreau-Yosida regularization: theoretical preliminaries. SIAM J Optim1997; 7(2): 367–385
[11]
Miffin R. An algorithm for constrained optimization with semismooth functions. Math Oper Res1977; 2(2): 191–207
[12]
MifflinR. A modification and an extension of Lemarechal’s algorithm for nonsmooth minimization. In: Nondifferential and Variational Techniques in Optimization, Math Program Stud, Vol 17. Berlin: Springer-Verlag, 1982, 77–90
[13]
Miflin R, Sun D F, Qi L Q. Quasi-Newton bundle type methods for nondifferentiable convex optimization. SIAM J Optim1998; 8(2): 583–603
[14]
OliveiraWSolodovM. A doubly stabilized bundle method for nonsmooth convex optimization. 2013
[15]
Pironneau O, Polak E. Rate of convergence of a class of methods of feasible directions. SIAM J Numer Anal1973; 10(1): 161–174
[16]
Sagastizabal C A, Solodov M. An infeasible bundle method for nonsmooth convex constrained optimization without a penalty function or a filter. SIAM J Optim2005; 16(1): 146–169
RIGHTS & PERMISSIONS
Higher Education Press 2023
AI Summary 中Eng×
Note: Please be aware that the following content is generated by artificial intelligence. This website is not responsible for any consequences arising from the use of this content.