Convergence analysis of an infeasible quasi-Newton bundle method for nonsmooth convex programming

Jie SHEN , Fangfang GUO , Liping PANG

Front. Math. China ›› 2023, Vol. 18 ›› Issue (5) : 367 -380.

PDF (549KB)
Front. Math. China ›› 2023, Vol. 18 ›› Issue (5) : 367 -380. DOI: 10.3868/s140-DDD-023-0026-x
RESEARCH ARTICLE

Convergence analysis of an infeasible quasi-Newton bundle method for nonsmooth convex programming

Author information +
History +
PDF (549KB)

Abstract

By utilizing the improvement function, we change the nonsmooth convex constrained optimization into an unconstrained optimization, and construct an infeasible quasi-Newton bundle method with proximal form. It should be noted that the objective function being minimized in unconstrained optimization subproblem may vary along the iterations (it does not change if the null step is made, otherwise it is updated to a new function). It is necessary to make some adjustment in order to obtain the convergence result. We employ the main idea of infeasible bundle method of Sagastizàbal and Solodov, and under the circumstances that each iteration point may be infeasible for primal problem, we prove that each cluster point of the sequence generated by the proposed algorithm is the optimal solution to the original problem. Furthermore, for BFGS quasi-Newton algorithm with strong convex objective function, we obtain the condition which guarantees the boundedness of quasi-Newton matrices and the R-linear convergence of the iteration points.

Keywords

Non-smooth optimization / convex constraint / improvement function / bundle method / quasi-Newton direction

Cite this article

Download citation ▾
Jie SHEN, Fangfang GUO, Liping PANG. Convergence analysis of an infeasible quasi-Newton bundle method for nonsmooth convex programming. Front. Math. China, 2023, 18(5): 367-380 DOI:10.3868/s140-DDD-023-0026-x

登录浏览全文

4963

注册一个新账户 忘记密码

1 Introduction

In the field of operations research, it is often necessary to minimize a function that has no derivatives at some points, and such problems are called non-differentiable optimization (NDO) or nonsmooth optimization (NSO). Many practical problems are NSO problems. Let the nonnegative variable x represent income. In many economic systems, income x corresponds to a tax function T(x), which usually has discontinuous derivatives. Given some threshold 0=a0<a1<<am=+ and tax rate 0=r0,r1,,rm1. T is given in the following form: T(x)=Ti+rix,aix<ai+1,i=0,1,,m1, where T0=0,Ti=Ti1+ai(ri1ri). It is easy to verify that if |ri|<1,ri+1ri0, then T is a continuously increasing convex function. Further, it is easy to verify that T has the following form: T(x)=max{Ti+rixi=0,1,,m1}. We usually want to minimize the tax rate, and thus the following optimization problem arises: min{T(x)x0}, where T(x) is a convex function and the set of constraints {xRx0} is a convex set, which is a convex programming, and its more general form is

minimizef(x)s.t.c(x)0,

where f,c:RnR are convex function, which is generally not differentiable. Suppose the Slater bound norm of Problem (1) holds, i.e., there exists xRn such that c(x)<0. We also assume that there exists an oracle that for any given xRn can compute the function value f(x),c(x) and one subgradient of each function, i.e., gf(x)f(x) and gc(x)c(x).

In general, NSO problems are difficult to solve. Among the many solution methods, the well-known methods are the subgradient method, the cutting plane method, the analytic central tangent plane method, and the bundle method. Among them, the latter two are currently considered to be the most stable and trustworthy methods for solving NSO problems. The NSO problems with constraints are more complicated [4]. For general nonsmooth convex constrained problems, one approach is to equivalently solve the unconstrained optimization problem with an exact penalty function [7]. However, the application of penalty functions inevitably involves the problem of taking values of penalty parameters. If a large parameter value is required to guarantee the accuracy of a given penalty function, which will increase the difficulty in numerical computation. There are also some bundle methods that do not use the idea of penalty functions [12]. However, all descent steps in these methods are required to be feasible, including the initial point. We know that this requirement is likely to be as difficult as solving Problem (1). As a result, the entire computational burden of solving the problem increases substantially. Other work on bundle methods for solving constrained optimization problems can be found in the literature [5, 8, 9, 14]. One of them, [14] gives a dual stable bundle method resulting from combining the proximal bundle method and the horizontal bundle method, which combines the advantages of both, while proposing an effective stopping criterion. When the objective function is nonconvex, [5] constructed a reassignment bundle method. Under the condition of using only the approximate information of the objective and constraint functions, [8] gave a proximal bundle method, which requires the error between the approximate and real values to be fixed but can be unknown, and finally obtains the approximate optimal solution of the problem.

We give the improvement function related to Problem (1): for a given xRn, let

hx(y)=max{f(y)f(x),c(y)},yRn.

It is called the improvement function of Problem (1). x¯ is the solution of Problem (1) when and only when x¯ is the optimal solution of the unconstrained optimization problem minhx¯(), see Theorem 2.1 in the next section. As a theoretical tool in the convergence analysis of the bundle method, the use of hx¯() dates back to literature [11]. However, the method requires that the resulting sequence of descent steps is feasible. The piecewise linear model of the improvement function has been used in feasible direction methods for solving smooth problems [15], while these methods still require that the descent steps are feasible and the improvement function itself is not addressed in the algorithm. Infeasible bundle methods are less common, with the literature [6] and the “stage I–stage II” improvement of the feasible methods in the constrained bundle methods before [3]. In these methods, it is necessary to predict the values of the objective function and parameters, which is obviously not feasible for general functions. In this paper, the infeasible bundle method of Sagastizdbal and Solodov [16] and the Moreau-Yosida regularization idea [13] are employed, where the objective function changes with the generation of descent step, but the adjusted model is still an upper and lower approximation of the updated objective function.

The full paper is structured as follows: Section 2 performs the problem transformation and gives the upper and lower approximation models of the objective function and its related properties, where the objective function is constructed by using the bundle method idea. Section 3 gives the specific infeasible quasi-Newtonian bundle method for solving the convex constrained optimization problem. Section 4 presents the global convergence analysis of the given algorithm. Section 5 gives the convergence results of the BFGS bundle method. The last section is the conclusion.

In the whole text, the following standard notation is used: x,y=i=1nxiyi denotes the inner product in Euclidean space and the corresponding norm denoted by , x+=max{x,0}, for a subset X of Rn, conv X represents its convex hull and h(x) represents the sub-differential of the convex function h(x) at the point x.

2 Bundle method

Theorem 2.1 [6]  Assuming that Problem (1) satisfies the Slater constraint specification, the following conclusions are equivalent:

(1) x¯ is the solution of Problem (1);

(2) min{hx¯(y)|yRn}=hx¯(x¯)=0;

(3) 0hx¯(x¯).

In view of the conclusion of Theorem 2.1, for a given xRn, consider the following problem:

min{hx(y)yRn}.

According to the Moreau-Yosida regularization conclusion, the above problem is equivalent to

min{HxM(x)xRn},

where

HxM(x)=min{hx(y)+12yxM2yRn},

M is an n×n symmetric positive definite matrix, M=,M. (4) is called the Moreau-Yosida regularization of hx(y) associated with M. It is well known that HM() is a continuous differentiable convex function defined on Rn, although hx() may be non-differentiable. The derivative of HM() at point x is GxM(x):=HxM(x)=M(xp(x))hx(p(x)), where p(x) is the unique optimal solution to Problem (4). Further, GxM(x) is a global Lipschitz continuous function which module is M. x¯ is a minimal point of hx() when and only when GxM(x¯)=0 and x¯=p(x¯). Let y=x+d. Then (4) is equivalent to

HxM(x)=min{hx(x+d)+12dM2dRn}.

The following considers the approximation of hx(x+d) using the bundle method. Assume that the information is already available in the bundle (efi,eci,gfiefif(x),gciecic(x)). The linearization errors of fi and ci are defined as efi=f(x)f(yi)gfi,xyi and eci=c(x)c(yi)gci,xyi, then bundle meets Bioraclei<l{(fi,ci,efi,eci,gfi,gci)}.

Lemma 2.1 [16]  For each iBioracle, define

ei:=efi+c+(x),ghxi:=gfi,iff(yi)f(x)c(yi),ei:=eci+c+(x)c(x),ghxi:=gci,iff(yi)f(x)<c(yi).

Then ei0,ghxieihx(x).

From Lemma 2.1, we know that hx(y)hx(x)+ghxi,yxei,yRn. Let y=x+d. Then hx(x+d)c+(x)+max{ghxi,deiiBloracle}=:hˇx(x+d). Furthermore, make

HˇxM(x)=min{hˇx(x+d)+12dM2dRn}=c+(x)+min{max{ei+ghxi,diBloracle}+12dTMddRn}.

Let d(x) be the optimal solution of (6). Let v(x)=max{ei+ghxi,d(x)iBioracle}. Then HˇxM(x)=c+(x)+v(x)+12d(x)TMd(x). Let a(x)=x+d(x). It is an approximation of the real solution of (5). Then let H^xM(x)=hx(a(x))+12d(x)TMd(x),xRn. For the upper and lower approximation functions of HxM(x), the following conclusion holds.

Lemma 2.2  xRn:

(i) H^xM(x)HxM(x)HˇxM(x);

(ii) H^xM(x)=HxM(x)a(x)=p(x).

Proof (i) Due to the fact that for xRn,hx(x+d)hˇx(x+d),dRn, then HxM(x)HˇxM(x). By the definition of a(x) and the optimal solution d(x) of (6), we have hx(a(x))+12d(x)TMd(x)hˇx(a(x))+12d(x)TMd(x). Thus, H^xM(x)HxM(x).

(ii) Apparently holds.□

Let ε(x)=H^xM(x)HˇxM(x). Then ε(x)0. Let C be a given positive number and δ(x) be a given positive number during each inner loop iteration. If

ε(x)δ(x)min{d(x)TMd(x),C},

then a(x) is considered to be an approximation of p(x). If (7) does not hold, let yi+1=x+d(x) and compute efi+1,eci+1,fi+1,ci+1,gfi+1,gci+1, and then ei+1,ghxi+1. According to Lemma 2.1, add the information of the new trial point yi+1 to the bundle, construct a new approximate model for hx(x+d), and then resolve (6) to find the new d(x),ε(x), and check it with (7). If condition (7) never holds, using a similar proof in [13], the following two lemmas can be obtained.

Lemma 2.3  Assuming that x is not an optimal solution for hx(), and (7) is never satisfied in the sub-algorithm, then ε(x)0.

Lemma 2.4  Let G~xM(x)=M(xa(x))=Md(x). Then

GxM(x)G~xM(x)M1=p(x)a(x)M2ε(x),

GxM(x)G~xM(x)2ε(x)M.

The following lemma shows that each inner loop iteration must stop in a finite number of steps, thus ensuring the smooth execution of the entire algorithm.

Lemma 2.5  If x is not an optimal solution of hx(), then after a finite number of computational iterations of the sub-problem (6), it is always possible to find a solution d(x) of the sub-problem such that (7) holds.

Proof Assuming that the conclusion does not hold, we will keep adding information of the new trial point yi+1 to the bundle, leading to i. By Lemma 2.3, we have ε(x)0. By Lemma 2.4, GxM(x)G~xM(x)0. Since x is not an optimal solution for hx(), there exists a positive δ0 and a positive integer i0 such that G~xM(x)δ0. This holds for all ii0. By d(x)TMd(x)=(G~xM(x))TM1G~xM(x) and the fact that (7) does not hold, there is ε(x)>δ(x)min{δ02M,N},ii0, which contradicts ε(x)0.□

For the decreasing step iteration point xk+1 generated by the algorithm, the following lemma ensures that the new model hˇxk+1() is still a valid lower approximation to the new objective function hxk+1() after the objective function is updated from hˇxk() to hˇxk+1() in (6), thus ensuring the next iteration to proceed smoothly.

Lemma 2.6 [16] Assume that the algorithm produces descent step iteration point xk+1, its linearization error associated with xk+1 is defined as follows:

efik+1=efik+f(xk+1)f(xk)+gfi,xkxk+1,ecik+1=ecik+c(xk+1)c(xk)+gci,xkxk+1.

Then ghk+1ieik+1hk+1(xk+1), where eik+10, and ghk+1i is obtained by converting x to the corresponding xk+1 in the result of Lemma 2.1.

3 Specific algorithms

Algorithm 3.1 Infeasible quasi-Newton bundle methods for convex constrained optimization problems.

Step 1 (Initial step)  σ,ρ,C are given constants and σ<12,ρ<1,{δk}k=0 is a sequence of positive numbers satisfying k=0δk<+. x0 is the initial estimated solution, B0 is an n×n symmetric positive definite matrix. Let k=0, choose d0,ε0 such that ε0δ0min{(d0)TMd0,C}. Start the iterative process of the sub-problem with i=1, u1=x0.

Step 2 (Compute the search direction) If G~xkM(xk)=0, the algorithm stops and xk is an optimal solution. Otherwise, calculate

sk=Bk1G~xkM(xk).

Step 3 (Line search) Starting from l=0, let jk be the smallest non-negative integer l such that

Hˇxk+ρlskM(xk+ρlsk)H^xkM(xk)+σρl(sk)TG~xkM(xk),

where Hˇxk+ρlskM(xk+ρlsk) satisfies

H^xk+ρlskM(xk+ρlsk)Hˇxk+ρlskM(xk+ρlsk)δk+1min{(d(xk+ρlsk))TMd(xk+ρlsk),C}.

Setting τk:=ρjk,xk+1:=xk+τksk.

Step 4 (Correct the quasi-Newton matrix) Let Δxk=xk+1xk,Δyk=G~xk+1M(xk+1)G~xkM(xk). If (Δxk)TΔyk>0. Correct Bk to Bk+1, so that Bk+1 is a symmetric positive definite matrix and satisfies the quasi-Newton equation Bk+1Δxk=Δyk. Otherwise, set Bk=M. Let k=k+1, and go back to Step 2.

Note 1 Take not of the stopping criterion in Step 2. If G~xkM(xk)=0, then we can infer that d(xk)=0 from G~xkM(xk)=Md(xk), and from (7), we know that ε(xk)=0. Then we get GxkM(xk)=0 and xk=p(xk),0hxk(xk) from (9). From Theorem 2.1, we know that xk is the optimal solution to the original Problem (1).

Note 2 In Step 3, the process of finding jk requires repeatedly solving the sub-problem (6) to find a descent step iteration point xk+1 that satisfies conditions (11) and (12). During this process, the objective function of the sub-problem (6) does not change (Step 1). Once jk is found, Step 3 is skipped and the next iteration is performed. At this time, the objective function of (6) is changed (the so-called descent step), and the original approximate model hˇxk() for hxk() becomes the approximate model hˇxk+1() for hxk+1(), i.e., not only the approximate model is changed, but the original objective function that is being approximated is also changed. According to Lemma 2.1 and Lemma 2.6, hˇxk+1() is still a valid approximation of hxk+1(), which ensures the smooth iteration of the algorithm. This is different from the previous bundle methods, in which the objective function in the sub-problem of the bundle method is fixed and does not change with the generation of descent steps.

Note 3 As the iteration proceeds, the number of elements in the bundle Bloracleincreases, and it is necessary to selectively delete and compress the bundle when the number of elements in the bundle becomes large. Let the optimal solution of the dual problem of Problem (6) be αR|Bloracle| with αi0,iBloracleαi=1. It is well known that the optimal solution of Problem (6) can be expressed according to the optimal solution α of its dual problem. However, in this formulation, only the sub-gradient ghxi corresponding to αi>0 takes effect, and this is also in effect for the linearization error ei. Thus, when |Bloracle|=nmax, we perform the following selection deletion and aggregation into techniques, where nmax represents the upper limit of the number of elements that can be stored in the bundle:

— Select the point pair (ei,ghxi) corresponding to αi=0, which can be removed directly from the bundle.

— If there are still many remaining point pairs corresponding to αi>0, their information is compressed into a point pair in the form of a convex combination (with αi as the convex combination factor). A new affine function is constructed using this point pair, which is inserted into the model. After which, the point pairs corresponding to αi>0 can be arbitrarily selected until the number of elements in bundle Bloracle is less than nmax. This technique can both preserves the information of the positive point pairs in the bundle and ensures that the number of elements in the bundle is within the specified range.

4 Global convergence analysis

The following theorem is a reasonable explanation for the existence of jk in each iteration of the algorithm.

Theorem 4.1  If xk is not the minimal value point of hxk(), then there exists τ¯k>0 such that

Hˇxk+τskM(xk+τsk)H^xkM(xk)+στ(sk)TG~xkM(xk)

holds for all τ(0,τ¯k), where Hˇxk+τskM(xk+τsk) satisfies

H^xk+τskM(xk+τsk)H˘xk+τskM(xk+τsk)δk+1min{(d(xk+τsk))TMd(xk+τsk),C}.

Proof Since xk is not a minimal point of hxk(), there exists a positive number τ~k>0 such that τ(0,τ~],xk+τsk is not a minimal point of HM() either. By Lemma 2.5, for each τ(0,τ~], there is always a d(xk+τsk) such that (14) holds. If H^xkM(xk)=HxkM(xk), by Lemma 2.2, we have a(xk)=p(xk), so G~xkM(xk)=M(xka(xk))=M(xkp(xk))=GxkM(xk). Since xk is not an optimal solution, (10) implies that (sk)TGxkM(xk)<0. Also, since HM() is continuously differentiable and σ<1, there is a positive number τ¯k,τ¯kτ~k such that τ(0,τ¯k]. While we have Hxk+τskM(xk+τsk)HxkM+στ(sk)TGxkM(xk), which implies that (13) holds. If H^xkM(xk)>HxkM(xk), then there also exists τ¯k>0 (sufficiently small) such that Hˇxk+τskM(xk+τsk)Hxk+τskM(xk+τsk)τ0HxkM(xk)HxkM(xk) + 12(H^xkM(xk)HxkM(xk))H^xkM(xk)+στ(sk)TG~xkM(xk),τ(0,τ¯k], and thus (13) also holds.□

Since it is assumed that k=0δk<+, there exists a constant A>0 such that

k=0δkA.

Introducing the notation Dk{xRnHxkM(x)Hx0M(x0)+2AC}, if we assume that both f and c are lower bounded functions, then {Dk}k0 is bounded.

Lemma 4.1  For all k0, we have

Hxk+1M(xk+1)HxkM(xk)+C(δk+δk+1),xkDk.

Proof According to the algorithm criterion, for all k0, there is

Hxk+1M(xk+1)Hˇxk+1M(xk+1)+Cδk+1H^xkM(xk)+σρjk(sk)TG~xkM(xk)+Cδk+1=H^xkM(xk)σρjkG~xkM(xk)TBk1G~xkM(xk)+Cδk+1H^xkM(xk)+Cδk+1HˇxkM(xk)+Cδk+Cδk+1.

Thus, for all k0, then Hxk+1M(xk+1)Hx0M(x0)+2AC. Since x0D0, so xkDk,k0. The conclusion is confirmed.□

The global convergence results of the algorithm are given below.

Theorem 4.2  Assuming that f and c are lower bounded and there exist two positive constants a1 and a2 such that Bka1 and Bk1a2 hold for all k0, the algorithm produces any cluster of the sequence {xk} that is a minimal value point for Problem (1).

Proof By Lemma 4.1 and the assumption that f and c are lower bounded, combined with (15) and (16), there exists HM such that

limkHxkM(xk)=HM.

Since δk0, according to Lemma 2.1 and the algorithm criterion, we have εk0 and limkH^xkM(xk)=limkHˇxkM(xk)=HM. Thus, according to the assumptions on Bk,

limkτkG~xkM(xk)2=0.

Let x¯ be any cluster-point of {xk}, and {xk}kK be a sub-sequence converging to x¯. According to Lemma 2.3,

limk,kKG~xkM(xk)=GxkM(x¯).

If liminfk,kKτk>0, then according to (17) and (18), we have Gx¯M(x¯)=0, which means that x¯ is the optimal solution to the original problem. Otherwise, liminfk,kKτk=0. Assuming that τk0(k,kK), according to the line search criterion and Lemma 2.1, we have

Hxk+ρjk1skM(xk+ρjk1sk)HxkM(xk)ρjk1>σ(sk)TG~xkM(xk).

Then by (l8), {G~xkM(xk)}kK is bounded and with the assumption that Bk is bounded, we get {sk}kK is bounded. We may assume again limkK,ksk=s¯. Since ρjk10,kK,k, we get s¯TGx¯M(x¯)σs¯TGx¯M(x¯). By σ<12,s¯TGx¯M(x¯)0 and the assumption for Bk, s¯TGx¯M(x¯)1c2s¯2, which implies that s¯TGx¯M(x¯)=0 as well as s¯=0, we finally get Gx¯M(x¯)=0.□

Note 4 We will discuss the assumption of boundedness of {Bk} and {Bk1} in Theorem 4.2 in the next section.

5 Convergence analysis of infeasible BFGS bundle methods

This section mainly considers the convergence of the infeasible BFGS bundle method.

The infeasible BFGS bundle method is obtained by taking B0=M in the initial step of Algorithm 3.1 and taking the positive sequence {δk}k=1 to satisfy k=0δk13<. The selection of Bk is specified in Step 4, and the infeasible BFGS bundle method is obtained as follows.

Algorithm 5.1 Infeasible BFGS bundle method for convex constrained optimization problem.

Step 4 (Correction of the quasi-Newton matrix) Let Δxk=xk+1xk,Δyk=G~xk+1M(xk+1)G~xkM(xk). If there exist two constants a3>0,a4(0,1) such that the iteration of the algorithm satisfies the following two conditions:

ΔxkM(2εk+2εk+1)a3(Δxk)TΔyk,

2ΔykM(2εk+2εk+1)min{a4,δk13+δk+113}Δyk2.

Let Bk+1=BkBkΔxk(Δxk)TBk(Δxk)TBkΔxk+Δyk(Δyk)T(Δxk)TΔyk, otherwise make Bk+1=M. Let k=k+1, go back to Step 2.

From Step 4, if (20) holds, then (Δxk)TΔyk>0; if Bk is a positive definition matrix, Bk+1 is also a positive definition matrix and satisfies the quasi-Newton equation

Bk+1Δxk=BkΔxkBkΔxk(Δxk)TBkΔxk(Δxk)TBkΔxk+Δyk(Δyk)TΔxk(Δxk)TΔyk=Δyk.

As we all know, if hx() is a strongly convex function on Rn, then HM() is also strongly convex on Rn [10]. Now assume that HM() is strongly convex on {Dk}k0. Then hx() has a unique minima x¯ on {Dk}k0. Introduce the label K:={0}{i(20) or (21) does not hold for k=i1}{k0,k1,,kj,}. The following theorem shows that the assumption of boundedness of {Bk} and {Bk1} in Theorem 4.2 holds under certain conditions.

Theorem 5.1  Suppose that HM() is strongly convex on {Dk}k0. The sequences {xk} and {Bk} are generated by Algorithm 5.1. Suppose that k0,Δxk and Δyk satisfy

Δy¯kMΔxkΔxkδk,

where Δy¯k=Gxk+1M(xk+1)GxkM(xk),M is some symmetric positive definite matrix and {δk}k=0 is a sequence of positive numbers satisfying k=0δk<. Then Bk and Bk1 are bounded.

Proof If (20) and (21) hold, we have

(Δxk)TΔyk(Δxk)TΔy¯kΔxkMΔykΔy¯kM1(Δxk)TΔy¯kΔxkM(G~xkM(xk)GxkM(xk)M1+G~xk+1M(xk+1)Gxk+1M(xk+1)M1)(Δxk)TΔy¯kΔxkM(2εk+2εk+1).

After organizing, we get

(Δxk)TΔyk11+a3(Δxk)TΔy¯k.

In addition,

Δy¯k2Δyk22ΔykMΔy¯kΔykM1Δyk22ΔykM(2εk+2εk+1).

We reorganized and get

Δy¯k2(1a4)Δyk2.

For iK, we have Bi=M. The iK,Bi is obtained by Bi1 using the BFGS correction formula. Assuming that K has an infinite number of elements, for kj1k<kj1, where for a certain j1,kj1,kjK, then (20) and (21) hold, i.e.,

(Δxk)TΔyk11+a3(Δxk)TΔy¯k>0.

Then, by Lemma 2.4, (20) and (21), for all kj1k<kj1, we have

ΔykMΔxkΔxkδk+ΔykΔy¯kΔxkδk+M(2εk+2εk+1)Δxkδk+MM121a4(δk13+δk+113)Δy¯kΔxkδk+M3M121a4(δk13+δk+152)=:δ¯k..

Then for all k, where k satisfying kj1k<kj1, there are

ΔykMΔxkΔxkδ¯k.

By assumption k=0δk<,k=0δk13<, we have k=0δ¯k<. Then according to [2, Theorem 3.2], it follows that for all k satisfying kj1k<kj1, Bk and Bk1 are bounded depending on Bkj1. Since Bkj=M for all j0, then the sequences Bk and Bk1 are bounded. When the number of elements in K is finite, a similar proof method can be used to obtain that Bk and Bk1 are bounded.□

The following results show that without the assumption of boundedness of Bk and Bk1, it is still possible to prove that the iterative sequence generated by the algorithm still converges to the optimal solution of Problem (1) for some special problems.

Theorem 5.2  Suppose HM() is strongly convex on {Dk}k0, and the sequence {xk},{Bk} are generated by Algorithm 5.1 with xkx¯, k0. Then {xk} R-linearly converges to the unique solution x¯ of Problem (1) further, with

k=0xkx¯<+.

Proof The division K is both a finite set and an infinite set. In either case, we can use the strong convexity of HM(), the global continuity of GM() modulo M, the positive definite correction criterion of Bk, Lemma 2.2, and k=0δk< to obtain that there exists the constant θ(0,1),N¯(0,) and a positive integer k¯ such that for all kk¯, there is

Hxk+1M(xk+1)Hx¯M(x¯)N¯(θ12)kk¯+1(HxkM(xk)Hx¯M(x¯)).

By the strong convexity of HM() and [1, Lemma 4.3], for all k0, there exists a constant α>0 such that

12αxkx¯2HxkM(xk)Hx¯M(x¯)α2GxkM(xk)2.

So, there is

k=k¯xkx¯2(2α)12k=k¯(HxkM(xk)Hx¯M(x¯))12[2N¯(HxkM(xk)Hx¯M(x¯))α]12k=k¯(θ14)kk¯.

The conclusion is proofed.□

6 Conclusions

In this paper, we combine the infeasible bundle method of Sagastizdbal with the Moreau-Yosida regularization idea to construct a quasi-Newton bundle method for solving convex constrained optimization problems. The global convergence result of the algorithm is derived without guaranteeing that every iteration point is feasible. Although the method is essentially similar to the unconstrained bundle method, it differs from the previous methods in that the objective function of the subproblem to be solved at each iteration changes with the generation of descent steps, which requires necessary adjustments to ensure the convergence of the algorithm. For the special BFGS bundle method, we briefly discuss the conditions that guarantee the boundedness of the quasi-Newton matrix.

References

[1]

Bonnans J F, Gilbert J Ch, Lemaréchal C, Sagastizábal C A. A family of variable metric proximal methods. Math Program 1995; 68(1/2/3): 15–47

[2]

Byrd R H, Nocedal J. A tool for the analysis of quasi-Newton methods with application to unconstrained minimization. SIAM J Numer Anal 1989; 26(3): 727–739

[3]

FletcherRLeyfferS. A bundle filter method for nonsmooth nonlinear optimization. Numerical Analysis Report NA/195, Dundee: Department of Mathematics, The University of Dundee, 1999

[4]

Frangioni A. Generalized bundle methods. SIAM J Optim 2002; 13(1): 117–156

[5]

Hare W, Sagastizábal C A. A redistributed proximal bundle method for nonconvex optimization. SIAM J Optim 2010; 20(5): 2442–2473

[6]

KiwielK C. Methods of Descent for Nondifferentiable Optimization. Lecture Notes in Math, Vol 1133. Berlin: Springer-Verlag, 1985

[7]

Kiwiel K C. Exact penalty functions in proximal bundle methods for constrained convex nondifferentiable minimization. Math Program 1991; 52(1/2/3): 285–302

[8]

Kiwiel K C. A proximal bundle method with approximate subgradient linearizations. SIAM J Optim 2006; 16(4): 1007–1023

[9]

Kiwiel K C. A proximal-projection bundle method for Lagrangian relaxation, including semidefinite programming. SIAM J Optim 2006; 17(4): 1015–1034

[10]

Lemaréchal C, Sagastizábal C A. Practical aspects of the Moreau-Yosida regularization: theoretical preliminaries. SIAM J Optim 1997; 7(2): 367–385

[11]

Miffin R. An algorithm for constrained optimization with semismooth functions. Math Oper Res 1977; 2(2): 191–207

[12]

MifflinR. A modification and an extension of Lemarechal’s algorithm for nonsmooth minimization. In: Nondifferential and Variational Techniques in Optimization, Math Program Stud, Vol 17. Berlin: Springer-Verlag, 1982, 77–90

[13]

Miflin R, Sun D F, Qi L Q. Quasi-Newton bundle type methods for nondifferentiable convex optimization. SIAM J Optim 1998; 8(2): 583–603

[14]

OliveiraWSolodovM. A doubly stabilized bundle method for nonsmooth convex optimization. 2013

[15]

Pironneau O, Polak E. Rate of convergence of a class of methods of feasible directions. SIAM J Numer Anal 1973; 10(1): 161–174

[16]

Sagastizabal C A, Solodov M. An infeasible bundle method for nonsmooth convex constrained optimization without a penalty function or a filter. SIAM J Optim 2005; 16(1): 146–169

RIGHTS & PERMISSIONS

Higher Education Press 2023

AI Summary AI Mindmap
PDF (549KB)

543

Accesses

0

Citation

Detail

Sections
Recommended

AI思维导图

/