Optimum Design Concepts

Jasbir S. Arora , in Introduction to Optimum Design (Second Edition), 2004

4.6.5 Sufficient Conditions for Convex Programming Problems

Theorem 4.11 Sufficient Condition for Convex Programming Problem If f(x) is a convex cost function defined on a convex feasible set, then the first-order KKT conditions are necessary as well as sufficient for a global minimum.

Thus, if we can show convexity of a problem, any solution of the necessary conditions will automatically satisfy sufficient conditions (see Example 4.42). In addition, the solution will be a global minimum. Following the procedure of Section 4.4, we consider various cases defined by the switching conditions of Eq. (4.50) until a solution is found. We can stop there as the solution is a global optimum design.

EXAMPLE 4.42

Check for Convexity of a Problem

Let us consider Example 4.29 again and check for its convexity. Minimize f(x) = (x 1 − 1.5)2 + (x 2 − 1.5)2 subject to g(x) = x 1 + x 2 − 2 ≤ 0.

Solution . The KKT necessary conditions give the candidate local minimum as x 1 * = 1, x 2 * = 1, and u * = 1. The constraint function g(x) is linear, so it is convex. Since the inequality constraint function is convex and there is no equality constraint, the feasible set S is convex. The Hessian matrix for the cost function is

H = [ 2 0 0 2 ]

Since H is positive definite everywhere by Theorem 4.2 or Theorem 4.3, the cost function f(x) is strictly convex by Theorem 4.8. Therefore, the problem is convex and the solution x 1 = 1, x 2 = 1 satisfies sufficiency condition of Theorem 4.11. It is a strict global minimum point for the problem.

The convexity results are summarized in Table 4-3.

TABLE 4-3. Convex Programming Problem—Summary of Results

The problem must be written in the standard form: Minimize f(x) subject to h i (x) = 0, g j(x) ≤ 0
1.

Convex set. The geometrical condition that a line joining two points in the set is to be in the set, is an "if and only if" condition for convexity of the set.

2.

Convexity of feasible set S. All the constraint functions should be convex. This condition is only sufficient but not necessary; i.e., functions failing the convexity check may also define convex sets.

nonlinear equality constraints always give nonconvex sets

linear equalities or inequalities always give convex sets

3.

Convex functions. A function is convex if and only if its Hessian is at least positive semidefinite everywhere.

A function is strictly convex if its Hessian is positive definite everywhere; however, the converse is not true, i.e., a strictly convex function may not have a positive definite Hessian everywhere; thus this condition is only sufficient but not necessary.

4.

Form of constraint function. Changing the form of a constraint function can result in failure of the convexity check for the new constraint or vice versa.

5.

Convex programming problem. f(x) is convex over the convex feasible set S.

KKT first order conditions are necessary as well as sufficient for global minimum

Any local minimum point is also a global minimum point

Nonconvex programming problem: If a problem fails convexity checks, it does not imply that there is no global minimum for the problem. It could also have only one local minimum in the feasible set S which would then be a global minimum as well.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780120641550500045

More on Optimum Design Concepts

Jasbir S. Arora , in Introduction to Optimum Design (Third Edition), 2012

5.4 Second-Order Conditions for the Rectangular Beam Design Problem

The rectangular beam problem was formulated and graphically solved in Section 3.8. The KKT necessary conditions were written and solved in Section 4.9.2. Several points that satisfy the KKT conditions are obtained. It is seen from the graphical representation of the problem that all of these points are global minima for the problem; however, none of the points is an isolated minimum. Let us show that the sufficiency condition will not be satisfied for any of these points.

Cases 3, 5, and 6 in Section 4.9.2 gave solutions that satisfy the KKT conditions. Cases 5 and 6 had two active constraints; however, only the constraint with positive multiplier needs to be considered in Eq. (5.11). The sufficiency theorem requires only constraints with u i >0 to be considered in calculating the feasible directions for use in Eq. (5.12). Therefore, only the g 2 constraint needs to be included in the check for sufficiency conditions. Thus, all three cases have the same sufficiency check.

We need to calculate Hessians of the cost function and the second constraint:

(a) 2 f = [ 0 1 1 0 ] , 2 g 2 = ( 2.25 × 10 5 ) b 3 d 3 [ 2 d 2 b d b d 2 b 2 ]

Since bd = (1.125 ×105), ∇2 g 2 becomes

(b) 2 g 2 = 2 [ 2 b 2 ( 1.125 × 10 5 ) 1 ( 1.125 × 10 5 ) 1 2 d 2 ]

The Hessian of the Lagrangian is given as

(c) 2 L = 2 f + u 2 2 g 2 = [ 0 1 1 0 ] + 2 ( 56,250 ) [ 2 b 2 ( 1.125 × 10 5 ) 1 ( 1.125 × 10 5 ) 1 2 d 2 ]

(d) 2 L = [ ( 2.25 × 10 5 ) b 2 2 2 ( 2.25 × 10 5 ) d 2 ]

The determinant of ∇2 L is 0 for bd=(1.125×105); the matrix is only positive semidefinite. Therefore, the Strong Sufficiency Theorem 5.3 cannot be used to show the sufficiency of x*. We must check the sufficiency condition of Eq. (5.12). In order to do that, we must find directions y satisfying Eq. (5.11). The gradient of g2 is given as

(e) g 2 = [ ( 2.25 × 10 5 ) b 2 d , ( 2.25 × 10 5 ) b d 2 ]

The feasible directions y are given by ∇g 2 T y = 0, as

(f) 1 b y 1 + 1 d y 2 = 0 , or y 2 = d b y 1

Therefore, vector y is given as y = (1, −d/b)c, where c = y 1 is any constant. Using ∇2 L and y, Q of Eq. (5.12) is given as

(g) Q = y T 2 L y = 0

Thus, the sufficiency condition of Theorem 5.2 is not satisfied. The points satisfying bd=(1.125×105) are not isolated minimum points. This is, of course, true from Figure 3.11. Note, however, that since Q=0, the second-order necessary condition of Theorem 5.1 is satisfied for Case 3. Theorem 5.2 cannot be used for solutions to Cases 5 and 6 since there are two active constraints for this two-variable problem; therefore, there are no nonzero d vectors.

It is important to note that this problem does not satisfy the condition for a convex programming problem and all of the points satisfying KKT conditions do not satisfy the sufficiency condition for an isolated local minimum. Yet all of the points are actually global minimum points. Two conclusions can be drawn from this example:

1.

Global minimum points can be obtained for problems that cannot be classified as convex programming problems. We cannot show global optimality unless we find all of the local minimum points in the closed and bounded set (the Weierstrass Theorem 4.1).

2.

If sufficiency conditions are not satisfied, the only conclusion we can draw is that the candidate point need not be an isolated minimum. It may have many local optima in the neighborhood, and they may all be actually global minimum points.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B978012381375600005X

Optimum Design Concepts

Jasbir S. Arora , in Introduction to Optimum Design (Third Edition), 2012

4.8.5 Sufficient Conditions for Convex Programming Problems

If we can show convexity of a problem, any solution to the necessary conditions will automatically satisfy the sufficient conditions (see Example 4.42). In addition, the solution will be a global minimum. Following the procedure of Section 4.4, we consider various cases defined by the switching conditions of Eq. (4.51) until a solution is found. We can stop there, as the solution is a global optimum design.

Theorem 4.11

Sufficient Conditions for Convex Programming Problems If f(x) is a convex cost function defined on a convex feasible set, then the first-order KKT conditions are necessary as well as sufficient for a global minimum.

Example 4.42

Check for the Convexity of a Problem

Let us consider Example 4.29 again and check for its convexity:

Minimize

(a) f ( x ) = ( x 1 1 . 5 ) 2 + ( x 2 1 . 5 ) 2

subject to

(b) g ( x ) = x 1 + x 2 2 0

Solution

The KKT necessary conditions give the candidate local minimum as x l*=1, x 2 * =1, and u*=1. The constraint function g(x) is linear, so it is convex. Since the inequality constraint function is convex and there is no equality constraint, the feasible set S is convex. The Hessian matrix for the cost function is

(c) H = [ 2 0 0 2 ]

Since H is positive definite everywhere by Theorem 4.2 or Theorem 4.3, the cost function f(x) is strictly convex by Theorem 4.8. Therefore, the problem is convex and the solution x 1 * = 1 , x 2 * = 1 satisfies the sufficiency condition of Theorem 4.11. It is a strict global minimum point for the problem.

The convexity results are summarized in Table 4.3.

Table 4.3. Convex programming problem—summary of results

Problem must be written in standard form: Minimize f(x), subject to hi(x)=0, gj(x)≤0
1.

Convex set

The geometrical condition, that a line joining two points in the set is to be in the set, is an "if-and-only-if" condition for the convexity of the set.
2.

Convexity of feasible set S

All of the constraint functions should be convex. This condition is sufficient but not necessary; that is, functions failing the convexity check may still define convex sets.

Nonlinear equality constraints always give nonconvex sets.

Linear equalities or inequalities always give convex sets.

3.

Convex functions

A function is convex if and only if its Hessian is at least positive semidefinite everywhere. A function is strictly convex if its Hessian is positive definite everywhere; however, the converse is not true: A strictly convex function may not have a positive definite Hessian everywhere. Thus this condition is sufficient but not necessary.
4.

Form of constraint function

Changing the form of a constraint function can result in failure of the convexity check for the new constraint or vice versa.
5.

Convex programming problem

f(x) is convex over the convex feasible set S.

KKT first-order conditions are necessary as well as sufficient for global minimums

Any local minimum point is also a global minimum point.

6.

Nonconvex programming problem

If a problem fails a convexity check, it does not imply that there is no global minimum for the problem. It could have only one local minimum in the feasible set S, which would then be a global minimum as well.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780123813756000048

More on Numerical Methods for Unconstrained Optimum Design

Jasbir S. Arora , in Introduction to Optimum Design (Second Edition), 2004

9.4 Search Direction Determination: Newton's Method

With the steepest descent method, only first-order derivative information is used to determine the search direction. If second-order derivatives were available, we could use them to represent the cost surface more accurately, and a better search direction could be found. With the inclusion of second-order information, we could expect a better rate of convergence. For example, Newton's method, which uses the Hessian of the function in calculation of the search direction, has a quadratic rate of convergence (meaning it converges very rapidly when the design point is within certain radius of the minimum point). For any positive definite quadratic function, the method converges in just one iteration with a step size of one.

9.4.1 Classical Newton's Method

The basic idea of the Newton's method is to use a second-order Taylor's expansion of the function about the current design point. This gives a quadratic expression for the change in design δx. The necessary condition for minimization of this function then gives an explicit calculation for design change. In the following, we shall omit the argument x (k) from all functions, because the derivation applies to any design iteration. Using second-order Taylor's expansion for the function f(x), we obtain

(9.7) f ( x+ Δ x ) = f ( x ) + c T Δ x+ 0.5 Δ x T H Δ x

where Δx is a small change in design and H is the Hessian of f at the point x (sometimes denoted as ∇2 f). Equation (9.7) is a quadratic function in terms of Δx . The theory of convex programming problems in Chapter 4 guarantees that if H is positive semidefinite, then there is a Δx that gives a global minimum for the function of Eq. (9.7). In addition, if H is positive definite, then the minimum for Eq. (9.7) is unique. Writing optimality conditions [∂f/∂(Δx) = 0] for the function of Eq. (9.7),

Assuming H to be nonsingular, we get an expression for Δx as

Using this value for Δx, the design is updated as

Since Eq. (9.7) is just an approximation for f at the point x (0), x (1) will probably not be the precise minimum point of f(x). Therefore, the process will have to be repeated to obtain improved estimates until the minimum is reached. Each iteration of Newton's method requires computation of the Hessian of the cost function. Since it is a symmetric matrix, it needs computation of n(n + 1)/2 second-order derivatives of f(x) (recall that n is the number of design variables). This can require considerable computational effort.

9.4.2 9.4.2 Modified Newton's Method

Note that the classical Newton's method does not have a step size associated with the calculation of design change Δx in Eq. (9.9); i.e., step size is taken as one (step of length one is called an ideal step size or Newton's step). Therefore, there is no way to ensure that the cost function will be reduced at each iteration; i.e., to ensure that f(x (k+1)) < f(x (k))). Thus, the method is not guaranteed to converge to a local minimum point even with the use of second-order information that requires large calculations. This situation can be corrected if we incorporate the use of a step size in the calculation of the design change Δx. In other words, we treat the solution of Eq. (9.9) as the search direction and use any of the one-dimensional search methods to calculate the step size in the search direction. This is called the modified Newton's method and is stated as follows.

Step 1.Make an engineering estimate for a starting design x (0). Set iteration counter k = 0. Select a tolerance ε for the stopping criterion.

Step 2.Calculate c i (k) = ∂f(x (k))/∂x i for i = 1 to n. If ||c (k)|| < ε, stop the iterative process. Otherwise, continue.

Step 3.Calculate the Hessian matrix H (k) at the current point x (k).

Step 4.Calculate the search by solving Eq. (9.9) as

Note that the calculation of d (k) in the above equation is symbolic. For computational efficiency, the linear equation H (k) d (k) = −c (k) is solved directly instead of evaluating the inverse of the Hessian matrix.

Step 5.Update the design as x (k + 1) = x (k) + α k d (k), where αk is calculated to minimize f(x (k)) + αd (k). Any one-dimensional search procedure may be used to calculate α.

Step 6.Set k = k + 1 and go to Step 2.

It is important to note here that unless H is positive definite, the direction d (k) determined from Eq. (9.11) may not be that of descent for the cost function. To see this, we substitute d (k) from Eq. (9.11) into the descent condition of Eq. (8.8) to obtain

The foregoing condition will always be satisfied if H is positive definite. If H is negative definite or negative semidefinite, the condition is always violated. With H as indefinite or positive semidefinite, the condition may or may not be satisfied, so we must check for it. If the direction obtained in Step 4 is not that of descent for the cost function, then we should stop there because a positive step size cannot be determined. Based on the foregoing discussion, it is suggested that the descent condition of Eq. (8.8) should be checked for Newton's search direction at each iteration before calculating the step size. Examples 9.6 and 9.7 demonstrate use of the modified Newton's method.

EXAMPLE 9.6

Use of Modified Newton's Method

(a) Minimize f ( x ) = 3 x 1 2 + 2 x 1 x 2 + 2 x 2 2 + 7

using the modified Newton's algorithm starting from the point (5, 10). Use ε= 0.0001 as the stopping criterion.

Solution . We will follow the steps of the modified Newton's method.

1.

x (0) is given as (5, 10).

2.

The gradient vector c (0) at the point (5, 10) is given as

(b) c ( 0 ) = ( 6 x 1 + 2 x 2 , 2 x 1 + 4 x 2 ) = ( 50 , 50 )

(c) c ( 0 ) = 50 2 + 50 2 = = 50 2 &gt; ε

Therefore, the convergence criterion is not satisfied.

3.

The Hessian matrix at the point (5, 10) is given as

(d) H ( 0 ) = [ 6 2 2 4 ]

Note that the Hessian does not depend on design variables and is positive definite (since its eigenvalues are 7.24 and 2.76). Therefore Newton's direction satisfies the descent condition at each iteration.

4.

The direction of design change is

(e) d ( 0 ) = H ( 1 ) c ( 0 ) = 1 20 [ 4 2 2 6 ] [ 50 50 ] = [ 5 10 ]

5.

Step size α is calculated to minimize f(x (0) + αd (0)):

(f) x ( 1 ) = x ( 0 ) + α d ( 0 ) = [ 5 10 ] + α [ 5 10 ] = [ 5 5 α 10 10 α ]

(g) f α = 0 ; or f ( x ( 1 ) ) . T d ( 0 ) = 0

Using the Step 2 calculations, ∇f(x (1)) and the dot product ∇f(x (1)) ˙ d (0) are calculated as

(h) f ( x ( 1 ) ) = [ 6 ( 5 5 α ) + 2 ( 10 10 α ) 2 ( 5 5 α ) + 4 ( 10 10 α ) ] = [ 50 50 α 50 50 α ]

(i)  ∇ f ( x ( 1 ) ) . d ( 0 ) = ( 50 50 α , 50 50 α ) [ 5 10 ] = 0

(j) Or, − 5 ( 50 50 α ) 10 ( 50 50 α ) = 0

Solving the preceding equation, we get α = 1. Note that the golden section search also gives α = 1. Therefore,

(k) x ( 1 ) = [ 5 5 α 10 10 α ] = [ 0 0 ]

The gradient of the cost function at x (1) is calculated as

(l) c ( 1 ) = [ 5 5 α 5 5 α ] = [ 0 0 ]

Since ||c (k)||&lt;ε, the Newton's method has given the solution in just one iteration. This is because the function is a positive definite quadratic form (the Hessian of f is positive definite everywhere). Note that the condition number of the Hessian is not 1; therefore the steepest descent method will not converge in one iteration, as was the case in Examples 9.4 and 9.5.

EXAMPLE 9.7

Use of Modified Newton's Method

(a) Minimize f ( x ) = 10 x 1 4 20 x 1 2 x 2 + 10 x 2 2 + x 1 2 2 x 1 + 5

using the computer program for the modified Newton's method given in Appendix D from the point (-1, 3). Golden section search may be used for step size determination with δ = 0.05 and line search accuracy equal to 0.0001. For the stopping criterion, use ε = 0.005.

Solution . Note that f(x) is not a quadratic function in terms of the design variables. Thus, we cannot expect the Newton's method to converge in one iteration. The gradient of f(x) is given as

(b) c = f ( x ) = ( 40 x 1 3 40 x 1 x 2 + 2 x 1 2 , 20 x 1 2 + 20 x 2 )

and the Hessian matrix of f(x) is

(c) H = 2 f ( x ) = [ 120 x 1 2 40 x 2 + 2 40 x 1 40 x 1 20 ]

Results with the modified Newton's method for the problem are given in Table 9-2. The optimum point is (1, 1) and the optimum value of f(x) is 4.0. Newton's method has converged to the optimum solution in eight iterations. Figure 9-7 shows the contours for the function and the progress of the method from the starting design (-1, 3). It is noted that the step size was approximately equal to one in the last phase of the iterative process. This is because the function resembles a quadratic function sufficiently close to the optimum point and step size is equal to unity for a quadratic function.

A computer program based on the modified Newton's method is given in Appendix D, which needs three user-supplied subroutines FUNCT, GRAD, and HASN. These subroutines evaluate cost function, the gradient, and the Hessian matrix of the cost function, respectively. The program is used to solve the problem of Example 9.7.

The drawbacks of the modified Newton's method for general applications are:

1.

It requires calculations of second-order derivatives at each iteration, which is usually quite time consuming. In some applications it may not even be possible to calculate such derivatives. Also, a linear system of equations in Eq. (9.11) needs to be solved. Therefore, each iteration of the method requires substantially more calculations compared with the steepest descent or conjugate gradient method.

2.

The Hessian of the cost function may be singular at some iterations. Thus, Eq. (9.11) cannot be used to compute the search direction. Also, unless the Hessian is positive definite, the search direction cannot be guaranteed to be that of descent for the cost function, as discussed earlier.

3.

The method is not convergent unless the Hessian remains positive definite and a step size is calculated along the search direction to update design. However, the method has a quadratic rate of convergence when it converges. For a strictly convex quadratic function, the method converges in just one iteration from any starting design.

A comparison of steepest descent, conjugate gradient, and modified Newton methods is presented in Example 9.8.

EXAMPLE 9.8

Comparison of Steepest Descent, Conjugate Gradient, and Modified Newton Methods

Minimize f(x) = 50(x 2x 1 2)2 + (2 − x 1)2 starting from the point (5, −5). Use the steepest descent, Newton, and conjugate gradient methods, and compare their performance.

Solution . The minimum point for the function is known as (2, 4) with f(2, 4) = 0. We use exact gradient expressions and ε = 0.005 to solve the problem using the steepest descent and Newton's method programs given in Appendix D and the conjugate gradient method available in IDESIGN. Table 9-3 summarizes final results with the three methods. For the steepest descent method, δ0 = 0.05 and a line search termination criterion of 0.00001 are used. For the Newton's method, they are 0.05 and 0.0001 respectively. Golden section search is used with both methods. It can be observed again that for the present example the steepest descent method is the most inefficient and the conjugate gradient is most efficient. Therefore, the conjugate gradient method is recommended for general applications.

9.4.3 Marquardt Modification

As noted before the modified Newton's method has several drawbacks that can cause numerical difficulties. For example, if the Hessian H of the cost function is not positive definite, the direction found from Eq. (9.11) may not be that of descent for the cost function. In that case, a step cannot be executed along the direction. Marquardt (1963) suggested a modification to the direction finding process that has the desirable features of the steepest descent and Newton's methods. It turns out that far away from the solution point, the method behaves like the steepest descent method, which is quite good there. Near the solution point, it behaves like the Newton's method, which is very effective there. In the modified procedure, the Hessian is modified as (H + λI), where λ is a positive constant. λ is initially selected as a large number that is reduced as iterations progress. The search direction is computed from Eq. (9.11) as

(9.13) d ( k ) = [ H ( k ) + λ k I ] 1 c ( k )

Note that when λ is large, the effect of H essentially gets neglected and d (k) is essentially −(1/λ)c (k), which is the steepest descent direction with 1/λ as the step size. As the algorithm proceeds, λ is reduced (i.e., step size is increased). When λ becomes sufficiently small, then the effect of λI is essentially neglected and the Newton direction is obtained from Eq. (9.13). If the direction d (k) of Eq. (9.13) does not reduce the cost function, then λ is increased (step size is reduced) and the search direction is recomputed. Marquardt's algorithm is summarized in the following steps.

Step 1.Make an engineering estimate for starting design x (0). Set iteration counter k = 0. Select a tolerance ε as the stopping criterion, and λ 0 as a large constant (say 1000).

Step 2.Calculate c i (k)f(x)(k)/∂x i = i = 1 to n. If ||c (k)|| < ε, stop. Otherwise, continue.

Step 3.Calculate the Hessian matrix H(x (k)).

Step 4.Calculate the search direction by solving Eq. (9.13).

Step 5.If f(x (k) + d (k)) < f(x (k)), then continue. Otherwise, increase λ k (to say 2λ k ), and go to Step 4.

Step 6.Reduce λ k , say, λ k + 1 = 0.5 λ k . Set k = k + 1 and go to Step 2.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780120641550500094

More on Optimum Design Concepts

Jasbir S. Arora , in Introduction to Optimum Design (Second Edition), 2004

5.3 Second-Order Conditions for Constrained Optimization

Solutions of the necessary conditions are candidate local minimum designs. In this section, we shall discuss second-order necessary and sufficiency conditions for constrained optimization problems. As in the unconstrained case, second-order information about the functions at the candidate point x * will be used to determine if it is indeed a local minimum. Recall for the unconstrained problem that the local sufficiency of Theorem 4.4 requires the quadratic part of the Taylor's expansion for the function at x * to be positive for all nonzero changes d. In the constrained case, we must also consider active constraints at x * to determine feasible changes d. We will consider only the points x = x * + d in the neighborhood of x * that satisfy the active constraint equations. Any d0 satisfying active constraints to the first order must be in the constraint tangent hyperplane (Fig. 5-2). Such d's are then orthogonal to the gradients of the active constraints since constraint gradients are normal to the constraint tangent hyperplane. Therefore, the dot product of d with each of the constraint gradients →h i and −g i must be zero, i.e., →h i T d = 0 and →g i T d = 0. These equations are used to determine directions d that define a feasible region around the point x *. Note that only active inequalities constraints (g i = 0) are used in determining d. The situation is depicted in Fig. 5-2 for one inequality constraint.

FIGURE 5-2. Directions d used in constrained sufficiency conditions.

To derive the second-order conditions, we write Taylor's expansion of the Lagrange function and consider only those d that satisfy the preceding conditions. x * is then a local minimum point if the second-order term of Taylor's expansion is positive for all d in the constraint tangent hyperplane. This is then the sufficient condition for an isolated local minimum point. As a necessary condition the second-order term must be nonnegative. We summarize these results in Theorems 5.1 and 5.2.

Theorem 5.1 Second-order Necessary Condition for General Constrained Problems Let x * satisfy the first-order KKT necessary conditions for the general optimum design problem. Define the Hessian of the Lagrange function L at x * as

(5.6) 2 L = 2 f + Σ i = 1 p v i * 2 h i + Σ i = 1 p u i * 2 g i

Let there be nonzero feasible directions, d0, satisfying the following linear systems at the point x *:

(5.7) h i T d=0; i = 1 to p

(5.8) g i T d = 0 ; for all active inequalities (i.e., for those i with g i ( x* ) = 0 )

Then if x * is a local minimum point for the optimum design problem, it must be true that

(5.9) Q 0 where Q = d T 2 L ( x* ) d

Note that any point that does not satisfy the second-order necessary conditions cannot be a local minimum point.

Theorem 5.2 Sufficient Conditions for General Constrained Problems Let x * satisfy the first-order KKT necessary conditions for the general optimum design problem. Define Hessian of the Lagrange function L at x * as in Eq. (5.6). Define nonzero feasible directions, d0, as solutions of the linear systems

(5.10) g i T d = 0 ; i = 1 to p

(5.11) g i T d = 0  for all active inequalites with u i * > 0

Also let →g i T d ≤ 0 for those active inequalities with u i * = 0. If

(5.12) Q > 0 , where Q = d T 2 L ( x* ) d

then x * is an isolated local minimum point (isolated means that there are no other local minimum points in the neighborhood of x *).

Note first the difference in the conditions for the directions d in Eq. (5.8) for the necessary condition and Eq. (5.11) for the sufficient condition. In Eq. (5.8) all active inequalities with nonnegative multipliers are included whereas in Eq. (5.11) only those active inequalities with a positive multiplier are included. Equations (5.10) and (5.11) simply say that the dot product of vectors →h i and d and →g i (having u i * > 0) and d should be zero. So, only the d orthogonal to the gradients of equality and active inequality constraints with u i * > 0 are considered. Or, stated differently, only d in the tangent hyperplane to the active constraints at the candidate minimum point are considered. Equation (5.12) says that the Hessian of the Lagrangian must be positive definite for all d lying in the constraint tangent hyperplane. Note that →h i , →g i and →2 L are calculated at the candidate local minimum points x * satisfying the KKT necessary conditions.

It is important to note that if matrix2 L(x *) is negative definite or negative semidefinite then the second-order necessary condition for a local minimum is violated and x * cannot be a local minimum point. Also if2 L(x *) is positive definite, i.e., Q in Eq. (5.12) is positive for any d0 then x * satisfies the sufficiency condition for an isolated local minimum and no further checks are needed. The reason is that if →2 L(x *) is positive definite, then it is also positive definite for those d that satisfy Eqs. (5.10) and (5.11). However, if2 L(x *) is not positive definite then we cannot conclude that x * is not an isolated local minimum. We must calculate d to satisfy Eqs. (5.10) and (5.11) and carry out the sufficiency test given in the Theorem 5.2. This result is summarized in Theorem 5.3.

Theorem 5.3 Strong Sufficient Condition Let x * satisfy the first-order KKT necessary conditions for the general optimum design problem. Define Hessian →2 L(x *) for the Lagrange function at x * as in Eq. (5.6). Then if →2 L(x *) is positive definite, x * is an isolated minimum point.

It should also be emphasized that if the inequality in Eq. (5.12) is not satisfied, we cannot conclude that x * is not a local minimum. It may still be a local minimum but not an isolated one. Note also that the theorem cannot be used for any x * if its assumptions are not satisfied. In that case, we cannot draw any conclusions for the point x *.

One case arising in some applications needs special mention. This occurs when the total number of active constraints (with at least one inequality) at the candidate minimum point x * is equal to the number of independent design variables; that is, there are no design degrees of freedom. Since x * satisfies KKT conditions, gradients of all the active constraints are linearly independent. Thus, the only solution for the system of Eqs. (5.10) and (5.11) is d = 0 and Theorem 5.2 cannot be used. However, since d = 0 is the only solution, there are no feasible directions in the neighborhood that can reduce the cost function any further. Thus, the point x * is indeed a local minimum for the cost function (see also the definition of a local minimum in Section 4.1.1). We consider Examples 5.4 to 5.6 to illustrate the use of sufficient conditions of optimality.

EXAMPLE 5.4

Check for Sufficient Conditions

Check sufficiency condition for Example 4.30: Minimize f(x) = 1/3 x 3 − 1/2 (b + c)x 2 + bcx + f 0 subject to axd where 0 &lt; a &lt; b &lt; c &lt; d and f 0 are specified constants.

Solution. There is only one constrained candidate local minimum point, x = a. Since there is only one design variable and one active constraint, the condition g 1 d ¯ = 0 of Eq. (5.11) gives d ¯ = 0 as the only solution (note that d ¯ is used as a direction for sufficiency check since d is used as a constant in the example). Therefore, Theorem 5.2 cannot be used for a sufficiency check. Also note that at x = a, d 2 L/dx 2 = 2abc which can be positive, negative, or zero depending on the values of a, b, and c. So, we cannot use curvature of Hessian to check the sufficiency condition (Strong Sufficient Theorem 5.3). However, from Fig. 4-20 we observe that x = a is indeed an isolated local minimum point. From this example we can conclude that if the number of active inequality constraints is equal to the number of independent design variables and all other KKT conditions are satisfied, then the candidate point is indeed a local minimum design.

EXAMPLE 5.5

Check for Sufficient Conditions

Consider the optimization problem of Example 4.31: Minimize f(x) = x 1 2 + x 2 2 −3x 1 x 2 subject to g(x) = x 1 2 + x 2 2 −6 ≤ 0. Check for sufficient conditions for the candidate minimum points.

Solution. The points satisfying KKT necessary conditions are

(a) ( i ) x*=(0,0),u*=0 ( ii ) x*= ( 3 , 3 ) , u * = 1 2 ( iii ) x*=(- 3 , 3 ), u * = 1 2

It was previously observed in Example 4.31 and Fig. 4-21 that the point (0, 0) did not satisfy the sufficiency condition, and the other two points did satisfy it. Those geometrical observations shall be mathematically verified using the sufficient theorems of optimality. The Hessian matrices for the cost and constraint functions are

(b) 2 f = [ 2 3 3 2 ] , 2 g = [ 2 0 0 2 ]

By the method of Appendix B, eigenvalues of →2 g are $lD1 = 2 and $lD2 = 2. Since both eigenvalues are positive, the function g is convex, and so the feasible set defined by g(x) ≤ 0 is convex by Theorem 4.9. However, since eigenvalues of →2 f are −1 and 5, f is not convex. Therefore, it cannot be classified as a convex programming problem and sufficiency cannot be shown by Theorem 4.11. We must resort to the general sufficiency Theorem 5.2. The Hessian of the Lagrangian function is given as

(c) 2 L = 2 f + u 2 g = [ 2 + 2 u 3 3 2 + 2 u ]

For the first point x * = (0, 0), u * = 0, →2 L becomes →2 f (the constraint g(x) ≤ 0 is inactive). In this case the problem is unconstrained and the local sufficiency requires d T 2 f(x *)d > 0 for all d. Or, →2 f should be positive definite at x *. Since both eigenvalues of →2 f are not positive, we conclude that the above condition is not satisfied. Therefore, x * = (0, 0) does not satisfy the second-order sufficiency condition. Note that since $lD1 = −1 and $lD2 = 5, the matrix →2 f is indefinite at x *. Therefore the point x * = (0, 0) violates the second-order necessary condition of Theorem 4.4 requiring →2 f to be positive semidefinite or definite at the candidate local minimum point. Thus, x * = (0, 0) cannot be a local minimum point. This agrees with graphical observation made in Example 4.31.

At points

At points x*= ( 3 , 3 ) , u * = 1 2  and x* = ( 3 , 3 ) , u * = 1 2 ,

(d) 2 L = 2 f + u 2 g = [ 2 + 2 u 3 3 2 + 2 u ]

(e) g = ± ( 2 3 , 2 3 ) = ± 2 3 ( 1 , 1 )

It may be checked that →2 L is not positive definite at either of the two points. Therefore, we cannot use Theorem 5.3 to conclude that x * is a minimum point. We must find d satisfying Eqs. (5.10) and (5.11). If we let d = (d 1, d 2), then →g T d = 0 gives

(f) ± 2 3 [ 1 1 ] [ d 1 d 2 ] = 0 ; or d 1 + d 2 = 0

Thus, d 1 = −d 2 = c, where c ≠ 0 is an arbitrary constant, and a d0 satisfying →g T d = 0 is given as d = c(1, −1). The sufficiency condition of Eq. (5.12) gives

(g) Q = d T ( 2 L ) d = c [ 1 1 ] [ 3 3 3 3 ] c [ 1 1 ] = 12 c 2 > 0 for c 0

The points x * = ( 3 , 3 ) and x * = ( 3 , 3 ) satisfy the sufficiency conditions. They are therefore isolated local minimum points as was observed graphically in Example 4.31 and Fig. 4-21. We see for this example that →2 L is not positive definite, but x * is still an isolated minimum point.

Note that since f is continuous and the feasible set is closed and bounded, we are guaranteed the existence of a global minimum by the Weierstrass Theorem 4.1. Also we have examined every possible point satisfying necessary conditions. Therefore, we must conclude by elimination that x * = ( 3 , 3 ) and x * = ( 3 , 3 ) are global minimum points. The value of the cost function for both points is f(x *) = −3.

EXAMPLE 5.6

Check for Sufficient Conditions

Consider Example 4.32: Minimize f(x 1, x 2) = x 1 2 + x 2 2 −2x 1 −2x 2 + 2 subject to g 1 = −2x 1x 2 + 4 ≤ 0, g 2 = −x 1 −2x 2 + 4 ≤ 0. Check the sufficiency condition for the candidate minimum point.

Solution. The KKT necessary conditions are satisfied for the point

(a) x 1 * = 4 3 , x 2 * = 4 3 , u 1 * = 2 9 , u 2 * = 2 9

Since all the constraint functions are linear, the feasible set S is convex. The Hessian of the cost function is positive definite. Therefore, it is also convex and the problem is convex. By Theorem 4.11, x 1 * = 4 3 , x 2 * = 4 3 satisfies sufficiency conditions for a global minimum with the cost function as f ( x * ) = 2 9 .

Note that local sufficiency cannot be shown by the method of Theorem 5.2. The reason is that the conditions of Eq. (5.11) give two equations in two unknowns:

(b) - 2 d 1 - d 2 = 0 , - d 1 - 2 d 2 = 0

This is a homogeneous system of equations with a nonsingular coefficient matrix. Therefore, its only solution is d 1 = d 2 = 0. Thus, we cannot find a d0 for use in the condition of Eq. (5.12), and Theorem 5.2 cannot be used. However, we have seen in the foregoing and in Fig. 4-22 that the point is actually an isolated global minimum point. Since it is a two-variable problem and two inequality constraints are active at the KKT point, the condition for local minimum is satisfied.

Read full chapter

URL:

https://www.sciencedirect.com/science/article/pii/B9780120641550500057

The radius of robust feasibility of uncertain mathematical programs: A Survey and recent developments

M.A. Goberna , ... J. Vicente-Pérez , in European Journal of Operational Research, 2022

5 RRF of uncertain convexly constrained programs under affine perturbations

We now consider, as in uncertain CP, a convex constraint systems posed in R n of the form

σ C P = { g i ( x ) 0 , i = 1 , , m } ,

where g i : R n R is an uncertain convex function for i I = { 1 , , m } . We also assume that the pattern-set U ( R n + 1 ) m is the cartesian product i I U i of m convex sets U i R n + 1 such that U i for all i I . We denote by u i the i th component of u U , i.e., u = ( u 1 , , u m ) . Regarding g , whose i th component is g i , we assume the existence of a convex function g ¯ i : R n R (the nominal i -th constraint function) such that the uncertainty of g i is captured by the expression

(18) g i ( x , u ) : = g ¯ i ( x ) + u i ( x 1 ) , u i U i , x R n .

For any i I we can pick a point ( a ¯ i , b ¯ i ) int U i . Defining V i = U i ( a ¯ i , b ¯ i ) , (18) becomes

g i ( x , u ) : = h ¯ i ( x ) + v i ( x 1 ) , v i V i , x R n ,

where the function h ¯ i ( x ) : = g ¯ i ( x ) + a ¯ i x b ¯ i is convex and 0 n + 1 V i for all i I . In the below formulas for the RRF appear the epigraphs of the conjugate functions of the constraints. Recall that g ¯ i * ( x * ) : = sup x R n { ( x * ) x g ¯ i ( x ) } is the conjugate of g ¯ i while epi g ¯ i * = { ( x * , r ) R n + 1 : g ¯ i * ( x * ) r } is its epigraph. Accordingly, h ¯ i * ( x * ) = b ¯ i + g ¯ i * ( x * ) and epi h ¯ i * = epi g ¯ i * + ( a ¯ i , b ¯ i ) .

Hence, we can assume without loss of generality that (18) holds with 0 n + 1 U i for all i I . In the same way, if int U i for all i I , we can assume without loss of generality that (18) holds with 0 n + 1 int U i for all i I .

Thus, the parameterized robust counterpart of σ C P is the convex system posed in R n

σ C P α : = { g ¯ i ( x ) + a i x b i , ( a i , b i ) α U i , i I } ,

whose solution set is

F C P α = { x R n : g ¯ i ( x ) + a i x b i 0 for all ( a i , b i ) α U i , i I } .

Since epi ( g ¯ i + a i , · b i ) * = epi g ¯ i * + ( a i , b i ) , by Dinh, Goberna, and López (2006, Theorem 3.1),

(19) F C P α ( 0 n , 1 ) cl cone { i I ( epi g ¯ i * + α U i ) } .

We assume F C P 0 , that is, ( 0 n , 1 ) cl cone { i I epi g ¯ i * } .

The RRF of σ C P is

ρ C P : = sup { α R + : F C P α } .

We associate with σ C P satisfying ( A 1 ) the epigraphical set

E ( g ¯ ) : = conv ( i I epi g ¯ i * ) ,

where g ¯ : = ( g ¯ 1 , , g ¯ m ) .

If g ¯ i ( x ) = a ¯ i x b ¯ i for all i I , since g ¯ i * = b ¯ i + δ { a ¯ i } , where δ { a ¯ i } denotes the indicator function of { a ¯ i } (i.e., δ { a ¯ i } ( x ) = 0 if x = a ¯ i and + otherwise), one has

E ( g ¯ ) : = conv ( i I { a ¯ i } × [ b ¯ i , + [ ) = E ( a ¯ , b ¯ ) ,

the epigraphical set defined in (5).

We first consider the RRF of σ C P under the interiority assumption ( A 1 ) .

Proposition 17

(Two exact formulas for ρ C P under ( A 1 ) ) (( Chen et al., 2020 , Corollaries 3.1 and 3.2) and ( Li and Wang, 2018 , Theorem 3.1, Corollary 3.2)) Under ( A 1 ) ,

ρ C P = inf ( a , b ) E ( g ¯ ) ϕ Z ( a , b ) .

If, additionally, Z is symmetric,

(20) ρ C P = dist ( 0 n + 1 , E ( g ¯ ) ) = inf δ Δ n + 2 , ( a k , b k ) i I epi g ¯ i * ϕ Z ( k = 1 n + 2 δ k a k , k = 1 n + 2 δ k b k ) .

In some simple cases, (20) allows to compute ρ C P by solving optimization problems:

If Z = { z R n + 1 : z M 1 z 1 } , with M being a positive definite symmetric ( n + 1 ) × ( n + 1 ) matrix, then, by Chen et al. (2020, Corollary 3.3(i)),

(21) ρ C P = inf ( a , b ) E ( g ¯ ) ( a , b ) M 1 ( a , b ) .

If Z = B n + 1 , by (21),

(22) ρ C P = inf ( a , b ) E ( g ¯ ) b 2 + k = 1 n a k 2 .

If Z is the 1 unit ball { z R n + 1 : i = 1 n + 1 | z i | 1 } , by Chen et al. (2020, Corollary 3.3(iii)),

(23) ρ C P = inf ( a , b ) E ( g ¯ ) { | b | + k = 1 n | a k | } .

If Z is the unit ball { z R n + 1 : | z i | 1 , i I } , by Chen et al. (2020, Corollary 3.3(iv)),

(24) ρ C P = inf ( a , b ) E ( g ¯ ) max { | b | , | a k | : k = 1 , , n } .

However, the above formulas (22)(24) do not provide tractable optimization problems for ρ C P when not all constraints are linear because E ( g ¯ ) is seldom polyhedral.

In order to check the positivity of ρ C P one has to decide whether the nominal constraint convex system satisfies, or not, the Slater condition. This can be done by maximizing a linear function under convex constraints, i.e., by solving certain CP problem.

Proposition 18

(Positiviness of ρ C P under ( A 1 ) ). (( Chen et al., 2020 , Proposition 3.1) and ( Li & Wang, 2018 , Theorem 3.5)) Assume that ( A 1 ) holds. Then,

ρ C P > 0 sup x R n inf i I { y i R : g ¯ i ( x ) + y i 0 } > 0 sup x R n , y R { y : g ¯ i ( x ) + y 0 , i I } > 0 .

The next corollary is the result of combining (19) and (20).

Proposition 19

(Attainment of ρ C P under ( A 1 ) ). Assume that ( A 1 ) holds with Z being symmetric. Then, ρ C P is attained if and only if

( 0 n , 1 ) cl cone { ( i I epi g ¯ i * ) + dist ( 0 n + 1 , E ( g ¯ ) ) Z } .

As in previous sections, we consider scaled CP, i.e., the counterparts of the above propositions when ( A 2 ) holds instead of ( A 1 ) . The parameterized scaled robust solution set is now

F ˜ C P α = { x R n : g ¯ i ( x ) + a i x b i 0 for all ( a i , b i ) α λ i Z , i I } = { x R n : λ i 1 g ¯ i ( x ) + c i x d i 0 for all ( c i , d i ) α Z , i I } .

We associate with σ C P satisfying ( A 2 ) the epigraphical set

E ( g ¯ , λ ) : = conv ( i I epi ( λ i 1 g ¯ i ) * ) .

It is easy to see that ( λ i 1 g ¯ i ) * ( x * ) = g ¯ i * ( λ i x * ) . Thus, defining Δ : = diag ( λ 1 , , λ m , 1 ) , we can write

E ( g ¯ , λ ) : = conv ( i I Δ epi g ¯ i * ) .

Corollary 20

(Two formulas for ρ C P under ( A 2 ) ). If ( A 2 ) holds, the RRF of σ C P is

ρ C P = inf ( a , b ) E ( g ¯ , λ ) ϕ Z ( a , b ) .

If, additionally, Z is symmetric, then ρ C P = dist ( 0 n + 1 , E ( g ¯ , λ ) ) .

Since the Slater condition holds for the nominal system { g ¯ i * ( x ) 0 , i I } if and only if it holds for { λ i 1 g ¯ i * ( x ) 0 , i I } , Proposition 18 remains valid under assumption ( A 2 ) .

Corollary 21

(Attainment of ρ C P under ( A 2 ) ). Assume that ( A 2 ) holds with Z being symmetric. Then, ρ C P is attained if and only if

( 0 , 1 ) cl cone { ( i I Δ epi g ¯ i * ) + dist ( 0 n + 1 , E ( g ¯ , λ ) ) Z } .

The following assumption is an extension of the interiority assumption ( A 2 ) introduced in Section 2 as the uncertainty sets U i are no longer required to be coincident up to scaling:

( A 4 )

For each i I there exists a compact convex set Z i R n + 1 such that 0 n + 1 int Z i and U i = Z i .

Assumption ( A 4 ) holds in convex programs with deterministic objective function and uncertain constraints whose uncertainty sets are closed balls for different norms, e.g., polyhedral and non-polyhedral balls, in which case they cannot be nonnegative multiples of a unique convex body.

Proposition 22

(Lower and upper bounds for ρ C P under ( A 4 ) ). ( Chen et al., 2020 , Theorem 3.1) Under ( A 4 ) , the following inequalities hold:

inf ( a , b ) E ( g ¯ ) inf i I ϕ Z i ( a , b ) ρ C P inf ( a , b ) E ( g ¯ ) sup i I ϕ Z i ( a , b ) .

Remark 23

The admissible perturbations of each constraint function g ¯ i in σ C P are not linear in Goberna et al. (2016), so that they are not covered by the above results. To be more precise, the admissible perturbations g ¯ i are sums of nonnegative combinations of the m constraint functions with affine functions, so that each perturbed functions is convex. Even though the interior of the uncertainty pattern-set U does not contain the zero vector, in the same vein as ( A 3 ) , (Goberna et al., 2016, Theorem 3.1) provides an exact formula for ρ C P computable by solving a suitable tractable optimization problem

Read full article

URL:

https://www.sciencedirect.com/science/article/pii/S0377221721003659