Home Mathematics



Unconstrained Optimization: Numerical MethodsWe’ve been investigating analytical techniques to solve unconstrained multivariable optimization problems In many problems (most real problems!), it is quite difficult to find the stationary or critical points, and then use them to determine the nature of the stationary point. In this section, we’ll discuss numerical techniques to maximize or minimize a multivariable function. Gradient Search Methods We want to solve the unconstrained nonlinear programming problem (NLP) given in Equation 4.1 above. Calculus tells us that if a function / is concave, then the optimal solution if there is one will occur at a stationary point x*; i.e., at a point x* where Finding the stationary points in many problems is quite difficult. The methods of Steepest Ascent (maximization) and Steepest Descent (minimization) offer an alternative by finding an approximation to a stationary point. We’ll focus on the gradient method, the Steepest Ascent. Examine the function shown in Figure 4.9. FIGURE 4.9: A Surface Defined by a Function We want to find the maximum point on the surface. If we started at the bottom of the hill, we might proceed by finding the gradient vector, since the gradient is the vector that points “up the hill” in the direction of maximum increase. The gradient vector is defined as
(The symbol V is called “del” or “nabla”.^{[1]}) If we were lucky, the gradient would point all the way to the maximum of the function, but the contours of functions do not always cooperate actually, they rarely do. The gradient “points uphill,” but for how far? We need to find the distance along the line given by the gradient to travel that maximizes the height of the function in that direction. From that new point, we compute a new gradient vector to find a new direction that “points uphill” in the direction of maximum increase. Continue this method until reaching to the “top of the hill,” the maximum of f. To summarize: from a given starting point, we move in the direction of the gradient as long as /’s value continues to increase. At that point, recalculate the gradient and move in the new direction as far as f continues to improve. This process continues until we achieve a maximum value within some specific tolerance (margin of acceptable error). The algorithm for the Method of Steepest Ascent using the gradient is: Method of Steepest Ascent Algorithm Find the unconstrained maximum of the function f : R" —f R. INPUTS: Function: / Starting point: Xo Tolerance: £ OUTPUTS: Maximal point x“ and maximum value /(x*) Step 1. Initialize: Set x = Xo Step 2. Calculate the gradient g = V/(x) Step 3. Compute the maximum t* of the lvariable function = /(x + t • g) Step 4. Find the new point x_{new} = x + t*g = x + t* V/(x) Step 5. If x — x_{ne№} < £ OR  V/(x)  < £, then STOP and return estimates x* = x_{new} and f(x_{new}). Otherwise, set x = x_{new} and return to Step 2. Remember that y = (t/i, y_{2},..., J/„) = s/yj + y_{2} H+ Уl It’s time for several examples using the method of steepest ascent . Example 4.13. Steepest Ascent Example, I. Maximize /(x) = 2xX2 + 2*2 — x^{2} — 2x to within £ = 0.01. We start with Xo = (0,0). Iteration 1. The gradient of f(xi,X2), V/, is found using the partial derivatives; /’s gradient is the vector V/(a?i, атг) = {2ж2—2жъ 2х+2—4a?2). Then V/(0,0) = (0,2). From (0,0), we move along (up) the a.'2axis in the direction of (0,2); that is, along the line L(t) = Xo + V/(xo) ■ t = (0,0) + f(0,2) = (0,2f). How far do we go? We need to maximize the function 4>{t) = /(0, 21) = 4f — 8f^{2} starting at t = 0 to find how far along the line Lit) to step. This function can be maximized by using any of the onedimensional search techniques for singlevariable optimization from Chapter 3 or by simple calculus.
Then L(t*) gives the new point X = L(0.25) = (0,0.5). The magnitude of the difference (xo — Xi) is 0.5 which is not less than our desired tolerance of 0.01 (chosen arbitrarily at the beginning of the solution). Since we are not optimal, we continue. Repeat the calculations from the new point (0,0.5). Iteration 2. The gradient vector g = V/((0,0.5)) = (1,0). Now L(t) = (0,0.5) + (l,0)t. Again, how far do we travel along L to maximize = /((t, 0.5)) = —t^{2} + t + 0.5 ? Simple calculus gives
Then, the new x is X2 = L(0.5) = (0.5,0.5). The magnitude of the difference (Xj — X2) is 0.5 which is still not less than our desired tolerance of 0.01. The magnitude of V/(x 1) = 1 which is also not within our tolerance 0.01. We continue to iterate. Maple will continue the iterations for us using the function Steepest As cent which is in the book’s PSMv2 package. Load the package via with(PSMv2). Define the function using vectors.
The syntax of our SteepestAscent function is
Adding a third argument tells SteepestAscent to produce a graph of the path Xo, X, ..., x_{n}. The output of SteepestAscent is a DataFrame containing the generated points, gradients, function value, and stepsize. To see the list of points generated, use SAf[’Pts’, to see just the last point and its function value, use SAf[ Pts’][!], SAf[ ’Fen’][1]. The multivariable calculus solution is straightforward to compute by solving the system {df/dxi = 0.df/dx? = 0}, and checking f’s Hessian at the critical point.
The Hessian is negative definite, so the point x* = (1,1) is a maximum with /(x*) = 1. To get a closer approximation with the steepest ascent method, we would make our tolerance smaller. A look at f’s contour plot confirms a hill at approximately (1,1) in Figure 4.10. FIGURE 4.10: Contour Plot of f Example 4.14. Steepest Ascent Example, II. Maximize g(x) = 55*i — 4x + 135*2 ^{—} 15*2 ^{—} 100 using the Steepest Ascent method starting at the point (1,1) to within a tolerance of £ = 0.01. Figure 4.11 provides a visual reference for the maximum. FIGURE 4.11: Surface Plot of The gradient is У,<у(х) = (55 — 8x, 135 — ЗО.г'2). Iteration 1. We begin with Xo = (1,1). Then V,9(xo) = (47.105). From (1.1), we move in the direction of (47,105). How far do we go along the line L(t) = (1,1) + (37,105)t? We need to maximize the function
This function can also be maximized by simple singlevariable calculus:
The new point Xi is found by evaluating L(t*) = Xo + i* • V
Since xi — Xq > 0.01, iterate again. Iteration 2. Now compute V,g(xi) = (32.719,14.645). We move from (2.785,4.988) in the direction of (32.719,14.645). How far do we go along the line L(t) = (2.785,4.988) + (32.719, —14.645)t? We need to maximize the function
This function can also be maximized by simple singlevariable calculus:
The new point X2 is found by evaluating L(t*) = Xj +1* ■ Vry(xi).
Since X2 — Xi > 0.01, we must iterate again. Use our SteepestAscent function to have Maple finish the iterations. Define g as a function of the vector x. Unconstrained Optimization: Numerical Methods The final point Хц and the g's maximum value are: This time, the solution process zigzagged and converged more slowly to an optimal solution as illustrated in Figure 4.12. FIGURE 4.12: ZigZagging to a Solution We’ll now look at maximizing a transcendental multivariable function where the critical points cannot be found analytically in a closed form, but must be approximated numerically. Example 4.15. Steepest Ascent Example, III. Maximize h(x) = 2xX2 + 2x2 ~ ^{e}^{Xl} ~ c^{Xl} + Ю using the Steepest Ascent method starting at the point (0.5,0.5) to within a tolerance of £ = 0.01. Verify that h has a local maximum near (1,1.25} by looking at a graph. The gradient of h is
Iteration 1. We begin with x_{0} = (0.5,0.5). Then V/i(x_{0}) = (—0.649,1.351). From (0.5,0.5), we move in the direction of (—0.6487,1.3513). How far do we go along the line L{t) = (0.5,0.5) + (—0.6487, 1.3513)t? We need to maximize the Unconstrained Optimization: Numerical Methods function Maple’s fsolve gives t* = 0.2875. Therefore
Since xi — Xo  > 0.01, we continue. But let Maple do the work.
Modified Newton’s Method The zigzag pattern we saw in Figure 4.12 shows that steepest ascent doesn’t always go directly to an optimum value. The NewtonRaphson^{[2]} iterative root finding technique using the partial derivatives of the function provides an alternative numerical search method for an optimum value when modified appropriately. Given the right conditions, this numerical method is more efficient and converges faster to an approximate optimal solution: quadratic convergence versus the linear convergence of steepest ascent. Newton’s Method for multivariable optimization searches is based on the singlevariable rootfinding algorithm. Modify the procedure to look for roots of the first derivative rather than roots of the original function:
and iterate until x_{n+}i — x_{n} < £ for our chosen tolerance £ > 0. Extend the Modified Newton’s Method to several variables by using gradients and the Hessian, the matrix of second partial derivatives.
now iterating until x_{n+}i — х,_{г} < £ for our chosen tolerance £ > 0. Applying a little bit of linear algebra gives us the method as an algorithm. Modified Newton’s Method Algorithm Find the unconstrained maximum of the function / : IR" —¥ R. INPUTS: Function: / Starting point: Xo Tolerance and Maximum Iterations: £, N OUTPUTS: Maximal point x* and maximum value /(x*) Step 1. Initialize: Set x = Xo Step 2. Calculate the gradient g = V/(x) and II = Hessian(x) Step 3. Compute d = H and create new matrices Hi = substitute g for column I of II H‘2 = substitute g for column 2 of II Step 4. Compute: Да?! = II/d and Дж2 = //21/с/• Step 5. Find the new point x_{n(}.„. = (a.’i + Ai’i, + Джг) Step 6. If  (Дат, Да’г)II < G then STOP and return estimates x* = x_{n(}.„. and f{x_{new}). Otherwise, set x = x_{new} and return to Step 2. Again, remember that y = {j/i,y_{2})ll = s/y't+y'i Let’s repeat our examples for the steepest ascent method using Newton’s method. We’ll use the function ModifiedNewtonMethod which is in the book’s PSMv2 package. Load the package via with(PSMv2). The syntax of the Mod ifiedNewtonMethod function is ModifiedNewtonMethod{f, (*o, J/o)>N) where / is the function, (*o,2/o) is the starting vector, £ is the tolerance, and N is the maximum number of iterations allowed. The output of ModifiedNew tonMethod is a DataFrame containing the generated points, function values, Hessians, and the definiteness of the Hessian. Example 4.16. Modified Newton’s Method Example, I. Maximize /(x) = 2*1*2 + 2*2 — x — 2x to within £ = 0.01 starting at x_{0} = (0,0). Limit the number of iterations to N = 20.
We have a maximum of f of 1 at (1,1) since the Hessian is negative definite there. On to the second example. Example 4.17. Modified Newton’s Method Example, II. Maximize g(x) = 55* i — dx + 135*2 ^{—} 15*2 ^{—} 100 starting at the point (1,1) to within a tolerance of £ = 0.01 limiting iterations to N = 20. We have a maximum of g of 392.9 at (6.875,4.500) since the Hessian is negative definite there. Now, for the third example. Example 4.18. Modified Newton’s Method Example, III. Maximize /i(x) = 2xX'2 + 2x2 ~ e^{Xl} ~ e^{X2} + Ю starting at the point (0.8,0.8) to within a tolerance of e = 0.01 with no more than N = 20 iterations. We have a maximum of h of 8.827 at (1.031,1.402) since the Hessian is negative definite there. Even though Newton’s method is faster and more direct, we must be cautious. The method requires a relatively good starting point. Try searching for a maximum for h starting at (0.5,0.5) and looking at just the last entry in the DataFrame report. We started at {0.5,0.5), not far from our original (0.8,0.8). But the Hessian being indefinite tells us that we’ve found an approximate saddle point, not a maximum. Modified Newton’s method is very sensitive to the initial value chosen. Comparisons of Methods We compared these two routines finding that Modified Newton’s Method converges faster than the gradient method. Table 4.4 shows the comparison. TABLE 4.4: Comparison of the Steepest Ascent and Modified Newton’s Methods
As a final comparison, add the modified Newton’s method points to the contour map showing the steepest ascent points for g of Figure 4.12 to obtain Figure 4.13 below. FIGURE 4.13: Steepest Ascent and Modified Newton’s Method Solutions When given a good starting value, a Modified Newton’s method search is much faster and more direct than a steepest ascent gradient method. Exercises 1. Maximize /(x) = 2xX2 — 2x — 2x to within a tolerance of 0.1. a. Start at the point x = (1,1). Perform 2 complete iterations of the steepest ascent gradient search. For each iteration clearly show x_{n}, x_{n+}i, V/(x_{n}), and t*. Justify that the process will eventually find the approximate maximum. b. Use Newton’s method to find the maximum. Starting at x = (1,1). Clearly show x„, x_{n+}i, V/(x_{n}), and the Hessian for each iteration. Indicate when the stopping criterion is achieved. 2. Maximize /(x) = ZxX2 — 4a?i — Ax to within a tolerance of 0.1. a. Start at the point x = (1,1). Perform 2 complete iterations of the steepest ascent gradient search. For each iteration clearly show x_{n}, x_{n+}i, V/(x_{n}), and t*. Justify that the process will eventually find the approximate maximum. b. Use Newton’s method to find the maximum. Starting at x = (1,1). Clearly show x„, x_{n+}i, V/(x_{n}), and the Hessian for each iteration. Indicate when the stopping criterion is achieved. 3. Apply the modified Newton’s method to find the following: a. Maximize f(x, y) = —x^{3} + 3x + 84?/ — 6y^{2} starting with the initial value (1,1). Why can’t we start at (0,0)? b. Minimize f(x, y) = —Ax + 4a:^{2} — 3у + у^{2} starting at {0,0). c. Perform 3 modified Newton’s method iterations to minimize f(x, y) = (a; — 2)^{4} + (a: — 2y)^{2} starting at (0,0). Why is this problem not converging as quickly as b? 4. Use a gradient search to approximate the minimum to /(x) = (aq — 2)^{2} + a?i + x^{2}. Start at (2.5,1.5). Projects Project 4.1. Modify the Steepest Ascent Algorithm (pg 147) to approximate the minimum of a function. This technique is called the Steepest Descent Algorithm. Project 4.2. Write a program in Maple that uses the onedimensional Golden Section search algorithm instead of calculus to perform iterations of a gradient search. Use your code to find the maximum of
Project 4.3. Write a program in Maple that uses the onedimensional Fibonacci search algorithm instead of calculus to perform iterations of a gradient search. Use your code to find the maximum of References and Further Reading [B2016] William C. Bauldry, Introduction to Computation, 2016. [BSS2013] Mokhtar S. Bazaraa, Hanif D. Sherali, and Chitharanjan M. Shetty, Nonlinear Programming: Theory and Algorithms, Wiley, 2013. [F1992] William P. Fox, “Teaching nonlinear programming with Minitab,” COED Journal, Vol. II (1992), no. 1, 8084. [F1993] William P. Fox, “Using microcomputers in undergraduate nonlinear optimization,” Collegiate Microcomputer, Vol. XI (1993), no. 3, 214218. [FGMW1987] William P. Fox, Frank Giordano, S. Maddox, and Maurice D. Weir, Mathematical Modeling with Minitab, Brooks/Cole, 1987. [FR2000] William P. Fox and William Richardson, “Mathematical modeling with least squares using Maple,” Maple Application Center, Nonlinear Mathematics, 2000. [FGW1997] William P. Fox, Frank Giordano, and Maurice Weir, A First Course in Mathematical Modeling, 2nd Ed., Brooks/Cole, 1997. [FW2001] William P. Fox and Margie Witherspoon, “Single variable optimization when calculus fails: Golden section search methods in nonlinear optimization using Maple,” COED Journal, Vol. XI (2001), no. 2, 5056. [M2013] Mark M. Meerschaert, Mathematical Modeling, Academic Press, 2013. [PRS1976] D.T. Phillips, A. Ravindran, and .1. Solberg, Operations Research, Wiley, 1976. [PFTV2007] William H. Press, Brian P. Flannery, Saul A. Teukolsky, William T. Vetterling, et ah, Numerical Recipes, Cambridge U. Press, 2007. [R1979] S.S. Rao, Optimization: Theory and Applications, Wiley Eastern Ltd., 1979. 
<<  CONTENTS  >> 

Related topics 