# Why steepest descent gives a wrong direction search?

• ymhiq
In summary: The solution is ƒ'(x0-α∇ƒ(x0))=-2+6α-2α2. So the extreme points are in α1=0.57735 and α2=2. So the minimum is in α2=2. I am sorry for all the trouble.
ymhiq
1. Homework Statement
I have to minimize the function (x1-1)2+x23+x1x2 by the steepest descent method. The initial point is [1,1]T

## The Attempt at a Solution

The gradient of this function is ∇ƒ(x1,x2)=[2(x1-1)-x2 3x22-x1]. This gradient evaluated in the initial point is ∇ƒ(1,1)=[-1 2]. Following the steepest descent method it is mandatory to minimize the function ƒ(x0-α∇ƒ(x0)) in order to find the value of α. So ƒ(x0-α∇ƒ(x0))=-5α+15α2-8α3 and ƒ'(x0-α∇ƒ(x0))=-5+30α-24α2. This function has extreme points in α1=0.95061 and α2=5.094. In order to be a minimum of this curve ƒ''(x0-α∇ƒ(x0))=30-48α has to be positive. This is my problem ƒ''(x0-α∇ƒ(x0) evaluated at both α values is negative so they don´t minimize the direction. So what I am doing wrong?

ymhiq said:
1. Homework Statement
I have to minimize the function (x1-1)2+x23+x1x2 by the steepest descent method. The initial point is [1,1]T

## The Attempt at a Solution

The gradient of this function is ∇ƒ(x1,x2)=[2(x1-1)-x2 3x22-x1]. This gradient evaluated in the initial point is ∇ƒ(1,1)=[-1 2]. Following the steepest descent method it is mandatory to minimize the function ƒ(x0-α∇ƒ(x0)) in order to find the value of α. So ƒ(x0-α∇ƒ(x0))=-5α+15α2-8α3 and ƒ'(x0-α∇ƒ(x0))=-5+30α-24α2. This function has extreme points in α1=0.95061 and α2=5.094. In order to be a minimum of this curve ƒ''(x0-α∇ƒ(x0))=30-48α has to be positive. This is my problem ƒ''(x0-α∇ƒ(x0) evaluated at both α values is negative so they don´t minimize the direction. So what I am doing wrong?

Just inspecting the gradient of the original function f(x1, x2), something doesn't look right.

If you take ∂f / ∂x1, how did you obtain [2(x1-1)-x2], specifically, the ' - x2' part? I'm confused, because there were no negative signs between terms in the original definition of f(x1, x2). A similar question arises in what you show to be ∂f / ∂x2.

ymhiq
ymhiq said:
1. Homework Statement
I have to minimize the function (x1-1)2+x23+x1x2 by the steepest descent method. The initial point is [1,1]T

## The Attempt at a Solution

The gradient of this function is ∇ƒ(x1,x2)=[2(x1-1)-x2 3x22-x1]. This gradient evaluated in the initial point is ∇ƒ(1,1)=[-1 2]. Following the steepest descent method it is mandatory to minimize the function ƒ(x0-α∇ƒ(x0)) in order to find the value of α. So ƒ(x0-α∇ƒ(x0))=-5α+15α2-8α3 and ƒ'(x0-α∇ƒ(x0))=-5+30α-24α2. This function has extreme points in α1=0.95061 and α2=5.094. In order to be a minimum of this curve ƒ''(x0-α∇ƒ(x0))=30-48α has to be positive. This is my problem ƒ''(x0-α∇ƒ(x0) evaluated at both α values is negative so they don´t minimize the direction. So what I am doing wrong?

As SteamKing has pointed out, your gradient formula is incorrect, and your initial steepest-descent direction is wrong. However, when you correct these errors, you will obtain a function ##\phi(\alpha) = f(x_0 - \alpha \nabla f(x_0))## that has no stationary points at all. What does that tell you?

ymhiq
Oh! Excuse me! You are right! However I made a mistake when I wrote the original problem. Let me write it again. I have to minimize the function ƒ(x1,x2)=(x1-1)2+x23-x1x2. The initial point is [1,1]T.

Excuse me all of you. Finally I got the mistake I made solved. It was an incorrect solutions of ƒ'(x0-α∇ƒ(x0))=-5+30α-24α2 .

## 1. Why does steepest descent give a wrong direction search?

Steepest descent is an optimization algorithm that aims to find the minimum value of a function. However, it can sometimes give a wrong direction search because it relies on the gradient, or the slope, of the function at a given point. If the gradient is small or zero, steepest descent may get stuck in a local minimum instead of finding the global minimum.

## 2. How does steepest descent handle non-convex functions?

Steepest descent is not well-suited for solving non-convex functions because it can easily get stuck in local minima. This is because the algorithm moves in the direction of steepest descent, which may not necessarily lead to the global minimum in non-convex functions.

## 3. What is the alternative to steepest descent for solving non-convex functions?

An alternative to steepest descent for solving non-convex functions is the Newton's method. This method takes into account not only the gradient but also the curvature of the function, allowing it to better navigate non-convex landscapes and potentially find the global minimum.

## 4. Can steepest descent be improved to handle non-convex functions?

Yes, steepest descent can be improved by using a technique called momentum. This involves adding a momentum term to the gradient descent update, which allows the algorithm to escape local minima and continue searching for the global minimum.

## 5. Is steepest descent still useful despite its limitations?

Yes, steepest descent is still a useful optimization algorithm, especially for convex functions. It is relatively simple to implement and computationally efficient. However, for more complex and non-convex functions, other algorithms such as Newton's method or stochastic gradient descent may be more appropriate.

Replies
1
Views
1K
Replies
1
Views
3K
Replies
7
Views
4K
Replies
8
Views
2K
Replies
2
Views
2K
Replies
1
Views
4K
Replies
7
Views
2K
Replies
5
Views
1K
Replies
2
Views
2K
Replies
1
Views
1K