Sign confusion when taking gradient (Newton's Method)

In summary, the professor did not use the negative sign in the first gradient because it is already taken into account in the Hesse Matrix. The Hesse Matrix shows the second derivatives of the function, which already includes the negative sign. Therefore, using the negative sign again in the first gradient would result in a double negative, which is not necessary.
  • #1
zmalone
10
0
I'm watching a lecture on Newton's method with n-dimensions but I am kind of hung up on why the professor did not use the negative sign while taking the first gradient? Is there a rule that explains this or something that I'm forgetting? The rest makes sense but highlighted in red is the part I am confused on if anyone can clear that up I'd appreciate it, thanks!

Where g(x,y) = 1-(x-1)^4-(y-1)^4

local maximum at (1,1) ; critical point at (1,1)

Gradient of g(x,y):

F(x,y,) = [Dg(x,y,)]transpose = [4(x-1)^3 4(y-1)^3]transpose
Why not [-4(x-1)^3 -4(y-1)^3]?

Gradient of F(x,y):

DF(x,y) =
12(x-1)^2 0
0 12(y-1)^2

Screen shot which is probably easier to read:
 

Attachments

  • NewtonMethodQuestion.jpg
    NewtonMethodQuestion.jpg
    13.3 KB · Views: 376
Physics news on Phys.org
  • #2
The gradient in Cartesian components is, of course, given by
[tex]\vec{\nabla} g=(\partial_x g,\partial_y g)=(-4(x-1)^3,-4 (y-1)^3)[/tex]
and the Hesse Matrix by
[tex]H_{ij}=\partial_i \partial_j g=\mathrm{diag}(-2(x-1)^2,-12(y-1)^2).[/tex]
 

1. What is sign confusion in gradient descent?

Sign confusion in gradient descent refers to the phenomenon where the gradient of a function changes sign, causing the algorithm to move in the opposite direction of the intended minimum. This can occur if the function has multiple local minima, or if the step size is too large.

2. How does sign confusion affect Newton's Method?

Sign confusion can significantly impact the effectiveness of Newton's Method, as it relies on the gradient to determine the direction of steepest descent. If the gradient changes sign, the algorithm can become stuck in a local minimum or even diverge.

3. What causes sign confusion in gradient descent?

Sign confusion can occur due to various reasons, such as a poorly chosen step size, an ill-conditioned Hessian matrix, or a function with multiple local minima. It can also be caused by rounding errors in numerical calculations.

4. How can sign confusion be avoided in gradient descent?

To avoid sign confusion, it is essential to carefully choose the step size and monitor the gradient to ensure it is consistently descending towards the minimum. Additionally, using a line search algorithm or implementing momentum in the gradient descent can help prevent sign confusion.

5. Can sign confusion be solved in Newton's Method?

While sign confusion cannot be entirely eliminated in Newton's Method, it can be mitigated by using techniques such as line search or backtracking to adjust the step size and ensure the algorithm is moving in the correct direction. Additionally, using a more robust optimization algorithm, such as BFGS, can also help overcome sign confusion in gradient descent.

Similar threads

Replies
3
Views
1K
Replies
3
Views
341
  • Calculus
Replies
13
Views
1K
Replies
18
Views
2K
Replies
20
Views
2K
Replies
1
Views
212
  • Calculus and Beyond Homework Help
Replies
8
Views
478
Replies
1
Views
963
  • Calculus
Replies
1
Views
1K
Back
Top