Gradient Ascent: Finding Maximum of a Scalar Function

  • Thread starter gaganaut
  • Start date
  • Tags
    Gradient
In summary, the problem was that the examiner asked for the gradient of a scalar function, which is given by the equation: nabla f=\left(\frac {\partial f \left(x\right)} {\partial x_1},\frac {\partial f \left(x\right)} {\partial x_2},...,\frac {\partial f \left(x\right)} {\partial x_n}\right) where the gradient is maximum if \Delta x lies along \nabla f\left(x\right). However, I could not prove that this was the case, so I gave up.
  • #1
gaganaut
20
0
hi all,

Couple of months ago I had an entrance exam wherein this problem appeared. (I hope this is what it was).

For a scalar function [tex]f\left(x\right)=f\left(x_{1},x_{2},...,x_{n}\right)[/tex] the gradient is given as
[tex]\nabla f=\left(\frac {\partial f \left(x\right)} {\partial x_1},\frac {\partial f \left(x\right)} {\partial x_2},...,\frac {\partial f \left(x\right)} {\partial x_n}\right)[/tex]
Then show that for any small change [tex]\Delta x[/tex], [tex]f\left(x+\Delta x\right)[/tex] is maximum if [tex]\Delta x[/tex] lies along [tex]\nabla f\left(x\right)[/tex].

Frankly, I did not get the question then. So I did few preliminary steps as follows.
[tex]f\left(x+\Delta x\right)=f\left(x\right)+\nabla f\left(x\right)\cdot\Delta x + O\left(\left(\Delta x\right)^2\right)[/tex]

Hence,
[tex]\nabla f\left(x+\Delta x\right)=\nabla f\left(x\right)+\nabla \left(\nabla f\left(x\right)\cdot\Delta x\right) [/tex]

For [tex]f\left(x+\Delta x\right)[/tex] to be maximum, [tex]\nabla f\left(x+\Delta x\right)=0[/tex] is maximum

Hence,
[tex]\nabla f\left(x\right)=-\nabla \left(\nabla f\left(x\right)\cdot\Delta x\right) [/tex]

This is where I gave up then. Bu intuition I could see that the dot product would give a maximum answer if [tex]\Delta x[/tex] lies along [tex]\nabla f\left(x\right)[/tex]. But I could not prove it.

So can somebody help me with this as this question is haunting me for last two months now.

Thanks in advance.
 
Physics news on Phys.org
  • #2
Let [tex]g_{n}(t)=f(x_{1}+tn_{1},x_{2}+tn_{2},...,x_{n}+tn_{n})[/tex]
where [tex]\vec{n}=(n_{1},n_{2},...,n_{n})[/tex] is a unit vector in direction n.

Thus, g_{n}(t) measures the increase of f along some direction.

At t=0, we are at the point [tex]\vec{x}_{0}=(x_{1},x_{2}...,x_{n})[/tex]

Now, the direction for maximal growth of f at [itex]\vec{x}_{0}[/itex] will be finding the maximum of [tex]\frac{dg_{n}}{dt}|_{t=0}[/tex], considered as a function of [itex]\vec{n}[/tex]

By aid of the chain rule, this transforms to finding the maximum of:
[tex]\nabla{f}_{\vec{x}=\vec{x}_{0}}\cdot\vec{n}[/tex]

Clearly, [itex]\vec{n}[/itex] must be parallell to the gradient in order to maximize the expression, which proves the result.
 
  • #3
Thanks for the reply

Thank you so much for the reply. I appreciate that. We can now say that this question has been resolved.
 

1. What is Gradient Ascent?

Gradient Ascent is an optimization algorithm used in machine learning and artificial intelligence to find the maximum value of a mathematical function by following the direction of the steepest ascent.

2. How does Gradient Ascent work?

Gradient Ascent starts with an initial guess for the maximum value of a function and then iteratively updates the guess by moving in the direction of the gradient (or slope) of the function at that point. This process is repeated until the maximum value is reached.

3. What is the difference between Gradient Ascent and Gradient Descent?

The main difference between Gradient Ascent and Gradient Descent is the direction of movement. Gradient Ascent moves in the direction of the steepest ascent, while Gradient Descent moves in the direction of the steepest descent. In other words, Gradient Ascent finds the maximum value of a function, while Gradient Descent finds the minimum value.

4. When is Gradient Ascent used?

Gradient Ascent is commonly used in machine learning and artificial intelligence for tasks such as optimizing neural networks and finding the maximum likelihood estimate for a probabilistic model.

5. What are some potential challenges of using Gradient Ascent?

One potential challenge of using Gradient Ascent is that it can get stuck in local maxima, where the algorithm finds a maximum value, but it is not the global maximum. Additionally, the algorithm may also struggle with high-dimensional data and require a long time to converge to the maximum value.

Similar threads

Replies
3
Views
654
  • Differential Equations
Replies
3
Views
1K
  • Differential Equations
Replies
1
Views
757
Replies
1
Views
2K
Replies
3
Views
1K
  • Linear and Abstract Algebra
2
Replies
41
Views
3K
  • Differential Equations
Replies
3
Views
1K
  • Advanced Physics Homework Help
Replies
3
Views
363
Replies
4
Views
816
  • Advanced Physics Homework Help
Replies
8
Views
793
Back
Top