- #1

- 41

- 2

## Main Question or Discussion Point

Can someone explain why the gradient of a function is just a vector made up of partial derivatives of the function?

- I
- Thread starter I_laff
- Start date

- #1

- 41

- 2

Can someone explain why the gradient of a function is just a vector made up of partial derivatives of the function?

- #2

haushofer

Science Advisor

- 2,279

- 636

- #3

- 2,423

- 683

The main idea comes from the functions of one variable f(x) where the derivative (with respect to the x variable ) at the point ##x## , that is ##f'(x)## is the slope of the tangent line at the point ##(x,f(x))## of the graph of the function f.

So, If we want to make an operator that :

takes as input a function f(x) and gives as output a vector which has

1) magnitude of the output vector is the slope of the function with respect to that variable,

2) direction of the output vector is the direction of the line of the variable (in one variable, or one dimension we just have a single direction, a single line)

Then the operator is just ##\vec{i}\frac{\partial}{\partial x}##

The most straightforward generalization of this operator in 3 dimensions is just the gradient operator, think about it, each component of the output vector of the gradient , is now the slope of the function f(x,y,z) with respect to that variable, that is the i component gives us the slope of f with respect to the variable x, the j component give us the slope with respect to y, and the k component give us the slope with respect to z.

So, If we want to make an operator that :

takes as input a function f(x) and gives as output a vector which has

1) magnitude of the output vector is the slope of the function with respect to that variable,

2) direction of the output vector is the direction of the line of the variable (in one variable, or one dimension we just have a single direction, a single line)

Then the operator is just ##\vec{i}\frac{\partial}{\partial x}##

The most straightforward generalization of this operator in 3 dimensions is just the gradient operator, think about it, each component of the output vector of the gradient , is now the slope of the function f(x,y,z) with respect to that variable, that is the i component gives us the slope of f with respect to the variable x, the j component give us the slope with respect to y, and the k component give us the slope with respect to z.

Last edited:

- #4

fresh_42

Mentor

- 12,669

- 9,197

I think it helps to consider the partial derivatives as the basis in tangent space. So the gradient is a certain derivative, which are always directional, expressed in this basis.Can someone explain why the gradient of a function is just a vector made up of partial derivatives of the function?

- #5

- 1,013

- 65

It is a straightforward generalization of the single-variable derivative to a multivariable function. Recall that if ##f:\mathbb{R}\rightarrow\mathbb{R}##, then

$$f'(x) = \lim_{h\to 0} \frac{f(x+h) - f(x)}{h}$$

If we try to generalize this to two-variable functions, we have a problem with the denominator:

$$f'(x,y) = \lim_{(h,k)\to 0} \frac{f(x+h, y+k) - f(x,y)}{(h,k)}$$

The denominator is a displacement vector, and there is no consistent way to divide by a vector. However, what we want to do is divide by the length of the displacement vector, which is what we really mean when we divide by h in 1 dimension too. So let's do that instead:

$$f'(x,y) = \lim_{(h,k)\to 0} \frac{f(x+h, y+k) - f(x,y)}{||(h,k)||}$$

This looks like a derivative. But recall that in 1-dimension, the derivative was defined so that it provided the best linear approximation to f around the point at which it was taken. It was not really a number: it was a tool for finding the correct tangent line.

That is what we want in a multivariable derivative: a tool for finding the best linear approximation to f(x,y) at the point the derivative is applied. In 1 dimension, the derivative was applied to the displacement h to get the linear approximation: ##f(x + h) \approx f(x) + f'(x)\cdot h##.

A linear 2-variable approximation based on the displacement vector (h, k) would then be ##f(x, y) \approx f(x_0, y_0) + u\cdot h + v\cdot k##, where u and v are constants.

So ##f'(x,y)## somehow has to give us two numbers, u and v, to apply to the displacement vector (h,k) in order to get our linear approximation. We know the dot product gives us that expression, so f'(x,y) must be the vector (u, v). Applying f'(x,y) to a particular displacement vector (h,k) must then be done with the dot product.

But hold on. Our definition doesn't look like it is going to give us a vector:

$$f'(x,y) = \lim_{(h,k)\to 0} \frac{f(x+h, y+k) - f(x,y)}{||(h,k)||}$$

If ##f(x,y)## is a scalar-valued function, then the right side is a single number, not a 2-component vector. So whatever that is, it is not what we want.To get what we want, mathematicians made a slight adjustment to the definition. Recalling that for single variable functions, we have:

$$f'(x) = \lim_{h\to 0} \frac{f(x+h) - f(x)}{h}$$

we can do some slight of hand to unify the two sides. If ##f'(x)## exists, then ##\lim_{h\to 0} f'(x) = f'(x)##, so

$$\lim_{h\to 0} f'(x) = \lim_{h\to 0} \frac{f(x+h) - f(x)}{h}$$

These two sides are just numbers, so we can do algebra:

$$0 = \lim_{h\to 0} \frac{f(x+h) - f(x)}{h} - \lim_{h\to 0} f'(x)$$

$$0 = \lim_{h\to 0} \frac{f(x+h) - f(x)}{h} - \lim_{h\to 0} \frac{f'(x)\cdot h}{h}$$

$$0 = \lim_{h\to 0} \frac{f(x+h) - f(x) - f'(x)\cdot h}{h}$$

It follows that this is an equivalent definition of the derivative, where we can see the linear approximation that it provides directly in the numerator. This motivates us to try the following definition for multivariable derivatives: ##f'(x,y)## is, at each point ##(x,y)## the unique linear function ##L(h,k)## such that:

$$0 = \lim_{(h,k)\to 0} \frac{f(x+h, y+k) - f(x,y) - L(h,k)}{||(h,k)||}$$

That is, the derivative at each point is the unique linear function that vanishes at the same rate as the function with the displacement vector. Recalling that L is a linear function of the displacement vector (h,k), it must be the case that ##L(h,k) = u\cdot h + v\cdot k##.

Use this fact in the limit, and find out what u and v must be equivalent to. You will answer your own question then. :-)

$$f'(x) = \lim_{h\to 0} \frac{f(x+h) - f(x)}{h}$$

If we try to generalize this to two-variable functions, we have a problem with the denominator:

$$f'(x,y) = \lim_{(h,k)\to 0} \frac{f(x+h, y+k) - f(x,y)}{(h,k)}$$

The denominator is a displacement vector, and there is no consistent way to divide by a vector. However, what we want to do is divide by the length of the displacement vector, which is what we really mean when we divide by h in 1 dimension too. So let's do that instead:

$$f'(x,y) = \lim_{(h,k)\to 0} \frac{f(x+h, y+k) - f(x,y)}{||(h,k)||}$$

This looks like a derivative. But recall that in 1-dimension, the derivative was defined so that it provided the best linear approximation to f around the point at which it was taken. It was not really a number: it was a tool for finding the correct tangent line.

That is what we want in a multivariable derivative: a tool for finding the best linear approximation to f(x,y) at the point the derivative is applied. In 1 dimension, the derivative was applied to the displacement h to get the linear approximation: ##f(x + h) \approx f(x) + f'(x)\cdot h##.

A linear 2-variable approximation based on the displacement vector (h, k) would then be ##f(x, y) \approx f(x_0, y_0) + u\cdot h + v\cdot k##, where u and v are constants.

So ##f'(x,y)## somehow has to give us two numbers, u and v, to apply to the displacement vector (h,k) in order to get our linear approximation. We know the dot product gives us that expression, so f'(x,y) must be the vector (u, v). Applying f'(x,y) to a particular displacement vector (h,k) must then be done with the dot product.

But hold on. Our definition doesn't look like it is going to give us a vector:

$$f'(x,y) = \lim_{(h,k)\to 0} \frac{f(x+h, y+k) - f(x,y)}{||(h,k)||}$$

If ##f(x,y)## is a scalar-valued function, then the right side is a single number, not a 2-component vector. So whatever that is, it is not what we want.To get what we want, mathematicians made a slight adjustment to the definition. Recalling that for single variable functions, we have:

$$f'(x) = \lim_{h\to 0} \frac{f(x+h) - f(x)}{h}$$

we can do some slight of hand to unify the two sides. If ##f'(x)## exists, then ##\lim_{h\to 0} f'(x) = f'(x)##, so

$$\lim_{h\to 0} f'(x) = \lim_{h\to 0} \frac{f(x+h) - f(x)}{h}$$

These two sides are just numbers, so we can do algebra:

$$0 = \lim_{h\to 0} \frac{f(x+h) - f(x)}{h} - \lim_{h\to 0} f'(x)$$

$$0 = \lim_{h\to 0} \frac{f(x+h) - f(x)}{h} - \lim_{h\to 0} \frac{f'(x)\cdot h}{h}$$

$$0 = \lim_{h\to 0} \frac{f(x+h) - f(x) - f'(x)\cdot h}{h}$$

It follows that this is an equivalent definition of the derivative, where we can see the linear approximation that it provides directly in the numerator. This motivates us to try the following definition for multivariable derivatives: ##f'(x,y)## is, at each point ##(x,y)## the unique linear function ##L(h,k)## such that:

$$0 = \lim_{(h,k)\to 0} \frac{f(x+h, y+k) - f(x,y) - L(h,k)}{||(h,k)||}$$

That is, the derivative at each point is the unique linear function that vanishes at the same rate as the function with the displacement vector. Recalling that L is a linear function of the displacement vector (h,k), it must be the case that ##L(h,k) = u\cdot h + v\cdot k##.

Use this fact in the limit, and find out what u and v must be equivalent to. You will answer your own question then. :-)

Last edited:

- #6

- 2,423

- 683

The vector of the gradient of a function just give us information about each of the first partial derivatives (the magnitude of each component give us the first partial derivative with respect to that corresponding variable). The direction of the gradient give us some information about which partial derivative is greater, for example if the partial derivative with respect to x is much greater than the partial derivatives with respect to y,z then the gradient vector will tend to point in the x-direction.

- #7

- 2,211

- 245

So if ##y=f(x)## is a single variable function ##dy = f'(x)dx## is the simple 1-dimensional linear operator we call "multiplication by a number".

But in higher dimensions, where ##\mathbf{x}## is a vector, say ##\mathbf{x}=\langle x,y,z\rangle## and we have a scalar valued function of this vector:

## u = f(\mathbf{x}) = f(x,y,z)##

the derivative must map a vector differential to a scalar differential:

[tex] du = f'(\mathbf{x})[ d\mathbf{x}][/tex]

(I'm using square brackets here as function notation for a specifically linear function. Note we have a distinct linear function ##f'(\mathbf{x})## at each value of ##\mathbf{x}## which then acts on the differential ##d\mathbf{x}##.)

In this case the derivative must be something called a

"For ##\phi## a linear functional there exists some vector ##\mathbf{v}## such that ##\phi[\mathbf{x}] = \mathbf{v}\bullet \mathbf{x}##."

In the context of defining the derivative it means we can define the derivative functional as the dot product of some vector.

[tex] du = f'(\mathbf{x})[d\mathbf{x}] \equiv \nabla f (\mathbf{x})\bullet d\mathbf{x}[/tex]

([edit]:in short, [itex]f'(\mathbf{x}) = \nabla f(\mathbf{x})\bullet[/itex].)

This vector, ##\nabla f## being used to express the derivative is defined as the gradient vector. It is not ideal because we don't always have a dot product (or just one of em) defined on our space and the gradient depends on the definition of the dot product while this generalized derivative does not. But it is very convenient if you don't want to whip out the full toolbox of linear algebra with its dual spaces and such.

Do note that the gradient

Now as to why the gradient is just the vector of partial derivatives, this is not always the case and depends specifically on the fact that you are working with rectilinear coordinates where the coordinate basis is ortho-normal. You'll find, for example if you re-express a quantity in polar coordinates, its gradient vector is no longer simply the vector of partial derivatives. However it does obey a very simple "chain rule" so you can work out gradients in any coordinate system simply by remembering:

[tex]\nabla f(u,v,\ldots) = \frac{\partial f}{\partial u}\nabla u + \frac{\partial f}{\partial v}\nabla v + \ldots[/tex]

- #8

lavinia

Science Advisor

Gold Member

- 3,170

- 575

Assuming you are asking this question for Euclidean 3 space, how do you define the gradient of a function if not the vector of partials with respect to the Euclidean coordinate axes? Are you using some other definition?

If you are thinking of a gradient as the direction and rate at which a physical quantity ##f(x,y,z)## changes most rapidly, then this turns out to be the vector of partial derivatives, ##(∂f/∂x,∂f/∂y,∂f/∂z)##.

To show this one needs to show:

The function changes most rapidly in the normal direction to the surfaces ##f(x,y,z)=k## on which ##f## is constant.

##∇f=(∂f/∂x,∂f/∂y,∂f/∂z)## is normal to the surface ##f(x,y,z)=k##.

The directional derivative of ##f## in the direction of ##∇f## is the length of ##∇f##

The last two properties follow from the Chain Rule.

Last edited:

- Replies
- 2

- Views
- 5K

- Last Post

- Replies
- 11

- Views
- 2K

- Replies
- 2

- Views
- 1K

- Last Post

- Replies
- 9

- Views
- 940

- Replies
- 2

- Views
- 2K

- Replies
- 12

- Views
- 21K

- Last Post

- Replies
- 1

- Views
- 1K

- Replies
- 5

- Views
- 9K

- Last Post

- Replies
- 12

- Views
- 1K

- Replies
- 4

- Views
- 3K