MHB Deriving Extremas of Homogeneous Functions: A Chain Rule Comparison

  • Thread starter Thread starter mathmari
  • Start date Start date
  • Tags Tags
    Form
mathmari
Gold Member
MHB
Messages
4,984
Reaction score
7
Hey! :o

Let $f:\mathbb{R}^n\rightarrow \mathbb{R}$ be twice differentiable and homogeneous of degree $2$.

To show that the function has its possible local extremas at its roots, do we have show that the first derivative, i.e. the gradient is equal to $0$ if the function is equal to $0$ ?

Also how can we show that $f$ is in the form $f(x)=\frac{1}{2}x^T\cdot H_f(0)\cdot x$, where $H_f(0)$ is the Hessian Matrix of $f$ at $0$ ? Could you give me a hint?

(Wondering)
 
Last edited by a moderator:
Physics news on Phys.org
Hey mathmari!

Suppose we expand f(x) as a Taylor series.
What will get when we check the property of homogeneity? (Wondering)
 
I like Serena said:
Suppose we expand f(x) as a Taylor series.
What will get when we check the property of homogeneity? (Wondering)

We have that $$T_2(x)=f(x)+(x-a)^T\nabla f(a)+\frac{1}{2!}(x-a)^TH(a)(x-a)$$ For $x=0$ we get $$T_2(x)=f(x)+x^T\nabla f(0)+\frac{1}{2!}x^TH(0)x$$

Do we set now $tx$ instead of $x$ ? Or how can we use the fact that $f$ is homogeneous of degree $2$ ?
 
mathmari said:
We have that $$T_2(x)=f(x)+(x-a)^T\nabla f(a)+\frac{1}{2!}(x-a)^TH(a)(x-a)$$ For $x=0$ we get $$T_2(x)=f(x)+x^T\nabla f(0)+\frac{1}{2!}x^TH(0)x$$

Do we set now $tx$ instead of $x$ ? Or how can we use the fact that $f$ is homogeneous of degree $2$ ?

Let's use the full expansion.
That is, we have:
$$f(x) = f(0) + Df(0)x + \frac 12 D^2f(0)x^2 + \frac 1{3!} D^3f(0) x^3 + ...$$

Now we substitute indeed $tx$ and use that $f(tx)=t^2f(x)$.
For the condition to hold, we must have that every coefficient in both expansions must be the same.
That is because a Taylor expansion is unique. (Thinking)
 
I like Serena said:
Let's use the full expansion.
That is, we have:
$$f(x) = f(0) + Df(0)x + \frac 12 D^2f(0)x^2 + \frac 1{3!} D^3f(0) x^3 + ...$$

Now we substitute indeed $tx$ and use that $f(tx)=t^2f(x)$.
For the condition to hold, we must have that every coefficient in both expansions must be the same.
That is because a Taylor expansion is unique. (Thinking)
\begin{align*}&f(tx) = f(0) + Df(0)tx + \frac 12 D^2f(0)t^2x^2 + \frac 1{3!} D^3f(0) t^3x^3 + ...\\ & \Rightarrow t^2f(x)=f(0) + Df(0)tx + \frac 12 D^2f(0)t^2x^2 + \frac 1{3!} D^3f(0) t^3x^3 + ...\\ &\Rightarrow f(x)=\frac{1}{t^2}f(0) + Df(0)\frac{x}{t} + \frac 12 D^2f(0)x^2 + \frac 1{3!} D^3f(0) tx^3 + ...\end{align*}

So we have that $$f(x) = f(0) + Df(0)x + \frac 12 D^2f(0)x^2 + \frac 1{3!} D^3f(0) x^3 + ...$$ and $$f(x)=\frac{1}{t^2}f(0) + Df(0)\frac{x}{t} + \frac 12 D^2f(0)x^2 + \frac 1{3!} D^3f(0) tx^3 + ...$$ That means that $1=\frac{1}{t^2}$, or not? (Wondering)
 
Good! (Happy)

mathmari said:
That means that $1=\frac{1}{t^2}$, or not?

Not quite.
The property must hold for any $t$, doesn't it?
It means that for instance $f(0)$ must be $0$... (Thinking)
 
I like Serena said:
Not quite.
The property must hold for any $t$, doesn't it?
It means that for instance $f(0)$ must be $0$... (Thinking)

I got stuck right now. What do you mean? (Wondering)
 
mathmari said:
I got stuck right now. What do you mean?

Isn't a homogeneous function $f$ of order 2 such that $f(t\mathbf x)=t^2 f(\mathbf x)$ for all $t>0$ and all $\mathbf x$?
Or is it different? (Wondering)Let's pick some $\mathbf x \ne \mathbf 0$ and $t=1$ respectively $t=2$.
Then we must have $\frac 1{1^2} f(0)=\frac 1{2^2} f(0)$.
This can only be true if $f(0)=0$, can't it? (Wondering)
 
I like Serena said:
Isn't a homogeneous function $f$ of order 2 such that $f(t\mathbf x)=t^2 f(\mathbf x)$ for all $t>0$ and all $\mathbf x$?
Or is it different? (Wondering)Let's pick some $\mathbf x \ne \mathbf 0$ and $t=1$ respectively $t=2$.
Then we must have $\frac 1{1^2} f(0)=\frac 1{2^2} f(0)$.
This can only be true if $f(0)=0$, can't it? (Wondering)
Ah ok!

We have that $$f(x) = f(0) + Df(0)x + \frac 12 D^2f(0)x^2 + \frac 1{3!} D^3f(0) x^3 + ...$$ and $$f(x)=\frac{1}{t^2}f(0) + \frac{Df(0)}{t}x + \frac 12 D^2f(0)x^2 + \frac 1{3!} D^3f(0) tx^3 + ...$$

$\frac{1}{t^2}f(0) =f(0), \forall t$ holds only when $f(0)=0$.

$\frac{Df(0)}{t}=Df(0), \forall t$ holds only when $Df(0)=0$

$\frac 1{3!} D^3f(0) =\frac 1{3!} D^3f(0) t, \forall t$ holds only when $D^3f(0)=0$

This holds also for any $k\geq 4$ : $D^kf(0)=0$

In that way we get that $f(x)=\frac 12 D^2f(0)x^2$. This is the same as $\frac{1}{2}x^T\cdot H_f(0)\cdot x$, or not? (Wondering) And the first question, that the function has at its root the local extrema, do we get that from the fact that $f(0)=Df(0)=0$ ? (Wondering)
 
  • #10
mathmari said:
In that way we get that $f(x)=\frac 12 D^2f(0)x^2$. This is the same as $\frac{1}{2}x^T\cdot H_f(0)\cdot x$, or not?

And the first question, that the function has at its root the local extrema, do we get that from the fact that $f(0)=Df(0)=0$ ?

Yes to both. (Nod)

So we have proven the statements if $f$ is infinitely differentiable.
However, I have just realized that this is not given.
We only have that $f$ is differentiable twice, and the second derivative does not even have to be continuous. (Worried)
 
  • #11
I like Serena said:
Yes to both. (Nod)

So we have proven the statements if $f$ is infinitely differentiable.
However, I have just realized that this is not given.
We only have that $f$ is differentiable twice, and the second derivative does not even have to be continuous. (Worried)
I just saw that I forgot the word "continuosuly" in my initial post, so $f$ is twice continuously differentiable. (Blush)

But having that $f(0)=Df(0)=0$ means that the function and the gradient is $0$ at the point $0$. Does this mean that the function can have its extremas at the same points as the roots? (Wondering)
 
Last edited by a moderator:
  • #12
Or do we have to do the same for a general point instead of $0$ ? (Wondering)
 
  • #13
I am stuck now. (Worried)
 
  • #14
Thanks to Euge I have a solution to your problem now. (Happy)

From the equation $f(t\mathbf{x}) = t^2 f(\mathbf{x})$, take the derivative with respect to $t$ twice on both sides to obtain $$\sum\limits_{i,\,j} x^ix^j\frac{\partial^2 f}{\partial x^i \partial x^j}(t\mathbf{x}) = 2f(\mathbf{x})$$ where $\mathbf{x} = (x^1,\ldots, x^n)$. Since $f$ is twice continuously differentiable, taking limits as $t \to 0$ results in (after dividing by $2$)$$f(\mathbf{x}) = \frac{1}{2}\sum\limits_{i,\,j} x^i\frac{\partial^2 f}{\partial x^i \partial x^j}(\mathbf{0})\,x^j = \frac{1}{2}\mathbf{x}^TH_f(\mathbf{0})\,\mathbf{x}$$ as desired. (Nerd)As for the other statement, take the first derivative with respect to $t$ on both sides of the equation $f(t\mathbf{x}) = t^2 f(\mathbf{x})$ to get $\mathbf{x}\cdot Df(t\mathbf{x}) = 2tf(\mathbf{x})$. Evaluating at $t = 1$ yields $$f(\mathbf{x}) = \frac{1}{2}\mathbf{x}\cdot Df(\mathbf{x})$$ If $f$ has a critical point at $\mathbf{x} = \mathbf{c}$, then $Df(\mathbf{c}) = \mathbf{0}$; in light of the above equation $f(\mathbf{c}) = .5\mathbf{c}\cdot \mathbf{0} = 0$, that is, $\mathbf{c}$ is a zero of $f$. (Thinking)
 
  • #15
I like Serena said:
From the equation $f(t\mathbf{x}) = t^2 f(\mathbf{x})$, take the derivative with respect to $t$ twice on both sides to obtain $$\sum\limits_{i,\,j} x^ix^j\frac{\partial^2 f}{\partial x^i \partial x^j}(t\mathbf{x}) = 2f(\mathbf{x})$$ where $\mathbf{x} = (x^1,\ldots, x^n)$.

We have the equation $f(t\mathbf{x}) = t^2 f(\mathbf{x})$. When we take the second derivative with respect to $t$ on the left side we get the following using the chain rule:
\begin{align*}\frac{d}{dt}f(t\mathbf{x})&=\frac{d}{dt}f(t(x_1, x_2, \ldots , x_n))=\frac{d}{dt}f(tx_1, tx_2, \ldots , tx_n)=\sum_{i=1}^n\frac{\partial{f}}{\partial{x_i}}\cdot \frac{d(tx_i)}{dt}\\ & =\sum_{i=1}^n\frac{\partial{f}}{\partial{x_i}}\cdot x_i\end{align*}
\begin{align*}\frac{d^2}{dt^2}f(t\mathbf{x})&=\frac{d}{dt}\left (\sum_{i=1}^n\frac{\partial{f}}{\partial{x_i}}(t(x_1, x_2, \ldots , x_n))\cdot x_i\right )=\frac{d}{dt}\left (\sum_{i=1}^nf_{x_i}(t(x_1, x_2, \ldots , x_n))\cdot x_i\right )\\ & =\sum_{i=1}^nx_i\cdot \frac{d}{dt}f_{x_i}(t(x_1, x_2, \ldots , x_n)) = \sum_{i=1}^nx_i\cdot \sum_{j=1}^n\frac{\partial{f_{x_i}}}{\partial{x_j}}\cdot \frac{d(tx_j)}{dt}\\ & =\sum_{i,j=1}^nx_i\cdot x_j\cdot \frac{\partial{f_{x_i}}}{\partial{x_j}}=\sum_{i,j=1}^nx_i\cdot x_j\cdot \frac{\partial^2{f}}{\partial{x_i}\partial{x_j}}\end{align*}
Is everything correct? Could we improve something? (Wondering)
 
  • #16
mathmari said:
We have the equation $f(t\mathbf{x}) = t^2 f(\mathbf{x})$. When we take the second derivative with respect to $t$ on the left side we get the following using the chain rule:
\begin{align*}\frac{d}{dt}f(t\mathbf{x})&=\frac{d}{dt}f(t(x_1, x_2, \ldots , x_n))=\frac{d}{dt}f(tx_1, tx_2, \ldots , tx_n)=\sum_{i=1}^n\frac{\partial{f}}{\partial{x_i}}\cdot \frac{d(tx_i)}{dt}\\ & =\sum_{i=1}^n\frac{\partial{f}}{\partial{x_i}}\cdot x_i\end{align*}
\begin{align*}\frac{d^2}{dt^2}f(t\mathbf{x})&=\frac{d}{dt}\left (\sum_{i=1}^n\frac{\partial{f}}{\partial{x_i}}(t(x_1, x_2, \ldots , x_n))\cdot x_i\right )=\frac{d}{dt}\left (\sum_{i=1}^nf_{x_i}(t(x_1, x_2, \ldots , x_n))\cdot x_i\right )\\ & =\sum_{i=1}^nx_i\cdot \frac{d}{dt}f_{x_i}(t(x_1, x_2, \ldots , x_n)) = \sum_{i=1}^nx_i\cdot \sum_{j=1}^n\frac{\partial{f_{x_i}}}{\partial{x_j}}\cdot \frac{d(tx_j)}{dt}\\ & =\sum_{i,j=1}^nx_i\cdot x_j\cdot \frac{\partial{f_{x_i}}}{\partial{x_j}}=\sum_{i,j=1}^nx_i\cdot x_j\cdot \frac{\partial^2{f}}{\partial{x_i}\partial{x_j}}\end{align*}
Is everything correct? Could we improve something? (Wondering)

All derivatives should be partial derivatives.
That is, we assume that the $x_i$ do not depend on $t$, which is what a partial derivative means.
Otherwise we would have for instance:
$$\d{(tx_i)}t = x_i + t\d{x_i}t$$

Everything else is correct. (Happy)
 
  • #17
I like Serena said:
All derivatives should be partial derivatives.
That is, we assume that the $x_i$ do not depend on $t$, which is what a partial derivative means.
Otherwise we would have for instance:
$$\d{(tx_i)}t = x_i + t\d{x_i}t$$

Everything else is correct. (Happy)

You mean the following:
\begin{align*}\frac{d}{dt}f(t\mathbf{x})&=\frac{d}{dt}f(t(x_1, x_2, \ldots , x_n))=\frac{d}{dt}f(tx_1, tx_2, \ldots , tx_n)=\sum_{i=1}^n\frac{\partial{f}}{\partial{x_i}}\cdot \frac{\partial{(tx_i)}}{\partial{t}}\\ & =\sum_{i=1}^n\frac{\partial{f}}{\partial{x_i}}\cdot x_i\end{align*}
\begin{align*}\frac{d^2}{dt^2}f(t\mathbf{x})&=\frac{d}{dt}\left (\sum_{i=1}^n\frac{\partial{f}}{\partial{x_i}}(t(x_1, x_2, \ldots , x_n))\cdot x_i\right )=\frac{d}{dt}\left (\sum_{i=1}^nf_{x_i}(t(x_1, x_2, \ldots , x_n))\cdot x_i\right )\\ & =\sum_{i=1}^nx_i\cdot \frac{d}{dt}f_{x_i}(t(x_1, x_2, \ldots , x_n)) = \sum_{i=1}^nx_i\cdot \sum_{j=1}^n\frac{\partial{f_{x_i}}}{\partial{x_j}}\cdot \frac{\partial{(tx_j)}}{\partial{t}}\\ & =\sum_{i,j=1}^nx_i\cdot x_j\cdot \frac{\partial{f_{x_i}}}{\partial{x_j}}=\sum_{i,j=1}^nx_i\cdot x_j\cdot \frac{\partial^2{f}}{\partial{x_i}\partial{x_j}}\end{align*} or not? (Wondering)
 
  • #18
mathmari said:
You mean the following:
\begin{align*}\frac{d}{dt}f(t\mathbf{x})&=\frac{d}{dt}f(t(x_1, x_2, \ldots , x_n))=\frac{d}{dt}f(tx_1, tx_2, \ldots , tx_n)=\sum_{i=1}^n\frac{\partial{f}}{\partial{x_i}}\cdot \frac{\partial{(tx_i)}}{\partial{t}}\\ & =\sum_{i=1}^n\frac{\partial{f}}{\partial{x_i}}\cdot x_i\end{align*}
\begin{align*}\frac{d^2}{dt^2}f(t\mathbf{x})&=\frac{d}{dt}\left (\sum_{i=1}^n\frac{\partial{f}}{\partial{x_i}}(t(x_1, x_2, \ldots , x_n))\cdot x_i\right )=\frac{d}{dt}\left (\sum_{i=1}^nf_{x_i}(t(x_1, x_2, \ldots , x_n))\cdot x_i\right )\\ & =\sum_{i=1}^nx_i\cdot \frac{d}{dt}f_{x_i}(t(x_1, x_2, \ldots , x_n)) = \sum_{i=1}^nx_i\cdot \sum_{j=1}^n\frac{\partial{f_{x_i}}}{\partial{x_j}}\cdot \frac{\partial{(tx_j)}}{\partial{t}}\\ & =\sum_{i,j=1}^nx_i\cdot x_j\cdot \frac{\partial{f_{x_i}}}{\partial{x_j}}=\sum_{i,j=1}^nx_i\cdot x_j\cdot \frac{\partial^2{f}}{\partial{x_i}\partial{x_j}}\end{align*} or not? (Wondering)

I meant everywhere. (Nerd)

Consider that the application of the chain rule for a regular derivative is:
$$\d{}t {f(tx,ty)} = \pd fx \cdot\d{(tx)}t + \pd fy \cdot\d{(ty)}t = \pd fx \cdot\left(x+\d xt\right) + \pd fy\cdot\left(y+\d yt\right)$$
While the application of the chain rule for a partial derivative is:
$$\pd{}t {f(tx,ty)} = \pd fx \cdot\pd{(tx)}t + \pd fy \cdot\pd{(ty)}t = \pd fx \cdot x + \pd fy\cdot y$$

We want the latter, don't we? (Wondering)
 
  • #19
I like Serena said:
I meant everywhere. (Nerd)

Consider that the application of the chain rule for a regular derivative is:
$$\d{}t {f(tx,ty)} = \pd fx \cdot\d{(tx)}t + \pd fy \cdot\d{(ty)}t = \pd fx \cdot\left(x+\d xt\right) + \pd fy\cdot\left(y+\d yt\right)$$
While the application of the chain rule for a partial derivative is:
$$\pd{}t {f(tx,ty)} = \pd fx \cdot\pd{(tx)}t + \pd fy \cdot\pd{(ty)}t = \pd fx \cdot x + \pd fy\cdot y$$

We want the latter, don't we? (Wondering)

Ah ok! Thank you very much! (Smile)
 

Similar threads

Replies
9
Views
2K
Replies
4
Views
1K
Replies
16
Views
2K
Replies
24
Views
4K
Replies
8
Views
2K
Replies
6
Views
2K
Replies
21
Views
2K
Back
Top