MHB Deriving Extremas of Homogeneous Functions: A Chain Rule Comparison

  • Thread starter Thread starter mathmari
  • Start date Start date
  • Tags Tags
    Form
Click For Summary
The discussion centers on deriving the properties of a twice-differentiable homogeneous function of degree 2. It is established that local extrema occur at the function's roots, specifically at the point where both the function and its gradient are zero. The participants explore the Taylor series expansion of the function and its implications for homogeneity, concluding that if the function is homogeneous of degree 2, it can be expressed in terms of its Hessian matrix at the origin. The necessity for the function to be zero at the origin is emphasized, as it ensures that the conditions for local extrema are satisfied. Overall, the conversation highlights the relationship between the function's derivatives and its behavior at critical points.
mathmari
Gold Member
MHB
Messages
4,984
Reaction score
7
Hey! :o

Let $f:\mathbb{R}^n\rightarrow \mathbb{R}$ be twice differentiable and homogeneous of degree $2$.

To show that the function has its possible local extremas at its roots, do we have show that the first derivative, i.e. the gradient is equal to $0$ if the function is equal to $0$ ?

Also how can we show that $f$ is in the form $f(x)=\frac{1}{2}x^T\cdot H_f(0)\cdot x$, where $H_f(0)$ is the Hessian Matrix of $f$ at $0$ ? Could you give me a hint?

(Wondering)
 
Last edited by a moderator:
Physics news on Phys.org
Hey mathmari!

Suppose we expand f(x) as a Taylor series.
What will get when we check the property of homogeneity? (Wondering)
 
I like Serena said:
Suppose we expand f(x) as a Taylor series.
What will get when we check the property of homogeneity? (Wondering)

We have that $$T_2(x)=f(x)+(x-a)^T\nabla f(a)+\frac{1}{2!}(x-a)^TH(a)(x-a)$$ For $x=0$ we get $$T_2(x)=f(x)+x^T\nabla f(0)+\frac{1}{2!}x^TH(0)x$$

Do we set now $tx$ instead of $x$ ? Or how can we use the fact that $f$ is homogeneous of degree $2$ ?
 
mathmari said:
We have that $$T_2(x)=f(x)+(x-a)^T\nabla f(a)+\frac{1}{2!}(x-a)^TH(a)(x-a)$$ For $x=0$ we get $$T_2(x)=f(x)+x^T\nabla f(0)+\frac{1}{2!}x^TH(0)x$$

Do we set now $tx$ instead of $x$ ? Or how can we use the fact that $f$ is homogeneous of degree $2$ ?

Let's use the full expansion.
That is, we have:
$$f(x) = f(0) + Df(0)x + \frac 12 D^2f(0)x^2 + \frac 1{3!} D^3f(0) x^3 + ...$$

Now we substitute indeed $tx$ and use that $f(tx)=t^2f(x)$.
For the condition to hold, we must have that every coefficient in both expansions must be the same.
That is because a Taylor expansion is unique. (Thinking)
 
I like Serena said:
Let's use the full expansion.
That is, we have:
$$f(x) = f(0) + Df(0)x + \frac 12 D^2f(0)x^2 + \frac 1{3!} D^3f(0) x^3 + ...$$

Now we substitute indeed $tx$ and use that $f(tx)=t^2f(x)$.
For the condition to hold, we must have that every coefficient in both expansions must be the same.
That is because a Taylor expansion is unique. (Thinking)
\begin{align*}&f(tx) = f(0) + Df(0)tx + \frac 12 D^2f(0)t^2x^2 + \frac 1{3!} D^3f(0) t^3x^3 + ...\\ & \Rightarrow t^2f(x)=f(0) + Df(0)tx + \frac 12 D^2f(0)t^2x^2 + \frac 1{3!} D^3f(0) t^3x^3 + ...\\ &\Rightarrow f(x)=\frac{1}{t^2}f(0) + Df(0)\frac{x}{t} + \frac 12 D^2f(0)x^2 + \frac 1{3!} D^3f(0) tx^3 + ...\end{align*}

So we have that $$f(x) = f(0) + Df(0)x + \frac 12 D^2f(0)x^2 + \frac 1{3!} D^3f(0) x^3 + ...$$ and $$f(x)=\frac{1}{t^2}f(0) + Df(0)\frac{x}{t} + \frac 12 D^2f(0)x^2 + \frac 1{3!} D^3f(0) tx^3 + ...$$ That means that $1=\frac{1}{t^2}$, or not? (Wondering)
 
Good! (Happy)

mathmari said:
That means that $1=\frac{1}{t^2}$, or not?

Not quite.
The property must hold for any $t$, doesn't it?
It means that for instance $f(0)$ must be $0$... (Thinking)
 
I like Serena said:
Not quite.
The property must hold for any $t$, doesn't it?
It means that for instance $f(0)$ must be $0$... (Thinking)

I got stuck right now. What do you mean? (Wondering)
 
mathmari said:
I got stuck right now. What do you mean?

Isn't a homogeneous function $f$ of order 2 such that $f(t\mathbf x)=t^2 f(\mathbf x)$ for all $t>0$ and all $\mathbf x$?
Or is it different? (Wondering)Let's pick some $\mathbf x \ne \mathbf 0$ and $t=1$ respectively $t=2$.
Then we must have $\frac 1{1^2} f(0)=\frac 1{2^2} f(0)$.
This can only be true if $f(0)=0$, can't it? (Wondering)
 
I like Serena said:
Isn't a homogeneous function $f$ of order 2 such that $f(t\mathbf x)=t^2 f(\mathbf x)$ for all $t>0$ and all $\mathbf x$?
Or is it different? (Wondering)Let's pick some $\mathbf x \ne \mathbf 0$ and $t=1$ respectively $t=2$.
Then we must have $\frac 1{1^2} f(0)=\frac 1{2^2} f(0)$.
This can only be true if $f(0)=0$, can't it? (Wondering)
Ah ok!

We have that $$f(x) = f(0) + Df(0)x + \frac 12 D^2f(0)x^2 + \frac 1{3!} D^3f(0) x^3 + ...$$ and $$f(x)=\frac{1}{t^2}f(0) + \frac{Df(0)}{t}x + \frac 12 D^2f(0)x^2 + \frac 1{3!} D^3f(0) tx^3 + ...$$

$\frac{1}{t^2}f(0) =f(0), \forall t$ holds only when $f(0)=0$.

$\frac{Df(0)}{t}=Df(0), \forall t$ holds only when $Df(0)=0$

$\frac 1{3!} D^3f(0) =\frac 1{3!} D^3f(0) t, \forall t$ holds only when $D^3f(0)=0$

This holds also for any $k\geq 4$ : $D^kf(0)=0$

In that way we get that $f(x)=\frac 12 D^2f(0)x^2$. This is the same as $\frac{1}{2}x^T\cdot H_f(0)\cdot x$, or not? (Wondering) And the first question, that the function has at its root the local extrema, do we get that from the fact that $f(0)=Df(0)=0$ ? (Wondering)
 
  • #10
mathmari said:
In that way we get that $f(x)=\frac 12 D^2f(0)x^2$. This is the same as $\frac{1}{2}x^T\cdot H_f(0)\cdot x$, or not?

And the first question, that the function has at its root the local extrema, do we get that from the fact that $f(0)=Df(0)=0$ ?

Yes to both. (Nod)

So we have proven the statements if $f$ is infinitely differentiable.
However, I have just realized that this is not given.
We only have that $f$ is differentiable twice, and the second derivative does not even have to be continuous. (Worried)
 
  • #11
I like Serena said:
Yes to both. (Nod)

So we have proven the statements if $f$ is infinitely differentiable.
However, I have just realized that this is not given.
We only have that $f$ is differentiable twice, and the second derivative does not even have to be continuous. (Worried)
I just saw that I forgot the word "continuosuly" in my initial post, so $f$ is twice continuously differentiable. (Blush)

But having that $f(0)=Df(0)=0$ means that the function and the gradient is $0$ at the point $0$. Does this mean that the function can have its extremas at the same points as the roots? (Wondering)
 
Last edited by a moderator:
  • #12
Or do we have to do the same for a general point instead of $0$ ? (Wondering)
 
  • #13
I am stuck now. (Worried)
 
  • #14
Thanks to Euge I have a solution to your problem now. (Happy)

From the equation $f(t\mathbf{x}) = t^2 f(\mathbf{x})$, take the derivative with respect to $t$ twice on both sides to obtain $$\sum\limits_{i,\,j} x^ix^j\frac{\partial^2 f}{\partial x^i \partial x^j}(t\mathbf{x}) = 2f(\mathbf{x})$$ where $\mathbf{x} = (x^1,\ldots, x^n)$. Since $f$ is twice continuously differentiable, taking limits as $t \to 0$ results in (after dividing by $2$)$$f(\mathbf{x}) = \frac{1}{2}\sum\limits_{i,\,j} x^i\frac{\partial^2 f}{\partial x^i \partial x^j}(\mathbf{0})\,x^j = \frac{1}{2}\mathbf{x}^TH_f(\mathbf{0})\,\mathbf{x}$$ as desired. (Nerd)As for the other statement, take the first derivative with respect to $t$ on both sides of the equation $f(t\mathbf{x}) = t^2 f(\mathbf{x})$ to get $\mathbf{x}\cdot Df(t\mathbf{x}) = 2tf(\mathbf{x})$. Evaluating at $t = 1$ yields $$f(\mathbf{x}) = \frac{1}{2}\mathbf{x}\cdot Df(\mathbf{x})$$ If $f$ has a critical point at $\mathbf{x} = \mathbf{c}$, then $Df(\mathbf{c}) = \mathbf{0}$; in light of the above equation $f(\mathbf{c}) = .5\mathbf{c}\cdot \mathbf{0} = 0$, that is, $\mathbf{c}$ is a zero of $f$. (Thinking)
 
  • #15
I like Serena said:
From the equation $f(t\mathbf{x}) = t^2 f(\mathbf{x})$, take the derivative with respect to $t$ twice on both sides to obtain $$\sum\limits_{i,\,j} x^ix^j\frac{\partial^2 f}{\partial x^i \partial x^j}(t\mathbf{x}) = 2f(\mathbf{x})$$ where $\mathbf{x} = (x^1,\ldots, x^n)$.

We have the equation $f(t\mathbf{x}) = t^2 f(\mathbf{x})$. When we take the second derivative with respect to $t$ on the left side we get the following using the chain rule:
\begin{align*}\frac{d}{dt}f(t\mathbf{x})&=\frac{d}{dt}f(t(x_1, x_2, \ldots , x_n))=\frac{d}{dt}f(tx_1, tx_2, \ldots , tx_n)=\sum_{i=1}^n\frac{\partial{f}}{\partial{x_i}}\cdot \frac{d(tx_i)}{dt}\\ & =\sum_{i=1}^n\frac{\partial{f}}{\partial{x_i}}\cdot x_i\end{align*}
\begin{align*}\frac{d^2}{dt^2}f(t\mathbf{x})&=\frac{d}{dt}\left (\sum_{i=1}^n\frac{\partial{f}}{\partial{x_i}}(t(x_1, x_2, \ldots , x_n))\cdot x_i\right )=\frac{d}{dt}\left (\sum_{i=1}^nf_{x_i}(t(x_1, x_2, \ldots , x_n))\cdot x_i\right )\\ & =\sum_{i=1}^nx_i\cdot \frac{d}{dt}f_{x_i}(t(x_1, x_2, \ldots , x_n)) = \sum_{i=1}^nx_i\cdot \sum_{j=1}^n\frac{\partial{f_{x_i}}}{\partial{x_j}}\cdot \frac{d(tx_j)}{dt}\\ & =\sum_{i,j=1}^nx_i\cdot x_j\cdot \frac{\partial{f_{x_i}}}{\partial{x_j}}=\sum_{i,j=1}^nx_i\cdot x_j\cdot \frac{\partial^2{f}}{\partial{x_i}\partial{x_j}}\end{align*}
Is everything correct? Could we improve something? (Wondering)
 
  • #16
mathmari said:
We have the equation $f(t\mathbf{x}) = t^2 f(\mathbf{x})$. When we take the second derivative with respect to $t$ on the left side we get the following using the chain rule:
\begin{align*}\frac{d}{dt}f(t\mathbf{x})&=\frac{d}{dt}f(t(x_1, x_2, \ldots , x_n))=\frac{d}{dt}f(tx_1, tx_2, \ldots , tx_n)=\sum_{i=1}^n\frac{\partial{f}}{\partial{x_i}}\cdot \frac{d(tx_i)}{dt}\\ & =\sum_{i=1}^n\frac{\partial{f}}{\partial{x_i}}\cdot x_i\end{align*}
\begin{align*}\frac{d^2}{dt^2}f(t\mathbf{x})&=\frac{d}{dt}\left (\sum_{i=1}^n\frac{\partial{f}}{\partial{x_i}}(t(x_1, x_2, \ldots , x_n))\cdot x_i\right )=\frac{d}{dt}\left (\sum_{i=1}^nf_{x_i}(t(x_1, x_2, \ldots , x_n))\cdot x_i\right )\\ & =\sum_{i=1}^nx_i\cdot \frac{d}{dt}f_{x_i}(t(x_1, x_2, \ldots , x_n)) = \sum_{i=1}^nx_i\cdot \sum_{j=1}^n\frac{\partial{f_{x_i}}}{\partial{x_j}}\cdot \frac{d(tx_j)}{dt}\\ & =\sum_{i,j=1}^nx_i\cdot x_j\cdot \frac{\partial{f_{x_i}}}{\partial{x_j}}=\sum_{i,j=1}^nx_i\cdot x_j\cdot \frac{\partial^2{f}}{\partial{x_i}\partial{x_j}}\end{align*}
Is everything correct? Could we improve something? (Wondering)

All derivatives should be partial derivatives.
That is, we assume that the $x_i$ do not depend on $t$, which is what a partial derivative means.
Otherwise we would have for instance:
$$\d{(tx_i)}t = x_i + t\d{x_i}t$$

Everything else is correct. (Happy)
 
  • #17
I like Serena said:
All derivatives should be partial derivatives.
That is, we assume that the $x_i$ do not depend on $t$, which is what a partial derivative means.
Otherwise we would have for instance:
$$\d{(tx_i)}t = x_i + t\d{x_i}t$$

Everything else is correct. (Happy)

You mean the following:
\begin{align*}\frac{d}{dt}f(t\mathbf{x})&=\frac{d}{dt}f(t(x_1, x_2, \ldots , x_n))=\frac{d}{dt}f(tx_1, tx_2, \ldots , tx_n)=\sum_{i=1}^n\frac{\partial{f}}{\partial{x_i}}\cdot \frac{\partial{(tx_i)}}{\partial{t}}\\ & =\sum_{i=1}^n\frac{\partial{f}}{\partial{x_i}}\cdot x_i\end{align*}
\begin{align*}\frac{d^2}{dt^2}f(t\mathbf{x})&=\frac{d}{dt}\left (\sum_{i=1}^n\frac{\partial{f}}{\partial{x_i}}(t(x_1, x_2, \ldots , x_n))\cdot x_i\right )=\frac{d}{dt}\left (\sum_{i=1}^nf_{x_i}(t(x_1, x_2, \ldots , x_n))\cdot x_i\right )\\ & =\sum_{i=1}^nx_i\cdot \frac{d}{dt}f_{x_i}(t(x_1, x_2, \ldots , x_n)) = \sum_{i=1}^nx_i\cdot \sum_{j=1}^n\frac{\partial{f_{x_i}}}{\partial{x_j}}\cdot \frac{\partial{(tx_j)}}{\partial{t}}\\ & =\sum_{i,j=1}^nx_i\cdot x_j\cdot \frac{\partial{f_{x_i}}}{\partial{x_j}}=\sum_{i,j=1}^nx_i\cdot x_j\cdot \frac{\partial^2{f}}{\partial{x_i}\partial{x_j}}\end{align*} or not? (Wondering)
 
  • #18
mathmari said:
You mean the following:
\begin{align*}\frac{d}{dt}f(t\mathbf{x})&=\frac{d}{dt}f(t(x_1, x_2, \ldots , x_n))=\frac{d}{dt}f(tx_1, tx_2, \ldots , tx_n)=\sum_{i=1}^n\frac{\partial{f}}{\partial{x_i}}\cdot \frac{\partial{(tx_i)}}{\partial{t}}\\ & =\sum_{i=1}^n\frac{\partial{f}}{\partial{x_i}}\cdot x_i\end{align*}
\begin{align*}\frac{d^2}{dt^2}f(t\mathbf{x})&=\frac{d}{dt}\left (\sum_{i=1}^n\frac{\partial{f}}{\partial{x_i}}(t(x_1, x_2, \ldots , x_n))\cdot x_i\right )=\frac{d}{dt}\left (\sum_{i=1}^nf_{x_i}(t(x_1, x_2, \ldots , x_n))\cdot x_i\right )\\ & =\sum_{i=1}^nx_i\cdot \frac{d}{dt}f_{x_i}(t(x_1, x_2, \ldots , x_n)) = \sum_{i=1}^nx_i\cdot \sum_{j=1}^n\frac{\partial{f_{x_i}}}{\partial{x_j}}\cdot \frac{\partial{(tx_j)}}{\partial{t}}\\ & =\sum_{i,j=1}^nx_i\cdot x_j\cdot \frac{\partial{f_{x_i}}}{\partial{x_j}}=\sum_{i,j=1}^nx_i\cdot x_j\cdot \frac{\partial^2{f}}{\partial{x_i}\partial{x_j}}\end{align*} or not? (Wondering)

I meant everywhere. (Nerd)

Consider that the application of the chain rule for a regular derivative is:
$$\d{}t {f(tx,ty)} = \pd fx \cdot\d{(tx)}t + \pd fy \cdot\d{(ty)}t = \pd fx \cdot\left(x+\d xt\right) + \pd fy\cdot\left(y+\d yt\right)$$
While the application of the chain rule for a partial derivative is:
$$\pd{}t {f(tx,ty)} = \pd fx \cdot\pd{(tx)}t + \pd fy \cdot\pd{(ty)}t = \pd fx \cdot x + \pd fy\cdot y$$

We want the latter, don't we? (Wondering)
 
  • #19
I like Serena said:
I meant everywhere. (Nerd)

Consider that the application of the chain rule for a regular derivative is:
$$\d{}t {f(tx,ty)} = \pd fx \cdot\d{(tx)}t + \pd fy \cdot\d{(ty)}t = \pd fx \cdot\left(x+\d xt\right) + \pd fy\cdot\left(y+\d yt\right)$$
While the application of the chain rule for a partial derivative is:
$$\pd{}t {f(tx,ty)} = \pd fx \cdot\pd{(tx)}t + \pd fy \cdot\pd{(ty)}t = \pd fx \cdot x + \pd fy\cdot y$$

We want the latter, don't we? (Wondering)

Ah ok! Thank you very much! (Smile)
 

Similar threads

  • · Replies 9 ·
Replies
9
Views
2K
  • · Replies 4 ·
Replies
4
Views
1K
  • · Replies 16 ·
Replies
16
Views
2K
  • · Replies 24 ·
Replies
24
Views
4K
Replies
11
Views
2K
  • · Replies 22 ·
Replies
22
Views
2K
  • · Replies 8 ·
Replies
8
Views
2K
  • · Replies 6 ·
Replies
6
Views
2K
  • · Replies 23 ·
Replies
23
Views
3K
  • · Replies 21 ·
Replies
21
Views
3K