Deriving Extremas of Homogeneous Functions: A Chain Rule Comparison

  • Context: MHB 
  • Thread starter Thread starter mathmari
  • Start date Start date
  • Tags Tags
    Form
Click For Summary
SUMMARY

The discussion focuses on deriving the extremas of homogeneous functions, specifically those that are twice continuously differentiable and homogeneous of degree 2. It establishes that for a function \( f:\mathbb{R}^n\rightarrow \mathbb{R} \) to have local extremas at its roots, both the function and its gradient must equal zero at that point. The participants confirm that the function can be expressed in the form \( f(x)=\frac{1}{2}x^T\cdot H_f(0)\cdot x \), where \( H_f(0) \) is the Hessian matrix at the origin, and they derive this through Taylor series expansion and the properties of homogeneity.

PREREQUISITES
  • Understanding of homogeneous functions and their properties
  • Familiarity with Taylor series expansions
  • Knowledge of gradient and Hessian matrices
  • Concept of critical points in multivariable calculus
NEXT STEPS
  • Study the properties of homogeneous functions in detail
  • Learn about Taylor series and their applications in multivariable calculus
  • Explore the role of Hessian matrices in determining local extremas
  • Investigate critical points and their significance in optimization problems
USEFUL FOR

Mathematicians, students of advanced calculus, and anyone interested in optimization and the behavior of multivariable functions will benefit from this discussion.

mathmari
Gold Member
MHB
Messages
4,984
Reaction score
7
Hey! :o

Let $f:\mathbb{R}^n\rightarrow \mathbb{R}$ be twice differentiable and homogeneous of degree $2$.

To show that the function has its possible local extremas at its roots, do we have show that the first derivative, i.e. the gradient is equal to $0$ if the function is equal to $0$ ?

Also how can we show that $f$ is in the form $f(x)=\frac{1}{2}x^T\cdot H_f(0)\cdot x$, where $H_f(0)$ is the Hessian Matrix of $f$ at $0$ ? Could you give me a hint?

(Wondering)
 
Last edited by a moderator:
Physics news on Phys.org
Hey mathmari!

Suppose we expand f(x) as a Taylor series.
What will get when we check the property of homogeneity? (Wondering)
 
I like Serena said:
Suppose we expand f(x) as a Taylor series.
What will get when we check the property of homogeneity? (Wondering)

We have that $$T_2(x)=f(x)+(x-a)^T\nabla f(a)+\frac{1}{2!}(x-a)^TH(a)(x-a)$$ For $x=0$ we get $$T_2(x)=f(x)+x^T\nabla f(0)+\frac{1}{2!}x^TH(0)x$$

Do we set now $tx$ instead of $x$ ? Or how can we use the fact that $f$ is homogeneous of degree $2$ ?
 
mathmari said:
We have that $$T_2(x)=f(x)+(x-a)^T\nabla f(a)+\frac{1}{2!}(x-a)^TH(a)(x-a)$$ For $x=0$ we get $$T_2(x)=f(x)+x^T\nabla f(0)+\frac{1}{2!}x^TH(0)x$$

Do we set now $tx$ instead of $x$ ? Or how can we use the fact that $f$ is homogeneous of degree $2$ ?

Let's use the full expansion.
That is, we have:
$$f(x) = f(0) + Df(0)x + \frac 12 D^2f(0)x^2 + \frac 1{3!} D^3f(0) x^3 + ...$$

Now we substitute indeed $tx$ and use that $f(tx)=t^2f(x)$.
For the condition to hold, we must have that every coefficient in both expansions must be the same.
That is because a Taylor expansion is unique. (Thinking)
 
I like Serena said:
Let's use the full expansion.
That is, we have:
$$f(x) = f(0) + Df(0)x + \frac 12 D^2f(0)x^2 + \frac 1{3!} D^3f(0) x^3 + ...$$

Now we substitute indeed $tx$ and use that $f(tx)=t^2f(x)$.
For the condition to hold, we must have that every coefficient in both expansions must be the same.
That is because a Taylor expansion is unique. (Thinking)
\begin{align*}&f(tx) = f(0) + Df(0)tx + \frac 12 D^2f(0)t^2x^2 + \frac 1{3!} D^3f(0) t^3x^3 + ...\\ & \Rightarrow t^2f(x)=f(0) + Df(0)tx + \frac 12 D^2f(0)t^2x^2 + \frac 1{3!} D^3f(0) t^3x^3 + ...\\ &\Rightarrow f(x)=\frac{1}{t^2}f(0) + Df(0)\frac{x}{t} + \frac 12 D^2f(0)x^2 + \frac 1{3!} D^3f(0) tx^3 + ...\end{align*}

So we have that $$f(x) = f(0) + Df(0)x + \frac 12 D^2f(0)x^2 + \frac 1{3!} D^3f(0) x^3 + ...$$ and $$f(x)=\frac{1}{t^2}f(0) + Df(0)\frac{x}{t} + \frac 12 D^2f(0)x^2 + \frac 1{3!} D^3f(0) tx^3 + ...$$ That means that $1=\frac{1}{t^2}$, or not? (Wondering)
 
Good! (Happy)

mathmari said:
That means that $1=\frac{1}{t^2}$, or not?

Not quite.
The property must hold for any $t$, doesn't it?
It means that for instance $f(0)$ must be $0$... (Thinking)
 
I like Serena said:
Not quite.
The property must hold for any $t$, doesn't it?
It means that for instance $f(0)$ must be $0$... (Thinking)

I got stuck right now. What do you mean? (Wondering)
 
mathmari said:
I got stuck right now. What do you mean?

Isn't a homogeneous function $f$ of order 2 such that $f(t\mathbf x)=t^2 f(\mathbf x)$ for all $t>0$ and all $\mathbf x$?
Or is it different? (Wondering)Let's pick some $\mathbf x \ne \mathbf 0$ and $t=1$ respectively $t=2$.
Then we must have $\frac 1{1^2} f(0)=\frac 1{2^2} f(0)$.
This can only be true if $f(0)=0$, can't it? (Wondering)
 
I like Serena said:
Isn't a homogeneous function $f$ of order 2 such that $f(t\mathbf x)=t^2 f(\mathbf x)$ for all $t>0$ and all $\mathbf x$?
Or is it different? (Wondering)Let's pick some $\mathbf x \ne \mathbf 0$ and $t=1$ respectively $t=2$.
Then we must have $\frac 1{1^2} f(0)=\frac 1{2^2} f(0)$.
This can only be true if $f(0)=0$, can't it? (Wondering)
Ah ok!

We have that $$f(x) = f(0) + Df(0)x + \frac 12 D^2f(0)x^2 + \frac 1{3!} D^3f(0) x^3 + ...$$ and $$f(x)=\frac{1}{t^2}f(0) + \frac{Df(0)}{t}x + \frac 12 D^2f(0)x^2 + \frac 1{3!} D^3f(0) tx^3 + ...$$

$\frac{1}{t^2}f(0) =f(0), \forall t$ holds only when $f(0)=0$.

$\frac{Df(0)}{t}=Df(0), \forall t$ holds only when $Df(0)=0$

$\frac 1{3!} D^3f(0) =\frac 1{3!} D^3f(0) t, \forall t$ holds only when $D^3f(0)=0$

This holds also for any $k\geq 4$ : $D^kf(0)=0$

In that way we get that $f(x)=\frac 12 D^2f(0)x^2$. This is the same as $\frac{1}{2}x^T\cdot H_f(0)\cdot x$, or not? (Wondering) And the first question, that the function has at its root the local extrema, do we get that from the fact that $f(0)=Df(0)=0$ ? (Wondering)
 
  • #10
mathmari said:
In that way we get that $f(x)=\frac 12 D^2f(0)x^2$. This is the same as $\frac{1}{2}x^T\cdot H_f(0)\cdot x$, or not?

And the first question, that the function has at its root the local extrema, do we get that from the fact that $f(0)=Df(0)=0$ ?

Yes to both. (Nod)

So we have proven the statements if $f$ is infinitely differentiable.
However, I have just realized that this is not given.
We only have that $f$ is differentiable twice, and the second derivative does not even have to be continuous. (Worried)
 
  • #11
I like Serena said:
Yes to both. (Nod)

So we have proven the statements if $f$ is infinitely differentiable.
However, I have just realized that this is not given.
We only have that $f$ is differentiable twice, and the second derivative does not even have to be continuous. (Worried)
I just saw that I forgot the word "continuosuly" in my initial post, so $f$ is twice continuously differentiable. (Blush)

But having that $f(0)=Df(0)=0$ means that the function and the gradient is $0$ at the point $0$. Does this mean that the function can have its extremas at the same points as the roots? (Wondering)
 
Last edited by a moderator:
  • #12
Or do we have to do the same for a general point instead of $0$ ? (Wondering)
 
  • #13
I am stuck now. (Worried)
 
  • #14
Thanks to Euge I have a solution to your problem now. (Happy)

From the equation $f(t\mathbf{x}) = t^2 f(\mathbf{x})$, take the derivative with respect to $t$ twice on both sides to obtain $$\sum\limits_{i,\,j} x^ix^j\frac{\partial^2 f}{\partial x^i \partial x^j}(t\mathbf{x}) = 2f(\mathbf{x})$$ where $\mathbf{x} = (x^1,\ldots, x^n)$. Since $f$ is twice continuously differentiable, taking limits as $t \to 0$ results in (after dividing by $2$)$$f(\mathbf{x}) = \frac{1}{2}\sum\limits_{i,\,j} x^i\frac{\partial^2 f}{\partial x^i \partial x^j}(\mathbf{0})\,x^j = \frac{1}{2}\mathbf{x}^TH_f(\mathbf{0})\,\mathbf{x}$$ as desired. (Nerd)As for the other statement, take the first derivative with respect to $t$ on both sides of the equation $f(t\mathbf{x}) = t^2 f(\mathbf{x})$ to get $\mathbf{x}\cdot Df(t\mathbf{x}) = 2tf(\mathbf{x})$. Evaluating at $t = 1$ yields $$f(\mathbf{x}) = \frac{1}{2}\mathbf{x}\cdot Df(\mathbf{x})$$ If $f$ has a critical point at $\mathbf{x} = \mathbf{c}$, then $Df(\mathbf{c}) = \mathbf{0}$; in light of the above equation $f(\mathbf{c}) = .5\mathbf{c}\cdot \mathbf{0} = 0$, that is, $\mathbf{c}$ is a zero of $f$. (Thinking)
 
  • #15
I like Serena said:
From the equation $f(t\mathbf{x}) = t^2 f(\mathbf{x})$, take the derivative with respect to $t$ twice on both sides to obtain $$\sum\limits_{i,\,j} x^ix^j\frac{\partial^2 f}{\partial x^i \partial x^j}(t\mathbf{x}) = 2f(\mathbf{x})$$ where $\mathbf{x} = (x^1,\ldots, x^n)$.

We have the equation $f(t\mathbf{x}) = t^2 f(\mathbf{x})$. When we take the second derivative with respect to $t$ on the left side we get the following using the chain rule:
\begin{align*}\frac{d}{dt}f(t\mathbf{x})&=\frac{d}{dt}f(t(x_1, x_2, \ldots , x_n))=\frac{d}{dt}f(tx_1, tx_2, \ldots , tx_n)=\sum_{i=1}^n\frac{\partial{f}}{\partial{x_i}}\cdot \frac{d(tx_i)}{dt}\\ & =\sum_{i=1}^n\frac{\partial{f}}{\partial{x_i}}\cdot x_i\end{align*}
\begin{align*}\frac{d^2}{dt^2}f(t\mathbf{x})&=\frac{d}{dt}\left (\sum_{i=1}^n\frac{\partial{f}}{\partial{x_i}}(t(x_1, x_2, \ldots , x_n))\cdot x_i\right )=\frac{d}{dt}\left (\sum_{i=1}^nf_{x_i}(t(x_1, x_2, \ldots , x_n))\cdot x_i\right )\\ & =\sum_{i=1}^nx_i\cdot \frac{d}{dt}f_{x_i}(t(x_1, x_2, \ldots , x_n)) = \sum_{i=1}^nx_i\cdot \sum_{j=1}^n\frac{\partial{f_{x_i}}}{\partial{x_j}}\cdot \frac{d(tx_j)}{dt}\\ & =\sum_{i,j=1}^nx_i\cdot x_j\cdot \frac{\partial{f_{x_i}}}{\partial{x_j}}=\sum_{i,j=1}^nx_i\cdot x_j\cdot \frac{\partial^2{f}}{\partial{x_i}\partial{x_j}}\end{align*}
Is everything correct? Could we improve something? (Wondering)
 
  • #16
mathmari said:
We have the equation $f(t\mathbf{x}) = t^2 f(\mathbf{x})$. When we take the second derivative with respect to $t$ on the left side we get the following using the chain rule:
\begin{align*}\frac{d}{dt}f(t\mathbf{x})&=\frac{d}{dt}f(t(x_1, x_2, \ldots , x_n))=\frac{d}{dt}f(tx_1, tx_2, \ldots , tx_n)=\sum_{i=1}^n\frac{\partial{f}}{\partial{x_i}}\cdot \frac{d(tx_i)}{dt}\\ & =\sum_{i=1}^n\frac{\partial{f}}{\partial{x_i}}\cdot x_i\end{align*}
\begin{align*}\frac{d^2}{dt^2}f(t\mathbf{x})&=\frac{d}{dt}\left (\sum_{i=1}^n\frac{\partial{f}}{\partial{x_i}}(t(x_1, x_2, \ldots , x_n))\cdot x_i\right )=\frac{d}{dt}\left (\sum_{i=1}^nf_{x_i}(t(x_1, x_2, \ldots , x_n))\cdot x_i\right )\\ & =\sum_{i=1}^nx_i\cdot \frac{d}{dt}f_{x_i}(t(x_1, x_2, \ldots , x_n)) = \sum_{i=1}^nx_i\cdot \sum_{j=1}^n\frac{\partial{f_{x_i}}}{\partial{x_j}}\cdot \frac{d(tx_j)}{dt}\\ & =\sum_{i,j=1}^nx_i\cdot x_j\cdot \frac{\partial{f_{x_i}}}{\partial{x_j}}=\sum_{i,j=1}^nx_i\cdot x_j\cdot \frac{\partial^2{f}}{\partial{x_i}\partial{x_j}}\end{align*}
Is everything correct? Could we improve something? (Wondering)

All derivatives should be partial derivatives.
That is, we assume that the $x_i$ do not depend on $t$, which is what a partial derivative means.
Otherwise we would have for instance:
$$\d{(tx_i)}t = x_i + t\d{x_i}t$$

Everything else is correct. (Happy)
 
  • #17
I like Serena said:
All derivatives should be partial derivatives.
That is, we assume that the $x_i$ do not depend on $t$, which is what a partial derivative means.
Otherwise we would have for instance:
$$\d{(tx_i)}t = x_i + t\d{x_i}t$$

Everything else is correct. (Happy)

You mean the following:
\begin{align*}\frac{d}{dt}f(t\mathbf{x})&=\frac{d}{dt}f(t(x_1, x_2, \ldots , x_n))=\frac{d}{dt}f(tx_1, tx_2, \ldots , tx_n)=\sum_{i=1}^n\frac{\partial{f}}{\partial{x_i}}\cdot \frac{\partial{(tx_i)}}{\partial{t}}\\ & =\sum_{i=1}^n\frac{\partial{f}}{\partial{x_i}}\cdot x_i\end{align*}
\begin{align*}\frac{d^2}{dt^2}f(t\mathbf{x})&=\frac{d}{dt}\left (\sum_{i=1}^n\frac{\partial{f}}{\partial{x_i}}(t(x_1, x_2, \ldots , x_n))\cdot x_i\right )=\frac{d}{dt}\left (\sum_{i=1}^nf_{x_i}(t(x_1, x_2, \ldots , x_n))\cdot x_i\right )\\ & =\sum_{i=1}^nx_i\cdot \frac{d}{dt}f_{x_i}(t(x_1, x_2, \ldots , x_n)) = \sum_{i=1}^nx_i\cdot \sum_{j=1}^n\frac{\partial{f_{x_i}}}{\partial{x_j}}\cdot \frac{\partial{(tx_j)}}{\partial{t}}\\ & =\sum_{i,j=1}^nx_i\cdot x_j\cdot \frac{\partial{f_{x_i}}}{\partial{x_j}}=\sum_{i,j=1}^nx_i\cdot x_j\cdot \frac{\partial^2{f}}{\partial{x_i}\partial{x_j}}\end{align*} or not? (Wondering)
 
  • #18
mathmari said:
You mean the following:
\begin{align*}\frac{d}{dt}f(t\mathbf{x})&=\frac{d}{dt}f(t(x_1, x_2, \ldots , x_n))=\frac{d}{dt}f(tx_1, tx_2, \ldots , tx_n)=\sum_{i=1}^n\frac{\partial{f}}{\partial{x_i}}\cdot \frac{\partial{(tx_i)}}{\partial{t}}\\ & =\sum_{i=1}^n\frac{\partial{f}}{\partial{x_i}}\cdot x_i\end{align*}
\begin{align*}\frac{d^2}{dt^2}f(t\mathbf{x})&=\frac{d}{dt}\left (\sum_{i=1}^n\frac{\partial{f}}{\partial{x_i}}(t(x_1, x_2, \ldots , x_n))\cdot x_i\right )=\frac{d}{dt}\left (\sum_{i=1}^nf_{x_i}(t(x_1, x_2, \ldots , x_n))\cdot x_i\right )\\ & =\sum_{i=1}^nx_i\cdot \frac{d}{dt}f_{x_i}(t(x_1, x_2, \ldots , x_n)) = \sum_{i=1}^nx_i\cdot \sum_{j=1}^n\frac{\partial{f_{x_i}}}{\partial{x_j}}\cdot \frac{\partial{(tx_j)}}{\partial{t}}\\ & =\sum_{i,j=1}^nx_i\cdot x_j\cdot \frac{\partial{f_{x_i}}}{\partial{x_j}}=\sum_{i,j=1}^nx_i\cdot x_j\cdot \frac{\partial^2{f}}{\partial{x_i}\partial{x_j}}\end{align*} or not? (Wondering)

I meant everywhere. (Nerd)

Consider that the application of the chain rule for a regular derivative is:
$$\d{}t {f(tx,ty)} = \pd fx \cdot\d{(tx)}t + \pd fy \cdot\d{(ty)}t = \pd fx \cdot\left(x+\d xt\right) + \pd fy\cdot\left(y+\d yt\right)$$
While the application of the chain rule for a partial derivative is:
$$\pd{}t {f(tx,ty)} = \pd fx \cdot\pd{(tx)}t + \pd fy \cdot\pd{(ty)}t = \pd fx \cdot x + \pd fy\cdot y$$

We want the latter, don't we? (Wondering)
 
  • #19
I like Serena said:
I meant everywhere. (Nerd)

Consider that the application of the chain rule for a regular derivative is:
$$\d{}t {f(tx,ty)} = \pd fx \cdot\d{(tx)}t + \pd fy \cdot\d{(ty)}t = \pd fx \cdot\left(x+\d xt\right) + \pd fy\cdot\left(y+\d yt\right)$$
While the application of the chain rule for a partial derivative is:
$$\pd{}t {f(tx,ty)} = \pd fx \cdot\pd{(tx)}t + \pd fy \cdot\pd{(ty)}t = \pd fx \cdot x + \pd fy\cdot y$$

We want the latter, don't we? (Wondering)

Ah ok! Thank you very much! (Smile)
 

Similar threads

  • · Replies 9 ·
Replies
9
Views
2K
  • · Replies 4 ·
Replies
4
Views
1K
  • · Replies 16 ·
Replies
16
Views
2K
  • · Replies 24 ·
Replies
24
Views
4K
Replies
11
Views
2K
  • · Replies 22 ·
Replies
22
Views
2K
  • · Replies 8 ·
Replies
8
Views
2K
  • · Replies 6 ·
Replies
6
Views
2K
  • · Replies 23 ·
Replies
23
Views
3K
  • · Replies 21 ·
Replies
21
Views
3K