Fermat's theorem (stationary points) of higher dimensions

Click For Summary

Discussion Overview

The discussion revolves around extending Fermat's theorem on stationary points to higher dimensions. Participants explore how to adapt existing proofs and the implications of differentiability in this context, focusing on theoretical aspects and mathematical reasoning.

Discussion Character

  • Technical explanation
  • Mathematical reasoning
  • Debate/contested

Main Points Raised

  • Some participants suggest substituting the derivative in the one-dimensional case with the gradient in higher dimensions, defined using partial derivatives.
  • There is a question about the values of the partial derivatives at a local maximum, with some participants proposing that they equal zero.
  • One participant seeks to prove that if a function has a local maximum at a point, it must either be differentiable at that point with a zero derivative or not differentiable at all.
  • A later reply discusses the challenge of reducing the problem to the one-dimensional case and the implications of differentiability in higher dimensions, particularly regarding the relationship between column and row vectors in the context of derivatives.

Areas of Agreement / Disagreement

Participants express differing views on how to approach the proof in higher dimensions, and there is no consensus on the best method or the implications of differentiability at local maxima.

Contextual Notes

The discussion highlights the complexity of extending concepts from one dimension to higher dimensions, particularly regarding the definitions and roles of derivatives and gradients. There are unresolved questions about the assumptions underlying the proofs and the mathematical steps involved.

Physics news on Phys.org
ianchenmu said:
Look at this page and the Proof part,

Fermat's theorem (stationary points) - Wikipedia, the free encyclopedia

How to change the proof 2 into a proof of higher dimensions or can you give a proof of Fermat's theorem of higher dimensions?


In case of higher dimension You cal substitute the derivative $\displaystyle f^{\ '}(x)= \frac{d f(x)}{d x}$ with the gradient, defined as... $\displaystyle \nabla f(x_{1},\ x_{2}, ...,\ x_{n}) = \frac{ \partial f}{\partial x_{1}}\ \overrightarrow {e}_{1} + \frac{ \partial f}{\partial x_{2}}\ \overrightarrow {e}_{2} + ... + \frac{ \partial f}{\partial x_{n}}\ \overrightarrow {e}_{n}$ (1)

Kind regards

$\chi$ $\sigma$
 
Last edited:
chisigma said:
In case of hogher dimension You cal substitute the derivative $\displaystyle f^{\ '}(x)= \frac{d f(x)}{d x}$ with the gradient, defined as... $\displaystyle \nabla f(x_{1},\ x_{2}, ...,\ x_{n}) = \frac{ \partial f}{\partial x_{1}}\ \overrightarrow {e}_{1} + \frac{ \partial f}{\partial x_{2}}\ \overrightarrow {e}_{2} + ... + \frac{ \partial f}{\partial x_{n}}\ \overrightarrow {e}_{n}$ (1)

Kind regards

$\chi$ $\sigma$

Thanks. But how about $f:\mathbb{R}^n\rightarrow \mathbb{R}$
 
chisigma said:
In case of higher dimension You cal substitute the derivative $\displaystyle f^{\ '}(x)= \frac{d f(x)}{d x}$ with the gradient, defined as... $\displaystyle \nabla f(x_{1},\ x_{2}, ...,\ x_{n}) = \frac{ \partial f}{\partial x_{1}}\ \overrightarrow {e}_{1} + \frac{ \partial f}{\partial x_{2}}\ \overrightarrow {e}_{2} + ... + \frac{ \partial f}{\partial x_{n}}\ \overrightarrow {e}_{n}$ (1)

Kind regards

$\chi$ $\sigma$

But what's then? what $\frac{ \partial f}{\partial x_{1}},\frac{ \partial f}{\partial x_{2}},...,\frac{ \partial f}{\partial x_{n}}$ equal to?

(I mean, is that $\frac{ \partial f}{\partial x_{1}}=\frac{ \partial f}{\partial a_{1}}=0$,$\frac{ \partial f}{\partial x_{n}}=\frac{ \partial f}{\partial a_{2}}=0$,...,$\frac{ \partial f}{\partial x_{n}}=\frac{ \partial f}{\partial a_{n}}=0$, (where $a$ is a local maximum) ,why?)
 
Last edited:
ianchenmu said:
But what's then? what $\frac{ \partial f}{\partial x_{1}},\frac{ \partial f}{\partial x_{2}},...,\frac{ \partial f}{\partial x_{n}}$ equal to?

If You write as $\displaystyle \overrightarrow {x}$ a generic vector of dimension n and as $\displaystyle \overrightarrow {0}$ the nul vector of dimension n, then $\displaystyle \overrightarrow {x}_{0}$ is a relative maximum or minimum only if is...

$\displaystyle \nabla f (\overrightarrow {x}_{0}) = \overrightarrow {0}$ (1)

Kind regards

$\chi$ $\sigma$
 
chisigma said:
If You write as $\displaystyle \overrightarrow {x}$ a generic vector of dimension n and as $\displaystyle \overrightarrow {0}$ the nul vector of dimension n, then $\displaystyle \overrightarrow {x}_{0}$ is a relative maximum or minimum only if is...

$\displaystyle \nabla f (\overrightarrow {x}_{0}) = \overrightarrow {0}$ (1)

Kind regards

$\chi$ $\sigma$
But this is what I need to prove. To clarify, I need to prove this:
Let $E\subset \mathbb{R}^n$ and $f:E\rightarrow\mathbb{R}$ be a continuous function. Prove that if $a$ is a local maximum point for $f$, then either $f$ is differentiable at $x=a$ with $Df(a)=0$ or $f$ is not differentiable at $a$.
 
ianchenmu said:
But this is what I need to prove. To clarify, I need to prove this:
Let $E\subset \mathbb{R}^n$ and $f:E\rightarrow\mathbb{R}$ be a continuous function. Prove that if $a$ is a local maximum point for $f$, then either $f$ is differentiable at $x=a$ with $Df(a)=0$ or $f$ is not differentiable at $a$.
When I saw this problem, I thought that it would be easy to tackle it by reducing it to the one-dimensional case. In fact, let $b\in\mathbb{R}^n$. If $f$ has a local maximum at $a$, then the function $g:\mathbb{R}\to\mathbb{R}$ defined by $g(t) = f(a+tb)$ must have a maximum at $t=0$. What we need to do here is to choose the vector $b$ suitably. Then, provided that $f$ is differentiable at $a$, we can use the fact that $g'(0) = 0$ to deduce that $Df(a) = 0.$

But that turns out to be a bit tricky. The reason is that if $a\in\mathbb{R}^n$ then the derivative $Df(a)$ belongs to the dual space $\mathbb{R}^n$. In other words, if you think of $a$ as a column vector, then $Df(a)$ will be a row vector. So suppose we take $b=(Df(a))^{\text{T}}$, the transpose of $Df(a)$. According to the higher-dimensional chain rule, $g'(0) = Df(a)\circ b = Df(a)\circ (Df(a))^{\text{T}} = \bigl|Df(a)\bigr|^2.$ But since $0$ is a local maximum for $g$ it follows that $g'(0) = 0$ and hence $Df(a) = 0.$

If you really want to get to grips with duality, and the reasons for distinguishing between row vectors and column vectors, then you will have to come to terms with covariance and contravariance.
 

Similar threads

  • · Replies 105 ·
4
Replies
105
Views
9K
Replies
2
Views
2K
Replies
2
Views
2K
Replies
1
Views
1K
  • · Replies 3 ·
Replies
3
Views
2K
Replies
6
Views
2K
Replies
2
Views
2K
Replies
2
Views
2K
Replies
2
Views
2K
  • · Replies 12 ·
Replies
12
Views
3K