Confused with this proof for the Cauchy Schwarz inequality

AI Thread Summary
The discussion centers on the confusion regarding the proof of the Cauchy-Schwarz inequality, particularly in determining whether the critical point found by setting the derivative with respect to lambda is a minimum. Participants note that the behavior of the function for large lambda indicates it cannot be a maximum. The importance of minimizing the inner product is emphasized, with some arguing that the proof's steps are sketchy and lack justification. A geometrical interpretation related to the Gram-Schmidt process is suggested as a way to understand why the chosen lambda minimizes the expression. Ultimately, the conversation highlights the need for clarity in the proof's reasoning and the role of lambda in achieving the desired result.
jaded2112
Messages
10
Reaction score
2
Homework Statement
This might be a silly question but im honestly confused. In this proof, how was it concluded that, from setting the derivative with respect to lambda and finding the critical point, there is a minima at that point?
Relevant Equations
https://ibb.co/BnFGKwX
Im confused as finding the minimum value of lambda is an important part of the proof but it isn't clear to me that the critical point is a minimum
 
  • Like
Likes Delta2
Physics news on Phys.org
jaded2112 said:
Homework Statement:: This might be a silly question but I am honestly confused. In this proof, how was it concluded that, from setting the derivative with respect to lambda and finding the critical point, there is a minima at that point?
Relevant Equations:: https://ibb.co/BnFGKwX

Im confused as finding the minimum value of lambda is an important part of the proof but it isn't clear to me that the critical point is a minimum
It can't be a maximum, due to the behaviour of the function for large ##\lambda##.

That said, treating ##\lambda## and ##\lambda^*## as independent variables needs a bit of explanation!
 
  • Like
Likes jaded2112
PeroK said:
It can't be a maximum, due to the behaviour of the function for large ##\lambda##.

That said, treating ##\lambda## and ##\lambda^*## as independent variables needs a bit of explanation!
Thanks for the reply.
Can you elaborate on why lambda can't be a maximum?
Also, i think lambda is the independent variable because it can be any complex valued number and minimising the inner product is an important part of the proof.
 
It doesn't matter if it is a minimum or maximum.

We have this inequality which holds for any ##\lambda##
$$0\leq (\phi,\phi)-\lambda(\phi,\psi)-\lambda^*(\psi,\phi)-\lambda\lambda^*(\psi,\psi)$$
Now suppose that I tell you to apply this inequality for ##\lambda=\frac{(\psi,\phi)}{(\psi,\psi)}##.
You will be able to infer what is asked for, that is the Cauchy-Schwarz inequality for the inner product, won't you?
But then you going to ask me how did I come up with this value of ##\lambda## won't you?
 
  • Like
Likes jaded2112
jaded2112 said:
Thanks for the reply.
Can you elaborate on why lambda can't be a maximum?
Also, i think lambda is the independent variable because it can be any complex valued number and minimising the inner product is an important part of the proof.
There is no maximum as the expression has no upper bound as ##|\lambda| \rightarrow \infty##. That's supposed to be so obvious that the author doesn't even mention it.

You're saying that ##\lambda^*## is independent of ##\lambda##? In the same way that ##x## is independent of ##y##. You can vary ##\lambda^*## without changing ##\lambda##?

There is a reason that works, but it definitely needs some explanation and without it the proof is very sketchy.

Note that, in general, just because a set of steps comes out with the right answer doesn't mean the steps are right.
 
  • Like
Likes jaded2112
@PeroK, @jaded2112 where do we use the fact that it is a minimum? I simply don't understand sorry, all it seems to me that we need is a proper value of ##\lambda## to apply the inequality.
 
  • Like
Likes PeroK
Delta2 said:
@PeroK, @jaded2112 where do we use the fact that it is a minimum? I simply don't understand sorry, all it seems to me that we need is a proper value of ##\lambda## to apply the inequality.
You're right. To get the required result, you just need to set ##\lambda## equal to the required expression.
 
  • Wow
  • Like
Likes jaded2112 and Delta2
PeroK said:
You're right. To get the required result, you just need to set ##\lambda## equal to the required expression.
But then again why the value of ##\lambda## that makes that expression minimum (or maximum) with respect to ##\lambda^*## is the one that works? Sorry but I just don't get this.
 
Delta2 said:
But then again why the value of ##\lambda## that makes that expression minimum (or maximum) with respect to ##\lambda^*## is the one that works? Sorry but I just don't get this.
See proof 3.

https://en.wikipedia.org/wiki/Cauchy–Schwarz_inequality

There's a neat geometrical view of why that ##\lambda## minimises the expression, which is related to the Gram-Schmidt process. You're taking off the amount of one vector that is contained in the other. That's not needed for the proof. That's another reason to distrust this proof as a bit sketchy - it gets the right answer, but what's being done is not justified.

PS I'm sure I've seen that minimisation argument used somewhere, but I can't remember where. Something more general that the C-S inequality.
 
  • Informative
  • Like
Likes jaded2112 and Delta2
  • #10
  • Like
Likes Delta2
  • #11
PeroK said:
See proof 3.

https://en.wikipedia.org/wiki/Cauchy–Schwarz_inequality

There's a neat geometrical view of why that ##\lambda## minimises the expression, which is related to the Gram-Schmidt process. You're taking off the amount of one vector that is contained in the other. That's not needed for the proof. That's another reason to distrust this proof as a bit sketchy - it gets the right answer, but what's being done is not justified.

PS I'm sure I've seen that minimisation argument used somewhere, but I can't remember where. Something more general that the C-S inequality.
Sorry the proof 3 in that wikipedia link just magically comes up with the value of lambda. Perhaps you meant to post another link? Gram-Schmidt process? How does that relate here?
 
  • #12
Delta2 said:
Sorry the proof 3 in that wikipedia link just magically comes up with the value of lambda. Perhaps you meant to post another link? Gram-Schmidt process? How does that relate here?
You should be able to draw a diagram and see this for yourself (using real vectors for simplicity):

Let's start with two linearly independent vectors ##\vec u, \vec v##. You could draw ##\vec u## along the x-axis, say, and ##\vec v## somewhere in the first quadrant.

To minimise ##|\vec w| = |\vec u - \lambda \vec v|##, we want the vector ##\vec w## to be orthogonal to the vector ##\vec v##. This is exactly what we do in the Gram-Schmidt process: we want a linear combination of ##\vec u, \vec v## that is orthogonal to ##\vec v##.

And, to find ##\lambda## we can use: $$\vec v \cdot \vec w = \vec v \cdot (\vec u - \lambda \vec v) = \vec v \cdot \vec u - \lambda |\vec v|^2$$ And, if ## \vec v \cdot \vec w = 0##, then $$\lambda = \frac{\vec v \cdot \vec u}{|\vec v|^2}$$ And C-S falls out of this if we look at ##|\vec w|^2##, with the added result that equality occurs only if ##\vec u, \vec v## are linearly dependent.
 
  • Like
Likes Delta2

Similar threads

Replies
2
Views
2K
Replies
6
Views
2K
Replies
7
Views
2K
Replies
8
Views
2K
Replies
12
Views
2K
Replies
5
Views
2K
Replies
19
Views
2K
Back
Top