Procrustes Analysis and Lagrange Multipliers

Click For Summary

Discussion Overview

The discussion revolves around the application of Procrustes Analysis using Lagrange Multipliers to minimize a specific trace function under orthogonality constraints. Participants explore the formulation of the Lagrange equation and the appropriate mathematical operations required for matrix constraints.

Discussion Character

  • Technical explanation
  • Mathematical reasoning
  • Debate/contested

Main Points Raised

  • One participant presents a problem involving the minimization of tr{RyxR} subject to the constraint RTR=I and questions the formulation of the Lagrange function due to a mismatch in dimensions between scalar and matrix terms.
  • Another participant suggests using an alternative matrix multiplication method, specifically the Frobenius inner product, to address the dimensionality issue in the Lagrange formulation.
  • A participant expresses uncertainty about the necessity of defining a new matrix product and inquires about references for further study on this topic.
  • Further clarification is provided regarding the need to sum each matrix entry in the constraint multiplied by its corresponding Lagrange multiplier, emphasizing the role of the Frobenius inner product in this context.
  • A later post shares a solution that reformulates the problem and derives a new Lagrange function, demonstrating the use of SVD decomposition and concluding with the identification of R as the orthogonal matrix closest to R_{xy}.

Areas of Agreement / Disagreement

Participants do not reach a consensus on the initial formulation of the Lagrange equation, with some proposing alternative approaches and others questioning the necessity of those changes. The discussion includes multiple competing views on the appropriate mathematical treatment of the problem.

Contextual Notes

The discussion highlights limitations related to the assumptions made about matrix operations and the specific definitions of the constraints involved. There are unresolved mathematical steps in the derivation of the Lagrange function and its implications for the problem.

Who May Find This Useful

This discussion may be of interest to those studying optimization techniques in mathematics, particularly in the context of matrix analysis and applications in fields such as statistics and machine learning.

AntsyPants
Messages
9
Reaction score
0
The problem:

Minimize tr{RyxR} subject to RTR=I

This problem is known as Procruses Analysis and can be solved using Lagrange Multipliers, so there's a tendency to write the following function:

L(R) = tr{RyxR} - [itex]\Lambda[/itex](RTR-I),

where [itex]\Lambda[/itex] is a matrix of Lagrange Multipliers

However, there is obviously a flaw in this, because the first parcel is a scalar, whilst the second is a square matrix.

So the question is: What is the correct form of the Lagrange equation? I'm familiar with this subject, but this example baffles me. I've searched in some mathematics books, but they only explain for scalar field subject to simpler equations.

If there's any problem with this post, let me know, please.
 
Last edited:
Physics news on Phys.org
Welcome to PF, AntsyPants! :smile:

It looks like you need an alternative matrix multiplication.
You need to multiply ##\Lambda## as if it's a vector with which you calculate a dot product with another matrix that is also treated as a vector.

For reference, this type of multiplication is called the Frobenius inner product.
See http://en.wikipedia.org/wiki/Matrix_multiplication#Frobenius_inner_product
 
I'll try this tomorrow, but it will most likely give the desired result. However, why are we given such freedom of defining a new matrix product that's more suitable to the problem? Or is it that in these types of constraints this alternative product must be used?

Do you know of any references which I can study regarding this?
 
The only reference I have available is the wikipedia article on Lagrange multipliers.
Is has a section on "Multiple constraints".

I think it's not so much that freedom is taken of defining a new matrix product, but rather that a way is needed to write down that each equation (each matrix entry) in the constraint must be summed while multiplied with a separate multiplier.

As you can see in your equation ##\Lambda## is not multiplied, but it is considered a function.
In the text should be explained how this is intended.
 
Last edited:
OK I solved this, so I might as well share it with everyone else.

First of all, for convenience, I changed the constraint to R RT=I, which is completely equivalent.

So the problem is as follows:

Maximize tr{R[itex]_{xy}[/itex]R[itex]^{T}[/itex]} subject to R R[itex]^{T}[/itex]=I


The solution:

First of all, we can write the lagrangean function as

L(R) = tr{R[itex]_{xy}[/itex]R[itex]^{T}[/itex]} + [itex]\Lambda[/itex]:(R R[itex]^{T}[/itex]-I) (1),

where [itex]\Lambda[/itex] is a matrix of Lagrange Multipliers. Since each matrix entry is a constraint, we need to multiply it by the corresponding lagrange multiplier and sum it at the end, which is done by the Frobenius Inner Product, :. Keeping in mind that A:B = tr{A[itex]^{T}[/itex]B}, we can rewrite (1) as

L(R) = tr{R[itex]_{xy}[/itex]R[itex]^{T}[/itex]} + tr{[itex]\Lambda[/itex][itex]^{T}[/itex](R R[itex]^{T}[/itex]-I)} (2).

Differentiating this with respect to R and setting it to zero, we get

R[itex]_{xy}[/itex] + ([itex]\Lambda[/itex]+[itex]\Lambda[/itex][itex]^{T}[/itex])R=0
R[itex]_{xy}[/itex] = - ([itex]\Lambda[/itex]+[itex]\Lambda[/itex][itex]^{T}[/itex])R (3)

- ([itex]\Lambda[/itex]+[itex]\Lambda[/itex][itex]^{T}[/itex]) forms a new set of multipliers, which we can define as [itex]\Phi[/itex] so,

R[itex]_{xy}[/itex] = [itex]\Phi[/itex] R (4).

Therefore,

R[itex]_{xy}[/itex]R[itex]^{T}_{xy}[/itex] = [itex]\Phi[/itex] R R[itex]^{T}[/itex][itex]\Phi[/itex][itex]^{T}[/itex] = [itex]\Phi[/itex] [itex]\Phi[/itex][itex]^{T}[/itex] (5),

seeing it is now imposed R is orthogonal (R R[itex]^{T}[/itex]=I).

We can now use SVD decomposition on R[itex]_{xy}[/itex]:

R[itex]_{xy}[/itex] = U D VR[itex]^{T}[/itex] (6),

where U and V are orthogonal matrixes and D is a diagonal matrix. Using (5) we know that

[itex]\Phi[/itex] [itex]\Phi[/itex][itex]^{T}[/itex] = U D[itex]^{2}[/itex]U[itex]^{T}[/itex] (7).

This means that [itex]\Phi[/itex] = U D U[itex]^{T}[/itex]. Replacing this in (4), we obtain

U D V[itex]^{T}[/itex] = U D U[itex]^{T}[/itex] R. Therefore,

R = U V[itex]^{T}[/itex] (8).

It's now clear that R is the Orthogonal Matrix closest to R[itex]_{xy}[/itex], because (8) is the equation which minimizes the Frobenius Norm of R-R[itex]_{xy}[/itex]
 

Similar threads

  • · Replies 8 ·
Replies
8
Views
2K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 8 ·
Replies
8
Views
2K
Replies
4
Views
2K
  • · Replies 5 ·
Replies
5
Views
3K
  • · Replies 16 ·
Replies
16
Views
3K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 6 ·
Replies
6
Views
6K