Procrustes Analysis and Lagrange Multipliers

AntsyPants · Dec 23, 2011

The problem:

Minimize tr{R_yxR} subject to R^TR=I

This problem is known as Procruses Analysis and can be solved using Lagrange Multipliers, so there's a tendency to write the following function:

L(R) = tr{R_yxR} - \Lambda(R^TR-I),

where \Lambda is a matrix of Lagrange Multipliers

However, there is obviously a flaw in this, because the first parcel is a scalar, whilst the second is a square matrix.

So the question is: What is the correct form of the Lagrange equation? I'm familiar with this subject, but this example baffles me. I've searched in some mathematics books, but they only explain for scalar field subject to simpler equations.

If there's any problem with this post, let me know, please.

I like Serena · Dec 23, 2011

Welcome to PF, AntsyPants!

It looks like you need an alternative matrix multiplication.
You need to multiply ##\Lambda## as if it's a vector with which you calculate a dot product with another matrix that is also treated as a vector.

For reference, this type of multiplication is called the Frobenius inner product.
See http://en.wikipedia.org/wiki/Matrix_multiplication#Frobenius_inner_product

AntsyPants · Dec 23, 2011

I'll try this tomorrow, but it will most likely give the desired result. However, why are we given such freedom of defining a new matrix product that's more suitable to the problem? Or is it that in these types of constraints this alternative product must be used?

Do you know of any references which I can study regarding this?

I like Serena · Dec 23, 2011

The only reference I have available is the wikipedia article on Lagrange multipliers.
Is has a section on "Multiple constraints".

I think it's not so much that freedom is taken of defining a new matrix product, but rather that a way is needed to write down that each equation (each matrix entry) in the constraint must be summed while multiplied with a separate multiplier.

As you can see in your equation ##\Lambda## is not multiplied, but it is considered a function.
In the text should be explained how this is intended.

AntsyPants · Dec 24, 2011

OK I solved this, so I might as well share it with everyone else.

First of all, for convenience, I changed the constraint to R R^T=I, which is completely equivalent.

So the problem is as follows:

Maximize tr{R_{xy}R^{T}} subject to R R^{T}=I

The solution:

First of all, we can write the lagrangean function as

L(R) = tr{R_{xy}R^{T}} + \Lambda:(R R^{T}-I) (1),

where \Lambda is a matrix of Lagrange Multipliers. Since each matrix entry is a constraint, we need to multiply it by the corresponding lagrange multiplier and sum it at the end, which is done by the Frobenius Inner Product, :. Keeping in mind that A:B = tr{A^{T}B}, we can rewrite (1) as

L(R) = tr{R_{xy}R^{T}} + tr{\Lambda^{T}(R R^{T}-I)} (2).

Differentiating this with respect to R and setting it to zero, we get

R_{xy} + (\Lambda+\Lambda^{T})R=0
R_{xy} = - (\Lambda+\Lambda^{T})R (3)

- (\Lambda+\Lambda^{T}) forms a new set of multipliers, which we can define as \Phi so,

R_{xy} = \Phi R (4).

Therefore,

R_{xy}R^{T}_{xy} = \Phi R R^{T}\Phi^{T} = \Phi \Phi^{T} (5),

seeing it is now imposed R is orthogonal (R R^{T}=I).

We can now use SVD decomposition on R_{xy}:

R_{xy} = U D VR^{T} (6),

where U and V are orthogonal matrixes and D is a diagonal matrix. Using (5) we know that

\Phi \Phi^{T} = U D^{2}U^{T} (7).

This means that \Phi = U D U^{T}. Replacing this in (4), we obtain

U D V^{T} = U D U^{T} R. Therefore,

R = U V^{T} (8).

It's now clear that R is the Orthogonal Matrix closest to R_{xy}, because (8) is the equation which minimizes the Frobenius Norm of R-R_{xy}

Procrustes Analysis and Lagrange Multipliers

Thread 'How does time derivative commute from one variable to another?'

Similar threads

Hot Threads

I Algebraic property of real numbers

I Problem in understanding instantaneous velocity

I How to find the path if we only know the velocity (without common formulas)?

I Harmonic series Ʃ1/n diverges but p-series Ʃ(1/n)^p diverges?

I Explicit logical justification for last step in epsilon/delta proof?

Recent Insights

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight

Insights Relativator (Circular Slide-Rule): Simulated with Desmos - Insight

Insights Fixing Things Which Can Go Wrong With Complex Numbers

Insights Fermat's Last Theorem

Insights Why Vector Spaces Explain The World: A Historical Perspective