How Can I Simplify This Linear Algebra Expression for Differentiation?

GeoffO
Messages
11
Reaction score
0
I am trying to simplify the following, so that I can differentiate it (with respect to X). Ideally I'll have everything in terms of (XX^T - YY^T).

<br /> \mathrm{trace}[(AX(AX)^T)((BX)(BX)^T)^{-1}] - <br /> \mathrm{trace}[(AY(AY)^T)((BY)(BY)^T)^{-1}]<br />

Where X and Y are 3 x N and A and B are N x N. A is symmetrical, B is anti-symmetrical (or skew symmetrical).

Some useful properties:
<br /> \mathrm{trace}(UV) = \mathrm{trace}(VU)<br />
<br /> \mathrm{trace}(U)+\mathrm{trace}(V) = \mathrm{trace}(U+V)<br />

I can't figure this out and have spent a long time working on it.
A is a band diagonal matrix where each row is a shifted version of (1 -2 1) and B is similar with a stencil of (-1 0 1).

Any ideas?
 
Physics news on Phys.org
Welcome to PF!

Hi GeoffO! Welcome to PF! :smile:

Hint: (PQ)T = QTPT.

And XXT is an n x n matrix.

So just change the order of everything. :wink:
 
Thanks for the reply. I have used this property and end up with
<br /> \mathrm{trace}[<br /> (B^{-1}A)^T<br /> (XX^T(B^{-1}A)(X^TX)^{-1} - <br /> YY^T(B^{-1}A)(Y^TY)^{-1}<br /> ]<br />

Which is okay, but I'd really like to get that common (B^{-1}A) term out of there.

We can rewrite this as
<br /> (Q^T)(R^T Q R^{-1} - S^T Q S^{-1})<br />
Where R and S are symmetrical (because they're squared X and Y from above). This looks so simple, but I can not find a way to further simplify.
 
GeoffO said:
Thanks for the reply. I have used this property and end up with
<br /> \mathrm{trace}[<br /> (B^{-1}A)^T<br /> (XX^T(B^{-1}A)(X^TX)^{-1} - <br /> YY^T(B^{-1}A)(Y^TY)^{-1}<br /> ]<br />

Now change the order of everything …

put all the As and Bs on the left, and the XXT or YYT on the right. :smile:
 
tiny-tim said:
Now change the order of everything …

put all the As and Bs on the left, and the XXT or YYT on the right. :smile:

What property allows me to reorder like that? I only know of the property
\mathrm{trace}(AB) = \mathrm{trace}(BA)
(Again, thank you.)
 
GeoffO said:
What property allows me to reorder like that? I only know of the property
\mathrm{trace}(AB) = \mathrm{trace}(BA)
(Again, thank you.)

Yes, and it works for any square matrices …

and XXT is square, so you can shove it off to one end. :smile:
 
tiny-tim said:
XXT is square, so you can shove it off to one end. :smile:

I see. You mean the following, right?
<br /> \mathrm{trace}[<br /> (B^{-1}A)(X^TX)^{-1}(B^{-1}A)^T (XX^T) - <br /> (B^{-1}A)(Y^TY)^{-1}(B^{-1}A)^T (YY^T)<br /> ]<br />

I would like to factor out the (B^{-1}A) terms, if possible. (Thanks!)
 
GeoffO said:
I see. You mean the following, right?
<br /> \mathrm{trace}[<br /> (B^{-1}A)(X^TX)^{-1}(B^{-1}A)^T (XX^T) - <br /> (B^{-1}A)(Y^TY)^{-1}(B^{-1}A)^T (YY^T)<br /> ]<br />

No, I mean
Tr(B^{-1}A)(B^{-1}A)^T Tr(XX^T)(X^TX)^{-1} - <br /> Tr(B^{-1}A)(B^{-1}A)^T Tr(YY^T)(Y^TY)^{-1}
 
tiny-tim said:
No, I mean
Tr(B^{-1}A)(B^{-1}A)^T Tr(XX^T)(X^TX)^{-1} - <br /> Tr(B^{-1}A)(B^{-1}A)^T Tr(YY^T)(Y^TY)^{-1}

Hmm... somewhere I have made an error. (X^TX)^{-1} is a 3x3, so this doesn't make sense... back in a minute after I figure out my error.

In the meantime how did you introduce multiplication of traces? What property am I missing? This looks promising!
 
  • #10
GeoffO said:
In the meantime how did you introduce multiplication of traces? What property am I missing? This looks promising!

oops … I got carried away :redface: … I meant

Tr(B^{-1}A)(B^{-1}A)^T (XX^T)(X^TX)^{-1} - <br /> Tr(B^{-1}A)(B^{-1}A)^T (YY^T)(Y^TY)^{-1}
 
  • #11
Yeah, sorry the inverted terms should have been (XX^T)^{-1} and (YY^T)^{-1}. This would result in all of the X and Y terms canceling out in your reduction. I'm excited to learn what property allows for this step!
 
  • #12
tiny-tim said:
oops … I got carried away :redface: … I meant

Tr(B^{-1}A)(B^{-1}A)^T (XX^T)(X^TX)^{-1} - <br /> Tr(B^{-1}A)(B^{-1}A)^T (YY^T)(Y^TY)^{-1}

Even still, what allows you to go from Tr(M N M^T N^{-1}) to Tr(M M^T N N^{-1})?

This is not just a rotation, it's a reordering, right?
 
  • #13
** solved **

I see how you did it, for any three square **SYMMETRIC** (or anti-symmetric) matrices you can reorder at will because of these properties.

The first Tr(AB) = Tr(BA) allows us to write Tr(ABC) = Tr(BCA) and the like.

The second is Tr(A) = Tr(A^T).

With these we can permute any three square **SYMMETRIC** (or anti-symmetric) matrices. Here is a proof:
Tr(ABC) = Tr(A^T B^T C^T) = Tr((CBA)^T) = Tr(CBA) = Tr(ACB) = Tr(BAC)
Where the last two steps are just an application of the first property.

So, we can say A= MN, B=M^T and C=N^{-1} which allows us to get the result you indicate.

Thank you very much, you have solved by problem!

As a side note X and Y are not invertible, but using the pseudoinverse allows me to approximate the least squares solution that I am after.

Thanks, I hope to give back a bit to the forum in the future.
 
Last edited:
  • #14
GeoffO said:
I see how you did it, for any three square matrices you can reorder at will because of these properties.

With these we can permute any three square matrices. Here is a proof:
Tr(ABC) = Tr(A^T B^T C^T) = Tr((CBA)^T) = Tr(CBA) = Tr(ACB) = Tr(BAC)

That's it! :biggrin:

Except that I think it only works for three symmetric and/or anti-symmetric square matrices, because of the step Tr(ABC) = Tr(AT BT CT) …

which is ok in this case because XXT is symmetric. :wink:
 
  • #15
tiny-tim said:
Except that I think it only works for three symmetric and/or anti-symmetric square matrices

Good point... It's careful surgery. I'll have to double check, not everything here is symmetrical, so being sure to associate carefully is very important.

(I edited the post above lest I confuse any future reader.)
 
  • #16
GeoffO said:
(I edited the post above lest I confuse any future reader.)

You can put and/or anti-symmetric in also, even for an odd number …

if A ad C are symmetric and B is anti-symmetric, then ABC is anti-symmetric, so you get a -1 for BT and for (ABC)T, so it's still +1 in the end. :smile:

EDIT: oooh no, that's rubbish … if A and B are both symmetric, then AB needn't be, although AB + BA will be … and if A is symmetric and B is anti-symmetric, then AB needn't be anti-symmetric, although AB - BA will be.

But the traces still work for a symmetric A and C and an anti-symmetric B …

Tr(ABC) = -Tr(BAC)

though not for three symmetric and one anti-symmetric …

Tr(ABCD) = -Tr(BADC), not (BACD)

And B-1A isn't symmetric or anti-symmetric anyway, so I think it's back to square one … :frown:
 
Last edited:
Back
Top