How Can I Simplify This Linear Algebra Expression for Differentiation?

Click For Summary

Discussion Overview

The discussion revolves around simplifying a linear algebra expression for differentiation, specifically focusing on the trace of products involving matrices. Participants explore properties of traces and matrix operations to achieve a desired form, with the goal of expressing everything in terms of (XX^T - YY^T).

Discussion Character

  • Technical explanation
  • Mathematical reasoning
  • Debate/contested

Main Points Raised

  • One participant presents a complex expression involving traces and matrix products, seeking simplification for differentiation.
  • Another participant suggests using properties of transposes and traces to reorder terms in the expression.
  • Several participants discuss the implications of reordering matrices, particularly focusing on symmetric and anti-symmetric properties.
  • There is a back-and-forth regarding the correct application of trace properties, with some participants expressing uncertainty about the conditions under which these properties hold.
  • A later reply indicates that the simplification process leads to a realization about the conditions necessary for reordering matrices, specifically mentioning symmetric and anti-symmetric matrices.
  • One participant acknowledges an error in their understanding of matrix inverses and traces, prompting further clarification on the properties of traces.
  • Another participant concludes that the problem is solved by recognizing the conditions under which traces can be permuted, while also noting the use of pseudoinverses for non-invertible matrices.
  • There is a discussion about the limitations of the properties when applied to non-symmetric matrices, with participants expressing caution in their application.

Areas of Agreement / Disagreement

Participants express varying levels of agreement on the properties of traces and the conditions for reordering matrices. Some participants are confident in the properties discussed, while others remain uncertain or challenge the applicability of certain properties to their specific case.

Contextual Notes

Participants note that the matrices involved have specific structures (symmetric and anti-symmetric) which influence the validity of the properties being applied. There is also mention of the use of pseudoinverses, indicating that the discussion is context-dependent and may not apply universally.

Who May Find This Useful

This discussion may be useful for individuals interested in advanced linear algebra, particularly those dealing with matrix calculus, trace properties, and the simplification of expressions for differentiation in mathematical contexts.

GeoffO
Messages
11
Reaction score
0
I am trying to simplify the following, so that I can differentiate it (with respect to X). Ideally I'll have everything in terms of ([tex]XX^T - YY^T[/tex]).

[tex] \mathrm{trace}[(AX(AX)^T)((BX)(BX)^T)^{-1}] - <br /> \mathrm{trace}[(AY(AY)^T)((BY)(BY)^T)^{-1}][/tex]

Where X and Y are 3 x N and A and B are N x N. A is symmetrical, B is anti-symmetrical (or skew symmetrical).

Some useful properties:
[tex] \mathrm{trace}(UV) = \mathrm{trace}(VU)[/tex]
[tex] \mathrm{trace}(U)+\mathrm{trace}(V) = \mathrm{trace}(U+V)[/tex]

I can't figure this out and have spent a long time working on it.
A is a band diagonal matrix where each row is a shifted version of (1 -2 1) and B is similar with a stencil of (-1 0 1).

Any ideas?
 
Physics news on Phys.org
Welcome to PF!

Hi GeoffO! Welcome to PF! :smile:

Hint: (PQ)T = QTPT.

And XXT is an n x n matrix.

So just change the order of everything. :wink:
 
Thanks for the reply. I have used this property and end up with
[tex] \mathrm{trace}[<br /> (B^{-1}A)^T<br /> (XX^T(B^{-1}A)(X^TX)^{-1} - <br /> YY^T(B^{-1}A)(Y^TY)^{-1}<br /> ][/tex]

Which is okay, but I'd really like to get that common [tex](B^{-1}A)[/tex] term out of there.

We can rewrite this as
[tex] (Q^T)(R^T Q R^{-1} - S^T Q S^{-1})[/tex]
Where R and S are symmetrical (because they're squared X and Y from above). This looks so simple, but I can not find a way to further simplify.
 
GeoffO said:
Thanks for the reply. I have used this property and end up with
[tex] \mathrm{trace}[<br /> (B^{-1}A)^T<br /> (XX^T(B^{-1}A)(X^TX)^{-1} - <br /> YY^T(B^{-1}A)(Y^TY)^{-1}<br /> ][/tex]

Now change the order of everything …

put all the As and Bs on the left, and the XXT or YYT on the right. :smile:
 
tiny-tim said:
Now change the order of everything …

put all the As and Bs on the left, and the XXT or YYT on the right. :smile:

What property allows me to reorder like that? I only know of the property
[tex]\mathrm{trace}(AB) = \mathrm{trace}(BA)[/tex]
(Again, thank you.)
 
GeoffO said:
What property allows me to reorder like that? I only know of the property
[tex]\mathrm{trace}(AB) = \mathrm{trace}(BA)[/tex]
(Again, thank you.)

Yes, and it works for any square matrices …

and XXT is square, so you can shove it off to one end. :smile:
 
tiny-tim said:
XXT is square, so you can shove it off to one end. :smile:

I see. You mean the following, right?
[tex] \mathrm{trace}[<br /> (B^{-1}A)(X^TX)^{-1}(B^{-1}A)^T (XX^T) - <br /> (B^{-1}A)(Y^TY)^{-1}(B^{-1}A)^T (YY^T)<br /> ][/tex]

I would like to factor out the [tex](B^{-1}A)[/tex] terms, if possible. (Thanks!)
 
GeoffO said:
I see. You mean the following, right?
[tex] \mathrm{trace}[<br /> (B^{-1}A)(X^TX)^{-1}(B^{-1}A)^T (XX^T) - <br /> (B^{-1}A)(Y^TY)^{-1}(B^{-1}A)^T (YY^T)<br /> ][/tex]

No, I mean
[tex]Tr(B^{-1}A)(B^{-1}A)^T Tr(XX^T)(X^TX)^{-1} - <br /> Tr(B^{-1}A)(B^{-1}A)^T Tr(YY^T)(Y^TY)^{-1}[/tex]
 
tiny-tim said:
No, I mean
[tex]Tr(B^{-1}A)(B^{-1}A)^T Tr(XX^T)(X^TX)^{-1} - <br /> Tr(B^{-1}A)(B^{-1}A)^T Tr(YY^T)(Y^TY)^{-1}[/tex]

Hmm... somewhere I have made an error. [tex](X^TX)^{-1}[/tex] is a 3x3, so this doesn't make sense... back in a minute after I figure out my error.

In the meantime how did you introduce multiplication of traces? What property am I missing? This looks promising!
 
  • #10
GeoffO said:
In the meantime how did you introduce multiplication of traces? What property am I missing? This looks promising!

oops … I got carried away :redface: … I meant

[tex]Tr(B^{-1}A)(B^{-1}A)^T (XX^T)(X^TX)^{-1} - <br /> Tr(B^{-1}A)(B^{-1}A)^T (YY^T)(Y^TY)^{-1}[/tex]
 
  • #11
Yeah, sorry the inverted terms should have been [tex](XX^T)^{-1}[/tex] and [tex](YY^T)^{-1}[/tex]. This would result in all of the X and Y terms canceling out in your reduction. I'm excited to learn what property allows for this step!
 
  • #12
tiny-tim said:
oops … I got carried away :redface: … I meant

[tex]Tr(B^{-1}A)(B^{-1}A)^T (XX^T)(X^TX)^{-1} - <br /> Tr(B^{-1}A)(B^{-1}A)^T (YY^T)(Y^TY)^{-1}[/tex]

Even still, what allows you to go from [tex]Tr(M N M^T N^{-1})[/tex] to [tex]Tr(M M^T N N^{-1})[/tex]?

This is not just a rotation, it's a reordering, right?
 
  • #13
** solved **

I see how you did it, for any three square **SYMMETRIC** (or anti-symmetric) matrices you can reorder at will because of these properties.

The first [tex]Tr(AB) = Tr(BA)[/tex] allows us to write [tex]Tr(ABC) = Tr(BCA)[/tex] and the like.

The second is [tex]Tr(A) = Tr(A^T)[/tex].

With these we can permute any three square **SYMMETRIC** (or anti-symmetric) matrices. Here is a proof:
[tex]Tr(ABC) = Tr(A^T B^T C^T) = Tr((CBA)^T) = Tr(CBA) = Tr(ACB) = Tr(BAC)[/tex]
Where the last two steps are just an application of the first property.

So, we can say [tex]A= MN[/tex], [tex]B=M^T[/tex] and [tex]C=N^{-1}[/tex] which allows us to get the result you indicate.

Thank you very much, you have solved by problem!

As a side note X and Y are not invertible, but using the pseudoinverse allows me to approximate the least squares solution that I am after.

Thanks, I hope to give back a bit to the forum in the future.
 
Last edited:
  • #14
GeoffO said:
I see how you did it, for any three square matrices you can reorder at will because of these properties.

With these we can permute any three square matrices. Here is a proof:
[tex]Tr(ABC) = Tr(A^T B^T C^T) = Tr((CBA)^T) = Tr(CBA) = Tr(ACB) = Tr(BAC)[/tex]

That's it! :biggrin:

Except that I think it only works for three symmetric and/or anti-symmetric square matrices, because of the step Tr(ABC) = Tr(AT BT CT) …

which is ok in this case because XXT is symmetric. :wink:
 
  • #15
tiny-tim said:
Except that I think it only works for three symmetric and/or anti-symmetric square matrices

Good point... It's careful surgery. I'll have to double check, not everything here is symmetrical, so being sure to associate carefully is very important.

(I edited the post above lest I confuse any future reader.)
 
  • #16
GeoffO said:
(I edited the post above lest I confuse any future reader.)

You can put and/or anti-symmetric in also, even for an odd number …

if A ad C are symmetric and B is anti-symmetric, then ABC is anti-symmetric, so you get a -1 for BT and for (ABC)T, so it's still +1 in the end. :smile:

EDIT: oooh no, that's rubbish … if A and B are both symmetric, then AB needn't be, although AB + BA will be … and if A is symmetric and B is anti-symmetric, then AB needn't be anti-symmetric, although AB - BA will be.

But the traces still work for a symmetric A and C and an anti-symmetric B …

Tr(ABC) = -Tr(BAC)

though not for three symmetric and one anti-symmetric …

Tr(ABCD) = -Tr(BADC), not (BACD)

And B-1A isn't symmetric or anti-symmetric anyway, so I think it's back to square one … :frown:
 
Last edited:

Similar threads

  • · Replies 4 ·
Replies
4
Views
3K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 8 ·
Replies
8
Views
3K
  • · Replies 3 ·
Replies
3
Views
4K
  • · Replies 24 ·
Replies
24
Views
4K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 59 ·
2
Replies
59
Views
10K
  • · Replies 9 ·
Replies
9
Views
3K
  • · Replies 9 ·
Replies
9
Views
3K
  • · Replies 0 ·
Replies
0
Views
3K