Inner product space of matrices

1. Dec 8, 2008

turin

I want to consider the space of NxN real matrices as a vector space in which any given NxN matrix can be given as a real weighted sum of at most N^2 basis matrices. I already know how this works if I assume the form for the inner product (eg. (1/N)Tr(matrix product)).

However, here's the catch: I want to be able to first specify the N^2 basis matrices arbitrarily, and then define the inner product that will make this basis orthogonal. (I don't care how the inner product normalizes them.) I'm thinking that this would be analogous to, for example, starting with the 2-D basis $\left\{\hat{X},\hat{X}+\hat{Y}\right\}$, and then defining a dot-product such that $\hat{X}\cdot\left(\hat{X}+\hat{Y}\right)=0$.

I'm not sure how many restrictions I can allow on the basis, but I know that there are at least two restrictions that I don't care about: 1) no two distinct basis matrices are scalar multiples of each other, and 2) every basis matrix has a nonvanishing square. If either or both of these restrictions make the problem easier, then great. But I don't care if the basis violates them either.

I do need to make one condition on the inner product, however. I don't know if this is always true of inner products, but I need mine to be bilinear. The reason for this is, of course, that I want to use it to pull out "components" in the basis directions.

Now I will try to say this as mathematically as I know how:

Given a set of m matrices,
$$S_A=\left\{A_1,A_2,\ldots,A_m\right\}$$
s.t.
$$A_j\neq{}rA_i$$
for any
$$r\in\mathbb{R}$$
unless i=j, and
$$A_i^2\neq0\forall{}i$$
I want to find a function
$$g:S_A\times{}S_A\rightarrow\mathbb{R}$$
s.t.
$$g\left(aA_i+A_j,bA_k+A_l\right)=ar_i\left(b\delta_{i,k}+\delta_{i,l}\right)+r_j\left(b\delta_{j,k}+\delta_{j,l}\right)$$
for any real a and b and for some real r_i and r_j

The specific situation that sparked my interest in this was the recursion relations for the powers of the rotation generators. And, the specific example that I have in mind would have a basis set of four 4x4 matrices, where
$$A_{m\leq4}=A_1^m$$
and
(EDIT: I knew I had it right the first time. Removed square root from the denominator.)
$$A_1^{m>4}=\sum_{n=1}^4\frac{g\left(A_1^m,A_1^n\right)}{g\left(A_1^n,A_1^n\right)}A_1^n$$
(So this is actually for a 4-D subspace of the 16-D space of all 4x4 real matrices.)

I reiterate that I am not asking about the Gramm-Schmidt procedure, which assumes a known g and finds a new basis of A_i's. I want to keep my original A_i's, and make them orthogonal by choosing the definition of the inner product.

Last edited: Dec 9, 2008
2. Dec 8, 2008

D H

Staff Emeritus
Have you tried the Frobenious inner product:

$$g(A,B) = \sum_{i,j}A_{ij}B_{ij} = \textrm{trace}(A^TB)$$

3. Dec 8, 2008

turin

This will not work. For a 2x2 counterexample, if I choose a basis:
$$A_1=\left(\begin{array}{cc}1&1\\0&1\end{array}\right) \qquad A_2=\left(\begin{array}{cc}1&2\\0&1\end{array}\right)$$
then
$$g(A_1,A_2)=4\neq0$$
so A_1 and A_2 are not orthogonal according to this inner product.

4. Dec 8, 2008

Ben Niehoff

Here's a random guess:

Consider a general linear transformation of matrices:

$$A'_{ij} = M_{ij}^{kl} A_{kl}$$

The inner product on matrices is given by

$$(A, B) = A_{ij} B^{ji}$$

Therefore, if we transform both A and B using M, we get:

$$(MA, MB) = M_{ij}^{kl} A_{kl} M_{rs}^{ij} B^{sr}$$

and we see that the inner product will be invariant under the transformation M given the condition

$$M_{ij}^{kl} M_{rs}^{ij} = \delta_{rs}^{kl}$$

so, M is something like an "orthogonal matrix", except that it is a 4-dimensional array of numbers.

Using this, the next task is to find an M which relates all of your vectors to the standard basis vectors. Then you can define

$$(A, B) = \textrm{trace} \; (M^{-1}A)^T (M^{-1}B)$$

for two of your basis vectors A, B.

Stated another way, there is some bilinear form G such that

$$(A, B) = A^{ij} G_{ijkl} B^{kl}$$

gives the inner product you want. To find G, write for each of your n^2 basis vectors $E_{mn}$,

$$(E_{mn})^{ij} G_{ijkl} (E_{rs})^{kl} = \delta^{mn}_{rs}$$

and then solve these n^4 simultaneous linear equations for the n^4 elements of G.

Last edited: Dec 8, 2008
5. Dec 8, 2008

turin

Ben, I think that is in the right direction. Thanks to both of y'all, I now have an idea.
$$g(A,B)=Tr\left\{M_1AM_2B\right\}$$
or maybe
$$g(A,B)=Tr\left\{M_1A^TM_2B\right\}$$
where M_1 and M_2 are determined in terms of the basis. I'll think about this, and, if I come up with something, I will post it. Thanks again.

6. Dec 8, 2008

Hurkyl

Staff Emeritus
I think you're overthinking it. What if I asked you the following question:

Let V be a finite-dimensional vector space.
Let B be a basis for V.
Find an inner product on V for which B is orthogonal.

or even

Consider the vector space Rm
Consider the standard basis for Rm
Find an inner product for which the standard basis is orthogonal.

Your question as well as these two are all the same question....

You should care. (1) is (essentially) part of the definition of "basis".

That is the key part of the very definition of "inner product".

Last edited: Dec 8, 2008
7. Dec 8, 2008

Hurkyl

Staff Emeritus
Oh, was there another, important condition that you didn't state? Something along the lines of wanting an efficient algorithm to compute the inner product?

Along that lines -- do you really care about the inner product at all? If your goal is simply to have an algorithm to take a matrix and write it as a linear combination of your basis matrices, then would you be content with an algorithm that didn't make use of inner products?

8. Dec 8, 2008

turin

I suspected as much. I think that is my biggest problem with math. It seems like every time I think I know the rule, it turns out that there is a very important exception, and every time I try to be careful not to miss any exception, I get bogged down in formalism and can't break through to the solution.

Yes, this is basically my question. I wouldn't have any idea how to do this in such abstract terms.

But my main point is that I don't want to consider the standard basis with the standard inner product (dot product here?); I want to consider an arbitrary basis that is not orthogonal under the standard inner product, and then find an inner product (in terms of that basis), that makes it orthogonal. By "find" or "determine", I mean that I want to have some concrete (i.e. closed form) expression or rule that I can plug into to get the inner product for any two matrices in the relevant space of matrices.

I humbly (and naively) disagree. Can you explain that?

I suspected as much, but I didn't know if there was some more abstract notion of linear superposition that I might be neglecting, like some other way to "add" matrices.

I thought the key part was
$$g:V\times{}V\rightarrow\mathbb{C}$$
no? This doesn't say anything about linearity. In particular, are you saying that I cannot define the inner product as, eg.
$$g(v,w)=\left(\sum_{i=1}^nv_iw_i\right)^2$$
Or am I misunderstanding the concept of bilinearity? Is there more than one possible definition of inner product.

I just checked one of my math books (that I never understood very well), Ivar Stakgold, Green's Functions and Boundary Value Problems, and on p. 284, eq. (5.1), Ivar agrees with your claim.

Last edited: Dec 8, 2008
9. Dec 8, 2008

turin

I'm intrigued.

For my original purpose, absolutely. However, I am now curious how I would resolve the issue that I tried to convey in my first post in this thread. I am in disbelief (again naively) that such a thing can be done without inner products.

Last edited: Dec 8, 2008
10. Dec 8, 2008

Hurkyl

Staff Emeritus
One of the important ideas about a basis is that it defines a system of coordinates, in which the coordinate representation of the k-th basis vector is precisely the k-th standard basis vector.

Reorganizing the information slightly -- an ordered basis (b1, b2, .., bm) for a vector space V is essentially the same thing as the invertible linear transformation
B : Rm --> V.​
The correspondence between the two ideas is to set $B(\hat{e}_k) = b_k$. ($\hat{e}_k$ is the k-th standard basis vector)

11. Dec 8, 2008

turin

If I were to put this into my own words, I would say that the task of finding an inner product that renders the basis orthogonal is tantamount to finding B. Does that sound correct? If this is indeed the case, then I would say by question is: how do I find B?

12. Dec 8, 2008

Hurkyl

Staff Emeritus
Finding B is easy: you just write it down. It's as I said: $B(\hat{e}_k) = b_k$.

I guess maybe that's not immediately obvious? No matter -- it's easy to see if you work through an example. Let V=R² and pick two elements of V to be your B1 and B2. You can solve the equation I wrote to find the four elements of B. (There are two, two-dimensional linear equations, enough to solve for all four unknowns)

The trick is that B points in the wrong direction. What B does is, if you're given the weights, it tells you what the weighted sum of your basis vectors are. But you want the opposite of that....

13. Dec 9, 2008

turin

I will choose a 2-D vector space, but I will represent it with matrices, where vector addition is represented as element-wise addition and scalar multiplication is represented as element-wise scalar multiplication (i.e. the standard way to use matrices as vectors?). Anyway, so I will choose the same basis as I used in one of my previous posts:
$$b_1\rightarrow\left(\begin{array}{cc}1&1\\0&1\end{array}\right) \qquad b_2\rightarrow\left(\begin{array}{cc}1&2\\0&1\end{array}\right)$$
So, I have
$$B\left(\hat{e}_1\right)=\left(\begin{array}{cc}1&1\\0&1\end{array}\right) \qquad B\left(\hat{e}_2\right)=\left(\begin{array}{cc}1&2\\0&1\end{array}\right)$$
Then, if I could find B, I would have
$$\hat{e}_1=B^{-1}\left(\left(\begin{array}{cc}1&1\\0&1\end{array}\right)\right) \qquad \hat{e}_2=B^{-1}\left(\left(\begin{array}{cc}1&2\\0&1\end{array}\right)\right)$$
Then, (by definition?),
$$\hat{e}_1\cdot\hat{e}_2=0$$
So, the inner product acting on the matrices would be
$$g'(b_1,b_2)\equiv{}B^{-1}(b_1)\cdot{}B^{-1}(b_2)=0$$
But I still don't know what B is, and I don't see how the abstract equation gives it to me.

Maybe this just emphasizes your point about doing this in principle vs. finding "an efficient algorithm"? I see that the problem is solved here, in principle, but it doesn't add any benefit to my calculation if I have to decompose the vectors that I feed the inner product into a weighted sum of basis vectors in order to determine the action of B^-1 on them.

Some examples that I think I understand:
3-D, B is the inverse of the gradient, the basis is {x,y,z}

Last edited: Dec 9, 2008
14. Dec 9, 2008

Hurkyl

Staff Emeritus
The only matrix spaces that are two-dimensional are $M_{2,1}$ and $M_{1,2}$. (the set of 2x1 and 1x2 matrices, respectively) The vector space you wrote down, $M_{2,2}$, is four-dimensional.

Incidentally, I think you're confusing yourself by actually writing the components of your matrices in a square array. (This is common, and I do it too now and then) While it technically doesn't matter how you arrange them, writing them as a square array tends to trigger the part of your brain used to dealing with linear transformations and matrix arithmetic, but that's not how you're using them! Instead, try writing the four components as you normally would when working with vectors in a vector space: as a column of four numbers (or as a row of four numbers). This should help trigger the part of your brain that's used to working with vectors in vector spaces.

I sometimes see the matrix spaces written as $$\mathbb{R}^{m \times n}$$ just to emphasize that it's really the same space as $$\mathbb{R}^{mn}$$ -- we're just writing the components in a funny order. (In a square-shape, instead of in a row or column shape)

Last edited: Dec 9, 2008
15. Dec 9, 2008

Hurkyl

Staff Emeritus
The action of B^-1 is the calculation that decomposes the vectors. So once you have B, you can compute B^-1, and you're all set!

16. Dec 9, 2008

Hurkyl

Staff Emeritus
Incidentally, despite the problems I mentioned before, we can still use the above to define a map from R^2 to M2,2. B has 8 components to it, because it's mapping from a two-dimensional space to a four-dimensional space. You wrote down two equations above -- and each of those equations have 4 components. You can solve for them!

Or, to think about it differently, can you tell me what $B(c \hat{e}_1 + d \hat{e}_2)$ is?

Last edited: Dec 9, 2008
17. Dec 9, 2008

turin

I'm using the subspace that is spanned by the two matrices that I gave as a 2-D space, since I want to apply this to square matrices, but I also want to babystep with a 2-D example. I do realize that the space of 2x2 real matrices forms a 4-D vector space, and I don't think I'm having that brain trigger problem (but introspection is certainly not 20 20). As far as I know, there is nothing inconsistent with treating a 2-D subspace of a 4-D space as a vector space. Is there?

Anyway, I think of these 2x2 martix problems more along the lines of
$$M_{2\times{}2}=v^0+\vec{v}\cdot\vec{\sigma}$$
so my two basis matrices would also be like
$$b_1\rightarrow\left(1,\frac{1}{2}\hat{x}+i\frac{1}{2}\hat{y}\right) \qquad b_2\rightarrow\left(1,\hat{x}+i\hat{y}\right)$$
which would span something like the t,x+iy plane, right?
I'm throwing an "i" around, even though I want to talk about real matrices, but as long as I don't throw any other complex phases in there, then this will also act like a real vector space, right? Or, I could have just used $i\sigma_y$.

18. Dec 9, 2008

Hurkyl

Staff Emeritus
I'm tired, so I'm going to say one last thing for the night.

The main point is that whatever space you're working with, there should be some basis into which you can decompose vectors easily, even if it's not the one you're really interested in. For example, if you're working with $M_{2,2}$, the obvious decomposition is to just take the four components of the matrix in some order. (The corresponding basis is the matrices whose entires are all zero, except for a single one)

Now, the easy thing to compute from a basis is this: if you're given the decomposition of a vector, it's easy to reconstruct the vector. This is a linear transformation.

Note that we're working with two decompositions here -- there's the one decomposition that you really want, but is not easy to compute from a given vector. There's the other decomposition that is easy to compute from a given vector, but you don't really want.

But it's always easy to convert from a decomposition into a vector, so the following linear transformation is easy:
{ interesting decompositions } --> { unwanted decompositions}
which can be computed by
{ interesting decompositions } --> vectors --> { unwanted decompositions}

It should be easy to write that as a matrix, and so you can compute its inverse. Then, that lets you compute

vectors --> {unwanted decompositions } --> {interesting decompositions}

19. Dec 9, 2008

turin

Since you say that B should be linear (though I don't understand why this must be, I agree that it is probably a useful property for B to have)
$$B(c \hat{e}_1 + d \hat{e}_2) = \left(\begin{array}{cc}c+d&c+2d\\0&c+d\end{array}\right)$$
I see what you're saying about the 8-component object thing.
However, could I also determine B^-1 as something like
$$B^{-1}\left(M\right) = \sum_i\hat{e}_i\langle{}i|M|i\rangle{}$$
where |i> is an eigenvector or something?

Anyway, thanks a bunch, Hurkyl. That gives me a lot to think about. I will be checking this thread for any other comments that you feel need to be made.