Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Least squares for matrix?

  1. Aug 5, 2011 #1
    Hi, as no unique unique analytical solution exists to my problem (as another poster pointed out), I have taken to solving it through a least squares method. My equation is as follows:

    (s1x1 + s2x2 - I).^2 = min.

    Here s1 and s2 are shift matrices (I know them), and I is a matrix of size nxm, also known. x1 and x2 are two unknown matrices (obviously of size nxm). The .^2 is my way (and MATLAB's) of expressing every element in the matrix is multiplied by itself (similar to least squares). Now I could do this iteratively but the matrices are very large and I'm sure it would require several iterations to converge and thus it becomes very computationally intensive. Thus I've resorted to try to work this out in a least squares method. I understand least squares for a set of values, however I'm not sure how to extend it to matrices (if possible).

    Again, if anyone has any guidance in this it would be greatly appreciated! Thanks very much!
  2. jcsd
  3. Aug 5, 2011 #2
    Could you provide a link where some previous poster had pointed out that your original is not solvable analytically.

    Because, your matrix I is nxm and x1 and x2 are nxm, it means that s1 and s2 ought to be nxn. But the whole result under the square is an nxm matrix, which cannot be squared.
  4. Aug 5, 2011 #3
    Here is the link:


    I do not mean to square the matrix, I used MATLAB's syntax. What I mean is, say
    a = [1 2 3; 4 5 6] then a.^2 = [1 4 9; 16 25 36]

    Still perserving it's size of 3x2. I put the square in to ensure that x1 and x2 do not come back filled with negative infinities.
  5. Aug 5, 2011 #4
    What you have is more unknowns than equations. Your problem does not have a unique solution. Not just analytic, but in any way. Your two matrices x1 and x2 can be stacked together into one bigger matrix that is 2n x m. Then your equations would look like:

    S_{1} & S_{2}
    \end{array}\right]_{n \times 2 n} \cdot \left[\begin{array}{c}
    x_{1} \\
    \end{array}\right]_{2n \times m} = \left[\begin{array}{c}
    \end{array}\right]_{n \times m}
  6. Aug 5, 2011 #5
    Ah, I see. My linear algebra is very rusty and I'm really trying to wrap my head around this and solve my problem. I can think of a physical constraint that the sum of all the values in x1 plus the sum of all the vaules in x2 should be equal to the sum of the values in I, but that's not sufficient for a unique solution.

    Hmm, I'll have to think about this more. I must not be writing down the mathematics of my system properly. The reason why I know that there must be a solution is my problem is physical in nature. Think multi-sources with a single detector, leading to the overlap of data. I have to disentangle the data to process it... hmm!
  7. Aug 5, 2011 #6
    But, if I understood well, you're given a map A:V-->W between normed spaces; then the matrix least squares gives you the _best_ solution x^ for Ax=b, in the sense that ||Ax^-b|| < ||Ax-b|| for all x in V. Specifically, x^ will be the orthogonal projection of b into the subspace A(V) of W . As an example, given A:R^2-->R^2 ( both as vector spaces over R ), with A( 1,0)=(1,0) and A((0,1))=0 (and extend linearly) ,, and you want to find the solution for Av=(0,2); here the vector (0,2) will not be in the range of A (since the range of A is a copy of R in R^2 ), _but_ you can then find the least-squares, which gives you the point in the range which is closest to (1,2) , i.e., least sqaures will give you a vector v^ with ||Av^-(1,2)|| <||Av-(1,2)|| ; specifically here, v^ is the orthogonal projection of (1,2) into the image A(cv)=(c,0).

    The whole thing generalizes to statistical least-squares, tho there the problem is sort of the reverse one: instead of having a subspace given and then finding the points closest to Ax=b, you are given a collection of points , and then you want to find the subspace that is closest to a collection of given points (and then also you need to worry and do other tests to see whether a linear relationship between the variables exists; e.g., by calculating r^2 , or by doing inference on regression with the assumption r=0, etc.)
    Last edited: Aug 5, 2011
  8. Aug 5, 2011 #7
    So if I understand correctly then you think there does exist a matrix x^ which is the best solution. Unfortunately I am still very unclear as to how to construct this matrix beyond iterative (and too bulky) means.
  9. Aug 6, 2011 #8
    Sorry, Klandhee, I did not think it thru carefully-enough. Let me give it some more thought.
  10. Aug 8, 2011 #9
    OK so here is the latest on this problem. I agree with any poster who said there were an infinite number of solutions, whereas the physical limits of the problem actaully permit one.

    Say each matrix x1, x2, and I are nxm matrices. Then since x1 and x2 are both nxm there would be 2*n*m unknown values, with n*m equations. However, the physical limits require the matrix to be, excuse me if I use the wrong terminology, symmetric about x. That is to say that if the matrix is [a1,1 a2,1; a1,2 a2,2] then the we have that a1,1 = a1,2 and a2,1 = a2,2. So this reduces the number of unknowns by 1/2. Thus we have n*m unknowns with n*m equations, making this potentially solvable, I believe. Now the question still remains of the way to do this, which is what I am not sure about but if anyone has any ideas (or sees an error in what I've said here) please let me know! Again thanks for the assistance!
Share this great discussion with others via Reddit, Google+, Twitter, or Facebook