Optimizing Bi-Linear Objective Function for Vector Fitting in Flat Space

sunjin09 · Mar 28, 2012

I have two matrices, A and D, with same numbers of rows and different numbers of columns (A has many more columns than D), I want to find x and y such that ||Ax-Dy||_2 is minimized. I.e., I want to find the closest vectors in span{A} and span{D}. Seems like a simple problem, but couldn't figure it out. Any suggestions? (A and D are linearly independent, so that span{A} and span{D} have no nonzero intersection)

Stephen Tashi · Mar 28, 2012

First, let's try to state the problem clearly. Your statement about finding 'x' and 'y' isn't clear because it isn't clear whether "Ax" is supposed to represent a column vector or whether it represents the matrix "A" times a vector "x".

We could try it this way first:

I have two sets of n dimensional vectors A and D. Set D has greater cardinality that set A. The span of set A and the span of set D are vector spaces whose only intersection is the zero vector. How do I find vectors x and y such that x is in the span of A and y is in the span of D and the distance between x and y (i.e. [itex] || x- y||_2 [/itex]) is minimal?

The answer, of course, is to set both x and y equal to the zero vector. Assuming that's not what you want to do, how do we modify the statement of the problem to say what you want?

sunjin09 · Mar 29, 2012

Thank you for correcting the problem statement, following your statement what I want to minimize is is the angle between x and y, i.e., maximize [itex]\frac{<x,y>}{||x||_2||y||_2}[/itex], and I don't want the trivial 0 solution. Where do I go from here?

Stephen Tashi · Mar 30, 2012

sunjin09 said:

maximize [itex]\frac{<x,y>}{||x||_2||y||_2}[/itex],

I don't know an easy way to do this. You may as well consider only unit vectors, so the problem becomes to maximize [itex] <\hat{x},\hat{y}> [/itex]. As far as I can see this problem falls under the heading of a "bilinear optimization problem" or, more generally, a "multilinear optimization problem".

My intuition is that if you have two vector subspaces that only intersect at the zero vector, then you should be able to find a set of vectors [itex] {e_1,e_2,..,e_n, f_1,f_2,...,f_m} [/itex] such that this set is a (non-orthogonal) basis for the parent n+m dimensional space, the [itex]e_i [/itex] are an orthonormal basis for the first subspace and the [itex] f_i [/itex] are an orthonormal basis for the second subspace.

If that inutition is correct then let [itex]\hat{x} = \sum_1^n \alpha_i e_i [/itex] and [itex] \hat{y} = \sum_1^m \beta_j f_j [/itex]. Let [itex] c_{i,j} = <e_i, f_j> [/itex].

The problem is to maximize the function [itex] \sum_{i=1}^n \sum_{j=1}^m c_{i,j} \alpha_i \beta_j [/itex] subject to the constraints [itex] \sum_1^n \alpha_i^2 = 1 [/itex] and [itex] \sum_1^n \beta_j^2 = 1 [/itex].

I wonder if there is a simpler formulation.

sunjin09 · Mar 30, 2012

It seems that
[tex]
<x,y>=(Aa)'(Db)=a'A'Db=a'U'SVb=(Ua)'S(Vb)
[/tex]
where [itex] A'D=U'SV[/itex] is the SVD, since both U and V are orthonormal, the minimum angle occurs at the largest singular value in S. Does that sound right?

chiro · Mar 30, 2012

Are you familiar with this area?

http://en.wikipedia.org/wiki/Linear_programming

sunjin09 · Mar 30, 2012

chiro said:

Are you familiar with this area?

http://en.wikipedia.org/wiki/Linear_programming

Yes but never needed to implement the details. But my problem does not seem to formulate as an LP problem, does it? It's quadratic.

Number Nine · Mar 30, 2012

sunjin09 said:

Yes but never needed to implement the details. But my problem does not seem to formulate as an LP problem, does it? It's quadratic.

How is it quadratic?

sunjin09 · Mar 30, 2012

Number Nine said:

How is it quadratic?

objective function is the dot product of two unknown vectors x and y.

Stephen Tashi · Mar 31, 2012

sunjin09 said:

It seems that
[tex]
<x,y>=(Aa)'(Db)=a'A'Db=a'U'SVb=(Ua)'S(Vb)
[/tex]
where [itex] A'D=U'SV[/itex] is the SVD, since both U and V are orthonormal, the minimum angle occurs at the largest singular value in S. Does that sound right?

What do you mean by "at" the largest singular value? Do you mean we set all the entries of vector [itex] a [/itex] equal to zero except for one of them?

If [itex] A = \begin{pmatrix} 1 \\ 1 \\ 1 \end{pmatrix} [/itex], [itex] B = \begin{pmatrix} 1 & 0 \\ 0 & 1 \\ 0 & 0 \end{pmatrix} [/itex] then [itex] A'D = \begin{pmatrix} 1 & 1 \end{pmatrix} [/itex].

[itex] A'D [/itex] is equal to the same thing if [itex] A = \begin{pmatrix} 1 \\ 1 \\ 2 \end{pmatrix} [/itex] [itex] B = \begin{pmatrix} 1 & 0 \\ 0 & 1 \\ 0 & 0 \end{pmatrix} [/itex]

sunjin09 · Mar 31, 2012

Stephen Tashi said:

What do you mean by "at" the largest singular value? Do you mean we set all the entries of vector [itex] a [/itex] equal to zero except for one of them?

If [itex] A = \begin{pmatrix} 1 \\ 1 \\ 1 \end{pmatrix} [/itex], [itex] B = \begin{pmatrix} 1 & 0 \\ 0 & 1 \\ 0 & 0 \end{pmatrix} [/itex] then [itex] A'D = \begin{pmatrix} 1 & 1 \end{pmatrix} [/itex].

[itex] A'D [/itex] is equal to the same thing if [itex] A = \begin{pmatrix} 1 \\ 1 \\ 2 \end{pmatrix} [/itex] [itex] B = \begin{pmatrix} 1 & 0 \\ 0 & 1 \\ 0 & 0 \end{pmatrix} [/itex]

Assuming A and B are both orthnormal, then the coefficient vectors a and b are both unit vectors so that x and y are also unit vectors. Then maximization of <x,y> correspond to minimal angles between x and y. In order to maximize <x,y>=(Ua)'*S*(Vb), given that ||Ua||=1 and ||Vb||=1, I want to choose Ua and Vb to be 1 at the largest singular values and 0 elsewhere, not that a and b are 1 at one place and 0 everywhere else. Is the logic correct?

Stephen Tashi · Mar 31, 2012

sunjin09 said:

I want to choose Ua and Vb to be 1 at the largest singular values and 0 elsewhere

Are you saying that vector 'a' will be chosen so that the vector Ua will be 1 at the jth component iff the largest singular value occurs in S at location S[j][j] and the vector Ua will be zero elsewhere?

In these two examples, do we have the same matrix for A'D but different answers for the maximum angle? (My 4-D intuition isn't good, so I'm not sure.)

Example 1: [itex] A = \begin{pmatrix} \frac{1}{\sqrt{3}} \\ \frac{1}{\sqrt{3}}\\ \frac{2}{\sqrt{15}}\\ \frac{1}{\sqrt{15}} \end{pmatrix} ,\ D = \begin{pmatrix} 1 & 0 \\ 0 & 1 \\ 0 & 0 \\ 0 & 0 \end{pmatrix} [/itex]

Example 2: [itex] A = \begin{pmatrix} \frac{1}{\sqrt{3}} \\ \frac{1}{\sqrt{3}}\\ \frac{1}{\sqrt{6}}\\ \frac{1}{\sqrt{6}} \end{pmatrix} ,\ D = \begin{pmatrix} 1 & 0 \\ 0 & 1 \\ 0 & 0 \\ 0 & 0 \end{pmatrix} [/itex]

sunjin09 · Apr 1, 2012

Stephen Tashi said:

Are you saying that vector 'a' will be chosen so that the vector Ua will be 1 at the jth component iff the largest singular value occurs in S at location S[j][j] and the vector Ua will be zero elsewhere?

In these two examples, do we have the same matrix for A'D but different answers for the maximum angle? (My 4-D intuition isn't good, so I'm not sure.)

Example 1: [itex] A = \begin{pmatrix} \frac{1}{\sqrt{3}} \\ \frac{1}{\sqrt{3}}\\ \frac{2}{\sqrt{15}}\\ \frac{1}{\sqrt{15}} \end{pmatrix} ,\ D = \begin{pmatrix} 1 & 0 \\ 0 & 1 \\ 0 & 0 \\ 0 & 0 \end{pmatrix} [/itex]

Example 2: [itex] A = \begin{pmatrix} \frac{1}{\sqrt{3}} \\ \frac{1}{\sqrt{3}}\\ \frac{1}{\sqrt{6}}\\ \frac{1}{\sqrt{6}} \end{pmatrix} ,\ D = \begin{pmatrix} 1 & 0 \\ 0 & 1 \\ 0 & 0 \\ 0 & 0 \end{pmatrix} [/itex]

As I solved this example, since A'*D is the same, so are a and b, actually a=1 and b=[1/sqrt(2),1/sqrt(2)]. But x1=A1≠x2=A2. However the angle is the same, since <x,y>=(Ua)'*S*(Vb) is totally determined by A'*D. Seemingly logical.

Stephen Tashi · Apr 3, 2012

sunjin09 said:

As I solved this example, since A'*D is the same, so are a and b, actually a=1 and b=[1/sqrt(2),1/sqrt(2)]. But x1=A1≠x2=A2. However the angle is the same, since <x,y>=(Ua)'*S*(Vb) is totally determined by A'*D. Seemingly logical.

I see what you mean. [itex] A'D = \begin{pmatrix} \frac{1}{\sqrt{3}} & \frac{1}{\sqrt{3}} \end{pmatrix} [/itex]

[itex] A'D = \begin{pmatrix} 1 \end{pmatrix} [/itex] [itex] \begin{pmatrix}\frac{\sqrt{2}}{\sqrt{3}} & 0 \end{pmatrix} [/itex] [itex] \begin{pmatrix}\frac{1}{\sqrt{2}} & \frac{1}{\sqrt{2}}\\ \frac{1}{\sqrt{2}} & \frac{-1}{\sqrt{2}} \end{pmatrix} [/itex]

You have shown that <x,y> is determined by A'D. This isn't a result I have seen before.

You haven't explained why the maximum possible <x,y> is equal to the largest singular value or why vectors a and b must exist that produce this value. (Are we maximizing <x,y> or maximizing the absolute value of <x,y>?)

chiro · Apr 3, 2012

sunjin09 said:

objective function is the dot product of two unknown vectors x and y.

All valid dot, inner, or scalar (in this context) functions are always bi-linear. If you are trying to maximize only an inner product, especially in a flat space, you will always get a bilinear problem. I'm assuming it's flat because you mentioned fot product which is usually associated with cartesian space: if it's not then please post your inner product definition.

Since you want to minimize ||Ax-Dy||_2 then just minimize <Ax-Dy,Ax-Dy> which is

<Ax-Dy,Ax-Dy> = <Ax,Ax> - 2<Ax,Dy> + <Dy,Dy>

Now if x and y are vectors, Ax will be bilinear in each component as will Dy which means the whole thing will be a multilinear expression.

Also minimizing the square of the norm is equivalent to minimizing the norm itself as both are purely increasing monotonic functions and since the answer is always greater than or equal to zero.

Optimizing Bi-Linear Objective Function for Vector Fitting in Flat Space

1. What is a strange data fitting problem?

2. How do you approach a strange data fitting problem?

3. What are some common causes of strange data fitting problems?

4. Can strange data fitting problems be solved?

5. How can strange data fitting problems impact research or decision making?

Similar threads

Hot Threads

Recent Insights