Rearranging a formula (Multivariate Gaussian function)

Pi-Bond
Messages
300
Reaction score
0

Homework Statement


See image, p(y|θ) is the Likelihood function which has to be rearranged in the form of equation (3). θ is a vector variable.
1zobx9u.png


Homework Equations


None?

The Attempt at a Solution


I first expanded the exponent in the original function, equation (2).

(b-A\theta)^T(b-A\theta)=b^Tb - b^TA\theta - \theta^T A^T b + \theta^T A^T A \theta

Now suppose I can write the function equivalently as

C \exp\left( -\frac{1}{2} (\theta - \theta_0)^T L (\theta - \theta_0) \right)

where C represents the same constant multiplying with exp in equation (2). In this case, the exponential parts must be the same. So:

b^Tb - b^TA\theta - \theta^T A^T b + \theta^T A^T A \theta = \theta^TL\theta - \theta^TL\theta_0 - \theta_0^TL\theta + \theta_0^TL\theta_0

If this this expression is true for all θ, then all coefficients of (...)θ , θT(...) , θT(...)θ and the constants must match.

So L = ATA and θ0=L-1ATb.

Now I don't know how to get L0. (equation (5) ). I only have this condition left from my assumption about the constants above:

bTb=θT00

There just don't seem to be enough terms in either exponent to allow the exponential part of L0. I don't think I can add and subtract anything either...

Any ideas?
 
Last edited:
Physics news on Phys.org
Pi-Bond said:

Homework Statement


See image, p(y|θ) is the Likelihood function which has to be rearranged in the form of equation (3). θ is a vector variable.
1zobx9u.png


Homework Equations


None?

The Attempt at a Solution


I first expanded the exponent in the original function, equation (2).

(b-A\theta)^T(b-A\theta)=b^Tb - b^TA\theta - \theta^T A^T b + \theta^T A^T A \theta

Now suppose I can write the function equivalently as

C \exp\left( -\frac{1}{2} (\theta - \theta_0)^T L (\theta - \theta_0) \right)

where C represents the same constant multiplying with exp in equation (2). In this case, the exponential parts must be the same. So:

b^Tb - b^TA\theta - \theta^T A^T b + \theta^T A^T A \theta = \theta^TL\theta - \theta^TL\theta_0 - \theta_0^TL\theta + \theta_0^TL\theta_0

If this this expression is true for all θ, then all coefficients of (...)θ , θT(...) , θT(...)θ and the constants must match.

So L = ATA and θ0=LATb.

Now I don't know how to get L0. (equation (5) ). I only have this condition left from my assumption about the constants above:

bTb=θT00

There just don't seem to be enough terms in either exponent to allow the exponential part of L0. I don't think I can add and subtract anything either...

Any ideas?

Try to show that
(b-A\theta)^T(b-A\theta) = (\theta - \theta_0)^T L (\theta - \theta_0) + K for some matrix L and some constant K, and for ##\theta_0## as given in the question.
 
Ok.

(b-A\theta)^T(b-A\theta)=b^Tb - b^TA\theta - \theta^T A^T b + \theta^T A^T A \theta
= b^Tb - (L L^{-1}A^T b)^T\theta - \theta^T L( L^{-1}A^T b) + \theta^T L \theta
= b^Tb + \theta_0^T L \theta - \theta^T L \theta_0 + \theta^T L \theta
= b^Tb + (\theta-\theta_0)^T L \theta - \theta^T L \theta_0
= b^Tb + (\theta-\theta_0)^T L \theta - (\theta-\theta_0)^T L \theta_0 -\theta_0^T L \theta_0
= b^Tb + (\theta-\theta_0)^T L (\theta-\theta_0)-\theta_0^T L \theta_0

I used the fact that LT=L=ATA.

On the basis of this it seems K = bTb - θ0T0

I still can't see the origin of L0 here though..
 
Bump. I haven't been able to find any leads. Anyone have an idea?
 
I'm somewhat confused by the fact that if L = A^TA and \theta_0 = L^{-1}A^Tb then L^{-1} = A^{-1}(A^T)^{-1} so that
A\theta_0 = AL^{-1}A^Tb = A(A^{-1}(A^T)^{-1})A^Tb = b
which means that b - A\theta_0 = b - b = 0. So I'm at a loss to explain why the author has included the exponential in \mathcal{L}_0, when its value appears to be \exp(0) = 1.

On the other hand, I was able to show (using the additional fact that L is symmetric so that (L^T)^{-1} = (L^{-1})^T = L^{-1}) that
(\theta - \theta_0)^TL(\theta - \theta_0) = (A\theta - b)^T(A\theta - b)<br /> = (b - A\theta)^T(b - A\theta)
It's just a case of expanding the left hand side, substituting the definitions of L and \theta_0 and simplifying.
 
pasmith said:
I'm somewhat confused by the fact that if L = A^TA and \theta_0 = L^{-1}A^Tb then L^{-1} = A^{-1}(A^T)^{-1}
Ignore that: A is not square, so cannot be invertible.

The aim is to show that (b - A\theta)^T(b - A\theta) = (b - A\theta_0)^T(b - A\theta_0) + (\theta - \theta_0)^T(\theta - \theta_0)

Now the left hand side is
b^Tb - b^TA\theta - \theta^TA^Tb + \theta^TA^TA\theta

The right hand side is
b^Tb - b^TA\theta_0 - \theta_0^TA^Tb + \theta_0^TA^TA\theta_0<br /> + \theta^TL\theta - \theta^TL\theta_0 - \theta_0^TL\theta + \theta_0^TL\theta_0\\<br /> = b^Tb - (b^TA\theta_0 + \theta_0^TA^Tb) + 2\theta_0^TL\theta_0<br /> + \theta^TL\theta - \theta^TL\theta_0 - \theta_0^TL\theta \\<br /> = b^Tb - 2b^TAL^{-1}A^Tb + 2b^TAL^{-1}A^Tb<br /> + \theta^TA^TA\theta - \theta^TA^Tb - b^TA\theta \\<br /> = b^Tb -b^TA\theta - \theta^TA^Tb + \theta^TA^TA\theta<br />
as required.
 
Last edited:
Edit: there seems to be a mistake between the second and third lines. You go from -(bT0 + θ0TATb) to -2bT(...)+2bT(...)

The plus sign should be minus, and I'm not sure of where the factor of 2 comes from. Anyway I will investigate your approach.
 
Last edited:
Back
Top