Least Squares Estimation for two parameters

Dixanadu · Nov 30, 2014

Homework Statement

Hi guys,

so the problem is as follows:

A set of n independent measurements [itex]y_{i}, i=1...n[/itex] are treated as Gaussian, each with standard deviations [itex]\sigma_{i}[/itex]. Each measurement corresponds to a value of a control variable [itex]x_{i}[/itex]. The expectation value of [itex]y[/itex] is given by
[itex]f(x;\alpha,\beta)=\alpha x +\beta x^{2}[/itex].

1) Find the log-likelihood function for the parameters [itex]\alpha,\beta[/itex].

2) Show that the least-squares estimators for [itex]\alpha,\beta[/itex] can be found from the solution of a system of equations as follows:

[itex] \begin{pmatrix}
a & b \\
c & d
\end{pmatrix}
\left( \begin{array}{c}
\alpha \\
\beta
\end{array}\right) =
\left( \begin{array}{c}
e \\
f
\end{array} \right)[/itex]

and find the quantities a,b,c,d,e and f as functions of [itex]x_{i}, y_{i}, \sigma_{i}[/itex].

Homework Equations

[/B]
least squares estimators are
[itex]\chi^{2}(\alpha,\beta)=\sum_{i=1}^{n}\frac{1}{\sigma_{i}^{2}}(y_{i}-f(x_{i};\alpha,\beta)^{2})[/itex]
if the measurements are not independent, then given the covariance matrix [itex]V[/itex], the least squares estimators are given by
[itex]\chi^{2}(\vec{\theta})\sum_{i,j=1}^{N}(y_{i}-f(x_{i};\vec{\theta}))(V^{-1})_{ij}(y_{j}-f(x_{j};\vec{\theta}))[/itex]
where the [itex]\vec{\theta}[/itex] is the vector of parameters we wish to estimate.

The Attempt at a Solution

[/B]
Right so I'm pretty sure I've solved the first part:
1)

2)
This is where I get stuck. To find the least squares estimators from the chi-squared thing, I have to put it in matrix form, differentiate, set it equal to 0 and solve the resulting system of equations. So in matrix form, since our measurements are all independent, we have

[itex]\chi^{2}(\alpha,\beta)=(\vec{y}-A\vec{\theta})^{2}(V^{-1})_{ij}[/itex]

where [itex]A_{ij}[/itex] is given by [itex]f(x_{i};\vec{\theta})=\sum_{j=1}^{m}a_{j}(x_{i})\theta_{j}=\sum_{j=1}^{m}A_{ij}\theta_{j}[/itex]
However, in our case, we already have this quantity because

[itex]\sum_{j=1}^{m}a_{j}(x_{i})\theta_{j}=\alpha x +\beta x^{2}[/itex]

aaaand this is my problem - I have no idea how to extract the [itex]A_{ij}[/itex] matrix out of this, and even more confusing is: how is it square? if the i index runs from 1...n (the measurements) and j runs from 1,2 (the number of parameters) then how am I supposed to cast this into the square matrix equation above?

Anyway, I did differentiate the chi-squared thing and set it equal to 0, which gives me

[itex]A\vec{\theta}=\vec{y}[/itex]

Which fits the system of equations provided that A is square...I don't see how this works...please help!

STEMucator · Nov 30, 2014

Dixanadu said:

Homework Statement

Hi guys,

so the problem is as follows:

A set of n independent measurements [itex]y_{i}, i=1...n[/itex] are treated as Gaussian, each with standard deviations [itex]\sigma_{i}[/itex]. Each measurement corresponds to a value of a control variable [itex]x_{i}[/itex]. The expectation value of [itex]y[/itex] is given by
[itex]f(x;\alpha,\beta)=\alpha x +\beta x^{2}[/itex].

1) Find the log-likelihood function for the parameters [itex]\alpha,\beta[/itex].

2) Show that the least-squares estimators for [itex]\alpha,\beta[/itex] can be found from the solution of a system of equations as follows:

[itex] \begin{pmatrix}
a & b \\
c & d
\end{pmatrix}
\left( \begin{array}{c}
\alpha \\
\beta
\end{array}\right) =
\left( \begin{array}{c}
e \\
f
\end{array} \right)[/itex]

and find the quantities a,b,c,d,e and f as functions of [itex]x_{i}, y_{i}, \sigma_{i}[/itex].

Homework Equations

[/B]
least squares estimators are
[itex]\chi^{2}(\alpha,\beta)=\sum_{i=1}^{n}\frac{1}{\sigma_{i}^{2}}(y_{i}-f(x_{i};\alpha,\beta)^{2})[/itex]
if the measurements are not independent, then given the covariance matrix [itex]V[/itex], the least squares estimators are given by
[itex]\chi^{2}(\vec{\theta})\sum_{i,j=1}^{N}(y_{i}-f(x_{i};\vec{\theta}))(V^{-1})_{ij}(y_{j}-f(x_{j};\vec{\theta}))[/itex]
where the [itex]\vec{\theta}[/itex] is the vector of parameters we wish to estimate.

The Attempt at a Solution

[/B]
Right so I'm pretty sure I've solved the first part:
1)

2)
This is where I get stuck. To find the least squares estimators from the chi-squared thing, I have to put it in matrix form, differentiate, set it equal to 0 and solve the resulting system of equations. So in matrix form, since our measurements are all independent, we have

[itex]\chi^{2}(\alpha,\beta)=(\vec{y}-A\vec{\theta})^{2}(V^{-1})_{ij}[/itex]

where [itex]A_{ij}[/itex] is given by [itex]f(x_{i};\vec{\theta})=\sum_{j=1}^{m}a_{j}(x_{i})\theta_{j}=\sum_{j=1}^{m}A_{ij}\theta_{j}[/itex]
However, in our case, we already have this quantity because

[itex]\sum_{j=1}^{m}a_{j}(x_{i})\theta_{j}=\alpha x +\beta x^{2}[/itex]

aaaand this is my problem - I have no idea how to extract the [itex]A_{ij}[/itex] matrix out of this, and even more confusing is: how is it square? if the i index runs from 1...n (the measurements) and j runs from 1,2 (the number of parameters) then how am I supposed to cast this into the square matrix equation above?

Anyway, I did differentiate the chi-squared thing and set it equal to 0, which gives me

[itex]A\vec{\theta}=\vec{y}[/itex]

Which fits the system of equations provided that A is square...I don't see how this works...please help!

I don't know much about this, but I don't think your answer to 1) is correct.

Isn't the likelihood function given by:

$$L(\alpha, \beta; x_1, ..., x_n) = ∏_{j=1}^n f(x_j; \alpha, \beta) = ∏_{j=1}^n \alpha x_j +\beta x^2_j$$

Dixanadu · Nov 30, 2014

The likelihood you wrote down cannot be correct, because maximising the likelihood should be the same as minimising the chi-squared function, so they must be the same up to some constant, which I have found to be -1/2.

STEMucator · Nov 30, 2014

So you want to minimize ##\chi^2## or maximize ##- \chi^2##. I agree with your work then except for some subscripts.

The second question indeed amounts to solving the normal equations by taking partials with respect to ##\alpha## and ##\beta##, setting them equal to zero and then solving.

Dixanadu · Nov 30, 2014

Hmm okay thank you. So doing that, I get these two equations:

[itex]\frac{\partial\chi^{2}}{\partial \alpha}=2x\sum_{i=1}^{n}\frac{1}{\sigma_{i}^{2}}(\alpha x +\beta x^{2} - y_{i})=0[/itex]
and
[itex]\frac{\partial\chi^{2}}{\partial \beta}=4\beta x\sum_{i=1}^{n}\frac{1}{\sigma_{i}^{2}}(\alpha x +\beta x^{2} - y_{i})=0[/itex]

How does one solve these? I thought about setting the summand to 0 since the sum is 0 (although I'm not sure you can do that) but then I get two identical equations - what should I do?

STEMucator · Nov 30, 2014

Dixanadu said:

Hmm okay thank you. So doing that, I get these two equations:

[itex]\frac{\partial\chi^{2}}{\partial \alpha}=2x\sum_{i=1}^{n}\frac{1}{\sigma_{i}^{2}}(\alpha x +\beta x^{2} - y_{i})=0[/itex]
and
[itex]\frac{\partial\chi^{2}}{\partial \beta}=4\beta x\sum_{i=1}^{n}\frac{1}{\sigma_{i}^{2}}(\alpha x +\beta x^{2} - y_{i})=0[/itex]

How does one solve these? I thought about setting the summand to 0 since the sum is 0 (although I'm not sure you can do that) but then I get two identical equations - what should I do?

Your derivatives look wrong for some reason. They should read:

$$\frac{\partial\chi^{2}}{\partial \alpha}= - 2 \displaystyle \sum_i^n \frac{x_i}{\sigma_i^2}(y_i - \alpha x_i - \beta x_i^2) = 0$$

$$\frac{\partial\chi^{2}}{\partial \beta} = - 2 \displaystyle \sum_i^n \left(\frac{x_i}{\sigma_i}\right)^2(y_i - \alpha x_i - \beta x_i^2) = 0$$

Now distribute the sum to the terms... do a little bit of re-arranging and presto a set of normal equations will result. These normal equations can be solved for ##\alpha## and ##\beta## and happen to be functions of the aforementioned variables in the question.

Dixanadu · Dec 1, 2014

Yea I did mess up the derivatives and my subscripts were also wrong - thank you for clarifying :) I rearranged and got something like this for the matrix:

STEMucator · Dec 1, 2014

Looks correct at a first glance.

Dixanadu · Dec 1, 2014

Thank you very much :D i know who to ask for help with stats! thanks :)

Least Squares Estimation for two parameters

Homework Statement

Homework Equations

The Attempt at a Solution

Homework Statement

Homework Equations

The Attempt at a Solution

What is Least Squares Estimation for two parameters?

How does Least Squares Estimation work?

What is the purpose of using Least Squares Estimation?

What are the assumptions of Least Squares Estimation?

What are the limitations of Least Squares Estimation?

Similar threads

Hot Threads

Recent Insights