Least Squares Estimation for two parameters

Click For Summary
The discussion revolves around finding the least squares estimators for parameters α and β in a quadratic model based on independent Gaussian measurements. The participants clarify the log-likelihood function and its relationship to the chi-squared function, emphasizing that maximizing the likelihood corresponds to minimizing the chi-squared value. Key issues include deriving the correct form of the matrix A and ensuring it is square, which is essential for solving the system of equations. Participants also correct each other's derivatives related to the chi-squared function, leading to the formulation of normal equations for α and β. The conversation concludes with a clearer understanding of the mathematical approach needed to solve for the parameters.
Dixanadu
Messages
250
Reaction score
2

Homework Statement


Hi guys,

so the problem is as follows:

A set of n independent measurements y_{i}, i=1...n are treated as Gaussian, each with standard deviations \sigma_{i}. Each measurement corresponds to a value of a control variable x_{i}. The expectation value of y is given by
f(x;\alpha,\beta)=\alpha x +\beta x^{2}.

1) Find the log-likelihood function for the parameters \alpha,\beta.

2) Show that the least-squares estimators for \alpha,\beta can be found from the solution of a system of equations as follows:

\begin{pmatrix}<br /> a &amp; b \\<br /> c &amp; d <br /> \end{pmatrix}<br /> \left( \begin{array}{c}<br /> \alpha \\<br /> \beta <br /> \end{array}\right) = <br /> \left( \begin{array}{c}<br /> e \\<br /> f <br /> \end{array} \right)

and find the quantities a,b,c,d,e and f as functions of x_{i}, y_{i}, \sigma_{i}.

Homework Equations


[/B]
least squares estimators are
\chi^{2}(\alpha,\beta)=\sum_{i=1}^{n}\frac{1}{\sigma_{i}^{2}}(y_{i}-f(x_{i};\alpha,\beta)^{2})
if the measurements are not independent, then given the covariance matrix V, the least squares estimators are given by
\chi^{2}(\vec{\theta})\sum_{i,j=1}^{N}(y_{i}-f(x_{i};\vec{\theta}))(V^{-1})_{ij}(y_{j}-f(x_{j};\vec{\theta}))
where the \vec{\theta} is the vector of parameters we wish to estimate.

The Attempt at a Solution


[/B]
Right so I'm pretty sure I've solved the first part:
1)
8pAyMQC.png

2)
This is where I get stuck. To find the least squares estimators from the chi-squared thing, I have to put it in matrix form, differentiate, set it equal to 0 and solve the resulting system of equations. So in matrix form, since our measurements are all independent, we have

\chi^{2}(\alpha,\beta)=(\vec{y}-A\vec{\theta})^{2}(V^{-1})_{ij}

where A_{ij} is given by f(x_{i};\vec{\theta})=\sum_{j=1}^{m}a_{j}(x_{i})\theta_{j}=\sum_{j=1}^{m}A_{ij}\theta_{j}
However, in our case, we already have this quantity because

\sum_{j=1}^{m}a_{j}(x_{i})\theta_{j}=\alpha x +\beta x^{2}

aaaand this is my problem - I have no idea how to extract the A_{ij} matrix out of this, and even more confusing is: how is it square? if the i index runs from 1...n (the measurements) and j runs from 1,2 (the number of parameters) then how am I supposed to cast this into the square matrix equation above?

Anyway, I did differentiate the chi-squared thing and set it equal to 0, which gives me

A\vec{\theta}=\vec{y}

Which fits the system of equations provided that A is square...I don't see how this works...please help!
 
Physics news on Phys.org
Dixanadu said:

Homework Statement


Hi guys,

so the problem is as follows:

A set of n independent measurements y_{i}, i=1...n are treated as Gaussian, each with standard deviations \sigma_{i}. Each measurement corresponds to a value of a control variable x_{i}. The expectation value of y is given by
f(x;\alpha,\beta)=\alpha x +\beta x^{2}.

1) Find the log-likelihood function for the parameters \alpha,\beta.

2) Show that the least-squares estimators for \alpha,\beta can be found from the solution of a system of equations as follows:

\begin{pmatrix}<br /> a &amp; b \\<br /> c &amp; d<br /> \end{pmatrix}<br /> \left( \begin{array}{c}<br /> \alpha \\<br /> \beta<br /> \end{array}\right) =<br /> \left( \begin{array}{c}<br /> e \\<br /> f<br /> \end{array} \right)

and find the quantities a,b,c,d,e and f as functions of x_{i}, y_{i}, \sigma_{i}.

Homework Equations


[/B]
least squares estimators are
\chi^{2}(\alpha,\beta)=\sum_{i=1}^{n}\frac{1}{\sigma_{i}^{2}}(y_{i}-f(x_{i};\alpha,\beta)^{2})
if the measurements are not independent, then given the covariance matrix V, the least squares estimators are given by
\chi^{2}(\vec{\theta})\sum_{i,j=1}^{N}(y_{i}-f(x_{i};\vec{\theta}))(V^{-1})_{ij}(y_{j}-f(x_{j};\vec{\theta}))
where the \vec{\theta} is the vector of parameters we wish to estimate.

The Attempt at a Solution


[/B]
Right so I'm pretty sure I've solved the first part:
1)
8pAyMQC.png

2)
This is where I get stuck. To find the least squares estimators from the chi-squared thing, I have to put it in matrix form, differentiate, set it equal to 0 and solve the resulting system of equations. So in matrix form, since our measurements are all independent, we have

\chi^{2}(\alpha,\beta)=(\vec{y}-A\vec{\theta})^{2}(V^{-1})_{ij}

where A_{ij} is given by f(x_{i};\vec{\theta})=\sum_{j=1}^{m}a_{j}(x_{i})\theta_{j}=\sum_{j=1}^{m}A_{ij}\theta_{j}
However, in our case, we already have this quantity because

\sum_{j=1}^{m}a_{j}(x_{i})\theta_{j}=\alpha x +\beta x^{2}

aaaand this is my problem - I have no idea how to extract the A_{ij} matrix out of this, and even more confusing is: how is it square? if the i index runs from 1...n (the measurements) and j runs from 1,2 (the number of parameters) then how am I supposed to cast this into the square matrix equation above?

Anyway, I did differentiate the chi-squared thing and set it equal to 0, which gives me

A\vec{\theta}=\vec{y}

Which fits the system of equations provided that A is square...I don't see how this works...please help!

I don't know much about this, but I don't think your answer to 1) is correct.

Isn't the likelihood function given by:

$$L(\alpha, \beta; x_1, ..., x_n) = ∏_{j=1}^n f(x_j; \alpha, \beta) = ∏_{j=1}^n \alpha x_j +\beta x^2_j$$
 
The likelihood you wrote down cannot be correct, because maximising the likelihood should be the same as minimising the chi-squared function, so they must be the same up to some constant, which I have found to be -1/2.
 
So you want to minimize ##\chi^2## or maximize ##- \chi^2##. I agree with your work then except for some subscripts.

The second question indeed amounts to solving the normal equations by taking partials with respect to ##\alpha## and ##\beta##, setting them equal to zero and then solving.
 
Hmm okay thank you. So doing that, I get these two equations:

\frac{\partial\chi^{2}}{\partial \alpha}=2x\sum_{i=1}^{n}\frac{1}{\sigma_{i}^{2}}(\alpha x +\beta x^{2} - y_{i})=0
and
\frac{\partial\chi^{2}}{\partial \beta}=4\beta x\sum_{i=1}^{n}\frac{1}{\sigma_{i}^{2}}(\alpha x +\beta x^{2} - y_{i})=0

How does one solve these? I thought about setting the summand to 0 since the sum is 0 (although I'm not sure you can do that) but then I get two identical equations - what should I do?
 
Dixanadu said:
Hmm okay thank you. So doing that, I get these two equations:

\frac{\partial\chi^{2}}{\partial \alpha}=2x\sum_{i=1}^{n}\frac{1}{\sigma_{i}^{2}}(\alpha x +\beta x^{2} - y_{i})=0
and
\frac{\partial\chi^{2}}{\partial \beta}=4\beta x\sum_{i=1}^{n}\frac{1}{\sigma_{i}^{2}}(\alpha x +\beta x^{2} - y_{i})=0

How does one solve these? I thought about setting the summand to 0 since the sum is 0 (although I'm not sure you can do that) but then I get two identical equations - what should I do?

Your derivatives look wrong for some reason. They should read:

$$\frac{\partial\chi^{2}}{\partial \alpha}= - 2 \displaystyle \sum_i^n \frac{x_i}{\sigma_i^2}(y_i - \alpha x_i - \beta x_i^2) = 0$$

$$\frac{\partial\chi^{2}}{\partial \beta} = - 2 \displaystyle \sum_i^n \left(\frac{x_i}{\sigma_i}\right)^2(y_i - \alpha x_i - \beta x_i^2) = 0$$

Now distribute the sum to the terms... do a little bit of re-arranging and presto a set of normal equations will result. These normal equations can be solved for ##\alpha## and ##\beta## and happen to be functions of the aforementioned variables in the question.
 
Last edited:
Yea I did mess up the derivatives and my subscripts were also wrong - thank you for clarifying :) I rearranged and got something like this for the matrix:
LP7IyZ1.png
 
Looks correct at a first glance.
 
  • Like
Likes Dixanadu
Thank you very much :D i know who to ask for help with stats! thanks :)
 

Similar threads

  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 7 ·
Replies
7
Views
3K
Replies
1
Views
3K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 7 ·
Replies
7
Views
2K
  • · Replies 3 ·
Replies
3
Views
4K
  • · Replies 2 ·
Replies
2
Views
569
  • · Replies 1 ·
Replies
1
Views
2K
Replies
2
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K