Least Squares Estimation for two parameters

In summary: I'm not sure, but I think you're supposed to show that the resulting normal equations are equivalent to the matric equation with the inverse.In summary, the problem involves finding the least-squares estimators for the parameters alpha and beta in a set of n independent measurements, each with a standard deviation of sigma. The expectation value of y is given by a function of x, alpha, and beta. The log-likelihood function for these parameters is given by the product of n terms, and the least-squares estimators can be found by solving a system of equations, which can be derived from the normal equations. To do this, the chi-squared function is minimized by setting its partial derivatives with respect to alpha and beta
  • #1
Dixanadu
254
2

Homework Statement


Hi guys,

so the problem is as follows:

A set of n independent measurements [itex]y_{i}, i=1...n[/itex] are treated as Gaussian, each with standard deviations [itex]\sigma_{i}[/itex]. Each measurement corresponds to a value of a control variable [itex]x_{i}[/itex]. The expectation value of [itex]y[/itex] is given by
[itex]f(x;\alpha,\beta)=\alpha x +\beta x^{2}[/itex].

1) Find the log-likelihood function for the parameters [itex]\alpha,\beta[/itex].

2) Show that the least-squares estimators for [itex]\alpha,\beta[/itex] can be found from the solution of a system of equations as follows:

[itex] \begin{pmatrix}
a & b \\
c & d
\end{pmatrix}
\left( \begin{array}{c}
\alpha \\
\beta
\end{array}\right) =
\left( \begin{array}{c}
e \\
f
\end{array} \right)[/itex]

and find the quantities a,b,c,d,e and f as functions of [itex]x_{i}, y_{i}, \sigma_{i}[/itex].

Homework Equations


[/B]
least squares estimators are
[itex]\chi^{2}(\alpha,\beta)=\sum_{i=1}^{n}\frac{1}{\sigma_{i}^{2}}(y_{i}-f(x_{i};\alpha,\beta)^{2})[/itex]
if the measurements are not independent, then given the covariance matrix [itex]V[/itex], the least squares estimators are given by
[itex]\chi^{2}(\vec{\theta})\sum_{i,j=1}^{N}(y_{i}-f(x_{i};\vec{\theta}))(V^{-1})_{ij}(y_{j}-f(x_{j};\vec{\theta}))[/itex]
where the [itex]\vec{\theta}[/itex] is the vector of parameters we wish to estimate.

The Attempt at a Solution


[/B]
Right so I'm pretty sure I've solved the first part:
1)
8pAyMQC.png

2)
This is where I get stuck. To find the least squares estimators from the chi-squared thing, I have to put it in matrix form, differentiate, set it equal to 0 and solve the resulting system of equations. So in matrix form, since our measurements are all independent, we have

[itex]\chi^{2}(\alpha,\beta)=(\vec{y}-A\vec{\theta})^{2}(V^{-1})_{ij}[/itex]

where [itex]A_{ij}[/itex] is given by [itex]f(x_{i};\vec{\theta})=\sum_{j=1}^{m}a_{j}(x_{i})\theta_{j}=\sum_{j=1}^{m}A_{ij}\theta_{j}[/itex]
However, in our case, we already have this quantity because

[itex]\sum_{j=1}^{m}a_{j}(x_{i})\theta_{j}=\alpha x +\beta x^{2}[/itex]

aaaand this is my problem - I have no idea how to extract the [itex]A_{ij}[/itex] matrix out of this, and even more confusing is: how is it square? if the i index runs from 1...n (the measurements) and j runs from 1,2 (the number of parameters) then how am I supposed to cast this into the square matrix equation above?

Anyway, I did differentiate the chi-squared thing and set it equal to 0, which gives me

[itex]A\vec{\theta}=\vec{y}[/itex]

Which fits the system of equations provided that A is square...I don't see how this works...please help!
 
Physics news on Phys.org
  • #2
Dixanadu said:

Homework Statement


Hi guys,

so the problem is as follows:

A set of n independent measurements [itex]y_{i}, i=1...n[/itex] are treated as Gaussian, each with standard deviations [itex]\sigma_{i}[/itex]. Each measurement corresponds to a value of a control variable [itex]x_{i}[/itex]. The expectation value of [itex]y[/itex] is given by
[itex]f(x;\alpha,\beta)=\alpha x +\beta x^{2}[/itex].

1) Find the log-likelihood function for the parameters [itex]\alpha,\beta[/itex].

2) Show that the least-squares estimators for [itex]\alpha,\beta[/itex] can be found from the solution of a system of equations as follows:

[itex] \begin{pmatrix}
a & b \\
c & d
\end{pmatrix}
\left( \begin{array}{c}
\alpha \\
\beta
\end{array}\right) =
\left( \begin{array}{c}
e \\
f
\end{array} \right)[/itex]

and find the quantities a,b,c,d,e and f as functions of [itex]x_{i}, y_{i}, \sigma_{i}[/itex].

Homework Equations


[/B]
least squares estimators are
[itex]\chi^{2}(\alpha,\beta)=\sum_{i=1}^{n}\frac{1}{\sigma_{i}^{2}}(y_{i}-f(x_{i};\alpha,\beta)^{2})[/itex]
if the measurements are not independent, then given the covariance matrix [itex]V[/itex], the least squares estimators are given by
[itex]\chi^{2}(\vec{\theta})\sum_{i,j=1}^{N}(y_{i}-f(x_{i};\vec{\theta}))(V^{-1})_{ij}(y_{j}-f(x_{j};\vec{\theta}))[/itex]
where the [itex]\vec{\theta}[/itex] is the vector of parameters we wish to estimate.

The Attempt at a Solution


[/B]
Right so I'm pretty sure I've solved the first part:
1)
8pAyMQC.png

2)
This is where I get stuck. To find the least squares estimators from the chi-squared thing, I have to put it in matrix form, differentiate, set it equal to 0 and solve the resulting system of equations. So in matrix form, since our measurements are all independent, we have

[itex]\chi^{2}(\alpha,\beta)=(\vec{y}-A\vec{\theta})^{2}(V^{-1})_{ij}[/itex]

where [itex]A_{ij}[/itex] is given by [itex]f(x_{i};\vec{\theta})=\sum_{j=1}^{m}a_{j}(x_{i})\theta_{j}=\sum_{j=1}^{m}A_{ij}\theta_{j}[/itex]
However, in our case, we already have this quantity because

[itex]\sum_{j=1}^{m}a_{j}(x_{i})\theta_{j}=\alpha x +\beta x^{2}[/itex]

aaaand this is my problem - I have no idea how to extract the [itex]A_{ij}[/itex] matrix out of this, and even more confusing is: how is it square? if the i index runs from 1...n (the measurements) and j runs from 1,2 (the number of parameters) then how am I supposed to cast this into the square matrix equation above?

Anyway, I did differentiate the chi-squared thing and set it equal to 0, which gives me

[itex]A\vec{\theta}=\vec{y}[/itex]

Which fits the system of equations provided that A is square...I don't see how this works...please help!

I don't know much about this, but I don't think your answer to 1) is correct.

Isn't the likelihood function given by:

$$L(\alpha, \beta; x_1, ..., x_n) = ∏_{j=1}^n f(x_j; \alpha, \beta) = ∏_{j=1}^n \alpha x_j +\beta x^2_j$$
 
  • #3
The likelihood you wrote down cannot be correct, because maximising the likelihood should be the same as minimising the chi-squared function, so they must be the same up to some constant, which I have found to be -1/2.
 
  • #4
So you want to minimize ##\chi^2## or maximize ##- \chi^2##. I agree with your work then except for some subscripts.

The second question indeed amounts to solving the normal equations by taking partials with respect to ##\alpha## and ##\beta##, setting them equal to zero and then solving.
 
  • #5
Hmm okay thank you. So doing that, I get these two equations:

[itex]\frac{\partial\chi^{2}}{\partial \alpha}=2x\sum_{i=1}^{n}\frac{1}{\sigma_{i}^{2}}(\alpha x +\beta x^{2} - y_{i})=0[/itex]
and
[itex]\frac{\partial\chi^{2}}{\partial \beta}=4\beta x\sum_{i=1}^{n}\frac{1}{\sigma_{i}^{2}}(\alpha x +\beta x^{2} - y_{i})=0[/itex]

How does one solve these? I thought about setting the summand to 0 since the sum is 0 (although I'm not sure you can do that) but then I get two identical equations - what should I do?
 
  • #6
Dixanadu said:
Hmm okay thank you. So doing that, I get these two equations:

[itex]\frac{\partial\chi^{2}}{\partial \alpha}=2x\sum_{i=1}^{n}\frac{1}{\sigma_{i}^{2}}(\alpha x +\beta x^{2} - y_{i})=0[/itex]
and
[itex]\frac{\partial\chi^{2}}{\partial \beta}=4\beta x\sum_{i=1}^{n}\frac{1}{\sigma_{i}^{2}}(\alpha x +\beta x^{2} - y_{i})=0[/itex]

How does one solve these? I thought about setting the summand to 0 since the sum is 0 (although I'm not sure you can do that) but then I get two identical equations - what should I do?

Your derivatives look wrong for some reason. They should read:

$$\frac{\partial\chi^{2}}{\partial \alpha}= - 2 \displaystyle \sum_i^n \frac{x_i}{\sigma_i^2}(y_i - \alpha x_i - \beta x_i^2) = 0$$

$$\frac{\partial\chi^{2}}{\partial \beta} = - 2 \displaystyle \sum_i^n \left(\frac{x_i}{\sigma_i}\right)^2(y_i - \alpha x_i - \beta x_i^2) = 0$$

Now distribute the sum to the terms... do a little bit of re-arranging and presto a set of normal equations will result. These normal equations can be solved for ##\alpha## and ##\beta## and happen to be functions of the aforementioned variables in the question.
 
Last edited:
  • #7
Yea I did mess up the derivatives and my subscripts were also wrong - thank you for clarifying :) I rearranged and got something like this for the matrix:
LP7IyZ1.png
 
  • #8
Looks correct at a first glance.
 
  • Like
Likes Dixanadu
  • #9
Thank you very much :D i know who to ask for help with stats! thanks :)
 

What is Least Squares Estimation for two parameters?

Least Squares Estimation is a method used in statistics to find the best fitting line for a set of data points. It is used to estimate the values of two parameters, typically the slope and intercept, that best describe the relationship between two variables.

How does Least Squares Estimation work?

In Least Squares Estimation, the sum of the squared differences between the actual data points and the predicted values from the estimated line is minimized. This is done by finding the values of the two parameters that minimize the sum of the squared differences, also known as the sum of squared residuals.

What is the purpose of using Least Squares Estimation?

The purpose of using Least Squares Estimation is to find the best fitting line for a set of data points. This allows us to make predictions and better understand the relationship between two variables. It is commonly used in regression analysis to model and analyze data.

What are the assumptions of Least Squares Estimation?

There are several assumptions in Least Squares Estimation, including that the data is linear, the residuals are normally distributed, and the variance of the residuals is constant. It is also assumed that the data is independent and the errors are independent and identically distributed.

What are the limitations of Least Squares Estimation?

One limitation of Least Squares Estimation is that it assumes the data is linear, which may not always be the case. It also assumes that the errors are normally distributed and independent, which may not always be true. Additionally, it can be sensitive to outliers in the data, which can greatly affect the estimated line and parameters.

Similar threads

  • Engineering and Comp Sci Homework Help
Replies
1
Views
960
Replies
7
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
7
Views
486
  • Calculus and Beyond Homework Help
Replies
3
Views
1K
  • Advanced Physics Homework Help
Replies
2
Views
823
  • Introductory Physics Homework Help
Replies
1
Views
196
  • Linear and Abstract Algebra
Replies
1
Views
909
  • Special and General Relativity
Replies
1
Views
780
  • Engineering and Comp Sci Homework Help
Replies
1
Views
3K
Replies
5
Views
2K
Back
Top