Least Squares Estimation for two parameters

Click For Summary

Discussion Overview

The discussion revolves around the least squares estimation for two parameters, α and β, in the context of a set of independent Gaussian measurements. Participants are tasked with finding the log-likelihood function and demonstrating how the least squares estimators can be derived from a system of equations. The scope includes theoretical derivation and mathematical reasoning related to statistical estimation techniques.

Discussion Character

  • Homework-related
  • Mathematical reasoning
  • Debate/contested

Main Points Raised

  • One participant presents the problem statement and attempts to derive the log-likelihood function and the least squares estimators from the chi-squared function.
  • Another participant questions the correctness of the likelihood function presented, suggesting it should be a product of the expectation values.
  • A different participant asserts that maximizing the likelihood should align with minimizing the chi-squared function, indicating a relationship between the two.
  • Participants discuss the differentiation of the chi-squared function with respect to α and β, leading to a set of equations that need to be solved.
  • One participant expresses confusion over the resulting equations and the process of solving them, while another corrects their derivatives and provides guidance on how to rearrange the equations to derive normal equations.
  • There is acknowledgment of errors in derivatives and subscripts, with participants collaboratively refining their understanding of the mathematical formulation.

Areas of Agreement / Disagreement

Participants do not reach a consensus on the correct formulation of the likelihood function, and there are differing views on the differentiation process and the resulting equations. The discussion remains unresolved regarding the exact steps to derive the least squares estimators.

Contextual Notes

Some participants note potential errors in the differentiation process and the formulation of the equations, indicating that the discussion is still exploring the correct approach to solving the problem.

Dixanadu
Messages
250
Reaction score
2

Homework Statement


Hi guys,

so the problem is as follows:

A set of n independent measurements y_{i}, i=1...n are treated as Gaussian, each with standard deviations \sigma_{i}. Each measurement corresponds to a value of a control variable x_{i}. The expectation value of y is given by
f(x;\alpha,\beta)=\alpha x +\beta x^{2}.

1) Find the log-likelihood function for the parameters \alpha,\beta.

2) Show that the least-squares estimators for \alpha,\beta can be found from the solution of a system of equations as follows:

\begin{pmatrix}<br /> a &amp; b \\<br /> c &amp; d <br /> \end{pmatrix}<br /> \left( \begin{array}{c}<br /> \alpha \\<br /> \beta <br /> \end{array}\right) = <br /> \left( \begin{array}{c}<br /> e \\<br /> f <br /> \end{array} \right)

and find the quantities a,b,c,d,e and f as functions of x_{i}, y_{i}, \sigma_{i}.

Homework Equations


[/B]
least squares estimators are
\chi^{2}(\alpha,\beta)=\sum_{i=1}^{n}\frac{1}{\sigma_{i}^{2}}(y_{i}-f(x_{i};\alpha,\beta)^{2})
if the measurements are not independent, then given the covariance matrix V, the least squares estimators are given by
\chi^{2}(\vec{\theta})\sum_{i,j=1}^{N}(y_{i}-f(x_{i};\vec{\theta}))(V^{-1})_{ij}(y_{j}-f(x_{j};\vec{\theta}))
where the \vec{\theta} is the vector of parameters we wish to estimate.

The Attempt at a Solution


[/B]
Right so I'm pretty sure I've solved the first part:
1)
8pAyMQC.png

2)
This is where I get stuck. To find the least squares estimators from the chi-squared thing, I have to put it in matrix form, differentiate, set it equal to 0 and solve the resulting system of equations. So in matrix form, since our measurements are all independent, we have

\chi^{2}(\alpha,\beta)=(\vec{y}-A\vec{\theta})^{2}(V^{-1})_{ij}

where A_{ij} is given by f(x_{i};\vec{\theta})=\sum_{j=1}^{m}a_{j}(x_{i})\theta_{j}=\sum_{j=1}^{m}A_{ij}\theta_{j}
However, in our case, we already have this quantity because

\sum_{j=1}^{m}a_{j}(x_{i})\theta_{j}=\alpha x +\beta x^{2}

aaaand this is my problem - I have no idea how to extract the A_{ij} matrix out of this, and even more confusing is: how is it square? if the i index runs from 1...n (the measurements) and j runs from 1,2 (the number of parameters) then how am I supposed to cast this into the square matrix equation above?

Anyway, I did differentiate the chi-squared thing and set it equal to 0, which gives me

A\vec{\theta}=\vec{y}

Which fits the system of equations provided that A is square...I don't see how this works...please help!
 
Physics news on Phys.org
Dixanadu said:

Homework Statement


Hi guys,

so the problem is as follows:

A set of n independent measurements y_{i}, i=1...n are treated as Gaussian, each with standard deviations \sigma_{i}. Each measurement corresponds to a value of a control variable x_{i}. The expectation value of y is given by
f(x;\alpha,\beta)=\alpha x +\beta x^{2}.

1) Find the log-likelihood function for the parameters \alpha,\beta.

2) Show that the least-squares estimators for \alpha,\beta can be found from the solution of a system of equations as follows:

\begin{pmatrix}<br /> a &amp; b \\<br /> c &amp; d<br /> \end{pmatrix}<br /> \left( \begin{array}{c}<br /> \alpha \\<br /> \beta<br /> \end{array}\right) =<br /> \left( \begin{array}{c}<br /> e \\<br /> f<br /> \end{array} \right)

and find the quantities a,b,c,d,e and f as functions of x_{i}, y_{i}, \sigma_{i}.

Homework Equations


[/B]
least squares estimators are
\chi^{2}(\alpha,\beta)=\sum_{i=1}^{n}\frac{1}{\sigma_{i}^{2}}(y_{i}-f(x_{i};\alpha,\beta)^{2})
if the measurements are not independent, then given the covariance matrix V, the least squares estimators are given by
\chi^{2}(\vec{\theta})\sum_{i,j=1}^{N}(y_{i}-f(x_{i};\vec{\theta}))(V^{-1})_{ij}(y_{j}-f(x_{j};\vec{\theta}))
where the \vec{\theta} is the vector of parameters we wish to estimate.

The Attempt at a Solution


[/B]
Right so I'm pretty sure I've solved the first part:
1)
8pAyMQC.png

2)
This is where I get stuck. To find the least squares estimators from the chi-squared thing, I have to put it in matrix form, differentiate, set it equal to 0 and solve the resulting system of equations. So in matrix form, since our measurements are all independent, we have

\chi^{2}(\alpha,\beta)=(\vec{y}-A\vec{\theta})^{2}(V^{-1})_{ij}

where A_{ij} is given by f(x_{i};\vec{\theta})=\sum_{j=1}^{m}a_{j}(x_{i})\theta_{j}=\sum_{j=1}^{m}A_{ij}\theta_{j}
However, in our case, we already have this quantity because

\sum_{j=1}^{m}a_{j}(x_{i})\theta_{j}=\alpha x +\beta x^{2}

aaaand this is my problem - I have no idea how to extract the A_{ij} matrix out of this, and even more confusing is: how is it square? if the i index runs from 1...n (the measurements) and j runs from 1,2 (the number of parameters) then how am I supposed to cast this into the square matrix equation above?

Anyway, I did differentiate the chi-squared thing and set it equal to 0, which gives me

A\vec{\theta}=\vec{y}

Which fits the system of equations provided that A is square...I don't see how this works...please help!

I don't know much about this, but I don't think your answer to 1) is correct.

Isn't the likelihood function given by:

$$L(\alpha, \beta; x_1, ..., x_n) = ∏_{j=1}^n f(x_j; \alpha, \beta) = ∏_{j=1}^n \alpha x_j +\beta x^2_j$$
 
The likelihood you wrote down cannot be correct, because maximising the likelihood should be the same as minimising the chi-squared function, so they must be the same up to some constant, which I have found to be -1/2.
 
So you want to minimize ##\chi^2## or maximize ##- \chi^2##. I agree with your work then except for some subscripts.

The second question indeed amounts to solving the normal equations by taking partials with respect to ##\alpha## and ##\beta##, setting them equal to zero and then solving.
 
Hmm okay thank you. So doing that, I get these two equations:

\frac{\partial\chi^{2}}{\partial \alpha}=2x\sum_{i=1}^{n}\frac{1}{\sigma_{i}^{2}}(\alpha x +\beta x^{2} - y_{i})=0
and
\frac{\partial\chi^{2}}{\partial \beta}=4\beta x\sum_{i=1}^{n}\frac{1}{\sigma_{i}^{2}}(\alpha x +\beta x^{2} - y_{i})=0

How does one solve these? I thought about setting the summand to 0 since the sum is 0 (although I'm not sure you can do that) but then I get two identical equations - what should I do?
 
Dixanadu said:
Hmm okay thank you. So doing that, I get these two equations:

\frac{\partial\chi^{2}}{\partial \alpha}=2x\sum_{i=1}^{n}\frac{1}{\sigma_{i}^{2}}(\alpha x +\beta x^{2} - y_{i})=0
and
\frac{\partial\chi^{2}}{\partial \beta}=4\beta x\sum_{i=1}^{n}\frac{1}{\sigma_{i}^{2}}(\alpha x +\beta x^{2} - y_{i})=0

How does one solve these? I thought about setting the summand to 0 since the sum is 0 (although I'm not sure you can do that) but then I get two identical equations - what should I do?

Your derivatives look wrong for some reason. They should read:

$$\frac{\partial\chi^{2}}{\partial \alpha}= - 2 \displaystyle \sum_i^n \frac{x_i}{\sigma_i^2}(y_i - \alpha x_i - \beta x_i^2) = 0$$

$$\frac{\partial\chi^{2}}{\partial \beta} = - 2 \displaystyle \sum_i^n \left(\frac{x_i}{\sigma_i}\right)^2(y_i - \alpha x_i - \beta x_i^2) = 0$$

Now distribute the sum to the terms... do a little bit of re-arranging and presto a set of normal equations will result. These normal equations can be solved for ##\alpha## and ##\beta## and happen to be functions of the aforementioned variables in the question.
 
Last edited:
Yea I did mess up the derivatives and my subscripts were also wrong - thank you for clarifying :) I rearranged and got something like this for the matrix:
LP7IyZ1.png
 
Looks correct at a first glance.
 
  • Like
Likes   Reactions: Dixanadu
Thank you very much :D i know who to ask for help with stats! thanks :)
 

Similar threads

  • · Replies 2 ·
Replies
2
Views
3K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 7 ·
Replies
7
Views
3K
Replies
1
Views
3K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 7 ·
Replies
7
Views
3K
  • · Replies 3 ·
Replies
3
Views
4K
  • · Replies 2 ·
Replies
2
Views
1K
  • · Replies 1 ·
Replies
1
Views
2K
Replies
2
Views
2K