I Maximum likelihood to fit a parameter of this model

AI Thread Summary
The discussion focuses on using maximum likelihood estimation (MLE) to estimate the parameter ##\lambda## in an exponential weighted moving average (EWMA) model for time series data. The model assumes that the data follows a specific structure where the variance is updated recursively. To estimate ##\lambda##, one must first define a probability distribution for the squared observations and then optimize the likelihood function based on the observed data. The likelihood function incorporates the density function of the squared observations, allowing for the estimation of both the parameters of the distribution and ##\lambda##. The process involves differentiating the likelihood with respect to the unknown parameters and solving the resulting equations.
member 428835
Hi PF!

Given random time series data ##y_i##, we assume the data follows a EWMA (exponential weighted moving average) model: ##\sigma_t^2 = \lambda\sigma_{t-1}^2 + (1-\lambda)y_{t-1}^2## for ##t > 250##, where ##\sigma_t## is the standard deviation, and ##\sigma_{M=250}^2 = \sum_{i=1}^{250}y_i^2/250## to initialize. How would we use maximum likelihood to estimate ##\lambda##?

In general, it seems to use the principal we first choose a distribution ##P(y_i)## the data likely came from ( like a Bernoulli variable maybe for flipping a coin and estimating probability of heads ##p##, or normal distribution if we've been given heights of people as a sample and want to estimate the mean, standard deviation). Next, since the data are i.i.d. (we assume this is true) we optimize ##\Pi_i P(y_i)## with respect to the variable we seek (##p## or ##\mu## in the previous examples, in this question should be ##\lambda##). I'm just confused how the assumed model with ##\sigma## plays a role. Any help is greatly appreciated.
 
Physics news on Phys.org
The squaring just adds unnecessary superscripts for this exercise so let's write ##S_i## for ##\sigma^2_i## and ##X_i## for ##y_i^2##.

Typically we assume the ##X_i## are iid. Say the distribution of ##X_i## has parameters ##\mathbf \beta=\langle \beta_1,...,\beta_K\rangle##, and the probability density function of ##X_i## is ##f_{\mathbf \beta}##. We need to estimate ##\mathbf\beta## and ##\lambda## given observations ##s_1, ...,, s_n## for the random variables ##S_1, ..., S_N##.

Given the equations
$$S_t = \lambda S_{t-1} + (1-\lambda)Y_{t-1}$$
for ##t=2,...N##
and the missing equation ##S_1=X_1##
we can write the realized values ##x_1,..., x_N## of the random variables ##X_1,..., X_N## in terms of just ##\lambda## by inserting the observed values of ##S_1, ..., S_N##. Write these as ##x_1(\lambda),..., X_N(\lambda)## to emphasise this dependence.

The likelihood of the observed data given ##\mathbf \beta,\lambda## is
$$\mathscr L (\mathbf \beta,\lambda)=
\prod_{i=1}^N
f_{\mathbf\beta}(x_i(\lambda))$$

This expression has ##K+1## unknowns: ##\beta_1, ..., \beta_K, \lambda##. We partially differentiate it wrt each of those unknowns in turn and set it equal to zero, to get ##K+1## equations, the same number as we have unknowns. Solving those equations leads to the ML estimators of those unknowns.

Note how we needed the density function of ##X_i## to form the expression for ##\mathscr L##.
 
I was reading a Bachelor thesis on Peano Arithmetic (PA). PA has the following axioms (not including the induction schema): $$\begin{align} & (A1) ~~~~ \forall x \neg (x + 1 = 0) \nonumber \\ & (A2) ~~~~ \forall xy (x + 1 =y + 1 \to x = y) \nonumber \\ & (A3) ~~~~ \forall x (x + 0 = x) \nonumber \\ & (A4) ~~~~ \forall xy (x + (y +1) = (x + y ) + 1) \nonumber \\ & (A5) ~~~~ \forall x (x \cdot 0 = 0) \nonumber \\ & (A6) ~~~~ \forall xy (x \cdot (y + 1) = (x \cdot y) + x) \nonumber...
Back
Top