Auto Regressive Moving Average Model (ARMA) Ljung-Box Test

In summary: I think that your example is a higher-dimensional example of the very simple one that @BvU gave. I like the simple example to make the point and I like your higher-dimensional example to show the...
  • #1
mertcan
340
6
Hi, according to ARMA model it is said that in order to check out white noise terms, Ljung-Box is applied involving sum of squared autocorrelations of errors with relevant lags. In short, sum of them is chi-square distribution but n-p-q degree of freedom when we have ARMA(p,0,q) model. My question : Is there a mathematical proof of why we subtract p+q from n?

By the way I know the proof of why we do the similar subtraction like above in multilinear regression. And mathematical proof of it is displayed in (https://stats.stackexchange.com/que...sion/400261?noredirect=1#comment749409_400261). But even though there is a similarity I can not derive for ARMA model.
 
  • Like
Likes FactChecker
Physics news on Phys.org
  • #2
mertcan said:
In short, sum of them is chi-square distribution but n-p-q degree of freedom when we have ARMA(p,0,q) model. My question : Is there a mathematical proof of why we subtract p+q from n?
No reponse so far, so I'll give it a try:
Degrees of freedom is number of points minus number of parameters in the model that are derived from the data. ARMA(p,0,q) has p+q parameters derived from the data. (I'd say p+q+1 because of the average, but apparently the 0 in there means the 0 is hypothesized).

Not really a proof, more an explanation...
 
  • Like
Likes mertcan and FactChecker
  • #3
BvU said:
No reponse so far, so I'll give it a try:
Degrees of freedom is number of points minus number of parameters in the model that are derived from the data. ARMA(p,0,q) has p+q parameters derived from the data. (I'd say p+q+1 because of the average, but apparently the 0 in there means the 0 is hypothesized).

Not really a proof, more an explanation...
Thanks for return, according to definition : In statistics, the number of degrees of freedom is the number of values in the final calculation of a statistics that are free to vary. Could you show me maybe using some mathematical demonstration how p+q+1 are not free to vary in ARMA? Could you help me imagine the case?
 
  • #4
BvU said:
Degrees of freedom is number of points minus number of parameters in the model that are derived from the data. ARMA(p,0,q) has p+q parameters derived from the data.
I don't feel like a genuine expert on ARMA (had to look it up for this thread). But I can try to answer
mertcan said:
Could you help me imagine the case?
Simple rule that works for me: take a simple example !
e.g. ARMA(1,0) = AR(1) for two data points has no degrees of freedom: ##c## and ##\phi_1## are fully determined.

In general math: The ##n+1## coefficients for a polynomial of order ##n## through ##n+1## points can be calculated ##\ \Rightarrow\ ## no degrees of freedom

See also here: if you need to establish the quality of the model, the noise level comes in as well.
 
  • Like
Likes FactChecker
  • #5
BvU said:
I don't feel like a genuine expert on ARMA (had to look it up for this thread). But I can try to answer

Simple rule that works for me: take a simple example !
e.g. ARMA(1,0) = AR(1) for two data points has no degrees of freedom: cc and ϕ1ϕ1 are fully determined.

In general math: The n+1n+1 coefficients for a polynomial of order nn through n+1n+1 points can be calculated ⇒ ⇒ no degrees of freedom

See also here: if you need to establish the quality of the model, the noise level comes in as well.
Thanks for return: What do you say about the following:

Let's say we have ARMA(4,0,0) process and

$$yt=ϕ1∗yt−1+ϕ2∗yt−2+ϕ3∗yt−3+ϕ4∗yt−4+error_t$$​

As you can see EXPECTATION OF $$y_1=\phi_1*y_{0}+\phi_2*y_{-1}+\phi_3*y_{-2}+\phi_4*y_{-3}$$
EXPECTATION OF $$y_2=\phi_1*y_{1}+\phi_2*y_{0}+\phi_3*y_{-1}+\phi_4*y_{-2}$$
EXPECTATION OF $$y_3=\phi_1*y_{2}+\phi_2*y_{1}+\phi_3*y_{0}+\phi_4*y_{-1}$$
EXPECTATION OF $$y_4=\phi_1*y_{3}+\phi_2*y_{2}+\phi_3*y_{1}+\phi_4*y_{0}$$

By the way we do not know $$y_{0},y_{-1},y_{-2},y_{-3}$$ so may be we can write
$$y_1=error_1$$
$$y_2=\phi_1*y_{1}+error_2$$
$$y_3=\phi_1*y_{2}+\phi_2*y_{1}+error_3$$
$$y_4=\phi_1*y_{3}+\phi_2*y_{2}+\phi_3*y_{1}+error_4$$
$$error_1=\phi_1*y_{0}+\phi_2*y_{-1}+\phi_3*y_{-2}+\phi_4*y_{-3}$$
$$error_2=\phi_2*y_{0}+\phi_3*y_{-1}+\phi_4*y_{-2}$$
$$error_3=\phi_3*y_{0}+\phi_4*y_{-1}$$
$$error_4=\phi_4*y_{0}$$
In short we have 8 equations and 8 unknowns as $$error_1, error_2, error_3, error_4, y_{0}, y_{-1}, y_{-2}, y_{-3}$$
So error terms 1 to 4 can not be random because we solved them in linear system they are not free as in $$y_5,y_6...$$ so we have n-4 degree of freedom.What do you say?
 
  • Like
Likes FactChecker
  • #6
mertcan said:
In short we have 8 equations and 8 unknowns as $$error_1, error_2, error_3, error_4, y_{0}, y_{-1}, y_{-2}, y_{-3}$$
So error terms 1 to 4 can not be random because we solved them in linear system they are not free as in $$y_5,y_6...$$ so we have n-4 degree of freedom.What do you say?
That looks good to me. I think that your example is a higher-dimensional example of the very simple one that @BvU gave. I like the simple example to make the point and I like your higher-dimensional example to show the generalization.
 
  • Like
Likes mertcan
  • #7
@FactChecker and @BvU could you help me for the case ARMA (0,0,q)?
Actually I converted MA to infinite AR process but can not get proper result. I think derivation is different than AR processes?
 
  • #8
I think I carved out it but I also wonder your return for cross check...?
 
  • #9
FactChecker said:
That looks good to me. I think that your example is a higher-dimensional example of the very simple one that @BvU gave. I like the simple example to make the point and I like your higher-dimensional example to show the generalization.
BvU said:
I don't feel like a genuine expert on ARMA (had to look it up for this thread). But I can try to answer

Simple rule that works for me: take a simple example !
e.g. ARMA(1,0) = AR(1) for two data points has no degrees of freedom: cc and ϕ1ϕ1 are fully determined.

In general math: The n+1n+1 coefficients for a polynomial of order nn through n+1n+1 points can be calculated ⇒ ⇒ no degrees of freedom

See also here: if you need to establish the quality of the model, the noise level comes in as well.
I also tired to set some equations for ARMA(3,0,2) model. As you know we lose p+q=5 degree of freedom which means we set constraints 5 error terms. So could you check my proof if it?

$$y1=error_1$$
$$y2=ϕ1∗y1+error_2$$
$$y3=ϕ1∗y2+ϕ2∗y1+θ1∗error2+θ2∗error_1+error_3$$
$$error1=ϕ1∗y0+ϕ2∗y−1+ϕ3∗y−2+θ1∗error_0+θ2∗error_−1$$
$$error2=ϕ2∗y0+ϕ3∗y−1+θ2∗error_0$$​

$$error_3=\phi_3*y_0$$

$$y4=ϕ1∗y3+ϕ2∗y2+ϕ3∗y1+θ1∗error_3+θ2∗error_2+error_4$$
$$y5=ϕ1∗y4+ϕ2∗y3+ϕ3∗y2+θ1∗error_4+θ2∗error_3+error_5$$​
if we set error_4 and error_5 are zero then we have 8 equations and 8 unknowns

$$error_1,error_2,error_3,error_0,error_−1,error_−2,y0,y−1,y−2error_1,error_2,error_3,error_0,error_−1,error_−2,y0,y−1,y−2$$​

. In short

$$error_1,error_2,error_3,error_0,error_−1,error_4,error_5error_1,error_2,error_3,error_0,error_−1,error_4,error_5$$​

has been set without considering their randomnesses. What do you think about that?
 

What is an Auto Regressive Moving Average Model (ARMA)?

An Auto Regressive Moving Average Model (ARMA) is a statistical model used to analyze and forecast time series data. It combines both the autoregressive (AR) model, which takes into account the relationship between a variable and its own past values, and the moving average (MA) model, which takes into account the relationship between a variable and its own past errors.

What is the Ljung-Box Test?

The Ljung-Box Test is a statistical test used to determine if a time series data set is random or not. It examines whether there is any significant autocorrelation in the data, which is when a variable is correlated with its own past values. The test calculates a p-value, and if the p-value is less than a chosen significance level, it indicates that there is significant autocorrelation in the data.

How is the Ljung-Box Test used in ARMA models?

The Ljung-Box Test is used in ARMA models to determine if the model adequately captures the autocorrelation in the data. The test is performed on the residuals of the ARMA model, which are the differences between the actual values and the predicted values. If the p-value is greater than the chosen significance level, it indicates that the model is a good fit for the data. However, if the p-value is less than the significance level, it suggests that the model may need to be improved.

What is the significance level in the Ljung-Box Test?

The significance level in the Ljung-Box Test is the threshold used to determine if the autocorrelation in the data is statistically significant. It is typically set at 0.05 or 0.01, depending on the level of confidence desired. If the p-value is less than the significance level, it indicates that there is significant autocorrelation in the data and the null hypothesis (that there is no autocorrelation) can be rejected.

What are the assumptions of the Ljung-Box Test?

The Ljung-Box Test assumes that the data is stationary, meaning that the mean and variance of the data do not change over time. It also assumes that the data is independent and identically distributed (iid), meaning that each data point is independent of the others and follows the same probability distribution. Additionally, the test assumes that the data is uncorrelated, except for the autocorrelation being tested.

Similar threads

  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
912
  • Set Theory, Logic, Probability, Statistics
Replies
4
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
5
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
2
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
792
  • Set Theory, Logic, Probability, Statistics
Replies
17
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
13
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
2K
Replies
6
Views
1K
Back
Top