How do you implement the Dickey-Fuller test?

In summary, it sounds like you are trying to do a Dickey-Fuller test on a time series model where you are assuming a unit root does not exist. You state that there are few resources on how to go about conducting the test. There seems to be a misunderstanding of the standard error of A, which could be due to the lack of clarity on what it is.
  • #1
tomizzo
114
2
Hi there,

I've recently start learning methods for determining whether or not time series are stationary. The first method I'm trying to learn is the 'Dickey-Fuller Test'. This test uses a time series modeled by an AR(1) process. The key is to find whether or not this process contains a unit root. If it contains a unit root, the series is said to be non-stationary.

While I'm understanding most of the derived equations, I'm inexperienced in hypothesis testing. Thus I'm struggling with the part when we actually implement the Dickey-Fuller test. There do not seem to be many resources that outline the iterative process for conducting the test.

I've outlined my question in full here:

http://imgur.com/HMWtn59

I appreciate any help!
 
Physics news on Phys.org
  • #2
I commiserate with you about how sketchy explanations of the test on the web are. I don't know the answer, but I'll make a guess. Let's say your time series data is ## x_1, x_2, x_3,...## and we are assuming an ##AR(1)## model. I think you do a least squares fit of an equation of the form ## y = A w ## to your data ## (y_t, w_t ) ## where ##y_t = x_t - x_{t-1}## and ##w_t =x_{t-1}##. This isn't the usual kind of linear least squares fit because the equation being fit isn't ##y = Aw + B##.

The Dickey-Fuller statistic is ##(A-1)## divided by "##se(A)##. It seems "##se(A)##" is supposed to abbreviate "the standard error of "##A##". However, ##A## is a constant, so it's hard to see why it has any "standard error". Perhaps " ##se(A)##" is supposed to denote an estimate of the standard deviation between the predicted values and actual values. If that is the case then ##se(A)## is an estimated standard deviation computed from the data values ##( (x_t -x_{t-1}) - Ax_{t-1})##. That raises the question of which method of estimating the standard deviation is implied by the term "standard error" - e.g. divide by N or divide by N-1 ? You'll have to figure out that vocabulary exercise.
 
  • #3
Stephen Tashi said:
I commiserate with you about how sketchy explanations of the test on the web are. I don't know the answer, but I'll make a guess. Let's say your time series data is ## x_1, x_2, x_3,...## and we are assuming an ##AR(1)## model. I think you do a least squares fit of an equation of the form ## y = A w ## to your data ## (y_t, w_t ) ## where ##y_t = x_t - x_{t-1}## and ##w_t =x_{t-1}##. This isn't the usual kind of linear least squares fit because the equation being fit isn't ##y = Aw + B##.

The Dickey-Fuller statistic is ##(A-1)## divided by "##se(A)##. It seems "##se(A)##" is supposed to abbreviate "the standard error of "##A##". However, ##A## is a constant, so it's hard to see why it has any "standard error". Perhaps " ##se(A)##" is supposed to denote an estimate of the standard deviation between the predicted values and actual values. If that is the case then ##se(A)## is an estimated standard deviation computed from the data values ##( (x_t -x_{t-1}) - Ax_{t-1})##. That raises the question of which method of estimating the standard deviation is implied by the term "standard error" - e.g. divide by N or divide by N-1 ? You'll have to figure out that vocabulary exercise.

Hi Stephen,

I appreciate the response! After doing further investigation, I believe you are correct in the sense that the parameter must be solved via a least squares estimates. I am currently attempting to find a method for a solving a polynomial y(x) = a0 + a1*x, where I make the assumption a0 = 0. However, I'm having a tough time in how I solve this least squares problem while being able to make the assumption a0 = 0...

I will let you know if I find out anything else!
 
  • #4
Hey tomizzo.

Is there any reason you can't fit the data and do an inference on a0?

For a simple linear regression you can technically set a0 = 0 and see the effect it has on the data (that has to follow this constraint if it is set to zero).
 
  • #5
Also - with regard to your AR(1) time series - have you looked at results for AR(x) in terms of finding a distribution that has a fixed variance/mean?

The AR(1) is the simplest one and you can find a recurrence relation (this one is very simple) that gives you a constraint on whether it will "converge" or not.

You don't need all of the time series stuff with shifting operators and doing all the operator algebras (which you do when you have an arbitrary time series) - you can write it as a sum once you expand it and then look for the condition for your coefficient so that it will converge properly.

If you expand the recurrence relation out to an explicit form it should become clearer with regards to what I'm talking about.
 

1. What is the purpose of the Dickey-Fuller test?

The Dickey-Fuller test is used to determine whether a time series data is stationary or not. Stationarity is an important assumption in many time series models, and the Dickey-Fuller test helps to determine if this assumption is valid.

2. How do you perform the Dickey-Fuller test?

To perform the Dickey-Fuller test, you first need to specify a null hypothesis, which is that the time series data is non-stationary. Then, the test calculates a test statistic and compares it to a critical value from a predetermined table. If the test statistic is less than the critical value, the null hypothesis is rejected, indicating that the data is stationary.

3. What are the assumptions of the Dickey-Fuller test?

The Dickey-Fuller test assumes that the time series data is serially uncorrelated, meaning that there is no relationship between the observations at different time points. It also assumes that the data follows a random walk model, which means that the difference between two successive observations is a random value with a mean of zero.

4. What are the limitations of the Dickey-Fuller test?

The Dickey-Fuller test has a few limitations. First, it assumes that the data follows a random walk model, which may not always be the case. It also does not work well with small sample sizes. Additionally, the test can only detect linear trends in the data, so it may not be suitable for non-linear time series.

5. When should I use the Dickey-Fuller test?

The Dickey-Fuller test is commonly used in econometrics and finance to analyze time series data. It is often used in conjunction with other tests, such as the autocorrelation test, to determine the best model for the data. It is also useful for identifying trends and seasonality in a time series.

Similar threads

  • Set Theory, Logic, Probability, Statistics
Replies
2
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
1K
Replies
4
Views
2K
  • Art, Music, History, and Linguistics
Replies
3
Views
2K
  • Engineering and Comp Sci Homework Help
Replies
1
Views
3K
Replies
2
Views
767
  • STEM Academic Advising
Replies
13
Views
2K
  • MATLAB, Maple, Mathematica, LaTeX
Replies
12
Views
1K
Replies
15
Views
2K
  • Art, Music, History, and Linguistics
Replies
1
Views
1K
Back
Top