Stationarity of Time Series: Tests

kimberley
Messages
14
Reaction score
0
This post may seem a bit meandering, but it does well to fully communicate my thoughts and ultimate questions.

Very little of the literature on Time Series Models makes reference to what is sparsely referred to elsewhere as the "Coefficient of Error", or even R-squared/Adjusted R-squared, in testing n-periods in a data series for stationarity of Mean and/or Standard Deviation/Variance. Viscerally, it seems to me that these two statistical tests would be the most parsimonious in accomplishing this end, but I suspect that there are various arguments that militate against this conclusion since these tests seem to be rarely, if ever, used. I'd be interested in your views.

Along these lines, consider a basic, Non-Linear, Simple Moving Average ("SMA") Time Series where your X axis is simply consecutive numbers representing Time (i.e., 1=Day 1, 2=Day 2, 3=Day 3 etc.), and your Y axis are your residuals for those days. The three primary goals in constructing an SMA Model are stationarity of Mean, stationarity of Standard Deviation/Variance, and Normality.

Now, follow me through this tangential reasoning in framing my ultimate questions. We know that the "Coefficient of Variation" is used to measure the relative dispersion of data sets by taking the standard deviation of each data set and dividing each by its Mean. The simplest and most common example given for the use of the "Coefficient of Variation" seems to be with regard to stock prices and risk. That is, if you have "Stock A" with a Mean price of $50 over the last 30 days, and a Standard Deviation of $5 over the last 30 days, its Coefficient of Variation is .1 over the last 30 days, and if you have "Stock B" with a Mean Price of $50 and Standard Deviation of $2 over the last 30 days, its Coefficient of Variation would be .04 over the last 30 days. We conclude, therefore, that the relative dispersion of Stock B is less than Stock A over the last 30 days. In other words, Stock B has been less risky over the last 30 days because the ratio of its standard deviation to Mean is smaller than that of Stock A. NOTE: COEFFICIENT OF VARIATION IS TYPICALLY USED IN THE LITERATURE WHERE YOU ARE COMPARING DISPERSION BETWEEN TWO DATA SETS (i.e., stocks in the above case) OVER TIME PERIODS OF EQUAL LENGTH.

With the above example in mind, couldn't we use a modified version of the Coefficient of Variation, the so-called "Coefficient of Error", to determine the relative stationarity of nested Moving Averages, of unequal length, in the same overall Time Series? (i.e., comparing the stationarity of the 89 day Simple Moving Average to the stationarity of the 58 day Simple Moving Average, as of today). Toward this end, as I understand it, the Coefficient of Error is really just a combination of the Standard Error of the Mean and Coefficient of Variation (i.e., the same formula used for calculating a confidence interval). To wit, the Coefficient of Error of a data set is calculated by multiplying the standard deviation of the data set by the square root of 1/n and then dividing that product by the Mean. Again, in other words, it's really just the same formula for calculating a confidence interval, but it uses the standard deviation of the sample rather than the whole population.

QUESTIONS:

1. At the end of each day, when analyzing a Simple Moving Average ("SMA") Time Series, wouldn't the most stationary SMA be the n-period with the narrowest confidence interval (i.e. smallest Coefficient of Error)?

2. Alternatively, in a SMA Time Series, isn't the n-period with the lowest R-squared/Adjusted R-squared statistic the most stationary (no linear trend)? For instance, if over the last 98 days the R-squared statistic is 0, and is the lowest R-squared reading of any n-period, wouldn't that be an indication of stationarity as well?
 
Physics news on Phys.org
With respect to Q.2, what is the regression model you have in mind? (See below.)

With respect to Q.1: on the face of it, CoE appears as a neat way to introduce the sample size into the formula, thus enabling the researcher to compare models with different sample sizes (which the CoV does not). However, even a visual comparison of two or more CoEs is not a statistical test and cannot be used for "statistical inference" in a technical sense.

Measures of risk (e.g. financial) usually assume some kind of structural model, even a naive one. The CAPM regression model used in financial theory is a simple and commonly accepted model of evaluating riskiness of stocks (relative to the market). Using a simple CAPM regression model, it is possible to test the relative riskiness of two stocks in a way that a visual comparison of two or more coefficients does not allow.

Also, a structural model such as the CAPM provides a concrete statistical basis to define terms such as "coefficient of determination" (commonly referred to as the R-squared statistic).
 
Last edited:
Hi all, I've been a roulette player for more than 10 years (although I took time off here and there) and it's only now that I'm trying to understand the physics of the game. Basically my strategy in roulette is to divide the wheel roughly into two halves (let's call them A and B). My theory is that in roulette there will invariably be variance. In other words, if A comes up 5 times in a row, B will be due to come up soon. However I have been proven wrong many times, and I have seen some...
Thread 'Detail of Diagonalization Lemma'
The following is more or less taken from page 6 of C. Smorynski's "Self-Reference and Modal Logic". (Springer, 1985) (I couldn't get raised brackets to indicate codification (Gödel numbering), so I use a box. The overline is assigning a name. The detail I would like clarification on is in the second step in the last line, where we have an m-overlined, and we substitute the expression for m. Are we saying that the name of a coded term is the same as the coded term? Thanks in advance.
Back
Top