Stationarity of Time Series: Tests

AI Thread Summary
The discussion centers on the application of the Coefficient of Error and R-squared in assessing the stationarity of time series data, particularly in Simple Moving Average (SMA) models. It argues that these statistical measures could provide insights into the stationarity of mean and variance over different periods. The Coefficient of Error, a combination of standard error and coefficient of variation, is proposed as a tool for comparing the stationarity of nested SMAs of unequal lengths. The conversation also touches on the limitations of visual comparisons of coefficients for statistical inference and emphasizes the importance of structural models like CAPM for evaluating risk in financial contexts. Ultimately, the thread raises questions about the effectiveness of these measures in determining the most stationary SMA and the implications for regression modeling.
kimberley
Messages
14
Reaction score
0
This post may seem a bit meandering, but it does well to fully communicate my thoughts and ultimate questions.

Very little of the literature on Time Series Models makes reference to what is sparsely referred to elsewhere as the "Coefficient of Error", or even R-squared/Adjusted R-squared, in testing n-periods in a data series for stationarity of Mean and/or Standard Deviation/Variance. Viscerally, it seems to me that these two statistical tests would be the most parsimonious in accomplishing this end, but I suspect that there are various arguments that militate against this conclusion since these tests seem to be rarely, if ever, used. I'd be interested in your views.

Along these lines, consider a basic, Non-Linear, Simple Moving Average ("SMA") Time Series where your X axis is simply consecutive numbers representing Time (i.e., 1=Day 1, 2=Day 2, 3=Day 3 etc.), and your Y axis are your residuals for those days. The three primary goals in constructing an SMA Model are stationarity of Mean, stationarity of Standard Deviation/Variance, and Normality.

Now, follow me through this tangential reasoning in framing my ultimate questions. We know that the "Coefficient of Variation" is used to measure the relative dispersion of data sets by taking the standard deviation of each data set and dividing each by its Mean. The simplest and most common example given for the use of the "Coefficient of Variation" seems to be with regard to stock prices and risk. That is, if you have "Stock A" with a Mean price of $50 over the last 30 days, and a Standard Deviation of $5 over the last 30 days, its Coefficient of Variation is .1 over the last 30 days, and if you have "Stock B" with a Mean Price of $50 and Standard Deviation of $2 over the last 30 days, its Coefficient of Variation would be .04 over the last 30 days. We conclude, therefore, that the relative dispersion of Stock B is less than Stock A over the last 30 days. In other words, Stock B has been less risky over the last 30 days because the ratio of its standard deviation to Mean is smaller than that of Stock A. NOTE: COEFFICIENT OF VARIATION IS TYPICALLY USED IN THE LITERATURE WHERE YOU ARE COMPARING DISPERSION BETWEEN TWO DATA SETS (i.e., stocks in the above case) OVER TIME PERIODS OF EQUAL LENGTH.

With the above example in mind, couldn't we use a modified version of the Coefficient of Variation, the so-called "Coefficient of Error", to determine the relative stationarity of nested Moving Averages, of unequal length, in the same overall Time Series? (i.e., comparing the stationarity of the 89 day Simple Moving Average to the stationarity of the 58 day Simple Moving Average, as of today). Toward this end, as I understand it, the Coefficient of Error is really just a combination of the Standard Error of the Mean and Coefficient of Variation (i.e., the same formula used for calculating a confidence interval). To wit, the Coefficient of Error of a data set is calculated by multiplying the standard deviation of the data set by the square root of 1/n and then dividing that product by the Mean. Again, in other words, it's really just the same formula for calculating a confidence interval, but it uses the standard deviation of the sample rather than the whole population.

QUESTIONS:

1. At the end of each day, when analyzing a Simple Moving Average ("SMA") Time Series, wouldn't the most stationary SMA be the n-period with the narrowest confidence interval (i.e. smallest Coefficient of Error)?

2. Alternatively, in a SMA Time Series, isn't the n-period with the lowest R-squared/Adjusted R-squared statistic the most stationary (no linear trend)? For instance, if over the last 98 days the R-squared statistic is 0, and is the lowest R-squared reading of any n-period, wouldn't that be an indication of stationarity as well?
 
Physics news on Phys.org
With respect to Q.2, what is the regression model you have in mind? (See below.)

With respect to Q.1: on the face of it, CoE appears as a neat way to introduce the sample size into the formula, thus enabling the researcher to compare models with different sample sizes (which the CoV does not). However, even a visual comparison of two or more CoEs is not a statistical test and cannot be used for "statistical inference" in a technical sense.

Measures of risk (e.g. financial) usually assume some kind of structural model, even a naive one. The CAPM regression model used in financial theory is a simple and commonly accepted model of evaluating riskiness of stocks (relative to the market). Using a simple CAPM regression model, it is possible to test the relative riskiness of two stocks in a way that a visual comparison of two or more coefficients does not allow.

Also, a structural model such as the CAPM provides a concrete statistical basis to define terms such as "coefficient of determination" (commonly referred to as the R-squared statistic).
 
Last edited:
I was reading documentation about the soundness and completeness of logic formal systems. Consider the following $$\vdash_S \phi$$ where ##S## is the proof-system making part the formal system and ##\phi## is a wff (well formed formula) of the formal language. Note the blank on left of the turnstile symbol ##\vdash_S##, as far as I can tell it actually represents the empty set. So what does it mean ? I guess it actually means ##\phi## is a theorem of the formal system, i.e. there is a...
Back
Top