Time series analysis and data transformation

  • Context: Undergrad 
  • Thread starter Thread starter fog37
  • Start date Start date
Click For Summary

Discussion Overview

The discussion revolves around time series analysis, specifically focusing on the necessity of data transformations to achieve stationarity for various forecasting models such as AR, ARMA, ARIMA, and SARIMA. Participants explore the implications of using transformed data versus original data in modeling and forecasting.

Discussion Character

  • Exploratory
  • Technical explanation
  • Debate/contested
  • Mathematical reasoning

Main Points Raised

  • Some participants express concern that forecasting models should ideally work with data resembling the original dataset rather than transformed data.
  • It is noted that as long as an inverse transformation exists, predictions can be converted back to the original scale, but the properties of residuals may differ after transformation.
  • There is a discussion on the various definitions of "stationary" and how ARIMA and SARIMA models may not be considered stationary in a strict sense.
  • One participant suggests that time-series models like AR and MA can be viewed as discrete time ODEs, questioning why these models do not yield a final solution for the variable being predicted.
  • Another participant explains that solving for the variable directly would involve cumulative random terms, which could introduce significant variance, thus justifying the use of previous values instead.
  • Concerns are raised about the need for manual removal of seasonality when using ARIMA, as differencing alone may not suffice.
  • Some participants advocate for the use of SARIMA, arguing it simplifies the process by automatically handling trend and seasonality without requiring manual transformations.
  • There is a caution against using a "throw everything at the wall" approach when selecting model terms, emphasizing the importance of statistical significance and subject-matter reasoning in model selection.

Areas of Agreement / Disagreement

Participants express differing views on the necessity and implications of data transformations in time series analysis. There is no consensus on the best approach, with ongoing debate about the effectiveness and appropriateness of various models and methods.

Contextual Notes

Participants highlight limitations regarding the assumptions of stationarity and the impact of transformations on residual properties. The discussion reflects varying interpretations of model requirements and the handling of seasonality.

fog37
Messages
1,566
Reaction score
108
TL;DR
time series analysis and transformations
Hello,
Many time-series forecasting models (AR, ARMA, ARIMA, SARIMA, etc.) require the time series data to be stationarity.

But often, due to seasonality, trend, etc. we start with an observed time-series that is not stationary. So we apply transformations to the data so it becomes stationary. Essentially, we get a new, stationary time series which we use to create the model (AR, ARMA, etc.). But the transformed data is very different from the original data...Isn't the model supposed to work with data like the original data, i.e. isn't the goal to build a model that describes and can make forecasting on data that looks like the original data, not like the transformed data?

Thanks!
 
Physics news on Phys.org
fog37 said:
TL;DR Summary: time series analysis and transformations

Isn't the model supposed to work with data like the original data, i.e. isn't the goal to build a model that describes and can make forecasting on data that looks like the original data, not like the transformed data?
As long as there is an inverse transform then you can get back to the original scale.

The usual problem with computing the statistics on the transformed data is that the residuals usually have different properties. Assumptions on the residual distribution hold on the transformed scale, and when inverse transformed they may be quite different.
 
Last edited:
  • Like
Likes   Reactions: fog37 and BvU
There are many levels and definitions of "stationary". See Stationary process. A lot of people would not consider an ARIMA or SARIMA to be stationary in the most simple sense.
 
  • Like
Likes   Reactions: fog37
Dale said:
As long as there is an inverse transform then you can get back to the original scale.

The usual problem with computing the statistics on the transformed data is that the residuals usually have different properties. Assumptions on the residual distribution hold on the transformed scale, and when inverse transformed they may be quite different.
Ok, I guess the key word is "inverse transformation". We convert the original signal into a new signal, create a model for the new signal, make predictions, and finally apply an inverse transformation to predictions which would now make sense for the original data...
It is the same things as when we convert a time-domain signal ##f(t)## into its frequency version ##F(\omega)##, solve the problem in the frequency domain, get a frequency domain solution, and convert that solution back to the time domain...
 
  • Like
Likes   Reactions: Dale
Yes, that is a good example
 
  • Like
Likes   Reactions: fog37
One realization I just had is that time-series models like ##AR, MA, ARMA, etc.## seem to just be discrete time ODEs, i.e. difference equations...But these linear models are generally used to make predictions/extrapolations of unknown values of ##y_t## without reaching a final solution, ##y=f(t)##, correct? Why not?

For example, a fitted AR(1) model is something like this: $$y_t = a y_{t-1}$$ which can be converted to the ODE model $$y_t = \frac {a} {a-1} y'$$

Why not solve for ##y_t## instead of keeping it as ##y_t = a y_{t-1}##?
 
fog37 said:
Why not solve for ##y_t## instead of keeping it as ##y_t = a y_{t-1}##?
Because the direct solution for ##y_t## includes the cumulative random terms of all the preceding time steps. That can have a huge random variance. On the other hand, if you know the value of ##y_{t-1}##, why not use it and the random variance from that is just from one time step and is relatively small.
 
  • Like
Likes   Reactions: fog37
I was thinking the following in regards to transformations, inverse transformations, ARMA, ARIMA and SARIMA.

ARMA is meant to model time-series that are weakly stationary (constant mean, variance, autocorrelation). To train an ARMA model, the training signal ##y(t)## must therefore be stationary. If it is not, we need to apply transformations to make it so and apply inverse transformations at the very end.

With ARIMA, we avoid manually doing the stationarizing step since the ##I(d)## part of ARIMA automatically make our input signal with trend and seasonality stationary, if it is not, by taking the difference transform...
But I guess differencing does not take remove the seasonal component from ##y(t)##? Does that means that we would need to remove seasonality manually before using ARIMA?

The best solution seems to then use SARIMA which does not care if the training signal has trend and/or seasonality because it takes care of it internally: we don't need to manually apply any transformations to the raw time series ##y(t)## and inverse transformations to the prediction outputs of the SARIMA model....

Any mistake in my understanding? I would definitely choose SARIMA, more convenient, since we can skip all those preprocessing transformations to make ##y(t)## stationary and inverse transformations after the forecasting...
 
fog37 said:
Any mistake in my understanding? I would definitely choose SARIMA, more convenient, since we can skip all those preprocessing transformations to make ##y(t)## stationary and inverse transformations after the forecasting...
That is a natural thought. But you should avoid anything that would be like "throwing everything at the wall to see what sticks". A time series analysis tool-set might allow you to automate finding a SARIMA solution that only includes terms that are statistically significant. But you should have some subject-matter reason to include the trend and seasonal terms. A good tool-set should allow you to prevent the inclusion of terms that do not make sense .
 

Similar threads

  • · Replies 14 ·
Replies
14
Views
3K
  • · Replies 8 ·
Replies
8
Views
3K
  • · Replies 10 ·
Replies
10
Views
3K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 1 ·
Replies
1
Views
1K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 7 ·
Replies
7
Views
2K
  • · Replies 1 ·
Replies
1
Views
1K