Is My Panel Data Sample Size Adequate and Modeling Approach Correct?

Click For Summary

Discussion Overview

The discussion revolves around the adequacy of a panel data sample size and the appropriateness of the modeling approach for estimating a regression model focused on expenditure on cars across 30 countries over 25 years. Participants explore various aspects of econometric modeling, including sample size considerations, model selection, and the distinction between prediction and explanation of expenditures.

Discussion Character

  • Exploratory
  • Technical explanation
  • Debate/contested
  • Mathematical reasoning

Main Points Raised

  • One participant questions whether a sample of 30 countries over 25 years is adequate, particularly whether the number of years should exceed the number of countries.
  • Another participant inquires about the type of regression model being used, suggesting that if it is linear, it may only capture trends rather than cyclic phenomena.
  • Concerns are raised about the goal of the analysis, whether it is to predict future expenditures or to explain past expenditures, with implications for model selection.
  • A participant mentions using a linear model (OLS regression) and a fixed effect model based on a Hausman test, expressing uncertainty about the concept of full interactions between variables.
  • Discussion includes the challenge of predicting growth rates in expenditures, particularly in light of unusual events that may affect data, such as tax reforms.
  • Another participant emphasizes the conflict between using past values as predictors and the desire to explain underlying causes of expenditure changes.
  • Personal experiences are shared regarding the limitations of using past purchases as predictors, particularly in relation to the size of the purchase.

Areas of Agreement / Disagreement

Participants express differing views on the adequacy of the sample size and the modeling approach. There is no consensus on whether the current model is appropriate or whether the goals of prediction and explanation can be reconciled effectively.

Contextual Notes

Participants highlight limitations related to the complexity of econometric modeling, the influence of cultural and practical differences between countries, and the potential impact of unobserved variables on the analysis.

beaf123
Messages
40
Reaction score
0
I am using panel- data for 30 countries over 25 years to estimate a regression model. Expenditure on cars is my dependent variable and then I use economic theory to find some explanatory variables.
First, is 30 countries and 25 years an ok sample? Or should years > countries?

Second, is it an ok approach to start with Norway (my home country, and country of focus) and our neighboring countries and then expand the model to more countries that seem simmiliar to Scandinavian countries? Or should I start with including all my countries in the model?

Its so much litterature on econometrics and so many rules, so I don`t dare to do anything!

Thanks,
 
Physics news on Phys.org
What sort of regression model are you trying to estimate? If it's linear then presumably you are using year as a continuous variable, in which case it will only be able to find trends, not cyclic phenomena.

Are you allowing full interactions between variables? If so, the coefficient of the year variable will vary between countries, and the model will be equivalent to a collection of separate linear models, one for each country.

I suggest that you try to clearly set out what you want the model to achieve and what data you have, as those two are the crucial factors in determining what an appropriate model would be.
 
Is your goal to predict future expenditures or is it to explain the past expenditures? The reason I ask is that the best predictor will be the past value, but that may not explain either the past or predicted value. There are many cultural and practical differences between countries -- population density, the quality of public transportation, the practicality of alternatives like bicycles, personal wealth, etc. If you only want to predict future expenditures, than the past expenditures are the best predictor and indirectly take all the country differences into account. But if you want to explain the expenditures, then using past expenditures that will not help and will hide the influences that you are looking for.
 
Thank you @ andrew. Good tips. At the moment I look at the growth rate in expenditure as my dependent variable. And I use a linear model, if you mean OLS- regression.
I don\t know what you mean by full interaction? I did a Hausman test and based on that I use a fixed effect model with an unique intercept for each country.
What I want to achieve is to explain differences in expenditure on cars or in the growth rate in expenditure on cars between countries and years in the best way possible.
@ Factchecker. I want to find the explanations of changes in the past so that I can predict the future!

I am experimenting with different models a fair bit, and although my R^2 is low (26%) I have tried some predictions. Do any of you have any comments on these two predictions of past values for the growth rate in expenditure in France and Norway?:

Norway:

upload_2017-9-21_17-51-32.png


France

upload_2017-9-21_17-52-30.png
I mean its hard to predict growth rates, cus something strange happened in France around 1995 ( maybe data) or maybe tax reform or something else that is impossible to catch. But are the predictions as horible as they seem?
 
beaf123 said:
@ Factchecker. I want to find the explanations of changes in the past so that I can predict the future!
I think that I may not have been clear. Those two goals are often somewhat in conflict. The best predictor of the future in a time series is usually to use the past values of the same variable. It takes into account any influences you have thought of and also any that you have not thought of. Unfortunately, that hides the underlying causes for the trend. Suppose the past values are highly correlated with the underlying causes. Then once the influence of past values have been removed, the remaining influence of the underlying causes is greatly reduced. Their residual statistical significance may be too small to justify their use in the model.

PS. I have been in exactly that predicament. Trying to explain to my boss that past buys is by far the best predictor of future buys. But that once past buys is put in the model, nothing else is statistically significant. He was very disappointed and I'm not sure he ever really accepted it.

PPS. I think that the extent of the problem depends on the size of the purchase. If the purchase is very large (for that person), then the decision depends more on economics of prior years in addition to the current year. That means that the prior year purchases are more of a predictor. If the purchase is small, then the decision depends mostly on the current year economics. That means that the prior year purchases are not a good predictor. My experience was regarding very large purchases.
 
Last edited: