Graduate "Population-averaged"regression on panel data using Stata

Click For Summary
Using "population-averaged" regression in Stata does not report squared R measures because this approach does not produce a model that explains individual data points, making R^2 calculation inappropriate. The population-averaged method focuses on average outcomes rather than individual variances, which is why traditional R^2 metrics are not applicable. Regular linear regression, on the other hand, provides R^2 values as it explains variance in individual data points. Consequently, substituting R^2 from regular regression for population-averaged analyses is not recommended, as it may not accurately reflect the model's explanatory power. Overall, R^2 is not meaningful in the context of population-averaged regression.
monsmatglad
Messages
75
Reaction score
0
TL;DR
using population-averaged as regression approach on panel data in Stata
Hey. I am running regression on panel data. I test different approaches using Stata. When using "population-averaged" no squared R measures are reported. The approach is equal to running a regular linear regression on the panel data, and according to my professor, a squared R is statistically "allowed." When I run a regular linear regression on the data, the coefficients and significance-levels are almost completely identical to "population-averaged", but a squared R and adjusted squared R is reported. is there a reason why Stata does not provide a squared R estimate (within, between, overall) when applying "population-averaged"? Is there a way to make it report such a measure? and if not, can I use the Squared R from a regular linear regression as a "substitute"?

Mons
 
Physics news on Phys.org
I am not sure that R^2 makes sense for a population averaged analysis. In general, R^2 measures the proportion of the variance in the data explained by fitting the model to the data. However, in a population averaged analysis you don't really produce a model that explains the data at all, so there isn't anything against which to measure the variance.

For example, suppose you have a control and a treatment group of seeds with several different characteristics of the seeds and your outcome is sprouting or not sprouting and you are doing a logit regression. A normal regression will give you the odds of a given control seed sprouting vs the odds of that same seed sprouting under the treatment. So it is an explanation about that given individual seed data point and can be used to explain the actual outcome of that specific data point. In contrast, the population averaged regression will give you the odds of an average control seed sprouting vs the odds of an average treatment seed sprouting. It does not explain any of the individual data points, and if your experimental assignment is not random then there can be biases due to the population biases.

I think that if you want an R^2 value you should not use a population averaged regression. It just doesn't seem to make sense to me.
 
The standard _A " operator" maps a Null Hypothesis Ho into a decision set { Do not reject:=1 and reject :=0}. In this sense ( HA)_A , makes no sense. Since H0, HA aren't exhaustive, can we find an alternative operator, _A' , so that ( H_A)_A' makes sense? Isn't Pearson Neyman related to this? Hope I'm making sense. Edit: I was motivated by a superficial similarity of the idea with double transposition of matrices M, with ## (M^{T})^{T}=M##, and just wanted to see if it made sense to talk...

Similar threads

  • · Replies 23 ·
Replies
23
Views
4K
  • · Replies 13 ·
Replies
13
Views
4K
  • · Replies 30 ·
2
Replies
30
Views
4K
Replies
3
Views
3K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 64 ·
3
Replies
64
Views
5K
  • · Replies 7 ·
Replies
7
Views
2K
  • · Replies 7 ·
Replies
7
Views
3K
  • · Replies 4 ·
Replies
4
Views
2K