Independent variable that is a distribution?

In summary, the speaker is attempting a logistic regression analysis with a binary response and explanatory variables measured at the individual level, as well as some measured within subgroups of the observations (nests). They are considering entering each histogram value as a separate variable or fitting a distribution to each histogram for each nest. This approach aims to overcome the ecological fallacy, but it is not a known method and may be more computationally intensive than the random effects model. The speaker is also seeking other ways to overcome the ecological fallacy, and references on this topic can be found in the article provided.
  • #1
wvguy8258
50
0
Hi,

I am attempting a logistic regression analysis with a binary response and explanatory variables measured at the individual level and also some measured within subgroups of the observations (nests). For each nest containing individual observations I know the distribution of income in the form of a histogram (but not the individual incomes). I could enter each histogram value as a separate variable (i.e. percent in class 1) but I would rather fit a distribution to each histogram for each nest. Then, similar to a regression with random effects (but in a sense the diametric opposite), a parameter estimate would be assigned to this variable. Analogous to integration over multiple versions of the random effect to find the one which maximizes likelihood, this would integrate over each known distribution for each nest when estimating the parameter estimate attached to it and other independent variables. I am wanting to do this because I believe that income interacts with the relationships between other independent variables and my dependent variable but do not know the actual income, only the distribution over a subset of observations. In doing so, I'm wanting to partially overcome what is called the ecological fallacy. Is this approach a known method? I have not run across it. It seems more computationally intensive than the random effects model as you would have to estimate all parameters in the model and by integration over the pre-specified distributions for each nest. At least the distribution will not also have to be estimated as well. Any food for thought on other ways to overcome the ecological fallacy (correlation)? Thanks. -Seth
 
Mathematics news on Phys.org

1. What is an independent variable that is a distribution?

An independent variable that is a distribution is a variable that is not controlled or manipulated by the researcher, but rather occurs naturally or is pre-existing in the data. It is a type of variable that is used to categorize or group data into different categories or ranges.

2. How is an independent variable that is a distribution different from other types of independent variables?

An independent variable that is a distribution is different from other types of independent variables because it is not directly manipulated by the researcher. Other types of independent variables are controlled or changed by the researcher in order to observe their effect on the dependent variable.

3. What is the purpose of using an independent variable that is a distribution in research?

The purpose of using an independent variable that is a distribution in research is to categorize or group data into different categories or ranges in order to observe patterns or relationships with the dependent variable. This type of independent variable is often used in studies that involve analyzing data or trends.

4. Can an independent variable that is a distribution be measured quantitatively?

Yes, an independent variable that is a distribution can be measured quantitatively. While it may not be manipulated by the researcher, it can still be measured and assigned numerical values based on the different categories or ranges it represents.

5. How should an independent variable that is a distribution be selected for a research study?

The selection of an independent variable that is a distribution for a research study should be based on the research question and the purpose of the study. It should also be relevant to the data being collected and analyzed. Additionally, the distribution should have enough variability to allow for meaningful analysis and interpretation of the data.

Similar threads

  • Set Theory, Logic, Probability, Statistics
Replies
30
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
4
Views
843
  • Set Theory, Logic, Probability, Statistics
Replies
2
Views
226
  • Set Theory, Logic, Probability, Statistics
Replies
5
Views
335
  • Set Theory, Logic, Probability, Statistics
Replies
7
Views
280
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
309
  • Set Theory, Logic, Probability, Statistics
Replies
13
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
8
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
7
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
791
Back
Top