A Grouping Non-Continuous Variables

WWGD
Science Advisor
Homework Helper
Messages
7,679
Reaction score
12,446
Hi All,
Is there a technique other than PCA (Principal Component Analysis) to decide whether it is somehow reasonable to group together , aka " collapse" several non-continuous ( Categorical, Likert, Ordinal, etc. ) into a single one. The idea is, of course, to lose only a negligible amount of explanatory/predictive power by doing this. PCA ( Possibly Latent Component Analysis --LCA -- as well ) collects groups through the use of the Covariance matrix.
Questions:
1): Are there other basis/justifications for collapsing several
2) To what extent does PCA generalize into non-continuous variables?
Thanks.
 
Physics news on Phys.org
There is a statistical field called "analysis of variance" that can be used with discrete and non-ordered variables. With it, you can analyze which variables do the most to explain the variance of the data and which do not contribute much that the important variables did not already explain.
 
  • Like
Likes WWGD
CORRECTION: I'm sorry. The Analysis of Variance (ANOVA) that I know of is to be used to explain the variance of a continuous dependent variable. The independent variables do not have to be continuous. Maybe there are also ways to use it with a discrete or non-ordered dependent variable, but I am not familiar with it. And I don't think it is a good replacement for cluster analysis (which may be what you are looking for, but I am also not familiar with that.)

Therefore, I can not help and will bow out of this thread.
 
FactChecker said:
CORRECTION: I'm sorry. The Analysis of Variance (ANOVA) that I know of is to be used to explain the variance of a continuous dependent variable. The independent variables do not have to be continuous. Maybe there are also ways to use it with a discrete or non-ordered dependent variable, but I am not familiar with it. And I don't think it is a good replacement for cluster analysis (which may be what you are looking for, but I am also not familiar with that.)

Therefore, I can not help and will bow out of this thread.
Thanks. If you're interested, Latent Class Analysis does some of this.
 
  • Like
Likes FactChecker
WWGD said:
Thanks. If you're interested, Latent Class Analysis does some of this.
Thanks. I'll check it out.
 
FactChecker said:
Thanks. I'll check it out.
No problem. It is actually pretty interesting stuff IMO: As I understand it, It is the study of (quantitative) qualities that are not directly observable, like Depression, Intelligence; you don't measure them directly , but observe their presence. You observe signs/evidence of these traits and you infer from it the existence of the unobservables. It ultimately seems to come down to using some version of PCA and see if variances line up the right way.
 
  • Like
Likes FactChecker
A few thoughts:

If you get tired of regular PCA, you may enjoy mixing in kernels and doing kernel PCA.

You may also check out auto-encoders -- basically neural nets meet PCA.

in general discrete optimization problems have a habit of being NP Hard and continuity (or something 'close') turns out to be a very helpful relaxation. Coming at this from a different direction-- consider the use of the Fiedler vector in the max cut problem.
 
  • Like
Likes WWGD
StoneTemplePython said:
A few thoughts:

If you get tired of regular PCA, you may enjoy mixing in kernels and doing kernel PCA.

You may also check out auto-encoders -- basically neural nets meet PCA.

in general discrete optimization problems have a habit of being NP Hard and continuity (or something 'close') turns out to be a very helpful relaxation. Coming at this from a different direction-- consider the use of the Fiedler vector in the max cut problem.
Thanks. Congrats on the SA Badge.
 
  • Like
Likes StoneTemplePython
Back
Top