I Are there Issues with Separation of Values in Ordinal Logistic Regression

WWGD · Aug 25, 2016

Hi all , just curious if someone knows of any issues of Separation of Points in Ordinal 3-valued
Logistic Regression. I think I have an idea of why there are issues with separation in binary
Logistic -- the need for the S-curve to go to 0 quickly makes the Bo term go to infinity. Are there
similar issues with 3-valued (or higher-valued) Logistic Regression?

MarneMath · Aug 30, 2016

I'm not entirely clear what you mean by "Separation of Points". Whenever I hear "Separation" with regards to logistic regression, it deals with complete separation or quasi separation, which tends to occur with small dataset/miscoded datasets. The problem that causes this (MLE not existing) doesn't disappear in more general cases.

There's ways around that (sometimes), but I feel that we may be talking about two different things.

WWGD · Aug 30, 2016

MarneMath said:

I'm not entirely clear what you mean by "Separation of Points". Whenever I hear "Separation" with regards to logistic regression, it deals with complete separation or quasi separation, which tends to occur with small dataset/miscoded datasets. The problem that causes this (MLE not existing) doesn't disappear in more general cases.

There's ways around that (sometimes), but I feel that we may be talking about two different things.

Hi thanks for replying. Separation happens when there is a value Xo of the independent variable (obviously this applies to cases with numerica; variables) such that for all X>Xo all trials (Bernoulli or multinomial) are fails or all trials are successes. e.g., if Y dependent was "has Cancer" and X is number of cigarettes smoked per week, then X is separated if for, e.g., X>10 all are fails, i.e., everyone who smoked more than 10 cigarettes got cancer.

MarneMath · Aug 30, 2016

Ok, then I think we are talking about he same thing. Then yes, separation is a problem even for higher orders. Most statistical packages are good at notifying you when this happens. One way around this is by using a penalizing the maximum estimator. I'm personally a fan of using a hidden logistic to overcome this when necessary.

WWGD · Jun 21, 2017

Just a followup on this: would it be reasonable, in the sense of not affecting "intrinsic" properties of a data set with separation of values with smallish size each, say in the range [0,5] , to slightly alter ; increase/decrease some of the data values , so as to overcome this issue, i.e., so that the values beyond a certain number are not monotone? Say my cutoff point for this data set within the [0,5] range is 3 and I have several points with value 3. Then I could change the data set to replace , in some cases, 3 by 3.02, in other cases 3 would be replaced by, say 2.98 , in order to avoid this problem? I just want to be able to model the probability of success by doing this; obviously, I would think, most of the properties of the data would be preserved by doing this?

I Are there Issues with Separation of Values in Ordinal Logistic Regression

Thread 'Onto set mapping is the surjective set mapping, and into injective?'

Thread 'Roulette wheel physics and probability'

Thread 'Detail of Diagonalization Lemma'

Similar threads

Hot Threads

B A Little Probability Puzzle

I Need help solving this Existence Algorithm for truth

A Does this computation satisfy LTL formulas?

A Prove that points which are indistinguishable from 0 exist (using logic)

A Mathematical Connection between Cosmic Expansion and Exponential Growth

Recent Insights

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight

Insights Relativator (Circular Slide-Rule): Simulated with Desmos - Insight

Insights Fixing Things Which Can Go Wrong With Complex Numbers

Insights Fermat's Last Theorem

Insights Why Vector Spaces Explain The World: A Historical Perspective