I Are there Issues with Separation of Values in Ordinal Logistic Regression

Tags:
1. Aug 25, 2016

WWGD

Hi all , just curious if someone knows of any issues of Separation of Points in Ordinal 3-valued
Logistic Regression. I think I have an idea of why there are issues with separation in binary
Logistic -- the need for the S-curve to go to 0 quickly makes the Bo term go to infinity. Are there
similar issues with 3-valued (or higher-valued) Logistic Regression?

2. Aug 30, 2016

Greg Bernhardt

Thanks for the thread! This is an automated courtesy bump. Sorry you aren't generating responses at the moment. Do you have any further information, come to any new conclusions or is it possible to reword the post? The more details the better.

3. Aug 30, 2016

MarneMath

I'm not entirely clear what you mean by "Separation of Points". Whenever I hear "Separation" with regards to logistic regression, it deals with complete separation or quasi separation, which tends to occur with small dataset/miscoded datasets. The problem that causes this (MLE not existing) doesn't disappear in more general cases.

There's ways around that (sometimes), but I feel that we may be talking about two different things.

4. Aug 30, 2016

WWGD

Hi thanks for replying. Separation happens when there is a value Xo of the independent variable (obviously this applies to cases with numerica; variables) such that for all X>Xo all trials (Bernoulli or multinomial) are fails or all trials are successes. e.g., if Y dependent was "has Cancer" and X is number of cigarettes smoked per week, then X is separated if for, e.g., X>10 all are fails, i.e., everyone who smoked more than 10 cigarettes got cancer.

5. Aug 30, 2016

MarneMath

Ok, then I think we are talking about he same thing. Then yes, separation is a problem even for higher orders. Most statistical packages are good at notifying you when this happens. One way around this is by using a penalizing the maximum estimator. I'm personally a fan of using a hidden logistic to overcome this when necessary.

6. Jun 21, 2017

WWGD

Just a followup on this: would it be reasonable, in the sense of not affecting "intrinsic" properties of a data set with separation of values with smallish size each, say in the range [0,5] , to slightly alter ; increase/decrease some of the data values , so as to overcome this issue, i.e., so that the values beyond a certain number are not monotone? Say my cutoff point for this data set within the [0,5] range is 3 and I have several points with value 3. Then I could change the data set to replace , in some cases, 3 by 3.02, in other cases 3 would be replaced by, say 2.98 , in order to avoid this problem? I just want to be able to model the probability of success by doing this; obviously, I would think, most of the properties of the data would be preserved by doing this?

Last edited: Jun 22, 2017