I Are there Issues with Separation of Values in Ordinal Logistic Regression

WWGD
Science Advisor
Homework Helper
Messages
7,678
Reaction score
12,360
Hi all , just curious if someone knows of any issues of Separation of Points in Ordinal 3-valued
Logistic Regression. I think I have an idea of why there are issues with separation in binary
Logistic -- the need for the S-curve to go to 0 quickly makes the Bo term go to infinity. Are there
similar issues with 3-valued (or higher-valued) Logistic Regression?
 
I'm not entirely clear what you mean by "Separation of Points". Whenever I hear "Separation" with regards to logistic regression, it deals with complete separation or quasi separation, which tends to occur with small dataset/miscoded datasets. The problem that causes this (MLE not existing) doesn't disappear in more general cases.

There's ways around that (sometimes), but I feel that we may be talking about two different things.
 
MarneMath said:
I'm not entirely clear what you mean by "Separation of Points". Whenever I hear "Separation" with regards to logistic regression, it deals with complete separation or quasi separation, which tends to occur with small dataset/miscoded datasets. The problem that causes this (MLE not existing) doesn't disappear in more general cases.

There's ways around that (sometimes), but I feel that we may be talking about two different things.
Hi thanks for replying. Separation happens when there is a value Xo of the independent variable (obviously this applies to cases with numerica; variables) such that for all X>Xo all trials (Bernoulli or multinomial) are fails or all trials are successes. e.g., if Y dependent was "has Cancer" and X is number of cigarettes smoked per week, then X is separated if for, e.g., X>10 all are fails, i.e., everyone who smoked more than 10 cigarettes got cancer.
 
Ok, then I think we are talking about he same thing. Then yes, separation is a problem even for higher orders. Most statistical packages are good at notifying you when this happens. One way around this is by using a penalizing the maximum estimator. I'm personally a fan of using a hidden logistic to overcome this when necessary.
 
Just a followup on this: would it be reasonable, in the sense of not affecting "intrinsic" properties of a data set with separation of values with smallish size each, say in the range [0,5] , to slightly alter ; increase/decrease some of the data values , so as to overcome this issue, i.e., so that the values beyond a certain number are not monotone? Say my cutoff point for this data set within the [0,5] range is 3 and I have several points with value 3. Then I could change the data set to replace , in some cases, 3 by 3.02, in other cases 3 would be replaced by, say 2.98 , in order to avoid this problem? I just want to be able to model the probability of success by doing this; obviously, I would think, most of the properties of the data would be preserved by doing this?
 
Last edited:
  • Like
Likes Greg Bernhardt
Hi all, I've been a roulette player for more than 10 years (although I took time off here and there) and it's only now that I'm trying to understand the physics of the game. Basically my strategy in roulette is to divide the wheel roughly into two halves (let's call them A and B). My theory is that in roulette there will invariably be variance. In other words, if A comes up 5 times in a row, B will be due to come up soon. However I have been proven wrong many times, and I have seen some...
Thread 'Detail of Diagonalization Lemma'
The following is more or less taken from page 6 of C. Smorynski's "Self-Reference and Modal Logic". (Springer, 1985) (I couldn't get raised brackets to indicate codification (Gödel numbering), so I use a box. The overline is assigning a name. The detail I would like clarification on is in the second step in the last line, where we have an m-overlined, and we substitute the expression for m. Are we saying that the name of a coded term is the same as the coded term? Thanks in advance.
Back
Top