A What to do ordinal response variable?

FallenApple · Jun 23, 2017

So what if my response variable, y, is say a scale. For example, ranking, something like 1-10. How would I transform the response to make linear regression work?

andrewkirk · Jun 23, 2017

Do a glm where the response variable is uniformly distributed on [0,1] and the link function is the cdf of the standard normal.
Code response variable values 1,...,10 as 0.05, 0.15,...,0.95.

FallenApple · Jun 23, 2017

andrewkirk said:

Do a glm where the response variable is uniformly distributed on [0,1] and the link function is the cdf of the standard normal.
Code response variable values 1,...,10 as 0.05, 0.15,...,0.95.

Oh ok got it. But why the mid point, is it because the rankings are evenly spaced? so 1 becomes 0.05.

But what if its something else. Like grades? A,B,C,D,F then would I split it into five? So 100/5=20, then take the mid point so A=10, B=30, C=50, D=70, F=90?

So Prob( Z < transformed(y))= sum of regressors?

so in R it would be qnorm(new_y, 1,0)~x1+x2+...+xp ?

andrewkirk · Jun 23, 2017

I think the code would be something like

Code:

glm(y~x1+x2+ ... + xn, family=quasi(link = "probit", variance = "constant"))

But I am not completely sure about the variance argument. Unfortunately, the R documentation on the 'quasi' family is almost non-existent. Best to try it and see what happens.

FallenApple said:

So Prob( Z < transformed(y))= sum of regressors?

We need to apply ##\Phi## to the sum of regressors.

If you have an ordered factor variable fac and the vector of corresponding integer values is fac.int then the transformation would be

Code:

n<-length(levels(fac))
y.transformed<-  (2 * fac.int - 1) / (2 * n)

You could also try searching 'Ordinal regression', which is the term for what you are trying to do.

FallenApple · Jun 23, 2017

andrewkirk said:
I think the code would be something like
Code:
glm(y~x1+x2+ ... + xn, family=quasi(link = "probit", variance = "constant"))
But I am not completely sure about the variance argument. Unfortunately, the R documentation on the 'quasi' family is almost non-existent. Best to try it and see what happens.We need to apply ##\Phi## to the sum of regressors.

If you have an ordered factor variable fac and the vector of corresponding integer values is fac.int then the transformation would be
Code:
n<-length(levels(fac))
y.transformed<-  (2 * fac.int - 1) / (2 * n)
You could also try searching 'Ordinal regression', which is the term for what you are trying to do.

Oh ok. I'll research into the probit glm.What about dichotomizing it? Would that hurt?

For grades, I could do C-A as 1 for pass and D-F as 0 for fail. So it depends on context if I can do this?

andrewkirk · Jun 23, 2017

FallenApple said:

So it depends on context if I can do this?

Yes, it depends on what you are trying to achieve.

FallenApple · Jun 24, 2017

andrewkirk said:

Yes, it depends on what you are trying to achieve.

Well, mostly its just to make it simpler. But what is the tradeoff? If I dichotomize say grades, to pass fail? Would I lose power? Presumably, if there is an effect of say x on grades from level to level of grades, then there would be a cooresponding effect of grades from fail to pass and vice versa.

A What to do ordinal response variable?

Similar threads

Hot Threads

B A Little Probability Puzzle

I Need help solving this Existence Algorithm for truth

I Stochastic calculus: Ito's lemma and differentials

I Help me understand skewness in QQ-plots please

I Intransitive implication

Recent Insights

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight

Insights Relativator (Circular Slide-Rule): Simulated with Desmos - Insight

Insights Fixing Things Which Can Go Wrong With Complex Numbers

Insights Fermat's Last Theorem