Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

A What to do ordinal response variable?

  1. Jun 23, 2017 #1
    So what if my response variable, y, is say a scale. For example, ranking, something like 1-10. How would I transform the response to make linear regression work?
     
  2. jcsd
  3. Jun 23, 2017 #2

    andrewkirk

    User Avatar
    Science Advisor
    Homework Helper
    Gold Member

    Do a glm where the response variable is uniformly distributed on [0,1] and the link function is the cdf of the standard normal.
    Code response variable values 1,...,10 as 0.05, 0.15,...,0.95.
     
  4. Jun 23, 2017 #3

    Oh ok got it. But why the mid point, is it because the rankings are evenly spaced? so 1 becomes 0.05.

    But what if its something else. Like grades? A,B,C,D,F then would I split it into five? So 100/5=20, then take the mid point so A=10, B=30, C=50, D=70, F=90?

    So Prob( Z < transformed(y))= sum of regressors?

    so in R it would be qnorm(new_y, 1,0)~x1+x2+...+xp ?
     
  5. Jun 23, 2017 #4

    andrewkirk

    User Avatar
    Science Advisor
    Homework Helper
    Gold Member

    I think the code would be something like

    Code (Text):
    glm(y~x1+x2+ .... + xn, family=quasi(link = "probit", variance = "constant"))
    But I am not completely sure about the variance argument. Unfortunately, the R documentation on the 'quasi' family is almost non-existent. Best to try it and see what happens.

    We need to apply ##\Phi## to the sum of regressors.

    If you have an ordered factor variable fac and the vector of corresponding integer values is fac.int then the transformation would be

    Code (Text):

    n<-length(levels(fac))
    y.transformed<-  (2 * fac.int - 1) / (2 * n)
     
    You could also try searching 'Ordinal regression', which is the term for what you are trying to do.
     
    Last edited: Jun 23, 2017
  6. Jun 23, 2017 #5

    Oh ok. I'll research into the probit glm.


    What about dichotomizing it? Would that hurt?

    For grades, I could do C-A as 1 for pass and D-F as 0 for fail. So it depends on context if I can do this?
     
  7. Jun 23, 2017 #6

    andrewkirk

    User Avatar
    Science Advisor
    Homework Helper
    Gold Member

    Yes, it depends on what you are trying to achieve.
     
  8. Jun 24, 2017 #7
    Well, mostly its just to make it simpler. But what is the tradeoff? If I dichotomize say grades, to pass fail? Would I lose power? Presumably, if there is an effect of say x on grades from level to level of grades, then there would be a cooresponding effect of grades from fail to pass and vice versa.
     
Know someone interested in this topic? Share this thread via Reddit, Google+, Twitter, or Facebook

Have something to add?
Draft saved Draft deleted



Similar Discussions: What to do ordinal response variable?
Loading...