When can Ordinal Variables be treated as Interval Variables?

Click For Summary

Discussion Overview

The discussion revolves around the treatment of ordinal variables, such as those derived from Likert scales, as interval variables in statistical analysis. Participants explore the justifications for this practice, its implications in social science research, and the potential consequences of such treatment on data interpretation.

Discussion Character

  • Debate/contested
  • Conceptual clarification
  • Technical explanation

Main Points Raised

  • Some participants note that ordinal variables can be ranked but argue that the intervals between categories may not be equal, raising questions about the validity of treating them as interval variables.
  • Others suggest that the practice of treating ordinal variables as interval variables is common in published papers, though they acknowledge this is not a strong justification.
  • A participant emphasizes that treating an ordinal response variable as numerical is more problematic than treating an ordinal independent variable as interval.
  • Some argue that there should be a logical basis for the numerical values assigned to ordinal categories to justify their treatment as interval variables.
  • Concerns are raised about the loss of information when binning continuous data, with examples provided regarding the potential for misinterpretation of data distributions.
  • Participants discuss the challenges of working with latent variables in psychological data, noting that unknown binning may be necessary due to the unobservable nature of these variables.
  • Methods for analyzing latent variables are mentioned, including the use of cumulative models with logit or probit links, which assume a continuous underlying distribution.

Areas of Agreement / Disagreement

Participants express a range of opinions on the appropriateness of treating ordinal variables as interval variables, with no consensus reached. Some support the practice under certain conditions, while others challenge its validity and highlight the need for careful justification.

Contextual Notes

Limitations include the potential for unequal spacing in ordinal scales, the impact of binning on data interpretation, and the reliance on assumptions about latent variables. These factors contribute to the complexity of the discussion without resolving the underlying issues.

fog37
Messages
1,566
Reaction score
108
TL;DR
Understanding when ok to treat ordinal variables treated as interval variables
Hello,

Ordinal variables (see Likert scale) can be labelled using numbers and ranked by those numbers. However, the difference between category 2 and category 3 may not be exactly be the same as the difference between category 4 and 5. That said, I noticed that in social science ordinal variables are sometimes approximately treated as if they were numerical predictors if the ordinal variable has many levels...Is that a correct approach? What justifies that? I did some reading and found a variety of opinions on the topic...

Thanks
 
Physics news on Phys.org
fog37 said:
TL;DR Summary: Understanding when ok to treat ordinal variables treated as interval variables

What justifies that?
People did it previously in published papers. Doing it currently doesn’t sink a paper.

It isn’t a great justification. As you say, there are a variety of opinions on the topic. Including ones that are supportive of the practice.

So it will continue to be done for the time being. Most reviewers are statistically unsophisticated, and ordinal methods are less familiar and often less powerful.
 
  • Like
Likes   Reactions: fog37
Dale said:
People did it previously in published papers. Doing it currently doesn’t sink a paper.

It isn’t a great justification. As you say, there are a variety of opinions on the topic. Including ones that are supportive of the practice.

So it will continue to be done for the time being. Most reviewers are statistically unsophisticated, and ordinal methods are less familiar and often less powerful.
It seems to me that the issue is more serious if we treat a response/outcome variable that is ordinal as numerical and maybe less a serious issue if the ordinal variable is an independent variable and we treat it as an interval variable...
 
  • Like
Likes   Reactions: Dale
fog37 said:
TL;DR Summary: Understanding when ok to treat ordinal variables treated as interval variables

Hello,

Ordinal variables (see Likert scale) can be labelled using numbers and ranked by those numbers. However, the difference between category 2 and category 3 may not be exactly be the same as the difference between category 4 and 5. That said, I noticed that in social science ordinal variables are sometimes approximately treated as if they were numerical predictors if the ordinal variable has many levels...Is that a correct approach? What justifies that? I did some reading and found a variety of opinions on the topic...

Thanks
"Is that a correct approach?"
No. The fact that something is [or has been] widely done does not make it valid.
 
  • Like
Likes   Reactions: FactChecker
IMO, there should be some subject-matter logic behind the relative numerical values in order to justify that approach. In the cases you refer to, you should base your evaluation on how well they justified the scaling. There may be very good reasons for unequal spacing, but there might not be. I would hope that any assignment of unequal spacing in a peer-reviewed publication was done for some subject-matter, logical reason.
 
Just a point: binning continuous data can be a very bad thing to do and you're losing information: care hast to be take even in the best of situations. Imagine a data set that is actually bimodal (or multimodal): a histogram with too few bins probably won't detect it. Using income data rounded to tens of thousands can hide evidence of inflation that would be detected from the raw values.

Frank Harrell has a very good illustration of problems at the following link.

https://discourse.datamethods.org/t/categorizing-continuous-variables/3402
 
statdad said:
binning continuous data can be a very bad thing to do and you're losing information: care hast to be take even in the best of situations. Imagine a data set that is actually bimodal (or multimodal): a histogram with too few bins probably won't detect it.
Binning can also produce a bimodal discrete distribution where the underlying continuous distribution is not bimodal.

However, very often with psychological data you are working with latent variables so you have no choice but to do an unknown binning on the unobservable latent scale.
 
Dale said:
Binning can also produce a bimodal discrete distribution where the underlying continuous distribution is not bimodal.

However, very often with psychological data you are working with latent variables so you have no choice but to do an unknown binning on the unobservable latent scale.
There are methods too, for Latent variables, that assume observed data originate from a continuous,
iirc (wolg) normal variables. let me see if I can find refs.
 
WWGD said:
There are methods too, for Latent variables, that assume observed data originate from a continuous,
iirc (wolg) normal variables. let me see if I can find refs.
Yes. I like the cumulative family with the logit or probit link in the brms package in R. With the probit link the latent variable is assumed to have a standard normal distribution.
 
  • Like
Likes   Reactions: WWGD

Similar threads

  • · Replies 5 ·
Replies
5
Views
4K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 6 ·
Replies
6
Views
3K
  • · Replies 23 ·
Replies
23
Views
5K
  • · Replies 45 ·
2
Replies
45
Views
6K
  • · Replies 5 ·
Replies
5
Views
3K
  • · Replies 16 ·
Replies
16
Views
3K
  • · Replies 3 ·
Replies
3
Views
2K