Lack of Fit in Ordinal Regression -- Analysis/Alternatives?

WWGD · Aug 25, 2016

Hi All,
I ran a binary logistic of Y on three different numerical variables A,B,C respectively. I am having an issue of separation of variables with all of them, meaning that there are values Ao,Bo, Co for each of A,B,C (different values for each, of course) so that for ## A>Ao, B>Bo, C>Co ## all the responses are successes (I guess this forces the slope to diverge to minus infinity for the slope of the curve to accommodate the abrupt change of 1 to 0). Then I increased the success levels to three: high, medium and low, to use an ordinal regression . But now I have a significant lack of fit, with p -->0 on the Chi-squared test. How does one interpret lack-of-fit issues with a Logistic Regression? I know that a lack of fit in a simple linear means that data is not linear but what does it mean for a Logistic? Does it mean the (log of) the data is not distributed like an S-curve ExpL/(1+ExpL) (##L=
\beta_0+ \beta_1 x+...##) ? If so, are there any standard , or any, alternatives (e.g for a distribution for the data). Any ideas?

micromass · Aug 25, 2016

What are your covariates? What is the nature of the covariates? Are they continuous? categorical?

WWGD · Aug 25, 2016

They are all continuous, thanks.

micromass · Aug 25, 2016

WWGD said:

They are all continuous, thanks.

That could be cause for problems in your hypothesis tests then. I don't know which test you used for lack of fit, but usually they don't work for continuous covariates.

WWGD · Aug 25, 2016

micromass said:

That could be cause for problems in your hypothesis tests then. I don't know which test you used for lack of fit, but usually they don't work for continuous covariates.

No, I had no problem with the Chi-Squared, which AFAIK does not require discrete/categorical variables. I just got a pretty low p -value.

micromass · Aug 25, 2016

WWGD said:

No, I had no problem with the Chi-Squared, which AFAIK does not require discrete/categorical variables. I just got a pretty low p -value.

I don't understand. What is chi-squared? There are many chi-square tests in regression.

WWGD · Aug 25, 2016

micromass said:

I don't understand. What is chi-squared? There are many chi-square tests in regression.

It is, I believe, the standard goodness of fit " [Sum(observed -expected)/observed]^2 I iam not aware of any other Chi-square goodness of fit tests.

micromass · Aug 25, 2016

Are you talking about the Pearson residuals? In either case, that chi-square test in your post doesn't always work for continuous variables.

WWGD · Aug 25, 2016

Thanks, I'll look into it.

WWGD · Aug 27, 2016

Still, it would be nice if someone knew of a good interpretation for a lack of fit in ordinal logistic, other than
obvious ones on collinearity, etc. Lack of fit for ordinary least squares means a line is not an effective way of describing a dataset, but not so clear for logistic. I have broken down the process of linearity of log(odds) =
##\beta_0 + \beta_1x_1+... ## of how ##\beta_0## shifts the S-curve while ##\beta_1 ## "speeds it up or slows it down", etc. , but I am having trouble finding a clear understanding of the lack of fit.

micromass · Aug 27, 2016

Is this a proportional odds model?

Stephen Tashi · Aug 27, 2016

It's easier to analyze real life situations as real life situations rather than mathematical skeletons. What phenomena does the data represent ?

WWGD · Aug 27, 2016

Stephen Tashi said:

It's easier to analyze real life situations as real life situations rather than mathematical skeletons. What phenomena does the data represent ?

EDIT2 I did a regress of control v compliance/ effectiveness. Specifically, control measures vs the existence of Fraud (F), Error (E) and Waste (W). A linear regression for each separately produces the expected results: increased control leads to a decrease of each of F,E,W. I was trying to do a Logit of Control vs each, to get a measure of proportionality to have some ideas of the odds of a certain level of control leading above or below a cutoff point ( selected as a standard level of 2.5 in a scale of 0 to 5 ) in each of the variables F,E,W. I got a horrible fit for binary regressions with the Chi-Squared and Pearson goodness of fit methods, with a P of 0.00 (Actually, I had a separation of points issue, since, beyond a certain level of control, all responses were successes). I tried using a Likert scale to change the binary into an ordinal logistic, see if I got a better fit , with no success EDIT( and low concordance, so low Kruskal's, etc.).

Stephen Tashi · Aug 27, 2016

WWGD said:

I did a regress of control v compliance/ effectiveness. Specifically, control measures vs the existence of Fraud (F), Error (E) and Waste (W).

An elementary question: Is each sample datum defined by 4-tuple of numbers: ( c,f,e,w) so all four values apply to a single "situation" that provides one sample ?

WWGD · Aug 27, 2016

Stephen Tashi said:

An elementary question: Is each sample datum defined by 4-tuple of numbers: ( c,f,e,w) so all four values apply to a single "situation" that provides one sample ?

Yes, for a certain fixed level of control we evaluate the associated levels of fraud, error and waste.

Stephen Tashi · Aug 27, 2016

WWGD said:

I was trying to do a Logit of Control vs each, to get a measure of proportionality to have some ideas of the odds of a certain level of control leading above or below a cutoff point ( selected as a standard level of 2.5 in a scale of 0 to 5 ) in each of the variables F,E,W.

That's a hard sentence to parse. For example, "odds of" and "probability of" have different meanings. It's easier for me to think about probability that odds.

I don't understand what "proportionality" means in that context. I think of a "proportion" as a ratio of a part to a whole. So what quantity is the "the part" and what quantity is "the whole"?

When you say "in each of the variables" , are you asking about all of them simultaneously? Or are you analyzing them individually ? For example if the level of control is (say) 8, are you asking something about the probability that a situation where the control is 8 will have less than a level of 2.5 in all three of F,E,W ?

WWGD · Aug 27, 2016

Stephen Tashi said:

That's a hard sentence to parse. For example, "odds of" and "probability of" have different meanings. It's easier for me to think about probability that odds.

I don't understand what "proportionality" means in that context. I think of a "proportion" as a ratio of a part to a whole. So what quantity is the "the part" and what quantity is "the whole"?

When you say "in each of the variables" , are you asking about all of them simultaneously? Or are you analyzing them individually ? For example if the level of control is (say) 8, are you asking something about the probability that a situation where the control is 8 will have less than a level of 2.5 in all three of F,E,W ?

Hi, sorry for the mess, they were closing the coffee shop and I wrote things in a hurry/
1) I meant probability. I am new to logistic regression. As I understand it (please correct me if I am wrong ) the input is a collection of Bernoulli trials ( or at least their
outcomes) and the outcome is a smooth family of Bernoulli distributions obtained through the use of Max Likelihood Estimators for the collection of outcomes. In other words, our output is a PDF from the family of S -curves with parameters the dependent variables.

2)Re proportionality, I was being loose again. I meant a PDF relates the dependent variable to the independent ones, assigning a probability to input values for each independent variable.

3) Re " In each of the variables" . Both, linearly I regress C against each individually and then against all of them (I ultimately do a "best subsets" analysis. considering all possible combination of regressions, the best one being the one with lowest Mallows' Cp and highest adjusted R^2; in case of tie, select the model with the fewest variables. The 3-variable model was the best). I also regressed each independent variable (i.e., F,E,W) logistically against Control . But I don't know how to do a logistic regression in the opposite sense, i.e., to have a control input and get probabilities for each of the 3 variables.

Stephen Tashi · Aug 28, 2016

WWGD said:

1) I meant probability. I am new to logistic regression. As I understand it (please correct me if I am wrong ) the input is a collection of Bernoulli trials ( or at least their
outcomes) and the outcome is a smooth family of Bernoulli distributions obtained through the use of Max Likelihood Estimators for the collection of outcomes.

Calling the outcome a "smooth family" of distributions is an interesting way to look at it. The outcome gives the parameter p of a Bernoulli distribution as a function of some independent variable x. The fitted equation p = f(x) implicitly defines a smooth family of Bernoulli distributions because for each given x we have a Bernoulli distribution with parameter p(x).

In other words, our output is a PDF from the family of S -curves with parameters the dependent variables.

A "family" of distributions defines more than one PDF. Each member of the family has a PDF.

2)Re proportionality, I was being loose again. I meant a PDF relates the dependent variable to the independent ones, assigning a probability to input values for each independent variable.

We can investigate that concept, but it seems "backwards" to how the usual sort of analysis goes. When I think of "fraud" and "control" (like frequent audits), I think of the a level of "control" causing (or allowing) a level of "fraud". I don't think of "fraud" as being what causes "control" (although I suppose one could look at it that way).

3) Re " In each of the variables" . Both, linearly I regress C against each individually and then against all of them (I ultimately do a "best subsets" analysis.

That would say that C is the dependent variable, so for logistic regression it has 2 possible outcomes , which I'll call "low control" and "high control". Each of the variables F,E,W is regarded as having a continuous range of values. Is that correct ?

WWGD · Aug 28, 2016

Stephen Tashi said:

Calling the outcome a "smooth family" of distributions is an interesting way to look at it. The outcome gives the parameter p of a Bernoulli distribution as a function of some independent variable x. The fitted equation p = f(x) implicitly defines a smooth family of Bernoulli distributions because for each given x we have a Bernoulli distribution with parameter p(x).

A "family" of distributions defines more than one PDF. Each member of the family has a PDF.
We can investigate that concept, but it seems "backwards" to how the usual sort of analysis goes. When I think of "fraud" and "control" (like frequent audits), I think of the a level of "control" causing (or allowing) a level of "fraud". I don't think of "fraud" as being what causes "control" (although I suppose one could look at it that way).
That would say that C is the dependent variable, so for logistic regression it has 2 possible outcomes , which I'll call "low control" and "high control". Each of the variables F,E,W is regarded as having a continuous range of values. Is that correct ?

Yes, I wrote the variable order backwards. I was hoping to actually do a multi-valued , i.e., control as independent and a triple (F,E,W) of values as functions of
control, assigning a probability triple for fixed values. Obviously, these three, F,E,W depend on C and not vice-versa. Still, going back to the initial question: how do we interpret a lack-of-fit in this case ( or, better, in general)?

WWGD · Sep 10, 2016

Just a followup: Say we are working on the same situation as above: we have a logistic of control (C) vs each of F,E,W (Fraud, Error, Waste)
Say we also assume each of F,E,W to have the same importance. It seems to make sense to logistically regress (binary) C against the arithmetic average:

D:= (F+E+W)/3 Any caveat to consider? I am trying to consider the cutoff point for this D to be the mean value of the means of each of F,E,W , i.e., this cutoff point determines a case (and anything below etermines a non-case; in the case of an equality we can randomly decide a yes or no.)
Is this a meaningful way and a standard way of doing things?.

Stephen Tashi · Sep 13, 2016

WWGD said:

i.e., this cutoff point determines a case (and anything below etermines a non-case; in the case of an equality we can randomly decide a yes or no.)
Is this a meaningful way and a standard way of doing things?.

The consensus in this thread was "no": https://www.physicsforums.com/threa...-regression-coefficients.876363/#post-5520330

Stephen Tashi · Sep 13, 2016

This is the way I visualize the model:

Plot F,E,W on the x,y,z axes. For simplicity, I'll imagine the values scaled so the data points fall inside the unit cube. At each data point (x,y,z) in space there is a value C, which I'll imagine as some sort of "density of matter" or "intensity" of something. You are interested in using the data to estimate the region in 3-D space where this density is "high".

The estimation is done in by fitting a function C(x,yz) to the data which predicts the density at each point in space. Describing the region where C(x,y,z) is "high" is done by assuming picking a function g(x,y,z) that defines a boundary by a rule such as "If g(x,y,z) > 0 then the C is "high". Otherwise C is "low". For example, you might try g(x,y,z) = (x+y+z)/3 - .73, which would separate the the "high" and "low" regions by a plane.

If you want to visualize the plausibility of various models, it would be helpful if you use some 3-D visualization software. There are all sorts of ways that data may fail to fit a particular model. For example, the "high" values of C might occur in isolated blobs that aren't well described by a volume with planar sides.

Lack of Fit in Ordinal Regression -- Analysis/Alternatives?

Discussion Overview

Discussion Character

Main Points Raised

Areas of Agreement / Disagreement

Contextual Notes

Similar threads

Graduate Hypothesis testing: Defining H0, HA hypotheses so that ( H_A)_A' makes sense

Undergrad My basic understanding of set theory

Undergrad The problem of points

Graduate Expected numbers of cards of a last color remaining

Undergrad How does axiom of foundation prevent infinite sequence of elements?

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect