Why are probit sometimes negative?

  • Thread starter isabella
  • Start date
  • Tags
    Negative
In summary, a probit model is used to estimate probabilities from a data set, but the probability value itself cannot be negative due to the nature of the normal distribution. This is in contrast to using OLS, which can result in negative probability estimates. The probit variable itself can be negative, but it is not a probability estimate. Instead, the probability estimate is the area under the bell curve of the normal distribution, which can never be negative.
  • #1
isabella
27
0
can anyone here explain to me why probit cannot be negative?why are probit sometimes negative?
 
Physics news on Phys.org
  • #2
The probability value obtained from a probit model cannot be negative. The probit variable (latent variable, usually denoted y) itself can well be negative.

The probability of an outcome as estimated with a probit model cannot be negative because it is derived from a proper distribution (Normal or Gaussian). By definition probabilities cannot be negative.

But suppose I want to estimate a "probability of success" from the data and I don't have the luxury of using a probit model (perhaps because it is too time consuming and I only need a quick estimate). In that case I could apply OLS to my data and hope that it would give me somewhat sensible results. The danger is, since OLS is a linear model, there is nothing that keeps away the "probability estimate" from becoming negative. But if all someone needs is a quick estimate, they must live with its consequences.

On the other hand, if someone cannot accept a "probability" being negative (contrary to everything that one is hopefully taught in a probability course), then they must go with the complicated probit model. Contrary to OLS, a Probit curve (the normal PDF) has a suitable nonlinear shape which prevents negative probability estimates.

If you read Saint-Exupery's The Little Prince, you should remember the snake that swallowed an elephant; it looked like a cross-sectional hat or bell. A normal distribution also looks like that (it is also called the bell curve); see the left-hand side graph in the following link.

http://mathworld.wolfram.com/NormalDistribution.html

You should mentally "erase" the "x" on the horizontal axis in the graph and put a "y" in its place. The horizontal axis is the latent variable y = a + b1x1 + ... + u.

As you can see, horizontal axis extends to both sides of zero (at the vertical line in the middle). So just like OLS, the probit variable y can be negative. But y is not a probability estimate. The probability estimate is the area under the bell. For example if y = 0 then the probability is Prob(y < 0) = 0.5. That's because exactly half of the area under the bell lies to the left of zero (the origin). No matter which y value that your model predicts (depending on your x), the area under the bell curve up until that y value (say, y*) will always be positive. For a very negative y value (e.g. y* = -1,000,000), that probability will be very very small, but still positive. As y gets very positive (e.g. y* = +1,000,000) then the probability will be practically all the area under the curve. By definition (of a probability distribution), that area is equal to 1.

So a probit model predicts 0 < Prob(y < y*) < 1 for all y* [itex]( -\infty < y^* < +\infty )[/itex].
 
Last edited:
  • #3
To continue,

For a picture of a snake swallowing an elephant whole and how it looks afterwards, click here. This picture looks less like a bell curve of the normal distribution, because the bell curve has only a single peak. In statistics, a distribution with a single peak is called a unimodal distribution (like the bell curve). A distribution that looks like this picture (having two peaks) would be called a bimodal distribution.

Here's a specific probit example:

Suppose in a medical trial for a new drug, the outcome is coded "1" for "success" and "0" for "failure." When the outcome variable was regressed against another variable showing the administered dosage level (x), the probit equation was estimated as y = -1.5 + 0.5 x, where y is the probit (latent) variable. At dosage level x* = 2, what is the predicted probability of success? What is the predicted outcome?

First calculate y* = -1.5 + 0.5 x* = -1.5 + 1 = -0.5.

Next, look up Prob(y < -0.5). You could either look it up from a normal distribution table, or calculate it using a computer software. For example, in Excel, go to a blank cell and type:

=NORMDIST(-0.5,0,1,TRUE)

then press enter. The cell value should display the value 0.30853753. So the probability of success for x* = 2 is approximately 0.31. Since this value is less than 0.5, the predicted outcome is "failure."
 

What is a probit?

A probit is a type of statistical model used to analyze the relationship between a categorical dependent variable and one or more independent variables. It is often used in the field of social sciences to predict the probability of a certain event or outcome based on a set of variables.

Why are probit sometimes negative?

Probits can sometimes be negative because they are derived from a cumulative normal distribution function, which can have negative values. This can happen when the independent variables have a strong effect on the dependent variable, resulting in a high probability of the event occurring.

Are negative probits valid?

Yes, negative probits are valid. The interpretation of a negative probit is simply the inverse of a positive probit - a higher negative probit indicates a lower probability of the event occurring.

Can negative probits be transformed into positive values?

Yes, negative probits can be transformed into positive values by taking the absolute value of the probit. However, this may not be necessary as the interpretation of a negative probit is still valid and meaningful.

What factors can cause negative probits?

Negative probits can be caused by a variety of factors, such as a strong negative relationship between the independent variables and the dependent variable, or when the dependent variable is rare and the independent variables have a strong effect on its probability. Additionally, errors in data collection and analysis can also lead to negative probits.

Similar threads

  • Materials and Chemical Engineering
Replies
7
Views
699
  • Engineering and Comp Sci Homework Help
Replies
2
Views
931
  • Set Theory, Logic, Probability, Statistics
Replies
6
Views
340
Replies
7
Views
1K
  • Introductory Physics Homework Help
Replies
4
Views
190
Replies
11
Views
1K
  • Special and General Relativity
Replies
1
Views
511
  • Introductory Physics Homework Help
Replies
2
Views
646
  • Set Theory, Logic, Probability, Statistics
2
Replies
35
Views
555
  • Classical Physics
Replies
23
Views
1K
Back
Top