The probability value obtained from a probit model cannot be negative. The probit variable (latent variable, usually denoted y) itself can well be negative.
The probability of an outcome as estimated with a probit model cannot be negative because it is derived from a proper distribution (Normal or Gaussian). By definition probabilities cannot be negative.
But suppose I want to estimate a "probability of success" from the data and I don't have the luxury of using a probit model (perhaps because it is too time consuming and I only need a quick estimate). In that case I could apply OLS to my data and hope that it would give me somewhat sensible results. The danger is, since OLS is a linear model, there is nothing that keeps away the "probability estimate" from becoming negative. But if all someone needs is a quick estimate, they must live with its consequences.
On the other hand, if someone cannot accept a "probability" being negative (contrary to everything that one is hopefully taught in a probability course), then they must go with the complicated probit model. Contrary to OLS, a Probit curve (the normal PDF) has a suitable nonlinear shape which prevents negative probability estimates.
If you read Saint-Exupery's The Little Prince, you should remember the snake that swallowed an elephant; it looked like a cross-sectional hat or bell. A normal distribution also looks like that (it is also called the bell curve); see the left-hand side graph in the following link.
http://mathworld.wolfram.com/NormalDistribution.html
You should mentally "erase" the "x" on the horizontal axis in the graph and put a "y" in its place. The horizontal axis is the latent variable y = a + b
1x
1 + ... + u.
As you can see, horizontal axis extends to both sides of zero (at the vertical line in the middle). So just like OLS, the probit variable y can be negative. But y is not a probability estimate. The probability estimate is the area under the bell. For example if y = 0 then the probability is Prob(y < 0) = 0.5. That's because exactly half of the area under the bell lies to the left of zero (the origin). No matter which y value that your model predicts (depending on your x), the area under the bell curve up until that y value (say, y
*) will always be positive. For a very negative y value (e.g. y
* = -1,000,000), that probability will be very very small, but still positive. As y gets very positive (e.g. y
* = +1,000,000) then the probability will be practically all the area under the curve. By definition (of a probability distribution), that area is equal to 1.
So a probit model predicts 0 < Prob(y < y
*) < 1 for all y
* ( -\infty < y^* < +\infty ).