Why it doesn't sum to one in this simple naive Bayes classification?

AI Thread Summary
In a naive Bayes classification example with three fruit classes (Orange, Banana, Other) and three features (Long, Sweet, Yellow), the probabilities P(Other|Long, Sweet, Yellow) and P(Banana|Long, Sweet, Yellow) do not sum to one, even when P(Orange|Long, Sweet, Yellow) is zero. This discrepancy arises from the way probabilities are calculated and the potential for rounding errors. It is crucial to clarify whether features are combined with "and" or "or," as this affects the calculations. The misunderstanding often stems from assumptions about the independence of features and their contribution to class probabilities. Ultimately, the probabilities may not sum to one due to the nature of the naive Bayes model and the specific data used.
Karagoz
Messages
51
Reaction score
5
Summary:: When we have only three classes (Orange, Banana and Other) and three features (Long, Sweet and
Yellow), why P(Other|Long, Sweet, Yellow) + P(Banana|Long, Sweet, Yellow) is not equal to 1 when P(Orange|Long, Sweet, Yellow) = 0 ?

In this example:
https://towardsdatascience.com/all-about-naive-bayes-8e13cef044cf

There's an example of data on fruits with different features, and how do predict probability of what class fruit is it given some features. There are similar guides online using similar examples.

There are only 3 classes of fruits. Banana, Orange and Other.
And we have only 3 features; long, sweet and yellow.

P(Orange|Long,Sweet,Yellow) = 0
The probability given fruit is Orange are zero because the Probability of Orange when given fruit is long are zero.

P(Banana|Long, Sweet, Yellow) = 0.25 / 0.27

P(Other|Long, Sweet, Yellow) = 0.01 / 0.27

But: P(Other|Long, Sweet, Yellow) + P(Banana|Long, Sweet, Yellow) = 0.25/0.27 + 0.01/0.27 = 0.26/0.27 < 1

If the features are given as "long, sweet and yellow" it's impossible to be an orange. It must be either banana or other when features are "long, sweet and yellow".
If the features are given as "long, swet and yellow", then it must be either a banana or "other".

But why the P(Other|Long, Sweet, Yellow) + P(Banana|Long, Sweet, Yellow) is not equal to 1?
Shouldn't it be equal to 1? Also P(Other|Long, Sweet, Yellow) + P(Banana|Long, Sweet, Yellow) = 1 ?

[Moderator's note: moved from a technical forum.]
 
Physics news on Phys.org
Karagoz said:
Summary:: When we have only three classes (Orange, Banana and Other) and three features (Long, Sweet and
Yellow), why P(Other|Long, Sweet, Yellow) + P(Banana|Long, Sweet, Yellow) is not equal to 1 when P(Orange|Long, Sweet, Yellow) = 0 ?

In this example:
https://towardsdatascience.com/all-about-naive-bayes-8e13cef044cf
First, do not use undefined commas in a probability like Long, Sweet, Yellow. Do you mean "and" or "or"?
Second, P(Orange|Long or Sweet or Yellow) does not = 0
 
That's just a rounding error.
 
I tried to combine those 2 formulas but it didn't work. I tried using another case where there are 2 red balls and 2 blue balls only so when combining the formula I got ##\frac{(4-1)!}{2!2!}=\frac{3}{2}## which does not make sense. Is there any formula to calculate cyclic permutation of identical objects or I have to do it by listing all the possibilities? Thanks
Essentially I just have this problem that I'm stuck on, on a sheet about complex numbers: Show that, for ##|r|<1,## $$1+r\cos(x)+r^2\cos(2x)+r^3\cos(3x)...=\frac{1-r\cos(x)}{1-2r\cos(x)+r^2}$$ My first thought was to express it as a geometric series, where the real part of the sum of the series would be the series you see above: $$1+re^{ix}+r^2e^{2ix}+r^3e^{3ix}...$$ The sum of this series is just: $$\frac{(re^{ix})^n-1}{re^{ix} - 1}$$ I'm having some trouble trying to figure out what to...
Back
Top