Why it doesn't sum to one in this simple naive Bayes classification?

Karagoz · Sep 3, 2021

Summary:: When we have only three classes (Orange, Banana and Other) and three features (Long, Sweet and
Yellow), why P(Other|Long, Sweet, Yellow) + P(Banana|Long, Sweet, Yellow) is not equal to 1 when P(Orange|Long, Sweet, Yellow) = 0 ?

In this example:
https://towardsdatascience.com/all-about-naive-bayes-8e13cef044cf

There's an example of data on fruits with different features, and how do predict probability of what class fruit is it given some features. There are similar guides online using similar examples.

There are only 3 classes of fruits. Banana, Orange and Other.
And we have only 3 features; long, sweet and yellow.

P(Orange|Long,Sweet,Yellow) = 0
The probability given fruit is Orange are zero because the Probability of Orange when given fruit is long are zero.

P(Banana|Long, Sweet, Yellow) = 0.25 / 0.27

P(Other|Long, Sweet, Yellow) = 0.01 / 0.27

But: P(Other|Long, Sweet, Yellow) + P(Banana|Long, Sweet, Yellow) = 0.25/0.27 + 0.01/0.27 = 0.26/0.27 < 1

If the features are given as "long, sweet and yellow" it's impossible to be an orange. It must be either banana or other when features are "long, sweet and yellow".
If the features are given as "long, swet and yellow", then it must be either a banana or "other".

But why the P(Other|Long, Sweet, Yellow) + P(Banana|Long, Sweet, Yellow) is not equal to 1?
Shouldn't it be equal to 1? Also P(Other|Long, Sweet, Yellow) + P(Banana|Long, Sweet, Yellow) = 1 ?

[Moderator's note: moved from a technical forum.]

FactChecker · Sep 3, 2021

Karagoz said:

Summary:: When we have only three classes (Orange, Banana and Other) and three features (Long, Sweet and
Yellow), why P(Other|Long, Sweet, Yellow) + P(Banana|Long, Sweet, Yellow) is not equal to 1 when P(Orange|Long, Sweet, Yellow) = 0 ?

In this example:
https://towardsdatascience.com/all-about-naive-bayes-8e13cef044cf

First, do not use undefined commas in a probability like Long, Sweet, Yellow. Do you mean "and" or "or"?
Second, P(Orange|Long or Sweet or Yellow) does not = 0

mfb · Sep 3, 2021

That's just a rounding error.

Why it doesn't sum to one in this simple naive Bayes classification?

1. Why is the sum of probabilities not equal to one in naive Bayes classification?

2. How does the naive assumption affect the sum of probabilities in naive Bayes classification?

3. Can the sum of probabilities be greater than one in naive Bayes classification?

4. How does the sum of probabilities affect the accuracy of naive Bayes classification?

5. Can the sum of probabilities be used to evaluate the performance of naive Bayes classification?

Similar threads

Hot Threads

Recent Insights