- #1

- 9

- 0

## Main Question or Discussion Point

I am trying to create a simple implementation of the Bayes decision rule with minimum error criterion and I am running into a problem. Specifically, if I have a data set consisting of a number of feature vectors stored in rows, how can I generate a probability density function from this data?

Also, how can I do this if some of the data is discrete, some is continuous, and some is missing? For example, let us assume each feature vector, x, has three elements.

x = [ a, b, c]

where;

a is categorical data and will be an element of the set {0, 1, 2, 3}

b is continous data and will be in the range [0,1]

c is also continous data in the range [0,1], but may be missing for some feature vectors

I want to be able to calculate the likelihood of a feature vector, x, based on the total data set or given that x is from a subset, w, of the total data set.

p(x) = ? and p(x|w) = ?

I have also posted this on Stack Exchange Mathematics, here:

http://math.stackexchange.com/quest...sity-function-from-a-set-of-multivariate-data

I would really appreciate if someone can help me out or point me in the right direction!

Also, how can I do this if some of the data is discrete, some is continuous, and some is missing? For example, let us assume each feature vector, x, has three elements.

x = [ a, b, c]

where;

a is categorical data and will be an element of the set {0, 1, 2, 3}

b is continous data and will be in the range [0,1]

c is also continous data in the range [0,1], but may be missing for some feature vectors

I want to be able to calculate the likelihood of a feature vector, x, based on the total data set or given that x is from a subset, w, of the total data set.

p(x) = ? and p(x|w) = ?

I have also posted this on Stack Exchange Mathematics, here:

http://math.stackexchange.com/quest...sity-function-from-a-set-of-multivariate-data

I would really appreciate if someone can help me out or point me in the right direction!