- #1
- 9
- 0
I am trying to create a simple implementation of the Bayes decision rule with minimum error criterion and I am running into a problem. Specifically, if I have a data set consisting of a number of feature vectors stored in rows, how can I generate a probability density function from this data?
Also, how can I do this if some of the data is discrete, some is continuous, and some is missing? For example, let us assume each feature vector, x, has three elements.
x = [ a, b, c]
where;
a is categorical data and will be an element of the set {0, 1, 2, 3}
b is continuous data and will be in the range [0,1]
c is also continuous data in the range [0,1], but may be missing for some feature vectors
I want to be able to calculate the likelihood of a feature vector, x, based on the total data set or given that x is from a subset, w, of the total data set.
p(x) = ? and p(x|w) = ?
I have also posted this on Stack Exchange Mathematics, here:
http://math.stackexchange.com/quest...sity-function-from-a-set-of-multivariate-data
I would really appreciate if someone can help me out or point me in the right direction!
Also, how can I do this if some of the data is discrete, some is continuous, and some is missing? For example, let us assume each feature vector, x, has three elements.
x = [ a, b, c]
where;
a is categorical data and will be an element of the set {0, 1, 2, 3}
b is continuous data and will be in the range [0,1]
c is also continuous data in the range [0,1], but may be missing for some feature vectors
I want to be able to calculate the likelihood of a feature vector, x, based on the total data set or given that x is from a subset, w, of the total data set.
p(x) = ? and p(x|w) = ?
I have also posted this on Stack Exchange Mathematics, here:
http://math.stackexchange.com/quest...sity-function-from-a-set-of-multivariate-data
I would really appreciate if someone can help me out or point me in the right direction!