- #1
demander
- 26
- 0
If this is in the wrong place please move it, i was in doubt where to post this
My problem is the following
I Have to create an induction tree able to calssify the attributes to this table, being y the class and A,B,C the Attributes,
this tree shall be constructed from ID3 Algorithm
ID3 uses information theory to determine the most informative attribute. from the branches in each step
A |B |C |Y
0 |0 |0 |0
0 |1 |0 |1
1 |0 |0 |1
1 |1 |1 |0
So ID3 Algorithm as the principle that the root of the tree must be the one with more information Gain obtain from conjunct entropy here are the formulas needed
http://img828.imageshack.us/img828/5199/entropyw.jpg
http://img217.imageshack.us/img217/9562/gain.jpg
So Starting with class entropy or conjunct entropy, there are only two classes possible two 0's or 1's, the Entropy in this case will become
E(C)=-2/4Log2(2/4)-2/4Log2(2/4)=-1/2x(-1)-1/2x(-1)=1
So now calculating the gain for each possible root
Gain(A)=1-(1/2x1+1/2x1)=0
1/2 because there are half probabilities to be 0 or 1, and 1 because the entropy of class from A is a maximum, since if A is 0 the class can be 0 or 1 but if A is ! the lass can be 0 or 1
doing the same to Gain of B and C i Got:
gain(A)=0
gain(b)=0
gain(c)=0,69 this is bigger causeif Cis 0 the class has more probability of being 1 than 0
if C has a major gain this is the root branch
so
the tree should become/start like this from ID3 algorithm
http://img529.imageshack.us/img529/8645/initialtree.jpg
So here is Where i Got Stucked Cause the second Branch should be the attribute A or B with more information gain, but if A or B are 1 then class is 1, but if is A or B are 0, we can have Class 1 or 0 again
so for second Branch the Gain(a)=gain(b)=0 no information gain
so what should i choose for the second branch?
i think the most probably is the algorithm choosing the value with most probability cutting the tree in the second branch so since the probability of being 1 is major maybe he finish the Decision tree like this:
http://img411.imageshack.us/img411/4914/finaltree.jpg
So am I right, or despite don't have any gaining ID3 follows any of the two branches resulting in a complete tree like this?
http://img545.imageshack.us/img545/7097/completetree.jpg
So which tree is more plausible from ID3 Algorithm? Can someone help me in this exercise?
Thanks for any help
My problem is the following
I Have to create an induction tree able to calssify the attributes to this table, being y the class and A,B,C the Attributes,
this tree shall be constructed from ID3 Algorithm
ID3 uses information theory to determine the most informative attribute. from the branches in each step
A |B |C |Y
0 |0 |0 |0
0 |1 |0 |1
1 |0 |0 |1
1 |1 |1 |0
So ID3 Algorithm as the principle that the root of the tree must be the one with more information Gain obtain from conjunct entropy here are the formulas needed
http://img828.imageshack.us/img828/5199/entropyw.jpg
http://img217.imageshack.us/img217/9562/gain.jpg
So Starting with class entropy or conjunct entropy, there are only two classes possible two 0's or 1's, the Entropy in this case will become
E(C)=-2/4Log2(2/4)-2/4Log2(2/4)=-1/2x(-1)-1/2x(-1)=1
So now calculating the gain for each possible root
Gain(A)=1-(1/2x1+1/2x1)=0
1/2 because there are half probabilities to be 0 or 1, and 1 because the entropy of class from A is a maximum, since if A is 0 the class can be 0 or 1 but if A is ! the lass can be 0 or 1
doing the same to Gain of B and C i Got:
gain(A)=0
gain(b)=0
gain(c)=0,69 this is bigger causeif Cis 0 the class has more probability of being 1 than 0
if C has a major gain this is the root branch
so
the tree should become/start like this from ID3 algorithm
http://img529.imageshack.us/img529/8645/initialtree.jpg
So here is Where i Got Stucked Cause the second Branch should be the attribute A or B with more information gain, but if A or B are 1 then class is 1, but if is A or B are 0, we can have Class 1 or 0 again
so for second Branch the Gain(a)=gain(b)=0 no information gain
so what should i choose for the second branch?
i think the most probably is the algorithm choosing the value with most probability cutting the tree in the second branch so since the probability of being 1 is major maybe he finish the Decision tree like this:
http://img411.imageshack.us/img411/4914/finaltree.jpg
So am I right, or despite don't have any gaining ID3 follows any of the two branches resulting in a complete tree like this?
http://img545.imageshack.us/img545/7097/completetree.jpg
So which tree is more plausible from ID3 Algorithm? Can someone help me in this exercise?
Thanks for any help
Last edited by a moderator: