Probably a silly question about tree diagrams

  • Context: High School 
  • Thread starter Thread starter etotheipi
  • Start date Start date
  • Tags Tags
    Diagrams Tree
Click For Summary

Discussion Overview

The discussion revolves around the interpretation and representation of tree diagrams in probability theory, specifically focusing on the events represented by nodes and the implications of labeling in these diagrams. Participants explore the accuracy of different representations and the conventions used in probability notation, considering both theoretical and pedagogical aspects.

Discussion Character

  • Exploratory
  • Debate/contested
  • Conceptual clarification
  • Mathematical reasoning

Main Points Raised

  • One participant suggests that each node in a tree diagram corresponds to an event, with the probability at any node being the product of probabilities along the path from the root, but questions the accuracy of this for dependent events.
  • Another participant argues that the left tree diagram implies the same meaning as the right, suggesting that full expressions at every node are unnecessary for clarity.
  • Some participants express confusion over the labeling of nodes, particularly whether the top right node should represent the intersection of events or just one event.
  • There is a discussion about the appropriateness of using set notation versus event notation, with some preferring the latter for clarity in probability contexts.
  • One participant mentions the potential distinction between events as sets and events as occurrences, raising questions about the conventions used in probability theory.
  • Another participant provides an example involving coin tosses to illustrate the concept of events and their intersections, questioning the labeling of nodes in the diagrams.
  • Several participants acknowledge the complexity of defining events and outcomes within the framework of probability theory, noting the potential for confusion in notation.

Areas of Agreement / Disagreement

Participants express differing views on the interpretation of tree diagrams and the labeling of nodes, indicating that there is no consensus on the best approach or notation. The discussion remains unresolved regarding the clarity and correctness of the representations used.

Contextual Notes

Participants highlight the limitations of using simplified notations in tree diagrams, especially in cases involving dependent events, and the potential for confusion when distinguishing between events and sets.

etotheipi
Each node of a tree diagram corresponds to an event (i.e. a subset of the sample space), and the probability of the event at any node occurring equals the product of the probabilities along the path from the root to that node.

So considering the following two tree diagrams,

1582983830312.png


I would have thought the one on the right is correct, and the one on the left is wrong. Since on the left tree diagram, for instance, the top right node doesn't correspond to the event ##B## but the event ##A\cap B##. But I only ever see variants of the diagram on the left. My guess is that on the left only the last set in the compound event is written to save time? Or am I interpreting the tree structure in the wrong way?

On a slightly unrelated (more pedagogical) note, all of the undergrad level textbooks I've looked at don't have any tree diagrams. Is it the case that they're not used that much beyond school, since everything can be done more rigorously just with set notation?
 
Physics news on Phys.org
I feel like the answer to your question will require the precise context where these diagrams came up.
 
  • Like
Likes   Reactions: etotheipi
The tree diagram on the left implies the same thing as on the right. Putting the full expression at every node would accumulate to a long expression in a deep tree. That is not necessary to convey the meaning.

Your statement: "the probability of the event at any node occurring equals the product of the probabilities along the path from the root to that node. " is not always true in either diagram. If the events of A and B are dependent, you would need to use the conditional probabilities like P(B|A) at the second level nodes to make that statement true on the left diagram. The statement is not appropriate at all on the right diagram.
 
  • Like
Likes   Reactions: sysprog and etotheipi
FactChecker said:
Your statement: "the probability of the event at any node occurring equals the product of the probabilities along the path from the root to that node. " is not always true in either diagram. If the events of A and B are dependent, you would need to use the conditional probabilities like P(B|A) at the second level nodes to make that statement true.

Yes, I should have worded that more carefully! I should have added that each edge represents a probability of the event in the subsequent node occurring given that all the previous events along the path have occurred.

FactChecker said:
The tree diagram on the left implies the same thing as on the right. Putting the full expression at every node would accumulate to a long expression in a deep tree. That is not necessary to convey the meaning.

Alright, so sort of what what I hoped would be the case! So I assume then that we can take each node to represent the event which is the intersection of all of the events up to and including the event labelled on the node. A necessary simplification for large trees especially. Thank you.
 
I don't see any difference between the two diagrams. The topmost node in both diagrams represents ##A \cap B##; i.e., to get to that node both A and B must be true. The next node down indicates that A is true but B is false.

Edit: I didn't see @FactChecker's reply when I posted this.
 
  • Like
Likes   Reactions: sysprog and etotheipi
Mark44 said:
I don't see any difference between the two diagrams. The topmost node in both diagrams represents ##A \cap B##; i.e., to get to that node both A and B must be true. The next node down indicates that A is true but B is false.

Yeah, the logic remains the same in both cases. It's just that, like you said, the top right node represents the set ##A \cap B## and not the set ##B##, so it's a bit of a confusing shortcut to just write ##B##!
 
etotheipi said:
Yeah, the logic remains the same in both cases. It's just that, like you said, the top right node represents the set ##A \cap B## and not the set ##B##, so it's a bit of a confusing shortcut to just write ##B##!
It's not confusing to me at all., and IMO, the figure on the left is better. Each node in the tree represents one event. Traversing the edges between nodes gives the probability of the branch.

BTW, as long as we're discussing the probability of events, rather than sets, the notation ##A \wedge B## would be more suitable.
 
Mark44 said:
It's not confusing to me at all., and IMO, the figure on the left is better. Each node in the tree represents one event. Traversing the edges between nodes gives the probability of the branch.

BTW, as long as we're discussing the probability of events, rather than sets, the notation ##A \wedge B## would be more suitable.

I suppose there are two ways of looking at it.

If we use Kolmogorov's model of probability, then events are sets and I would be inclined to say that ##A \cap B## is a better fit for the top right node (i.e. the set of outcomes in A and B, which in this case happens to be an elementary set), as opposed to the set ##\overline{A} \cap B##.

If we discard the notion of sets and instead speak of events as "things that happen", then labelling the top right node as ##B## makes sense. Though I'm unaware of how often this latter convention is used, since my text puts forward all of the probability in terms of set theory and Kolmogorov's model.
 
I don't have Kolmogorov at hand, so I can't check. Set theory and probability are closely related, so perhaps he's distinguishing between a set A and the event that ##x \in A##.
 
  • #10
Mark44 said:
I don't have Kolmogorov at hand, so I can't check. Set theory and probability are closely related, so perhaps he's distinguishing between a set A and the event that ##x \in A##.

I could be very wrong (!) since I'm still not too familiar with the formalisation, however as far as I'm aware an "event" is defined as a subset of the sample space, whose elements are outcomes. So for two tosses of a coin, an event could be ##\{ (H,H), (H,T), (T,H) \}##. Though it could also be an elementary event, which has a size of one, like ##\{ (H,H) \}##.

So if ##A## is the event that the first toss is heads (the subset of ##S## containing the outcomes where this is true, that is, ##A = \{ (H,H), (H,T) \})## and ##B## is the event that the second toss is heads ##B = \{ (H,H), (T,H) \})##, then ##A \cap B## is the event ##\{ (H,H) \}##. So in this sense, it doesn't seem right to label the top right node ##B## since it doesn't represent the whole set ##B##, only a subset which is the intersection with ##A##. The other part of ##B## would be contained within ##\overline{A} \cap B## which would be below.
 
Last edited by a moderator:
  • #11
That is, if we wrote the sets instead of the letters, we'd end up with something like this using the example of an experiment consisting of two flips of a coin:

1582998574763.png
 
  • #12
etotheipi said:
I could be very wrong (!) since I'm still not too familiar with the formalisation, however as far as I'm aware an "event" is defined as a subset of the sample space, whose elements are outcomes.

You are correct. Mathematical probability theory deals with assigning a probability to sets in a probability space. The notion of an event as a statement is used in applications of probability theory. An "outcome" in a probability space ##(\Omega, \mathcal{F}, P)## is defined to be an element of a certain set ##\Omega##. As a special case, one can define the members of ##\Omega## to be statements.

In that special case, for a ##A \subset \Omega##, we can identify the symbol "##A##" with a statement that describes the individual statements that constitute the set ##A##. (It is common in math to "abuse notation" by using the same symbol to mean two different things.) The leads to using notation like "P(A and B)" to mean ##P(A \cap B)##. This, in turn, leads to using the symbols employed in formal logic (##\land##, ##\lor##, ##\lnot##) to abbreviate "and", "or", "not".
 
Last edited:
  • Like
Likes   Reactions: sysprog and etotheipi
  • #13
etotheipi said:
I could be very wrong (!) since I'm still not too familiar with the formalisation, however as far as I'm aware an "event" is defined as a subset of the sample space, whose elements are outcomes. So for two tosses of a coin, an event could be ##\{ (H,H), (H,T), (T,H) \}##. Though it could also be an elementary event, which has a size of one, like ##\{ (H,H) \}##.
Yes, this makes sense.
etotheipi said:
So if ##A## is the event that the first toss is heads (the subset of ##S## containing the outcomes where this is true, that is, ##A = \{ (H,H), (H,T) \})## and ##B## is the event that the second toss is heads ##B = \{ (H,H), (T,H) \})##, then ##A \cap B## is the event ##\{ (H,H) \}##. So in this sense, it doesn't seem right to label the top right node ##B## since it doesn't represent the whole set ##B##, only a subset which is the intersection with ##A##. The other part of ##B## would be contained within ##\overline{A} \cap B## which would be below.
It seems to me to make more sense to me to label each node as H or T, as in this drawing. Labeling the nodes with A and B, etc., adds an unnecessary level of complexity, IMO.
tree.png


The top two edges represent the event (H, H). The upper branch represent the events (H, H) and (H, T). If I define A as the event "at least one head occurs in two throws of the coin" can be represented as the set ##A = \{(H, H), (H, T), (T, H)\}##. As a compound event, this would be ##(H, H) \vee (H, T) \vee (T, H)##.
 
  • Like
Likes   Reactions: etotheipi
  • #14
We could perhaps also let ##H_{1}## be the event that a head occurs on the first throw, ##H_2## that a head occurs on the second throw, etc.
Stephen Tashi said:
In that special case, for a ##A \subset \Omega##, we can identify the symbol "##A##" with a statement that describes the individual statements that constitute the set ##A##.

With this, events like ##H_{1}## do a double duty as (formally) sets and (informally) statements / logical propositions. We can translate "the event that a head occurs on the first throw" as "the set of outcomes where a head occurs on the first throw". I might draw it like this,

1583005525965.png


I think there are lots of different ways of denoting the same maths; I seem to find it simpler to keep the idea of events as sets at the forefront, since it means that the notation meshes together quite nicely.

For instance, ##P(H_2 \cap T_1) = P(T_1) \times P(H_2 | T_1)##

which is consistent with

etotheipi said:
each edge represents a probability of the event in the subsequent node occurring given that all the previous events along the path have occurred.
 
Last edited by a moderator:
  • #15
FactChecker said:
Your statement: "the probability of the event at any node occurring equals the product of the probabilities along the path from the root to that node. " is not always true in either diagram ... The statement is not appropriate at all on the right diagram.

@FactChecker I'm having difficulty understanding this part, why would this not apply to the second diagram?

If the diagram on the left implies the one on the right, shouldn't they both operate in the same manner?
 
Last edited by a moderator:
  • #16
etotheipi said:
@FactChecker I'm having difficulty understanding this part, why would this not apply to the second diagram?

If the diagram on the left implies the one on the right, shouldn't they both operate in the same manner?
Sorry if my statement was confusing. Both can be right if the probability calculation of the second level uses the conditional on the top level, like P(A)*P(A|B). I will leave it to you to decide if the two diagrams are equivalent in terms of how difficult it is to translate the diagram node labels to the correct probability calculation.
 
  • Like
Likes   Reactions: etotheipi
  • #17
Alright, awesome. To me at least it still makes more sense to draw the tree as branching into smaller and smaller subsets; i.e. if a node represents an event, the set must be unambiguously specified. And we might mentally substitute the intersection with all of the previous nodes (i.e. '##B##' ##\implies B\cap A##) if we do abbreviate the expression in the interest of clarity.

Although there is a close correspondence between set theory and logic theory (e.g. some have mentioned the ##\wedge## to ##\cap## isomorphism for acting on statements and events (sets) accordingly), it seems easier to treat probability theory entirely in terms of set theory, since this seems to be how it is formulated usually.
 
Last edited by a moderator:

Similar threads

  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 10 ·
Replies
10
Views
3K
  • · Replies 5 ·
Replies
5
Views
3K
  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 17 ·
Replies
17
Views
4K
  • · Replies 10 ·
Replies
10
Views
3K
  • · Replies 16 ·
Replies
16
Views
6K
  • · Replies 12 ·
Replies
12
Views
2K
  • · Replies 25 ·
Replies
25
Views
2K
  • · Replies 2 ·
Replies
2
Views
3K