Calculating the Probability of Two Boys for Math-Obsessed Friend

JeffJo · Feb 10, 2011

jarednjames said:

But the odds of having two boys are half of having only one boy, so there's no gain in advantage.

That's already accounted for. Assuming the requirement R is "one is a boy" in the simpler problem, without "Tuesday"

P(two boys and R) = P(R|two boys)*P(two boys) = 1*(1/4) = 1/4
P(one boy and R) = P(R|one boy)*P(one boy) = 1*(1/2) = 1/2
P(no boys and R) = P(R|no boys)*P(no boys) = 0*(1/4) = 0

So the answer is (1/4)/(1/4+1/2+0)=1/3.

But if you add an additional requirement that any given boy has a a small probability Q to satisfy, the two-boy family has a 2Q-Q^2 chance (for two boys that could meet it, but you don't want to double-count the Q^2 chance that both do):

P(two boys and R) = P(R|two boys)*P(two boys) = (2Q-Q^2)*(1/4) = Q/2 - <small> = ~Q/2
P(one boy and R) = P(R|one boy)*P(one boy) = Q*(1/2) = Q/2
P(no boys and R) = P(R|no boys)*P(no boys) = 0*(1/4) = 0

The "~" means approximate. And now the answer is (~Q/2)/(~Q/2+Q/2+0)= ~1/2. The more accurate answer is (1-Q/2)/(2-Q/2). Which, if Q=1/7, is 13/27.

That is what happens when you require the information in the problem to be true before you select a family. If you merely observe it, represented by O,

P(two boys and O) = P(O|two boys)*P(two boys) = 1*(1/4) = 1/4
P(one boy and O) = P(O|one boy)*P(one boy) = (1/2)*(1/2) = 1/4
P(no boys and O) = P(O|no boys)*P(no boys) = 0*(1/4) = 0

And the answer is (1/4)/(1/4+1/4)=1/2. But now adding in Q when you observe one boy makes no difference, sicne there is still a Q chance the boy you observe meets it:

P(two boys and O) = P(O|two boys)*P(two boys) = Q*(1/4) = 1/4
P(one boy and O) = P(O|one boy)*P(one boy) = (Q/2)*(1/2) = Q/4
P(no boys and O) = P(O|no boys)*P(no boys) = 0*(1/4) = 0

And the answer is (Q/4)/(Q/4+Q/4)=1/2.

davee123 · Feb 10, 2011

JeffJo said:

That is what happens when you require the information in the problem to be true before you select a family.

Maybe I missed something, but isn't that the point? If the information in the problem weren't true of the selected family, isn't that case necessarily not included in the probability?

DaveE

JeffJo · Feb 10, 2011

davee123 said:

Maybe I missed something, but isn't that the point? If the information in the problem weren't true of the selected family, isn't that case necessarily not included in the probability?

DaveE

Necessarily, yes. The information is true of the selected family. Sufficiently, no. The information can be true even if the parent says something else. That's the problem. And the real issue is that your arguments require it to be both necessary and sufficient.

The problem statement is that a parent says "I have a boy." The issue is whether this particular parent is a representative sample of "all parents of two including a boy," or a representative sample of "all parents of two who chose to tell you 'I have a boy.'" Since there is no requirement that a parent of a boy and a girl must say "I have a boy," some might say "I have a girl" in the same situation. In fact, we should expect half of them to do so.

The issue is 100% equivilent to the controversy in the Monty Hall Problem. If "Monty Hall reveals a goat behind door X" is equivalent to "There is a goat behind door X," then the other two doors remain equally likely to have the car. Although they use many different arguments, the people who recognize that Monty Hall had a choice of two doors in one case (the contestant's door has the car), but not in the other (the contestant's door has a goat) see that counting possible cases doesn't work. You need to sum the probability that these cases would produce the observed results.

Please, rather than resorting to superficial arguments, examine what I've argued here. The Law of Total Probability says that P(two boys|one boy)=sum[D=Sunday to Saturday, P(two boys|parent says "one boy born on D")*P(parent says "one boy born on D"). Do you contest this? Because if you do, you need to revist Probability 101. It's called a "law" for a reason.

Now, do you think that P(two boys|parent says "one boy born on D") could be different for any of the seven possible values of D? I certainly hope not. You'd need to provide some reason why one day is any different than another, and there is none.

Is there some reason why you think sum[D=Sunday to Saturday, P(parent says "one boy born on D") is not 1 for our assumed situation? I agree the statement is not orthodox, but since we are given that a parent made a statement in this form, we must consider only situations where a statement in that form is made.

The end result is that P(two boys|parent says "one boy born on D") must equal P(two boys|parent says "one boy"). Under the "requirement" assumption, this is "1/3=13/27." Under the "observation" assumption, this is "1/2=1/2." Which do you prefer to defend?

Please, please, please, step back from your preconceived notions of what the answer should be. Look at what it has to be. As [PLAIN]http://fox-lab.org/papers/Fox&Levav(2004).pdf[/PLAIN] argue, you should not "partition the sample space into n interchangeable events, edit out events that can be eliminated on the basis of conditioning information, count remaining events, then report probabilities as a ratio of the number of focal to total events." You need to sum the probabilities that each possible event could produce the observed results, instead of merely counting them. Unless you can produce evidence that the "count" must be the same as my "sum," you are not answering these issues.

davee123 · Feb 10, 2011

JeffJo said:

The issue is 100% equivilent to the controversy in the Monty Hall Problem. If "Monty Hall reveals a goat behind door X" is equivalent to "There is a goat behind door X," then the other two doors remain equally likely to have the car.

I believe that's true in the case that Monty Hall doesn't know which door contains the car. The unstated assumption in the Monty Hall problem is that Monty knows perfectly damn well where the car is, and will intentionally show you the door that he knows contains the goat.

It seems like you might be implying something of similar distinction here, but I'm having a difficult time understanding how you might be applying that logic.

DaveE

Dadface · Feb 11, 2011

JeffJo said:

You are missing the point. In Gary Foshee's solution (the 13/27 one) to the problem, he is actually requiring a family to have a boy born on a Tuesday before they can be selected. Since a family with two boys is (almost) twice as likely to meet the requirement, the probability of two boys goes up from what it would be if you only required that there be a boy.

If, on the other hand, no restrictions are placed on who can be randomly selected, and then you make an observation about the family that coudl be any fact that is true, the answer is 1/2 whether or not "Tuesday" or any other subdivision is mentioned.

You are right that it should be rewritten. If the author wants the answer to be 1/3 (withjout "Tuesday") or 13/27 (with it), this requirement must be made explicit. Without that, the answer is 1/2.

I am aware of the requirements and am not missing the point(please read my posts).I'm saying that if the information is such that the time of birth can be pinned down to increasingly more precise values between limits(eg Tuesday as in the stated problem and then Tuesday morning and then Tuesday morning between 9 and 10 etc)then the probabilities as calculated by the method used here get increasingly closer to 0.5.

D H · Feb 11, 2011

JeffJo does have a point here. Making the question short and story-like makes the intended answer (13/27) incorrect. The probability that both children are boys is indeed 13/27 given a family randomly drawn from the set of two child families with one son born on a Tuesday. However, that is not necessarily the right universe for the question "I have two children, one is a son born on a Tuesday. What is the probability my other son is a boy?"

Suppose Monty gets tired of giving away cars to smarty-alecky mathematicians. He asks the members of the audience to raise their hand if they have exactly two children, one a boy and the other a girl. He asks one such couple to come on down and asks the couple on what day of the week the boy was born. Monty now takes you out of a sound-proofed room and tells you "This fine young couple have two children. One is a boy born on a Tuesday. What is the probability the other child is a boy?"

Dadface · Feb 11, 2011

What about the fact that the solver can use additional knowledge,mainly common sense and general knowledge,in order to solve the problem.If it is the intention of the question compiler that any solver should not use this general knowledge then in my opinion the boys problem is too artificial.

JeffJo · Feb 11, 2011

davee123 said:

I believe that's true in the case that Monty Hall doesn't know which door contains the car. The unstated assumption in the Monty Hall problem is that Monty knows perfectly damn well where the car is, and will intentionally show you the door that he knows contains the goat.

If Monty opens a door at random, and just happens to find a goat, the probability your door has the prize is 1/2. That's not what I'm talking about.

First, let me point out that you will always know what the door numbers are when you have to decide to switch. So let's specify that you choose Door #1, and Monty opened Door #3. What these numbers are shouldn't matter, but specifying them brings the comparison closer to the Two Child Problem, where "boy" is similarly specified.

If Monty Hall is biased, and is required to open Door #3 if there is a goat behind it and the contestant did not choose it, then the probability Door #1 has is the car 1/2. There are three possibilities, one is eliminated, and the other two have equal probability. But if Monty is free to choose a door to open, he will open Door #3 100% of the time the car is behind Door #2, 0% of the time it is behind Door #3, and 50% of the time it is behind Door #1. The proper solution does not count cases, it sums these probabilities: P(D1)=(50%)/(100%+0%+50%)=1/3.

If this parent is required to say he has a boy in all cases where he has one, then the probability he has two is 1/3. There are four possibilities, one is eliminated, and the other three have equal probability. And the answer changes if additional information about a boy is presented as a requirement, since a family of two boys is almost twice as likely to meet that requirement as a family woth one boy.

But if this parent is free to choose any gender among his children, he will say "boy" 100% of the time he has two, 0% of the time he has none, and 50% of the time he has one. The proper solution does not count cases, it sums these probabilities: P(2Boys)=(100%)/(100%+0%+50%+50%)=1/2. And the answer can't change with additional information.

My point is that the presence of one value in the problem statement, from what we should consider a set of N equally-likely possibilities, cannot be taken to mean that value was required. You know what door Monty Hall opened, but you can't assume he was required to open that door if it was possible to. You were told there is a boy in the family, but you similarly can't assume the person who told you was required to say "boy" if he could. A parent of a boy and a girl should have been equally likely to say "one is a girl (born on...)," and that makes the answer 1/2 regardless of what additional information is presented.

JeffJo does have a point here. Making the question short and story-like makes the intended answer (13/27) incorrect. The probability that both children are boys is indeed 13/27 given a family randomly drawn from the set of two child families with one son born on a Tuesday. However, that is not necessarily the right universe for the question "I have two children, one is a son born on a Tuesday. What is the probability my other son is a boy?"

Remove the "necessarily" and this is correct. First off, as soon as he says "other" (colored in red), he has specified one child just as completely as he would have if he had said "My older child is a boy." The answer is 1/2. But let's assume you asked what you meant, "what is the probability I have two boys?"

Look back at Jimmy Snyder's grid in Post #23. In the universe that produces the answer 13/27, father b3b4 or b4b3 can't say "I have two children, one is a son born on a Wednesday." Assuming the father does in all other cases, the answer for the question "what is the probability I have two boys?" is probably 11/25, not 13/27. It could be something else, but it can't be 13/27. Since you can't postulate a universe where the answers are different for different days, you can't postulate the universe where the answer is 13/27. That's the point.

Suppose Monty gets tired of giving away cars to smarty-alecky mathematicians. He asks the members of the audience to raise their hand if they have exactly two children, one a boy and the other a girl. He asks one such couple to come on down and asks the couple on what day of the week the boy was born. Monty now takes you out of a sound-proofed room and tells you "This fine young couple have two children. One is a boy born on a Tuesday. What is the probability the other child is a boy?

A poorly formulated question. A probability problem has to imply a random process somehow. If you don't say what that process is, the solver has to assume there is an equal probability for what appear to be equivalent outcomes. In this question, your process did not allow the "other" child to be a boy, so the actual answer is 0. It also pre-determined that the formulation of the question would include the word "boy," and you did not convey that information, either. The person in the booth has to assume any usage of gender had to allow either possibility. Specifically:

He can't assume you asked for parents of two, but the rest is independent of this so it doesn't matter. (And I think the fact that, when independence applies, it doesn't matter if the fact is a requirement or an observation, is what confuses people into believing it never matters.)
He can't assume you specified the gender make-up of the family. BB, GB, BG, and GG are all equally likely.
He can't assume you asked about the day a specific child was born. That fact could apply to any of the eight children that are possible.
He must assume that if the equivalent facts for both children are different, that each was equally likely to be presented to him.

So the best answer based on the information you presented to him is 1/2.

JeffJo · Feb 11, 2011

Dadface said:

I am aware of the requirements and am not missing the point(please read my posts).I'm saying that if the information is such that the time of birth can be pinned down to increasingly more precise values between limits(eg Tuesday as in the stated problem and then Tuesday morning and then Tuesday morning between 9 and 10 etc)then the probabilities as calculated by the method used here get increasingly closer to 0.5.

And I'm saying that it is a result of requiring a boy to be born within an interval that causes this. A family of one boy has one chance only, and the probability is Q. The proportion of families that have one boy is not included in determining this Q; the two probabilities must be multiplied together to get the probability that both happen in the same family.

But a family of two boys has nearly twice the probability of having one that meets such a requirement; 2Q-Q^2, actually. This value is also multiplied by the proportion of families with two boys to get the probability of both happening together.

The result is that the list possible families decreases as Q decreases, but the sub-list of two-boy families decreases at about half the rate of the one-boy families. So the ration of the two approaches 1/2.

But if the facts we are given are treated as an observation of a random family, there are exactly two facts that apply to any particular family. We don't have to say "boy" or "girl", but pick an applicable gender based on the two children. We don't have to say "born between 9:13PM and 9:14PM on a Thursday", but pick an applicable minute based on a child of the applicable gender.

The chances of two boys, given that we observe "a boy born between 9:13PM and 9:14PM on a Thursday" this way, is exactly 1/2. It remains exactly 1/2 no matter what size of an interval you choose, because you can always isolate the births to such an interval.

Jimmy Snyder · Feb 11, 2011

I get it now. Here is a simplified version of the problem.
A mathematician says "I have one child. It is a boy. What is the probability that my child is a boy."
The common sense answer is 1. However, we don't know what the mathematician would have asked if their child was a girl.
Case 1: If they would not have asked you a question, then the answer to the question they didn't ask is 14/27.
Case 2: If they would have asked "I have one child. It is a girl. What is the probability that my child is a girl." Then the answer to the question is 1/2.

Of course, I'm just joking. Look at it this way. Imagine you are in a room with 196 other people, one of each type in the grid that I made. 27 of them walk up to you and say I have two children, one is boy born on Tuesday and the other child is a {boy/girl}, then of the 27, thirteen will say that the other child is a boy, and 14 will say that the other child is a girl. There's your 13/27

If all 196 people walk up to you and say "I have two children, one is a {boy/girl} born on {S/M/T/W/T/F/S} and the other is a {boy/girl}, then 98 will say that the other child is a boy and 98 will say that the other child is a girl (it will be easier to see it if you assume that each will identify the eldest child first). There's your 1/2.

Taken in isolation, you can't tell which situation the OP is discussing. If a bunch of people ask the question, then you may get a feel for which of these two universes you live in.

D H · Feb 11, 2011

JeffJo said:

If Monty opens a door at random, and just happens to find a goat, the probability your door has the prize is 1/2. That's not what I'm talking about.

Good thing that, because if Monty opens a door at random my chances of winning the car is 2/3. Suppose Monty randomly opens door #3 and happens to show the car. In that case I am going to switch to door #3.

JeffJo · Feb 11, 2011

D H said:

Good thing that, because if Monty opens a door at random my chances of winning the car is 2/3. Suppose Monty randomly opens door #3 and happens to show the car. In that case I am going to switch to door #3.

Read what I said. When (not "if") he reveals a goat, as is explicit in the problem statement, the probability is 1/2 if he chose any door you didn't choose randomly, 1/3 if he chose a goat-door you didn't choose randomly, and either 1/2 or 1 if he prefers to open a specific door if he can. If he reveals a car, it is a different problem unrelated to anything we have talked about. We don't know if you would be offered the chance to switch.

JeffJo · Feb 11, 2011

Jimmy Snyder said:

I get it now. ...

Taken in isolation, you can't tell which situation the OP is discussing. If a bunch of people ask the question, then you may get a feel for which of these two universes you live in.

Exactly.

Finally, imagine that you leave after the first person who walks up to you says "I have two children. One is a boy born on a Tuesday." Can you rule out either scenario as being the reason he did this? No. Can you assume it was because he was required, as in your first scenario, to tell you exactly that fact, and no other? No. Can you assume it is a random occurrence that could equally-likely be any statement of that form? Yes. The answer is 1/2. And it is also 1/2 if he only says "One is a boy."

Both problems are really ambiguous; but then, so are most probability puzzles. They don't tell you things like "Assume 50% of children are boys, 50% girls, and gender is independent in all siblings." Because those assumptions are not true in the real world. I'm not criticizing the problems for that, I'm saying that assuming equiprobability is an inherent part of these problems. There could be reasons why the probability here is not 1/2, but unless they are explicit in the problem statement, the only possible answer is 1/2.

Dadface · Feb 12, 2011

JeffJo,or anybody else, will you clarify something please?The original question seems to impose certain limitations:

1.The question seems to imply that consideration be given only to a particular "universe".
2.The question further seems to imply that only the limited information given in the question itself should be used when solving the problem.

From my understanding of previous posts it seems to be agreed that working within the implied limitations gives a probability of 13/27.Is that correct? This thread,however,has gone further with some suggesting/stating that the true probability is 1/2:

A.Following on from the above I show that if one applies a similar methodology(which is not necessarily the best methodology) as previously used but also uses extra information as drawn from general knowledge then one obtains different answers.As the birth time is pinned down more accurately to reducing time intervals the probability tends to 1/2.(actually reaching 1/2 if birth time can be defined as an instantaneous event).If I am correct the probality of 1/2 applies to the universe described in the question and possibly,by extension,to all universe.(I have yet to check my maths)

B.The same conclusion of a probability of 1/2 is reached but differently as in the event is analysed primarily by not considering the defined universe only.

Right or wrong the conclusions reached by methods A. and B. seem to breach the requirements of the original question.The conclusion reached by my method (A) goes beyond limitation 2.(see above)and the conclusion reached by method B.breaches limitations 1. and 2.

Have we moved too far away from the original question?Is the original question meaningless?Am I being daft?(the answer is yes to at least one of these questions)

Jimmy Snyder · Feb 12, 2011

Dadface said:

1.The question seems to imply that consideration be given only to a particular "universe".

If two people were to ask you the same question and each one used a different day of the week, then you would know for sure which universe you were dealing with. If they asked you using the same sex and day of the week, then you could, with some confidence know which universe, but not be completely sure. However, if only a single person asks you, then you have no way of knowing which universe.

JeffJo · Feb 12, 2011

Dadface said:

The question further seems to imply that only the limited information given in the question itself should be used when solving the problem.

I honestly am not trying to be disrespectful of anyone, when I say the interpretation you express here is a common misconception held by many. They have been known to defend it voraciously because they "know" it produces the "right" answer to this problem. They are so convinced, that they won't look at arguments against it, since they already "know" those arguments must be wrong. This has been done by students, teachers, full professors, and even Nobel Prize winners for a different problem.

The correct approach is that the possible information is not limited to what is in the problem statement. The specific instance of a random process that the problem exemplifies is conditioned on that information, and there is a big difference. To condition a probability on another event, you need to first enumerate all the possibilities for the process. Then the conditional probability for event E, given event C, is the probability of E and C happening together, divided by the probability of event C happening regardless of E. And the issue here is that C is not "A family has a boy born on a Tuesday," it is "A parent states one fact, of what is likely two different but equivalent facts that apply."

The other problem I refer to is the Game Show Problem. The information given in it, is that there is one car and two goats, each behind one of three numbered doors. One door is chosen by the contestant, and a different one is revealed to have a goat. If we limit the solution to using this information alone, since the car was equally likely to have been placed behind either of the two remaining doors, it seems it now has a 1/2 chance to be behind either.

This solution has been accepted by students, teachers, full professors, and Nobel Prize winners. Thousands, including many holding doctorates in Math, wrote angry letters to Marilyn vos Savant when she said, in Parade Magazine, that the answer to the problem is that the chosen door has a 1/3 chance, and the unchosen, unopened door has a 2/3 chance. Yet that is what is accepted as the correct answer.

The correct solution (different from the explanations of how the game works in the long run which are usually given) is that if the contestant chose a door with a goat, only one door could have been opened. But if the chosen door had the car, then there were two doors that could have been opened. So even if the information in the problem tells use which door was opened, the possibility of a different door being opened - contradicting the problem statement the same way the possibility of a parent saying "I have a girl" contradicts ours - means that case's chances are cut in half. The same solution applies to the Two Child Problem, except that there are two cases where other possibilities exist, that contradict the specific information in the problem statement, but not the process the statement describes or implies which produced that information.

From my understanding of previous posts it seems to be agreed that working within the implied limitations gives a probability of 13/27. Is that correct?

It is correct than some have accepted that answer, just like Nobel Prize winners have said the chances in the Game Show Problem are even. The answer itself is not correct.

A. Following on from the above ... As the birth time is pinned down more accurately to reducing time intervals the probability tends to 1/2.

I haven't been very clear on addressing this because (honestly) I try to avoid my tendency to be long-winded. The issue is what you mean by "pinned down." In order for your 1/3 -> 1/2 progression to be correct, you have to have chosen the specific time interval you want before you chose the family you apply it to. A set of families that all meet that requirement has to be assembled, effectively, and you pick one of the qualified group. So if you decide "I'm look for boys" and arrange to get a family with a boy, the answer is 1/3. If you then decide you want a Tuesday Boy, you need to assmeble a different (smaller) group, pick a new family, and the answer is 13/27.

But if, as seems to be a closer interpretation of "pinned down," you mean that you pick a family first and then determine a gender that exists in that family that could be either "boy" or "girl" but in this case happens to be "boy," the answer is 1/2. If you then, through a repeated series of questions, narrow the interval the includes the birth of that boy? The answer stays at 1/2 throughout.

+++++
In 1959, Martin Gardner asked this in Scientific Ameican:

Mr. Jones has two children. The older child is a girl. What is the probability that both children are girls?
Mr. Smith has two children. At least one of them is a boy. What is the probability that both children are boys?

He gave the answers as 1/2 and 1/3. But he retracted the second one six months later:

Readers were told that Mr. Smith had two children, at least one of whom was a boy, and were asked to calculate the probability that both were boys. Many readers correctly pointed out that the answer depends on the procedure by which the information "at least one is a boy" is obtained. If from all families with two children, at least one of whom is a boy, a family is chosen at random, then the answer is 1/3. But there is another procedure that leads to exactly the same statement of the problem. From families with two children, one family is selected at random. If both children are boys, the informant says "at least one is a boy." If both are girls, he says "at least one is a girl." And if both sexes are represented, he picks a child at random and says "at least one is a ..." naming the child picked. When this procedure is followed, the probability that both children are of the same sex is clearly 1/2. (This is easy to see because the informant makes a statement in each of the four cases -- BB, BG, GB, GG -- and in half of these case both children are of the same sex.) That the best of mathematicians can overlook such ambiguities is indicated by the fact that this problem, in unanswerable form, appeared in one of the best of recent college textbooks on modern mathematics.

By Gardner's standards, the question that started this thread is unanswerable since it does nto tell us why we were told what we were told. If we assume there is enough information to answer it, we must assume that any similar statement could be given, and the answer is 1/2.

Jimmy Snyder · Feb 12, 2011

JeffJo said:

If we assume there is enough information to answer it, we must assume that any similar statement could be given, and the answer is 1/2.

This answer suffers from the same error as the 13/27 answer. That is, you have generalized Tuesday, but you have not generalized the number of children. What if the question in the OP is a particularization of the general question "I have n children. One is a {boy/girl} born on {S/M/T/W/T/F/S}, what is the probability I have n {boys/girls}."

Dadface · Feb 12, 2011

Jimmy Snyder said:

If two people were to ask you the same question and each one used a different day of the week, then you would know for sure which universe you were dealing with. If they asked you using the same sex and day of the week, then you could, with some confidence know which universe, but not be completely sure. However, if only a single person asks you, then you have no way of knowing which universe.

What is meant by a universe as used here?As I see it at present the original question refers to a universe which is just a family with two children and the other information(boy born on Tuesday) is not needed to get the real answer.

JeffJo · Feb 12, 2011

Jimmy Snyder said:

This answer suffers from the same error as the 13/27 answer. That is, you have generalized Tuesday, but you have not generalized the number of children. What if the question in the OP is a particularization of the general question "I have n children. One is a {boy/girl} born on {S/M/T/W/T/F/S}, what is the probability I have n {boys/girls}."

Then you need to know the probabilities P(n children) for all n>0 to answer. But the cases where you have n or m children, where n<>m, can not overlap and so can be treated independently. Which is what I did, and you are calling an "error." It is not.

The cases where the man says "I have two including a b3" and "I have two including a b4" can overlap, so treating them independently is an error.

+++++

There are many red herrings that can be chased in these problems. Other sized families. Why woudl a parent make this odd statement? More boys are born than girls, so the answer can't be based on P(boy)=1/2. There is a correlation between genders in siblings due to identical twins, so a far more complicated approach should be used. More children are born on weekdays because cesearian sections are scheduled, seldom of weekends. Some parents are more proud of one particular child of the two, so unknown biases exist.

What I find happens when people who want to defend the 1/3 or 13/27 answers can't find a hole in my arguments, is that they start chasing such red herrings. Please don't do that. There is a simple, incontrovertible demonstration of what the answer needs to be:

P(two boys) = sum(i=1 to 7, P(two boys|parent says "one is a boy born on DAY(i))*P(parent says "one is a boy born on DAY(i))

where sum(i=1 to 7,j=1 to 2, P(parent says "one is a SEX(j) born on DAY(i)) = 1

Yes, I did build "parent of two" into that; all the parents mentioned have two children. See above for justificaiton. I also built in the fact that a parent of two says "one is a SEX(j) born on DAY(i)", so that the second equation holds. That is also reasonable, since we aren't interested in any parents who don't, and you can adjust the probabilities for any situation you think applies within that condition. And I left out the "j=1 to 2" part in the first equation, since the conditional probability is always 0. Technically it should be in there, but it takes less space if I leave it out, and contributes nothing.

Do we all agree that P(two boys)=1/4? This is the unconditional probability (except the conditions that they have two children, and make the odd statement). In order for P(two boys|parent of two says "one is a BOY born on TUESDAY) to equal 13/27, you need to accept one of three absurdities. Either

P(two boys) is not 1/4
P(two boys|parent of two says "one is a BOY born on DAY(i)) is different for different values of i
P(parent of two says "one is a SEX(j) born on DAY(i)) is different for different values of i and j.

If #2 and #3 are false - meaning the respective probabilities are all equal - then P(two boys|parent of two says "one is a BOY born on DAY(i))=13/27 for all i and j, and P(parent of two says "one is a BOY born on DAY(i))=1/14 for all i and J. That makes P(two boys)=13/54. But if any of those probabilities are different, the problem has to establish why in what is known as the a priori state. That's the state before the known condition "parent of two says 'one is a BOY born on TUESDAY'" is applied, which is the state these probabilities must evaluated in. So you can't use the given values of i and j to deduce anythign about the probabilities.

BobG · Feb 12, 2011

JeffJo said:

I haven't been very clear on addressing this because (honestly) I try to avoid my tendency to be long-winded. The issue is what you mean by "pinned down." In order for your 1/3 -> 1/2 progression to be correct, you have to have chosen the specific time interval you want before you chose the family you apply it to. A set of families that all meet that requirement has to be assembled, effectively, and you pick one of the qualified group. So if you decide "I'm look for boys" and arrange to get a family with a boy, the answer is 1/3. If you then decide you want a Tuesday Boy, you need to assmeble a different (smaller) group, pick a new family, and the answer is 13/27.

But if, as seems to be a closer interpretation of "pinned down," you mean that you pick a family first and then determine a gender that exists in that family that could be either "boy" or "girl" but in this case happens to be "boy," the answer is 1/2. If you then, through a repeated series of questions, narrow the interval the includes the birth of that boy? The answer stays at 1/2 throughout.

+++++
In 1959, Martin Gardner asked this in Scientific Ameican:

He gave the answers as 1/2 and 1/3. But he retracted the second one six months later:

By Gardner's standards, the question that started this thread is unanswerable since it does nto tell us why we were told what we were told. If we assume there is enough information to answer it, we must assume that any similar statement could be given, and the answer is 1/2.

I can agree with this point in theory, but Gardner's statement is slightly detached from reality.

I think it's safe to say that questions asked in brain teasers are asked in a "Clue" type environment in which you're looking to eliminate possibilities. In other words, if the person answered "I have a girl", you would have narrowed things down to a different probability than if the person answered "I have a boy" and would be on a different path. Only the path you're currently on is relevant.

Just like in the game where you have a person pick a number from 1 to 100 and guess his number in no more than seven tries. There may be a nearly 50/50 chance of getting higher or lower (allowing for 1 guess being correct), but each answer eliminates nearly 50% of the possibilities.

Actually, maybe the brain teaser is more detached from reality, since there's just no way you're going to get perfectly random answers in the real world. Trying to figure out how to test this to make my point kind of tilts me at least a little bit towards what Gardner said, since asking humans just isn't as easy as using cards (the truth of the Monty Hall problem is easily provable by simulating the exercise with 3 cards). You'd have to do something weird, such as having the parent flip a coin to decide which child's sex they'd tell me about to make the brain teaser probabilities work out. Otherwise you'd have to resort to saying that asking lousy subjects caused the results of your real world test to be unrealistic.

But brain teasers exist in a universe where native tribes can live happily with rampant infidelity in their tribe for generations until a missionary tells them at least one person is unfaithful and then the tribe has to respond with a mass murder of all the men. I think the original question can be safely answered assuming a perfectly logical, yet perfectly random universe.

JeffJo · Feb 12, 2011

BobG said:

I can agree with this point in theory, but Gardner's statement is slightly detached from reality.

No, it is the one that is grounded in reality. It is the interpretation that "We have knowledge of only one of two genders in this family, and it is male" is 100% erquivalent to "this family has at least one boy" that is detached. Since, by definition, we know that our knowledge is incomplete, we have to consider that a familiy might have a boy, but we don't know it.

Again, look at my arguments using the law of total probability. I'll even express it a different way:

I have two children, and one is the gender I have written down and sealed in this envelope. What is the probability I have one boy and one girl (I'm reversing it for ease of comparison)? If I reveal the gender I wrote, and it is "Boy," what is the probability? If I reveal it, and is is "Girl," what is the probability?

I think everybody agrees the first probability I asked for is 1/2. The second is a conditional probability, which can be expressed P(I have one of each|I wrote down "boy"). The third is P(I have one of each|I wrote down "girl").

I think everbody will also agree that the last two probabilities have to be the same. Let's call that value Q.

One rule of probability, called Bayes' Rule, is that P(I have one of each|I wrote down "boy")*P(I wrote down "boy")=P(I have one of each and I wrote down "boy")=Q*P(I wrote down "boy"). Similarly P(I have one of each|I wrote down "girl")*P(I wrote down "girl")=P(I have one of each an I wrote down "girl")=Q*P(I wrote down "boy").

Adding these two together gives P(I have one of each and I wrote down "boy")+P(I have one of each and I wrote down "girl") = Q*[P(I wrote down "boy")+P(I wrote down "girl")]. But another rule, called the law of Total Probability, says that P(I have one of each and I wrote down "boy")+P(I have one of each and I wrote down "girl")=P(I have one of each). The same rules says P(I wrote down "boy")+P(I wrote down "girl")=1. So P(I have one of each)=Q. But we know what P(I have one of each) is: 1/2. So Q=1/2.

That's a very long-winded way of saying that the second and third probabilities have to be the same, and that since there are only two possibilites for what I wrote down, the answer is the same regardless of what I wrote down. So teh answer has to be the same for all three.

I've repeated this argument about five different ways now. Nobody has tried to address any of them, because thay are all based on indistutable logic. So why can you not believe it is the right answer?

I think it's safe to say that questions asked in brain teasers are a
sked in a "Clue" type environment

Why? And please consider that you might think this because it gets the answer you want to be right; rather than to deduce this first, and use that to derive the answer. Also note that this issue is only important to probabiltiy probles (because you need to be concerned with what could have happened, but didn't) and even then, only when the possibilities overlap. So you can't compare it to most brain teasers you've seen.

In other words, if the person answered "I have a girl", you would have narrowed things down to a different probability than if the person answered "I have a boy" and would be on a different path. Only the path you're currently on is relevant.

I agree that if a person answered such a question, the answer is 1/3. I just see no reason, except to force the answer to be 1/3, to think they answered "Do you have a boy?" It is backwards to assume that the answer defines the unknown question that was asked. It is why Gardner said the question was ambiguous.

While we don't have to make brain teasers realistic, we do need to assume they are consistent. There are lots of questions we could assume were asked, but if we assume it was "Do you have a boy," then we also need to assume some people in the same universe are asked "do you have a girl?" So anybody who answered "no" is eliminated. But while none of the mixed families will answer "no," half will answer "yes" to a different question, and we still get 1/2 as the correct answer.

Jimmy Snyder · Feb 12, 2011

Dadface said:

What is meant by a universe as used here?As I see it at present the original question refers to a universe which is just a family with two children and the other information(boy born on Tuesday) is not needed to get the real answer.

There are two universes. One universe is a room A filled with fathers who have two children, one of which is a boy born on Tuesday. The other universe is a room B filled with father who have two children, but not necessarily a boy born on Tuesday. Note that some of the fathers in room B have a boy born on Tuesday, but not all of them do. Someone from room A or room B greets you. You don't know which room they came from. They ask you the question in the OP. You still don't know which room they came from.

Dadface · Feb 13, 2011

Jimmy Snyder said:

There are two universes. One universe is a room A filled with fathers who have two children, one of which is a boy born on Tuesday. The other universe is a room B filled with father who have two children, but not necessarily a boy born on Tuesday. Note that some of the fathers in room B have a boy born on Tuesday, but not all of them do. Someone from room A or room B greets you. You don't know which room they came from. They ask you the question in the OP. You still don't know which room they came from.

Ah I get it.Thank you Jimmy.I don't think it makes a difference to the main point I am trying to put across.

JeffJo · Feb 13, 2011

Imagine a rather unique casino game. The players have two cards, one marked "Doubles" and one "Not Doubles" on the face-down side, but identical on the other. After the croupier rolls two standard dice behind a screen, he announces "One of the dice landed on a ..." and names a number. Each player can wager $1 on Doubles of this number, or Not Doubles, by pushing out a dollar and the appropriate face-down card. Then the dice are revealed, and all players' cards. If bets were placed on both options, the winners split the losers' money. If not, everybody takes back their bets.

Obviously, the croupier needs a strategy for deciding what number to announce when two different numbers come up. This strategy affects the probabilities after each number he announces. For example, if he always announces the highest number, the probability of doubles is 1/11 if he announces a six, but 1/1 if he announces a one.

Interestingly, if everybody else knows what that strategy is and bets accordingly (i.e., gambling on the longer odds a proportionate number of times), but you don’t, you still will break even by betting on doubles once every six games. And I can show you how to make an Excel spreadsheet that simulates it. This means the average probability of doubles is 1/6, even if you don't know the strategy. It may differ on individual rolls; but you can't know how so your best - and correct - estimate is the average.

This strategy represents what you all have been calling a "universe." This example shows that you don’t need to know what the universe is, to answer the question "what is the probability that two '1 in N' chances match, if I tell you what one of them is, based on the knowledge you have." It is 1/N, not 1/(2*N-1). The only point to the "universe" is that if you know what it is, you can use that to refine your answer. But if you don't, you can't assuem there is a non-standard on in a brain teaser.

If a man tells you "I have two children, and one of them is a boy," but you don’t know why he would tell you about one gender over another (i.e., what "universe" you are in), the probability of two boys is 1/2. Not 1/3. It is only if you know, for a fact, that he must tell you about a boy if he has one, that your probability is 1/3.

Dadface · Feb 14, 2011

I meant to write another post in this thread about the "Boys Puzzle" but I have been stopped in my tracks(temporarily possibly) after reading more about the "Boy Girl Paradox" which was first published by Martin Gardener in 1959.The original question was phrased as follows.

Mr Smith has two children.At least one of them is a boy.What is the probability that both children are boys?

Apparently the paradox arises from the fact that the answer could be 1/2 or 1/3 depending on how it is found out that one child is a boy.Look at the information in the question:

1.Mr Smith has a son
2.Mr Smith has a second child

We know that,regardless of how the information was obtained,the son has one sibling.We further know that there is a fifty percent chance that the sibling is a boy and a fifty percent chance that it is a girl.The answer is 1/2.Where's the paradox?

(considering age differences makes no difference because we know,without being told,that the son is either the oldest or the youngest child)

JeffJo · Feb 14, 2011

Dadface said:

We know that, regardless of how the information was obtained, the son has one sibling. We further know that there is a fifty percent chance that the sibling is a boy and a fifty percent chance that it is a girl. The answer is 1/2. Where's the paradox?

If the family has two boys, which one is the "the son" that I colored in red? If you can't identify him, you have to consider the chances of the two children together.

Essentially, you are treating the problem statement as "I know about one particular child of Mr. Smith's. That child is a boy." When that is the problem statement, the answer is indeed 1/2 because the un-pictured child has a 1/2 chance to be a boy, as you reasoned. But that is assuming information not included in the problem. You are assuming a specific child is identified.

Others treat it as "I know about both of Mr. Smith's children. The pair includes at least one boy." When that is the problem statement, the answer is 1/3 because 3/4 of all the possible families include a boy, and 1/4 include two, so the answer is (1/4)/(3/4)=1/3. But this also assumes information not included in the problem. It assumes you actually know both ngenders: you need to, in order to always find a boy when one is there. It is implicit that you can know about one boy, but not both. It also assumes the intent to look for a boy, something you can't deduce from the statement alone.

Most people who take the second approach accuse those who answer "1/2" of taking the first. That's why it is usually compared to the Mr. Jones question. While many do exactly that - as did you - that isn't the only way to get 1/2. I prefer to think of it as "I know a gender that exists in Mr. Smith's family." This assumes nothing, since all that is really implicit in the statement is that you know a gender. This is more complicated to solve, because you have to allow for the possibility that you might know "girl." In fact, the whole reason conditional probability is unintitive is that you have to allow for things you know didn't happen. But the short of it is that in the 1/2 that have one boy and one girl, it is just as likely to know about the girl as the boy.

An example of "knowing a gender" might be if a new family moves into the three-bedrooom house next door. You might deduce they have a boy by seeing a boy's bicycle in the driveway, but you can't associate it with a specific child.

+++++

But since you are interested in history, I have traced what may (it's speculation, but it seems plausible) be the history of the Tuesday Boy Problem, and how top-level experts can't always agree.

In his 1988 book "Innumeracy," Professor John Allen Paulos of Temple University included a variant of the Two Child Problem as an example, paraphrased: "If Myrtle is a girl from a family of two, what is the probability she has a brother?" His answer was 2/3 (note that he asks for the probability if a mixed family, not two girls).

J. L. Snell (Dartmouth) and R. Vanderbei (Princeton) pointed out in a 1995 article titled "Three Bewitching paradoxes" that, by giving the girl an uncommon name, he inadvertantly changed the problem. The probability Myrtle has a sister should be 2/(4-p), where p is the probability a girl is named Myrtle (this is the same formula that would be used for the Tuesday Boy question, if it asked fro a mixed family). In his 2008 book "The Drunkard's Walk," Leonard Mlodinow (Stanford, and a frequent collaborator with no less than Stephen Hawkings) used the same problem, about a girl named Florida.

Snell and Vanderbei ignored the illogic of having two girls named Myrtle in the same family, and included that case in their count. Mlodinow argued that the factor for it depends on p^2, which is negligibly small compared to the other kinds of families. Which is wrong: compared to the fraction of the famlies he is counting, it depends on p^1.

Giulio D'Agostini (Universit`a “La Sapienza” and INFN, Rome, Italy) attempted to correct for that factor, but did it wrong by disallowing only girls named Myrtle/Florida. He allowed two girls of any other name. But at least he got the right answer, 1/2, which is quite trivial to prove by a different method!

Define the following events: M2 is the event where a family of 2 children includes a girl named Myrtle. MO is the event where she is the older sibling, and in MY she is the younger. Finally, MB is the event where she has a brother. Everybody will agree that the probability Myrtle has a brother, given that she is either the older or younger sibling, is 1/2. Which is the first statwement in this progression:

P(MB|MO)=1/2
P(MB|MY)=1/2
P(MO|M2)=Q (Most will say it is actually 1/2. I use a variable because I don't need to know it, and the error I'll demonstrate makes it a little more than 1/2.)
P(MY|M2)=1-Q (That is, MO and MY represent all possibilites in M2, and do not overlap)
P(MB|MO)*P(MO|M2) = P(MB and MO|M2) = Q/2
P(MB|MY)*P(MY|M2) = P(MB and MY|M2) = (1-Q)/2
P(MB and MO|M2) + P(MB and MY|M2) = P(MB|M2) = Q/2 + (1-Q)/2 = 1/2.
QED.

The error that Snell, Vanderbei, and Mlodinow make is that by allowing two girls named Myrtle/Florida in the same family, MO and MY overlap. Both P(MO|M2) and P(MY|M2) are equal to Q=2/(4-P), which is greater than 1/2. But they still use equation #7, which is invalid if MO and MY overlap. As a result, they get P(MB|M2)=2/(4-P), or the probability Myrtle has a sister is (2-P)/(4-P).

And what is event worse, is that my derivation above is wrong. Specificalky, equation #2 is wrong. #1 is right because the gender of a second child is independent of both the gender and name of the first child, but the name (not gender) of a second child will depend on the name of the first child, if the gender is the same. I won't go into details unless asked, but it turns out that P(Myrtle has a brother) is approximately equal to 1/2+[P(a girl receives the "average" name)-P(a girl gets named Myrtle)]/8. In other words, for uncommon names, the probabiltiy is greater than 1/2!

Dadface · Feb 14, 2011

JeffJo,thanks for your replies.Consider the following:

1.Mr Smith has a son.
2.Mr Smith has two children

The two children must be siblings and therefore the Smith children must be either brother with brother or brother with sister.It follows that if there is a son the probability of the second child being a son is 1/2.

This is an attempt to structure my reasoning such that I don't need to identify any of the children.

JeffJo · Feb 14, 2011

Dadface said:

JeffJo,thanks for your replies.Consider the following:

1.Mr Smith has a son.
2.Mr Smith has two children

The two children must be siblings and therefore the Smith children must be either brother with brother or brother with sister. It follows that if there is a son the probability of the second child being a son is 1/2.

This is an attempt to structure my reasoning such that I don't need to identify any of the children.

Correction: it must be brother with brother, older brother with younger sister, or younger brother with older sister. Each of those cases is equally likely to exist. The fact that you cannot specify the relative ages for the first group changes how you account for them. This is the point mathematicians are trying to demonstrate with this problem.

But they overlook that "girl" cannot apply to that case, but can apply to the others. That also changes how you account for them. The answer is 1/2, but for reasons different than you propose.

JeffJo · Feb 15, 2011

For some reason, I can't edit posts today. I misremembered where Leonard Mlodinow teaches. He's at Caltech, not Stanford (there is another professor who is part of the story - because he is one of the few who acknowledges the ambiguity in these problems - at Stanford).

Dadface · Feb 15, 2011

JeffJo said:

Correction: it must be brother with brother, older brother with younger sister, or younger brother with older sister. Each of those cases is equally likely to exist. The fact that you cannot specify the relative ages for the first group changes how you account for them. This is the point mathematicians are trying to demonstrate with this problem.

But they overlook that "girl" cannot apply to that case, but can apply to the others. That also changes how you account for them. The answer is 1/2, but for reasons different than you propose.

Thanks for your feed back JeffJo.When I wrote post 62 I forgot that age differences need to be considered.What a dope I am.I wasn't familiar with these type of problems before but now I'm hooked.

Dadface · Feb 16, 2011

Can someone confirm,or otherwise that with problems of this type we must use the information contained in the problem statement only,even though this might be ambiguous and open to different interpretations,and that we are not allowed to bring our own extra knowledge to the task?.Consider the boy girl paradox.We don't need to be told that the two siblings have different ages,this is knowledge we bring to the task.We don't need to be told that if there are two boys they can be identified because they have numerous other differences in addition to having different ages but this is knowledge it seems we are not allowed to bring to the task.The restrictions imposed on what we can and cannot use seem arbitary and artificial.Consider this...we are required to suspend all knowledge that if there are two boys they will have different age related bodily conditions but retain the knowledge that they(siblings) have different ages.

JeffJo · Feb 16, 2011

Dadface said:

Can someone confirm, or otherwise that with problems of this type we must use the information contained in the problem statement only, ...

Of course. But like Inigo Montoya said, I don’t think that means what you think it means.

We know the family has two distinct children. That means that they can be differentiated from one another by somebody, but not necessarily by us. And we do know this, whether or not it is stated in the problem.

Age is a very convenient tool to use to express the difference in terms of an order. But it isn't the only one we could use. I prefer to alphabetize the first name of each child's best friend, or order them clockwise (relative to mother) as they sit around the dinner table. As long as it is unique (which I suppose friends' names might not be, but let's assume they are) and independent of gender, all that matters is that the order exists.

We don’t even have to know how the order applies to any family; we use it only to calculate the proportions of the various groups of families we need to keep track of. Here's some simpler examples:

A six-sided die has two red sides, and four white sides.; The red sides are opposite each other. Other than that, no side can be distinguished from another. What is the probability that a red side will come up on a roll? Is it 1/2, because there are two colors? Or 4/6=2/3, because there are four red sides out of six, even though we can't distinguish them?
Two normal dice are completely indistinguishable. What is the probability their sum is 7 when you roll them? Is it 1/11, because there are 11 different totals that could come up? Or 3/21=1/7, because there three unordered pairs of numbers that add up to 7, out of 21 possible unordered pairs? Or 6/36=1/6, because there are 36 possible ordered pairs, and 6 of them total 7?

The last answer is right for each, of course. The different groupings exist, and we know they exist, even though it might be ambiguous what group a particular roll belongs to. The problem with how this Paradox is normally presented, is that it makes age sound like an we need ultimately to know the age, and we don't.

+++++
But you can't add information to a problem if it isn’t there, or clearly implied. That is why the Two Child Problem is generally ambiguous, but the best answer is 1/2.

When a man walks up to you and says "Mr. Smith has two children, and one is a boy," do you know, for a fact, how he decided to tell you that? In particular, do you have any reason to believe one of these possible reasons (and there are others) is preferable to another? (Note that you have to assume he never intended to tell you all the information that applies to Mr. Smith, since there would be no probability problem then. That makes all possibilities a little unrealistic, so don’t be surprised by it.)

He knows about only one child of Mr. Smith's.
He picked a gender at random from what is either one, or two, that exist in the family.
He is predisposed to mention boys, and so the tells you about a boy if he can, and a girl if he can't.
He is predisposed to mention girls, and so the tells you about a girl if he can, and a boy if he can't.

He could say "one is a boy" with any of these, so you can't pick one that must be true. So most wordings of the problem are ambiguous. But most of the reasons that you can put in the list require adding some information. Does the problem say he knows about both of Mr. Smith's children (as #2, #3, and #4 require)? Some do, but the Mr. Smith one doesn't. Does it say he is predisposed toward one gender or the other? Very few do. And the 1/3 answer requires that the answer to both questions be "yes." Without both, the answer is 1/2.

Calculating the Probability of Two Boys for Math-Obsessed Friend

Similar threads

Hot Threads

Recent Insights