Is Marilyn Vos Savant wrong on this probability question?

micromass · Feb 17, 2012

andrewr said:

Simple, you go get a dice. Roll it 20x, and fairly (use a can to shake it rigorously before dumping). Record the 20x results. Then ask me whether or not you rolled a sequence of repeating digits "11111" "22222" "33333" ... "6666" (20x), as opposed to what the dice rolled.

No, that's not what's going on here. The deal is: go get a dice and roll it 20x, then see whether you rolled the specific sequence 14325231542341632165. The answer will be no most likely.

Let's continue with the analysis. Let's write a computer program and let's do billions of dice rolls and let's measure whether 14325231542341632165 and 11111111111111111111 is more likely. Are you willing to accept the answer of a computer simulation??

eg:
Let's actually test the GAME as Marilyn suggested, and see who is right statistically (eg: in a sample of 10 games.)

LOL, a sample of 10 games. You know very well that you need to roll it many more times to have something statistically significant.

But, ok, are you prepared to do the computer simulation I proposed?? I'll even code it for you.

lavinia · Feb 17, 2012

Hurkyl said:

What is "confidence"? Is it anything other than "I know the math says one thing, but I don't want to believe it"? (edit: I don't mean to be condescending, but it is really easy to try and rationalize one's intuition when faced with the fact it's simply wrong)
The mistake I mentioned earlier -- here is one way to make that mistake:

I'm going to invent a statistical test: my statistic T is the entropy distribution of observed frequencies. Computing T for 1111... gives a result less likely than computing T for 6623... Therefore, I infer that 6623... is what was rolled

Hurky I see your points and agree but something is bothering me that maybe you can explain.

If I take independent samples from a distribution with finitely many values then for a large sample wouldn't I expect the frequencies in the sample to be close to the frequencies in the distribution? So forgetting the order of the digits in the not all 1's sequence - wouldn't it be more expected since its frequencies are more like the underlying uniform distribution? And I guess it is being assumed that the distribution is uniform in this case or at least very far from constantly 1.

chiro · Feb 17, 2012

lavinia said:

Hurky I see your points and agree but something is bothering me that maybe you can explain.

If I take independent samples from a distribution with finitely many values then for a large sample wouldn't I expect the frequencies in the sample to be close to the frequencies in the distribution? So forgetting the order of the digits in the not all 1's sequence - wouldn't it be more expected since its frequencies are more like the underlying uniform distribution? And I guess it is being assumed that the distribution is uniform in this case or at least very far from constantly 1.

It depends on the specific probabilistic properties of the process.

If the process has very complex conditional probabilistic properties of any order that are known, then this information can be incorporated when you are trying to get likelihood information for a parameter.

This problem is essential in statistics. What we usually do is we assume that our data fits a specific model and then based on the data we find out how likely this really is.

Again with this kind of problem there are many perspectives you can take and a large amount of statistical work deals with the task of trying to get representative samples or design processes where a real representative sample can be obtained that 'represents' the real process in the best way possible (i.e. the distribution of the sample is a good representation of the underlying process distribution).

Statisticians have to do this all the time and consider the kinds of things that the OP has brought up and because of situations like this, we have to use a combination of solid mathematical foundations in statistical theory as well as some kind of 'inner judgement' that includes non-domain specific (general statistical understanding) as well as domain-specific knowledge to know when we should 'repeat the experiment just to be sure' or to 'look at the data and process it further' if we don't have the time or resources to do the experiment again.

andrewr · Feb 24, 2012

pwsnafu said:

Why?

Loren wrote: "Say you plan to roll a die 20 times." Clearly there has been no rolling done.

I fail to see how Marilyn's "game" is relevant to the question Loren posed.

Emphasis mine:

Oh come forth(right) and use an English grammar book.
Loren said "YOU" and she used the infinitive "to".
Therefore, there is a colloquial expression and a variable interpretation of the hypothetical question involved.

Marilyn has the right to use her own opinion(eg: the YOU) about how Marilyn would roll and when/how she would report the results.

Her reply has a conditional answer for a given variation of the original question's meaning.

But let’s say you tossed a die out of my view and then said

The colloquial expression "But ... you" is a hypothetical question, meaning "if you"; and notice, Marilyn casts it in the PAST tense instead of the equivocal infinitive.

Your failure includes mis-understanding the sphere of discourse problem Marilyn was confronted with in the "OP" (I still haven't and won't read the parade article itself before reading Hurkle's response.)

The infinitive does not strictly define "when" an event happens. Connotation is NOT the same as denotation.
http://en.wikipedia.org/wiki/Infinitive

They do not have tense, aspect, moods, and/or voice, or they are limited in the range of tenses, aspects, moods, and/or voices that they can use. (In languages where infinitives do not have moods at all, they are usually treated as being their own non-finite mood.)

I read several languages, and the question Loren asked is a trick question.

As you (pf...) falsify the antecedent of Marilyn's SECOND response (as you clearly do) then her consequent statement SHOULD NEVER HAVE BEEN DISCUSSED AT ALL by you. eg: Marilyn is thus *CORRECT* in her evaluation of your interpretation of Loren's question, (for her answer STOPS before the BUT can be evaluated as TRUE -- no "BUT" about it.)

Anyone who judges Marilyn according to the consequent by saying the antecedant of Marilyn's reply can only be true in one way, is making a psychological and logical error. (by a fallacy...!)

Again, I was asking Hurkle how he judged the antecedent of Marilyn's hypothetical as TRUE;
He might have a legitimate answer -- but YOU do not, so far!

As you persist in attacking Marilyn -- tell me, how do you show her antecedent *is* DEFINITELY True in order to evaluate the consequent as an error?

No court would vindicate a judgment of Marilyn based on the low IQ grammar understood by most people in this thread.

Marilyn scored high in English as well as math; Take it all into account!

micromass · Feb 24, 2012

Are you now making your case by using linguistics?? This is not good...

Bacle2 · Feb 24, 2012

Anyone who judges Marilyn according to the consequent by saying the antecedant of Marilyn's reply can only be true in one way, is making a psychological and logical error. (by a fallacy...!)

Listen, I usually make an effort not to carp on others' grammar unless it is egregiously wrong, given my own imperfections. Still, considering you're accusing us here of using "low IQ grammar" ( ever heard of punctuating as low-IQ grammar, before chiding others' grammar?), an unclear term, I will make an exception and will carp on every small innacuracy of yours. I like to do that with those who claim to be smarter than others.

1)"... by a fallacy"? Is that high-IQ grammar?

2) It is antecedent, not antecedant, mr high-IQ grammar. If you want to talk down to others you may want to spell-check before replying.

3)Learn the _actual names/handles_ of others : I, with my low-IQ can tell it is HURKYL.

4)How do you know the errors are of a psychological nature?

5)Do you have a copy of Marylin's IQ test? I have asked her to support her claims of having the highest IQ, and she has not replied, neither personally (I included my e-mail when I asked ), nor in her site. Moreover, none of the Guiness book-of-record editions of the last few years include her --in any category. Still, VS repeatedly takes strong ethical positions, chiding others' behavior. Maybe she would care to live by the standards she wants to enforce in others.

Now, would you please include a copy , or at least tell us her score, and some details of her test?

6)"Marilyn scored high in English as well as math; Take it all into account!"

Beside the above point, _you_ may want to consider that Marylin back-tracked in a very non-gracious way when her claim that the proof of Fermat's last theorem
was challenged.

And I doubt there is any relation between the level of math in an IQ test and advanced mathematics, tho..., maybe there is (sic) "by a fallacy"

Sorry for muh, rekuest, IQ majesty I is no have low IQ .

Bacle2 · Feb 24, 2012

micromass said:

Are you now making your case by using linguistics?? This is not good...

Don't forget his use of appeal to authority--a fallacy -- by his mention that he knows several languages.

pwsnafu · Feb 24, 2012

andrewr said:

Her reply has a conditional answer for a given variation of the original question's meaning.

Doesn't change the fact that she doesn't explain what her assumptions of the second half was. If you are going to change the intention of the question then be clear in stating the assumptions. If you andrewr had read the first half of this thread you would know that's what the bulk of the discussion boils down to.

The colloquial expression "But ... you" is a hypothetical question, meaning "if you"; and notice, Marilyn casts it in the PAST tense instead of the equivocal infinitive.

Your failure includes mis-understanding the sphere of discourse problem Marilyn was confronted with in the "OP" (I still haven't and won't read the parade article itself before reading Hurkle's response.)

The infinitive does not strictly define "when" an event happens. Connotation is NOT the same as denotation.
http://en.wikipedia.org/wiki/Infinitive

Yes, I understand all that, that's why I am able to make the claim she shouldn't have done so in first place.

I read several languages

As others have noted that's an appeal to authority, but I'll just say: so do I.

and the question Loren asked is a trick question.

Trick question (and I disagree on that) or not, she's still wrong.

Again, I was asking Hurkle how he judged the antecedent of Marilyn's hypothetical as TRUE;

That is why we have PMs on this forum.

He might have a legitimate answer -- but YOU do not, so far!

Apart from the fact that I'm not the only one arguing the irrelevance angle (see Fredrik's post #72), I already have given a criticism of Marilyn's second answer (see the end of post #84).

But because you clearly don't chase up references, to make this explicit (again): Marilyn is right when she claims that "t’s far more likely that the roll produced a mixed bunch of numbers than a series of 1’s." But she is wrong when she claims that 66234441536125563152 is a mixed bag of numbers. It is a very specific sequence. That's why it is equal odds.

No court would vindicate a judgment of Marilyn based on the low IQ grammar understood by most people in this thread.

What court? Courts are for legal issues.
Apart from being a backhanded argumentum ad hominem, the use of "vindicate" is an appeal to emotion. You are stooping low when you have to resort to these tactics.

Marilyn scored high in English as well as math

Clearly you have not.

Hurkyl · Feb 24, 2012

lavinia said:

Hurky I see your points and agree but something is bothering me that maybe you can explain.

If I take independent samples from a distribution with finitely many values then for a large sample wouldn't I expect the frequencies in the sample to be close to the frequencies in the distribution?

Yes. The set of sequences whose frequencies are flat^*, for example, contains around 5 \cdot 10^{13} elements. Each element is just as unlikely as 11111111111111111111, but there are so many of them.

Of course, the odds of picking something from this set is still only 1 in 75...

*: Well, they can't be flat because 20 isn't divisible by 6, so I mean the frequencies are 333344

Let me repeat that, for emphasis. When picking the sequence of 20 digits at random, you have a 1-in-75 chance of getting the flat distribution. The reason is entirely because there are many sequences whose frequencies are flat. Each individual sequence with this property is just as unlikely as any other sequence -- do not get the idea that the individual sequences with this property are somehow more likely than any other sequence.

Hurkyl · Feb 24, 2012

andrewr said:

"111111111111" 20x times would certainly be rejected as a loaded dice;

Replace 11111111111111111111 with any 20-digit sequence -- chosen before the dice are rolled -- and the same is true.

If you prepare to roll a dice 20 times, and THEN (consequently) provide a sequence of all 1's vs a series of mixed numbers; which is more likely to be the true answer about what was rolled?

(what does "mixed" mean? every number appears at least once?)

Your premise is not clear. If I operated according to the procedure

Roll 20 dice and write down the sequence
Come up some other sequence of 20 digits uniformly randomly
Present both sequences to you

then under the hypothesis that I present to you 11111111111111111111 and 66234441536125563152, the odds are 50% - 50% that the dice really did roll 20 1's in a row.

But if I operated according to the procedure

Roll 20 dice and write down the sequence
If the dice roll was not all 1's, write down 11111111111111111111, otherwise write down 66234441536125563152
Present both sequences to you.

then under the hypothesis that I present to you 11111111111111111111 and 66234441536125563152, the odds are still 50% - 50% that the dice really did roll 20 1's in a row.

Of course, if I presented you with 11111111111111111111 and 66234441536125563125, the odds are strictly 100% that the latter is what was actually rolled.

If I operated according to the procedure

Roll 20 dice and write down the sequence
If the dice roll was not all 1's, write down 11111111111111111111, otherwise select another sequence of 20 digits uniformly randomly
Present both sequences to you.

then under the hypothesis that I present to you 11111111111111111111 and 66234441536125563152, then the odds that the latter is what was actually rolled is 3.6 \cdot 10^{15}

If I operated according to the procedure

Roll 20 dice and write down the sequence
Think up^* some other 20-digit sequence that contains every digit at least once
Present both sequences to you.

then under the hypothesis that I present to you 11111111111111111111 and 66234441536125563152, the odds are strictly 100% that the former is what's rolled.

*: The particular method doesn't matter, so long as it satisfies the given constraint[/size]

andrewr · Mar 9, 2012

micromass said:

No, that's not what's going on here. The deal is: go get a dice and roll it 20x, then see whether you rolled the specific sequence 14325231542341632165. The answer will be no most likely.

Let's continue with the analysis. Let's write a computer program and let's do billions of dice rolls and let's measure whether 14325231542341632165 and 11111111111111111111 is more likely. Are you willing to accept the answer of a computer simulation??

It was a computer simulation that taught me the three shell problem; And I did accept it although I disagreed with my room-mate before I tried the program.

LOL, a sample of 10 games. You know very well that you need to roll it many more times to have something statistically significant.

But, ok, are you prepared to do the computer simulation I proposed?? I'll even code it for you.

Thank you, yes I would like to see how you code the program and verify it is at least algorithmically correct. I had some minor trouble in mine; for much of the tests, it is indeed nearly impossible to get an answer in "10" tries and so it is *very* difficult to verify that I coded the success counting section correctly for a 20x dice (so, if it ever does succeed, the program might just crash -- but I'm generally pretty good at debugging...)

For the 3 shell game I described, 10 runs is sufficient to notice a bias in the randomness, if there is one. I got 50/50 on my first try using the digits of pi mod 2 to choose among the two remaining shells. Not exactly random, but a good enough test.

I include the 3 shell casino, just as an example of how I code a probability demonstration, and a little fun. Let's have everyone play... ! and gather cumulative statistics...
I don't know about the 20x dice throw; but it won't hurt for a few thousand people to see if they can manually outguess python's well tested shuffling randomizer. Mercen? whatever twister core -- but pretty good.

If you catch a bug, let me know where and why it a bug in the code. :)
I'll fix it, if it is indeed a bug.

And, again -- Thank you for your offer to code something for me.
I love integrity, Micromass, it *always* impresses me; and it will save me some time.

I know C,C++,Java,Python,Fortran,Cobol,Snobol,assembly -- but here at the Farm (just a small one) we mostly have power processors free to do number crunching. Don't get me wrong, this isn't IBM's Haupage New York super-computer room; but I do have some spare computing...

However, I can't use x86 based binaries; I *do* need source code.

If you read my thread on converting a binomial/normal data distribution, you'll note that even at 500,000 data points, that the Python gaussian random number generator has a inexplicable defect near the mean value; it can be seen in all three graphs, although it is a very small bias.

I *do* believe this is a problem with the math co-processors on the Intel platform. I also had to borrow one to run a test of the casino under windows. Intel's fpu has a minor underflow problem in the log function, and when used to produce a univariate random variable by inversion (e**-0.5x**2) by anti/inverse -function-- the problem shows up in the graph.

I tried to work around that in the casino by using shuffling of an unbiased deck in my example program -- and I have commented lines that allow you to see the random numbers generated and verify they are reasonably "fair", or to even replace the random number generator with one of your own. (not that it's really important for a three shell game...)

But for the 20x dice, a bias in the random generator might be suspect, right?

I'm looking forward to your program... I'm sure to learn something about you from it.
:)

andrewr · Mar 9, 2012

Hurkyl said:

Replace 11111111111111111111 with any 20-digit sequence -- chosen before the dice are rolled -- and the same is true.

I already noted that in a previous post.
In fact, if the sequence mentioned in the OP were to come up at a casino -- I WOULD be checking for loaded dice; and I would be justified in doing so... DO you ever think I will?

(what does "mixed" mean? every number appears at least once?)

Your premise is not clear. If I operated according to the procedure

Roll 20 dice and write down the sequence

Come up some other sequence of 20 digits uniformly randomly

Present both sequences to you

then under the hypothesis that I present to you 11111111111111111111 and 66234441536125563152, the odds are 50% - 50% that the dice really did roll 20 1's in a row.

That is the premise of "future" roll. I do include it in the casino... It is, as you say -- 50/50; even Marilyn agrees to that.

But if I operated according to the procedure

Roll 20 dice and write down the sequence

If the dice roll was not all 1's, write down 11111111111111111111, otherwise write down 66234441536125563152

Present both sequences to you.

then under the hypothesis that I present to you 11111111111111111111 and 66234441536125563152, the odds are still 50% - 50% that the dice really did roll 20 1's in a row.

Of course, if I presented you with 11111111111111111111 and 66234441536125563125, the odds are strictly 100% that the latter is what was actually rolled.

This is exactly what I was wondering about how you think. I don't care to judge the rightness or wrongness of your response -- I just wanted to know how *you* personally approached the problem.

If I operated according to the procedure

Roll 20 dice and write down the sequence

If the dice roll was not all 1's, write down 11111111111111111111, otherwise select another sequence of 20 digits uniformly randomly

Present both sequences to you.

then under the hypothesis that I present to you 11111111111111111111 and 66234441536125563152, then the odds that the latter is what was actually rolled is 3.6 \cdot 10^{15}

If I operated according to the procedure

Roll 20 dice and write down the sequence

Think up^* some other 20-digit sequence that contains every digit at least once

Present both sequences to you.

then under the hypothesis that I present to you 11111111111111111111 and 66234441536125563152, the odds are strictly 100% that the former is what's rolled.

*: The particular method doesn't matter, so long as it satisfies the given constraint[/size]

Which constraint is that?
A child playing dice with a friend, say a cup rolling dice game, refuses to show the roll sequence to their mate; but claims, it is '1111111111'; So the father comes over to stop the fight, and looks in the cup which was bumped. He sees a sequence of numbers and says to the other child, "it is either 1111111111' or '5248232123'; Then the father says to the less favored child, they are "both" equally likely. Now, we don't know what happened -- but it isn't about the probability of '5248232123' being rolled in the future. It's about what happened in an actual roll of the dice in a past game -- and cheating is suspected.

What would the other child do? (It's fair, he got all ones and that was perfect to win the game?), or would the child say "Marilyn, suppose you decided to roll dice; and then you told me '111111111' or '5248243123'; which would be more likely to be the true roll?" )
Obviously, one of the rolls is a lie -- for a dice can't be both; and it was already rolled as far as the child is concerned.

Clearly, the first child "COULD" have cheated. The total probability of the problem includes the number of ways a child could cheat according to *any* algorithm that is reasonably possible. (Let's ignore space aliens, although they *ARE* theoretically possible, they are as unlikely as 11111111111111111...).

The issue in my mind is that a child could have asked the question to Marilyn through their parent in a NON-ACADEMIC way; EG: The supposed asker of the question to Marilyn hasn't told us publicly how she came up with the question. I rather wonder if you will appreciate it if she does...

I just wanted to know how you personally thought through to an answer.
I'm not saying you're wrong or anything, I don't know your IQ score in comparison to Marilyn anyway. Why should I believe you aren't equals?

Peace. --Andrew.

micromass · Mar 9, 2012

Mod note: Let's please keep this thread on-topic. The topic is a probability question. Off-topic posts will be deleted

micromass · Mar 9, 2012

andrewr said:

It was a computer simulation that taught me the three shell problem; And I did accept it although I disagreed with my room-mate before I tried the program.

Thank you, yes I would like to see how you code the program and verify it is at least algorithmically correct. I had some minor trouble in mine; for much of the tests, it is indeed nearly impossible to get an answer in "10" tries and so it is *very* difficult to verify that I coded the success counting section correctly for a 20x dice (so, if it ever does succeed, the program might just crash -- but I'm generally pretty good at debugging...)

For the 3 shell game I described, 10 runs is sufficient to notice a bias in the randomness, if there is one. I got 50/50 on my first try using the digits of pi mod 2 to choose among the two remaining shells. Not exactly random, but a good enough test.

I include the 3 shell casino, just as an example of how I code a probability demonstration, and a little fun. Let's have everyone play... ! and gather cumulative statistics...
I don't know about the 20x dice throw; but it won't hurt for a few thousand people to see if they can manually outguess python's well tested shuffling randomizer. Mercen? whatever twister core -- but pretty good.

If you catch a bug, let me know where and why it a bug in the code. :)
I'll fix it, if it is indeed a bug.

And, again -- Thank you for your offer to code something for me.
I love integrity, Micromass, it *always* impresses me; and it will save me some time.

I know C,C++,Java,Python,Fortran,Cobol,Snobol,assembly -- but here at the Farm (just a small one) we mostly have power processors free to do number crunching. Don't get me wrong, this isn't IBM's Haupage New York super-computer room; but I do have some spare computing... However, I can't use x86 based binaries; I *do* need source code.

If you read my thread on converting a binomial/normal data distribution, you'll note that even at 500,000 data points, that the Python gaussian random number generator has a inexplicable defect near the mean value; it can be seen in all three graphs, although it is a very small bias.

I *do* believe this is a problem with the math co-processors on the Intel platform. I also had to borrow one to run a test of the casino under windows. Intel's fpu has a minor underflow problem in the log function, and when used to produce a univariate random variable by inversion (e**-0.5x**2) by anti/inverse -function-- the problem shows up in the graph.

I tried to work around that in the casino by using shuffling of an unbiased deck in my example program -- and I have commented lines that allow you to see the random numbers generated and verify they are reasonably "fair", or to even replace the random number generator with one of your own. (not that it's really important for a three shell game...)

But for the 20x dice, a bias in the random generator might be suspect, right?

I'm looking forward to your program... I'm sure to learn something about you from it.
:)

Can you post an outline of your program in pseudocode, please??

micromass · Mar 9, 2012

Firstly, my code written in Scheme:

Code:

(define (MakeRandomList)
  {local [(define (MakeRandomList-iter n)
            {local [(define x (+ (random 2) 1))]
              (if (= n 0)
                  (list)
                  (cons x (MakeRandomList-iter (- n 1))))})] 
    (MakeRandomList-iter 10)})

(define (ListEqual List1 List2)
  {local [(define (ListEqual-iter l1 l2)
            (if (empty? l1)
                true
                (and (= (car l1) (car l2)) (ListEqual-iter (cdr l1) (cdr l2)))))]
    (ListEqual-iter List1 List2)})

(define list1 (list 1 1 1 1 1 1 1 1 1 1))
(define list2 (list 1 2 1 2 1 1 1 2 1 2))(define (Test n)
  {local [(define (Test-iter n amount1 amount2)
            {local [(define CurrentList (MakeRandomList))]
              (if (> n 0)
                  (if (ListEqual CurrentList list1)
                      (Test-iter (- n 1) (+ amount1 1) amount2)
                      (if (ListEqual CurrentList list2)
                          (Test-iter (- n 1) amount1 (+ amount2 1))
                          (Test-iter (- n 1) amount1 amount2)))
                  (list amount1 amount2))})]
    (Test-iter n 0 0)})

(Test 1000000)

A disclaimer first: the original post worked with "rolling the dice 20 times". This is unfeasable. Therefore, I changed the problem to "flipping a coin 10 times".

I worked with the two sequences 1111111111 and the supposedly random sequence 1212111212.

Now, what I did was:
Each test, I flip a coin 10 times. If the result is not one of the two sequences above, I discard the test. If the result is one of the two sequences above, I add 1 to the amount of times I saw the sequence.
This I do a million times.

Why is this a good representation of the test?
The original test was that I flip a coin 10 times. Then I get a choice which one of the above sequences was rolled. Of course, to get that very choice, I actually need to get one of the sequences. This is why every experiment where I do NOT get one of the sequences, I discard it.

After I got one of the sequences, I can choose which one of the sequences I get. Adding 1 to the amount of times I saw sequence 1 corresponds to getting it right if you guessed 1. Adding 1 to the amount of times I saw sequence 2 corresponds to getting it right if you guessed 2.
Eventually, the two amounts correspond to the number of times you got it right.

So, after iterating it a million times, I get
Sequence 1: 948
Sequence 2: 995

A subsequent test yielded:
Sequence 1: 1015
Sequence 2: 1001

These two are so close together that it seems plausible that the actual amount you get things right is indeed 50-50. Running it more than 1000000 times will only reinforce this, but I don't got the time to do so.

mathwonk · Mar 9, 2012

If you think 1,1,1,1,1,1,1 has essentially no chance of occurring as the winning numbers in a lottery, then you have just answered why the lottery is not a good bet. I.e. every other choice is just as unlikely as this one in a fair lottery.

It is ironic that Ms. Vos Savant would make this simple mistake since she rode to fame on a probability question that stumped some mathematicians (including me) as follows:

Suppose there are three doors and a prize lies behind one of them, and you have one choice. After you indicate your preferred choice the moderator opens another door with nothing behind it, leaving two doors still closed, yours and one other. Then you have the opportunity of keeping to your original choice or changing it.

What should you do, and why?

chiro · Mar 9, 2012

mathwonk said:

If you think 1,1,1,1,1,1,1 has essentially no chance of occurring as the winning numbers in a lottery, then you have just answered why the lottery is not a good bet. I.e. every other choice is just as unlikely as this one in a fair lottery.

It is ironic that Ms. Vos Savant would make this simple mistake since she rode to fame on a probability question that stumped some mathematicians (including me) as follows:

Suppose there are three doors and a prize lies behind one of them, and you have one choice. After you indicate your preferred choice the moderator opens another door with nothing behind it, leaving two doors still closed, yours and one other. Then you have the opportunity of keeping to your original choice or changing it.

What should you do, and why?

I've said this before, but I think it's important to bring this up.

The differences IMO that Ms. Vos Savant is talking about is the comparison of an underlying process vs the estimation of process parameters using likelihood techniques based on existing data.

Hurkyl is right in saying that if the underlying process is random, then every combination will be as unlikely (or likely) as every other possibility. No argument there.

But an important thing that statisticians have to do is 'guess' the probabilistic properties of a stochastic process using data. For a process that is binomial we use things like MLE estimation and using this we get the estimator to be t/n +- std where t is the number of 'true' or 'heads' and n is the number of trials.

My guess is that Marilyn is talking about likelihood estimation in the very last statement as opposed to true underlying probabilistic properties that Hurkyl is referring to.

Again if the dice are really and truly from a purely random process then Hurkyl is right, but if we have to measure some kind of 'confidence' by taking existing data where we do not know the real underlying process and have to make a 'judgement' about the probabilistic properties of the process where we don't actually know them, then if a likelihood procedure was done on a space with 6 possibilities per trial with 20 trials and we get all 1's, then given this data we have to say that we are not 'confident' that this data comes from a process that is purely random.

It's important to see the distinction: the likelihood results do not say that it doesn't come from a particular process, but rather gives evidence for it either coming or not coming from a particular kind of process.

Statisticians have to do this kind of thing all the time: they get data and they have to try and extract the important properties of the underlying process itself. We don't often get the luxury of knowing the process in any great detail so what we do is we say 'this model looks good, let's try and estimate its parameters using the data'.

People have to remember that the probabilistic properties of the true underlying stochastic process that is known and the exercise of trying to measure distribution parameters for a process that is not known are two very different things.

One specifies properties for a process that is known and the other tries to 'figure out' using sound statistical theory 'what the specifics of the process should be given the data since we don't actually know the underlying process'.

Again, two very different things.

nucl34rgg · Mar 10, 2012

Both probabilities are equally likely.

On a side note, if I roll a fair dice 999999999999 times and get 1 each time, and I roll it again, the probability of rolling a 1 is still 1/6. (Empirically, we might dispute that the dice was fair, however! ;))

Here is a nice quote from Feynman:

"You know, the most amazing thing happened to me tonight. I was coming here, on the way to the lecture, and I came in through the parking lot. And you won't believe what happened. I saw a car with the license plate ARW 357. Can you imagine? Of all the millions of license plates in the state, what was the chance that I would see that particular one tonight? Amazing!"

ParticleGrl · Mar 10, 2012

Everyone agrees the dice would roll both sequences with equal probability. Thats not the question being addressed in the second example.

The question being addressed in the second example is 'presented with two numbers, one of which was generated by rolling dice, one of which was generated with a different unknown process, which was more likely to be generated by the dice?'

In this case, I think many approaches will suggest the string of 1s is less likely to be the dice, but with only one data point and no information about the process generating the non-dice number, the predicted probabilities will always be close to 1/2 for each.

Hurkyl · Mar 10, 2012

chiro said:

But an important thing that statisticians have to do is 'guess' the probabilistic properties of a stochastic process using data. For a process that is binomial we use things like MLE estimation and using this we get the estimator to be t/n +- std where t is the number of 'true' or 'heads' and n is the number of trials.

My guess is that Marilyn is talking about likelihood estimation in the very last statement as opposed to true underlying probabilistic properties that Hurkyl is referring to.

The central limit theorem (CLT) is a great technique for predicting the mean of a large sample. The fallacious gambler uses it to predict the next outcome after a losing streak. That the CLT is a good technique for one purpose doesn't mean it's a good idea for the gambler to use it in a vaguely related situation.

When presented with the knowledge

Exactly one of

11111111111111111111
66234441536125563152

is real, and the other is fake

(and given the assumption that the specific question being asked is independent of your strategy for responding to it) there is exactly one reasons why you should predict that option (2) is the real one: you believe

(*) whatever process lead to you being faced with this question would produce this pair with all-1's being fake more often than with all 1's being real.

(also assuming your goal is to be right as often as possible)

Any approach you have to the question that, in the end, isn't aimed specifically at deciding whether (*) is true or not is fundamentally misguided.

aside: if you believe the generation of the fake is independent of the generation of the real, then (*) simplifies to

(*) the process that generates the fake is more likely to produce all 1's than it is to produce 66234441536125563152

chiro · Mar 10, 2012

Hurkyl said:

The central limit theorem (CLT) is a great technique for predicting the mean of a large sample. The fallacious gambler uses it to predict the next outcome after a losing streak. That the CLT is a good technique for one purpose doesn't mean it's a good idea for the gambler to use it in a vaguely related situation.

When presented with the knowledge

Exactly one of

11111111111111111111

66234441536125563152

is real, and the other is fake

(and given the assumption that the specific question being asked is independent of your strategy for responding to it) there is exactly one reasons why you should predict that option (2) is the real one: you believe

(*) whatever process lead to you being faced with this question would produce this pair with all-1's being fake more often than with all 1's being real.
(also assuming your goal is to be right as often as possible)

Any approach you have to the question that, in the end, isn't aimed specifically at deciding whether (*) is true or not is fundamentally misguided.

aside: if you believe the generation of the fake is independent of the generation of the real, then (*) simplifies to

(*) the process that generates the fake is more likely to produce all 1's than it is to produce 66234441536125563152

Hurkyl, do you know what likelihood techniques and parameter estimation is all about?

Like I said above, they focus on completely different things. The likelihood procedures are used to gauge what the parameters are for an assumed model given the data: you don't do it the other way around.

Likelihood procedures aren't perfect of course, but the point of them including parameter estimation is an intuitive concept that anyone can appreciate, not just a statistician.

As you say, if the dice roll process is truly purely random then there is no reason why everything is not equally likely but I'm afraid there is a huge caveat: we statisticians and scientists can't do this.

We have to use statistics and its methods to see how our hypotheses are backed up by the evidence which translates into analyzing the actual data. We have to check that the evidence supports the notion that the dice or the coin or whatever is what it is: we can't just say 'it's going to be equally likely': we have to do the experiment, get the right data and process it to see whether the data backs up our intuition.

You don't need to bring in the Central Limit Theorem or anything else: the idea is very basic and can be understood by anyone, statistician or non-statistician in a very simple way.

nucl34rgg · Mar 10, 2012

ParticleGrl said:

Everyone agrees the dice would roll both sequences with equal probability. Thats not the question being addressed in the second example.

The question being addressed in the second example is 'presented with two numbers, one of which was generated by rolling dice, one of which was generated with a different unknown process, which was more likely to be generated by the dice?'

In this case, I think many approaches will suggest the string of 1s is less likely to be the dice, but with only one data point and no information about the process generating the non-dice number, the predicted probabilities will always be close to 1/2 for each.

I don't understand how that question is the question that arose in the second part.

Here is what is being said:
"But let’s say you tossed a die out of my view and then said that the results were one of the above. Which series is more likely to be the one you threw? Because the roll has already occurred, the answer is (b). It’s far more likely that the roll produced a mixed bunch of numbers than a series of 1’s."

This does not relate to the first statement. The roll sequence is more likely to produce a string of mixed numbers. However, what we have here is a choice between two specific strings of numbers. Her conclusion, "thus, the answer is (b)" is false. Everything else that she said is technically fine, but largely irrelevant. The probability that the sequence is a mixed sequence of numbers is not the same thing as the probability that the sequence is a PARTICULAR mixed sequence of numbers.

ParticleGrl · Mar 10, 2012

"But let’s say you tossed a die out of my view and then said that the results were one of the above. Which series is more likely to be the one you threw? Because the roll has already occurred, the answer is (b). It’s far more likely that the roll produced a mixed bunch of numbers than a series of 1’s."

This does not relate to the first statement. The roll sequence is more likely to produce a string of mixed numbers. However, what we have here is a choice between two specific strings of numbers. Her conclusion, "thus, the answer is (b)" is false. Everything else that she said is technically fine, but largely irrelevant. The probability that the sequence is a mixed sequence of numbers is not the same thing as the probability that the sequence is a PARTICULAR mixed sequence of numbers.

Right, so whoever rolled the dice is presenting you with two choices a and b. One of them was generated by the dice roll, one was generated by an unknown process (you don't know where the alternative number came from).

So the question boils down to: given the strings of numbers 66234441536125563152, and 11111111111111111111, which one was more likely to gave been generated by a dice roll?

Which is related to a similar question- how many times do you have to roll 1 in a row before you start to wonder if your dice is biased?

nucl34rgg · Mar 10, 2012

I think those two strings are equally likely to be produced by a sequence of dice rolls.

However, we could use a chi-square test to show that obtaining a string of 1's suggests with a high probability that the dice was loaded.

ParticleGrl · Mar 10, 2012

I think those two strings are equally likely to be produced by a sequence of dice rolls.

Yes, literally every single person on this thread agrees with you. Thats not what the second question is asking. Its asking

GIVEN one of these strings was produced by dice and was not, which was more likely produced by dice?

However, we could use a chi-square test to show that obtaining a string of 1's suggests with a high probability that the dice was loaded.

Which is what I'm getting at, the string of 1s is less likely to be the dice generated string.

Bacle2 · Mar 11, 2012

micromass said:
Firstly, my code written in Scheme:
Code:
(define (MakeRandomList)
  {local [(define (MakeRandomList-iter n)
            {local [(define x (+ (random 2) 1))]
              (if (= n 0)
                  (list)
                  (cons x (MakeRandomList-iter (- n 1))))})] 
    (MakeRandomList-iter 10)})

(define (ListEqual List1 List2)
  {local [(define (ListEqual-iter l1 l2)
            (if (empty? l1)
                true
                (and (= (car l1) (car l2)) (ListEqual-iter (cdr l1) (cdr l2)))))]
    (ListEqual-iter List1 List2)})

(define list1 (list 1 1 1 1 1 1 1 1 1 1))
(define list2 (list 1 2 1 2 1 1 1 2 1 2))


(define (Test n)
  {local [(define (Test-iter n amount1 amount2)
            {local [(define CurrentList (MakeRandomList))]
              (if (> n 0)
                  (if (ListEqual CurrentList list1)
                      (Test-iter (- n 1) (+ amount1 1) amount2)
                      (if (ListEqual CurrentList list2)
                          (Test-iter (- n 1) amount1 (+ amount2 1))
                          (Test-iter (- n 1) amount1 amount2)))
                  (list amount1 amount2))})]
    (Test-iter n 0 0)})

(Test 1000000)
A disclaimer first: the original post worked with "rolling the dice 20 times". This is unfeasable. Therefore, I changed the problem to "flipping a coin 10 times".

I worked with the two sequences 1111111111 and the supposedly random sequence 1212111212.

Now, what I did was:
Each test, I flip a coin 10 times. If the result is not one of the two sequences above, I discard the test. If the result is one of the two sequences above, I add 1 to the amount of times I saw the sequence.
This I do a million times.

Why is this a good representation of the test?
The original test was that I flip a coin 10 times. Then I get a choice which one of the above sequences was rolled. Of course, to get that very choice, I actually need to get one of the sequences. This is why every experiment where I do NOT get one of the sequences, I discard it.

After I got one of the sequences, I can choose which one of the sequences I get. Adding 1 to the amount of times I saw sequence 1 corresponds to getting it right if you guessed 1. Adding 1 to the amount of times I saw sequence 2 corresponds to getting it right if you guessed 2.
Eventually, the two amounts correspond to the number of times you got it right.

So, after iterating it a million times, I get
Sequence 1: 948
Sequence 2: 995

A subsequent test yielded:
Sequence 1: 1015
Sequence 2: 1001

These two are so close together that it seems plausible that the actual amount you get things right is indeed 50-50. Running it more than 1000000 times will only reinforce this, but I don't got the time to do so.

If that's the way you chose to decide the issue, maybe you can run a significance test on each of the differences 995-948 and 1015-1001 . I think it will pass, i.e., be accepted, at just-about any significance level.

Bacle2 · Mar 11, 2012

mathwonk said:

If you think 1,1,1,1,1,1,1 has essentially no chance of occurring as the winning numbers in a lottery, then you have just answered why the lottery is not a good bet. I.e. every other choice is just as unlikely as this one in a fair lottery.

It is ironic that Ms. Vos Savant would make this simple mistake since she rode to fame on a probability question that stumped some mathematicians (including me) as follows:

Suppose there are three doors and a prize lies behind one of them, and you have one choice. After you indicate your preferred choice the moderator opens another door with nothing behind it, leaving two doors still closed, yours and one other. Then you have the opportunity of keeping to your original choice or changing it.

What should you do, and why?

Mathwonk: a good point can be made that the reason why the problem trumped a large number of people is that the problem was not well-posed--just like this last one, where I think Vos-Savant could have made more of an effort to avoid potential ambiguities in her description, i.e., to specify the layout in such a way that alternative descriptions , understanding of her layout is less likely.

andrewr · Mar 13, 2012

Bacle2 said:

Mathwonk: a good point can be made that the reason why the problem trumped a large number of people is that the problem was not well-posed--just like this last one, where I think Vos-Savant could have made more of an effort to avoid potential ambiguities in her description, i.e., to specify the layout in such a way that alternative descriptions , understanding of her layout is less likely.

I agree with you; Marilyn's responses appear to be characteristically confusing; I find myself wondering if she is purposely trying to trip up certain intelligent people...

Of course, that wonder is just an automatic reaction of mine.. and not a considered opinion. Upon thinking about her response a bit more, I notice that Marilyn evokes thoughts (in my eyes) to normal but confusing "women" conversations.

I don't think it uncommon for men, like myself, to infer different priorities of meaning than the women actually involved in such men-exclusive conversations. (I note Loren hasn't yet responded again, and I am only noticing one other female respondent entering the melee... GOOD for her! )

I do agree that Marilyn has a command of the English language which makes her somewhat liable to judgment; eg: the IQ tests she took were heavily biased by men writers at the time...

However, I know that judging her wrong based on a manly interpretation (solely) is likely an injustice (which is why I don't personally care to do it ? ).

Marilyn might be careless, tired, annoyed with a leading question, or something along those lines; However, if even the original auditor (?Loren?) really did not understand Marilyn's nuances -- then Marilyn has made a true "faut pas" where she ought to know better *intuitively*.

andrewr · Mar 13, 2012

micromass said:

Can you post an outline of your program in pseudocode, please??

Python is often equated with pseudo-code

.

The program itself is fairly lengthy because I tried to include several different interpretations of the question. (Liars included, I already have someone claiming to have gotten odds on the 20x dice throw by outguessing the python random generator intuitively! It's possible, but How do I check if he told me the truth?)

Which sub-section do you want me to outline? I can edit the "if" statements out such that only your question is listed with comments / pseudo-code. I tried *VERY* hard to comment the program thoroughly (It is well over 50% comments...) Eg: Here's the tiny 3 shell game (edited only to improve my spelling and remove a game irrelevant print statement)

Code:

def ShellGame(_):
	"""
	Play a "Three then eject one Spam can shell game!"
	Load one of three cans with spam on a table, then let user pick one.
	Then, Pythonically eject a different but empty shell off the table
	when the customer finishes choosing. (please miss the customer!)
	Apologize profusely, and do allow (subject to Marilyn shell game 
	rules), the upset customer to re-choose among the two remaining shells.

	Marilyn maintains that the probability of getting a prize from the
	two remaining shells is not 50%/50% *depending* on your *aposteriori*
	choosing choice method.  That is the point of the test.

	This demonstrates how I learned about aposteriori probability changes 
	over 18 years ago, when I lost a serious bet to a mate of mine.
	Learn from my mistakes (!) as the voice of casino experience says *ouch*

	:)
	"""
	onTable={ 1:"Nothing", 2:"Nothing", 3:"Nothing" } # Set shells on table

	global usedDice, usedCoins # Casinos Keep track of used dice and coins.
	whichShell, usedDice = DiceToShell( dicePool[usedDice] ), usedDice+1
	onTable[whichShell]="SPAM AND SpAm sPaM SpAm" # Fill one shell randomly 

	def GetOnTable(): # Pick a shell subroutine
		menu=[]
		print "Shell game Table menu: appetizer is in ONE of:"
		for i,j in onTable.items():
			print "shell and tin can *",i,"*, ",
			menu.append(i)
		print "\n"
		choice,_=GetChoice(menu)
		return choice

	choice=GetOnTable() # first, Let the customer calmly choose a shell

	# Flip a coin in preparation for violent ejaculation
	headsOrTails,usedCoins = coinPool[usedCoins], usedCoins+1

	# Do a quick posteriori analysis for a dangerous random eject
	if onTable[choice]=="Nothing": headsOrTails="Heads"# remove possibility 

	for i,j in onTable.items():
		if i==choice: continue # Don't ever remove the user's choice!
		if j=="Nothing":  # This is empty, and thus not illegal to bomb 
			if headsOrTails == "Heads": # Randomly set off a shell 
				del onTable[i] # Shell is NOW GONE off table.
				print "\nI'm SO Sorry!"
				print "Can.an.d shell #",i," Just BLEW off!"
				print "Thankfully it held no prize!"
				print "Whew, now, you may re-choose for prize"
				print
				break
			else:
				headsOrTails=ReverseCoin( headsOrTails )

	# There are now TWO shells left, and the prize is still available 
	# Notice there was NO SWITCHEROO.  Just a safe pointless ejaculation.
	# Now the customer still doesn't KNOW where the item is, they guess ??
	
	choice=GetOnTable()
	print "You found ",onTable[choice]," for a prize"
	return ( onTable[choice] != "Nothing" )	# Win Spam=True, Nothing=False
# END of shell game.

How can I improve the code to make it more readable for you?

I don't know scheme off the top of my head, but I can attempt to translate (crudely) the part of the Casino you are interested in; Just let me know which part of (MarilynCasinoPack.py), and give me some time to read the specifications of scheme.

Loren Booda · Mar 13, 2012

andrewr said:

(I note Loren hasn't yet responded again, and I am only noticing one other female respondent entering the melee... GOOD for her! )

And good for him -- Loren. One less woman.

I must have some kind of dyslexia in trying to respond to posts.

I don't always agree with Marilyn, and this puzzle's answer I also find non-intuitive -- but similar to the Monty Hall paradox.

Is Marilyn Vos Savant wrong on this probability question?

Attachments

Similar threads

B A Little Probability Puzzle

I A variant of the Monty Hall problem

I What Are the Axioms of Fuzzy Logic and How Do They Extend Boolean Algebra?

I Please Explain (actually explain) The Monty Hall Problem

B How Rare Is Low Smartphone Usage Among Metro Travelers in Japan?

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight

Insights Relativator (Circular Slide-Rule): Simulated with Desmos - Insight

Insights Fixing Things Which Can Go Wrong With Complex Numbers