## Is Marilyn Vos Savant wrong on this probability question?

 Quote by Hurkyl Did you notice you've significantly changed the problem? I get the impression you've fixated on one method of approaching the problem so strongly that you're having trouble acknowledging any other aspects of the situation.
If I did that it was completely intentional: like I said in the quote, I focused on what the quote said literally and I interpreted it to be what I said.

I already acknowledged that the other part of the question which has been addressed is fair: I agree with your stance on probabilities being equal and all the rest of that which has been discussed in depth.

Again, I'm not trying to hide anything: I just looked at the quote and interpreted it to mean what it meant in the way that I described.

I thought I made it clear when I was talking about parameter estimation, but I think that perhaps I should have been clearer. I'll keep that in mind for future conversations.

 I need you to understand the following five problems are different problems: Here are two sequences, one real, one fake. The real one is generated by a fair die roll. The fake one is generated by the person asking the question. Which one is real? Here are two sequences. Given the hypothesis that one of them was generated by rolling a fair die, which one is more likely to be the one rolled? Here are two sequences. Which one is more likely to be generated by rolling a fair die? Here are two histograms. Which one is more likely to be generated by rolling a fair die? Here is a sequence. Was it generated by a fair die roll? Here is a sequence generated by die roll. Is the die fair?
For what I was talking about I was only concerned with the problems where a sequence was given. Again I thought I made that very clear. I am, as you have pointed out, addressing the last point in the list.

In terms of a sequence being generated by a non-die process (but still has the same probability space), we can't really know this based on Marilyn's circumstance: we have assumed that someone else rolled a dice and therefore we construct the constraints we construct. Does that seem like a fair thing to do? If not why not?

 You, I think, are trying to solve problem #4 too, but you're solving it by pretending it is two instances of problem #5, but the work you're describing is for solving problem #6.
I am specifically solving problem 6 yes, but I've outlined my reasoning above.

 That last thing is one of the things I'm criticizing. People make very serious blunders by pretending like that. There's one situation I recall vividly: there was a gaming community that was trying to test whether some character attribute had any effect on the proportion of success. They gathered data that supported the hypothesis with well over 99% confidence... but they spent years believing there was no effect because some vocal analysts made a substitution similar to what you did:We want to test if proportion 1 is bigger than proportion 2, right? Well, let's estimate the two proportions. (Compute two confidence intervals) The confidence intervals overlap, so the data isn't significant.Whereas if they had done a test that was actually designed to answer the question at hand (a difference between proportions test), they would have seen the result as very significant.
Yes I have found that statistics and probability has a habit of getting people falling into that trap, and even for people that have been doing this for a long time it still can happen. But with respect to the answer, I thought it was clear what I was saying.

 Problem #5 is of a typical philosophically interesting type, because we can't talk about the probability of the answer. We can't even give an answer of the sort "yes is more probable than no". We can, however, choose a strategy to answer the question such that if the true answer is "yes", then we will be correct over, e.g., 95% of the time.
I agree with you on this, but again I wasn't focusing on this.

 But all of that aside, the main thing you're missing about problem #1 (and problem #6) that makes it very different from problems #2 through #5. We're not trying to answer questions about a single "process": we have two different processes, and we're trying to decide which processes produced the output we have. True, it can be difficult to get precise or accurate information about one of the processes, but that doesn't change the form of the problem. (#6 and #1 are different because #6 has a single output and we're trying to guess which among many processes generated that output, and #1 has two processes with two outputs, and we're trying to say which one goes with which)
I never argued about that part of the problem. You might want to look at the response I had for those parts of Marilyn's statement. You made a statement about this and I agreed with you: again I'm not focusing on that part and I made it clear before what my thoughts were.

 All that aside, if we tried to use your strategy to solve problem #1, you will have a low probability of success against many people: it is a well-known tendency for humans to generate fake data that is *too* uniform. For example, 66234441536125563152 is 1.5 standard deviations too uniform by the test I did. So, when you take the real and fake data, decide what bias is most likely on the die, and compare to fair, you will pick the overly uniform fake data over the randomly generated data most of the time. Any question of the form versus 11111111111111111111 is very unlikely to ever come up except against a human opponent who is likely to make that sort of bluff, so your mis-analysis won't cost you much in this case. However, it will cost you big-time by picking the overly-uniform data too much.
Again, I agree that if a process has specific characteristics then regardless of what we 'think' it doesn't change the process. I didn't argue that and in fact I agreed with you if you go back a few pages in the thread. The process is what the process is.

The big thing I have learned from this is that in a conversation like this (and especially one this heated) we need to all be clear what we are talking about. It includes me but I think it also includes the other participants as well.

I will make the effort on my part to do this for future threads, especially ones of this type.

 Quote by SidBala I haven't read through the thread. But in short, she is right. Any one valid string of dice rolls is just as probable as any other. So what are people talking about for 9+ pages?
It's become a heated argument with a little bit of a misunderstanding on what other posters are specifically talking about thrown in for good measure :)

 Blog Entries: 1 Here is Savant's latest on the subject, if anyone's interested: http://www.parade.com/askmarilyn/201...e-rolling.html She still isn't admitting defeat, and she's continuing to spread mathematical falsehoods.
 Recognitions: Science Advisor She seems a little too certain for someone who had to backpedal from her claims on the proof of Fermat's last theorem being flawed.
 Recognitions: Gold Member Science Advisor Staff Emeritus Ah, she still doesn't get it. And I doubt she will, because she's in that situation where she has a correct conclusion with a terrible argument. Why do I say she has the right answer? Because I have incredibly high prior odds on her choosing 11111111111111111111 as the fake sequence -- much less than the odds on her choosing 44132411666623551133 -- and so it's far more likely that 11111111111111111111 is the fake.
 Recognitions: Science Advisor I think one of the posters there has a good point: Marilyn does not make any testable claims, nor calculations, which makes it (unnecessarily) hard to test her arguments.

Blog Entries: 8
Recognitions:
Gold Member
Staff Emeritus
 Quote by Bacle2 I think one of the posters there has a good point: Marilyn does not make any testable claims, nor calculations, which makes it (unnecessarily) hard to test her arguments.
Yes. I suppose we can all agree on that. If she would describe an experiment unambiguously, then it would be easily resolved what the correct answer was.

 "Yes. I suppose we can all agree on that. If she would describe an experiment unambiguously, then it would be easily resolved what the correct answer was." I think that depends on whether you are Bayesian or Frequentist. Maybe someone knows more about this.

 Quote by Hurkyl Ah, she still doesn't get it. And I doubt she will, because she's in that situation where she has a correct conclusion with a terrible argument. Why do I say she has the right answer? Because I have incredibly high prior odds on her choosing 11111111111111111111 as the fake sequence -- much less than the odds on her choosing 44132411666623551133 -- and so it's far more likely that 11111111111111111111 is the fake.
Absolutely agree with your statement: The conclusion makes sense under one interpretation (which I have been debating for a while and finally clarified eventually), but her argument just doesn't make sense to me about the past and future.

Remember folks, this is what you get when debates continue and consume people when the issue at hand is vaguely described or not really described at all!