# Precognition paper to be published in mainstream journal

P: 3,387
 Quote by collinsmark Yes, if you were to flip a coin fair ten times in a single experiment, the likelihood of the coin coming up all heads on a given experiment is 1/210 or about 1 chance in 1024.
Which is exactly the same odds of equal heads and tails coming up.

The test itself, as per the article had 50/50 odds of the test subject guessing correctly. So I don't see 53/47 as being statistically amazing.

EDIT: I'm talking in regards to prediction so far as the coin toss odds.

The 53% must be from another experiment. The first one in the article I believe.
 P: 3,387 Perhaps I should elaborate. By always having a 50/50 chance of any outcome. No matter what you predict the odds of it occurring are the same. Any pattern you choose so far as a coin toss goes is equally likely to occur. So you really need to shift the odds to >70/30 to show strong predictability. I'd prefer a test with smaller odds, say 1 in 6, of you guessing the result. That way you have significant odds against you simply guessing on each turn. By using 50/50 you are swinging the odds in favour of a guess. Even a roll of the dice, giving the 1 in 6 odds, gives an even chance of any pattern occurring. However, it does mean that there is a 5 in 6 chance you are wrong on each go, making a string of correct predictions far more spectacular and significantly less likely.
HW Helper
PF Gold
P: 1,963
Quote by jarednjames
 Quote by collinsmark Yes, if you were to flip a coin fair ten times in a single experiment, the likelihood of the coin coming up all heads on a given experiment is 1/210 or about 1 chance in 1024.
Which is exactly the same odds of equal heads and tails coming up.
Egads! don't say that!

It's not the same. Let's take a 2 coin toss experiment to start. There are four possibilities.

H H
H T *
T H *
T T

Only one possibility out of 4 gives you all heads. That's one chance in 4. But there there are two possibilities that given you equal number of heads and tails, H T and T H. So the probability to tossing equal number of heads vs. tails is 50% or one chance in two attempts.

Moving on to a experiment with 4 tosses,

H H H H
H H H T
H H T H
H H T T *
H T H H
H T H T *
H T T H *
H T T T
T H H H
T H H T *
T H T H *
T H T T
T T H H *
T T H T
T T T H
T T T T

There are 16 possible outcomes and only 1 with all heads. So there is one chance in 16 of getting all heads. But there are 6 ways of getting an equal number of heads and tails. So the probability of equal heads and tails is 6/16 = 37.5% or about one chance in 2.67 attempts.

It turns out that one can calculate the number of ways to produce an outcome of the coin toss flip using

$$\left( \begin{array}{c}n \\ x \end{array} \right) = \frac{n!}{x!(n-x)!}$$

where n is the number of tosses, and x is the number of heads (or tails).

So for a 10-toss experiment, the chances of getting all heads is 1 in 1024, but the chances of getting equal number of heads and tails is 24.6094% or about 1 in 4.

 By always having a 50/50 chance of any outcome. No matter what you predict the odds of it occurring are the same. Any pattern you choose so far as a coin toss goes is equally likely to occur. So you really need to shift the odds to >70/30 to show strong predictability.
Yes, I agree with that. For a particular pattern the odds are 1 in 1024 (10-toss coin experiment) for any specific pattern.

But if you don't care which coins come up heads as long as there is an even number of heads and tails, things are very different.

The experiments presented in the paper don't really care which order the words are recalled, or which specific words happen to be in the "practice" or "control" set. The experiments are not looking for overly specific patters, they are looking for sums of choices that are statistically unlikely when taken as a whole.
 I'd prefer a test with smaller odds, say 1 in 6, of you guessing the result. That way you have significant odds against you simply guessing on each turn. By using 50/50 you are swinging the odds in favour of a guess. Even a roll of the dice, giving the 1 in 6 odds, gives an even chance of any pattern occurring. However, it does mean that there is a 5 in 6 chance you are wrong on each go, making a string of correct predictions far more spectacular and significantly less likely.
Again, for a single roll of the die you are correct. For a single roll of the die, the probability distribution is uniform.

But that is not the case for rolling the die twice and taking the sum. Or, the same thing, guessing on the sum of two dice rolled together.

If you were to guess on the sum being 2 (snake eyes), you have a 1 chance in 36

On the other hand, if you were to guess that the sum is 7, your odds are incredibly better. There are 6 combinations that give you a score of 7. That makes your odds 6/36 = 16.6667% or 1 chance in 6.

[Edit: fixed a math/typo error.]

[Another edit: Sorry if this is a bit off topic but this subject is fascinating. It's a curious aspect of nature that things tend to reach a state of equilibrium. At the heart of nature, this aspect is because there are a far greater number of possible states that are roughly equally distributed and far fewer states at the extremes. At sub-microscopic scales, there's really no such thing as friction and all collisions are essentially elastic and reversible. But when considering groups of atoms and particles taken together, there are far more states that have roughly equal distribution and far fewer at extreme situations, all else the same (such as the total energy being the same in all possible states). it's this property that we are talking about here that explain friction, inelastic collisions, non-conservative forces, and the second law of thermodynamics when scaled up to macroscopic scales. And perhaps most importantly, the reason that getting 5 heads in a 10-toss coin experiment is far more likely than getting 10 heads is essentially the same reason why my coffee cools down on its own instead of heating up spontaneously.]
P: 3,387
Yes, I was referring to predicting a specific pattern.
 The effects he recorded were small but statistically significant. In another test, for instance, volunteers were told that an erotic image was going to appear on a computer screen in one of two positions, and asked to guess in advance which position that would be. The image's eventual position was selected at random, but volunteers guessed correctly 53.1 per cent of the time. That may sound unimpressive – truly random guesses would have been right 50 per cent of the time, after all. But well-established phenomena such as the ability of low-dose aspirin to prevent heart attacks are based on similarly small effects, notes Melissa Burkley of Oklahoma State University in Stillwater, who has also blogged about Bem's work at Psychology Today.
This is the test I'm referring to.

As per another thread, probability isn't my strong suit. A very interesting post from you there and I thank you. Cleared up some other questions I had as well.
PF Gold
P: 739
 Quote by collinsmark (III) The experiments were somehow biased in ways not evident from the paper, or the data were manipulated or corrupted somehow.
No need to postulate malice where a simple mistake will suffice.

It's got to be this one (well reasoned opinion). Frankly, I think it's because the tests are fundamentally non-causal (i.e. don't take place during forward propagation on the positive t-axis). You can never remove the systematic bias from the test: the data point is always taken before the test is performed.

I don't mean that in a trivial "oh, that's neat" way. Seriously consider it. The data being taken in a "precognitive memorization test" is taken prior to the test being performed.

1)Memorize words
2)Recall words test
3)Record results
4)Perform typing test

So we have a fundamental problem. This is situation in which one of the following two scenarios MUST be true:

1) Either the list of words to be typed during the typing test are generated PRIOR to the recall test, or
2) the list of words to be typed during the typing test are generated AFTER the recall test.

In the case of (1), it would be impossible to separate precognition from remote viewing. In the case of (2), there is a tiny chance that the event is actually causal (in that the generation process could be influenced by the results of the recalled word test).

(For the purposes of this problem description I am assuming that causal events are more likely than non-causal events.)
HW Helper
PF Gold
P: 1,963
Quote by jarednjames
 The effects he recorded were small but statistically significant. In another test, for instance, volunteers were told that an erotic image was going to appear on a computer screen in one of two positions, and asked to guess in advance which position that would be. The image's eventual position was selected at random, but volunteers guessed correctly 53.1 per cent of the time. That may sound unimpressive – truly random guesses would have been right 50 per cent of the time, after all. But well-established phenomena such as the ability of low-dose aspirin to prevent heart attacks are based on similarly small effects, notes Melissa Burkley of Oklahoma State University in Stillwater, who has also blogged about Bem's work at Psychology Today.
This is the test I'm referring to.
Okay, I hadn't looked at that experiment yet, but I'll look at it now.

The study paper says in the experiment, "Experiment 1: Precognitive Detection of Erotic Stimuli," that there were 100 participants. 40 of the participants were shown each 12 erotic images (among other images), and the other 60 participants were each shown 18 erotic images (among others). That makes the total number of erotic images shown altogether, (40)(12)+ (60)(18) = 1560 erotic images shown. The paper goes on to say,

"Across all 100 sessions, participants correctly identified the future position of the erotic
pictures significantly more frequently than the 50% hit rate expected by chance: 53.1%"
However, after reading that, it's not clear to me whether the 53.1% is the total hit rate averaged across all total erotic pictures from all participants, or whether that is the average erotic-image hit rate of each participant. I don't think it matters much, but I'm going to interpret it the former way, meaning a hit rate of 53.1% of the total 1560 erotic images shown.

So this is sort of like a 1560-toss coin experiment. 53.1% of 1560 is ~828. So I'm guessing that the average number of "correct" guesses is 828 out of 1560 (making the percentage more like 53.0769%).

We could use the binomial distribution

$$P(n|N) = \left( \begin{array}{c}N \\ n \end{array} \right) p^n (1-p)^{(N-n)} = \frac{N!}{n!(N-n)!} p^n (1-p)^{(N-n)}$$

Where N = 1560, n = 828, and p = 0.5. But that would give us the probability of getting exactly 828 heads out of 1560 coin tosses.

But we're really interested in finding the probability of getting 828 heads or greater, out of 1560 coin tosses. So we have to take that into consideration too, and our equation becomes,

$$P = \sum_{k = n}^N \left( \begin{array}{c}N \\ k \end{array} \right) p^k (1-p)^{(N-k)} = \sum_{k = n}^N \frac{N!}{k!(N-k)!} p^k (1-p)^{(N-k)}$$

Rather than break my calculator and sanity, I just plopped the following into WolframAlpha:
"sum(k=828 to 1560, binomial(1560,k)*0.5^k*(1-0.5)^(1560-k))"
Thank goodness for WolframAlpha. (http://www.wolframalpha.com)

The results are the probability is 0.00806697 (roughly 0.8%)

That means the probability of 53.1% heads or better in 1560-toss coin experiment, merely by chance with a fair coin, is 1 in 124. Similarly, the chances of the participants randomly choosing the "correct" side of the screen in erotic image precognition test 53.1% or better, on average, on the first experiment (with all 100 subjects choosing which side 12 or 18 times each), merely by chance, is 1 out of 124. I'd call that statistically significant.

 As per another thread, probability isn't my strong suit. A very interesting post from you there and I thank you. Cleared up some other questions I had as well.
I'm not very good at probability and statistics either. I used to know this stuff a long time ago, but I promptly forgot most of it. I had to re-teach myself much of it for this thread!
PF Gold
P: 739
 Quote by collinsmark That means the probability of 53.1% heads or better in 1560-toss coin experiment, merely by chance with a fair coin, is 1 in 124.
I could be wrong, but aren't we assuming something by using only the number of erotic images as tests? It implies that there was always an erotic image to be found, and that's not the impression I get from the test.

In fact, and I could be wrong, I understood it to mean that the options were always "left" or "right" but that not every left=right set contained a possible correct answer.

I think I'll have to read again.
P: 271
A story on daryl bem's paper in the new york times:

 One of psychology’s most respected journals has agreed to publish a paper presenting what its author describes as strong evidence for extrasensory perception, the ability to sense future events. The decision may delight believers in so-called paranormal events, but it is already mortifying scientists. Advance copies of the paper, [Mind Mysteries] to be published this year in The Journal of Personality and Social Psychology, have circulated widely among psychological researchers in recent weeks and have generated a mixture of amusement and scorn. Some scientists say the report deserves to be published, in the name of open inquiry; others insist that its acceptance only accentuates fundamental flaws in the evaluation and peer review of research in the social sciences. “It’s craziness, pure craziness. I can’t believe a major journal is allowing this work in,” Ray Hyman, an emeritus professor of psychology at the University Oregon and longtime critic of ESP research, said. “I think it’s just an embarrassment for the entire field.”
http://www.nytimes.com/2011/01/06/sc...pagewanted=all

Another quote:
 In this case, the null hypothesis would be that ESP does not exist. Refusing to give that hypothesis weight makes no sense, these experts say; if ESP exists, why aren’t people getting rich by reliably predicting the movement of the stock market or the outcome of football games?
I wonder why people suddenly get such sloppy logic when the subject concerns ESP.
P: 2,284
 Quote by pftest A story on daryl bem's paper in the new york times: http://www.nytimes.com/2011/01/06/sc...pagewanted=all Another quote: I wonder why people suddenly get such sloppy logic when the subject concerns ESP.
Yes, it's always good to move away from the paper itself, and instead read a reporter's personal take on it... why????

Forget the article and focus on the actual paper, which is a different matter. Beyond that, you need to learn what the scientific method is so you can understand when you posit that null hypothesis, and why. Nobody here should have to argue with you, just to realize that you need further education on the subject.

For instance, would it be logical to assume the existence (i.e. truth of hypothesis) of something, then go about to prove your assumption? That's called... NOT SCIENCE... in fact it's enough to end your career regardless of the research subject. To pass off the results of a test designed to exploit a known neurological process is just... stupid. There's something to be examined here, but IF it's repeatable, then it doesn't sound ESPy to me at all. This is ESP in the way that forgetting where your keys are, then suddenly having an idea in your mind that they're under couch! You must be psychic, and all because of your mindset while waiting for your search pattern to improve based on dim memory.
P: 271
 Quote by nismaratwork Yes, it's always good to move away from the paper itself, and instead read a reporter's personal take on it... why????
Perhaps you didnt read the article, but even the quote that i used states that this was the opinion of "experts". So it isnt the reporters "personal take". Im surprised that those experts use such sloppy logic. Perhaps the reporter didnt summarise the experts views well.
P: 2,284
 Quote by pftest Perhaps you didnt read the article, but even the quote that i used states that this was the opinion of "experts". So it isnt the reporters "personal take". Im surprised that those experts use such sloppy logic. Perhaps the reporter didnt summarise the experts views well.
Oh, in that case I'll have Flex do the same referring to ME as an "expert", and I'll call him a journalist. I can see that you really press the standards here when it comes to credulity.
P: 772
Here is a PDF of a response paper:

http://dl.dropbox.com/u/1018886/Bem6.pdf

It looks like there are some serious flaws with the ESP paper. The one I have the biggest problem with is coming up with a hypothesis from a set of data, and then using that same set of data to test the hypothesis. It's a version of the Texas Sharpshooter Fallacy.

Here's what the paper I linked has to say, in part, on this matter:

 The Bem experiments were at least partly exploratory. For instance, Bem’s Experiment tested not just erotic pictures, but also neutral pictures, negative pictures, positive pictures, and pictures that were romantic but non-erotic. Only the erotic pictures showed any evidence for precognition. But now suppose that the data would have turned out differently and instead of the erotic pictures, the positive pictures would have been the only ones to result in performance higher than chance. Or suppose the negative pictures would have resulted in performance lower than chance. It is possible that a new and different story would then have been constructed around these other results (Bem, 2003; Kerr, 1998). This means that Bem’s Experiment 1 was to some extent a fishing expedition, an expedition that should have been explicitly reported and should have resulted in a correction of the reported p-value.
I'm currently reading a book by Dr. Ben Goldacre called "Bad Science" where he goes over this exact sort of thing.
P: 2,284
 Quote by Jack21222 Here is a PDF of a response paper: http://dl.dropbox.com/u/1018886/Bem6.pdf It looks like there are some serious flaws with the ESP paper. The one I have the biggest problem with is coming up with a hypothesis from a set of data, and then using that same set of data to test the hypothesis. It's a version of the Texas Sharpshooter Fallacy. Here's what the paper I linked has to say, in part, on this matter: I'm currently reading a book by Dr. Ben Goldacre called "Bad Science" where he goes over this exact sort of thing.
I'd call it, "Good Fraud"... better 'atmospherics'.
Other Sci
P: 1,400
Perhaps this falls into the category of "journalism" that seems so despised in this discussion, but Jonah Lehrer wrote a nice article for The New Yorker that touches on issues relevant to the debate (similar to the points already brought up in the thread: that subtle biases in study design, analysis and interpretation can introduce significant biases and lead to erroneous results). In particular, he talks about some work done by Jonathan Schooler:
 In 2004, Schooler embarked on an ironic imitation of Rhine’s research: he tried to replicate this failure to replicate. In homage to Rhine’s interests, he decided to test for a parapsychological phenomenon known as precognition. The experiment itself was straightforward: he flashed a set of images to a subject and asked him or her to identify each one. Most of the time, the response was negative—the images were displayed too quickly to register. Then Schooler randomly selected half of the images to be shown again. What he wanted to know was whether the images that got a second showing were more likely to have been identified the first time around. Could subsequent exposure have somehow influenced the initial results? Could the effect become the cause? The craziness of the hypothesis was the point: Schooler knows that precognition lacks a scientific explanation. But he wasn’t testing extrasensory powers; he was testing the decline effect. “At first, the data looked amazing, just as we’d expected,” Schooler says. “I couldn’t believe the amount of precognition we were finding. But then, as we kept on running subjects, the effect size”—a standard statistical measure—“kept on getting smaller and smaller.” The scientists eventually tested more than two thousand undergraduates. “In the end, our results looked just like Rhine’s,” Schooler said. “We found this strong paranormal effect, but it disappeared on us.” The most likely explanation for the decline is an obvious one: regression to the mean. As the experiment is repeated, that is, an early statistical fluke gets cancelled out. The extrasensory powers of Schooler’s subjects didn’t decline—they were simply an illusion that vanished over time. And yet Schooler has noticed that many of the data sets that end up declining seem statistically solid—that is, they contain enough data that any regression to the mean shouldn’t be dramatic. “These are the results that pass all the tests,” he says. “The odds of them being random are typically quite remote, like one in a million. This means that the decline effect should almost never happen. But it happens all the time! Hell, it’s happened to me multiple times.”
http://www.newyorker.com/reporting/2...fa_fact_lehrer

In essence, Schooler replicated the results of the Bem paper but, after performing many more tests, showed that the results were noting but a statistical anomaly. I'm not aware whether Schooler published these results.

This, especially in light of other such examples detailed in Lehrer's piece, is why I'm hesitant to trust findings based primarily on statistical data without a plausible, empirically-tested mechanism explaining the results.
P: 2,284
 Quote by Ygggdrasil Perhaps this falls into the category of "journalism" that seems so despised in this discussion, but Jonah Lehrer wrote a nice article for The New Yorker that touches on issues relevant to the debate (similar to the points already brought up in the thread: that subtle biases in study design, analysis and interpretation can introduce significant biases and lead to erroneous results). In particular, he talks about some work done by Jonathan Schooler: http://www.newyorker.com/reporting/2...fa_fact_lehrer In essence, Schooler replicated the results of the Bem paper but, after performing many more tests, showed that the results were noting but a statistical anomaly. I'm not aware whether Schooler published these results. This, especially in light of other such examples detailed in Lehrer's piece, is why I'm hesitant to trust findings based primarily on statistical data without a plausible, empirically-tested mechanism explaining the results.
Nah, when you post journalism, it's OK... you're the world-tree after all . Plus, your article actually offers information rather than obscuring it when the original paper is available. Thank you.
P: 271
 Quote by nismaratwork Oh, in that case I'll have Flex do the same referring to ME as an "expert", and I'll call him a journalist. I can see that you really press the standards here when it comes to credulity.
The article i posted is about Bems paper, aswell as some of the replication efforts. It also has a "debate" section, or rather a criticism section, in which 9 different scientists give their opinion on it. The NYT does not invent its experts, sources or the many scientists it mentions, if thats what you are suggesting. Google them if you dont believe they exist. I was the one who posted Bems original paper btw.

Perhaps you didnt read it because it now requires a login (it didnt when i posted it yesterday), but registration is free.
P: 2,284
 Quote by pftest The article i posted is about Bems paper, aswell as some of the replication efforts. It also has a "debate" section, or rather a criticism section, in which 9 different scientists give their opinion on it. The NYT does not invent its experts, sources or the many scientists it mentions, if thats what you are suggesting. Google them if you dont believe they exist. I was the one who posted Bems original paper btw. Perhaps you didnt read it because it now requires a login (it didnt when i posted it yesterday), but registration is free.
Oh lord... listen pftest... the NYtimes isn't a peer reviewed journal, so what you're talking about is the fallacy of an appeal to authority. I am also NOT suggesting anything about the NYTimes... I really know very little about them and don't use it for my news; I prefer more direct sources. I did read THIS, but the OPINIONS of 9 people are just that... and not scientific support. AGAIN, I don't believe you're familiar with standards like this, so you're running into trouble... again.
P: 3,188
 Quote by Ygggdrasil Perhaps this falls into the category of "journalism" that seems so despised in this discussion, but Jonah Lehrer wrote a nice article for The New Yorker that touches on issues relevant to the debate (similar to the points already brought up in the thread: that subtle biases in study design, analysis and interpretation can introduce significant biases and lead to erroneous results). In particular, he talks about some work done by Jonathan Schooler: http://www.newyorker.com/reporting/2...fa_fact_lehrer In essence, Schooler replicated the results of the Bem paper but, after performing many more tests, showed that the results were noting but a statistical anomaly. I'm not aware whether Schooler published these results. This, especially in light of other such examples detailed in Lehrer's piece, is why I'm hesitant to trust findings based primarily on statistical data without a plausible, empirically-tested mechanism explaining the results.
Very interesting, thanks! Although kind of stating the contrary as Bern, I would say that Schooler's findings are almost as mind boggling as those of Bern... Perhaps worth a topic fork?

PS as a personal anecdote, as a kid I once came across a "one-armed bandit" gambling machine with a group of guys around it. They had thrown a lot of false coins(!) in the machine and one of them was about to throw in the last coin when he noticed me. After I confirmed to him that I had never gambled before he asked me to throw it in, and I got jackpot for them - most of it consisting of their own false coins. I left the scene with mixed feelings, as they had robbed my chance on beginners luck for myself...

 Related Discussions Academic Guidance 85 General Physics 10 General Physics 0 General Physics 0 General Discussion 2