Can this be called 'a coincidence'?

In summary, the conversation discusses the relationship between two variables, A and B, and whether their correlation can be considered a coincidence or if there is a causal connection. It is mentioned that the variables have a random distribution and that when observed, they always produce the same outcome. The concept of correlation and independence is explored, with the conclusion being that dependence does not necessarily require a cause, and coincidence is reserved for completely random events.
  • #1
entropy1
1,230
71
Suppose we have two variables A and B. A has a truly random distribution over {0,1} with P(0)=P(1)=0.5 . B has the same distribution.

Now suppose that A and B always show both a 1 or both a 0. This would be a strong correlation between A and B.

Now could this be called 'a coincidence'? And if not, in what way does it differ from a coincidence?
 
Physics news on Phys.org
  • #2
entropy1 said:
Suppose we have two variables A and B. A has a truly random distribution over {0,1} with P(0)=P(1)=0.5 . B has the same distribution.

Now suppose that A and B always show both a 1 or both a 0. This would be a strong correlation between A and B.

Now could this be called 'a coincidence'? And if not, in what way does it differ from a coincidence?

How do you define "coincidence"?
 
  • #3
In probability terms, the observation would be that A and B are correlated. They are not independent. The word "coincidence" is not used.

In colloquial terms, I guess a coincidence is an unlikely event that occurs. So if A and B were supposed to be independent, but in 100 trials they always showed the same number, you could state that an event with probability 0.5^100 has occurred, which is possible though unlikely. You still wouldn't use the term "coincidence" which has no mathematical meaning.

A Bayesian would say "based on experiment clearly A and B are not independent" and recalculate the joint probability distribution P(A, B) to reflect the observation.
 
  • Like
Likes entropy1 and Klystron
  • #4
The way to judge if it is a coincidence is to assume that they really are completely unrelated and calculate how strange the luck would be to get the results you got. The assumption that they are unrelated is called the "null hypothesis". This is a typical example of the most common application of probability. There are some standard levels of probability that are typically used before one would conclude (with the selected probability) that the null hypothesis is wrong. They are .05, 0.025, 0.01. They are called "confidence levels" of 95%, 97.5%, and 99%, respectively. Sometimes an extremely strict level is used. For example, before a nuclear physicist can claim that he has found a new particle, he must show that the probability of having his results would be less than 1 in 3,500,000 chance (probability=0.0000002857) if it was not a new particle. This is called a 5-sigmas criteria. 5-sigmas corresponds to around a 1 in 3,500,000 chance.

In your example, you say that A and B always show the same result. At the 0.05 probability level, it would require 5 experiments, all with the same results, to conclude with 95% confidence that they are related. Here are some other probability levels and the number of identical results that would be required:
0.05 5
0.025 6
0.01 7
1/3,500,000 = 0.0000002857 20
 
Last edited:
  • Like
Likes ZeGato and entropy1
  • #5
Math_QED said:
How do you define "coincidence"?
I guess something like: that there is no relation between the outcomes of A and B although it may seem there is.
 
  • #6
entropy1 said:
I guess something like: that there is no relation between the outcomes of A and B although it may seem there is.

What comes to mind when you say that:

- Correlation between random variables
- Independence between random variables

You might want to google these two terms, and decide what you are looking after :)
 
  • #7
Wikipedia said:
two random variables are independent if the realization of one does not affect the probability distribution of the other
link

So I guess, since in the example I gave, A and B always yield identical values as outcome, that ##P(B|A) = 1, P(B|\bar{A}) = 0##, which means the probability distribution of B is affected by A's outcome.

In this way, dependent variables should show a different correlation than independent variables. So does this mean that there must be a cause causing the difference? If not, what is the reason for the difference to occur?

If nothing is causing a certain correlation, can one speak of dependence at all, or should one rather speak of coincidence?
 
Last edited:
  • #8
entropy1 said:
In this way, dependent variables should show a different correlation than independent variables. So does this mean that there must be a cause causing the difference? If not, what is the reason for the difference to occur?
There does not have to be a cause that you care about. For instance, the population of both the U.S. and England increases. So they are correlated just because both have the same trend in time. That is usually not considered a "cause" of the correlation.
If nothing is causing a certain correlation, can one speak of dependence at all, or should one rather speak of coincidence?
No. Even in the example of the populations of the U.S. and England, if I told you the population of one without telling you the year, that would help you to estimate the population of the other in that same year. I would not call that either a "cause" or a "pure coincidence" -- they are just correlated. I would reserve the term "pure coincidence" for things that are completely due to random luck. People generally use the term "pure coincidence" as just luck and the term "coincidence" as in "That is quite a coincidence" to indicate that they are suspicious of a cause-effect relationship.
 
  • #9
entropy1 said:
Now suppose that A and B always show both a 1 or both a 0.
Hi entropy:

Wikipedia gives the following definition.
https://en.wikipedia.org/wiki/Coincidence
A coincidence is a remarkable concurrence of events or circumstances that have no apparent causal connection with one another.​
An important word in the quote from your post is "always".

Now "always" suggests an infinite number of repeated events. Since it is impractical to have an infinite number of events, I have to assume that you intended "always" to imply just an extremely large number of events, say one billion for example. If you observed one billion consecutive events in which A and B produced the same value, I would have to assume that this alone implies that there is some currently unknown casual connection, and that therefore this would not be a coincidence. If the number of events were a small number, then one could calculated that there is a "reasonable" probability that the sequence is coincidental.

Regards,
Buzz
 
  • Like
Likes FactChecker
  • #10
Buzz Bloom said:
If the number of events were a small number, then one could calculated that there is a "reasonable" probability that the sequence is coincidental.
I agree. And this entire subject should be discussed in terms of probability and confidence levels, not in vaguely defined terms like "coincidental".
 
  • Like
Likes WWGD
  • #11
By "coincidental" I mean correlations without any relevant underlying cause, so luck, chance. Subsequently I wonder if the correlation shown by dependent random variables can be ascribed to luck. So could one speak of chance (coincidence) in the case of dependence? What then would cause the difference in correlation between dependent and independent variables, and the difference between one set of dependent variables and another?
 
Last edited:
  • Like
Likes WWGD
  • #12
There can be a mixture of dependence and uncorrelated. Suppose one variable, X, is uniformly distributed on the set {0,1} and another variable, Y, completely independent of X and uniformly distributed on the set {-1,1}. Then a third variable, Z = X*Y, would be dependent on X and on Y but uncorrelated with X and correlated with Y.
 
  • #13
What I mean is, if we have
  1. ##P(B|A)=0.8## and ##P(B|\bar{A})=0.2##, or:
  2. ##P(B|A)=0.6## and ##P(B|\bar{A})=0.4##,
then there must be a cause causing at least the difference between (1) and (2)? If not, then (1) and (2) are both due to chance and then I tend to call the dependence/correlation a coincidence. I even think it couldn't even be called a dependence, for there is nothing causing the dependence/correlation then.
 
Last edited:
  • #14
You are better off using the traditional terms of probability so that everyone understands in great detail what it means and doesn't mean. In probability, saying that A and B are "dependent" only means that the probabilities of B change when we know A or not A. It does not mean that A causes B. It just means that the probability changes. The term "coincidence" does not have a well-defined meaning in probability.

Suppose you were throwing a uniformly distributed dart at the picture below. The events A and B have no "cause-effect" relationship. They may, or may not be probabilistically dependent, depending only on the probabilities.
independentEvents.gif
 

Attachments

  • independentEvents.gif
    independentEvents.gif
    42.3 KB · Views: 563
Last edited:
  • Like
Likes Geofleur
  • #15
I'm not sure what your (1) and (2) signify. I'm also not sure what you mean by probability.

entropy1 said:
##P(B|A)=0.8## and ##P(B|\bar{A})=0.2##

That says to me that the fixed probability that B occurs, out of cases where A occurs, is 0.8. And that in cases where A does not occur, B occurs 0.2 of the time. These are constants of the population. By the law of large numbers, the larger the number of cases I test, the closer the fraction of occurrences of B will be to these two possibilities.
entropy1 said:
or ##P(B|A)=0.6## and ##P(B|\bar{A})=0.4##

then there must be a cause causing at least the difference between (1) and (2)?

That's a completely different set of events with different probabilities. The cause of the difference between (1) and (2) is that you're clearly talking about different random variables in the two cases. If you meant those to describe the same events A and B, then I can't imagine what you mean by the probability changing. If I take a million outcomes where A happened, do you think I'm going to get closer to 0.8 or 0.6 of those where B also happened? It can't be both.
 
  • Like
Likes entropy1
  • #16
RPinPA said:
I'm not sure what your (1) and (2) signify.
I interpret them as just two different examples where the probability of B changes depending on whether or not A occurs. Therefore, A and B are dependent in both cases.
 
  • #17
entropy1 said:
I think I have to ponder correlation vs. dependence some more, I realize... o0)
To start with, independent variables can not be correlated because the probabilities do not depend on each other. The converse is not true. One can rig up examples where two random variables are dependent (the value of one changes the probable results of the other) but the correlation between them is rigged to be zero. My example in post #12 is like that.
 
  • #18
Part of the difficulty I'm having is that I'm not sure of OP's definition of probability. Let's talk about this sentence in the original question.
entropy1 said:
Now suppose that A and B always show both a 1 or both a 0.

Does that mean you observed this in a small number of trials? Or do you mean that there is ZERO PROBABILITY that A and B are different? Those are quite different statements. One is about a sample, the other is about the entire population, the entire set of possible outcomes.

If we're really talking about what most of us mean by probability, then I'd say there has to be some underlying reason for dependence. And I think what OP is asking is, "is it possible for two random variables to be dependent without some kind of chain of causality connecting them?" Not necessarily A causing B or B causing A. So I believe my answer is "no". There is some underlying reason, though it may be inaccessible to us.

You definitely don't want to interpret A|B as B causing A. For instance, students are often given problems like this: Urn #1 contains 10 red balls and 1 blue ball. Urn #2 contains 2 red balls and 8 blue balls. If I draw a ball from a random urn and it is red, what is the probability it was urn #2?

You should find that it's a lot more likely that a red ball came from urn #1. P(Urn #2 | Red ball) < 0.5. That doesn't mean that a red ball caused you to pick urn #1.
 
  • Like
Likes entropy1 and FactChecker
  • #19
I thought correlation was measured by ##P(a_n=b_n)##. I think this is not correct after all. I read of the Pearson correlation. Which is the real measure for binary correlation?
 
  • #20
The official definition of the correlation is the expected value: ##E( \frac{(X-\bar X ) *(Y-\bar Y)}{(\sigma_X \sigma_Y)})##, where ##\bar X## and ##\bar Y## are the means of ##X## and ##Y##, respectively and ##\sigma_X## and ##\sigma_Y## are the standard deviations.
If it is positive, that tells you that ##X## being over its mean tends to indicate that ##Y## is also over its mean and conversely, ##X## being below its mean tends to indicate that ##Y## is also below its mean. The equation treats ##X## and ##Y## similarly, so the reverse tendencies (##Y## above/below mean tends to indicate that ##X## above/below mean).
Corresponding things can be said if the correlation is negative, except ##X## above mean tends to imply ##Y## below mean. And other similar statements.
 
Last edited:
  • #21
entropy1 said:
By "coincidental" I mean correlations without any relevant underlying cause, so luck, chance. Subsequently I wonder if the correlation shown by dependent random variables can be ascribed to luck. So could one speak of chance (coincidence) in the case of dependence? What then would cause the difference in correlation between dependent and independent variables, and the difference between one set of dependent variables and another?
There is something similar to this in Control Charts, where there is variability attributable to noise (which is what I think you call 'coincidence') and variability explained through attributable causes, which are factors that can be identified and manipulated to eliminate ( or at least lower ) variability.
 
  • Like
Likes entropy1
  • #22
entropy1 said:
By "coincidental" I mean correlations without any relevant underlying cause, so luck, chance. Subsequently I wonder if the correlation shown by dependent random variables can be ascribed to luck. So could one speak of chance (coincidence) in the case of dependence? What then would cause the difference in correlation between dependent and independent variables, and the difference between one set of dependent variables and another?
As has been said, "coincidence" is not a defined term in probability theory. In common parlance it has two meanings: it can just mean a correlation, for a known reason or otherwise, but more usually it implies there is no known reason. As far as I am aware it never implies there is definitely no reason.
Thus, drawing attention to the coincidence of two events often carries the implication that there is a hidden reason.
Sometimes someone may ask "do you believe in coincidence?" . Again, there is an ambiguity. It may be facetious, suggesting there is a causal connection, or it may be more mystical, suggesting that causally unrelated coincidences happen more frequently than the laws of chance allow.
 
  • Like
Likes Klystron
  • #23
The following occurred to me lately:
Suppose that we have two binary variables with a dependence of P(A,B)=0.5 . (1)
Suppose we make a batch of measurements that show P(A,B)=0.75 . (2)

Now, it could be that (1) is the rule, and we have measured exception (2). In that case I would call (2) a coincidence. The probability of getting (1) is higher, the highest in fact, and we have measured a batch which has lower probability to be measured, because it is not the rule.
It could also be the other way round: that (2) is the rule and (1) is the exception.* In that case, if the probability of getting a batch of (2) is higher, the highest in fact, then we have to adjust the dependence to (2).

So, a coincidence would be that we have measured an exception, a less probable batch, and it would have to have a measure because we would almost always measure an exception.

* It could also be that the actual dependence is something else altogether.
 
Last edited:
  • #24
  • Like
Likes entropy1
  • #25
FactChecker said:
That is vaguely correct.
Ok.
Your terminology is all wrong and vague. You should study or take a class in probability and statistics if you are interested in this.
I know. I don't know if the term correlation applies or the term dependence, or a combination of the two. I don't know the status of both. And english is not my native language. I will take an effort to study statistics if I get the opportunity. :smile:
You are asking basic questions about hypothesis testing. See https://en.wikipedia.org/wiki/Statistical_hypothesis_testing.
Ok.
 
  • #26
Specifically here, you may do a test for proportions. One usually deals with confidence intervals for proportions, using the sampling distribution of proportions. If you want to test for, say , the proportion being 0.5, you will get a confidence interval centered at the proportion.
 
  • #27
quote-that-s-too-coincidental-to-be-a-coincidence-yogi-berra-83-33-80.jpg
 

Attachments

  • quote-that-s-too-coincidental-to-be-a-coincidence-yogi-berra-83-33-80.jpg
    quote-that-s-too-coincidental-to-be-a-coincidence-yogi-berra-83-33-80.jpg
    26 KB · Views: 369
  • Like
Likes FactChecker

1. What is a coincidence?

A coincidence is when two or more things happen at the same time or in a similar way, but are not directly connected or caused by each other.

2. How can we determine if something is a coincidence?

Determining if something is a coincidence can be subjective, but typically it involves evaluating the likelihood of the events occurring together by chance and considering any potential underlying causes or connections.

3. Can coincidences be explained by science?

Yes, coincidences can often be explained by scientific principles such as probability and statistical analysis. However, some coincidences may still remain unexplained.

4. Are all coincidences just random chance?

Not necessarily. While some coincidences may be attributed to random chance, others may have underlying causes or connections that are not immediately apparent.

5. Can coincidences have any significance or meaning?

This is a highly debated topic and largely depends on individual beliefs. Some people may attribute meaning to coincidences, while others may see them as purely random occurrences.

Similar threads

  • Set Theory, Logic, Probability, Statistics
Replies
30
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
14
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
919
Replies
0
Views
347
  • Set Theory, Logic, Probability, Statistics
Replies
3
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
6
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
11
Views
490
  • Set Theory, Logic, Probability, Statistics
Replies
0
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
2
Views
1K
Replies
4
Views
690
Back
Top