Correlation limits for binary variates

Click For Summary

Discussion Overview

The discussion revolves around the limits of correlation for binary variates, particularly in the context of detector coincidences. Participants explore theoretical calculations related to the maximum possible coincidences between binary sequences produced by random processes, as well as the implications of independence and correlation in statistical analysis.

Discussion Character

  • Technical explanation
  • Debate/contested
  • Mathematical reasoning

Main Points Raised

  • One participant presents a calculation for the maximum number of (0,0) and (1,1) coincidences based on probabilities of binary sequences, leading to a proposed correlation formula.
  • Another participant suggests defining a test statistic to evaluate the hypothesis of coincidences, emphasizing the importance of assessing independence first.
  • Some participants express skepticism about the initial claims, arguing that the possibility of unlikely events must be considered, which could lead to correlations outside the proposed limits.
  • There is a discussion about the implications of independence on correlation, with one participant noting that independent random variables can still yield sample correlations between -1 and 1.
  • One participant questions whether the derived correlation limit is valid, suggesting that it may only hold asymptotically and referencing the CHSH inequality as a true limit.

Areas of Agreement / Disagreement

Participants do not reach a consensus, as there are multiple competing views regarding the validity of the proposed correlation limits and the implications of independence on correlation. Some participants challenge the initial claims, while others defend them.

Contextual Notes

Participants highlight the need for careful consideration of assumptions regarding independence and the nature of random variables, as well as the potential for unlikely events to affect correlation outcomes.

Mentz114
Messages
5,429
Reaction score
292
I've been looking at detector coincidences and tried to find what general limits apply to coincidences. I was surprised how simply the calculation works out. My question is whether it is correct and where can I find similar stuff ?

Consider two binary sequences produced by random processes where the probabilities of getting 1 are ##p_1## and ##p_2## respectively. Now assume that the number of 1's in the streams is ##1_n \rightarrow Np_n## as ##N \rightarrow \infty##.

If we know the counts ##1_n,\ 0_n=N-1_n## in our sequences then by a permutation argument it is clear that the maximum number of (0,0) coincidences can not be greater than the minimum of ##0_1=N(1-p_1)## and ##0_2=N(1-p_2)##. Similarly the maximum possible (1,1) coincidences is the least of ##Np_1## and ##Np_2##. Assumimg ##p_1<p_2## this gives a total for the (0,0) and (1,1) coincidences of ##S_{12}=N(1-p_2+p_1)##. There are no permutations which give a greater total than this.

The maximum possible correlation between the streams is given by ##\mathcal{C}_{12}=(2S_{12}-N)/N## which gives ##1-2(p_2-p_1)##.

From this one can write for the maximum possible correlations between 4 streams ( assuming ##p_1\leq p_2 \leq p_3 \leq p_4##).

##|\mathcal{C}_{12}+\mathcal{C}_{23}+\mathcal{C}_{34}-\mathcal{C}_{41}| \leq 2##

the ##p_n## terms conveniently cancelling.
 
Last edited:
Physics news on Phys.org
Hey Mentz114.

I think it would be useful to define a test statistic that can be decided on regarding whether co-incidences exist and then to use that to evaluate the hypothesis of co-incidences.

If you can do that then you will have a far better chance of understanding and estimating this attribute in your random sample.

Strictly speaking the first thing to do would probably involve assessing the sample for independence and independence means that any conditional probability of any sort equals the probability of the original random variable (not that being conditioned on).

There are statistical tests to do this - and I think one involves chi-square.

http://www.stat.wmich.edu/s216/book/node112.html

Basically if correlation exists it can exist in many forms but the independence test is the first thing to ascertain evidence of whether hidden correlations may exist.

The other way is to partition the random variables and decompose them based on their correlation - something that happens in a Principal Component Analysis (or PCA). If information is independent then the decomposition should yield what was initially there to start off with and you won't be able to reduce the dimension of the system without significantly impacting its ability to capture variation.
 
chiro said:
Hey Mentz114.

I think it would be useful to define a test statistic that can be decided on regarding whether co-incidences exist and then to use that to evaluate the hypothesis of co-incidences.
[..]
.
Chiro,

thanks for the reply. I think you might be misunderstanding what I'm doing. The theoretical limits on correlations is not ( it seems ) a very interesting subject but
it crops up, see here for instance Bell notes and wiki CHSH.
I could be in the wrong sub-forum ...
 
Mentz114 said:
the maximum number of (0,0) coincidences can not be greater than the minimum of ##0_1=N(1-p_1)## and ##0_2=N(1-p_2)##.
Maybe I am misunderstanding you, but I would say this is wrong. You are ignoring the possibility of an unlikely event. Although it is unlikely, they can both be 0 for all N trials as long as there is any possibility (i.e. neither p1 or p2 being 1).
If the random variables are independent, they have an actual correlation of 0. But even if they are independent, it is possible for a sample to have a correlation anywhere between -1 and 1, inclusive. As the sample size, N, gets large, the probability of sample correlations being far from 0 gets small. But it is always possible to get values anywhere between -1 and 1, inclusive.
 
FactChecker said:
Maybe I am misunderstanding you, but I would say this is wrong. You are ignoring the possibility of an unlikely event. Although it is unlikely, they can both be 0 for all N trials as long as there is any possibility (i.e. neither p1 or p2 being 1).
If the random variables are independent, they have an actual correlation of 0. But even if they are independent, it is possible for a sample to have a correlation anywhere between -1 and 1, inclusive. As the sample size, N, gets large, the probability of sample correlations being far from 0 gets small. But it is always possible to get values anywhere between -1 and 1, inclusive.
Yes, this true. I don't think ##1-2(p_2-p_1)## is a limit (except asymptotically) because we have probabilties in the expression.

In fact the first expression I worked out was the multi-stream limit where the probabilities cancel. This is the same as the CHSH inequality which is reckoned to be a true limit.
Can my logic for ##|\mathcal{C}_{12}+\mathcal{C}_{23}+\mathcal{C}_{34}-\mathcal{C}_{41}| \leq 2## be saved because it has no probabilities in it ?

I think I'm assuming the same things as the derivation I've attached, which uses set logic.
 

Attachments

Last edited:

Similar threads

  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 54 ·
2
Replies
54
Views
6K
  • · Replies 0 ·
Replies
0
Views
4K
  • · Replies 175 ·
6
Replies
175
Views
28K
  • · Replies 61 ·
3
Replies
61
Views
12K
  • · Replies 71 ·
3
Replies
71
Views
14K
  • · Replies 107 ·
4
Replies
107
Views
20K
  • · Replies 105 ·
4
Replies
105
Views
15K
  • · Replies 43 ·
2
Replies
43
Views
13K
  • · Replies 67 ·
3
Replies
67
Views
16K