Combination of two dependant discrete random variables

simcc
Messages
1
Reaction score
0
Hi,
I’m looking for a way to combine two discrete random variables (which I have as probability distributions). The combination should be the product (or other operation) of the two variables.
This would be easy if they were independent, but they’re not. There is a known correlation between the variables.

Question: how to combine two discrete random variables with correlation?
Given: The marginal probabilities of the two variables & a correlation function
Result: either the individual probabilities in a probability table or the complete probability distribution of the combination.

Simple example:
Variables A and B are the distributions:
PA(a=1, 4) = [0.75, 0.25]
PB(b=4, 8, 10) = [0.25, 0.25, 0.5]

Their joint probability function is shown in their joint probability table and joint value table:
P B=4 8 10
A=1 ? ? ? 0.75
4 ? ? ? 0.25
0.25 0.25 0.5 1

value B=4 8 10
A=1 4 8 10
4 16 32 40

(tables are clearer in attached file)

The correlation between the two variables is: b = 10 – 2/3*a

P(A*B)(4, 8, 10, 16, 32, 40) = ?
 

Attachments

Physics news on Phys.org
You've described an interesting type of problem. This general type of problem is "ill posed", meaning that there are examples of it that have infinitely many solutions. However, ill posed problems arise in many real world situation, such as in mathematics of computing CAT scans, MRI scans etc., so you shouldn't let the ill posed nature of the problem deter you from thinking about it if you find it interesting.

To solve for the joint probability distribution (or determine that there are no solutions or infinitely many solutions), set up the simultaneous equations that the entries in the joint probability table must satisfy. Each entry in the joint probability table is an unknown. The fact that each row sum is known gives you an equation for each row. Likewise the totals for each column give you an equation for each column.

You need to clarify what you mean by "the correlation function". If you mean the line that is computed by doing linear regression ( to get a least squares fit), there is some ambiguity about that line. The line computed by treating B as the independent variable is not the same as the line you get by treating A as the independent variable. There is also a method called "total least squares" that fits a regression line that may be different from both the aforementioned lines. (If you intended to say "correlation coefficient", that is a single quantity, not a line. Likewise, the "covariance" of A and B is not a line.)

How you define the "correlation function" will give you more equations for the unknowns in the joint probability table.

You may find that in some cases, the simultaneous equations have no solution and in some cases they may have infinitely many solutions.

As to the probability distribution for the quantity AB, it would be defined by a table that gave all the possible values of AB and their probabilities. It would not list a value twice. So if your "joint value" table for AB had several entries all equal to the same number, then the final table for the random variable AB would list that number as a value only once. The probability of that vaue would be the sum of all probabilities in the joint distribution table that corresponded to that "joint vaue".
 
Last edited:
If you can use the normal approximation to your two distributions and know the correlation \rho, you should be able to use the characteristic function:

\phi(t_{1},t_{2})=exp[i(t_{1}\mu_{1}+t_{2}\mu_{2})-1/2(\sigma_{1}^{2}t_{1}^{2}+2\rho \sigma_{1} \sigma_{2} t_{1}t_{2}+\sigma_{2}^{2} t_{2}^{2})]
 
Last edited:
Namaste & G'day Postulate: A strongly-knit team wins on average over a less knit one Fundamentals: - Two teams face off with 4 players each - A polo team consists of players that each have assigned to them a measure of their ability (called a "Handicap" - 10 is highest, -2 lowest) I attempted to measure close-knitness of a team in terms of standard deviation (SD) of handicaps of the players. Failure: It turns out that, more often than, a team with a higher SD wins. In my language, that...
Hi all, I've been a roulette player for more than 10 years (although I took time off here and there) and it's only now that I'm trying to understand the physics of the game. Basically my strategy in roulette is to divide the wheel roughly into two halves (let's call them A and B). My theory is that in roulette there will invariably be variance. In other words, if A comes up 5 times in a row, B will be due to come up soon. However I have been proven wrong many times, and I have seen some...

Similar threads

Replies
30
Views
4K
Replies
2
Views
2K
Replies
7
Views
1K
Replies
5
Views
2K
Replies
1
Views
2K
Back
Top