Graduate Convolution Questions: Expectation Value & PDF Method

  • Thread starter Thread starter rabbed
  • Start date Start date
  • Tags Tags
    Convolution
Click For Summary
The discussion centers on the application of expectation value and convolution in deriving the probability density function (PDF) for sums and products of random variables. It explores how convolution relates to the density of a sum of random variables and emphasizes the importance of understanding the grouping of outcomes in discrete distributions. The conversation also delves into the challenges of calculating the Jacobian in cases where the determinant is not easily computed, suggesting that convolution can provide insights into this process. Additionally, it highlights the necessity of using absolute values in the PDF method to ensure probabilities remain non-negative when transforming random variables. Overall, the thread provides a comprehensive examination of convolution's role in probability theory and its implications for random variable transformations.
rabbed
Messages
241
Reaction score
3
Hi

Two questions:

1)
I saw this definition of expectation value:
E[g(X)] = integral wrt x from -inf to inf of g(x)*f(x)*dx
for some function g(x) of a random variable X and its density function f(x).

Can this be used to derive why convolution gives the density of a random
variable sum?

2)
In cases where the determinant can not be calculated, do convolution give
any hints of the jacobian in the PDF method formula,
Y_PDF(y) = X_PDF(f^-1(y)) / |f'(f^-1(y))| for some Y = f(X) of a random
variable X with known density X_PDF(x)?
 
Physics news on Phys.org
Hey rabbed.

The convolution theorem has a multiplicative identity in the frequency (fourier) domain not in the time domain.

I think you should look at the convolution theorem from a probability perspective in the discrete domain and then apply limits to get the continuous identity (which is often how a lot of summation to integral identities actually happen).

When you are adding distributions you are matching events that correspond to all permutations of an event happening.

I'll use a simple example.

X ~ U[0,2] with a state space {0,1,2} [Discrete Uniform]
Y ~ U[0,2] as well

You have combinations of values with the frequencies (across all possible events]

0 corresponds to [0+0]
1 corresponds to [0+1,1+0]
2 corresponds to [0+2,1+1,2+0]
3 corresponds to [1+2,2+1]
4 corresponds to [2+2]

I did an explicit example to help you understand the grouping process and the formula you will derive for the distribution will represent this frequency normalized (known as a probability) and it's best to use a summation and then group the probabilities that map to an event.

If you do this you can derive the convolution theorems connection to finding the distribution probability density function (normalized frequency) of a sum of two random variables.
 
  • Like
Likes jim mcnamara and Delta2
Thanks

I was able to derive the density sum- and difference formulas. Now looking at how to derive the product formula:
So let's say I throw two die which generates outcomes according to independent random variables X and Y.
X_PMF(x) = Y_PMF(y) = 1/6 (for 1 <= x <= 6 and 1 <= y <= 6)
I first want to derive the probability mass formula of a random variable product by calculating Z_PMF(z) for the random
variable Z = X * Y.
I first list all possible outcomes of Z and group all combinations of two die by the products of their number of dots:

01 - [1,1]
02 - [1,2], [2,1]
03 - [1,3], [3,1]
04 - [1,4], [2,2], [4,1]
05 - [1,5], [5,1]
06 - [1,6], [2,3], [3,2], [6,1]
07 -
08 - [2,4], [4,2]
09 - [3,3]
10 - [2,5], [5,2]
11 -
12 - [2,6], [3,4], [4,3], [6,2]
13 -
14 -
15 - [3,5], [5,3]
16 - [4,4]
17 -
18 - [3,6], [6,3]
19 -
20 - [4,5], [5,4]
21 -
22 -
23 -
24 - [4,6], [6,4]
25 - [5,5]
26 -
27 -
28 -
29 -
30 - [5,6], [6,5]
31 -
32 -
33 -
34 -
35 -
36 - [6,6]

For each outcome of Z, normalizing the number of die combinations by the total number of combinations (36),
I then try to express the probabilities of the outcomes of Z in terms of probability mass for X and Y:

01 - 1/6*1/6
02 - 1/6*1/6 + 1/6*1/6
03 - 1/6*1/6 + 1/6*1/6
04 - 1/6*1/6 + 1/6*1/6 + 1/6*1/6
05 - 1/6*1/6 + 1/6*1/6
06 - 1/6*1/6 + 1/6*1/6 + 1/6*1/6 + 1/6*1/6
07 -
08 - 1/6*1/6 + 1/6*1/6
09 - 1/6*1/6
10 - 1/6*1/6 + 1/6*1/6
11 -
12 - 1/6*1/6 + 1/6*1/6 + 1/6*1/6 + 1/6*1/6
13 -
14 -
15 - 1/6*1/6 + 1/6*1/6
16 - 1/6*1/6
17 -
18 - 1/6*1/6 + 1/6*1/6
19 -
20 - 1/6*1/6 + 1/6*1/6
21 -
22 -
23 -
24 - 1/6*1/6 + 1/6*1/6
25 - 1/6*1/6
26 -
27 -
28 -
29 -
30 - 1/6*1/6 + 1/6*1/6
31 -
32 -
33 -
34 -
35 -
36 - 1/6*1/6

Then, trying to find an algebraic formula (by iterating x, setting Y_PMF(y) = Y_PMF(z/x) and use AND/OR-logic),
which satisfies all outcomes of Z (showing the case for Z=6 below):

Z_PMF(06) =
X_PMF(1)*Y_PMF(06/1) +
X_PMF(2)*Y_PMF(06/2) +
X_PMF(3)*Y_PMF(06/3) +
X_PMF(4)*Y_PMF(06/4) +
X_PMF(5)*Y_PMF(06/5) +
X_PMF(6)*Y_PMF(06/6)

Z_PMF(06) =
X_PMF(1)*Y_PMF(6) +
X_PMF(2)*Y_PMF(3) +
X_PMF(3)*Y_PMF(2) +
X_PMF(4)*Y_PMF(1.5) +
X_PMF(5)*Y_PMF(1.2) +
X_PMF(6)*Y_PMF(1)

Z_PMF(06) =
1/6*1/6 +
1/6*1/6 +
1/6*1/6 +
1/6*0 +
1/6*0 +
1/6*1/6

Z_PMF(06) = 1/6*1/6 + 1/6*1/6 + 1/6*1/6 + 1/6*1/6

Is it so that Y_PMF(1.5) = Y_PMF(1.2) = 0 because a PMF is only valid for integers?

I get Z_PMF(z) = sum(for x from -inf to inf): X_PMF(x) * Y_PMF(z/x)
I think the actual product formula is Z_PMF(z) = sum(for x from -inf to inf): X_PMF(x) * Y_PMF(z/x) / |x|
How does the |x| come in?
 
What you are doing involves a substitution theorem for probability.

Have you looked at change of random variable density function results?

What you are looking at is the multi-variable version of that applied to two variable to go from a 2D random variable to a 1D one.
 
I think so, I started looking at transformations using the CDF and PDF method, but noticed there were gaps I needed to fill in.

The general formula for finding the PDF of a function of random variables (using the PDF method) is:
Y_PDF(y) = X_PDF(f^-1(y)) / |f'(f^-1(y))| for some Y = f(X) of a random variable X with known density X_PDF(x)
right?

Generalizing this, for transformations where the number of source RV's/dimensions equals the number of destination RV's/dimensions,
|f'(f^-1(y))| turns into the absolute value of the Jacobian, right?

But for cases where the dimensions are not equal I need these convolution (and similar) formulas? I'm hoping to tie this together with the PDF method formula in some way.

Is "|x|" the absolute value of the transformation function derivative (sometimes the Jacobian)?
 
Last edited:
I'd have to look at the formula but the inverse function should become the Jacobian in a multiple dimension substitution integral (the derivative term that is involved).
 
You should probably take a look at something like below and verify it in depth for yourself:

https://www.cl.cam.ac.uk/teaching/0708/Probabilty/prob11.pdf

They have a section explaining why the absolute value exists.

That would have to be generalized for your Jacobian so that the result can be made sense of but essentially the Jacobian represents the expansion/contraction with respect to the gradient vector and the "idea" itself is the same.

I recall seeing some proof in a standard undergraduate statistics book when I did this stuff ages ago so you could use those resources as well.
 
Hi again, a bit late :)

I failed to see how any of the constraints on page 11.3 in the linked document above has the consequence of requiring an absolute value.

But, it should boil down to the chain rule.. Could you say something like this?

Normally when derivating f(x) wrt y where x = g(y), we use:
df/dx * dx/dy = (df/dx) / (dy/dx)
because we allow dy/dx to be negative, since each value of the derivative should contain
(the change in f wrt x) / (the change in y wrt x).
This way, we create something that can decrease the sum when integrating it.

But when calculating the PDF of Y (where f(x) is the CDF of X and g(y) is a one-to-one inverse
transformation giving x), we use:
df/dx * |dx/dy| = (df/dx) / |dy/dx|
because we want to deny dy/dx to be negative, since each value of a PDF should contain
(the probability (not change in probability) of y) / (_outcome length_ of y).
This way, we create something that can never decrease the sum when integrating it.

Does it make sense? Any comments?
 
  • #10
I think my point is, a PDF is not defined as the derivative of probability as a function of outcome, but rather probability per outcome, and its integral (the CDF) become the sum of probabilities.
So derivation and integration of a simple variable will still work as usual, but derivating the CDF of a variable that is a function of some other variable is different, and that's where the absolute value comes from?
 

Similar threads

  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 11 ·
Replies
11
Views
2K
  • · Replies 7 ·
Replies
7
Views
2K
Replies
4
Views
2K
  • · Replies 12 ·
Replies
12
Views
4K
  • · Replies 7 ·
Replies
7
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K