# A Convolution questions

1. Jul 6, 2016

### rabbed

Hi

Two questions:

1)
I saw this definition of expectation value:
E[g(X)] = integral wrt x from -inf to inf of g(x)*f(x)*dx
for some function g(x) of a random variable X and its density function f(x).

Can this be used to derive why convolution gives the density of a random
variable sum?

2)
In cases where the determinant can not be calculated, do convolution give
any hints of the jacobian in the PDF method formula,
Y_PDF(y) = X_PDF(f^-1(y)) / |f'(f^-1(y))| for some Y = f(X) of a random
variable X with known density X_PDF(x)?

2. Jul 6, 2016

### chiro

Hey rabbed.

The convolution theorem has a multiplicative identity in the frequency (fourier) domain not in the time domain.

I think you should look at the convolution theorem from a probability perspective in the discrete domain and then apply limits to get the continuous identity (which is often how a lot of summation to integral identities actually happen).

When you are adding distributions you are matching events that correspond to all permutations of an event happening.

I'll use a simple example.

X ~ U[0,2] with a state space {0,1,2} [Discrete Uniform]
Y ~ U[0,2] as well

You have combinations of values with the frequencies (across all possible events]

0 corresponds to [0+0]
1 corresponds to [0+1,1+0]
2 corresponds to [0+2,1+1,2+0]
3 corresponds to [1+2,2+1]
4 corresponds to [2+2]

I did an explicit example to help you understand the grouping process and the formula you will derive for the distribution will represent this frequency normalized (known as a probability) and it's best to use a summation and then group the probabilities that map to an event.

If you do this you can derive the convolution theorems connection to finding the distribution probability density function (normalized frequency) of a sum of two random variables.

3. Jul 10, 2016

### rabbed

Thanks

I was able to derive the density sum- and difference formulas. Now looking at how to derive the product formula:
So let's say I throw two die which generates outcomes according to independent random variables X and Y.
X_PMF(x) = Y_PMF(y) = 1/6 (for 1 <= x <= 6 and 1 <= y <= 6)
I first want to derive the probability mass formula of a random variable product by calculating Z_PMF(z) for the random
variable Z = X * Y.
I first list all possible outcomes of Z and group all combinations of two die by the products of their number of dots:

01 - [1,1]
02 - [1,2], [2,1]
03 - [1,3], [3,1]
04 - [1,4], [2,2], [4,1]
05 - [1,5], [5,1]
06 - [1,6], [2,3], [3,2], [6,1]
07 -
08 - [2,4], [4,2]
09 - [3,3]
10 - [2,5], [5,2]
11 -
12 - [2,6], [3,4], [4,3], [6,2]
13 -
14 -
15 - [3,5], [5,3]
16 - [4,4]
17 -
18 - [3,6], [6,3]
19 -
20 - [4,5], [5,4]
21 -
22 -
23 -
24 - [4,6], [6,4]
25 - [5,5]
26 -
27 -
28 -
29 -
30 - [5,6], [6,5]
31 -
32 -
33 -
34 -
35 -
36 - [6,6]

For each outcome of Z, normalizing the number of die combinations by the total number of combinations (36),
I then try to express the probabilities of the outcomes of Z in terms of probability mass for X and Y:

01 - 1/6*1/6
02 - 1/6*1/6 + 1/6*1/6
03 - 1/6*1/6 + 1/6*1/6
04 - 1/6*1/6 + 1/6*1/6 + 1/6*1/6
05 - 1/6*1/6 + 1/6*1/6
06 - 1/6*1/6 + 1/6*1/6 + 1/6*1/6 + 1/6*1/6
07 -
08 - 1/6*1/6 + 1/6*1/6
09 - 1/6*1/6
10 - 1/6*1/6 + 1/6*1/6
11 -
12 - 1/6*1/6 + 1/6*1/6 + 1/6*1/6 + 1/6*1/6
13 -
14 -
15 - 1/6*1/6 + 1/6*1/6
16 - 1/6*1/6
17 -
18 - 1/6*1/6 + 1/6*1/6
19 -
20 - 1/6*1/6 + 1/6*1/6
21 -
22 -
23 -
24 - 1/6*1/6 + 1/6*1/6
25 - 1/6*1/6
26 -
27 -
28 -
29 -
30 - 1/6*1/6 + 1/6*1/6
31 -
32 -
33 -
34 -
35 -
36 - 1/6*1/6

Then, trying to find an algebraic formula (by iterating x, setting Y_PMF(y) = Y_PMF(z/x) and use AND/OR-logic),
which satisfies all outcomes of Z (showing the case for Z=6 below):

Z_PMF(06) =
X_PMF(1)*Y_PMF(06/1) +
X_PMF(2)*Y_PMF(06/2) +
X_PMF(3)*Y_PMF(06/3) +
X_PMF(4)*Y_PMF(06/4) +
X_PMF(5)*Y_PMF(06/5) +
X_PMF(6)*Y_PMF(06/6)

Z_PMF(06) =
X_PMF(1)*Y_PMF(6) +
X_PMF(2)*Y_PMF(3) +
X_PMF(3)*Y_PMF(2) +
X_PMF(4)*Y_PMF(1.5) +
X_PMF(5)*Y_PMF(1.2) +
X_PMF(6)*Y_PMF(1)

Z_PMF(06) =
1/6*1/6 +
1/6*1/6 +
1/6*1/6 +
1/6*0 +
1/6*0 +
1/6*1/6

Z_PMF(06) = 1/6*1/6 + 1/6*1/6 + 1/6*1/6 + 1/6*1/6

Is it so that Y_PMF(1.5) = Y_PMF(1.2) = 0 because a PMF is only valid for integers?

I get Z_PMF(z) = sum(for x from -inf to inf): X_PMF(x) * Y_PMF(z/x)
I think the actual product formula is Z_PMF(z) = sum(for x from -inf to inf): X_PMF(x) * Y_PMF(z/x) / |x|
How does the |x| come in?

4. Jul 11, 2016

### chiro

What you are doing involves a substitution theorem for probability.

Have you looked at change of random variable density function results?

What you are looking at is the multi-variable version of that applied to two variable to go from a 2D random variable to a 1D one.

5. Jul 11, 2016

### rabbed

I think so, I started looking at transformations using the CDF and PDF method, but noticed there were gaps I needed to fill in.

The general formula for finding the PDF of a function of random variables (using the PDF method) is:
Y_PDF(y) = X_PDF(f^-1(y)) / |f'(f^-1(y))| for some Y = f(X) of a random variable X with known density X_PDF(x)
right?

Generalizing this, for transformations where the number of source RV's/dimensions equals the number of destination RV's/dimensions,
|f'(f^-1(y))| turns into the absolute value of the Jacobian, right?

But for cases where the dimensions are not equal I need these convolution (and similar) formulas? I'm hoping to tie this together with the PDF method formula in some way.

Is "|x|" the absolute value of the transformation function derivative (sometimes the Jacobian)?

Last edited: Jul 11, 2016
6. Jul 11, 2016

### chiro

I'd have to look at the formula but the inverse function should become the Jacobian in a multiple dimension substitution integral (the derivative term that is involved).

7. Jul 11, 2016

### rabbed

8. Jul 12, 2016

### chiro

You should probably take a look at something like below and verify it in depth for yourself:

https://www.cl.cam.ac.uk/teaching/0708/Probabilty/prob11.pdf

They have a section explaining why the absolute value exists.

That would have to be generalized for your Jacobian so that the result can be made sense of but essentially the Jacobian represents the expansion/contraction with respect to the gradient vector and the "idea" itself is the same.

I recall seeing some proof in a standard undergraduate statistics book when I did this stuff ages ago so you could use those resources as well.

9. Aug 27, 2016

### rabbed

Hi again, a bit late :)

I failed to see how any of the constraints on page 11.3 in the linked document above has the consequence of requiring an absolute value.

But, it should boil down to the chain rule.. Could you say something like this?

Normally when derivating f(x) wrt y where x = g(y), we use:
df/dx * dx/dy = (df/dx) / (dy/dx)
because we allow dy/dx to be negative, since each value of the derivative should contain
(the change in f wrt x) / (the change in y wrt x).
This way, we create something that can decrease the sum when integrating it.

But when calculating the PDF of Y (where f(x) is the CDF of X and g(y) is a one-to-one inverse
transformation giving x), we use:
df/dx * |dx/dy| = (df/dx) / |dy/dx|
because we want to deny dy/dx to be negative, since each value of a PDF should contain
(the probability (not change in probability) of y) / (_outcome length_ of y).
This way, we create something that can never decrease the sum when integrating it.

Does it make sense? Any comments?

10. Aug 28, 2016

### rabbed

I think my point is, a PDF is not defined as the derivative of probability as a function of outcome, but rather probability per outcome, and its integral (the CDF) become the sum of probabilities.
So derivation and integration of a simple variable will still work as usual, but derivating the CDF of a variable that is a function of some other variable is different, and that's where the absolute value comes from?