Convolution Questions: Expectation Value & PDF Method

  • Context: Graduate 
  • Thread starter Thread starter rabbed
  • Start date Start date
  • Tags Tags
    Convolution
Click For Summary

Discussion Overview

The discussion revolves around the concepts of convolution in probability theory, specifically regarding expectation values, probability density functions (PDFs), and probability mass functions (PMFs). Participants explore the implications of convolution for sums and products of random variables, as well as the relationships between different transformations and their respective density functions.

Discussion Character

  • Exploratory
  • Technical explanation
  • Mathematical reasoning
  • Debate/contested

Main Points Raised

  • Some participants discuss the definition of expectation value and its potential application in deriving why convolution gives the density of a random variable sum.
  • Others suggest examining the convolution theorem from a probability perspective in the discrete domain before applying it to continuous cases.
  • A participant presents an example using discrete uniform distributions to illustrate how to derive the convolution theorem's connection to the probability density function of a sum of two random variables.
  • One participant describes their process of deriving the probability mass formula for the product of two independent random variables represented by dice rolls, detailing the combinations and outcomes.
  • There is a discussion about the use of the Jacobian in the PDF method and whether the absolute value of the transformation function derivative is necessary in certain cases.
  • Some participants question the need for absolute values in the context of transformations and derivatives, discussing the implications for probability density functions.
  • One participant references external resources to support their claims about the product formula and the role of the Jacobian in multiple dimensions.
  • There is a debate about the interpretation of derivatives in the context of PDFs and how they relate to the integration of probabilities.

Areas of Agreement / Disagreement

Participants express differing views on the necessity and interpretation of absolute values in the context of the Jacobian and transformations. While some agree on the general principles of convolution and its applications, the discussion remains unresolved regarding specific mathematical formulations and interpretations.

Contextual Notes

Participants note gaps in understanding regarding transformations and the application of the PDF method, indicating that further exploration of the underlying mathematical principles is needed. There are also references to specific pages in linked documents that may contain relevant constraints or explanations.

Who May Find This Useful

This discussion may be useful for students and practitioners in probability theory, statistics, and related fields who are interested in the mathematical foundations of convolution, expectation values, and transformations of random variables.

rabbed
Messages
241
Reaction score
3
Hi

Two questions:

1)
I saw this definition of expectation value:
E[g(X)] = integral wrt x from -inf to inf of g(x)*f(x)*dx
for some function g(x) of a random variable X and its density function f(x).

Can this be used to derive why convolution gives the density of a random
variable sum?

2)
In cases where the determinant can not be calculated, do convolution give
any hints of the jacobian in the PDF method formula,
Y_PDF(y) = X_PDF(f^-1(y)) / |f'(f^-1(y))| for some Y = f(X) of a random
variable X with known density X_PDF(x)?
 
Physics news on Phys.org
Hey rabbed.

The convolution theorem has a multiplicative identity in the frequency (fourier) domain not in the time domain.

I think you should look at the convolution theorem from a probability perspective in the discrete domain and then apply limits to get the continuous identity (which is often how a lot of summation to integral identities actually happen).

When you are adding distributions you are matching events that correspond to all permutations of an event happening.

I'll use a simple example.

X ~ U[0,2] with a state space {0,1,2} [Discrete Uniform]
Y ~ U[0,2] as well

You have combinations of values with the frequencies (across all possible events]

0 corresponds to [0+0]
1 corresponds to [0+1,1+0]
2 corresponds to [0+2,1+1,2+0]
3 corresponds to [1+2,2+1]
4 corresponds to [2+2]

I did an explicit example to help you understand the grouping process and the formula you will derive for the distribution will represent this frequency normalized (known as a probability) and it's best to use a summation and then group the probabilities that map to an event.

If you do this you can derive the convolution theorems connection to finding the distribution probability density function (normalized frequency) of a sum of two random variables.
 
  • Like
Likes   Reactions: jim mcnamara and Delta2
Thanks

I was able to derive the density sum- and difference formulas. Now looking at how to derive the product formula:
So let's say I throw two die which generates outcomes according to independent random variables X and Y.
X_PMF(x) = Y_PMF(y) = 1/6 (for 1 <= x <= 6 and 1 <= y <= 6)
I first want to derive the probability mass formula of a random variable product by calculating Z_PMF(z) for the random
variable Z = X * Y.
I first list all possible outcomes of Z and group all combinations of two die by the products of their number of dots:

01 - [1,1]
02 - [1,2], [2,1]
03 - [1,3], [3,1]
04 - [1,4], [2,2], [4,1]
05 - [1,5], [5,1]
06 - [1,6], [2,3], [3,2], [6,1]
07 -
08 - [2,4], [4,2]
09 - [3,3]
10 - [2,5], [5,2]
11 -
12 - [2,6], [3,4], [4,3], [6,2]
13 -
14 -
15 - [3,5], [5,3]
16 - [4,4]
17 -
18 - [3,6], [6,3]
19 -
20 - [4,5], [5,4]
21 -
22 -
23 -
24 - [4,6], [6,4]
25 - [5,5]
26 -
27 -
28 -
29 -
30 - [5,6], [6,5]
31 -
32 -
33 -
34 -
35 -
36 - [6,6]

For each outcome of Z, normalizing the number of die combinations by the total number of combinations (36),
I then try to express the probabilities of the outcomes of Z in terms of probability mass for X and Y:

01 - 1/6*1/6
02 - 1/6*1/6 + 1/6*1/6
03 - 1/6*1/6 + 1/6*1/6
04 - 1/6*1/6 + 1/6*1/6 + 1/6*1/6
05 - 1/6*1/6 + 1/6*1/6
06 - 1/6*1/6 + 1/6*1/6 + 1/6*1/6 + 1/6*1/6
07 -
08 - 1/6*1/6 + 1/6*1/6
09 - 1/6*1/6
10 - 1/6*1/6 + 1/6*1/6
11 -
12 - 1/6*1/6 + 1/6*1/6 + 1/6*1/6 + 1/6*1/6
13 -
14 -
15 - 1/6*1/6 + 1/6*1/6
16 - 1/6*1/6
17 -
18 - 1/6*1/6 + 1/6*1/6
19 -
20 - 1/6*1/6 + 1/6*1/6
21 -
22 -
23 -
24 - 1/6*1/6 + 1/6*1/6
25 - 1/6*1/6
26 -
27 -
28 -
29 -
30 - 1/6*1/6 + 1/6*1/6
31 -
32 -
33 -
34 -
35 -
36 - 1/6*1/6

Then, trying to find an algebraic formula (by iterating x, setting Y_PMF(y) = Y_PMF(z/x) and use AND/OR-logic),
which satisfies all outcomes of Z (showing the case for Z=6 below):

Z_PMF(06) =
X_PMF(1)*Y_PMF(06/1) +
X_PMF(2)*Y_PMF(06/2) +
X_PMF(3)*Y_PMF(06/3) +
X_PMF(4)*Y_PMF(06/4) +
X_PMF(5)*Y_PMF(06/5) +
X_PMF(6)*Y_PMF(06/6)

Z_PMF(06) =
X_PMF(1)*Y_PMF(6) +
X_PMF(2)*Y_PMF(3) +
X_PMF(3)*Y_PMF(2) +
X_PMF(4)*Y_PMF(1.5) +
X_PMF(5)*Y_PMF(1.2) +
X_PMF(6)*Y_PMF(1)

Z_PMF(06) =
1/6*1/6 +
1/6*1/6 +
1/6*1/6 +
1/6*0 +
1/6*0 +
1/6*1/6

Z_PMF(06) = 1/6*1/6 + 1/6*1/6 + 1/6*1/6 + 1/6*1/6

Is it so that Y_PMF(1.5) = Y_PMF(1.2) = 0 because a PMF is only valid for integers?

I get Z_PMF(z) = sum(for x from -inf to inf): X_PMF(x) * Y_PMF(z/x)
I think the actual product formula is Z_PMF(z) = sum(for x from -inf to inf): X_PMF(x) * Y_PMF(z/x) / |x|
How does the |x| come in?
 
What you are doing involves a substitution theorem for probability.

Have you looked at change of random variable density function results?

What you are looking at is the multi-variable version of that applied to two variable to go from a 2D random variable to a 1D one.
 
I think so, I started looking at transformations using the CDF and PDF method, but noticed there were gaps I needed to fill in.

The general formula for finding the PDF of a function of random variables (using the PDF method) is:
Y_PDF(y) = X_PDF(f^-1(y)) / |f'(f^-1(y))| for some Y = f(X) of a random variable X with known density X_PDF(x)
right?

Generalizing this, for transformations where the number of source RV's/dimensions equals the number of destination RV's/dimensions,
|f'(f^-1(y))| turns into the absolute value of the Jacobian, right?

But for cases where the dimensions are not equal I need these convolution (and similar) formulas? I'm hoping to tie this together with the PDF method formula in some way.

Is "|x|" the absolute value of the transformation function derivative (sometimes the Jacobian)?
 
Last edited:
I'd have to look at the formula but the inverse function should become the Jacobian in a multiple dimension substitution integral (the derivative term that is involved).
 
You should probably take a look at something like below and verify it in depth for yourself:

https://www.cl.cam.ac.uk/teaching/0708/probability/prob11.pdf

They have a section explaining why the absolute value exists.

That would have to be generalized for your Jacobian so that the result can be made sense of but essentially the Jacobian represents the expansion/contraction with respect to the gradient vector and the "idea" itself is the same.

I recall seeing some proof in a standard undergraduate statistics book when I did this stuff ages ago so you could use those resources as well.
 
Hi again, a bit late :)

I failed to see how any of the constraints on page 11.3 in the linked document above has the consequence of requiring an absolute value.

But, it should boil down to the chain rule.. Could you say something like this?

Normally when derivating f(x) wrt y where x = g(y), we use:
df/dx * dx/dy = (df/dx) / (dy/dx)
because we allow dy/dx to be negative, since each value of the derivative should contain
(the change in f wrt x) / (the change in y wrt x).
This way, we create something that can decrease the sum when integrating it.

But when calculating the PDF of Y (where f(x) is the CDF of X and g(y) is a one-to-one inverse
transformation giving x), we use:
df/dx * |dx/dy| = (df/dx) / |dy/dx|
because we want to deny dy/dx to be negative, since each value of a PDF should contain
(the probability (not change in probability) of y) / (_outcome length_ of y).
This way, we create something that can never decrease the sum when integrating it.

Does it make sense? Any comments?
 
  • #10
I think my point is, a PDF is not defined as the derivative of probability as a function of outcome, but rather probability per outcome, and its integral (the CDF) become the sum of probabilities.
So derivation and integration of a simple variable will still work as usual, but derivating the CDF of a variable that is a function of some other variable is different, and that's where the absolute value comes from?
 

Similar threads

  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 1 ·
Replies
1
Views
3K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 11 ·
Replies
11
Views
2K
Replies
4
Views
3K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 7 ·
Replies
7
Views
2K
  • · Replies 7 ·
Replies
7
Views
3K
  • · Replies 12 ·
Replies
12
Views
4K