# I Deriving convolution

1. Sep 24, 2016

### rabbed

Hi

Can I derive the expression for Z_PDF(z) where:
Z = t(X,Y) = X + Y
By starting with:
Z_PDF(z)*|dz| = X_PDF(x)*|dx| * Y_PDF(y)*|dy|
Z_PDF(z) = X_PDF(x) * Y_PDF(y) * |dx|*|dy|/|dz|

and then substitute the deltas with derivatives and x and y with expressions of z?

Last edited: Sep 24, 2016
2. Sep 24, 2016

### Stephen Tashi

Are you asking whether you can "derive" in the sense of giving a proof? Or are you just asking whether you can get a result by doing some formal algebraic manipulations ?

I see no reason why that would be true - even using "infinitesimal reasoning" upon "dx,dy,dz".

Inutuitively, the value of Z_PDF(z) dz is the probability that Z is in the interval (z - dz, z + dz). For that to happen, the probabilities of X and Y can't be chosen arbitrarily. You have incorporate some relationship between x and y and the value of z.

I think the basic idea is that to approximate Z_PDF(z) dz you must integrate the joint PDF of x and y over the region where x + y is in (z - dz, z + dz). (If you are assuming X and Y are independent, you need to say so.)

3. Sep 25, 2016

### rabbed

The latter.

Ok, I have to do the substitution of x and y before I state the equality of probabilities (otherwise it's not true), and since X and Y are just any numbers picked at random (either does not affect the probability of the other), they are independent giving P(X=z-y AND Y=z-x) = P(X=z-y) * P(Y=z-x | X=z-y) = P(X=z-y) * P(Y=z-x), so:
Z_PDF(z)*|dz| = X_PDF(z-y)*|dx| * Y_PDF(z-x)*|dy|

Is this first step OK?

4. Sep 25, 2016

### Stephen Tashi

No. For example, suppose x = 100 and y = 1. You're claiming the value of Z_PDF(101) is approximately the probability that X is near 100 and Y is near 1. But Z_PDF(101) must also account for other possibilities. For example X might be near 50 and Y might be near 51.

5. Sep 25, 2016

### rabbed

I was just about to change my answer to:

Z_PDF(z)*|dz| = X_PDF(x)*|dx| * Y_PDF(z-x)*|dy|

Better? :)

6. Sep 25, 2016

### Stephen Tashi

Suppose z = 101 and x = 1. You're saying the probability that Z is near 101 is approximately the probability that X is near 1 and Y is near 100. But the probability that Z is near 101 must also include other possibilities - for example that X is near 50 and Y is near 51.

7. Sep 25, 2016

### rabbed

But if Z=101,
I can set x = 1 and get y = 101-1 = 100
Z_PDF(101)*|dz| = X_PDF(1)*|dx| * Y_PDF(101-1)*|dy|

Or I can set x = 50 and get y = 101-50 = 51
Z_PDF(101)*|dz| = X_PDF(50)*|dx| * Y_PDF(101-50)*|dy|

8. Sep 25, 2016

### Stephen Tashi

You can only do that if you assume the equation you are using is correct. You might get two different answers for Z_PDF(101) if you do those two different computations.

9. Sep 25, 2016

### rabbed

But am I not "setting" the unknown Z_PDF(101) to be the same no matter if x=1 and y=100 or x=50 and y=51?
They will become the same because of the different deltas?

10. Sep 25, 2016

### rabbed

Maybe I'm lost..
But it would be good to have a 'derivation' starting from the probability approximations using the PDF's. And it should be possible because that's what's actually being used to get the destination RV's PDF? Any advice?

11. Sep 25, 2016

### Stephen Tashi

You are going to have to do something involving a summation. The value of Z_PDF( z) doesn't depend only on one particular pair of values for X and Y, so Z_PDF(z) is not going to be a function of X_PDF(x) and Y_PDF(y) for one particular pair of values x,y. The value of Z_PDF(z) depends on the values of X_PDF and Y_PDF at all possible combinations of x,y that sum to z. You have to sum over all those possible combinations.

Most dx-dy-dz arguments explicitly or implicitly express a differential equation. I don't know whether there is a dx-dy-dz way to derive the convolution formula. My suggestion is to start with the answer and try to express it as a differential or partial differential equation. Try taking the formula for the cumulative distribution of the convolution of X and Y and differentiate both sides of it. (You might have to use Leibnitz's formula for differentiating an integral where variables appear in the limits of integration. See the "Formal statement" section of https://en.wikipedia.org/wiki/Leibniz_integral_rule )

12. Sep 26, 2016

### rabbed

So since the transformation function is describing a plane, The PDF formula of the destination RV needs to account for the probability of each point on that plane (or rather the infinite line of constant z)?
Like for example a n-to-1 transformation function, where the PDF formula of the destination RV needs to account for the probability of n source RV values for each destination RV value?
(The first case having two (dimensional) source RV's and the second case having one source RV)

Last edited: Sep 26, 2016
13. Sep 26, 2016

### rabbed

Z_PDF(z)*|dz| = integral wrt x from -inf to inf of X_PDF(x)*|dx| * Y_PDF(z-x)*|dy|

Looks better?

14. Sep 26, 2016

### Stephen Tashi

Yes.

Yes, it's a similar situation. The convolution is a special case of a many-to-one transformation. The transformation Z = X + Y maps many points (x,y) to one point z.

The convolution of independent random variables is a even more special case where the joint PDF of X and Y is the product of their individual PDFs.

15. Sep 26, 2016

### Stephen Tashi

It looks better, but your integration isn't defined. You say "wrt x" but you also have a "dy" in the integrand.

In general, if you were asked to integrate a function f(x,) of two variables over the line x + y = 2, how would you write this integration ?

16. Sep 26, 2016

### rabbed

I was thinking I could divide both sides by |dz| and change |dy/dz| into it's derivative.. If y=z-x, then dy/dz = 1?

It was a long time since I did much integrating, but maybe you mean a double integral?
Or setting the integration limits according to that relation..

Last edited: Sep 26, 2016
17. Sep 26, 2016

### Stephen Tashi

You can divide both sides of an equation by something once you have an equation. But what you wrote isn't an actual equation because the integration on the right hand side isn't completely defined.

My use of the terminology "integrating over" was not good. If we integrate g(x,y) over an area, then we use a double integral and, yes, one of the limits would involve a variable - since given a value of x, we can vary y and still have (x,y) be inside an area. However, we try to integrate g(x,y) "over a line" by using a double integral, we get zero since a line has no area. If we set a value of x, we have no choice about the value of y.

What I mean to say is that we want the integral $h(z) = \int_{-\infty}^{\infty} g(x,z-x) dx$, which doesn't involve any "dy".

18. Sep 27, 2016

### rabbed

In programming terms I see the integral sign as a summing for-loop:
for (v=a, result=0; v<b; v+=dv) result += f(parameters)
where you can choose the limits a and b and which of the parameters that should be substituted by the value of v in some expression f.
(Same with sigma, where the only difference being the increment 1 instead of the infinitesimal dv)
Although, we can only do integration if the expression contains dx (in the case where the parameter x was chosen).

Shouldn't the right hand side also mathematically just be a sum in
Z_PDF(z)*|dz| = integral wrt x from -inf to inf of X_PDF(x)*|dx| * Y_PDF(z-x)*|dy|
and it shouldn't matter if some factor is infinitesimal.

How does this extend to n source RV's? Would I let n-1 variables not be substituted in terms of the other variables and be integrated, while one of the source variables is expressed in terms of the other?

For example:
Q = X + Y + Z + W
Q_PDF(q)*|dq| = integral wrt x from -inf to inf of integral wrt y from -inf to inf of integral wrt z from -inf to inf of X_PDF(x)*|dx| * Y_PDF(y)*|dy| * Z_PDF(z)*|dz| * W_PDF(q-x-y-z)*|dw|

Last edited: Sep 27, 2016
19. Sep 27, 2016

### Stephen Tashi

If we are approximating the integral of a function "f", the code would be:
result += f(v)*dv

The sum we are approximating doesn't involve any factor of dy. The function being integrated is: f(x) = X_PDF(x)*Y_PDF(z-x) where z is a constant since we do this approximation when we are given a specific value of z in order to find Z_PDF(z).

If that were true then for X,Y independent and Z = X + Y we would have the convolution formula
$Z\_PDF(z) = \int_{-\infty}^{\infty} \ ( \int_{-\infty}^{\infty} X\_PFF(x) Y\_PDF(z-y) dy)\ dx$
$= \int_{-\infty}^{\infty} \ ( X\_PDF(x) \int_{-\infty}^{\infty} Y\_PDF(z-y) dy)\ dx$
$= \int_{-\infty}^{\infty} X\_PDF(x) (1) dx$
$= 1$

20. Sep 27, 2016

### rabbed

But if I keep |dz| on the left side, the products on the right hand side will have a factor |dy| in their sum? And afterwards I divide both sides with |dz| to get rid of the |dy|.

If I have n independent source RV's and n=2 (Z=X+Y), I assign an integral sign for n-1 of them, for example x, and the remaining one, y, I substitute with the expression z-x.
So I will always have n-1 integral signs and 1 change of variable. Will this not work for n>2?