I Derive Convolution Expression for Z_PDF(z)

AI Thread Summary
The discussion centers on deriving the expression for the probability density function (PDF) of the sum of two independent random variables, Z = X + Y. Initial attempts to express Z_PDF(z) using the joint PDFs of X and Y are critiqued for lacking proper justification, particularly regarding the independence of X and Y and the need to account for all combinations of values that sum to z. The conversation emphasizes that the correct approach involves integrating over all possible pairs (x, y) such that x + y = z, leading to the convolution formula. Participants suggest that a rigorous derivation should start from the cumulative distribution function and differentiate it, rather than relying on informal algebraic manipulations. The importance of defining integration limits and relationships between variables is highlighted as crucial for accurate derivation.
rabbed
Messages
241
Reaction score
3
Hi

Can I derive the expression for Z_PDF(z) where:
Z = t(X,Y) = X + Y
By starting with:
Z_PDF(z)*|dz| = X_PDF(x)*|dx| * Y_PDF(y)*|dy|
Z_PDF(z) = X_PDF(x) * Y_PDF(y) * |dx|*|dy|/|dz|

and then substitute the deltas with derivatives and x and y with expressions of z?
 
Last edited:
Physics news on Phys.org
rabbed said:
Hi

Can I derive the expression for Z_PDF(z) where:
Z = t(X,Y) = X + Y

Are you asking whether you can "derive" in the sense of giving a proof? Or are you just asking whether you can get a result by doing some formal algebraic manipulations ?

By starting with:
Z_PDF(z)*|dz| = X_PDF(x)*|dx| * Y_PDF(y)*|dy|

I see no reason why that would be true - even using "infinitesimal reasoning" upon "dx,dy,dz".

Inutuitively, the value of Z_PDF(z) dz is the probability that Z is in the interval (z - dz, z + dz). For that to happen, the probabilities of X and Y can't be chosen arbitrarily. You have incorporate some relationship between x and y and the value of z.

I think the basic idea is that to approximate Z_PDF(z) dz you must integrate the joint PDF of x and y over the region where x + y is in (z - dz, z + dz). (If you are assuming X and Y are independent, you need to say so.)
 
Stephen Tashi said:
Are you asking whether you can "derive" in the sense of giving a proof? Or are you just asking whether you can get a result by doing some formal algebraic manipulations ?
The latter.

Ok, I have to do the substitution of x and y before I state the equality of probabilities (otherwise it's not true), and since X and Y are just any numbers picked at random (either does not affect the probability of the other), they are independent giving P(X=z-y AND Y=z-x) = P(X=z-y) * P(Y=z-x | X=z-y) = P(X=z-y) * P(Y=z-x), so:
Z_PDF(z)*|dz| = X_PDF(z-y)*|dx| * Y_PDF(z-x)*|dy|

Is this first step OK?
 
rabbed said:
so:
Z_PDF(z)*|dz| = X_PDF(z-y)*|dx| * Y_PDF(z-x)*|dy|

Is this first step OK?

No. For example, suppose x = 100 and y = 1. You're claiming the value of Z_PDF(101) is approximately the probability that X is near 100 and Y is near 1. But Z_PDF(101) must also account for other possibilities. For example X might be near 50 and Y might be near 51.
 
I was just about to change my answer to:

Z_PDF(z)*|dz| = X_PDF(x)*|dx| * Y_PDF(z-x)*|dy|

Better? :)
 
rabbed said:
Z_PDF(z)*|dz| = X_PDF(x)*|dx| * Y_PDF(z-x)*|dy|

Better? :)

Suppose z = 101 and x = 1. You're saying the probability that Z is near 101 is approximately the probability that X is near 1 and Y is near 100. But the probability that Z is near 101 must also include other possibilities - for example that X is near 50 and Y is near 51.
 
But if Z=101,
I can set x = 1 and get y = 101-1 = 100
Z_PDF(101)*|dz| = X_PDF(1)*|dx| * Y_PDF(101-1)*|dy|

Or I can set x = 50 and get y = 101-50 = 51
Z_PDF(101)*|dz| = X_PDF(50)*|dx| * Y_PDF(101-50)*|dy|
 
rabbed said:
But if Z=101,
I can set x = 1 and get y = 101-1 = 100
Z_PDF(101)*|dz| = X_PDF(1)*|dx| * Y_PDF(101-1)*|dy|

Or I can set x = 50 and get y = 101-50 = 51
Z_PDF(101)*|dz| = X_PDF(50)*|dx| * Y_PDF(101-50)*|dy|

You can only do that if you assume the equation you are using is correct. You might get two different answers for Z_PDF(101) if you do those two different computations.
 
But am I not "setting" the unknown Z_PDF(101) to be the same no matter if x=1 and y=100 or x=50 and y=51?
They will become the same because of the different deltas?
 
  • #10
Maybe I'm lost..
But it would be good to have a 'derivation' starting from the probability approximations using the PDF's. And it should be possible because that's what's actually being used to get the destination RV's PDF? Any advice?
 
  • #11
rabbed said:
Maybe I'm lost..
But it would be good to have a 'derivation' starting from the probability approximations using the PDF's. And it should be possible because that's what's actually being used to get the destination RV's PDF? Any advice?

You are going to have to do something involving a summation. The value of Z_PDF( z) doesn't depend only on one particular pair of values for X and Y, so Z_PDF(z) is not going to be a function of X_PDF(x) and Y_PDF(y) for one particular pair of values x,y. The value of Z_PDF(z) depends on the values of X_PDF and Y_PDF at all possible combinations of x,y that sum to z. You have to sum over all those possible combinations.

Most dx-dy-dz arguments explicitly or implicitly express a differential equation. I don't know whether there is a dx-dy-dz way to derive the convolution formula. My suggestion is to start with the answer and try to express it as a differential or partial differential equation. Try taking the formula for the cumulative distribution of the convolution of X and Y and differentiate both sides of it. (You might have to use Leibnitz's formula for differentiating an integral where variables appear in the limits of integration. See the "Formal statement" section of https://en.wikipedia.org/wiki/Leibniz_integral_rule )
 
  • #12
So since the transformation function is describing a plane, The PDF formula of the destination RV needs to account for the probability of each point on that plane (or rather the infinite line of constant z)?
Like for example a n-to-1 transformation function, where the PDF formula of the destination RV needs to account for the probability of n source RV values for each destination RV value?
(The first case having two (dimensional) source RV's and the second case having one source RV)
 
Last edited:
  • #13
Z_PDF(z)*|dz| = integral wrt x from -inf to inf of X_PDF(x)*|dx| * Y_PDF(z-x)*|dy|

Looks better?
 
  • #14
rabbed said:
So since the transformation function is describing a plane, The PDF formula of the destination RV needs to account for the probability of each point on that plane (or rather the infinite line of constant z)?
Yes.

Like for example a n-to-1 transformation function, where the PDF formula of the destination RV needs to account for the probability of n source RV values for each destination RV value?
Yes, it's a similar situation. The convolution is a special case of a many-to-one transformation. The transformation Z = X + Y maps many points (x,y) to one point z.

The convolution of independent random variables is a even more special case where the joint PDF of X and Y is the product of their individual PDFs.
 
  • #15
rabbed said:
Z_PDF(z)*|dz| = integral wrt x from -inf to inf of X_PDF(x)*|dx| * Y_PDF(z-x)*|dy|

Looks better?

It looks better, but your integration isn't defined. You say "wrt x" but you also have a "dy" in the integrand.

In general, if you were asked to integrate a function f(x,) of two variables over the line x + y = 2, how would you write this integration ?
 
  • #16
I was thinking I could divide both sides by |dz| and change |dy/dz| into it's derivative.. If y=z-x, then dy/dz = 1?

It was a long time since I did much integrating, but maybe you mean a double integral?
Or setting the integration limits according to that relation..
 
Last edited:
  • #17
rabbed said:
I was thinking I could divide both sides by |dz| and change |dy/dz| into it's derivative.. If y=z-x, then dy/dz = 1?

You can divide both sides of an equation by something once you have an equation. But what you wrote isn't an actual equation because the integration on the right hand side isn't completely defined.
It was a long time since I did much integrating, but maybe you mean a double integral?
Or setting the integration limits according to that relation..

My use of the terminology "integrating over" was not good. If we integrate g(x,y) over an area, then we use a double integral and, yes, one of the limits would involve a variable - since given a value of x, we can vary y and still have (x,y) be inside an area. However, we try to integrate g(x,y) "over a line" by using a double integral, we get zero since a line has no area. If we set a value of x, we have no choice about the value of y.

What I mean to say is that we want the integral ## h(z) = \int_{-\infty}^{\infty} g(x,z-x) dx ##, which doesn't involve any "dy".
 
  • #18
In programming terms I see the integral sign as a summing for-loop:
for (v=a, result=0; v<b; v+=dv) result += f(parameters)
where you can choose the limits a and b and which of the parameters that should be substituted by the value of v in some expression f.
(Same with sigma, where the only difference being the increment 1 instead of the infinitesimal dv)
Although, we can only do integration if the expression contains dx (in the case where the parameter x was chosen).

Shouldn't the right hand side also mathematically just be a sum in
Z_PDF(z)*|dz| = integral wrt x from -inf to inf of X_PDF(x)*|dx| * Y_PDF(z-x)*|dy|
and it shouldn't matter if some factor is infinitesimal.

How does this extend to n source RV's? Would I let n-1 variables not be substituted in terms of the other variables and be integrated, while one of the source variables is expressed in terms of the other?

For example:
Q = X + Y + Z + W
Q_PDF(q)*|dq| = integral wrt x from -inf to inf of integral wrt y from -inf to inf of integral wrt z from -inf to inf of X_PDF(x)*|dx| * Y_PDF(y)*|dy| * Z_PDF(z)*|dz| * W_PDF(q-x-y-z)*|dw|
 
Last edited:
  • #19
rabbed said:
In programming terms I see the integral sign as a summing for-loop:
for (v=a, result=0; v<b; v+=dv) result += f(parameters)
If we are approximating the integral of a function "f", the code would be:
result += f(v)*dv

Shouldn't the right hand side also mathematically just be a sum in
Z_PDF(z)*|dz| = integral wrt x from -inf to inf of X_PDF(x)*|dx| * Y_PDF(z-x)*|dy|
and it shouldn't matter if some factor is infinitesimal.

The sum we are approximating doesn't involve any factor of dy. The function being integrated is: f(x) = X_PDF(x)*Y_PDF(z-x) where z is a constant since we do this approximation when we are given a specific value of z in order to find Z_PDF(z).
For example:
Q = X + Y + Z + W
Q_PDF(q)*|dq| = integral wrt x from -inf to inf of integral wrt y from -inf to inf of integral wrt z from -inf to inf of X_PDF(x)*|dx| * Y_PDF(y)*|dy| * Z_PDF(z)*|dz| * W_PDF(q-x-y-z)*|dw|

If that were true then for X,Y independent and Z = X + Y we would have the convolution formula
## Z\_PDF(z) = \int_{-\infty}^{\infty} \ ( \int_{-\infty}^{\infty} X\_PFF(x) Y\_PDF(z-y) dy)\ dx ##
##= \int_{-\infty}^{\infty} \ ( X\_PDF(x) \int_{-\infty}^{\infty} Y\_PDF(z-y) dy)\ dx ##
## = \int_{-\infty}^{\infty} X\_PDF(x) (1) dx ##
## = 1##
 
  • #20
But if I keep |dz| on the left side, the products on the right hand side will have a factor |dy| in their sum? And afterwards I divide both sides with |dz| to get rid of the |dy|.

If I have n independent source RV's and n=2 (Z=X+Y), I assign an integral sign for n-1 of them, for example x, and the remaining one, y, I substitute with the expression z-x.
So I will always have n-1 integral signs and 1 change of variable. Will this not work for n>2?
 
  • #21
rabbed said:
But if I keep |dz| on the left side, the products on the right hand side will have a factor |dy| in their sum? And afterwards I divide both sides with |dz| to get rid of the |dy|.

If you are going to begin the derivation with the assertion ##Z_{pdf}(z) dz = \int_{-\infty}^{\infty} X_{pdf}(x) dx Y_{pdf}(y) dy ## then you need to justify that equation before proceeding. For example, is that equation correct even in a simple case such as when X and Y are each uniformly distributed on [0,1] ?

A derivation beginning with the assertion ##Z_{pdf} dz = \int_{-\infty}^{\infty} X_{pdf}(x) dx Y_{pdf}(y) dy ## is apparently claiming that an appropriate dy can be chosen to make the two sides of the equation equal, because there is no other stipulation on what dy represents.

However, by a similar technique, we can begin by assuming the equation ##1 dz = 2 dy##. Dividing both sides of that equation by dz doesn't prove 1 = 2. An equation like ##1 dz = 2 dy## only holds when there is particular relationship between dz and dy.

So you if attempt to derive the general convolution formula by assuming a special realationship between Y and Z, you must show that this relationship is not actually "special". It must always hold between Y and Z regardless of how X is distributed.
 
  • #22
I'm just trying (for my own sake) to explain the change of variables/PDF method and have done this by:
- first explain the PDF formula for one-to-one transformation functions
- then explain how to use that in the PDF formula for many-to-one transformation functions (using OR-logic with the probability approximations:
Y_PDF(y)*|dy| = X_PDF(t1^-1(y)) * dx + X_PDF(t2^-1(y)) * dx + ..
where t1^-1(y), t2^1(y) .. are the different solutions of the inverse transformation function)

Now I want to learn how to treat the probabilities in the case of multidimensional transformation functions, starting with convolutions.
So as long as the reasoning makes sense probability-wise (using AND/OR logic and some change of variables), I'm happy :)

Since, before I started with the change of variables method-explanation I already explained how the AND/OR-'formulas' hold for probabilities and that the probability of a specific outcome value for a continuous random variable can be approximated by multiplying its PDF with the infinitesimal outcome size, if I can explain how to treat the probabilities for the separate transformation cases above in a way that makes sense intuitively and mathematically it's feels like proof enough for me.
 
Last edited:
  • #23
rabbed said:
I'm just trying (for my own sake) to explain the change of variables/PDF method and have done this by:
- first explain the PDF formula for one-to-one transformation functions
- then explain how to use that in the PDF formula for many-to-one transformation functions (using OR-logic with the probability approximations:
Y_PDF(y)*|dy| = X_PDF(t1^-1(y)) * dx + X_PDF(t2^-1(y)) * dx + ..
where t1^-1(y), t2^1(y) .. are the different solutions of the inverse transformation function)

Now I want to learn how to treat the probabilities in the case of multidimensional transformation functions, starting with convolutions.
That's an interesting approach. I don't recall ever seeing convolution presented as special case of a more general theorem - a theorem for a general "function of several random variables".

So as long as the reasoning makes sense probability-wise (using AND/OR logic and some change of variables), I'm happy :)

Ok, but what you've presented so far doesn't make sense. If we are doing to do a dx,dy,dz type of argument it has to make sense to a dx,dy,dz type of thinker.

If we accept Z_PDF(z) dz as an approximation for the probabilty of z - dz < Z < z + dz then there is some area in the X,Y plane that corresponds to this event. How do we integrate the joint density g(x,y) of X and Y over that area? The area isn't simply a rectangle of dimensions dx by dy with sides parallel to the respective coordinate axes.

For ##g(X,Y) = X + Y##, I think the area looks like two "infinite triangles" with a common vertex at (0,0). Every line that satisfies the equation x + y = k for z-dz < k < z+dz is within the area.

So I think the integration we need is ##\int_{-\infty}^{\infty} \int_{ymin}^{ymax} g(x,y) dy dx ## where
##ymin = MIN( z - dz - x, z + dz - x) ## and ##ymax = MAX(z - dz - x, z + dz - x)##.
 
  • #24
Stephen Tashi said:
Ok, but what you've presented so far doesn't make sense. If we are doing to do a dx,dy,dz type of argument it has to make sense to a dx,dy,dz type of thinker.

I think the case of Z = X + Y does make sense:
Z_PDF(z)*|dz| = integral wrt x from -inf to inf of X_PDF(x)*|dx| * Y_PDF(z-x)*|dy|
This would produce the right hand side sum:

X_PDF(-inf)*|dx| * Y_PDF(z-(-inf))*|dy| +
X_PDF(-inf+dx)*|dx| * Y_PDF(z-(-inf+dx))*|dy|
X_PDF(-inf+2*dx)*|dx| * Y_PDF(z-(-inf+2*dx))*|dy| +
...
X_PDF(inf)*|dx| * Y_PDF(z-(inf))*|dy|

We get the sum of the probabilities of each infinitesimal point on that infinite line where z is some constant c and the result is Z_PDF(c)*|dz| (the probability that any point OR the others on that line will be the outcome).

Wouldn't the same reasoning work for another dimension of source RV, giving us a plane of points we need to sum the probabilities for? Maybe we don't need to look at the area/volume etc. of the shape, just the probabilities of the points? So we only use the integral signs to produce the coordinates of those points and make sure that the change of the last variable (y = z-x in the above 2D case) keeps the coordinates on that line/place etc.

2D: z = x + y => (x, z-x) are the points on the 1D line where z is constant
3D: w = x + y + z => (x, y, w-x-y) are points on the 2D plane where w is constant
4D: q = x + y + z + w => (x, y, z, q-x-y-z) are points in the 3D volume where q is constant

Hm, is that correct?

Then maybe the area/volume element comes in automatically when dividing the right hand side substituted variable's delta by the left hand side delta. but yes, it feels too speculative.. so maybe we need a more analytic solution that you propose.
 
Last edited:
  • #25
rabbed said:
I think the case of Z = X + Y does make sense:
Z_PDF(z)*|dz| = integral wrt x from -inf to inf of X_PDF(x)*|dx| * Y_PDF(z-x)*|dy|

This would produce the right hand side sum:

X_PDF(-inf)*|dx| * Y_PDF(z-(-inf))*|dy| +
X_PDF(-inf+dx)*|dx| * Y_PDF(z-(-inf+dx))*|dy|
X_PDF(-inf+2*dx)*|dx| * Y_PDF(z-(-inf+2*dx))*|dy| +
...
X_PDF(inf)*|dx| * Y_PDF(z-(inf))*|dy|

If you think that makes sense as an approximation, try coding it for a specific example and see if you can approximate Z_PDF(z) that way for various values of z.
We get the sum of the probabilities of each infinitesimal point on that infinite line where z is some constant c and the result is Z_PDF(c)*|dz| (the probability that any point OR the others on that line will be the outcome).

The problem I see is that you are integrating the joint density over a line of uniform thickness, but the bundle of lines that define the probability that z - dz < Z < z + dz is not of uniform thickness.
 
  • #26
Stephen Tashi said:
If we accept Z_PDF(z) dz as an approximation for the probabilty of z - dz < Z < z + dz then there is some area in the X,Y plane that corresponds to this event. How do we integrate the joint density g(x,y) of X and Y over that area? The area isn't simply a rectangle of dimensions dx by dy with sides parallel to the respective coordinate axes.

Isn't Z_PDF(z)*|dz| the approximation of the probability of Z being in the interval between z and z + |dz|?

Stephen Tashi said:
For g(X,Y)=X+Yg(X,Y) = X + Y, I think the area looks like two "infinite triangles" with a common vertex at (0,0). Every line that satisfies the equation x + y = k for z-dz < k < z+dz is within the area.

So I think the integration we need is ∫∞−∞∫ymaxyming(x,y)dydx\int_{-\infty}^{\infty} \int_{ymin}^{ymax} g(x,y) dy dx where
ymin=MIN(z−dz−x,z+dz−x)ymin = MIN( z - dz - x, z + dz - x) and ymax=MAX(z−dz−x,z+dz−x)ymax = MAX(z - dz - x, z + dz - x).

Do you have a reference or a describing picture? I have difficulties to see why the area of the line of points would be triangular

Also, it's hard to find any references showing the convolution formulas for 2+ dimensions without the convolution operator. I want the integral signs :)
Do you know?
 
  • #27
rabbed said:
Isn't Z_PDF(z)*|dz| the approximation of the probability of Z being in the interval between z and z + |dz|?

Yes. But the best approximation of that probability, ##Pr( z < Z < z + |dz|)## is 0.0. The value we are trying approximate is ##Z_{PDF}(z)##, which is not equal to ##lim_{|dz| \rightarrow 0} Pr(z < Z < z + |dz|) = 0##.

We need to write an equation that approximates ##Z_{PDF}(z)##. If we try to do that by dividing both sides your equation by ##|dz|##, we get something like ##\sum_{k=-\infty}^\infty X_{PDF}( k \ dx) Y_{PDF}(z - k\ dx) \frac{dx dy}{dz}##.

If someone asks you to code up this approximation, how are you pick the "small" values of dx,dy,dz? Are you going to make them small and equal? Or you going to set some constant ratio between them ? I don't think the limit exists because its value is depends on how you let ##dx\rightarrow 0, dy \rightarrow 0, dz\rightarrow 0##.

Do you have a reference or a describing picture? I have difficulties to see why the area of the line of points would be triangular

That's because I'm wrong about the triangular shape !

I'm trying to write the expression for integrating a function over the area of an (infinitely long) thickened line. That area is bounded by two parallel lines.

The event ##(z < Z < z + |dz|)## in the XY_plane is the area bounded by the two lines ##x + y = z## and ##x+y = z + |dz|## The value of Z_PDF(z) is approximated by the probability of that event, divided by ##dz##.
Also, it's hard to find any references showing the convolution formulas for 2+ dimensions without the convolution operator. I want the integral signs :)
Do you know?
I don't know any references. ##W = X + Y + Z ## with ##X,Y,Z## independent. ## W_{PDF}(w) =\int_{-\infty}^{\infty} \int_{-\infty}^{\infty} X_{PDF}(x) Y_{PDF}(y) Z_{PDF}( w - x - y) dy dx ##
 
  • #28
Yes, now it feels like we're getting somewhere..
I'll have to check some more after work.
Just an idea that popped up, maybe we need to transform into a different space so that the line is integrated along an axis of that space?
 
  • #29
I'm looking at something called a "line integral". Just need to understand it first :)
Do you think it looks like the right way to go?
 
  • #30
A line integral is just evaluating the sum of projections of one function relative to another.

Inner products look at relative measure between two vectors and the integral sums up these components instead of say summing up lengths or areas.

In terms of convolution, you should probably look at proving the discrete convolution formula first and then generalize it to the continuous domain by taking limits.

The proof of the discrete domain will involve some clever grouping.
 
  • #31
rabbed said:
I'm looking at something called a "line integral". Just need to understand it first :)
Do you think it looks like the right way to go?

The problem with that idea is that the joint density f(x,y) is a density per unit area and when you do integral of a scalar function f(x,y) over a curve, the interpretation of f(x,y) needs to be density per unit length.

That leads to the question: Give a density f(x,y) per unit area, can we use it to define a density per unit length on some lines ?
 
  • #32
Stephen Tashi said:
The problem with that idea is that the joint density f(x,y) is a density per unit area and when you do integral of a scalar function f(x,y) over a curve, the interpretation of f(x,y) needs to be density per unit length.

That leads to the question: Give a density f(x,y) per unit area, can we use it to define a density per unit length on some lines ?

What if the densities are not joined, even though they can be?
 
  • #33
rabbed said:
What if the densities are not joined, even though they can be?

I don't know what you mean by that.
 
  • #34
X_PDF(x) and Y_PDF(y) both give density per unit length, right?
So we have the information..

But let me get it clear what we're doing.
Let's say we want the line on the plane where z = c and we call it L = (x(t), y(t)) = (t, c-t)

x'(t) = 1
y'(t) = -1

We want to integrate X_PDF(x) * Y_PDF(y) along the line L..

Z_PDF(c)*|dz| = integral wrt t from -inf to inf of X_PDF(x(t)) * Y_PDF(y(t)) * sqrt( x'(t)^2 + y'(t)^2 ) * dt

Ok?

We're integrating along the hypothenuse and we need the catheters?
 
  • #35
rabbed said:
X_PDF(x) and Y_PDF(y) both give density per unit length, right?

Yes
So we have the information..

But let me get it clear what we're doing.
Let's say we want the line on the plane where z = c and we call it L = (x(t), y(t)) = (t, c-t)

x'(t) = 1
y'(t) = -1

We want to integrate X_PDF(x) * Y_PDF(y) along the line L..

Z_PDF(c)*|dz| = integral wrt t from -inf to inf of X_PDF(x(t)) * Y_PDF(y(t)) * sqrt( x'(t)^2 + y'(t)^2 ) * dt

Ok?

No. Why do you have a "dz" on the left hand side of the equation?

On the right hand side, we get the intriguing result ##\int_{-\infty}^{\infty} X_{PDF}(t) Y_{PDF}(c-t) \sqrt{2} dt ##

We're integrating along the hypothenuse and we need the catheters?

"catheters"? Do you mean "sides" ?
 
Last edited:
  • #36
Stephen Tashi said:
No. Why do you have a "dz" on the left hand side of the equation?
I thought we were approximating the probability, so that we can then divide by |dz| to get Z_PDF(c)

Stephen Tashi said:
"catheters"? Do you mean "sides" ?
yep.. https://en.wikipedia.org/wiki/Cathetus
"catheters" in plural form from swedish according to google translate :)
 
  • #37
rabbed said:
I thought we were approximating the probability, so that we can then divide by |dz| to get Z_PDF(c)

Yes, but there is nothing on the right hand size of the equation that depends upon "dz". If you divide the right hand size of the equation by "dz" and take the limit of the right hand size as ##dz \rightarrow 0##, you get an infinite result.

Since c is some constant, the integral on the right hand side of the equation (if the integral exists) is some constant.

yep.. https://en.wikipedia.org/wiki/Cathetus
"catheters" in plural form from swedish according to google translate :)
Is your idea that we can fix the factor of ##\sqrt{2}## by some argument about "dz" being the hypotenuse of an infinitesimal triangle whose sides are "dx" and "dx" ? That's an interesting idea, but I don't see how to formulate it as plausible reasoning.
 
  • #38
Stephen Tashi said:
Yes, but there is nothing on the right hand size of the equation that depends upon "dz". If you divide the right hand size of the equation by "dz" and take the limit of the right hand size as dz→0dz \rightarrow 0, you get an infinite result.

Since c is some constant, the integral on the right hand side of the equation (if the integral exists) is some constant.

Best would be to use z instead of c all the way, but I wanted to clarify.

Stephen Tashi said:
Is your idea that we can fix the factor of √2\sqrt{2} by some argument about "dz" being the hypotenuse of an infinitesimal triangle whose sides are "dx" and "dx" ? That's an interesting idea, but I don't see how to formulate it as plausible reasoning.

Something like that. Can we use the "aim-vector" of L for that? so that the rate of growth of t has dx and dy in it
 
  • #39
L = (x(t), y(t)) = (t*|dy|/sqrt(2), z-t*|dy|/sqrt(2))

x'(t) = |dy|/sqrt(2)
y'(t) = -|dy|/sqrt(2)

Z_PDF(z)*|dz| = integral wrt t from -inf to inf of X_PDF(x(t)) * Y_PDF(y(t)) * sqrt( x'(t)^2 + y'(t)^2 ) * dt
= integral wrt t from -inf to inf of X_PDF(t*|dy|/sqrt(2)) * Y_PDF(z-t*|dy|/sqrt(2)) * |dy| * dt

Make any sense?
 
Last edited:
  • #40
Z_PDF(z) dz is approximately the integral of the joint density over a thin area bounded by the lines x + y = z + dz and x + y = z - dz. Approximate this integral by dividing the area into parallelograms with vertices ( (k)dx, z+dz), ((k)dx, ,z-dz) , ( (k+1)dx, z-dz) ((k+1)dx, z+dz)). The area of a parallelogram is (dx)(dz). The probability mass in a parallelogram is approximately X_PDF(kdx) Y_PDF(z-kdx) dx dz. This gives an approximation with "dz" on both sides of the equation.
 
  • #41
Stephen Tashi said:
Z_PDF(z) dz is approximately the integral of the joint density over a thin area bounded by the lines x + y = z + dz and x + y = z - dz. Approximate this integral by dividing the area into parallelograms with vertices ( (k)dx, z+dz), ((k)dx, ,z-dz) , ( (k+1)dx, z-dz) ((k+1)dx, z+dz)). The area of a parallelogram is (dx)(dz). The probability mass in a parallelogram is approximately X_PDF(kdx) Y_PDF(z-kdx) dx dz. This gives an approximation with "dz" on both sides of the equation.

Okay, sounds good.. I'll take a closer look when I get some more time. Thanks for now!
 
  • #42
Hm, a density value of an outcome equals the probability of that outcome per the length/area/volume of that outcome.

I would like to see how the outcome area of P(x<X<x+dx AND z-x<Y<z-x+dy) can be modified into the outcome area of P(z < Z < z+dz), since these should be equal.

Then it's desirable to take the quotient of these areas (even though I think this will be 1?).
When the number of source RV's equal the number of destination RV's, this will become the absolute value of the Jacobian determinant, I think?

I'll try to formulate that, but feel free to help out! :)
 
Last edited:
  • #43
rabbed said:
I would like to see how the outcome area of P(x<X<x+dx AND z-x<Y<z-x+dy) can be transformed into the outcome area of P(z < Z < z+dz), since these should be equal.

Why should they be equal ? For example, suppose z = 10, dz = 0.01, x = 5, dx = 0.5, dy = 0.25. Then you aren't accounting for cases like x = 9 and y = 1. And cases like x = 5.45, y = 5.20 aren't cases where x + y = z is in (z, z + dz).
 
  • #44
rabbed said:
Hm, a density value of an outcome equals the probability of that outcome per the length/area/volume of that outcome.

That would do for a definition of average density. For a probability density function (defined on points in a length, area, or volume) you need a definition for "density at a point", which means you must define it in terms of a limit of average densities.
 
  • #45
Stephen Tashi said:
Why should they be equal ? For example, suppose z = 10, dz = 0.01, x = 5, dx = 0.5, dy = 0.25. Then you aren't accounting for cases like x = 9 and y = 1. And cases like x = 5.45, y = 5.20 aren't cases where x + y = z is in (z, z + dz).

Right, already forgot :) Is it possible to express that as probabilities?
Something like P(OR wrt x from -inf to inf of x<X<x+dx AND z-x<Y<z-x+dy)? :)
Or maybe integral wrt x from -inf to inf of P(x<X<x+dx AND z-x<Y<z-x+dy)
Or integral wrt x from -inf to inf of P(X=x AND Y=z-x)

Also, should dy be expressed wrt x?
 
Last edited:
  • #46
rabbed said:
Right, already forgot :) Is it possible to express that as probabilities?

Express what ? Are you asking for ways to describe the event ##Z :\{ z < Z < z + dz\}## ? If you describe it in terms of variables like x,y,dx,dy, then those variables must have some relation to z and dz.

In the XY-plane the event ##Z :\{ z < Z < z + dz\}## is ## (x,y): \{ z < x + y < z + dz \} ## If you want to write a description that includes "dx" and "dy", you have to specify how they are related to "z" and "dz".
 
  • #47
Basically, there should be a directional derivative or gradient (dz/dx, dz/dy) when you have many source RV's and one destination RV.

I want Z_PDF(z) to be expressed in terms of the probability of the points on that infinite line, divided by the absolute value of the derivative (length?) of the point where Z=z, like when there is a Jacobian..

Something like:
Z_PDF(z)/|dz| = integral wrt x from -inf to inf of X_PDF(x) / |dz/dx| * Y_PDF(z-x) / |dz/dy| * |dz|

I realize |dz/dx| and |dz/dy| is 1.. so maybe it would be better to try something like Z = 2*X + 3*Y

Maybe it will give a definition of what a determinant looks like for a non-square matrix, because that has no definition now, right? :)
 
Last edited:
  • #48
rabbed said:
I want Z_PDF(z) to be expressed in terms of the probability of the points on that infinite line, divided by the absolute value of the derivative (length?) of the point where Z=z, like when there is a Jacobian..

For continuous random variables, the probability of each point on the infinite line is zero and the probability that (X,Y) will be some point on the infinite line is also zero. We have to think about probability densities, not probabilities.

The line x + y = z is a level curve of the surface f(x,y) = x + y. The gradient of f(x,y) defines a vector field that is perpendicular to that level curve. If we imagine Z varying from z to z + dz, this corresponds to the level curve x+y = Z sweeping out an area approximated by moving each point (x,y) on the level surface in the direction specified by the gradient.

I don't think you can approximate the area swept out by the level surface only by considering the gradient, because the area swept out depends on the shape of curve. If I imagine the curve approximated by a series of small straight line segments, then the area swept out is approximated by a sum of areas of parallelograms. One side of a parallelogram is a line segment. The adjacent size is a vector defined by the direction of the gradient at one end of the line segment.

Perhaps this topic has been worked out by people who study "level set" methods. https://en.wikipedia.org/wiki/Level_set_method

Or page 449 eq 39 b) of https://books.google.com/books?id=Q...ge&q=area swept out by a moving curve&f=false
 
Last edited:
  • #49
Crystal clear explanation! Thank you :)
It doesn't seem impossible to do, but I guess if it can/has been solved they would already be teaching it as part of probability theory.

So people get by with only having the n-to-1 and n-to-n dimensional cases or are n-to-m calculations done graphically instead of with a formula?
 
  • #50
rabbed said:
So people get by with only having the n-to-1 and n-to-n dimensional cases or are n-to-m calculations done graphically instead of with a formula?

Even the 1-dim to 1-dim case isn't that simple in practice. Things are simple if Y = F(X) is a monotone function of the random variable X, but if it has peaks and valleys, you must consider various cases.

A complication with an n-dimensional function of a random variable (or variables) is that the components of a n-dimensional random vector might be dependent, even if the variables in the domain of the functions are independent. For example if X1,X2,X3 are independent random variables and Y1 = X1 + X3, Y2 = X2 + X3 then the joint density of (Y1,Y2) isn't necessarily given by the product Y1_PDF(y1) Y2_PDF(y2).
 

Similar threads

Back
Top