Calculating a CDF Identity: Derivation and Explanation

  • Thread starter Thread starter pluviosilla
  • Start date Start date
  • Tags Tags
    Cdf Identity
pluviosilla
Messages
17
Reaction score
0
I ran across this identity in some actuarial literature:

Pr( (x_1 \le X \le x_2) \ \cap \ (y_1 \le Y \le y_2) ) = F(x_2, y_2) - F(x_1, y_2) - F(x_2, y_1) + F(x_1, y_1)

First of all, I am not certain this is correct. I think the expression on the LHS is equal to the following double integral, which is by no means obviously equal to the CDF expression on the RHS:

Pr( (x_1 \le X \le x_2) \cap (y_1 \le Y \le y_2) ) = \int_{x_1 }^{x_2}\int_{y_1}^{y_2}f(x,y)dydx

I suspect that maybe the author intended to use the OR condition in the expression on the left. Did he mean to say this?

Pr( (x_1 \le X \le x_2) \ \cup \ (y_1 \le Y \le y_2) ) = F(x_2, y_2) - F(x_1, y_2) - F(x_2, y_1) + F(x_1, y_1)

Either way, I would like to see the derivation. Any help would be much appreciated.

Thanks!
 
Physics news on Phys.org
That's (almost) right. If I(S) is the indicator function of a set S then

<br /> I(x_1\le X&lt;x_2,\ y_1\le Y&lt;y_2) =<br /> \left(I(X&lt;x_2)-I(X&lt;x_1)\right)\left(I(Y&lt;y_2)-I(Y&lt;y_1)\right)<br />

<br /> =I(X&lt;x_2,Y&lt;y_2)+I(X&lt;x_1,Y&lt;y_1)-I(X&lt;x_1,Y&lt;y_2)-I(X&lt;x_2,Y&lt;y_1)<br />
 
  • Like
Likes pluviosilla
Fascinating! I'll read up on the indicator function, because it looks like something I could use to find shortcuts! :-)

A couple of questions (if you have time):
(1.) Does the equation you posted require X & Y to be independent RVs?
(2.) What do you get when the two parentheses are ORed instead of ANDed?

Pr( (x_1 \le X \le x_2) \ \cup \ (y_1 \le Y \le y_2) )
 
pluviosilla said:
(1.) Does the equation you posted require X & Y to be independent RVs?
(2.) What do you get when the two parentheses are ORed instead of ANDed?

(1) no
(2) It's a little bit messier, but you can use I(AuB)=I(A)+I(B)-I(A)I(B).

The "Basic properties" section of the http://en.wikipedia.org/wiki/Indicator_function" should also help answer your questions.
 
Last edited by a moderator:
  • Like
Likes pluviosilla
I read the Wikipedia article which provides the identities you used above. In particular,

I_{A \cap B} = I_A \cdot I_B

Are you saying that we can extrapolate this relationship to all functions of an intersection? If so, how would you prove that?

It is true that a CDF is, in a sense, a function of an intersection:

F(x, y) = P(X < x AND Y < y)

But it is not generally true that F(x, y) = F(x) * F(y). This identity only works when X & Y are independent.

No doubt, people familiar with the indicator function will quickly see how it applies to a multivariate CDF, but I am having trouble filling in the gaps.

Do you know where I can find some good proofs that use the indicator function to explore the properties of CDFs? I skimmed through a basic Probability text (Sheldon Ross) and found interesting applications (notably, a proof that E[I_A] = P(A)). But I found nothing of relevance to this discussion.
 
Try this.
<br /> \begin{align*}<br /> \Pr(x_1 \le X \le x_2 \cap y_1 \le Y \le y_2) &amp; = \int_{x_1}^{x_2} \int_{y_1}^{y_2} f(x,y)\,dydx\\<br /> &amp; = \int_{x_1}^{x_2} \left(\int_{-\infty}^{y_2} - \int_{-\infty}^{y_1}\right) f(x,y)\,dydx\\<br /> &amp; = \int_{x_1}^{x_2} \int_{-\infty}^{y_2} f(x,y) \, dy dx - \int_{x_1}^{x_2} \int_{-\infty}^{y_1} f(x,y) \, dy dx \\<br /> &amp; = \int_{-\infty}^{x_2} \left(\int_{-\infty}^{y_2} f(x,y) \, dy dx - \int_{-\infty}^{y_1} f(x,y)\right)\,dydx\\<br /> &amp; - \int_{-\infty}^{x_1} \left(\int_{-\infty}^{y_2} f(x,y) \, dy dx - \int_{-\infty}^{y_1} f(x,y)\right) \,dydx\\<br /> &amp; = F(x_2,y_2) - F(x_1,y_2) - F(x_2,y_1) + F(x_1,y_1)<br /> \end{align*}<br />
 
  • Like
Likes pluviosilla
Yes, of course. This approach should have been obvious, but I didn't think of it. Thanks!

Do you understand how to prove the identity using the indicator function?
 
Sorry it took a time to get all the integrals correct - the Latex here was giving me fits (not updating, not parsing all the code, giving "Latex image not valid" messages) - it just isn't my day, I guess. The work with indicators is similar:

<br /> \begin{align*}<br /> I(x_1 \le X \le x_2, y_1 \le Y \le y_2) &amp; = I(x_1 \le X \le x_2) \cdot I(y_1 \le Y \le y_2))\\<br /> &amp; = I(x_1 \le X \le x_2) \cdot \left(I(Y \le y_2) - I(Y \le y_1)\right) \\<br /> &amp; = \left(I(X \le x_2) - I(X \le x_1\right) \cdot \left(I(Y \le y_2) - I(Y \le y_1)\right)<br /> \end{align*}<br />

Multiply these out and then integrate the entire shebang w.r.t. dF(x,y) = f(x,y) \,dxdy
 
  • Like
Likes pluviosilla
I see! You might say the indicator function takes the place of the integration limits. That's powerful!

It was your last statement (Integrate PDF * Expression with Indicator Function) that made this click for me. I'm a self-taught statistician, so I've got these annoying gaps in my training. This thread is the first I'd ever heard of the indicator function, but it clearly has some very useful properties.

Thanks very much to both you and gel.
 
  • #10
pluviosilla said:
I see! You might say the indicator function takes the place of the integration limits. That's powerful!

Indeed. It works because integration is a linear function of the integrand.
Rather than writing out the integral in full, it's often useful to use the standard notation E(Z) for the expected value of a random variable Z. Then, for any event S, the indicator function I(S) is a random variable taking the values 0 and 1, and
<br /> E(I(S)) = P(S)<br />.
Writing probabilities in terms of expected values in this way is often handy for rearranging expressions such as the one you were asking about.
 
  • Like
Likes pluviosilla
  • #11
Where are you taught that an integral is actually just a linear function of the integrand? Analysis class, perhaps?

I somehow managed to get a bachelor's degree in physics without a single course in analysis, chemistry or - alas - statistics. In the interview I had with the department chairman before graduating he said, "I admit that you have fulfilled all the requirements even though you are missing these courses, but I have to ask: how on Earth did you do it?" At the time, I thought I was clever to avoid these courses. Now, I just feel like a moron.
 
  • #12
"Where are you taught that an integral is actually just a linear function of the integrand? "

If you have seen that

<br /> \int_a^b \left(c \cdot f(x) + d \cdot g(x) \right) \, dx = c \int_a^b f(x) \, dx + d \int_a^b g(x) \, dx<br />

then you've seen the property you reference. Proving this property holds requires a class in which the properties of Riemann integration are developed; that may be an advanced calculus class or a first analysis class (more generalities in the latter). I saw it in advanced calculus as a junior and in a mathematical statistics class as a senior.
 
  • Like
Likes pluviosilla

Similar threads

Replies
6
Views
1K
Replies
5
Views
2K
Replies
2
Views
2K
Replies
14
Views
3K
Replies
2
Views
1K
Replies
3
Views
2K
Replies
12
Views
2K
Replies
3
Views
3K
Back
Top