Proof of the formula for the probability in a region

In summary: I like Serena said:Did you know that micromass's nick name is proof-guy? (Just made that up!...:rofl:)Oh, I didn't know that! :)I like Serena said:So you still need to show that(B_{x_2,y_2}\setminus (B_{x_1,y_2}\cup B_{x_2,y_1})\cup B_{x_1,y_1}=BHmm, actually that is not sufficient, since there is no claim that B is rectangular.But to "proof" stuff like this is really delving into calculus.I would tend to proof it by saying something like: "it
  • #1
Post-its
5
0
I'd like to know how to prove (or show that it is reasonable) that the probability that a random vector [itex](X, Y)[/itex] assumes a value in the region [itex]B\subseteq \mathbb{R}^2[/itex] is

[itex](1)[/itex] [itex]Pr((X, Y) \in B)=\iint\limits_B \, f_{X,Y}(x, y) \mathrm{d}x\,\mathrm{d}y[/itex].​

My textbook doesn't provide much of an explanation for the above formula except that it is "the volume under the surface defined by the density and lying above the region [itex]B[/itex]." However, the univariate case is explained in the text by appealing to the fundamental theorem of calculus:

[itex](2)[/itex] [itex]Pr(X \in (a, b])=Pr(X \leq b)-Pr(X \leq a) = F_{X}(b)-F_{X}(a) = \int_a^b \! f_{X}(x) \, \mathrm{d}x[/itex].​

Since my text practically proves the formula for [itex](2)[/itex] by appealing to the fundamental theorem of calculus, I've been assuming that [itex](1)[/itex] is not a definition and can be similarly proved by some theorem or set of theorems. Is this true, and if so, how can I prove [itex](1)[/itex]?

Also, I should mention that I'm currently taking an introductory course in probability and have no knowledge of measure theory.

Thanks,

Bijan
 
Physics news on Phys.org
  • #2
Hi Bijan.

You may generalize expression (2) easily for the case of a random vector (X,Y).
Just suppose B to be the rectangular region defined as

B={(x,y):[itex]x_{1}[/itex]<x[itex]\leq[/itex][itex]x_{2}[/itex], [itex]y_{1}[/itex]<y[itex]\leq[/itex][itex]y_{2}[/itex]}.

From this how do you calculate P{(X,Y)[itex]\in[/itex]B} ?
 
  • #3
Welcome to PF, Post-its! :smile:

Your equation (1) is actually the very definition of fX,Y(x,y).

Or rather, usually it is defined something like:
[tex]f_{X,Y}(x, y)dxdy = Pr((X, Y) \in dS)[/tex]
where dS is the infinitesimal surface element at (x, y) with size (dx,dy).


Your equation (1) is the summation (i.e. integral) of all the infinitesimal elements that belong to B.
 
  • #4
I like Serena:

I haven't learned a formal way of dealing with infinitesimals, so I don't really understand that definition. I became very confused when I tried to think of the rationale for it.

Karax:

So if [itex]B[/itex] is the region [itex]B=\{(x,y):x_{1} < x \le x_{2}, y_{1} < y \le y_{2}\}[/itex], then the probability that [itex](X, Y)[/itex] assumes a value in [itex]B[/itex] is

[itex]Pr((X, Y) \in B)=Pr(x_{1} < X \le x_{2}, y_{1} < Y \le y_{2})[/itex].​

From http://www.londoninternational.ac.u...3_advanced_stat_distribution_theory_ch_4.pdf", I've learned that I can use a "divide and conquer approach" to prove that

[itex]Pr(x_{1} < X \le x_{2}, y_{1} < Y \le y_{2})=F_{X,Y}(x_{2},y_{2})-F_{X,Y}(x_{1},y_{2})-F_{X,Y}(x_{2},y_{1})+F_{X,Y}(x_{1},y_{1})[/itex].​

Now that I know this, I don't know how to get from that last equation to

[itex](3)[/itex] [itex]Pr(x_{1} < X \le x_{2}, y_{1} < Y \le y_{2})=\iint\limits_B f_{X,Y}(x,y) \, \mathrm{d}x\,\mathrm{d}y[/itex],​

which is basically [itex](1)[/itex]. I'm uncomfortable with the notion that [itex](3)[/itex] or [itex](1)[/itex] could be a definition since my textbook felt the need to prove [itex](2)[/itex] by appealing to the fundamental theorem of calculus. If we should use [itex](3)[/itex] as some general definition, then I'd like to know "why" in order to feel more comfortable.

Thank you! :)

Bijan
 
Last edited by a moderator:
  • #5
Hi Post-its! :smile:

Post-its said:
I like Serena:

I haven't learned a formal way of dealing with infinitesimals, so I don't really understand that definition. I became very confused when I tried to think of the rationale for it.

Karax:

So if [itex]B[/itex] is the region [itex]B=\{(x,y):x_{1} < x \le x_{2}, y_{1} < y \le y_{2}\}[/itex], then the probability that [itex](X, Y)[/itex] assumes a value in [itex]B[/itex] is

[itex]Pr((X, Y) \in B)=Pr(x_{1} < X \le x_{2}, y_{1} < Y \le y_{2})[/itex].​

From http://www.londoninternational.ac.u...3_advanced_stat_distribution_theory_ch_4.pdf", I've learned that I can use a "divide and conquer approach" to prove that

[itex]Pr(x_{1} < X \le x_{2}, y_{1} < Y \le y_{2})=F_{X,Y}(x_{2},y_{2})-F_{X,Y}(x_{1},y_{2})-F_{X,Y}(x_{2},y_{1})+F_{X,Y}(x_{1},y_{1})[/itex].​

Now that I know this, I don't know how to get from that last equation to

[itex](3)[/itex] [itex]Pr(x_{1} < X \le x_{2}, y_{1} < Y \le y_{2})=\iint\limits_B f_{X,Y}(x,y) \, \mathrm{d}x\,\mathrm{d}y[/itex],​

which is basically [itex](1)[/itex]. I'm uncomfortable with the notion that [itex](3)[/itex] or [itex](1)[/itex] could be a definition since my textbook felt the need to prove [itex](2)[/itex] by appealing to the fundamental theorem of calculus. If we should use [itex](3)[/itex] as some general definition, then I'd like to know "why" in order to feel more comfortable.

Thank you! :)

Bijan

You are correct, (1) and (3) are not definitions. The only defiinition that's made is

[tex]F_{X,Y}(x_1,y_1)=\iint_{B_{x_1,y_1}}{f_{X,Y}(x,y)dxdy}[/tex]

with [itex]B_{x_1,y_1}=\{(x,y)~\vert~x\leq x_1,y\leq y_1\}[/itex].

Now, we have that

[tex]F_{X,Y}(x_2,y_2)-F_{X,Y}(x_1,y_2)-F_{X,Y}(x_2,y_1)+F_{X,Y}(x_1,y_1)=\iint_{(B_{x_2,y_2}\setminus (B_{x_1,y_2}\cup B_{x_2,y_1}))\cup B_{x_1,y_1}}{f_{X,Y}(x,y)dxdy}[/tex]

So you still need to show that

[tex](B_{x_2,y_2}\setminus (B_{x_1,y_2}\cup B_{x_2,y_1})\cup B_{x_1,y_1}=B[/tex]
 
Last edited by a moderator:
  • #6
Hmm, actually that is not sufficient, since there is no claim that B is rectangular.

But to "proof" stuff like this is really delving into calculus.
I would tend to proof it by saying something like: "it is immediately obvious that ..." or "from the definition of an integral it is clear that ...".
I believe that is basically what the textbook did. ;)

I guess formally you would have to make a "partition" of B into rectangles.
Then each rectangle would have to mapped on to the proof micromass just started.
And next you would need to take the limit of this partition to infinitely small elements.

Then again, it's quite some time ago that I did stuff like that, so if that is what you want to learn, I'll leave that to micromass!
Did you know that micromass's nick name is proof-guy?
(Just made that up! :wink:)
 
  • #7
Ah, B is not supposed to be rectangular?? Hmm, that makes things annoying. You'll have to know something about sigma-algebras and lambda-systems then... :frown: Let's just say that, if your book doesn't mention a proof and doesn't mention sigma-algebra, then I guess you can take the formula on faith...
 
  • #8
No! I am totally fine with assuming that B is rectangular. I'm just trying to decipher micro's post right now.
 
  • #9
@MM: Did you see that the textbook used a very strict definition of B saying: "For any reasonably well-behaved region B ⊆ R2"? ;)
 
  • #10
I like Serena said:
@MM: Did you see that the textbook used a very strict definition of B saying: "For any reasonably well-behaved region B ⊆ R2"? ;)

Hahaha :smile: I guess it's better than my first probability course that started with hard measure theory...

For the OP: maybe my post is more difficult than it should be. I'll post a new version...
 
  • #11
You are correct, (1) and (3) are not definitions. The only defiinition that's made is

[tex]F_{X,Y}(x_1,y_1)=\iint_{B_{x_1,y_1}}{f_{X,Y}(x,y)dxdy}[/tex]

with [itex]B_{x_1,y_1}=\{(x,y)~\vert~x\leq x_1,y\leq y_1\}[/itex].

But, if we use the characteristic functions:

[itex]I_A(x)=\left\{\begin{array}{c} 0~\text{if}~x\notin A\\ 1~\text{if}~x\in A\\ \end{array}\right.[/itex]

Then we can write this as


[tex]F_{X,Y}(x_1,y_1)=\iint{I_{B_{x_1,y_1}}f_{X,Y}(x,y)dxdy}[/tex]

Now, we have that

[tex]F_{X,Y}(x_2,y_2)-F_{X,Y}(x_1,y_2)-F_{X,Y}(x_2,y_1)+F_{X,Y}(x_1,y_1)=\iint{I_{(B_{x_2,y_2}}-I_{B_{x_1,y_2}}-I_{B_{x_2,y_1}}+I_{B_{x_1,y_1}})f_{X,Y}(x,y)dxdy}[/tex]

So you still need to show that

[tex]I_{B_{x_2,y_2}}-I_{B_{x_1,y_2}}-I_{B_{x_2,y_1}}+I_{B_{x_1,y_1}}=I_B[/tex]

This should be a lot easier than what I originally wrote...
 
  • #12
Hi Bijan.
In your post #4, you got it yet !
The "divide and conquer approach" of your book is the fundamental theorem of calculus, a little hidden.

It is useful to draw the rectangular area B, with its four corners (x[itex]_{1}[/itex],y[itex]_{1}[/itex]) , (x[itex]_{1}[/itex],y[itex]_{2}[/itex]) , (x[itex]_{2}[/itex],y[itex]_{1}[/itex]) and (x[itex]_{2}[/itex],y[itex]_{2}[/itex]), and try to visualize graphically where the result of the "divide and conquer approach" comes from.
Now we'll try to identify each term in your expression. It is useful to go on and shadow the region corresponding to each term:

First term in the expression is F[itex]_{X,Y}[/itex](x[itex]_{2}[/itex],y[itex]_{2}[/itex]): this amounts for the volume between F[itex]_{X,Y}[/itex] and the region {x[itex]\leq[/itex]x[itex]_{2}[/itex],y[itex]\leq[/itex]y[itex]_{2}[/itex]}. This is all the region in XY-plane on the left and below the top-right corner (x[itex]_{2}[/itex],y[itex]_{2}[/itex]). Clearly this region includes region B, but is greater. Next terms will correct the excess.

Second term is - F[itex]_{X,Y}[/itex](x[itex]_{1}[/itex],y[itex]_{2}[/itex]): This amounts for the region on the left and below the top-left corner (x[itex]_{1}[/itex],y[itex]_{2}.[/itex]). This region does not belong to B. That's why we subtract it, in order to correct the excess involved in the first term.

And the same for the third and fourth terms.

Well, after that, and with this geometric image in mind, let us go with the expression (1) of your first post:

P{(X,Y)[itex]\in[/itex] B} = [itex]\int[/itex][itex]\int[/itex]f[itex]_{X,Y}[/itex](x,y)dxdy.

We have obtained the term in the left side.
Let us go with the integral. With the geometric image in mind, it is easy to write down the integral term as (schematically, my Latex ability is rather bad...):

Integral (B) =
Integral (x[itex]_{1}[/itex][itex]\leq[/itex] x [itex]\leq[/itex]x[itex]_{2}[/itex],y[itex]_{1}[/itex][itex]\leq[/itex]y[itex]\leq[/itex] y [itex]_{2}[/itex]) =
Integral(x[itex]_{1}[/itex][itex]\leq[/itex] x [itex]\leq[/itex]x[itex]_{2}[/itex],-[itex]\infty[/itex][itex]\leq[/itex]y[itex]\leq[/itex] y [itex]_{2}[/itex]) - Integral(x[itex]_{1}[/itex][itex]\leq[/itex] x [itex]\leq[/itex]x[itex]_{2}[/itex],-[itex]\infty[/itex][itex]\leq[/itex]y[itex]\leq[/itex] y [itex]_{1}[/itex]) =
Integral(-[itex]\infty[/itex][itex]\leq[/itex] x [itex]\leq[/itex]x[itex]_{2}[/itex],-[itex]\infty[/itex][itex]\leq[/itex]y[itex]\leq[/itex] y [itex]_{2}[/itex]) - Integral(-[itex]\infty[/itex][itex]\leq[/itex] x [itex]\leq[/itex]x[itex]_{1}[/itex],-[itex]\infty[/itex][itex]\leq[/itex]y[itex]\leq[/itex] y [itex]_{2}[/itex]) - Integral(-[itex]\infty[/itex][itex]\leq[/itex] x [itex]\leq[/itex]x[itex]_{2}[/itex],-[itex]\infty[/itex][itex]\leq[/itex]y[itex]\leq[/itex] y [itex]_{1}[/itex]) + Integral(-[itex]\infty[/itex][itex]\leq[/itex] x [itex]\leq[/itex]x[itex]_{1}[/itex],-[itex]\infty[/itex][itex]\leq[/itex]y[itex]\leq[/itex] y [itex]_{1}[/itex]).

This four terms are exactly the four terms we obtained earlier. And we have applied just the fundamental theorem of calculus.

By the way, there is no restriction in consider B to be a rectangular region.
As you know from the theory of integration, all region in R[itex]^{n}[/itex] can be treated as the limit of a rectangular region. It is not necessary to dwell into measure theory subtleties...
 
  • #13
If the density is _defined_ as a Radon-Nikodym derivative, then (1) is basically the statement of the Radon-Nikodym theorem, in which case it is true by definition. Then (2) becomes a proof that the density is equal to the derivative of the distribution function.
 

Related to Proof of the formula for the probability in a region

1. What is the formula for calculating probability in a region?

The formula for calculating probability in a region is P = (Number of favorable outcomes in the region)/(Total number of possible outcomes in the region).

2. How is the probability formula derived?

The probability formula is derived from the basic principle of probability, which states that the probability of an event occurring is equal to the number of favorable outcomes divided by the total number of possible outcomes. This formula can be extended to a region by considering the number of favorable outcomes in the region and the total number of outcomes in the region.

3. Can the probability formula be used for any type of region?

Yes, the probability formula can be used for any type of region, as long as the number of favorable outcomes and the total number of outcomes in the region can be determined.

4. Are there any limitations to the probability formula for a region?

One limitation of the probability formula for a region is that it assumes all outcomes within the region are equally likely. In real-life scenarios, this may not always be the case and the formula may not accurately reflect the actual probability of an event occurring in a region.

5. What is the importance of understanding the probability formula for a region?

Understanding the probability formula for a region is important for making informed decisions in various fields such as statistics, economics, and finance. It allows us to calculate the likelihood of an event occurring in a given region, which can help in predicting outcomes and making decisions based on data and evidence.

Similar threads

  • Set Theory, Logic, Probability, Statistics
Replies
2
Views
215
  • Set Theory, Logic, Probability, Statistics
Replies
5
Views
493
  • Set Theory, Logic, Probability, Statistics
Replies
2
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
2
Views
1K
Replies
5
Views
653
  • Introductory Physics Homework Help
Replies
6
Views
346
Replies
3
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
5
Views
3K
Replies
11
Views
2K
Back
Top