CDF of minimum of N random variables.

AI Thread Summary
The discussion centers on finding the cumulative distribution function (CDF) of the minimum of N independent random variables. The initial incorrect reasoning proposed that P(Z<z) could be derived from the probabilities of individual random variables being less than or greater than z. However, this approach mistakenly assumes that Z being less than z requires exactly one X_i to be less than z while others are greater, which is not accurate. The correct method involves calculating P(Z > z) and leads to the result 1 - (1 - F_X(z))^N. Additionally, caution is advised regarding the definitions of CDF and the implications of strict versus non-strict inequalities in probability calculations.
ashwinnarayan
Messages
16
Reaction score
0
There's this problem that I've been trying to solve. I know the solution for it now but my initial attempt at a solution was wrong and I can't seem to figure out the mistake with my reasoning. I'd appreciate some help with figuring this one out.

1. Homework Statement
I have a set of random variables drawn independently from a distribution. And a new random variable.

Z = min\{X_1, X_2, ... X_N\}.

Each X_i has the pdf f_X(x) and CDF F_X(x)

What I want to do is to find the CDF (and then the PDF) of Z.

The Attempt at a Solution


So here's what I tried first.

P(Z&lt;z) = P((\exists i\ s.t\ X_i &lt; z) \cap (X_j &gt; z\ \forall j \neq i))
P(Z&lt;z) = \left(\sum_{i=1}^{N}P(X_i &lt; z)\right) \left( \sum_{j=1, j\neq i}^{N}P(X_j &lt; z) \right)
P(Z&lt;z) = N(N-1)F_X(z)(1-F_X(z))

But I know this is wrong because I did some research and I know that the correct (and easier) way to do it is to find P(Z &gt; z). The actual answer is 1 - (1 - F_X(z))^N.

Can someone help me find the flaw in my reasoning?
 
Physics news on Phys.org
ashwinnarayan said:
There's this problem that I've been trying to solve. I know the solution for it now but my initial attempt at a solution was wrong and I can't seem to figure out the mistake with my reasoning. I'd appreciate some help with figuring this one out.

1. Homework Statement
I have a set of random variables drawn independently from a distribution. And a new random variable.

Z = min\{X_1, X_2, ... X_N\}.

Each X_i has the pdf f_X(x) and CDF F_X(x)

What I want to do is to find the CDF (and then the PDF) of Z.

The Attempt at a Solution


So here's what I tried first.

P(Z&lt;z) = P((\exists i\ s.t\ X_i &lt; z) \cap (X_j &gt; z\ \forall j \neq i))
P(Z&lt;z) = \left(\sum_{i=1}^{N}P(X_i &lt; z)\right) \left( \sum_{j=1, j\neq i}^{N}P(X_j &lt; z) \right)
P(Z&lt;z) = N(N-1)F_X(z)(1-F_X(z))

But I know this is wrong because I did some research and I know that the correct (and easier) way to do it is to find P(Z &gt; z). The actual answer is 1 - (1 - F_X(z))^N.

Can someone help me find the flaw in my reasoning?

You are claiming that ##Z < z## if and only if exactly one of the ##X_i## is ##< z## while all of the others are ##> z##. This claim is false: ##\min\{3,4,5 \} < 10## but none of 3,4 or 5 is > 10. Also, ##\min \{3,4,5 \} < 4.5 ## but only one of the entries exceeds 4.5.

Also: be careful of inequalities. The usual definition of CDF is ##P(Z \leq z)##, with a non-strict inequality. Some authors (very few) write the CDF as ##P(Z < z)##, but in that case the complementary probability is NOT ##P(Z > z)##, but rather, ##P(Z \geq z)##. Of course, it makes no difference when you are dealing with continuous random variables having densities (as you seem to be), but if you want to deal with discrete, or mixed continuous-discrete random variables, then you must be very careful. The easiest way to be careful is to learn some rigid rules right from the start of your studies.
 
Last edited:
  • Like
Likes Buzz Bloom
Back
Top