# Placing random variables in order

1. Oct 10, 2012

### infk

Hello
Let's say we have some continuous i.i.d random variables $X_1, \ldots X_n$ from a known distribution with some parameter $\theta$
We then place them in ascending order $X_{(1)}, \ldots X_{(n)}$ such that $X_{(i)}, < X_{(i+1)}$.

We call this operation $T(\mathbf{X})$ where $\mathbf{X}$ is our vector $X_1, \ldots X_n$.

Now let's say we are interested in finding out whether $P(\mathbf{X} = X| T(\mathbf{X}) = t)$ where t and x are both vectors (and outcomes), depends on $\theta$.

By definition of conditional probability, we have:
$P(\mathbf{X} = X| T(\mathbf{X}) = t) = \frac{P(\mathbf{X} = X, T(\mathbf{X}) = t)}{P(T(\mathbf{X}) = t)}$

Trying to find these 2 probabilities:
$P(T(\mathbf{X}) = t)$, this is the probability that my ascending ordering $X_{(1)}, \ldots X_{(n)}$ is a certain vector. This probability should simply be
$\prod^{n}_{i=1}f(x_i)$, since we already know that they are ordered.

$P(\mathbf{X} = X, T(\mathbf{X}) = t)$, this is the probability that my random vector $\mathbf{X}$ attains a certain (vector)value while the ordering attains a certain (vector)value. But this should also be equal to $\prod^{n}_{i=1}f(x_i)$.

So
$P(\mathbf{X} = X| T(\mathbf{X}) = t) = \frac{P(\mathbf{X} = X, T(\mathbf{X}) = t)}{P(T(\mathbf{X}) = t)} = 1$. If this is correct, what does it mean that the probability is 1? Is it wrong? why?

2. Oct 10, 2012

### Staff: Mentor

I agree with $P(\mathbf{X} = X, T(\mathbf{X}) = t) = P(\mathbf{X} = X) = \prod^{n}_{i=1}f(x_i)$

However, the probability for a specific ordering, $P(T(\mathbf{X}) = t)$, is different, as you have multiple ways (multiple X) to get the same P(X) - assuming the probability that two Xi are the same is 0, you have exactly n! ways.
Therefore, $P(T(\mathbf{X}) = t)=\frac{1}{n!}P(\mathbf{X} = X)$.

This is easy to see with an example:

$P(T(\mathbf{X}) = (1,2)) = P(\mathbf{X} = (1,2)) + P(\mathbf{X} = (2,1))$

$P(\mathbf{X} = X| T(\mathbf{X}) = t)$ is equivalent to "one specific permutation out of n! was chosen" - assuming $T(X)=t$, of course, otherwise it is 0.

3. Oct 10, 2012

### Mute

Are the random variables discrete or continuous?

http://planetmath.org/encyclopedia/OrderStatistics.html [Broken] claims that the pdf should be $n! \prod_i f_X(x_i)$ for continuous variables $x$, which seems to agree with what mfb has said.

In case the OP is not already aware, such a problem is one of order statistics.

Last edited by a moderator: May 6, 2017
4. Oct 10, 2012

### infk

I see what you mean.
But since $P(T(\mathbf{X}) = (1,2)) = P(\mathbf{X} = (1,2)) + P(\mathbf{X} = (2,1)) = 2*P(\mathbf{X} = (1,2))$, shouldnt it rather be:
$P(T(\mathbf{X}) = t)= n!P(\mathbf{X} = X)$?

This way, the probability $P(\mathbf{X} = X|T(\mathbf{x}) = t)$ is equal to $\frac{1}{n!}$ which seems very intuitive: Given that the order $T$ has taken a certain value $t = X_{(1)},\ldots X_{(n)}$ this corresponds to (as pointed out by mfb) excactly $n!$ outcomes of $\mathbf{X} = X_1 \ldots X_n$ such that the probability of $\mathbf{X}$ attaining one of them is excactly $\frac{1}{n!}$

Last edited: Oct 10, 2012
5. Oct 10, 2012

### Staff: Mentor

Oh, wrong side of the equation. Of course, otherwise your conditional probability would be n! (>1...) instead of 1/n! (correct).

6. Oct 12, 2012

### infk

Same question but we pick another $T$: $T = (\text{min}(\mathbf{X}),\text{max}(\mathbf{X}))$. In this case we should get the same probability $P(T(X)=t)$ as before, namely $P(T(\mathbf{X}) = t)= n!P(\mathbf{X} = X)$, for consider the case: $P(T = 1,X_2,3)$ this is the sum of the probabilites of $\mathbf{X}$ attaining all possible permutations of $(1,X_2,3)$, which equals $3!P(\mathbf{X} = (1,X_2,3) )$.

For $P(\mathbf{X} = X, T(\mathbf{X}) = t)$ , I believe the same reason applies, since $T$ just takes the largest and smallest value of $\mathbf{X}$. Hence $P(\mathbf{X} = X, T(\mathbf{X}) = t) = P(\mathbf{X} = X) = \prod^{n}_{i=1}f(x_i)$

We therefore get the same result as before. I don't know about the intuition behind this, (look at my interpretation of the previous result), Shouldn't the values of $\mathbf{X}$ that are between the smallest and largest have more "liberty" of attaining their outcomes since we, in this case, only impose the restriction that they are between $\text{min}(\mathbf{X})$ and $\text{max}(\mathbf{X})$?

Last edited: Oct 12, 2012
7. Oct 12, 2012

### Staff: Mentor

I agree.

Why?

A constructive approach: Pick the index with the lowest value (n choices), and the index with the highest value (n-1 choices), require that all other variables are between those values:
$P(T(\mathbf{X}) = t)= n(n-1)f(t_1)f(t_2)\left(\int_{t_1}^{t_2}f(x)\right)^{n-2}$