Exponential Order Statistics and Independence

  • #1
showzen
34
0

Homework Statement


Consider the exponential probability density function with location parameter ##\theta## :
$$f(x|\theta) = e^{-(x-\theta)}I_{(\theta,\infty)}(x)$$
Let ##X_{(1)}, X_{(2)},...,X_{(n)}## denote the order statistics.
Let ##Y_i = X_{(n)} - X_{(i)}##.
Show that ##X_{(1)}## is independent of ##(Y_1,Y_2,...,Y_{n-1})##

Homework Equations

The Attempt at a Solution


First, I consider the distribution of all order statistics for this distribution.
$$
\begin{align*}
f(x_1, x_2,..., x_n) &= n!e^{-\sum_{i=1}^{n}(x_i-\theta)}\\
&= n!\exp\left(n\theta - \sum_{i=1}^{n}x_i\right)\\\\
\end{align*}
$$
Next, I coerce the desired joint probability distribution function to resemble the known jpdf of order statistics.
$$
\begin{align*}
f(x_1, y_1,...,y_{n-1}) &= f(x_1, x_n-x_1,...,x_n-x_{n-1})\\
&= n!\exp(n\theta -x_1-x_2-...-x_n)\\
&= n!\exp(n\theta-x_1+(y_2-x_n)+...+(y_{n-1}-x_n)-(y_1+x_1))\\
&= n!\exp(n\theta-x_1+(y_2-y_1-x_1)+...+(y_{n-1}-y_1-x_1)-(y_1+x_1))\\
&= n!\exp(n\theta-nx_1-(n-1)y_1+y_2+...+y_{n-1})\\
&= n!\exp(n\theta-nx_1)\exp(-(n-1)y_1)\prod_{i=2}^{n-1}\exp(y_i)\\
&= g(x_1)h(y_1,...,y_{n-1})
\end{align*}
$$
The factorization into ##g## and ##h## prove independence.'

Okay, so I am really not sure if this type of argument holds... or if I need to pursue a more rigorous approach?
 
Physics news on Phys.org
  • #2
showzen said:

Homework Statement


Consider the exponential probability density function with location parameter ##\theta## :
$$f(x|\theta) = e^{-(x-\theta)}I_{(\theta,\infty)}(x)$$
Let ##X_{(1)}, X_{(2)},...,X_{(n)}## denote the order statistics.
Let ##Y_i = X_{(n)} - X_{(i)}##.
Show that ##X_{(1)}## is independent of ##(Y_1,Y_2,...,Y_{n-1})##

Homework Equations

$$f(x|\theta) = e^{-(x-\theta)}I_{(\theta,\infty)}(x)$$
is not the pdf of an exponential distribution. I think you are doing something else other than what you've literally stated here. The lack of Relevant Equations further adds to the ambiguity.
 
  • #3
StoneTemplePython said:
$$f(x|\theta) = e^{-(x-\theta)}I_{(\theta,\infty)}(x)$$
is not the pdf of an exponential distribution. I think you are doing something else other than what you've literally stated here. The lack of Relevant Equations further adds to the ambiguity.

upload_2019-2-19_12-50-14.png

I am using this definition, with ##\beta=1## and ##\mu=\theta##. Also, I am using the indicator function to specify the domain.
 

Attachments

  • upload_2019-2-19_12-50-14.png
    upload_2019-2-19_12-50-14.png
    11.4 KB · Views: 268
  • #4
Thanks, I see what you are doing now. Location parameter isn't a common term for an exponential distribution and I inferred ##\theta## was the rate (scale) parameter since ##\beta## was not mentioned.

It's good to follow your text but you should know that this is a very nonstandard formulation. I looked through several texts I have and they all define the exponential distribution as in here http://www.randomservices.org/random/poisson/Exponential.html
- - - -
So for this problem the rate (or scale) parameter is 1, and you have ##X = Z + \theta## where ##X## is the stated random variable and ##Z## is the (standard) exponential distribution. In any case ##\theta## is a constant that is acting as an affine shift.

My suggestion is twofold
(1.) Consider setting ##\theta :=0##. This should radically streamline the result. (Then later show why this assumption works WLOG -- the clue is that you are using ##Y_i = X_{(n)} - X_{(i)}## so if you increment each ##X_{(k)}## by ##\theta## it does not change ##Y_i##, and an affine shift to the first arrival ##X_{(1)}## doesn't change anything either.) In any case setting ##\theta := 0## case is simpler because it allows you to use (2):

(2.) Exponentials have an extremely nice attribute (that is unique amongst continuous distributions) called memorylessness. Are you familiar with this? Similarly there are a few ways to prove independence -- one is working directly with CDFs / densities, another involves using MGFs, and in this case understanding the Poisson process as an (ideal) counting process may be used... without things like this stated in the Relevant Equations, we don't really know sort of tools are at your disposal. Typically there's more than one valid approach but some are a lot easier than others...
 
  • #5
showzen said:

Homework Statement


Consider the exponential probability density function with location parameter ##\theta## :
$$f(x|\theta) = e^{-(x-\theta)}I_{(\theta,\infty)}(x)$$
Let ##X_{(1)}, X_{(2)},...,X_{(n)}## denote the order statistics.
Let ##Y_i = X_{(n)} - X_{(i)}##.
Show that ##X_{(1)}## is independent of ##(Y_1,Y_2,...,Y_{n-1})##

Homework Equations

The Attempt at a Solution


First, I consider the distribution of all order statistics for this distribution.
$$
\begin{align*}
f(x_1, x_2,..., x_n) &= n!e^{-\sum_{i=1}^{n}(x_i-\theta)}\\
&= n!\exp\left(n\theta - \sum_{i=1}^{n}x_i\right)\\\\
\end{align*}
$$
Next, I coerce the desired joint probability distribution function to resemble the known jpdf of order statistics.
$$
\begin{align*}
f(x_1, y_1,...,y_{n-1}) &= f(x_1, x_n-x_1,...,x_n-x_{n-1})\\
&= n!\exp(n\theta -x_1-x_2-...-x_n)\\
&= n!\exp(n\theta-x_1+(y_2-x_n)+...+(y_{n-1}-x_n)-(y_1+x_1))\\
&= n!\exp(n\theta-x_1+(y_2-y_1-x_1)+...+(y_{n-1}-y_1-x_1)-(y_1+x_1))\\
&= n!\exp(n\theta-nx_1-(n-1)y_1+y_2+...+y_{n-1})\\
&= n!\exp(n\theta-nx_1)\exp(-(n-1)y_1)\prod_{i=2}^{n-1}\exp(y_i)\\
&= g(x_1)h(y_1,...,y_{n-1})
\end{align*}
$$
The factorization into ##g## and ##h## prove independence.'

Okay, so I am really not sure if this type of argument holds... or if I need to pursue a more rigorous approach?

As pointed out in #4, we might as well take ##\theta = 0##. Let's look at the case of ##n=4##. For notational simplicity let ##Z_i = X_{(i)}## for ##i = 1,2,3,4.## The density function of ##(Z_1,Z_2,Z_3, Z_4)## is
$$ f(z_1,z_2,z_3,z_4) = 4! e^{-z_1} e^{-z_2} e^{-z_3} e^{-z_4} \cdot 1\{0 \leq z_1 < z_2 < z_3 < z_4 \},$$
where ##1\{ A \}## is the indicator function of a set ##A## The multivariate moment-generating function of ##(Z_1, Z_4-Z_1, Z_4-Z_2, Z_4-Z_3)## is
$$M(t,s1,s2,s3) = E\exp(t Z_1 + s_1 (Z_4-Z_1) + s_2 (Z_4 - Z_2) + s_3 (Z_4 - Z_3)) \\
\hspace{1ex} = \int \! \int \! \int \! \int f(z_1,z_2,z_3,z_4) \exp(t z_1 + s_1(z_4-z_1)+s_2(z_4-z_2) + s_3(z_4 - z_3)) \,dz^4 \\
\hspace{1ex} = \int_0^\infty dz_1 \int_{z_1}^\infty dz_2 \int_{z_2}^\infty dz_3 \int_{z_3}^\infty g( z_1,z_2,z_3 z_4) \, dz_4,$$
where
$$g(z_1,z_2,z_3,z_4) = 4!e^{t z_1 + s_1(z_4-z_1)+s_2(z_4-z_2) + s_3(z_4 - z_3) -z_1-z_2-z_3-z_4}.$$
The integrals are all more-or-less elementary, so ##M## can be determined.

You can decide independence from the form of ##M(t,s_1,s_2,s_3).##
 
  • #6
For a clean analytical approach, it seems to me that Ray's method using MGF is probably the most efficient path. The lingering problem is that it doesn't tell me why this result holds. Typically various order statistics have a high degree of dependence -- but in this case we have memorylessness. By convention I use the rate parameter ##\lambda##, where ##\lambda = \frac{1}{\beta}##.

I'm not sure that OP knows about Poisson processes yet... but the below approach is probabilistically rather satisfying.
- - - - -
Here's a sketch of a different approach (again assuming ##\theta =0##):

letting ##A_x## be the event that ##X_{(1)}(\omega) \leq x## and ##B_t## be the event that ##Y_i(\omega) \leq t##, the goal here is to prove independence between ##X_{(1)}## and ##Y_i## by showing that

##P\big(X_{(1)} \leq x, Y_i \leq t\big) = P\big(X_{(1)} \leq x\big)P\big(Y_i \leq t\big)##
aka:
##P\big(A_x, B_t\big) = P\big(A_x\big) P\big( B_t\big)##
for any ##x, t \gt 0##

So we may try to prove:
##P\big(A_x\big) P\big(B_t\big \vert A_x\big) = P\big(A_x\big) P\big( B_t\big)##

observations:
(i) ##P\big( A_x\big)##
is given precisely as ##1 - \text{merged Poisson process probability of 0 arrivals} = 1 - p_{n \lambda}(k=0, x)##
(i.e. the merged poisson process has a rate of ##n \cdot \lambda##-- why?)

(ii) all quantities are positive, so we could just seek to prove
##P\big(B_t\big \vert A_x\big) = P\big( B_t\big)##
unfortunately both of these are hard to evaluate as ##P\big( B_t\big)## is a particular kind of inhomogenous Poisson process starting with rate ##(n-i)\cdot\lambda## but that decreases to ##(n-i -1)\cdot\lambda## after an arrival, then to ##(n-i -2)\cdot\lambda## after another arrival, and so on. This may be doable but inhomogeneity is challenging. Homogenous processes are a lot easier.

- - - -
a better approach:
try to prove
##P\big(A_x \big \vert B_t \big) P\big(B_t\big) = P\big(A_x\big) P\big( B_t\big)##
hint: this follows from a key attribute of the Poisson process. (Various ways to finish could include things like reversibility, or Poisson splitting/merging, maybe an induction hypothesis, etc., though they aren't needed per se.)
 
Last edited:
Back
Top