Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Probability & Statistics: Order Statistics

  1. Feb 17, 2009 #1
    1. The problem statement, all variables and given/known data
    Q1) About "order statistics", sometimes it's denoted x(1) and sometimes it's denoted X(1). What is the difference between the two?
    Also, for X(1)=min{X1,X2,...,Xn}, it's a random variable. What does it mean to be the minimum of a bunch of random variables? If they are SPECIFIC observed values, then we can order them (e.g. if we have 6,3,8,7, then ordering them gives 3,6,7,8)...that I understand. But if they are random variables, HOW can we order them?

    Q2) (more about order statistics)
    Here we have n random variables X1,X2...,Xn and we see FX(x) here. Why can we label it just based on one single varaible "x" instead of x1,x2,...,xn? Don't we have to treat them separately as x1,x2,...xn instead of just one "x"? Well, you may say it is because they're identically distributed, so we can just use a single "x" to represent each of x1,x2,...xn. But consider the following case:
    Let X1,X2,...,Xn be iid random variables with density f(xi)=xi, 0<xi<sqrt2, then in this case the joint density must be f(x1,x2,...,xn)=x1x2...xn, and is definitely NOT (x1)n
    So we've seen two different situations. In the first case, we can say x=x1=x2=...=xn, but not so in the second case. What is going on? Can someone please explain? I am always confused between these two cases. I am confused whenever they say X1,...Xn are iid with COMMON density fX(x). If this is the case, then the JOINT density [fX(x)]n would be a function of only a single variable "x" which doesn't make any sense to me (the joint density should be a function of n variables x1,x2,...,xn)

    2. Relevant equations
    Order Statistics

    3. The attempt at a solution
    As shown above.

    Thank you for clearing my doubts! I appreciate your great help!
  2. jcsd
  3. May 18, 2009 #2
    x usually denotes an observed value; whereas X denotes a random variable (i.e., a distribution). You cannot order random variables but using them you can derive the distribution that governs the value of the order statistic variable. As for Q2, you are confusing particular values with a distribution. If it makes it easier, you can try thinking of them not as probability distributions but frequency distributions (e.g., body height in a given population of N individuals).
  4. May 18, 2009 #3
    For Q1, maybe it would help you if you would recall that a random variable X is a function from a sample space S to the set of real numbers.

    If s is an outcome in S, then X(s) is a real number.

    To say [tex]X_{(1)}=\min\{X_1,X_2,\dots,X_n\}[/tex], this really means that for each outcome s in S, we have [tex]X_{(1)}(s)=\min\{X_1(s),X_2(s),\dots,X_n(s)\}[/tex].

    For Q2, you are right that you can not simply say that the joint f is the product [tex][f_X(x)]^n[/tex]. There is a different reason for this in this example.

    First, for emphasis and clarity, use the letter k instead of x, where k is a constant.

    Now [tex]\{X_{(n)} \le k\}[/tex] is shorthand for [tex]\{ s\in S:X_{(n)}(s) \le k\}[/tex]. But [tex]X_{(n)}(s)=\max\{X_1(s),X_2(s),\dots,X_n(s)\}[/tex], so [tex]X_{(n)}(s) \le k[/tex] if and only [tex]X_i(s)\le k[/tex] for all i from 1 to n.

    That is, [tex]\{ s\in S:X_{(n)}(s) \le k\}=\{s\in S:X_1(s)\le k\ \land\ X_2(s)\le k\ \land\ \dots\ \land\ X_n(s)\le k\}=\{s\in S:X_1(s)\le k\}\cap \dots\cap\{s\in S: X_n(s)\le k\}[/tex].

    By independence, [tex]P(\{s\in S:X_1(s)\le k\}\cap \dots\cap\{s\in S: X_n(s)\le k\})=P(X_1\le k)P(X_2\le k)\dots P(X_n\le k)[/tex].
Share this great discussion with others via Reddit, Google+, Twitter, or Facebook