Order Statistics: CDF Calculation for i.i.d. Random Variables

  • Context: Undergrad 
  • Thread starter Thread starter EngWiPy
  • Start date Start date
  • Tags Tags
    Cdf Statistics
Click For Summary

Discussion Overview

The discussion revolves around the calculation of cumulative distribution functions (CDFs) for the sums of ratios of order statistics derived from independent and identically distributed (i.i.d.) random variables. Participants explore the implications of different assumptions regarding the independence and distribution of these random variables, particularly in the context of exponential distributions.

Discussion Character

  • Technical explanation
  • Mathematical reasoning
  • Debate/contested

Main Points Raised

  • One participant presents an integral formulation for the probability of the sum of ratios of random variables, defined as \(Z_k = X_k/Y_k\), and seeks to extend this to order statistics.
  • Another participant questions the clarity of independence and distribution assumptions regarding the random variables \(X_k\) and \(Y_k\), suggesting multiple interpretations of their relationships.
  • A participant clarifies that all random variables are i.i.d. exponential variables, emphasizing that the support is positive reals, and reiterates the need for proper definitions in the context of their calculations.
  • Concerns are raised about the dependencies introduced by the order statistics, leading to a discussion on how to adjust the integrals accordingly to account for these dependencies.
  • Participants discuss the complexity of the resulting integrals and the potential for numerical evaluation using software tools, while also considering the implications of changing the number of variables summed from 3 to a general \(M\).

Areas of Agreement / Disagreement

Participants express differing views on the assumptions regarding independence and distribution of the random variables, leading to an unresolved discussion with multiple competing interpretations and approaches to the problem.

Contextual Notes

Limitations include the dependence on specific assumptions about the distributions and independence of the random variables, as well as the complexity introduced by the order statistics, which affects the evaluation of the integrals.

  • #31
Solving ##dy## yields to the following formula

dy = y\left[\frac{1}{1-y\,\delta}-1\right]

To have a positive increment ##dy## we need ##0<y\,\delta<1##. How to incorporate this in the calculation of ##dy##? In other words, what ##dy## should be when the condition is not met? Can we say

dy=\max\left[\rho,\,\frac{1}{1-y\,\delta}-1\right]

where ##\rho = 0.001##, for example? Also, I think we need to put some condition on ##dy## like ##0<dy<1##, right?

The code becomes (before applying the conditions on ##dy##)

Python:
import math
K = 3
M = 2
dx = 0.01
delta = 0.01

y_Upper_lim = 7

#[1-FX(x)]^{(K-M)}
def F(x):
    return math.exp(-x*(K-M))

#fX(x) or fY(y)
def f(u):
    return math.exp(-u)

def G(z, m, x_prev_lim, y_current_lower_lim):
    res = 0
    #The lower limit of the integrations over x
    ux = x_prev_lim
    uy = y_current_lower_lim
    dy = uy*((1/(1-uy*delta)) - 1)
    while uy < y_Upper_lim:
        while ux <= uy*z:
            if m == M:
                res += F(ux)*f(ux)*f(uy)*dx*dy
                ux += dx
            else:
                res += G(z - (ux/uy), m+1, ux, z/ux)*f(ux)*f(uy)*dx*dy
                ux += dx
        uy += dy
    return res

(math.factorial(K)/math.factorial(K-M))*G(1, 1, 0.0001, 0.0001)

but ##dy## sometimes becomes negative, and others very large positive (e.g., 1000)!
 
Last edited:
Physics news on Phys.org
  • #32
The calculation of dy needs to be inside the 'while uy...' loop, otherwise the increment size will be the same throughout that integration.

In the recursive call of G, the last argument needs to be ux/z rather than z/ux.

EngWiPy said:
To have a positive increment dy we need 0<yδ<1. How to incorporate this in the calculation of dy?
That condition will automatically be met in the range being considered. With ##y\le L=7## and ##\delta=0.01##, we'll have ##y\delta \le 0.07##.

EngWiPy said:
Also, I think we need to put some condition on dy like 0<dy<1
Again, with ##L=7## and ##\delta=0.01## that will be unnecessary. I think the largest value of dy we get will be 49/93 which is <1.

Lastly, to speed things up, you may wish to put a minimum of ##\varepsilon=0.0001## on each dy that is calculated, otherwise the first dy in the outer loop will be around ##10^{-6}##, which may make it quite slow. Since that minimum has already been applied impoicitly in the first subinterval, accuracy will not be materially degraded by applying for intervals with larger value of y.
 
  • #33
Ooops, you are right. I updated the code as the following , yet it is not working correctly

Python:
import math
K = 3
M = 2
dx = 0.001
dy = 0.001
delta = 0.001

y_Upper_lim = 7

#[1-FX(x)]^{(K-M)}
def F(x):
    return math.exp(-x*(K-M))

#fX(x) or fY(y)
def f(u):
    return math.exp(-u)

def G(z, m, x_prev_lim, y_current_lower_lim):
    res = 0
    #The lower limit of the integrations over x
    ux = x_prev_lim
    uy = y_current_lower_lim
    while uy < y_Upper_lim:
        while ux < uy*z:
            print(f'ux = {ux}, uy*z = {uy*z}, uy = {uy}, and dy = {dy}')
            if m == M:
                res += F(ux)*f(ux)*f(uy)*dx*dy
                ux += dx
            else:
                res += G(z - (ux/uy), m+1, ux, ux/z)*f(ux)*f(uy)*dx*dy
                ux += dx
        uy += dy
    return res

(math.factorial(K)/math.factorial(K-M))*G(1, 1, 0, 0.0001)

I put a print statement after the while loop of ux, and it seems that ux is always close uy*z. Is this what supposed to happen? Also, from that print statement, controlling dy is tricky, because when it is very small (1E-6), we need set it to say 0.0001, but it grows larger than 0.0001 at some point, in which case we probably want to make it smaller. So, I fixed it for now. The numerical results for the above parameters is 3.3406241938703932e-09, while the MC simulations for the same parameters is 0.0489, which makes much more sense!
 
  • #34
Rather than moving the calculation of dy to inside the 'while uy...' loop, it has been moved to the top where it sets dy to be constant at 0.01. It needs to be inside the aforesaid loop, and to use the formula that relates it to delta and uy.

Given epsilon = 0.0001, the statement is:

dy = max(epsilon, uy*((1/(1-uy*delta)) - 1))

EngWiPy said:
it seems that ux is always close uy*z
For uy close to 0 that should usually be the case. But I would not expect it for uy>1.
EngWiPy said:
but it grows larger than 0.0001 at some point, in which case we probably want to make it smaller
No, that would just slow the program down with no material improvement in accuracy. See second-last para of post #32.
 
  • Like
Likes   Reactions: EngWiPy
  • #35
I did this exactly dy = max(epsilon, uy*((1/(1-uy*delta)) - 1)), but I noticed that dy grows larger, where I thought wouldn't give accurate results. But still, this doesn't give the expected result. There is some logical error. Now I have this code

Python:
import math
K = 3
M = 2
dx = 0.001
#dy = 0.001
delta = 0.001

y_Upper_lim = 7

#[1-FX(x)]^{(K-M)}
def F(x):
    return math.exp(-x*(K-M))

#fX(x) or fY(y)
def f(u):
    return math.exp(-u)

def G(z, m, x_prev_lim, y_current_lower_lim):
    res = 0
    #The lower limit of the integrations over x
    ux = x_prev_lim
    uy = y_current_lower_lim
    while uy < y_Upper_lim:
        dy = max(0.0001, uy*((1/(1-uy*delta)) - 1))
        while ux < uy*z:
            print(f'ux = {ux}\n z = {z}\n uy*z = {uy*z}\n uy = {uy}\n ux/uy = {ux/uy}\n dy = {dy}\n')
            if m == M:
                res += F(ux)*f(ux)*f(uy)*dx*dy
                ux += dx
            else:
                res += G(z - (ux/uy), m+1, ux, ux/z)*f(ux)*f(uy)*dx*dy
                ux += dx
        uy += dy
    return res

(math.factorial(K)/math.factorial(K-M))*G(1, 1, 0, 0.0001)

and the first and last 4 print statements in the while loop of ux are as following

ux = 0
z = 1
uy*z = 0.0001
uy = 0.0001
ux/uy = 0.0
dy = 0.0001

ux = 0
z = 1.0
uy*z = 0.0001
uy = 0.0001
ux/uy = 0.0
dy = 0.0001

ux = 0.001
z = 1.0
uy*z = 0.0010000000000000002
uy = 0.0010000000000000002
ux/uy = 0.9999999999999998
dy = 0.0001

ux = 0.002
z = 1.0
uy*z = 0.0020000000000000005
uy = 0.0020000000000000005
ux/uy = 0.9999999999999998
dy = 0.0001

.
.
.ux = 6.96200000000066
z = 1
uy*z = 6.965954503188108
uy = 6.965954503188108
ux/uy = 0.9994323099317346
dy = 0.04886491290055385

ux = 6.96300000000066
z = 1
uy*z = 6.965954503188108
uy = 6.965954503188108
ux/uy = 0.9995758652764534
dy = 0.04886491290055385

ux = 6.96400000000066
z = 1
uy*z = 6.965954503188108
uy = 6.965954503188108
ux/uy = 0.9997194206211721
dy = 0.04886491290055385

ux = 6.965000000000661
z = 1
uy*z = 6.965954503188108
uy = 6.965954503188108
ux/uy = 0.9998629759658909
dy = 0.04886491290055385

which show that ux, uy, and uy*z are almost equal form ##ux\neq 0## (and thus ux/uy ~ 1), and ##z## remains 1 (as it was passed) through out the execution of the code, and for all the recursion calls! This shouldn't happen. z should change with each recursion call.
 
  • #36
The problem was that ux was not reset to x_prev_lim for each uy value, and there was a mistake in the MC simulation. Now it works perfectly, and agrees with the MC simulation. Thanks for your responses @andrewkirk.

For the integration over Xs, dx is constant, and thus applying Simpson's rule for some optimization is straightforward. We simply compute dx = (uy*z - x_prev_lim)/L, where L is an even number (say 600) that represents the number of points in the numerical integration.

However, for Ys the step size is changing within the same integral. How can we deal with the Simpson's rule here, to make sure that the number of points is even?
 

Similar threads

  • · Replies 8 ·
Replies
8
Views
3K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 16 ·
Replies
16
Views
3K
  • · Replies 10 ·
Replies
10
Views
6K
  • · Replies 36 ·
2
Replies
36
Views
5K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 12 ·
Replies
12
Views
2K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 6 ·
Replies
6
Views
2K
  • · Replies 12 ·
Replies
12
Views
2K