I Order Statistics: CDF Calculation for i.i.d. Random Variables

  • Thread starter Thread starter EngWiPy
  • Start date Start date
  • Tags Tags
    Cdf Statistics
Click For Summary
The discussion focuses on calculating the cumulative distribution function (CDF) for the sum of ratios of order statistics derived from independent and identically distributed (i.i.d.) random variables. It begins with the expression for the probability of the sum of random variables Z_k and transitions to the ordered variables G_k, emphasizing the need to adjust the integration limits due to their dependence on order statistics. The complexity of the integrals is highlighted, particularly when considering the constraints imposed by the ordering of the variables. The conversation also touches on the feasibility of numerical evaluation of these integrals using software tools, with suggestions for implementing the calculations in Python or MATLAB. Overall, the thread illustrates the intricate nature of statistical calculations involving order statistics and their implications for probability distributions.
  • #31
Solving ##dy## yields to the following formula

dy = y\left[\frac{1}{1-y\,\delta}-1\right]

To have a positive increment ##dy## we need ##0<y\,\delta<1##. How to incorporate this in the calculation of ##dy##? In other words, what ##dy## should be when the condition is not met? Can we say

dy=\max\left[\rho,\,\frac{1}{1-y\,\delta}-1\right]

where ##\rho = 0.001##, for example? Also, I think we need to put some condition on ##dy## like ##0<dy<1##, right?

The code becomes (before applying the conditions on ##dy##)

Python:
import math
K = 3
M = 2
dx = 0.01
delta = 0.01

y_Upper_lim = 7

#[1-FX(x)]^{(K-M)}
def F(x):
    return math.exp(-x*(K-M))

#fX(x) or fY(y)
def f(u):
    return math.exp(-u)

def G(z, m, x_prev_lim, y_current_lower_lim):
    res = 0
    #The lower limit of the integrations over x
    ux = x_prev_lim
    uy = y_current_lower_lim
    dy = uy*((1/(1-uy*delta)) - 1)
    while uy < y_Upper_lim:
        while ux <= uy*z:
            if m == M:
                res += F(ux)*f(ux)*f(uy)*dx*dy
                ux += dx
            else:
                res += G(z - (ux/uy), m+1, ux, z/ux)*f(ux)*f(uy)*dx*dy
                ux += dx
        uy += dy
    return res

(math.factorial(K)/math.factorial(K-M))*G(1, 1, 0.0001, 0.0001)

but ##dy## sometimes becomes negative, and others very large positive (e.g., 1000)!
 
Last edited:
Physics news on Phys.org
  • #32
The calculation of dy needs to be inside the 'while uy...' loop, otherwise the increment size will be the same throughout that integration.

In the recursive call of G, the last argument needs to be ux/z rather than z/ux.

EngWiPy said:
To have a positive increment dy we need 0<yδ<1. How to incorporate this in the calculation of dy?
That condition will automatically be met in the range being considered. With ##y\le L=7## and ##\delta=0.01##, we'll have ##y\delta \le 0.07##.

EngWiPy said:
Also, I think we need to put some condition on dy like 0<dy<1
Again, with ##L=7## and ##\delta=0.01## that will be unnecessary. I think the largest value of dy we get will be 49/93 which is <1.

Lastly, to speed things up, you may wish to put a minimum of ##\varepsilon=0.0001## on each dy that is calculated, otherwise the first dy in the outer loop will be around ##10^{-6}##, which may make it quite slow. Since that minimum has already been applied impoicitly in the first subinterval, accuracy will not be materially degraded by applying for intervals with larger value of y.
 
  • #33
Ooops, you are right. I updated the code as the following , yet it is not working correctly

Python:
import math
K = 3
M = 2
dx = 0.001
dy = 0.001
delta = 0.001

y_Upper_lim = 7

#[1-FX(x)]^{(K-M)}
def F(x):
    return math.exp(-x*(K-M))

#fX(x) or fY(y)
def f(u):
    return math.exp(-u)

def G(z, m, x_prev_lim, y_current_lower_lim):
    res = 0
    #The lower limit of the integrations over x
    ux = x_prev_lim
    uy = y_current_lower_lim
    while uy < y_Upper_lim:
        while ux < uy*z:
            print(f'ux = {ux}, uy*z = {uy*z}, uy = {uy}, and dy = {dy}')
            if m == M:
                res += F(ux)*f(ux)*f(uy)*dx*dy
                ux += dx
            else:
                res += G(z - (ux/uy), m+1, ux, ux/z)*f(ux)*f(uy)*dx*dy
                ux += dx
        uy += dy
    return res

(math.factorial(K)/math.factorial(K-M))*G(1, 1, 0, 0.0001)

I put a print statement after the while loop of ux, and it seems that ux is always close uy*z. Is this what supposed to happen? Also, from that print statement, controlling dy is tricky, because when it is very small (1E-6), we need set it to say 0.0001, but it grows larger than 0.0001 at some point, in which case we probably want to make it smaller. So, I fixed it for now. The numerical results for the above parameters is 3.3406241938703932e-09, while the MC simulations for the same parameters is 0.0489, which makes much more sense!
 
  • #34
Rather than moving the calculation of dy to inside the 'while uy...' loop, it has been moved to the top where it sets dy to be constant at 0.01. It needs to be inside the aforesaid loop, and to use the formula that relates it to delta and uy.

Given epsilon = 0.0001, the statement is:

dy = max(epsilon, uy*((1/(1-uy*delta)) - 1))

EngWiPy said:
it seems that ux is always close uy*z
For uy close to 0 that should usually be the case. But I would not expect it for uy>1.
EngWiPy said:
but it grows larger than 0.0001 at some point, in which case we probably want to make it smaller
No, that would just slow the program down with no material improvement in accuracy. See second-last para of post #32.
 
  • Like
Likes EngWiPy
  • #35
I did this exactly dy = max(epsilon, uy*((1/(1-uy*delta)) - 1)), but I noticed that dy grows larger, where I thought wouldn't give accurate results. But still, this doesn't give the expected result. There is some logical error. Now I have this code

Python:
import math
K = 3
M = 2
dx = 0.001
#dy = 0.001
delta = 0.001

y_Upper_lim = 7

#[1-FX(x)]^{(K-M)}
def F(x):
    return math.exp(-x*(K-M))

#fX(x) or fY(y)
def f(u):
    return math.exp(-u)

def G(z, m, x_prev_lim, y_current_lower_lim):
    res = 0
    #The lower limit of the integrations over x
    ux = x_prev_lim
    uy = y_current_lower_lim
    while uy < y_Upper_lim:
        dy = max(0.0001, uy*((1/(1-uy*delta)) - 1))
        while ux < uy*z:
            print(f'ux = {ux}\n z = {z}\n uy*z = {uy*z}\n uy = {uy}\n ux/uy = {ux/uy}\n dy = {dy}\n')
            if m == M:
                res += F(ux)*f(ux)*f(uy)*dx*dy
                ux += dx
            else:
                res += G(z - (ux/uy), m+1, ux, ux/z)*f(ux)*f(uy)*dx*dy
                ux += dx
        uy += dy
    return res

(math.factorial(K)/math.factorial(K-M))*G(1, 1, 0, 0.0001)

and the first and last 4 print statements in the while loop of ux are as following

ux = 0
z = 1
uy*z = 0.0001
uy = 0.0001
ux/uy = 0.0
dy = 0.0001

ux = 0
z = 1.0
uy*z = 0.0001
uy = 0.0001
ux/uy = 0.0
dy = 0.0001

ux = 0.001
z = 1.0
uy*z = 0.0010000000000000002
uy = 0.0010000000000000002
ux/uy = 0.9999999999999998
dy = 0.0001

ux = 0.002
z = 1.0
uy*z = 0.0020000000000000005
uy = 0.0020000000000000005
ux/uy = 0.9999999999999998
dy = 0.0001

.
.
.ux = 6.96200000000066
z = 1
uy*z = 6.965954503188108
uy = 6.965954503188108
ux/uy = 0.9994323099317346
dy = 0.04886491290055385

ux = 6.96300000000066
z = 1
uy*z = 6.965954503188108
uy = 6.965954503188108
ux/uy = 0.9995758652764534
dy = 0.04886491290055385

ux = 6.96400000000066
z = 1
uy*z = 6.965954503188108
uy = 6.965954503188108
ux/uy = 0.9997194206211721
dy = 0.04886491290055385

ux = 6.965000000000661
z = 1
uy*z = 6.965954503188108
uy = 6.965954503188108
ux/uy = 0.9998629759658909
dy = 0.04886491290055385

which show that ux, uy, and uy*z are almost equal form ##ux\neq 0## (and thus ux/uy ~ 1), and ##z## remains 1 (as it was passed) through out the execution of the code, and for all the recursion calls! This shouldn't happen. z should change with each recursion call.
 
  • #36
The problem was that ux was not reset to x_prev_lim for each uy value, and there was a mistake in the MC simulation. Now it works perfectly, and agrees with the MC simulation. Thanks for your responses @andrewkirk.

For the integration over Xs, dx is constant, and thus applying Simpson's rule for some optimization is straightforward. We simply compute dx = (uy*z - x_prev_lim)/L, where L is an even number (say 600) that represents the number of points in the numerical integration.

However, for Ys the step size is changing within the same integral. How can we deal with the Simpson's rule here, to make sure that the number of points is even?
 

Similar threads

  • · Replies 8 ·
Replies
8
Views
2K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 16 ·
Replies
16
Views
3K
  • · Replies 10 ·
Replies
10
Views
6K
  • · Replies 36 ·
2
Replies
36
Views
4K
  • · Replies 3 ·
Replies
3
Views
1K
  • · Replies 12 ·
Replies
12
Views
2K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 6 ·
Replies
6
Views
2K
  • · Replies 12 ·
Replies
12
Views
2K