# Median geometric distribution

1. Mar 11, 2012

### Max.Planck

1. The problem statement, all variables and given/known data
How do you find the median of the geometric distribution?

2. Relevant equations
M is median if P(X>=M) >= 1/2 and P(X<=M)>=1/2.

3. The attempt at a solution
I have found this inequality using the geometric series:
(m-1)*log(1-p) >= 1/2

2. Mar 11, 2012

### Max.Planck

anyone??

3. Mar 11, 2012

### Ray Vickson

RGV

4. Mar 11, 2012

### Max.Planck

P(X>=M) =$$\sum_{k=M}^{\infty}p(1-p)^{k-1} = p(1-p)^{M-1}\sum_{k=0}^{\infty}(1-p)^k = (1-p)^{M-1}$$

Now I set:

$$(1-p)^{M-1} >= 1/2 \implies (M-1)log(1-p) >= log(1/2) \implies M >= log(1/2)/log(1-p)+1$$

Now for the other part:
P(X<=M) = 1-P(X>M) =
$$1-\sum_{k=M+1}^{\infty}p(1-p)^{k-1} = 1-(1-p)^M$$

Now I set:

$$1-(1-p)^{M} >= 1/2 \implies (M)log(1-p) <= log(1/2) \implies M <= log(1/2)/log(1-p)$$

Is this correct and how do I combine these solutions?

5. Mar 12, 2012

### Ray Vickson

You have the >= and <= backwards: you need P{X >= M} <= 1/2, etc. The way I would do it is to find a continuous solution of (1-p)^(m-1) = 1/2, then take M = ceiling(m), where ceiling(w) = smallest integer >= w.

RGV

6. Mar 12, 2012

### Max.Planck

Actually it is P(X>=M) >= 1/2 and P(X<=M) >= 1/2. See http://en.wikipedia.org/wiki/Median.

7. Mar 12, 2012

### Ray Vickson

If that is what the article says, it is wrong. I suggest you look in a real book.

RGV

8. Mar 12, 2012

### Max.Planck

It also says here in my book, however, the way you described it seems to lead to the right answer...

9. Mar 12, 2012

### Ray Vickson

Fix typos
Actually, My statement may have been a bit harsh: saying "not useful" rather than "wrong" may have been better.

The point is that there is a bit of an issue defining the "median" for some discrete cases, and some sources regard entire intervals [a,b] as the median when F(a) = 1/2 (and b > a is the next point in the distribution). Other sources define the median as the solution of the optimization problem min_m E|X-m|; that would lead to a median interval in some cases. I think it is better to deal with strict inequalities: the interval [Mmin,Mmax] (Mmin <= Mmax) is a median interval if P{X < Mmin} < 1/2 and P{X > Mmax } < 1/2. Of course, if Mmin=Mmax= M we have P{X>M} < 1/2 and P{X<M} <1/2. Some books and papers would say that we must pick a particular point, so would reject the median-interval notion, but would choose a point M in [Mmin,Mmax] in some way. If you do that , it would be false to say P{X<M}<1/2 and P{X>M} < 1/2.

Anyway, the method I gave is OK as long as the median interval shrinks to a single point; if you look at the graph of the cdf you can even see why it works: at the median M we have F(M) < 1/2 and F(M+1) > 1/2. If you imagine drawing the graph y = F(x), but with vertical line segments inserted at the jump points, then the solution of the equation 1/2 = F(x) is at the point M where the vertical segment from F(M) to F(M+1) cuts the value 1/2; that is, it is at the point M where F(M) < 1/2 and F(M+1) > 1/2. You would get this same value by joining up the points (j,F(j)) by a smooth curve y = H(x) (with H(j) = F(j) for all j) and then rounding up the solution x of 1/2 = H(x).

RGV

Last edited: Mar 12, 2012