Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Puzzling Random Walk Question

  1. Aug 14, 2011 #1
    I've been trying to find the solution to the following problem but it's evaded me thus far.

    Take the classic one dimension random walk scenario. I start at point 0 and can either step forward +1 step or step backwards -1 step (equal probability). I can countinue like this for N steps.

    If I walk N steps, what is the probability that at some point during my walk I will be n steps or more from my starting point (where n<=N)

    For example:

    I walk 100 steps, what is the probability that at some point during my walk I will be +20 steps or more from my starting point
  2. jcsd
  3. Aug 14, 2011 #2


    User Avatar
    Gold Member

    I don't have an answer but a numerical reformulation of the problem that might help get at one. (OR ... it might not).

    Given a random sequence of 0's and 1's, call A the # of 1's. What is the probability that A >= (N+n)/2 ?
  4. Aug 14, 2011 #3
    I would like to change the question slightly to:

    I walk 100 steps, what is the probability that during my walk I am never more than +20 steps from my original starting point.
  5. Aug 14, 2011 #4

    I like Serena

    User Avatar
    Homework Helper

    Welcome to PF, russellhq! :smile:

    What kind of probability distribution do you think this problem has?
  6. Aug 14, 2011 #5
    Hi Serena, from what I've looked at, I would think it would be a normal distribution with some sort of relationship to Pascal's Triangle. Beyond that I'm stumped!
  7. Aug 14, 2011 #6

    I like Serena

    User Avatar
    Homework Helper

    It's called a binomial distribution.

    The bigger your N, the closer it resembles a normal distribution, so you might use the normal distribution to approximate your problem.
    Pascal's Triangle consists of the so called binomial numbers, which are actually used in the binomial distribution.

    In short the binomial distribution (take a http://en.wikipedia.org/wiki/Binomial_distribution" [Broken]) gives the chance on k successes if a coin is flipped N times.
    This chance is [itex]P(k) = \binom N k (\frac 1 2)^N[/itex].
    (Is this notation familiar to you?)

    Perhaps you can try to "map" it onto your steps?
    Last edited by a moderator: May 5, 2017
  8. Aug 14, 2011 #7
    Hi Serena, I don't think that's what I'm looking for.

    An example using the coin flip analogy.

    Say I flip a coin 100 times and run a count of the number of heads and tails. Every time I flip a head, I add 1 to the count and every time I flip a tails I subtract 1 from the count.

    What is the probability that during counting, I make it to +20?

    For this to happen, I need the number of heads to equal the number of tails plus 20 at some point during the count (doesn't have to be only after 100 flips)
  9. Aug 14, 2011 #8

    I like Serena

    User Avatar
    Homework Helper

    Yep, you need the number of heads to exceed the number of tails by 20.
    With 100 tosses that is 60 times head and 40 times tail.
    More specifically you need k=(100+20)/2, or more generally k=(N+steps)/2.

    Did I already convince you that this is what you need? :wink:
  10. Aug 14, 2011 #9
    Thanks Serena, but I don't think I'm making myself clear.

    For example, if at the start of the 100 coin flips, I flip 20 heads in a row, then my condition has been met and I don't need to continue to the end.

    I hope that makes things clearer.
  11. Aug 14, 2011 #10

    I like Serena

    User Avatar
    Homework Helper

    Ah, I see. Sorry. I guess I didn't read your OP properly.

    Still, the random walk is typically a binomial distribution process.
    But if you want to calculate what you ask, the math seems to become rather messy I'm afraid.
    I'd have to puzzle a bit more on the possible sequences that are or are not a match.

    Now I understand phinds's suggestion better.
    Basically his suggestion involves enumerating all possibilities for N steps, and count how many possibilities match your request.
    For small N this can be done, but for large N this becomes too computationally expensive.
    But still, an approximation can be made by rolling a limited number of random numbers and counting the matches (this is called the Monte Carlo method).
  12. Aug 14, 2011 #11
    Monte Carlo is where I am going but I was hoping there would be a nice formula that covered this problem.
  13. Aug 14, 2011 #12

    I like Serena

    User Avatar
    Homework Helper

    Here's an analysis of +2 steps.

    I can build up a Pascal-like triangle:

    Code (Text):

    0:             1 |
    1:           1  1|
    2:         1  2  |1
    3:       1   3  2|
    4:     1   4  5  |2
    5:   1  5   9   5|
    6: 1  6  14  14  |5
    After 2 steps there is 1 path in 4 leading to +2 steps:

    After 4 steps there are 2 new paths from a total of 16 paths leading to +2 steps:
    P=1/4 + 2/16

    After 6 steps there are 5 new paths from a total of 32 paths leading to +2 steps:
    P=1/4 + 2/16 + 5/32
  14. Aug 14, 2011 #13
    In this case, I think that the best thing to do would be to draw some intuition from the reflection principle for Brownian motion. Suppose that X_t is a Brownian motion, and let M_t = max {X_s : s≤t}. Then for any fixed y, P(M_t > y) = 2 P(X_t > y). The argument is as follows: we can condition on the event that M_t > y to obtain that P(X_t > y) = P(X_t > y | M_t > y)P(M_t > y) + P(X_t > y | M_t ≤ y)P(M_t ≤ y). But if M_t ≤ y, then since X_t ≤ M_t, X_t ≤ y and thus P(X_t > y | M_t ≤ y). Now, suppose that M_t ≥ y. Let T = min{s: X_s = y}. Then t ≤ T. Define Y_s = X_(s+T), then Y_s is a Brownian motion starting at y, so P(X_t > y | M_t > y) = P(Y_s > y) = 1/2. Hence P(X_t > y) = 1/2 P(M_t > y) and multiplying both sides by 2 yields the result.

    The exact same reasoning can almost be applied to the discrete case, except that if n and N have the same parity, then there is a nonzero probability that X_N = n, so P(X_N ≥ n | M_N ≥ n) is slightly greater than 1/2. Still for large N, this is negligible, so we may approximate P(M_N ≥ n) closely as 2 P(X_N ≥ n) (and this is exact when N and n have different parity), and that can in turn be approximated using the normal approximation to the binomial distribution.
  15. Aug 15, 2011 #14
    Just an idea, its been years since I've done Probability. Break up the problem into N-n separate events. You could calculate the following:

    Let E be the event that at some point in N moves you are at least |n| from 0. Let Ek be the event that after k moves, you are >= |n| from 0. Then

    [tex]P(E) = \sum_{k= n}^N P(E_k) - I[/tex]

    Where [itex]I[/itex] is that annoying sum/difference of intersections formula if all [itex]E_i[/itex] which I cannot write neatly. If it works, it at least simplifies some of it to the binomial idea given above.
  16. Aug 16, 2011 #15
    I don't think the parity check is necessary, you have P[reach +20] = P[M_N>19] = 2 P[X_N>19] = 2 P[2B-N > 19] exactly, where B is binomial(N,1/2).
  17. Aug 16, 2011 #16
    Thanks everyone for your contributions, but I'm afraid the discussion has passed my level of understanding!

    Is is possible to break this down into something that can be put in a spreadsheet?
  18. Aug 17, 2011 #17


    User Avatar
    Science Advisor

    You could use convolution to do this.

    Typically the convolution problem helps us find the cumulative distribution of the sum of multiple random variables. The convolution gives you the cumulative distribution function and then you can differentiate that to get the pdf

    For example when you get P(X +Y < s) you get a triangle distribution, where X and Y are two uniform distributions i.i.d.

    With this you could apply the convolution theorem to get a CDF in terms of your original i.i.d random walk delta R.V's and then use that to find the probability of your value after n steps to be in some range.

    Also the suggestion of Monte-Carlo is also a good method, especially when the pdf is either unsolvable in an analytic sense, or is just too complicated.
Share this great discussion with others via Reddit, Google+, Twitter, or Facebook