Probability problem that is driving me crazy.

  • Thread starter Thread starter Karlx
  • Start date Start date
  • Tags Tags
    Probability
AI Thread Summary
The discussion centers around a challenging probability problem from "An Introduction to Probability and Statistics" by V.K. Rohatgi and A.K. Ehsanes. Participants are trying to clarify the problem's details, particularly the upper limit of the summation and the nature of the solution provided in the book, which some believe to be incorrect. Key insights involve visualizing the problem as a random walk and using combinatorial methods to calculate probabilities related to paths on a grid. The final consensus suggests that the problem can be approached through combinatorial counting rather than relying solely on random walk theory. Overall, the conversation emphasizes the complexity of the problem and the collaborative effort to derive a correct solution.
Karlx
Messages
75
Reaction score
0
Hi everybody.
I am reading the book "An introduction to Probability and Statistics", by V.K.Rohatgi and A.K. Ehsanes. I recommend it very much.

I am trying to solve on of the problems but I have all tried, and I am not able to find the proper solution.
The book gives the solution, but even with this help, I am not able to find the adequate reasoning to achieve it.
If somebody could help me I would appreciate very much.
Thanks.

Here is the problem:
 

Attachments

  • Problem.jpg
    Problem.jpg
    40.9 KB · Views: 551
Physics news on Phys.org
Hi, Karlx

A few questions:

1) Is the upper limit of the summation n-k or n+k? It doesn't show up in the picture for me.
2) Have you obtained a solution that is different from the one given in the book?
3) If you haven't solved the problem yet, have you recognized it as a problem about random walks?

I asked the second question because I think the answer given is wrong. You should be able to multiply the solution by 22n and get a whole number. That doesn't appear to be the case with the solution given.

EDIT:

The answer given in the attachment gives an OK approximation to the correct answer for some values of n and k. I have no idea why. I've been trying to come up with possible interpretations for that answer, but none of it makes much sense. Here are some values comparing the attached answer and what I think is the correct answer:

k=3, n=10
Answer in attachment: .939293 , Correct answer: .884682

k=4, n=11
Answer in attachment: .972406 , Correct answer: .947521

k=1, n=7
Answer in attachment: .575758 , Correct answer: .576050
This next one is less good...

k=1, n=10
Answer in attachment: .605882 , Correct answer: .496555

For that last one, the attached answer value moved in the wrong direction. It should be less probable that nobody has to wait for change when there are 20 kids in line rather than 14, given the ticket seller started out with only 2 quarters. I assumed the sum goes to n-k, because n+k would lead to negative probabilities.
 
Last edited:
To solve the problem, it helps to visualize it graphically. Let Y be the number of quarters the seller has at a particular moment and X be the number of tickets sold. The problem is asking for the number of paths on the X-Y grid from (0, 2k) to (2n, 2j) that do not go below the X- axis (though they may touch it). Every time you move to the right one (another ticket sold), you either move up or down one (gain or lose a quarter for change). I was trying to determine if you knew all this already, so I would know where to start.

An essential trick for finding the number of paths between points A and B that do touch or cross some horizontal line below them is this: It is the same as the total number of paths between A' and B, where A' is the reflection of A through the horizontal line.
 
Last edited:
Hi, techmologist.

First of all, thank you very much for your remarks and hints.
And sorry for my delay in answering to you.

As I have problems editing directly from this website (I don't know why), I attach a document with my answers.
 

Attachments

You are welcome. Unfortunately, I cannot upen the .doc file you attached. I'm not sure if it's a problem on my end or yours. My old computer has WordPad, but not Microsoft Word. I also have trouble editing posts on here. I think it is a combination of slow internet connection and an obsolete browser. Can you compose your post offline and then copy and paste it here? That's what I do.

Also, you can use the LaTeX typesetting to write formulas. For example, the answer attached in your first post would look like this:

1-\frac{\sum_{i=1}^{n-k}\binom{2i}{i}}{\binom{2n}{n-k}}

Hold the cursor over the formula to see the code that generates it. If you have an old internet browser like I do, you may not be able to see the formula. It appears as a black box with some white specks in it.
 
Last edited:
I tried to typing the expressions in LaTEx, but my browser gives me problems with this, too, although I can visualize the expressions without problem.

I try to attach the document in PDF format.
Can you see it?
 

Attachments

Hi, techmologist.

First of all, thank you very much for your remarks and hints.
And sorry for my delay in answering to you.

Now, let me go through your questions and explain to you my unsuccessful attempts to obtain an expression for the asked probability.

Your questions:

1) Is the upper limit of the summation n-k or n+k? It doesn't show up in the picture for me.

The upper limit of the summation is n-k.
Like you, I’ve tried to give some interpretations to the expression given in the book, but all of them seem to be meaningless.
On the other hand, it is not the first time that a solution given in this book seems to be wrong, but after reconsidering the question more carefully, it becomes completely true.
But in this case, and after some “empirical” examples worked out in an Excel worksheet, I think this one is wrong.

2) Have you obtained a solution that is different from the one given in the book?

I tried it.
Let me explain how my reasoning has been:

Let Aj be the event “the j-th child has to wait for change” and let Bj be the event “the j-th child no needs to wait for change”.

Then the probability that we are asking for is

p(\bigcap_{j=1}^{2n}B_{j})=1-p(\bigcup_{j=1}^{2n}A_{j}) (1)

Since Aj is the complementary set of Bj, for all j=1,...,2n.

Working out (1), I’ve obtained the following expression:

1-\sum_{m=0}^{n-k-1}\binom{2k+2m}{m}\frac{2^{2(n-m-k)-1}}{2^{2n}} (2)

But the probabilities calculated from this expression are not exact, but only an approximation.
The mistake is that I have assumed that

p(\bigcup_{j=1}^{2n}A_{j})=\sum_{j=1}^{2n}pA_{j} (3)

But this is not true, since events Aj are not independent.
So I must use the Principle of Inclusion-Exclusion.
But then the calculation becomes much more complicated, and I am not sure this way could be worthy.

At this point, I’ve tried to calculate the probability from the first term in expression (1), but with no success at all.

Right now, I don’t know any more ways to solve the question.
So, I would appreciate very much if you could suggest me any solutions, or could indicate me where my reasoning has been erroneous, or if I must keep on trying to obtain an expression from the principle of inclusion-exclusion, or something else.

3) If you haven't solved the problem yet, have you recognized it as a problem about random walks?

No, I didn’t recognize it as a problem about random walks.
In fact, I didn’t know anything about them.

The graphic visualization that you suggest is really very helpful, in order to “see” the problem graphically.
If I turn the graphic 90º clockwise, I see the Pascal’s triangle, with a vertical axis (Y=0 in your graphic), cutting a left portion of it (the paths below the Y=0 axis in your graphic).

I have drawn the graphic for n=4 and k=2, and it has been very easy to calculate “empirically” the number of paths above the Y=0 axis. My result is 238. So the probability in this case would be 238/256 = 0.9297. Do you agree with it?

Up to now, I have not worked this graphical way enough to obtain a general expression.

Once more, thank very much for your help.
 
Last edited:
Yes, I can see that one (pdf). I'm reading through it now.

EDIT: Oh, good you are posting it directly on here. That will make things easier.
 
Thanks again, techmologist.
Now I can LaTEx typing directly on the website.
Much more easy.
 
  • #10
The Principle of Inclusion-Exclusion should work in principle, but as you found out, it is too complicated for this problem. There is an easier way...counting paths on the grid. Yes, I also get 238 paths for k=2, n=4. So you have the idea of it.

The first thing to realize is that from the origin (0,0) to the point (r, s), there are

\binom{r}{\frac{r-s}{2}}

paths. This is that Pascal's triangle pattern you were seeing. Where r-s is an odd number, this is taken to be zero, because in that case there are no paths of length r with an excess of s "successes" (upward movements). You are looking for the number of paths between (0,2k) and (2n, 2j), for j=0,1,...n+k, that do not go below Y=0. That is, they do not touch Y=-1.

So for each j, you find all possible paths from (0, 2k) to (2n, 2j), but then subtract out those paths that touch Y=-1. To find the paths that touch Y=-1, reflect (0,2k) through the line Y=-1 to (0, -2k-2). Count the number of paths from (0, -2k-2) to (2n, 2j).

EDIT:

I only recently started learning about random walks myself. This problem is a slight variation on the Gambler's Ruin problem: find the probability a gambler betting on a fair game loses his starting bankroll in 2n steps or less. In your problem, though, you aren't necessarily "ruined" when you get to zero, because the next customer might pay with a quarter. So the "ruin" level occurs at Y=-1.

For 1-dimensional random walks, you can't do better than to read chapter 3 of William Feller's An Introduction to Probability Theory and its Applications, Volume I. 2nd or 3rd edition. It has some really neat stuff.

Feel free to ask for more hints if you get stuck :).
 
Last edited:
  • #11
Hi techmologist.
Finally your hints have been very helpful.

Following them, the probability we were looking for is

<br /> p=\frac{1}{2^{2n}}\left(A-B\right)<br />

where

A=Number of paths that lead to a point (2n,2j), for j=0,1,...,n+k
B=Number of A paths that touch the Y=-1 axis

With the graphical visualitzation, it is not difficult to obtain

<br /> A=\sum_{s=0}^{n+k}\binom{2n}{n-k+s}<br />

and, using the symmetry of the figure through the Y=-1 axis,

<br /> B=\sum_{s=1}^{n-k}\binom{2n}{n-k-s}<br />

Working out the expression for p, finally we arrive at

<br /> p=1 - \frac{1}{2^{2n-1}} \sum_{m=0}^{n-k-1}\binom{2n}{m}<br />

What surprises me a little is that this problem appears in Rohatgi's book in the first chapter, after explaining the Probability axioms, the Principle of Inclusion-Exclusion and Combinatorics.
I mean that it seems that this problem could be solved only with these primary concepts, and with no need of random walks.

It is for that reason that I tried, with no success, to solve it from a more "elementary" point of view.
Do you know some way to solve it in that way, not using random walks nor graphical representations ?

Thank you very much for your help.
I am really tempted to get Feller's book. I have some good recommendations about it.
 
Last edited:
  • #12
That is correct! Good job :smile:. I'm glad I could help.

I don't know of a more elementary way to do this problem. If you look at your solution, it was all combinatorics, i.e. counting. It wasn't really necessary to know anything about random walks to do this. However, I brought up random walks in case you had already studied them, since this problem is about a random walk in disguise. If you had already studied the gambler's ruin problem, I was just hinting that this is basically the same thing.

The only way I can see to use PIE is the way you were trying to do it: find the probability that at least one child has to wait. I haven't tried to work it out that way, but it looks messy. That doesn't mean it is impossible. You might try it, now that you have a correct answer to shoot for.

One thing about something you said in an earlier post--that you had to use PIE because the events aren't independent. I think you meant to say because they aren't mutually exclusive, which is different. Mutually exclusive events are maximally statistically dependent. If one occurs, you know for certain the other(s) did not.

I imagine there are several other ways to do this problem. One less elementary way would be to use Z-transforms (generating functions). I'd be surprised if this directly resulted in the same nice expression you derived above. It would probably take some manipulation to get them to match.
 
Back
Top