Probability of getting a seat in the train car

AI Thread Summary
The discussion revolves around calculating the probability that 150 passengers can be seated in five train cars, each with N seats. The main focus is on determining the probability that no car is chosen by more than N passengers, which involves understanding the multinomial distribution rather than a hypergeometric one. Participants explore various methods, including approximations using normal distributions and geometric approaches, to derive the probability and establish conditions for N that yield a probability greater than 0.9. The complexity of the problem is acknowledged, with suggestions for numerical methods and approximations to simplify calculations. The conversation highlights the challenges in achieving a closed-form solution and the necessity of considering dependencies between the choices of passengers.
bznm
Messages
181
Reaction score
0

Homework Statement


A train has got five train cars, each one with N seats. There are 150 passengers who randomly choose one of the cars. What is the probability that everyone will get a seat?

I think that what is asking me is "what is the probability that each wagon is chosen by no more than N passengers"?

The probability of having n<N people in *a* wagon is given by
p_N=\sum\limits_{n=0}^N \binom{150}{n}0.2^n 0.8^{150-n}.

I thought of the answer being p_N^5.. but I think that the events

- wagon a is chosen by no more than N passengers
- wagon b is chosen by no more than N passengers
- wagon c is chosen by no more than N passengers
- wagon d is chosen by no more than N passengers
- wagon e is chosen by no more than N passengers

are far from being independent... So what could I do?
Thanks a lot
 
Physics news on Phys.org
bznm said:

Homework Statement


A train has got five train cars, each one with N seats. There are 150 passengers who randomly choose one of the cars. What is the probability that everyone will get a seat?

I think that what is asking me is "what is the probability that each wagon is chosen by no more than N passengers"?

The probability of having n<N people in *a* wagon is given by
p_N=\sum\limits_{n=0}^N \binom{150}{n}0.2^n 0.8^{150-n}.

I thought of the answer being p_N^5.. but I think that the events

- wagon a is chosen by no more than N passengers
- wagon b is chosen by no more than N passengers
- wagon c is chosen by no more than N passengers
- wagon d is chosen by no more than N passengers
- wagon e is chosen by no more than N passengers

are far from being independent... So what could I do?
Thanks a lot
I suggest that you are not expected to get an answer in closed form. It should suffice to write it as a sum. If n1 choose the first wagon, how many are left for the next?
 
Thanks a lot for the reply!
I thought of writing the number of interesting combinations as

\binom{150}{n_1} \binom{150-n_1}{n_2} \binom{150-n_1-n_2}{n_3} \binom{150-n_1-n_2-n_3}{n_4} \binom{150-n_1-n_2-n_3 -n_4}{n_5}
where n_1, n_2, n_3, n_4, n_5 < N and n_1 + n_2 +n_3 + n_4 + n_5=150
However I believe that the exercise does want me to find a closed form. In fact, it also asks which N implies probability > 0.9...
 
bznm said:
However I believe that the exercise does want me to find a closed form
Consider a simple example. Two wagons of capacity 100. We have ##\Sigma_{50}^{100}## ##^{150}C_r2^{-150}##, i.e. a truncated sum of the binomial expansion. Isn't that known not to have closed form?
For the numeric part of the question, maybe you can use an approximation.
 
haruspex said:
Consider a simple example. Two wagons of capacity 100. We have ##\Sigma_{50}^{100}## ##^{150}C_r2^{-150}##, i.e. a truncated sum of the binomial expansion. Isn't that known not to have closed form?
For the numeric part of the question, maybe you can use an approximation.
Mhh how could I approximate? I mean, I know that binomial can be approximated with normals.. but here I have a sort of hypergeometric multivariate distribution, isn't it?
I think this exercise exceed my course level.. I don't know how to proceed :(
 
bznm said:

Homework Statement


A train has got five train cars, each one with N seats. There are 150 passengers who randomly choose one of the cars. What is the probability that everyone will get a seat?

I think that what is asking me is "what is the probability that each wagon is chosen by no more than N passengers"?

The probability of having n&lt;N people in *a* wagon is given by
p_N=\sum\limits_{n=0}^N \binom{150}{n}0.2^n 0.8^{150-n}.

I thought of the answer being p_N^5.. but I think that the events

- wagon a is chosen by no more than N passengers
- wagon b is chosen by no more than N passengers
- wagon c is chosen by no more than N passengers
- wagon d is chosen by no more than N passengers
- wagon e is chosen by no more than N passengers

are far from being independent... So what could I do?
Thanks a lot

Be careful to distinguish between ##n < N## and ##n \leq N##. Which one do you want in this problem?

And no, you do not have a multivariate hypergeometric distribution, because you are not "sampling without replacement". In this problem each of the 150 passengers randomly chooses a car, independently of what any of the other passengers do. So, you have a multivariate distribution alright, but it is a so-called "multinomial distribution" (which is an extension of the binomial to more than two categories). So, in your problem the probability of ##k_i## passengers choosing cars ##i = 1,2,3## (and thus ##k_4 \equiv 150-k_1-k_2-k_3## choosing car 4) is
$$P(k_1,k_2,k_3) = \frac{150!}{k_1! k_2! k_3! (150-k_1-k_2-k_3)!} \left(\frac{1}{4}\right)^{150}.$$
Assuming that ##N < 150##, we want ##\sum_{(k_1,k_2,k_3) \in S} P(k_1,k_2,k_3)##, where ##S## is the set
$$S = \{ (k_1,k_2,k_3): 0 \leq k_i \leq N, \; i = 1,2,3, \;
\text{and} \;150-k_1-k_2-k_3 \leq N \}.$$

One can approximate a multinomial distribution by a multinormal distribution with the same mean vector and variance-covariance matrix.
 
Ray Vickson said:
Be careful to distinguish between ##n < N## and ##n \leq N##. Which one do you want in this problem?

And no, you do not have a multivariate hypergeometric distribution, because you are not "sampling without replacement". In this problem each of the 150 passengers randomly chooses a car, independently of what any of the other passengers do. So, you have a multivariate distribution alright, but it is a so-called "multinomial distribution" (which is an extension of the binomial to more than two categories). So, in your problem the probability of ##k_i## passengers choosing cars ##i = 1,2,3## (and thus ##k_4 \equiv 150-k_1-k_2-k_3## choosing car 4) is
$$P(k_1,k_2,k_3) = \frac{150!}{k_1! k_2! k_3! (150-k_1-k_2-k_3)!} \left(\frac{1}{4}\right)^{150}.$$
Assuming that ##N < 150##, we want ##\sum_{(k_1,k_2,k_3) \in S} P(k_1,k_2,k_3)##, where ##S## is the set
$$S = \{ (k_1,k_2,k_3): 0 \leq k_i \leq N, \; i = 1,2,3, \;
\text{and} \;150-k_1-k_2-k_3 \leq N \}.$$

One can approximate a multinomial distribution by a multinormal distribution with the same mean vector and variance-covariance matrix.
Thanks for the correction! The fact that if passenger chooses the first wagon, then it doesn't choose the others made me think about "sampling without replacement"...
 
bznm said:
Thanks for the correction! The fact that if passenger chooses the first wagon, then it doesn't choose the others made me think about "sampling without replacement"...
For what it's worth, I slogged through an approximation using Xi as the actual number in the ith car minus the average number (i.e. 30). I got the probability that all are seated as ##K\Sigma _{\max\left[x_i\right]<N-30}e^{-\Sigma x_i^2/60}##, where K is the probability of exactly 30 in each. Cross-checked with a spreadsheet and it seems fairly accurate.

This suggests a geometric approach involving n-dimensional spherical shells.
 
bznm said:
Thanks for the correction! The fact that if passenger chooses the first wagon, then it doesn't choose the others made me think about "sampling without replacement"...

Sorry: my response took the number of cars to be 4, not 5. For 5 cars the expression is similar.
 
  • #10
haruspex said:
For what it's worth, I slogged through an approximation using Xi as the actual number in the ith car minus the average number (i.e. 30). I got the probability that all are seated as ##K\Sigma _{\max\left[x_i\right]<N-30}e^{-\Sigma x_i^2/60}##, where K is the probability of exactly 30 in each. Cross-checked with a spreadsheet and it seems fairly accurate.

This suggests a geometric approach involving n-dimensional spherical shells.
I looked further into the geometric approach. It works in principle, but gets tangled up with integrals over spherical end caps (and even intersections of such) in five dimensions.

If we pretend naïvely that the occupancies are independent, we can get a lower bound. It's a lower bound because having chosen those distributions in which everyone in the first carriage gets a seat, the expected occupancy of that carriage is below 30. That reduces the probability that other carriages will seat all their customers. Anyway, I think you'll find this bound is about 40.
To remove the bias, we can put a lower limit on the occupancy of the first carriage, i.e. require its occupancy to be from A to R. If that keeps its average occupancy at or above 30, we can claim not to have put upward pressure on the remaining carriages. (I have a sort of argument that says this should produce an upper bound on R.)
So what we need to find is a pair of numbers, A and R, such that
P(occupancy of first carriage is in [A, R]) > 0.90.2, and
E[occupancy of first carriage given that occupancy range] > 30, and
R is minimised.

With the aid of an online calculator for cumulative probabilities of a binomial distribution, that is not too hard. Or we could use a Gaussian approximation.
 
Last edited:
Back
Top