Statistics - Finding marginal distribution through integration (I think?)

In summary, the conditional distribution of X given Y = y is binomial and the marginal distribution of X is Poisson with rate pλ.
  • #1
Gullik
62
6

Homework Statement



Problem 2
Assume that the number Y of customers entering a store is a Poisson random variable with rate λ
. Let X denote the number of these customers being a woman. The probability that a customer
is a woman is denoted by p. Also, assume that all customers enter the store independently.

a) Explain why the conditional distribution of X given Y = y is binomial.

b) Show that the marginal distribution of X is Poisson with rate pλ



Homework Equations



[itex]\Gamma(\alpha)=∫^\infty_0 t^{\alpha-1}e^{-t} dt[/itex]


The Attempt at a Solution


I'm having troubles with b.
I think I should use the relation [itex]P(X=x|Y=y)=\frac{P(X=x,Y=x)}{P(Y=y)}[/itex] and [itex]f(x)=\int^\infty_0 f(x,y)dy[/itex] to get [itex]f(x)=\int^\infty_0 f(x|y)f(y)dy[/itex].

When I then put in probability distributions I get [itex]f(x)=∫^\infty_0 C_x^y*p^x(1-p)^{y-x}*\frac{\lambda^y*e^{\lambda}}{x!}dy=\frac{p^x}{x!*(1-p)^x}\int^\infty_0 \frac{((1-p)\lambda)^ye^{-\lambda}}{(y-x)!}[/itex]

Is that a solvable integral or is there an easier method? I'm thinking of trying to make a substitution so a get a gamma function out of it, but I can't get it right.
 
Last edited:
Physics news on Phys.org
  • #2
Gullik said:

Homework Statement



Problem 2
Assume that the number Y of customers entering a store is a Poisson random variable with rate
. Let X denote the number of these customers being a woman. The probability that a customer
is a woman is denoted by p. Also, assume that all customers enter the store independently.

a) Explain why the conditional distribution of X given Y = y is binomial.

b) Show that the marginal distribution of X is Poisson with rate p



Homework Equations



[itex]\Gamma(\alpha)=∫^\infty_0 t^{\alpha-1}e^{-t} dt[/itex]


The Attempt at a Solution


I'm having troubles with b.
I think I should use the relation [itex]P(X=x|Y=y)=\frac{P(X=x,Y=x)}{P(Y=y)}[/itex] and [itex]f(x)=\int^\infty_0 f(x,y)dy[/itex] to get [itex]f(x)=\int^\infty_0 f(x|y)f(y)dy[/itex].

When I then put in probability distributions I get [itex]f(x)=∫^\infty_0 C_x^y*p^x(1-p)^{y-x}*\frac{\lambda^y*e^{\lambda}}{x!}dy=\frac{p^x}{x!*(1-p)^x}\int^\infty_0 \frac{((1-p)\lambda)^ye^{-\lambda}}{(y-x)!}[/itex]

Is that a solvable integral or is there an easier method? I'm thinking of trying to make a substitution so a get a gamma function out of it, but I can't get it right.

Poisson random variables are discrete: they take only values 0,1,2,3,... so they do not have probability density functions and you cannot integrate. Instead, they have probability mass functions and you have to sum.

Instead of X and Y I will use C = number of customers and W = number of women, within some specified time interval of length t. So, C is Poisson with mean m = r*t (where r = entry rate: your post did not actually give a value here).

Now suppose C = 4 (4 customers enter); can you give the distribution of W? (That would be P{W = k|C = 4} for k = 0,1,2,3,4.) Suppose, instead, that 100 customers enter; can you now give the distribution of W? (That would be P{W = k|C = 100} for k = 0,1,2,...,100.) So, what is P{W = k|C = n} for k = 0,1,2,...,n?

RGV
 
  • #3
Ray Vickson said:
Poisson random variables are discrete: they take only values 0,1,2,3,... so they do not have probability density functions and you cannot integrate. Instead, they have probability mass functions and you have to sum.
Feels stupid

Instead of X and Y I will use C = number of customers and W = number of women, within some specified time interval of length t. So, C is Poisson with mean m = r*t (where r = entry rate: your post did not actually give a value here).
The rate was λ, the copypaste ate it.

Now suppose C = 4 (4 customers enter); can you give the distribution of W? (That would be P{W = k|C = 4} for k = 0,1,2,3,4.) Suppose, instead, that 100 customers enter; can you now give the distribution of W? (That would be P{W = k|C = 100} for k = 0,1,2,...,100.) So, what is P{W = k|C = n} for k = 0,1,2,...,n?

With C=4 will it be
[itex]P(W=k|C=4)=C^4_kp^x(1-p)^{4-k}[/itex]

So
[itex]P(W=k|C=n)=C^n_kp^x(1-p)^{n-k}[/itex]?

And
[itex]P(W=k)=\Sigma^n_{i=1}P(W=k|C=i)P(C=i)[/itex]?

Or can I say that [itex]\lim_{n\to\infty}bin(n,p)\approx poiss(np)[/itex]


RGV
Thanks
 
Last edited:
  • #4
I've solved it btw, so no one wastes any time here.
 

1. What is marginal distribution in statistics?

Marginal distribution is a statistical concept that refers to the probability distribution of one variable in a multi-dimensional dataset. It provides information about the frequency or likelihood of a specific variable occurring, while holding all other variables constant.

2. How is marginal distribution calculated?

Marginal distribution can be calculated by summing or integrating the joint probability distribution over all possible values of the other variables. This involves summing or integrating the joint probability distribution function over all values of a specific variable, while keeping the other variables fixed.

3. Can marginal distribution be estimated from a dataset?

Yes, marginal distribution can be estimated from a dataset by simply counting the number of times a specific variable occurs in the dataset and dividing it by the total number of observations. This will provide an estimate of the probability of that variable occurring.

4. How does marginal distribution differ from conditional distribution?

Marginal distribution considers the probability of one variable without taking into account the values of other variables, while conditional distribution takes into account the values of other variables. In other words, marginal distribution provides a general overview, while conditional distribution provides specific information based on the values of other variables.

5. What are the applications of marginal distribution in statistics?

Marginal distribution is commonly used in data analysis and modeling to understand the relationship between variables and to make predictions. It is also useful in hypothesis testing, where it helps determine the significance of a specific variable in a dataset. Additionally, it is used in machine learning and data mining to identify patterns and trends in large datasets.

Similar threads

  • Calculus and Beyond Homework Help
Replies
6
Views
1K
  • Calculus and Beyond Homework Help
Replies
10
Views
376
  • Calculus and Beyond Homework Help
Replies
20
Views
463
  • Calculus and Beyond Homework Help
Replies
12
Views
994
  • Calculus and Beyond Homework Help
Replies
4
Views
847
  • Calculus and Beyond Homework Help
Replies
3
Views
565
  • Calculus and Beyond Homework Help
Replies
6
Views
764
  • Calculus and Beyond Homework Help
Replies
3
Views
2K
  • Calculus and Beyond Homework Help
Replies
3
Views
573
  • Calculus and Beyond Homework Help
Replies
2
Views
1K
Back
Top