Bayesian probability question about Dirichlet prior distributions

In summary, you can use a binomial model to estimate the probability of an event in a given bin. You can then use the estimated probability to calculate conditional probabilities.
  • #1
bradyj7
122
0
Hi there,

I have a question about Bayesian probability.

I have a list of the starting times of journeys. I binned the data into 15 minute bins so I have 96 bins in total (4*24=96). So for example a journey start time of 08:05 am would be in bin number 29.

As an example here is the data for bins numbers 28-50 (8am until 12.30pm).

https://dl.dropbox.com/u/54057365/All/bin.JPG

I've calculate the frequency density of the bins in the last column.

Would anybody be able to tell me how I would do the following:

Taking Dirichlet prior distribution over the density of each bin for a multinomial model, you estimate the parameters. This way you get a non-zero probability for each bin. Each parameter is basically some prior parameter plus the frequency of the data in that bin.

Would anybody know if this can be done with an excel addin?

Appreciate your comments.

Regards

John
 
Last edited by a moderator:
Physics news on Phys.org
  • #2
Hello,

Just wondering if anybody has had time to consider my question?

Regards
 
  • #3
bradyj7 said:
Hello,

Just wondering if anybody has had time to consider my question?

Regards

To many mathematicians the stumbling block in your question is the phrase "in Excel". Your question is rather like asking "How do I tie my shoes - in a phone booth, standing on my head?" Lots of people know the answer to the first part. Not many know the answer with the added restrictions.

I suggest you look in the computer sections of the forum if you want to know about Excel. I'm not very familiar with those sections, so I can't suggest which ones. Do a search on "excel" and find places on the forum where people who know about sophisticated excel plugins make posts.
 
  • #4
Hi Steven,

Thanks for your comment. It doesn't necessarily have to be in excel, how would you suggest doing it?

Thanks

John
 
  • #5
bradyj7;4114723 [I said:
Taking Dirichlet prior distribution over the density of each bin for a multinomial model, you estimate the parameters. This way you get a non-zero probability for each bin. Each parameter is basically some prior parameter plus the frequency of the data in that bin.[/I]

(The term "frequency" is not a good choice of words since it might also mean the fraction of trials that fell in a given bin. That is not the intended meaning here.)

A plausible 'non-informative" prior is the the Dirichlet distribution with all parameters set = 1.
The posterior is a Dirichlet distribution with parameters [itex] \alpha_i = 1 + b_i [/itex] where [itex] b_i [/itex] is the number of observations in the data that were in bin [itex] i [/itex].

What is it that you want to do with this posterior distribution?
 
  • #6
You can code routines in Excel if they don't exist using formulas or VBA code.
 
  • #7
Hi,

I'm confused on the exact steps on how you determine the posterior distribution. I want to sample from the posterior distribution. I've been told that these are the steps to follow but I can't seem to grasp the method. I'd be grateful if you confirm if I'm doing it correctly. I want to estimate the posterior distribution of journey start times. I believe you do the following.

1) You have the column of all starting times.

2) Choose a bin size, giving let's say k bins.

3) Taking Dirichlet prior distribution over the density of each bin for a multinomial model, you estimate the parameters. This way you get a non-zero probability for each bin. Each parameter is basically some prior parameter plus the frequency of the data in that bin. The data is binned when you want model your data using a discrete probability distribution or create a frequency table and do classical analysis. In the bayesian approach you get a discrete probability model for the probability in each bin.

4) Thus you have estimated f(journey starting time)

So I binned the data into 15 minute bins so I have 96 bins in total (4*24=96). So for example a journey start time of 08:05 am would be in bin number 29.

As an example here is the data for bins numbers 28-50 (8am until 12.30pm).

https://dl.dropbox.com/u/54057365/All/bin.JPG

I'm using this excel software

Is this problem an example of a Conjugate prior distribution? http://www.vosesoftware.com/ModelRiskHelp/index.htm#Analysing_and_using_data/Bayesian/Bayesian_inference.htm

So I have the frequency in each bin. Is the next step to find the prior distribution of each bin? and then the likeihood function for the data in each bin?

How do these come together to form one complete posterior distribution that you can sample from? Am I understanding this correctly or have I got it all wrong?

I have been looking at this page http://www.vosesoftware.com/ModelRiskHelp/index.htm#Analysing_and_using_data/Bayesian/Bayesian_inference.htm

Could you tell me if this page is doing what I am trying to do?

Apologies for the long winded post. I've been trying to do this for a few days now with no success.

I'm going to be doing conditional probabilities next so I need to grasp the basics.

Thanks for all your help
 
Last edited by a moderator:
  • #8
If you are using your data to update the "parameter" of your distribution then typically what you do is your last posterior becomes your new prior and you repeat the cycle if you are using previous data to update parameters.

If you are doing this in a computer with bins, then if your likelihood for each bin is P(X|theta) and your prior is P(theta) for each bin then for each bin multiply the two together for each cell and then once you've done this, normalize the whole distribution (i.e. sum up all newly created cells of the posterior distribution and divide the whole distribution by this values).

You can then take this distribution and get for example a point estimate for the parameter (by taking the expectation of each element in the theta vector) and you can even get confidence intervals as well by getting the appropriate quantile.

If for example you got the point estimates from the posterior, these can be used in your new prior and the process goes on.
 
  • #9
Thank you foe explaining that.

I have 2 questions:

1. Is the frequency density (last column) the prior distribution for that bin?

https://dl.dropbox.com/u/54057365/All/screenshot.JPG

2. How do you estimate the likelihood for a bin? I can't seem to find an similar example on line?

John
 
  • #10
If you are using a multinomial model, then you will have quite a complex distribution that sums over many different combinations of permutations. For example if you have a multinomial with five parameters for N trials, this corresponds to the number of events corresponding to say rolling a dice N times and 6^N gets very large very quickly.

Your prior for P choices should have P-1 different unique probabilities (the last can be calculated by 1 - sum_of_the_rest), so you will have 5 different values for your actual prior (like a vector).

Your likelihood will contain an entry for every possible combination and you can think about this in either a multi-dimensional way or in a uni-dimensional way where the uni-dimensional way is basically like taking every combination and laying it out in a big massive line (one way to think about this is instead of having a square, you take the square and slowly un-pack all the cells in one huge line instead of seeing it as two dimensional).

The likelihood function for the multinomial is just the multinomial like any likelihood function and it will be represented as P(X|theta) where theta is your set of parameters (remember for P choices you will have P-1 parameters) and X will be any particular possible outcome.

What you will need to do is relate your excel spreadsheet data with the actual choices with regards to the probabilities.

So as an example with the three throws of a dice we can get anything from {1,1,1} to {6,6,6} so if you lay this out in a one-dimensional form, then you you could have the first cell corresponding to {1,1,1} and the last cell corresponding to {6,6,6}.

So the likelihood function will be one field and the prior for that combination will be some fixed vector.

Now your posterior probabilities will basically have distributions for every single parameter (i.e. P-1 different ones) and your posterior will simply be a vector where you calculate the likelihood for every combination (with regard to those parameters) and then you multiply that by the given prior vector to get a posterior vector.

Remember that your posterior of P(theta|X) will be a vector where theta corresponds to <theta1,theta2,theta3,...>^T where this vector is the vector of all parameters in the multinomial.

This new posterior will become your prior and will represent the "updated" version of the parameters under the given sample you got before given that the likelihood model is a multinomial distribution.
 
  • #11
bradyj7 said:
How do these come together to form one complete posterior distribution that you can sample from? Am I understanding this correctly or have I got it all wrong?

If you read my previous post, I describe how you get a specific posterior distribution. To draw a sample using that distribution, do two steps.

1) Draw a sample from the posterior distribution. This sample gives you a vector of probabilities.
2) Use the vector of probabilities as the probabilities that a start time falls in a given pin. Make a random draw to determine in which bin the start time falls.

Repeat both steps each time you want to generate a random sample.


I have been looking at this page http://www.vosesoftware.com/ModelRiskHelp/index.htm#Analysing_and_using_data/Bayesian/Bayesian_inference.htm

Could you tell me if this page is doing what I am trying to do?

That's not a specific question because that page covers a wide variety of topics.

If you are using a Dirichlet prior, you don't have to calculate the posterior. People have already done this and the formula is known.

http://en.wikipedia.org/wiki/Dirichlet_distribution, see the section "Conjugate to categorical/multinomail".

The only complicated thing I see about your problem is how to take a sample from a Dirichlet distribution. We can discuss that or someone other forum member may know an easy way. The general version of this question is "How do I draw a random sample from a joing probability distribution of several random variables?".
 
  • #12
Regarding Stephen's final point:
That wikipedia page also describes sampling a Dirichlet distribution, relying on the fact that each element in the vector drawn from a Dirichlet distribution is gamma distributed, with the overall vector normalized by its sum. If you have MATLAB this is really easy to do, but in Excel--I'm not so sure. I don't think Excel can generate gamma distributed numbers. I did a quick search for you and something like this package might help: it has a function called ntrandgamma that maybe what you're looking for. It seems to be free, but I have not used it and cannot vouch for it.

The general way of sampling from a joint distribution is also mentioned on that wikipedia page, where you sample from the marginal distribution of one element in a multivariate distribution, and then repeatedly use conditional distributions to sample each additional element (see chain rule). Intuitively it is easy to imagine for a 2D distribution: you pick an x point using the marginal distribution of the x coordinate, then the slice through the distribution at that x value is the conditional distribution of y on x, which you sample from to get the y point, and then you have your desired point.

Hope that is useful!
 

1. What is a Dirichlet prior distribution?

A Dirichlet prior distribution is a type of probability distribution used in Bayesian statistics to represent our prior beliefs about the possible values of a set of parameters. It is a multivariate generalization of the beta distribution and is often used in situations where the parameters are proportions or probabilities.

2. How is a Dirichlet prior distribution different from other prior distributions?

A Dirichlet prior distribution differs from other prior distributions in that it is a multivariate distribution, meaning it can be used to represent prior beliefs about multiple parameters at once. It also has the advantage of being a conjugate prior, meaning that the posterior distribution can be easily calculated using the prior and likelihood functions.

3. How is a Dirichlet prior distribution used in Bayesian inference?

In Bayesian inference, a Dirichlet prior distribution is used to represent our initial beliefs about the values of the parameters in a given model. This prior distribution is combined with the likelihood function, which represents the observed data, to calculate the posterior distribution, which represents our updated beliefs about the parameters after taking the data into account.

4. Can a Dirichlet prior distribution be updated with new data?

Yes, a Dirichlet prior distribution can be updated with new data using Bayes' theorem. The updated posterior distribution will reflect our revised beliefs about the parameters based on the new data, while still taking into account our initial beliefs represented by the prior distribution.

5. What are the advantages of using a Dirichlet prior distribution in Bayesian statistics?

One of the main advantages of using a Dirichlet prior distribution is its conjugate property, which makes it easier to calculate the posterior distribution. Additionally, it allows for the representation of multiple prior beliefs about different parameters, and can also be updated with new data, making it a flexible and useful tool in Bayesian inference.

Similar threads

  • Set Theory, Logic, Probability, Statistics
Replies
2
Views
341
  • Set Theory, Logic, Probability, Statistics
Replies
2
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
26
Views
3K
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
2
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
4
Views
990
  • Set Theory, Logic, Probability, Statistics
Replies
8
Views
4K
  • Calculus and Beyond Homework Help
Replies
4
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
5
Views
2K
Back
Top