Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Bayesian probability question

  1. Dec 30, 2012 #1
    Hello,

    I am building a model that simulates the travel patterns of electric cars using a series of iterative conditional distributions. I have a dataset to build the pdfs.

    In one part of the model I generate a parking time from a conditional distribution.

    I create a parking time distribution for example given the time of day and location etc.

    I am using a Bayeisan approach because given certain condition sometimes no observations may be returned from the dataset because none were recorded so no distribution can be created and the simulation stops

    So first of all I assume a uniform Prior distribution.

    https://dl.dropbox.com/u/54057365/All/prior.JPG [Broken]

    Secondly I return the data in the database given the conditions and create the likelihood function.

    https://dl.dropbox.com/u/54057365/All/likelihood.JPG [Broken]

    Then I combine the prior distribution and posterior distribution to form the posterior distribution and I generate a value.

    https://dl.dropbox.com/u/54057365/All/posterior.JPG [Broken]

    My question is as follows, whenever no observations are returned the likelihoods is 0 so the posterior distribution is flat like prior distribution.

    Instead of using a uniform prior I want to use an informed prior.

    I have set the all the hyperparamters of the bins to 1 in the prior distribution but can I assign the hyper parameters according to some distribution instead?

    How would I do this?

    Thanks
     
    Last edited by a moderator: May 6, 2017
  2. jcsd
  3. Dec 30, 2012 #2

    Stephen Tashi

    User Avatar
    Science Advisor

    Distribution of what? In a Bayesian approach, you begin with a prior distribution for the parameters of another distribution. If you assume a uniform distribution of parking times, then I don't see how this a distribution for parameters of another distribution.

    What you need is a model where parking times are distributed according to some family of distributions ( an exponential family, for example) and each distribution in the family is defined by a particular value of some parameters. (It's simplest to use as few parameters as possible. For example, an exponential distribution is defined by one parameter lambda.) Then you assume a prior distribution for the parameters.

    The only way I can make sense of your work is that you assume parking times come from a family of distributions and that each distribution in the family is defined by about 300 parameters lambda1, lambda2,.... , lambda300, where lambdaN gives the probability for parking for exactly N minutes.

    You can't assume these parameters are jointly and independently uniformly distributed over the interval [0,1] since the parameters are probabilities then they must add to 1. You can assume they are jointly uniformly distributed subject to the condition that they add to 1.

    There are various methods for fitting a smooth distribution to discrete data. There is no single best or correct way that works for all situations. (Likewise to use an informed prior, you actually need to have some information or be willing to assume some.) Accounting for Imprecision in measurements is a natural way to produce smoother distributions. For example if a parking time is 30 minutes, the method of measuring the 30 might produce that value from, say, any time between 29.5 and 30.5 minutes. One smoothing technique is to replace the observation of 30 with a set of observations uniformly distributed between 29.5 and 30.5. (The general technique is called using a smoothing "kernel".) You could even argue that an observation of 30 is evidence for a wider range of possibilities. For example, if someone went shopping and returned after 30 minutes, random factors such as delays is checkout lines etc. might lengthen or shorten that time if the same shopping was repeated.

    It is common to see distributions fitted to data by fitting them to the cumulative histogram instead of the frequency histogram. The cumulative histogram (psychologically) often looks less erratic than the frequency histogram and people are more confident about guessing a family of distributions for it.
     
Know someone interested in this topic? Share this thread via Reddit, Google+, Twitter, or Facebook