Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Test For Bias

  1. Feb 9, 2012 #1
    how is bayesian inference actually applied?
    Say I have (100samples) a series of random numbers between 1 to 10.
    How do I test for the hypothesis that "there is a bias for the numbers 5,7" ?
  2. jcsd
  3. Feb 10, 2012 #2

    Stephen Tashi

    User Avatar
    Science Advisor

    The strength of Bayesian analysis is that it encourages experts to use their knowledge instead of leaving out details of the problem in order to fit it into some textbook type of exercise. If this is a real world problem, you have to consider what you know about the causes of the bias or examples of other series of numbers where you understand the bias.

    If you are making the problem up merely to work an example of Bayesian analysis, then we can consider how to define a probability distribution on "bias". You could use a "probability distribution of probability distributions". For example, let [itex] p_i [/itex] be the probability of the number [itex] i [/itex] and assume that any vector of probabilities [itex] p_1, p_2,...p_n [/itex] is equally likely, subject to the condition that the probabilities add to 1 and are each between 0 and 1. If you are trying to make a "yes or no" judgment on "bias", you have to define what that means. For example, does a "bias" in favor of 5 mean that [itex] p_5 [/itex] was at least 0.15?.

    We can discuss this further if you can refine your goals. It's often most convenient to do Bayesian analysis by Monte-Carlo simulations.
  4. Feb 10, 2012 #3
    This is a basic example(1to10) just to try Bayesian analysis.
    Let bias be expressed as Pi>Pb, where Pb is user defined, e.g 0.15
    The goal is simply to find out if there was user defined level of bias for certain numbers in different historical sample set sizes, e.g last 30draws, last 70draws, last 100draws, etc besides merely looking at the corresponding frequencies for each number for each timeframe.
    I.e. At which timeframe, there was most number of bias shown even though in the long run Pi for all numbers converges to 0.1?
    Last edited: Feb 10, 2012
  5. Feb 10, 2012 #4

    Stephen Tashi

    User Avatar
    Science Advisor

    Neither Bayesian or non-Bayesian statistics gives you a definite yes-or-no answer to most problems. In Bayesian statistics, the answer to the question "Is [itex] P_5 > 0.15 [/itex]" will have a certain probability. (It's a "probability of a probability" in this case, which might be a confusing thought.)

    In non-Bayesian statistics, you would assume a definite distribution for the numbers, you would compute the probability of observing your data and you would set some abritary limit on how improbable the data would be in order to "reject" or "accept" your original assumption.

    When you start mentioning varying sample sizes and "trends", you are getting into complications that you need to be clear about. People who look for "trends" in data can often fool themselves. Are you assuming the "bias" varies over time?

    If you are looking for an example to use to understand Bayesian analysis then you must define a specific example and do so precisely. Bayesian analysis doesn't do a translation from the ambiguous language of everyday speech into mathematics. The user of Bayesian analysis must do that.
Share this great discussion with others via Reddit, Google+, Twitter, or Facebook