Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Homework Help: Proving an unbiased estimator

  1. Sep 13, 2010 #1
    I have a terrible teacher and have to teach myself out of the book and don't understand this.

    1. The problem statement, all variables and given/known data

    Male verbal IQs
    117 103 121 112 120 132 113 117 132
    149 125 131 136 107 108 113 136 114

    Female Verbal IQs
    114 102 113 131 124 117 120 90
    114 109 102 114 127 127 103

    Denote the male values by X1, X2...Xm and female values by Y1, Y2...Yn. Suppose that the Xi's constitute a random sample froma distribution with mean mu_1 and standard deviation sigma_1 and the Yi's form a random sample distribution (independent from the Xi's) with mean mu_2 and standard deviation sigma_2.

    a.) Use rules of expected vale to show that Xbar - Ybar is an unbiased estimator of mu_1 - mu_2.

    3. The attempt at a solution

    I know that bias is the difference between the Expected value of the estimator and the value of the parameter. However, I am completely lost on how I can figure this out if I don't know the true means of the IQs.

    E(Xbar - Ybar) = E(Xbar) - E(Ybar) = (1/m)(X1+X2+...Xm) - (1/n)(Y1+Y2+...Yn)

    I have no idea what this means or where to go.
  2. jcsd
  3. Sep 13, 2010 #2


    Staff: Mentor

    You are given the means and standard deviations of the two sets of IQ data.
  4. Sep 13, 2010 #3
    I know that E(Xbar) = E(X) = mu_1 and E(Ybar) = E(Y) = mu_2. E(Xbar-Ybar) = mu_1 - mu_2.

    I don't know what it all means.
  5. Sep 14, 2010 #4


    Staff: Mentor

    You pretty much have it, but are having trouble putting the pieces together.

    [tex]E(\bar{X} - \bar{Y}) = E(\bar{X}) - E(\bar{Y}) = \mu_1 - \mu_2[/tex]

    So, on average, the statistic [tex]E(\bar{X} - \bar{Y})[/tex] can be expected to be equal to [itex]\mu_1 - \mu_2[/itex].

    The definition (slightly paraphrased) for "unbiased estimator" in one of my books is this:
    Let Y1, Y2, ..., Yn be a random sample from a distribution. An estimator W = h(Y1, Y2, ..., Yn) is said to be unbiased (for [itex]\theta[/itex]) if E(W) = [itex]\theta[/itex], for all [itex]\theta[/itex].
    Last edited: Sep 14, 2010
  6. Sep 14, 2010 #5
    So I'm looking for the difference in the expectation of the estimator from what the estimator actually measures...

    If Xbar - Ybar is mu1 - m2, and E(Xbar - Ybar) = mu1 - m2, then what exactly is this saying? I don't understand what's going on regarding the samples and distributions. I can't seem to find any good resources with pictures or graphs.
  7. Sep 14, 2010 #6


    Staff: Mentor

    No, you're calculating the expectation of the statistic [tex]\bar{X} - \bar{Y}[/tex].
    No, [tex]\bar{X} - \bar{Y} \neq \mu_1 - \mu_2[/tex]

    but if you took a large number of samples from the two populations, the differences of the sample averages --

    [tex]\bar{X} - \bar{Y} [/tex]

    -- would cluster around [tex]\mu_1 - \mu_2[/tex].

    One of my two references, Intro to Mathematical Statistics, 4th Ed., by Hogg & Craig, doesn't have a single picture or graph.
  8. Sep 14, 2010 #7
    You've almost got it.

    There's a small error though.

    Xbar = (1/m)(X1 + ... + Xm) so E(Xbar) = (1/m)[E(X1) + ... + E(Xm)]. This is because of linearity: E(aX + bY) = aE(X) +bE(Y).

    Then the line: "Suppose that the Xi's constitute a random sample froma distribution with mean mu_1" says the sample is identically distributed.

    Therefore E(X1) = E(X2) = ... = E(Xm). So E(Xbar) = (1/m)[m * E(X1)], and E(X1) = mu_1 (in fact the expected value of any of the X's is mu_1 because they have the same distribution).

    Use the same argument for the Y's, and you get the result.
    That's probably the most hardcore (but best) "introductory" stats textbook in print.
  9. Sep 14, 2010 #8
    To be completely honest, I don't even know what the 'result' is. I'm lost on what we're actually trying to achieve. Proving that I'm using an unbiased estimator is given in the text I have and above as E(W) = 0, but I don't really understand what that means other than the results from the sample will have the same mean as the population itself.
  10. Sep 14, 2010 #9


    Staff: Mentor

    E(W) = 0 would be true only if the population means (mu1 and mu2) of the two populations were equal, and this is not given in the problem.

    I would advise you to spend more time on learning the definitions. If there is something that you don't understand about the definition, ask your instructor. If you do that, go in with specific questions about what you are having problems with, not vague statements such as "I don't get it" or "I'm lost" etc.
  11. Sep 14, 2010 #10
    The instructor is Chinese, reads lectures that her former partner wrote (who was Greek), and does examples straight out of the book with no variation. There's absolutely nothing she can offer in English that would help...which is why I asked for pictures or graphs or something other than math lingo. I'm not a math major and can't think in abstracts. I seriously don't understand how the bias of a estimator relates to the estimates being done. Thanks for shoving me aside tho. It really helps.
  12. Sep 14, 2010 #11


    Staff: Mentor

    OK, so she's a lousy teacher.
    How do you know that? Have you gone to her office during her office hour (I assume she has regular office hours) to ask for clarification on some questions you have?
    Are you assuming that only math types can think in abstracts? Regarding pictures and graphs, the higher you go in mathematics, the less likely you are to see pictures and graphs. This is why I said to focus on definitions, since they are crucial in mathematics.
    I certainly appreciate your gratitude for the time I spent posting four responses to your question.
  13. Sep 14, 2010 #12
    Definition: The function [tex]g[/tex] is an unbiased estimator of [tex]\theta[/tex] if [tex]E(g)=\theta[/tex].

    What this means intuitively is that the estimator is on average equal to the true value of what it's trying to estimate. Being unbiased is just a property (amongst many others) that good estimators should have. But this intuitive meaning has no place in a mathematical proof, like this question, although it's probably something that's good to know so you have a feeling for what's going on.

    An estimator is *any* function of the observed values (that's the definition). So we could say that [tex]4+\frac{1}{m}\sum_{i=1}^mX_i[/tex] is an estimator for the mean. But this would be a terrible estimator, because it is biased in the sense that it will on average overestimate the mean by 4. So we would like an unbiased estimator instead.

    You want to prove that the function [tex]\bar{X}-\bar{Y}=\frac{1}{m}\sum_{i=1}^mX_i - \frac{1}{n}\sum_{i=1}^nY_i [/tex] is an unbiased estimator for [tex]\mu_1-\mu_2[/tex].

    Want to show: [tex]E(\bar{X}-\bar{Y}) = \mu_1-\mu_2[/tex]

    [tex]E(\bar{X}-\bar{Y}) = E(\bar{X})-E(\bar{Y})[/tex] because E(aX+bY) = aE(X) + bE(Y)

    [tex]=E(\frac{1}{m}\sum_{i=1}^mX_i)-E(\frac{1}{n}\sum_{i=1}^nY_i)[/tex] (substituting Xbar and Ybar in)

    [tex]=\frac{1}{m}\sum_{i=1}^mE(X_i)-\frac{1}{n}\sum_{i=1}^nE(Y_i)[/tex] using E(aX+bY) = aE(X) + bE(Y) again

    [tex]=\frac{1}{m}\sum_{i=1}^m\mu_1-\frac{1}{n}\sum_{i=1}^n\mu_2[/tex] because all the X's are identically distributed (as explained in my last post), so they all have the same expected value: namely mu_1, same with the Y's.

    [tex]=\frac{1}{m}(m\mu_1)-\frac{1}{n}(n\mu_2)[/tex] (now your just summing up constants)

    [tex]= \mu_1 - \mu_2[/tex] which is what we wanted to show.
    Last edited: Sep 15, 2010
Share this great discussion with others via Reddit, Google+, Twitter, or Facebook