What is the second equality in the bias of histogram estimate derived from?

logarithmic · Nov 9, 2010

Suppose we wish to estimate a probability density given the points {x_1, ..., x_n} using a histogram [tex]\hat{f}(x)[/tex].

I have a book that says [tex]Bias(\hat{f}(x))=E_f(\hat{f}(x))-f(x)=\frac{1}{2}f'(x)(h-2(x-b_j))+O(h^2)[/tex] for [tex]x\in(b_j,b_{j+1}][/tex].

Can someone explain where the second equality comes from? I pretty sure it's a Taylor expansion, but I'm not sure how to Taylor expand the expected value.

The notation is as follows:

[tex]h[/tex] is the width of the histogram bins.
[tex]b_j[/tex] and [tex]b_{j+1}[/tex] are the boundaries of the j-th bin.
[tex]\hat{f}(x)=n_j/(nh)[/tex] for [tex]x\in(b_j,b_{j+1}][/tex], where [tex]n_j[/tex] is the number of x points in the j-th bin. and n is the number of x points in total.

Any help is appreciated.

lippyka · Nov 9, 2010

The second equality comes from a Taylor expansion of the expected value of \hat{f}(x). The Taylor expansion is used to approximate the expected value of \hat{f}(x) for values of x near b_j.We begin by writing E_f(\hat{f}(x)) in terms of the expected value of f(x). Note that E_f(\hat{f}(x)) = E_f(n_j/(nh)) for x\in(b_j,b_{j+1}], where n_j is the number of x points in the j-th bin and n is the total number of x points.We then write f(x) as a Taylor expansion around b_j. Specifically,f(x) = f(b_j) + f'(b_j)(x-b_j) + O((x-b_j)^2).Substituting this into the equation for E_f(\hat{f}(x)), we getE_f(\hat{f}(x)) = f(b_j) + f'(b_j)(x-b_j) + O((x-b_j)^2).Now, we can write the bias as the difference between the expected value of \hat{f}(x) and f(x). Thus,Bias(\hat{f}(x))=E_f(\hat{f}(x))-f(x) = f'(b_j)(x-b_j) + O((x-b_j)^2).Finally, we can use the fact that h = (b_{j+1} - b_j) to rewrite the expression asBias(\hat{f}(x))=E_f(\hat{f}(x))-f(x)=\frac{1}{2}f'(x)(h-2(x-b_j))+O(h^2) for x\in(b_j,b_{j+1}].

What is the second equality in the bias of histogram estimate derived from?

1. What is the definition of "Bias of histogram estimate"?

2. How is bias calculated for a histogram estimate?

3. How does bias affect the accuracy of a histogram estimate?

4. What are some common sources of bias in histogram estimates?

5. How can bias be minimized in histogram estimates?

Similar threads

Hot Threads

Recent Insights