Bernoulli and Bayesian probabilities

Click For Summary

Homework Help Overview

The discussion revolves around a homework problem related to Bernoulli and Bayesian probabilities, specifically focusing on estimating parameters using maximum likelihood estimation (MLE) and Bayesian approaches. The original poster, a graduate mechanical engineering student, is uncertain about the accuracy of different estimation methods and seeks clarification on the topic.

Discussion Character

  • Exploratory, Conceptual clarification, Assumption checking

Approaches and Questions Raised

  • Participants discuss the original poster's calculations for MLE and Bayesian estimates, questioning the accuracy of the methods used. There is a focus on understanding why one estimation might be considered more accurate than another, especially in the absence of a known true population parameter.

Discussion Status

Some participants have provided insights into the calculations and raised questions about the assumptions underlying the accuracy of the estimates. There is an ongoing exploration of the implications of not knowing the true parameter value and how that affects the assessment of accuracy among the different methods.

Contextual Notes

Participants note that the problem statement lacks information about the true population parameter, which is crucial for evaluating the accuracy of the estimates. This missing information is a point of contention in the discussion.

hdp12
Messages
67
Reaction score
2
Summary:: Hello there, I'm a mechanical engineer pursuing my graduate degree and I'm taking a class on machine learning. Coding is a skill of mine, but statistics is not... anyway, I have a homework problem on Bernoulli and Bayesian probabilities. I believe I've done the first few parts correctly, but the final question asks me to explain why one is more accurate than another, and the inverse as well. I am not sure, so I figured I'd reach out here and ask. The work and appropriate equations are below:

1. (10 pts) Consider 20 values randomly sampled from the Bernoulli Distribution with parameter :

Matlab:
x = [1, 1, 0, 1, 1, 0, 0, 1, 1, 1, 0, 1, 1, 1, 1, 0, 1, 1, 1, 1];
N = length(x);

(a) Estimate the parameter using the maximum likelihood approach and the 20 data values.
Matlab:
u = sum(x==1)/N; % u = 0.75
bern = (u.^x).*(1-u).^(1-x)

p = 0;
for n = 1:N
    pTemp = x(n)*log(u) + (1-x(n))*log(1-u);
    p = p+pTemp;
end

%ln(a) = b <--> a = e^b
p = exp(p); % p = 1.3050e-05
(b) Estimate the parameter using the Bayesian approach. Use the beta distribution Beta(a=8, b=4).
Matlab:
% a + sum(xn),b + N - sum(xn)
% (8 + 15 - 1) / (12 + 20 - 2) = 22/30
u = 22/30; % u = 0.7333

(c) Estimate the parameter using the Bayesian approach. Use the beta distribution Beta(a=4, b=8).
Matlab:
% (4 + 15 - 1) / (12 + 20 - 2) = 18/30
u = 18/30; % u = 0.6
(d) Discuss why the estimation from (b) is more accurate than that from (a) and why the estimation from (c) is worse than that from (a).
Matlab:
uA = 0.75;
uB = 0.7333;
uC = 0.6;

1610979062961.png

1610979089554.png


Thanks in advance for any help!
 
Physics news on Phys.org
I think something goes wrong in the first step (a).
The max likelihood estimate of the Bernoulli parameter is simply the number of 1s divided by the sample size, which gives 15/20 = 0.75.
I don't understand your reason for doing the calcs you show above, which appear to give an answer of approx 10^-5.

It is too hard to work out what you were trying to do based only on computer code. Better to write out mathematical reasoning and explain the steps you took.

Also, we can't assess accuracy without knowing what the population parameter is. It looks like you meant to include that in the problem statement, but it is missing.
 
  • Like
Likes   Reactions: hdp12
I did a little too much in part A, you're right.

nothing else was provided in the problem statement though. Is there any reason why one estimation would be more accurate than another method of estimation?
 
Do you know the true value of your parameter u (e.g. with what value of the parameter your x was generated)? Otherwise I don't understand the question about accuracy ... How would you know if the MLE or the Bayesian posterior is more accurate if you don't know the actual value that you try to predict in this case?

However, if I assume that true value should be somewhere near 0.7, then the comparison between (c)-(a) is rather straightforward if you plot your Beta function prior in the case of (c)...
 
hdp12 said:
Is there any reason why one estimation would be more accurate than another method of estimation?
Yes. "Accuracy" measures the difference between the estimate and the true population parameter. Your data sample gives a MLE estimate of 0.75. You have done two different Bayesian calcs, call them B1 and B2, producing estimates of 0.7333 and 0.6. If you line these up on a number line, you can see that :
  • MLE is most accurate if the population parameter is greater than (0.7333 + 0.75) / 2, approx 0.742
  • B1 is most accurate if the population parameter is between (0.6 + 0.7333) / 2 and (0.7333 + 0.75) / 2 (approx between 0.67 and 0.742)
  • B2 is most accurate if the population parameter is less than (0.6 + 0.7333) / 2 (approx 0.67).
Any of those three could be true given the information provided in the OP!

As @ChrisVer points out, if the question doesn't state the population parameter's value, it is impossible to answer.
 

Similar threads

Replies
1
Views
1K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 11 ·
Replies
11
Views
2K
Replies
2
Views
1K
Replies
1
Views
1K
  • · Replies 4 ·
Replies
4
Views
2K
Replies
6
Views
2K
  • · Replies 14 ·
Replies
14
Views
3K
  • · Replies 1 ·
Replies
1
Views
1K
  • · Replies 11 ·
Replies
11
Views
2K