Verifying Potato Weight: Hypothesis Testing and Beta Error Analysis

  • MHB
  • Thread starter mathmari
  • Start date
  • Tags
    Beta Error
In summary, the conversation discusses a vegetable trader selling potatoes in 5 kg bags and whether the average weight of the potatoes is truly 5 kg. A sample of 25 potatoes gives an average weight of 5.2 kg, causing the dealer to question if the average weight differs from 5 kg. The conversation also brings up the concept of the Beta error and the additional information needed to compute it. Ultimately, it is concluded that the null hypothesis is that the population mean is equal to 5 kg, and the alternative hypothesis is that it differs from 5 kg.
  • #1
mathmari
Gold Member
MHB
5,049
7
Hey! :eek:

A vegetable trader sells potatoes in 5 kg bags. Since potatoes differ in size, it is difficult to ensure that in each bag are exactly 5 kg. The dealer claims that the average weight of the potatoes was 5 kilograms. This has to be checked with a sample.
The sample (n = 25) gives an average value of 5.2 kg and an empirical standard deviation of 0.5 kg. By that sample can we say that that the average weight of the potatoes in the bag differs from 5 kg? The significance level is 0.05.We have the following:

Null hypothesis ($H_0$): The average weight of the potatoes is 5 kg.

Alternative hypothesis ($H_1$): The average weight of the potatoes is not 5 kg, but 5.2kg.

Is this correct? (Wondering)
The Beta error is the possibility that the null hypothesis is hold although it is false.

Then it s asked the following:

How big is the Beta error?
a) 0,25
b) 0,05
c) 0,95
d) 0,1
e) cannot be given I have done the following:

We have that the significance level is 0.05.

So, $P(Z < z) = 0.05$ and we get a z-value of $-1.645$.

We convert this to an X-value: $5-1.645\frac{0.5}{\sqrt{25}}=5-1.645\cdot 0.1=4.8355$

Then we have the follwoing: $P(X > 4.8355) = P\left [z > \frac{3.069-5.2}{0.1}\right ] = P(z > -3.645)=1-P(x\leq -3.645$ , now from a Normal table we get $beta = 1-P(x\leq -3.645 = 1-0.00014=0.99986$.

I must have done something wrong, since this answer is not one of the choices. Bur what? (Wondering)
 
Physics news on Phys.org
  • #2
mathmari said:
Hey! :eek:

A vegetable trader sells potatoes in 5 kg bags. Since potatoes differ in size, it is difficult to ensure that in each bag are exactly 5 kg. The dealer claims that the average weight of the potatoes was 5 kilograms. This has to be checked with a sample.
The sample (n = 25) gives an average value of 5.2 kg and an empirical standard deviation of 0.5 kg. By that sample can we say that that the average weight of the potatoes in the bag differs from 5 kg? The significance level is 0.05.

We have the following:

Null hypothesis ($H_0$): The average weight of the potatoes is 5 kg.

Alternative hypothesis ($H_1$): The average weight of the potatoes is not 5 kg, but 5.2kg.

Is this correct? (Wondering)

Hey mathmari! ;)

A hypothesis is not supposed to include a sample measurement - only a statement about the population that we want to test.

mathmari said:
The Beta error is the possibility that the null hypothesis is hold although it is false.

Then it s asked the following:

How big is the Beta error?
a) 0,25
b) 0,05
c) 0,95
d) 0,1
e) cannot be given I have done the following:

We have that the significance level is 0.05.

So, $P(Z < z) = 0.05$ and we get a z-value of $-1.645$.

Since the alternative hypothesis is an inequality, shouldn't that be P(Z<z)<0.05 OR P(Z>z)<0.05? (Wondering)

Where or how did you get that z-value?

mathmari said:
We convert this to an X-value: $5-1.645\frac{0.5}{\sqrt{25}}=5-1.645\cdot 0.1=4.8355$

Then we have the follwoing: $P(X > 4.8355) = P\left [z > \frac{3.069-5.2}{0.1}\right ] = P(z > -3.645)=1-P(x\leq -3.645$ , now from a Normal table we get $beta = 1-P(x\leq -3.645 = 1-0.00014=0.99986$.

I must have done something wrong, since this answer is not one of the choices. Bur what? (Wondering)

To say anything about beta, we basically need to know what the real distribution of the population is, so that we can estimate whether the null hypothesis holds. That distribution is not given is it? (Wondering)
 
  • #3
I like Serena said:
A hypothesis is not supposed to include a sample measurement - only a statement about the population that we want to test.

So, the hypothesis doesn't say anything about the average weight of the potatoes, does it? Is the null hypothesis then:
"The weight of one single bag is equal to the average weight. So, the weight of one single bag is 5kg." ? (Wondering)

I like Serena said:
Where or how did you get that z-value?

From the table we get the result $0.05$ for $z=-1.645$. Is this wrong? (Wondering)
I like Serena said:
To say anything about beta, we basically need to know what the real distribution of the population is, so that we can estimate whether the null hypothesis holds. That distribution is not given is it? (Wondering)

So, we cannot say anything about beta, can we? (Wondering)

What additional information about the distribution do have to know to compute it? For example if the distribution is normal? (Wondering)
 
Last edited by a moderator:
  • #4
mathmari said:
So, the hypothesis doesn't say anything about the average weight of the potatoes, does it? Is the null hypothesis then:
"The weight of one single bag is equal to the average weight. So, the weight of one single bag is 5kg." ? (Wondering)

The null hypotheses is that the population mean $\mu$ is equal to a certain number.
The alternative hypothesis is that the population mean $\mu$ somehow differs from that certain number.

So we should have:
$$H_0: \mu = 5\text{ kg} \\ H_1: \mu \ne 5 \text{ kg}$$
This form of $H_1$ is called 2-sided, since the real population mean $\mu$ could be either higher or lower (from the word 'differs') than $5\text{ kg}$. (Nerd)
mathmari said:
From the table we get the result $0.05$ for $z=-1.645$. Is this wrong? (Wondering)

Ah. I see what you mean. This is the so called critical z-value for $\alpha=0.05$ of a 1-sided alternative hypothesis (like $H_1: \mu > 5 \text{ kg}$).
However, since we have a 2-sided alternative hypothesis, we should look up the z-value for $\frac\alpha 2=0.025$, which is $z^*=1.96$ ($z^*$ to denote the critical z-value).
mathmari said:
So, we cannot say anything about beta, can we? (Wondering)

What additional information about the distribution do have to know to compute it? For example if the distribution is normal? (Wondering)

To calculate $\beta$ we typically need the real $\mu$ and $\sigma$ of the population that should be such that the alternative hypothesis is satisfied.
With those we can calculate the probability $\beta$ that we keep $H_0$ even though the real population distribution matches the $H_1$ hypothesis. (Thinking)
 
  • #5
I like Serena said:
The null hypotheses is that the population mean $\mu$ is equal to a certain number.
The alternative hypothesis is that the population mean $\mu$ somehow differs from that certain number.

So we should have:
$$H_0: \mu = 5\text{ kg} \\ H_1: \mu \ne 5 \text{ kg}$$
This form of $H_1$ is called 2-sided, since the real population mean $\mu$ could be either higher or lower (from the word 'differs') than $5\text{ kg}$. (Nerd)

Ah ok. I see! (Smile)
I like Serena said:
Ah. I see what you mean. This is the so called critical z-value for $\alpha=0.05$ of a 1-sided alternative hypothesis (like $H_1: \mu > 5 \text{ kg}$).
However, since we have a 2-sided alternative hypothesis, we should look up the z-value for $\frac\alpha 2=0.025$, which is $z^*=1.96$ ($z^*$ to denote the critical z-value).

Ah ok. (Thinking)
I like Serena said:
To calculate $\beta$ we typically need the real $\mu$ and $\sigma$ of the population that should be such that the alternative hypothesis is satisfied.
With those we can calculate the probability $\beta$ that we keep $H_0$ even though the real population distribution matches the $H_1$ hypothesis. (Thinking)

We have that "The sample (n = 25) gives an average value of 5.2 kg and an empirical standard deviation of 0.5 kg.".
Do we not get from that that $\mu=5.2$ and $\sigma=0.5$ ? (Wondering)
 
  • #6
mathmari said:
We have that "The sample (n = 25) gives an average value of 5.2 kg and an empirical standard deviation of 0.5 kg.".
Do we not get from that that $\mu=5.2$ and $\sigma=0.5$ ? (Wondering)

mathmari said:
The sample (n = 25) gives an average value of 5.2 kg and an empirical standard deviation of 0.5 kg. By that sample can we say that that the average weight of the potatoes in the bag differs from 5 kg? The significance level is 0.05.

The problem statement asks whether we the alternative hypothesis is true.
The way to do that is to calculate the z-score given by:
$$SE = \frac{\sigma}{\sqrt n} \\ z = \frac{\bar x - 5\text{ kg}}{SE}$$
where $\sigma$ is the empirical standard deviation (and not the standard deviation of the sample), and $SE$ is the so called standard error.

We didn't do that yet did we? (Wondering)

mathmari said:
The Beta error is the possibility that the null hypothesis is hold although it is false.

Then it s asked the following:

How big is the Beta error?

Well... the actual population mean is not given, so I don't think we can calculate $\beta$.
Alternatively, perhaps we're supposed to assume that the sample mean is somehow representative of the real distribution, which is a bit of a stretch... (Thinking)
If so we should assume that $\mu = 5.2\text{ kg}$ and $\sigma=0.5\text{ kg}$, after which we should calculate:
$$\beta = P\big((X < 5 + z^* \cdot SE) \land (X > 5 - z^* \cdot SE)\big)$$
or:
$$\beta = P\left(\left(Z < \frac{5 - 5.2}{SE} + z^*\right) \land \left(Z > \frac{5 - 5.2}{SE} - z^*\right)\right)$$
which is the probability that we keep $H_0$ even though the population distribution is assumed to be given by the sample mean and the empirical standard deviation.
 
Last edited:
  • #7
I like Serena said:
Alternatively, perhaps we're supposed to assume that the sample mean is somehow representative of the real distribution, which is a bit of a stretch... (Thinking)
If so we should assume that $\mu = 5.2\text{ kg}$ and $\sigma=0.5\text{ kg}$, after which we should calculate:
$$\beta = P\big((X < 5 + z^* \cdot SE) \land (X > 5 - z^* \cdot SE)\big)$$
or:
$$\beta = P\left(\left(Z < \frac{5 - 5.2}{SE} + z^*\right) \land \left(Z < \frac{5 - 5.2}{SE} - z^*\right)\right)$$
which is the probability that we keep $H_0$ even though the population distribution is assumed to be given by the sample mean and the empirical standard deviation.

What is $z^*$ ? The one that you defined above, $z = \frac{\bar x - 5\text{ kg}}{SE}$ ? (Wondering)
 
  • #8
mathmari said:
What is $z^*$ ? The one that you defined above, $z = \frac{\bar x - 5\text{ kg}}{SE}$ ? (Wondering)

No. $z^*$ is the so called critical z-value.

I like Serena said:
mathmari said:
From the table we get the result $0.05$ for $z=-1.645$. Is this wrong? (Wondering)

Ah. I see what you mean. This is the so called critical z-value for $\alpha=0.05$ of a 1-sided alternative hypothesis (like $H_1: \mu > 5 \text{ kg}$).
However, since we have a 2-sided alternative hypothesis, we should look up the z-value for $\frac\alpha 2=0.025$, which is $z^*=1.96$ ($z^*$ to denote the critical z-value).

In this problem we have $z^* = 1.96$, which is in your table, and which is the most common critical z-value. (Nerd)
 
  • #9
I like Serena said:
No. $z^*$ is the so called critical z-value.
In this problem we have $z^* = 1.96$, which is in your table, and which is the most common critical z-value. (Nerd)
Ah ok. So, do we have the following? (Wondering)

\begin{align*}\beta &= P\left(\left(Z < \frac{5 - 5.2}{SE} + z^*\right) \land \left(Z < \frac{5 - 5.2}{SE} - z^*\right)\right) \\ & =P\left(\left(Z < \frac{0.2}{0.1} + 1.96\right) \land \left(Z < \frac{0.2}{0.1} - 1.96\right)\right) \\ & =P\left(\left(Z < 3.96\right) \land \left(Z < 0.04\right)\right) \\ & =P\left(Z < 0.04\right) \\ & =0.51595\end{align*}
 
  • #10
mathmari said:
Ah ok. So, do we have the following? (Wondering)

\begin{align*}\beta &= P\left(\left(Z < \frac{5 - 5.2}{SE} + z^*\right) \land \left(Z < \frac{5 - 5.2}{SE} - z^*\right)\right) \\ & =P\left(\left(Z < \frac{0.2}{0.1} + 1.96\right) \land \left(Z < \frac{0.2}{0.1} - 1.96\right)\right) \\ & =P\left(\left(Z < 3.96\right) \land \left(Z < 0.04\right)\right) \\ & =P\left(Z < 0.04\right) \\ & =0.51595\end{align*}

Erm... (Blush)

That should be:
\begin{align*}
\beta &= P\left(\left(Z < \frac{5 - 5.2}{SE} + z^*\right) \land \left(Z > \frac{5 - 5.2}{SE} - z^*\right)\right) \\
& =P\left(\left(Z < \frac{-0.2}{0.1} + 1.96\right) \land \left(Z > \frac{-0.2}{0.1} - 1.96\right)\right) \\
& =P\left(\left(Z < -0.04\right) \land \left(Z > -3.96\right)\right) \\
& \approx P\left(Z < -0.04\right)
\end{align*}
(Thinking)
 
  • #11
I like Serena said:
Erm... (Blush)

That should be:
\begin{align*}
\beta &= P\left(\left(Z < \frac{5 - 5.2}{SE} + z^*\right) \land \left(Z > \frac{5 - 5.2}{SE} - z^*\right)\right) \\
& =P\left(\left(Z < \frac{-0.2}{0.1} + 1.96\right) \land \left(Z > \frac{-0.2}{0.1} - 1.96\right)\right) \\
& =P\left(\left(Z < -0.04\right) \land \left(Z > -3.96\right)\right) \\
& \approx P\left(Z < -0.04\right)
\end{align*}
(Thinking)

Oh yes... (Blush)

So, the result is then $0.48405$, or not? (Wondering)
 
  • #12
mathmari said:
Oh yes... (Blush)

So, the result is then $0.48405$, or not? (Wondering)

That looks about right. (Nod)

And since it is not in the list of given answers, perhaps the answer should be "cannot be given" after all? (Wondering)
 
  • #13
I like Serena said:
That looks about right. (Nod)

And since it is not in the list of given answers, perhaps the answer should be "cannot be given" after all? (Wondering)

It must be so.

Thank you very much! (Happy)
 
  • #14
For the record, this is what the distributions look like when trying to determine $\beta$.
\begin{tikzpicture}
%preamble \usepackage{pgfplots}
\pgfmathdeclarefunction{gauss}{3}{%
\pgfmathparse{1/(#3*sqrt(2*pi))*exp(-((#1-#2)^2)/(2*#3^2))}%
}
\begin{axis}[
no markers, domain=4.5:5.7, samples=100,
axis lines*=left, xlabel=$x$, ylabel=$p$,
every axis y label/.style={at=(current axis.above origin),anchor=south},
every axis x label/.style={at=(current axis.right of origin),anchor=west},
height=5cm, width=12cm,
xtick={5,5.2}, ytick=\empty,
enlargelimits=false, clip=false, axis on top,
grid = major
]
\addplot [fill=cyan!30, draw=none, domain=4.5:5.19] {gauss(x,5.2,0.1)} \closedcycle;
\addplot [fill=red!30, draw=none, domain=5.19:5.7] {gauss(x,5,0.1)} \closedcycle;
\addplot [fill=red!30, draw=none, domain=4.5:4.81] {gauss(x,5,0.1)} \closedcycle;
\addplot [very thick,cyan!50!black] {gauss(x,5,0.1)};
\addplot [very thick,cyan!50!black] {gauss(x,5.2,0.1)};

\draw [yshift=-0.6cm, latex-latex](axis cs:5,0) -- node [fill=white] {$1.96\sigma$} (axis cs:5.19,0);
\node at (axis cs:5.12, 1.1) {$\beta$};
\node at (axis cs:4.78, 0.16) {$\alpha/2$};
\node at (axis cs:5.22, 0.16) {$\alpha/2$};
\node at (axis cs:5, 4.3) {$N(5,SE)$};
\node at (axis cs:5.2, 4.3) {$N(\mu,SE)$};
\end{axis}
\end{tikzpicture}

The light-red part is $\alpha/2=0.025$, which is where we reject the null hypothesis.
And the cyan part is $\beta \approx 0.48$ where we keep the null hypothesis even though it should have been rejected based on the actual population distribution (which is usually not known). (Thinking)
 

What is a beta error in a hypothesis?

A beta error, also known as a type II error, occurs when a null hypothesis is accepted as true when it is actually false. In other words, the hypothesis incorrectly concludes that there is no significant difference or relationship between two variables when there actually is one.

How does a beta error differ from an alpha error?

A beta error is the opposite of an alpha error. While a beta error occurs when a null hypothesis is accepted when it should be rejected, an alpha error occurs when a null hypothesis is rejected when it should be accepted. Both types of errors can lead to incorrect conclusions in a hypothesis test.

What factors can contribute to a beta error?

There are several factors that can contribute to a beta error, including a small sample size, a weak effect size, and a lack of statistical power. Additionally, using an inappropriate statistical test or failing to control for confounding variables can also increase the likelihood of a beta error.

How can a researcher reduce the risk of a beta error?

In order to reduce the risk of a beta error, a researcher can increase the sample size, use a more sensitive statistical test, and ensure that all relevant variables are controlled for in the analysis. It is also important for researchers to carefully plan their study design and conduct a power analysis to determine the appropriate sample size for their research.

What are the consequences of a beta error in a hypothesis?

A beta error can have serious consequences in research, as it can lead to incorrect conclusions and potentially impact the validity of a study. It can also result in missed opportunities for further investigation and potential advancements in a field. Additionally, a beta error can lead to wasted time and resources, as well as potential harm to individuals if the incorrect conclusion is applied in real-world settings.

Similar threads

  • Set Theory, Logic, Probability, Statistics
Replies
26
Views
3K
  • Set Theory, Logic, Probability, Statistics
Replies
6
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
5
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
8
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
19
Views
2K
  • Calculus and Beyond Homework Help
Replies
2
Views
1K
  • Calculus and Beyond Homework Help
Replies
2
Views
2K
  • Calculus and Beyond Homework Help
Replies
2
Views
1K
Replies
1
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
2
Views
9K
Back
Top