Hypothesis Test for Public Transport On-Time Claim with Example

  • Context: MHB 
  • Thread starter Thread starter mathmari
  • Start date Start date
  • Tags Tags
    Delay Test
Click For Summary
SUMMARY

The discussion centers on testing the public transport company's claim that its buses are at least 95% on time, using a sample of 1,000 buses where 66 were delayed. The null hypothesis is defined as H0: p ≥ 95%, while the alternative hypothesis is H1: p < 95%. Using R, the p-value was calculated as approximately 0.0149, leading to the rejection of the null hypothesis at a significance level of α = 0.10. The conversation also explores Type I and Type II errors, emphasizing the need to understand the actual population distribution to assess the Type II error accurately.

PREREQUISITES
  • Understanding of hypothesis testing concepts, including null and alternative hypotheses.
  • Familiarity with binomial distribution and its parameters.
  • Proficiency in R programming, specifically using functions like pbinom and qbinom.
  • Knowledge of Type I and Type II errors in statistical testing.
NEXT STEPS
  • Learn how to perform hypothesis testing using Python's SciPy library.
  • Explore the implications of Type I and Type II errors in real-world scenarios.
  • Study the concept of power analysis to understand sample size requirements for hypothesis tests.
  • Investigate the use of confidence intervals in conjunction with hypothesis testing.
USEFUL FOR

Statisticians, data analysts, and researchers involved in hypothesis testing and statistical analysis, particularly in the context of public transport performance evaluation.

mathmari
Gold Member
MHB
Messages
4,984
Reaction score
7
Hey! :o

A public transport company claims that its buses are at least $95\%$ on time. (A bus is still on time here, if he has at most $3$ minutes delay compared to the timetable.) A sample size of $n = 1000$ at various stops results in $66$ delays. The probability that a randomly selected bus will arrive on time is denoted by $p$.
(a) You as a passenger doubt the claim of the enterprise. Test the company's claim to a significance level of $\alpha = 0, 10$.
(b) Use an example to explain the second-type error. I have done the following:

(a) The null hypothesis is $H_0: p\geq 95\%$. The alternative hypothesis is therefore $p<95\%$. The rejection area is $\overline{A}=\{0, \ldots , k\}$ and the acceptance is $\{k+1, \ldots , 1000\}$.

From the significance level we have that $P(X\leq k)\leq 0.10\Rightarrow F(1000, 95, k)\leq 0.10$, right? I haven't really understood how we can read from a table the value of $k$. Could you explain it to me?

(Wondering)
 
Physics news on Phys.org
Hey mathmari! (Wave)

a) In R, which we can use for instance online here, we can execute:
Code:
# Probability of getting 1000-66 or less on-time events from 1000 trials with success probability 0.95
message("p=", pbinom(1000-66,1000,0.95))

# The number of on-time events with a cumulative probability of 0.10 after 1000 trials
message("k+1=", qbinom(0.1,1000,0.95))
The result is:
Code:
p=0.0149304062062877
k+1=941

It means that we can reject $H_0$ since $p<0.10$.
Or alternatively reject $H_0$ since our rejection area is up to $k=940$, while we have $1000-66=934 \le 940$ on-time events. (Thinking)b) As for the type II error, how about we consider the example where the real population has a delay rate of $0.066$, while our null hypothesis assumes it's $0.050$ or less. And we pick $\alpha=0.10$.
Then we get the graph:
\begin{tikzpicture}
%preamble \usepackage{pgfplots}
\pgfmathdeclarefunction{gauss}{3}{%
\pgfmathparse{1/(#3*sqrt(2*pi))*exp(-((#1-#2)^2)/(2*#3^2))}%
}
\begin{axis}[
no markers, domain=25:90, samples=100,
axis lines*=left, xlabel=$k$, ylabel=$p$,
every axis y label/.style={at=(current axis.above origin),anchor=south},
every axis x label/.style={at=(current axis.right of origin),anchor=west},
height=5cm, width=12cm,
xtick={50,60,66}, ytick=\empty,
enlargelimits=false, clip=false, axis on top,
grid = major
]
\addplot [fill=cyan!30, draw=none, domain=40:60] {gauss(x,66,7)} \closedcycle;
\addplot [fill=red!30, draw=none, domain=60:70] {gauss(x,50,7)} \closedcycle;
\addplot [very thick,cyan!50!black] {gauss(x,50,7)};
\addplot [very thick,cyan!50!black] {gauss(x,66,7)};
\node at (axis cs:57, 0.005) {$\beta$};
\node at (axis cs:63, 0.005) {$\alpha$};
\node at (axis cs:38, 0.05) {$B(1000,0.050)$};
\node at (axis cs:78, 0.05) {$B(1000,0.066)$};
\end{axis}
\end{tikzpicture}
The type I error is $\alpha$ and the type II error is $\beta$.
 
I like Serena said:
a) In R, which we can use for instance online here, we can execute:
Code:
# Probability of getting 1000-66 or less on-time events from 1000 trials with success probability 0.95
message("p=", pbinom(1000-66,1000,0.95))

# The number of on-time events with a cumulative probability of 0.10 after 1000 trials
message("k+1=", qbinom(0.1,1000,0.95))
The result is:
Code:
p=0.0149304062062877
k+1=941

It means that we can reject $H_0$ since $p<0.10$.
Or alternatively reject $H_0$ since our rejection area is up to $k=940$, while we have $1000-66=934 \le 940$ on-time events. (Thinking)

Let $X$ be the numbers of on-time events. $X$ is RV with binomial distribution with $n=1000$ and $p=0.95$.

We want to calculate the p-value.

The p-value is defined as the probability, under the null hypothesis, of obtaining a result equal to or more extreme than what was actually observed. In this case, it was observed that there were $1000-66=934$ on-time events. More extreme than that would we if the number of on-time events were less than $934$.
So the p-value is defined as $$p\text{-value}=P(X\leq 934)=\sum_{i=0}^{934}\binom{1000}{i}0.95^i\cdot 0.05^{1000-i}\approx 0.0149$$ If the p-value is less than the significance level, we reject the null hypothesis.
Since $p\text{-value}=0.0149<0.10=\alpha$ we reject $H_0: p =95\%$.

Have I understood that correctly? (Wondering)

I like Serena said:
b) As for the type II error, how about we consider the example where the real population has a delay rate of $0.066$, while our null hypothesis assumes it's $0.050$ or less. And we pick $\alpha=0.10$.
Then we get the graph:
\begin{tikzpicture}
%preamble \usepackage{pgfplots}
\pgfmathdeclarefunction{gauss}{3}{%
\pgfmathparse{1/(#3*sqrt(2*pi))*exp(-((#1-#2)^2)/(2*#3^2))}%
}
\begin{axis}[
no markers, domain=25:90, samples=100,
axis lines*=left, xlabel=$k$, ylabel=$p$,
every axis y label/.style={at=(current axis.above origin),anchor=south},
every axis x label/.style={at=(current axis.right of origin),anchor=west},
height=5cm, width=12cm,
xtick={50,60,66}, ytick=\empty,
enlargelimits=false, clip=false, axis on top,
grid = major
]
\addplot [fill=cyan!30, draw=none, domain=40:60] {gauss(x,66,7)} \closedcycle;
\addplot [fill=red!30, draw=none, domain=60:70] {gauss(x,50,7)} \closedcycle;
\addplot [very thick,cyan!50!black] {gauss(x,50,7)};
\addplot [very thick,cyan!50!black] {gauss(x,66,7)};
\node at (axis cs:57, 0.005) {$\beta$};
\node at (axis cs:63, 0.005) {$\alpha$};
\node at (axis cs:38, 0.05) {$B(1000,0.050)$};
\node at (axis cs:78, 0.05) {$B(1000,0.066)$};
\end{axis}
\end{tikzpicture}
The type I error is $\alpha$ and the type II error is $\beta$.

Could you explain this part further to me? (Wondering)
 
mathmari said:
Let $X$ be the numbers of on-time events. $X$ is RV with binomial distribution with $n=1000$ and $p=0.95$.

We want to calculate the p-value.

The p-value is defined as the probability, under the null hypothesis, of obtaining a result equal to or more extreme than what was actually observed. In this case, it was observed that there were $1000-66=934$ on-time events. More extreme than that would we if the number of on-time events were less than $934$.
So the p-value is defined as $$p\text{-value}=P(X\leq 934)=\sum_{i=0}^{934}\binom{1000}{i}0.95^i\cdot 0.05^{1000-i}\approx 0.0149$$ If the p-value is less than the significance level, we reject the null hypothesis.
Since $p\text{-value}=0.0149<0.10=\alpha$ we reject $H_0: p =95\%$.

All correct. (Nod)
Although we actually reject $H_0: p \ge 95\%$. (Nerd)

mathmari said:
Could you explain this part further to me? (Wondering)

To say anything about the Type II error, we need to know what the actual population distribution is.
In my proposed example we're assuming that it is a binomial distribution of 1000 trials with a delay-rate of 66.

The Type I error ($\alpha$) is the probability that we reject the null-hypothesis, even though the null-hypothesis is actually true. We choose this $\alpha$ ourselves, and in the example we have set it to $\alpha=0.10$.

The Type II error ($\beta$) is the probability that we do not reject the null-hypothesis, even though the null-hypothesis is false. That is, the population is different from the null-hypothesis as we assumed it to be.
Typically $\beta$ is much larger than $\alpha$.
That is, we only reject the null-hypothesis if we have sufficient evidence.
If we don't have sufficient evidence, it just means we need a larger sample, although we can never be quite sure that the null-hypothesis is true. The Type II error ($\beta$) indicates how large this probability is.

We can calculate $\beta$ if and only if we know what the actual population is, which is presumed to be distinct from the population of the null-hypothesis. (Thinking)
 
I like Serena said:
All correct. (Nod)
Although we actually reject $H_0: p \ge 95\%$. (Nerd)

Oh yes (Blush)
I like Serena said:
To say anything about the Type II error, we need to know what the actual population distribution is.
In my proposed example we're assuming that it is a binomial distribution of 1000 trials with a delay-rate of 66.

The Type I error ($\alpha$) is the probability that we reject the null-hypothesis, even though the null-hypothesis is actually true. We choose this $\alpha$ ourselves, and in the example we have set it to $\alpha=0.10$.
The Type I error is always the significance level that we choose, right? Since it is like that, we don't need to know the actual population distribution for $\alpha$.
I like Serena said:
The Type II error ($\beta$) is the probability that we do not reject the null-hypothesis, even though the null-hypothesis is false. That is, the population is different from the null-hypothesis as we assumed it to be.
Typically $\beta$ is much larger than $\alpha$.
That is, we only reject the null-hypothesis if we have sufficient evidence.
If we don't have sufficient evidence, it just means we need a larger sample, although we can never be quite sure that the null-hypothesis is true. The Type II error ($\beta$) indicates how large this probability is.

We can calculate $\beta$ if and only if we know what the actual population is, which is presumed to be distinct from the population of the null-hypothesis. (Thinking)
What do you mean by "the population is different from the null-hypothesis as we assumed it to be" ? (Wondering)
 
mathmari said:
The Type I error is always the significance level that we choose, right? Since it is like that, we don't need to know the actual population distribution for $\alpha$.

Yep. (Nod)

mathmari said:
What do you mean by "the population is different from the null-hypothesis as we assumed it to be" ? (Wondering)

We have the distribution of the null hypothesis, and we have the presumed distribution of the alternative hypothesis. The type II error is the probability that we keep the null hypothesis even though the alternative hypothesis is true. (Thinking)
 
I like Serena said:
We have the distribution of the null hypothesis, and we have the presumed distribution of the alternative hypothesis. The type II error is the probability that we keep the null hypothesis even though the alternative hypothesis is true. (Thinking)

At the null hypothesis we suppose that at least $95\%$ is on time, so the population at the null hypothesis is $95\%$ of $1000$, and so $5\%$ of $1000$ have delay, i.e. $50$ of $1000$.
It has been observed that $66$ of $1000$ buses have delay, which is at the alternative hypothesis.

So the population at the null hypothesis is $50$ and at the alternative hypothesis it is $66$.

Have I understood that correctly? Or did you mean something else? (Wondering)
 
mathmari said:
At the null hypothesis we suppose that at least $95\%$ is on time, so the population at the null hypothesis is $95\%$ of $1000$, and so $5\%$ of $1000$ have delay, i.e. $50$ of $1000$.
It has been observed that $66$ of $1000$ buses have delay, which is at the alternative hypothesis.

So the population at the null hypothesis is $50$ and at the alternative hypothesis it is $66$.

Have I understood that correctly? Or did you mean something else?

That is correct. (Nod)

Do note that we're assuming that the population of the alternative hypothesis has the distribution that we've observed.
And in general a hypothesis cannot depend on a sample.
But that's the example that we've chosen. (Nerd)
 
I like Serena said:
That is correct. (Nod)

Do note that we're assuming that the population of the alternative hypothesis has the distribution that we've observed.
And in general a hypothesis cannot depend on a sample.
But that's the example that we've chosen. (Nerd)

Ah ok! Thank you! (Smile)
 

Similar threads

  • · Replies 20 ·
Replies
20
Views
3K
  • · Replies 13 ·
Replies
13
Views
3K
Replies
3
Views
12K
Replies
4
Views
3K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 15 ·
Replies
15
Views
6K
  • · Replies 6 ·
Replies
6
Views
4K
  • · Replies 4 ·
Replies
4
Views
31K