- #1
fahraynk
- 186
- 6
I tried to derive an equation for one sample mean to converge to another sample mean within a 95% confidence interval, but I know I am wrong. Can someone tell me what I did wrong, and what is the correct formula?
Suppose:
##\hat{x_1},\hat{\sigma_1},N## are a sample mean, standard deviation calculated with ##N## samples,
##\hat{x_2},\hat{\sigma_2},n## are a sample mean, standard deviation calculated with ##n## samples ##n\leq N##
##\mu##,##\delta## are the true mean, true standard deviation for the population.
If ##d(\hat{x_1},\hat{x_2})## is a euclidean distance function on the sample means, then:
$$
d(\hat{x_1},\hat{x_2})\leq d(\hat{x_1},\mu)+ d(\hat{x_2},\mu)\leq 4\frac{\sigma_1}{\sqrt{N}}+4\frac{\sigma_2}{\sqrt{n}}
$$
With ##95##% confidence because : ##\mu\in [\hat{x_1}-\frac{2\sigma_1}{\sqrt{N}},\hat{x_1}+\frac{2\sigma_1}{\sqrt{N}}]## with 95% confidenceMy first question is, What is the relationship between sample standard deviation and population standard deviation?
When I take many samples, the standard deviation of the samples changes very little, so I assume the relationship ##\sigma_1=\sigma_2=\delta## :
$$
d(\hat{x_1},\hat{x_2})\leq 4(\frac{\sigma_1}{\sqrt{N}}+\frac{\sigma_2}{\sqrt{n}})=4\delta\frac{(\sqrt{N}+\sqrt{n})}{\sqrt{N}\sqrt{n}}=4\delta\frac{\frac{\sqrt{N}}{\sqrt{n}}+1}{\sqrt{N}}=>\\
=>\sqrt{N}d(\hat{x_1},\hat{x_2})-4\delta\leq\frac{4\delta\sqrt{N}}{\sqrt{n}}=>\\=>\sqrt{n}\leq\frac{4\delta\sqrt{N}}{\sqrt{N}d(\hat{x_1},\hat{x_2})-4\delta}
$$
But this can't be true, because if I choose ##d(\hat{x_1},\hat{x_2})=0## then ##\sqrt{n}\leq-\sqrt{N}##, but ##n## and ##N## must be positive.
What is wrong here?
Also, I am sure there must be a simple way to do this. What I really want to know is how to get ##n## as a function of ##d(\hat{x_1},\hat{x_2})## and ##\phi##, where ##\phi## is a confidence level, like 95% confidence.
Suppose:
##\hat{x_1},\hat{\sigma_1},N## are a sample mean, standard deviation calculated with ##N## samples,
##\hat{x_2},\hat{\sigma_2},n## are a sample mean, standard deviation calculated with ##n## samples ##n\leq N##
##\mu##,##\delta## are the true mean, true standard deviation for the population.
If ##d(\hat{x_1},\hat{x_2})## is a euclidean distance function on the sample means, then:
$$
d(\hat{x_1},\hat{x_2})\leq d(\hat{x_1},\mu)+ d(\hat{x_2},\mu)\leq 4\frac{\sigma_1}{\sqrt{N}}+4\frac{\sigma_2}{\sqrt{n}}
$$
With ##95##% confidence because : ##\mu\in [\hat{x_1}-\frac{2\sigma_1}{\sqrt{N}},\hat{x_1}+\frac{2\sigma_1}{\sqrt{N}}]## with 95% confidenceMy first question is, What is the relationship between sample standard deviation and population standard deviation?
When I take many samples, the standard deviation of the samples changes very little, so I assume the relationship ##\sigma_1=\sigma_2=\delta## :
$$
d(\hat{x_1},\hat{x_2})\leq 4(\frac{\sigma_1}{\sqrt{N}}+\frac{\sigma_2}{\sqrt{n}})=4\delta\frac{(\sqrt{N}+\sqrt{n})}{\sqrt{N}\sqrt{n}}=4\delta\frac{\frac{\sqrt{N}}{\sqrt{n}}+1}{\sqrt{N}}=>\\
=>\sqrt{N}d(\hat{x_1},\hat{x_2})-4\delta\leq\frac{4\delta\sqrt{N}}{\sqrt{n}}=>\\=>\sqrt{n}\leq\frac{4\delta\sqrt{N}}{\sqrt{N}d(\hat{x_1},\hat{x_2})-4\delta}
$$
But this can't be true, because if I choose ##d(\hat{x_1},\hat{x_2})=0## then ##\sqrt{n}\leq-\sqrt{N}##, but ##n## and ##N## must be positive.
What is wrong here?
Also, I am sure there must be a simple way to do this. What I really want to know is how to get ##n## as a function of ##d(\hat{x_1},\hat{x_2})## and ##\phi##, where ##\phi## is a confidence level, like 95% confidence.