Convergence of 2 sample means with 95% confidence

Click For Summary

Discussion Overview

The discussion revolves around the convergence of two sample means within a 95% confidence interval, exploring the mathematical relationships and assumptions involved in deriving an equation for this convergence. Participants examine the implications of sample sizes, standard deviations, and the conditions under which the Central Limit Theorem (CLT) applies.

Discussion Character

  • Exploratory
  • Technical explanation
  • Debate/contested
  • Mathematical reasoning

Main Points Raised

  • One participant attempts to derive an equation relating two sample means and expresses uncertainty about the correctness of their approach, particularly regarding the relationship between sample and population standard deviations.
  • Another participant points out a potential error in the manipulation of inequalities, emphasizing the importance of direction when dividing by negative quantities.
  • Concerns are raised about the assumptions underlying the use of sample standard deviations and the normal distribution of mean estimates, with references to the Central Limit Theorem and the need for sufficient sample sizes.
  • One participant suggests exploring the use of correlation as an inner product to define a distance metric, indicating an alternative approach to the problem.
  • Another participant highlights the ambiguity in the original question, suggesting various interpretations of the scenario involving independent random samples and the relationship between sample means.
  • There is a mention of using concentration inequalities as a potential method for estimating convergence without relying on variance information.
  • Discussion includes the idea that if both sample sizes are large enough, the CLT could imply that the sample means are close to each other under certain conditions.
  • A later reply introduces the concept of weak convergence and its implications for the behavior of sample means as sample sizes increase.

Areas of Agreement / Disagreement

Participants express differing views on the validity of the original derivation and the assumptions made. There is no consensus on the correct approach or resolution of the issues raised, indicating that multiple competing perspectives remain in the discussion.

Contextual Notes

Participants note limitations related to the assumptions about sample sizes, the distribution of sample means, and the appropriateness of using sample standard deviations in the context of the problem. The discussion reflects a range of mathematical considerations that remain unresolved.

fahraynk
Messages
185
Reaction score
5
I tried to derive an equation for one sample mean to converge to another sample mean within a 95% confidence interval, but I know I am wrong. Can someone tell me what I did wrong, and what is the correct formula?

Suppose:

##\hat{x_1},\hat{\sigma_1},N## are a sample mean, standard deviation calculated with ##N## samples,

##\hat{x_2},\hat{\sigma_2},n## are a sample mean, standard deviation calculated with ##n## samples ##n\leq N##

##\mu##,##\delta## are the true mean, true standard deviation for the population.

If ##d(\hat{x_1},\hat{x_2})## is a euclidean distance function on the sample means, then:

$$
d(\hat{x_1},\hat{x_2})\leq d(\hat{x_1},\mu)+ d(\hat{x_2},\mu)\leq 4\frac{\sigma_1}{\sqrt{N}}+4\frac{\sigma_2}{\sqrt{n}}
$$

With ##95##% confidence because : ##\mu\in [\hat{x_1}-\frac{2\sigma_1}{\sqrt{N}},\hat{x_1}+\frac{2\sigma_1}{\sqrt{N}}]## with 95% confidenceMy first question is, What is the relationship between sample standard deviation and population standard deviation?

When I take many samples, the standard deviation of the samples changes very little, so I assume the relationship ##\sigma_1=\sigma_2=\delta## :

$$
d(\hat{x_1},\hat{x_2})\leq 4(\frac{\sigma_1}{\sqrt{N}}+\frac{\sigma_2}{\sqrt{n}})=4\delta\frac{(\sqrt{N}+\sqrt{n})}{\sqrt{N}\sqrt{n}}=4\delta\frac{\frac{\sqrt{N}}{\sqrt{n}}+1}{\sqrt{N}}=>\\
=>\sqrt{N}d(\hat{x_1},\hat{x_2})-4\delta\leq\frac{4\delta\sqrt{N}}{\sqrt{n}}=>\\=>\sqrt{n}\leq\frac{4\delta\sqrt{N}}{\sqrt{N}d(\hat{x_1},\hat{x_2})-4\delta}
$$

But this can't be true, because if I choose ##d(\hat{x_1},\hat{x_2})=0## then ##\sqrt{n}\leq-\sqrt{N}##, but ##n## and ##N## must be positive.

What is wrong here?

Also, I am sure there must be a simple way to do this. What I really want to know is how to get ##n## as a function of ##d(\hat{x_1},\hat{x_2})## and ##\phi##, where ##\phi## is a confidence level, like 95% confidence.
 
Physics news on Phys.org
I don't like the look of this. As I understand your scenario, n is not a variable, and d is a random variable that you cannot "choose".
Mathematically, your problem comes in the last step. If we set d=0, the penultimate line is
-4δ ≤ 4δ√N/√n ⇒
√n ≤ 4δ√N/(-4δ)
You have to be careful with inequalities. It's not as simple as "swapping terms" in an equation. When you multiply or divide both sides of an inequality by a negative quantity (-4δ), the direction of the inequality is reversed. So it should be
√n ≥ 4δ√N/(-4δ)
And more generally, if √Nd - 4δ is negative, moving it to the bottom of the RHS reverses the direction of the inequality.
 
fahraynk said:
<Snip>
If ##d(\hat{x_1},\hat{x_2})## is a euclidean distance function on the sample means, then:

$$
d(\hat{x_1},\hat{x_2})\leq d(\hat{x_1},\mu)+ d(\hat{x_2},\mu)\leq 4\frac{\sigma_1}{\sqrt{N}}+4\frac{\sigma_2}{\sqrt{n}}
$$

With ##95##% confidence because : ##\mu\in [\hat{x_1}-\frac{2\sigma_1}{\sqrt{N}},\hat{x_1}+\frac{2\sigma_1}{\sqrt{N}}]## with 95% confidence<Snip>
What is wrong here?

Also, I am sure there must be a simple way to do this. What I really want to know is how to get ##n## as a function of ##d(\hat{x_1},\hat{x_2})## and ##\phi##, where ##\phi## is a confidence level, like 95% confidence.

Not sure, but maybe because ##\mu## is also in a similar interval about ##\hat{x_2} ##?
 
As an idea ( which I have put off developing) maybe you can use correlation as an inner-product ( with some adjustments) , then find the norm generated by the inner-product and define a distance based on the norm. I will do it too...some day.
 
fahraynk said:
My first question is, What is the relationship between sample standard deviation and population standard deviation?

some red flags here are that I see you're dividing by ##n## when in fact for sample variance you'd divide by ##n-1##. Similar issue is that while you can get unbiased estimates of variance, you'll have a biased standard deviation estimate due to (negative) convexity issues.

other issues in addition to what was raised above: I don't see why your mean estimates have a normal distribution -- this isn't stated anywhere. Sure CLT would tell you that normal approximation works for large enough ##n## but I don't see the sufficiency of size of ##n## addressed anywhere.

There's also ruler problem in that you're using estimates of standard deviation to measure estimates of mean -- but how do you know your std dev (or variance) estimates are any good? There's a lot of issues lurking in here... I think this is why books on statistics are long.

- - - -
if I was trying to develop some kind of estimate from scratch, I'd probably start with some kind of bounded random variable and apply Chernoff Bounds or concentration inequalities. That way you don't need variance information, only mean. Once I had this down, if feeling adventurous this may be applied to more general random variables (that still have first 2 moments) with the help of the method of truncation.
 
fahraynk said:
I tried to derive an equation for one sample mean to converge to another sample mean within a 95% confidence interval,

Your description doesn't define a specific mathematical problem.

You might intend asking about a scenario where independent random samples are taken of a random variable. After taking ##n## samples, the sample mean is the random variable ##\mu_n##. After taking ##k## more samples, the sample mean is the random variable ##\mu_{n+k}## where the first ##n## of those samples are the same as those used to compute ##\mu_n##.

Or you might intend to ask about the situation where ##\mu_n## and ##\mu_{n+k}## are computed from two groups of samples that need not have any common samples.

You might be taking ##n## and ##k## as given and asking for an interval length ##L## such that there is a ##0.95## probability that ##| \mu_n - \mu_{k+n}| < L/2##

Or you might be taking ##L## and ##n## as given and asking for value of ##k## such that there is a 0.95 probability that ##|\mu_n - \mu_{n+k}| < L/2##

Or you might have in mind some question involving the relationship of ##\mu_n## and ##\mu_{n+k}## with the mean ##\mu## of the random variable being sampled.
 
Last edited:
It eems that if N,n were both large enough, we could use the CLT to somehow argue they must be close to each other, under certain assumptions on sampling as @Stephen Tashi described in his post.
 
Doesn't Weak Convergence allow us to say that ##\hat x_n## is Cauchy, so that for n>N ## |x_k -x_j | < \epsilon##?
 

Similar threads

  • · Replies 6 ·
Replies
6
Views
2K
  • · Replies 1 ·
Replies
1
Views
1K
  • · Replies 1 ·
Replies
1
Views
1K
Replies
1
Views
4K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 22 ·
Replies
22
Views
4K
  • · Replies 9 ·
Replies
9
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K
Replies
4
Views
2K
  • · Replies 3 ·
Replies
3
Views
1K