Hoeffding inequality for the difference of two sample means?

Click For Summary

Discussion Overview

The discussion revolves around the Hoeffding inequality and its application to the difference of two sample means. Participants explore the derivation of a corollary from the original inequality, focusing on the mathematical implications and conditions involved in the bounding of the difference between sample means.

Discussion Character

  • Technical explanation
  • Mathematical reasoning
  • Debate/contested

Main Points Raised

  • Jan presents the original Hoeffding inequality and its corollary for the difference of two sample means, seeking clarification on how the corollary follows from the original inequality.
  • Chiro suggests using a new variable Z = X + Y to approach the problem, though the specifics of this approach are not fully detailed.
  • Jan questions the introduction of the term (m^{-1} + n^{-1}) in the bound, expressing confusion about its relevance given that z = \bar{x} - \bar{y} is bounded between [0,1].
  • Jan further speculates that the boundedness refers to almost sure boundedness, proposing that z is bounded a.s. within a specific range involving the means and the variances.

Areas of Agreement / Disagreement

Participants express differing views on the implications of the term (m^{-1} + n^{-1}) and its role in the bounding of the difference of sample means. The discussion remains unresolved regarding the exact derivation and interpretation of the corollary.

Contextual Notes

Participants note the importance of variances in understanding the bounds, but the discussion does not resolve the mathematical steps or assumptions necessary to clarify the relationship between the original inequality and its corollary.

JanO
Messages
3
Reaction score
0
In W. Hoeffding's 1963 paper* he gives the well known inequality:

[itex]P(\bar{x}-\mathrm{E}[x_i] \geq t) \leq \exp(-2t^2n) \ \ \ \ \ \ (1)[/itex],

where [itex]\bar{x} = \frac{1}{n}\sum_{i=1}^nx_i[/itex], [itex]x_i\in[0,1][/itex]. [itex]x_i[/itex]'s are independent.

Following this theorem he gives a corollary for the difference of two sample means as:

[itex]P(\bar{x}-\bar{y}-(\mathrm{E}[x_i] - \mathrm{E}[y_k]) \geq t) \leq \exp(\frac{-2t^2}{m^{-1}+n^{-1}}) \ \ \ \ \ \ (2)[/itex],

where [itex]\bar{x} = \frac{1}{n}\sum_{i=1}^nx_i[/itex], [itex]\bar{y} = \frac{1}{m}\sum_{k=1}^my_k[/itex], [itex]x_i,y_k\in[0,1][/itex]. [itex]x_i[/itex]'s and [itex]y_k[/itex]'s are independent.


My question is: How does (2) follow from (1)?

-Jan

*http://www.csee.umbc.edu/~lomonaco/f08/643/hwk643/Hoeffding.pdf (equations (2.6) and (2.7))
 
Physics news on Phys.org
Hey JanO and welcome to the forums.

One idea I have is to let Z = X + Y and use Z instead of X in the definition.
 
Thanks Chiro for your responce.

However, I still do not understand how the term [itex](m^{-1} + n^{-1})[/itex] comes into the bound. Isn't [itex]z=\bar{x}-\bar{y}[/itex] is still bounded between [0,1]?

-Jan
 
JanO said:
Thanks Chiro for your responce.

However, I still do not understand how the term [itex](m^{-1} + n^{-1})[/itex] comes into the bound. Isn't [itex]z=\bar{x}-\bar{y}[/itex] is still bounded between [0,1]?

-Jan

Think about what happens to the variances.
 
It seems like bounded here means all most surely bounded. At least that's how Hoeffding inequality seems to be given elsewhere. I guess it then means that [itex]z=\bar{x}-\bar{y}[/itex] is bounded a.s. between [itex][\mu_x-\mu_y-\frac{1}{2}\sqrt{m^{-1}+n^{-1}}, \ \mu_x-\mu_y+\frac{1}{2}\sqrt{m^{-1}+n^{-1}}][/itex].?

Thanks again for your help!
 

Similar threads

  • · Replies 0 ·
Replies
0
Views
2K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K
Replies
1
Views
4K
  • · Replies 7 ·
Replies
7
Views
2K
  • · Replies 1 ·
Replies
1
Views
1K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 5 ·
Replies
5
Views
3K
  • · Replies 9 ·
Replies
9
Views
3K