What is the relation between probability spaces and the binomial distribution?

Click For Summary

Discussion Overview

The discussion explores the relationship between probability spaces and the binomial distribution, focusing on the definitions and properties of probability measures, random variables, and the distinctions between discrete and continuous cases. Participants examine theoretical aspects, mathematical definitions, and implications for different types of distributions.

Discussion Character

  • Technical explanation
  • Conceptual clarification
  • Debate/contested

Main Points Raised

  • One participant presents a structured approach to defining probability spaces related to the binomial distribution, outlining components such as sample spaces and probability measures.
  • Another participant confirms the correctness of the initial definitions and elaborates on the concept of measurability in random variables, noting that it is typically satisfied in practice.
  • A participant seeks clarification on the formal definition of discrete random variables, questioning whether it relates to countable sets or continuity, and asks about the implications for measures and distributions.
  • Another participant asserts that a random variable is discrete if its range is finite or countable, and discusses the relationship between discrete random variables and discrete distributions.
  • There is a discussion about the definitions of continuity in the context of probability, with references to cumulative distribution functions (cdf) and absolute continuity.
  • Participants explore whether the notions of continuity in probability coincide with traditional definitions when considering topological spaces, raising questions about the implications of these definitions.

Areas of Agreement / Disagreement

Participants express varying viewpoints on the definitions of discrete and continuous random variables, as well as the implications of these definitions for probability measures and distributions. The discussion remains unresolved regarding the nuances of continuity in different contexts.

Contextual Notes

There are limitations in the definitions and assumptions regarding continuity and discreteness, particularly in relation to the topology of the spaces involved. The discussion highlights the need for conventions in defining these terms in probability theory.

Rasalhague
Messages
1,383
Reaction score
2
Here, to further test my understanding, is an attempt to apply the measury theory definitions of a probability space to the binomial distribution. All comments welcome!


Let (R,D,O) be a probability space:

R = \left \{ 0,1 \right \}

D = 2^R

O:D\rightarrow[0,1] \; | \; O(\left \{ 1 \right \})=p


Let (S,E,P) be another probability space:

S = \left \{ 0,1 \right \}^n

E = 2^S

P:E\rightarrow[0,1] \; | \; P(\left \{ s \right \})=p^n


Let (T,F,Q) be a third probability space:

T=\left \{ 0,1,...,n \right \}

F = 2^T

Q:F\rightarrow[0,1] \; | \; Q(\left \{ t \right \})= \binom{n}{t}p^t(1-p)^{n-t}


Let X be a random variable:

X:S\rightarrow T \; | \; X(s)=\sum_{i=1}^{n}s_i


Then the probability measure Q belongs to a class of (probability) distributions called binomial distributions. Its sample space is S. The events are E. The probability is P. The observation space is T. The "observed events" are F. We can interpret p as the likelihood of success on one trial, n as the number of trials, and t as the likelihood of exactly t successes in n trials.

The components of (R,D,O) have no special name, but if we define another random variable, W, such that W is the identity function on R, then O becomes the Bernoulli distribution. Equivalently, the Bernoulli distribution is a binomial distribution with n = 1.

Footnote: I think there's a more subtle variant of this idea, which I hope to get to eventually, where the observation space is taken to be the real numbers, and F the Borel algebra (smallest sigma algebra generated by the open sets), allowing one to use general formulas for defining expectation, and so forth, that apply both to continuous and discrete cases.
 
Physics news on Phys.org
Yes, all you said is correct! :smile:

Maybe I should comment a bit on the general theory. In general, we'll indeed have a sample space (S,E,P). The random variables are functions X:S\rightarrow \mathbb{R}. This function can be discrete or continuous or it can be anything. We only want the function to be measurable, that is: X^{-1}(B)\in E for every Borel set B. Measurability is a technical concept that is almost always satisfied in practise.

Because of measurability, we can define a distribution by

P_X(B)=P(X^{-1}(B))

Now, the mean is defined in general as

\int{XdP}=\int{xdP_X}

So we are actually integrating with respect to a probability measure! If the measure is discrete, then the integral is a sum, if the measure is "continuous", then the integral is the integral like we all know it! :smile:
 
Phew, that's a relief! Thanks for all the help and encouragement, micromass.

By the way, what exactly is the formal definition of discrete? Of a random variable to R, does it perhaps mean one whose range contains a (finite or infinitely) countable set of nonzero values, or is it defined in terms of continuity? Is there a definition of equal generality to the definition of a random variable itself, i.e. a definition which works for any random variable, no matter what its domain and codomain are? And how is discrete defined for a measure? Does a discrete random variable necessarily induce a discrete distribution, and is a discrete distribution necessarily induced by a discrete random variable?

I guess those scare quotes around "continuous" are because there isn't, in general, a topology defined on either of the sigma algebras, E or F. Is there a more technically correct way of expressing this difference between models that involve things like binomial distributions, and models that involve things like normal distributions?
 
Rasalhague said:
Phew, that's a relief! Thanks for all the help and encouragement, micromass.

By the way, what exactly is the formal definition of discrete? Of a random variable to R, does it perhaps mean one whose range contains a (finite or infinitely) countable set of nonzero values, or is it defined in terms of continuity? Is there a definition of equal generality to the definition of a random variable itself, i.e. a definition which works for any random variable, no matter what its domain and codomain are?

A random variable is discrete if it's range is finite or countable. It's as easy as that.

And how is discrete defined for a measure?

It isn't. I should not have written that. I should have said "if X is discrete, then it is a sum. If X is "continuous", then it is an integral"

Does a discrete random variable necessarily induce a discrete distribution, and is a discrete distribution necessarily induced by a discrete random variable?

Yes. And I dare to say that it is by definition. But this depends of the definition.

I guess those scare quotes around "continuous" are because there isn't, in general, a topology defined on either of the sigma algebras, E or F. Is there a more technically correct way of expressing this difference between models that involve things like binomial distributions, and models that involve things like normal distributions?

There are different answers to this. One convention says that X is continuous if the cdf

F(x)=P\{X\leq x\}

is continuous.

A (stronger) convention says that X is continuous if the distribution PX is an "absolute continuous measure". All that means is that there exists an integrable function f (called the probability density function or pdf) such that

P_X(A)=\int_A{f(x)dx}

where the integral above is the ordinary integral that you're used to (in more technical terms: the integral with respect to Lebesgue measure).
 
micromass said:
One convention says [...]

When S and T are topological spaces, do these notions of "continuity" coincide with the usual one? Is the need for a convention only because these are extensions of the usage of the word to cases where S and T are not both topological spaces, or is there ever a case where X can be continuous in the usual sense but not in the probability sense, or vice-versa?
 
Rasalhague said:
When S and T are topological spaces, do these notions of "continuity" coincide with the usual one? Is the need for a convention only because these are extensions of the usage of the word to cases where S and T are not both topological spaces, or is there ever a case where X can be continuous in the usual sense but not in the probability sense, or vice-versa?

No, saying that a random variable is continuous has nothing to do with the topology on S or T. I can find a lot of topologies on S, but that doesn't mean that X is necessarily continuous in the usual sense. The terminology of saying that X is continuous is historically invented to mean that the cdf is continuous, and that's still the meaning of it today.
 

Similar threads

Replies
1
Views
3K
  • · Replies 9 ·
Replies
9
Views
3K
  • · Replies 0 ·
Replies
0
Views
2K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 10 ·
Replies
10
Views
9K
  • · Replies 5 ·
Replies
5
Views
3K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 14 ·
Replies
14
Views
6K