Probability function for discrete functions

Click For Summary

Discussion Overview

The discussion revolves around the probability function for discrete random variables and the derivation of the distribution function from the probability function. Participants explore the mathematical formulation of these concepts, particularly focusing on how to express the probability of a set in terms of the probabilities of individual outcomes.

Discussion Character

  • Technical explanation
  • Mathematical reasoning
  • Debate/contested

Main Points Raised

  • Some participants discuss the relationship between the probability function and the distribution function, noting that for discrete random variables, knowing the probability function suffices to determine the distribution function.
  • There is a query regarding the derivation of the formula for the probability of a set, specifically how to show that the probability of elements not in the countable set of positive probabilities is zero.
  • One participant suggests focusing on countable sets to simplify the problem and mentions the importance of the axioms of probability, particularly the relationship between a set and its complement.
  • Another participant emphasizes that the positive probability elements are countable and that the probability of a set can be computed by summing the probabilities of its elements that have positive probabilities.
  • Clarifications are sought regarding the notation used, particularly the meaning of the image of the function X and its implications for countability.
  • Some participants express uncertainty about the definitions and notation, leading to requests for elaboration on specific terms and concepts.
  • There is a discussion about the implications of countability on the summation of probabilities, with some participants asserting that if the number of points were not countable, it would complicate the summation process.

Areas of Agreement / Disagreement

Participants generally agree on the foundational aspects of probability functions for discrete variables, but there are multiple competing views and some confusion regarding notation and definitions. The discussion remains unresolved in terms of fully clarifying the derivation and implications of the probability function.

Contextual Notes

Some participants express uncertainty about the definitions and notation used in the discussion, particularly regarding the distribution function and the image of the function X. There is also a recognition that the summation of probabilities is only meaningful if the number of terms is countable.

member 587159
My textbook says that if ##X: \Omega \to \mathbb{R}## is discrete stochast (I.e., there are only countably many values that get reached), then it suffices to know the probability function ##p(x) = \mathbb{P}\{X =x\}## in order to know the distribution function ##\mathbb{P}_X: \mathcal{R} \to \mathbb{R}: A \mapsto \mathbb{P}\{X \in A\} = \mathbb{P}(X^{-1}A)##Indeed, if ##S:= \{x : p(x) > 0\}##, then for ##A \in \mathcal{R}##, it follows that $$\mathbb{P}\{X \in A\} = \sum_{x \in S \cap A}p(x)$$

But how do they get this formula?

I tried the following:

$$\mathbb{P}\{X \in A\} = \mathbb{P}\left(X^{-1}\left(\bigcup_{a \in S\cap A}\{a\} \cup\bigcup_{a \in A\setminus S}\{a\}\right)\right) $$

$$= \mathbb{P}\left(X^{-1}\left(\bigcup_{a \in S \cap A}\{a\}\right)\right) + \mathbb{P}\left(X^{-1}\left(\bigcup_{a \in A\setminus S}\{a\}\right)\right) $$
$$=\sum_{a \in S \cap A}p(a) + \mathbb{P}\left(X^{-1}\left(\bigcup_{a \in A\setminus S}\{a\}\right)\right)$$

But how do I show that the probability on the right is zero? I can't use ## \sigma##-additivity on uncountable disjoint unions.

EDIT: ##\mathcal{R}## is the smallest sigma algebra that contains the usual topology on the real numbers, i.e. it are the Borel parts of the reals.
 
Last edited by a moderator:
Physics news on Phys.org
Math_QED said:
$$=\sum_{a \in S \cap A}p(a) + \mathbb{P}\left(X^{-1}\left(\bigcup_{a \in A\setminus S}\{a\}\right)\right)$$

But how do I show that the probability on the right is zero? I can't use ## \sigma##-additivity on uncountable disjoint unions.

I'll take a shot in the dark here-- I don't totally follow your notation or even what a "distribution function" is (generally means CDF but...). The problem should be easy if you can find a way to focus your attention on countable sets.

In any case, your problem reminded me of a quote from Kolmogorov that I like: "Behind every theorem lies an inequality."

- - - -

You have some special structure in that probabilities are always real non-negative and sum to one. The axioms immediately give that

##\Pr\{B\} + \Pr\{B^c\} = 1##

supposing you are allowed to use that, perhaps you can try re-running your argument over something complementary?

The goal would be to combine the two via linearity, show that your probabilities sum to one, and you end up with

##0\leq \mathbb{P}\left(X^{-1}\left(\bigcup_{a \in A\setminus S}\{a\}\right)\right) \leq 0##

(or equivalently: take advantage of positive definiteness after subtracting 1 from each side of your equation -- but the idea is to show that the thing in question is bounded above and below by zero, and hence is zero.)

- - - -
Hopefully this helps or at least provides some inspiration toward a way to get the result you want via an inequality.
 
Last edited:
  • Like
Likes   Reactions: member 587159
StoneTemplePython said:
I'll take a shot in the dark here-- I don't totally follow your notation or even what a "distribution function" is (generally means CDF but...). The problem should be easy if you can find a way to focus your attention on countable sets.

In any case, your problem reminded me of a quote from Kolmogorov that I like: "Behind every theorem lies an inequality."

- - - -

You have some special structure in that probabilities are always real non-negative and sum to one. The axioms immediately give that

##\Pr\{B\} + \Pr\{B^c\} = 1##

supposing you are allowed to use that, perhaps you can try re-running your argument over something complementary?

The goal would be to combine the two via linearity, show that your probabilities sum to one, and you end up with

##0\leq \mathbb{P}\left(X^{-1}\left(\bigcup_{a \in A\setminus S}\{a\}\right)\right) \leq 0##

(or equivalently: take advantage of positive definiteness after subtracting 1 from each side of your equation -- but the idea is to show that the thing in question is bounded above and below by zero, and hence is zero.)

- - - -
Hopefully this helps or at least provides some inspiration toward a way to get the result you want via an inequality.

Thanks. I defined probability distribution at the beginning of the post. I will take some time to digest your answer.
 
The statement looks obvious. Since the positive probability elements are countable (elements of S), to get P(A), add up all the probabilities of those elements in A which have positive probabilities.
 
mathman said:
The statement looks obvious. Since the positive probability elements are countable (elements of S), to get P(A), add up all the probabilities of those elements in A which have positive probabilities.

I don't quite understand what you mean. Can you elaborate?
 
Math_QED said:
I don't quite understand what you mean. Can you elaborate?
I am not sure what I need to say. S is a subset (countable) of R containing all the points with positive probability. A is a subset of R. The probability of A is the sum of the probabilities of all points of A which have positive probability. Since S consists of all points of probability, the intersection of S and A contains all the points of A needed to define the probability of A.
 
  • Like
Likes   Reactions: member 587159
mathman said:
I am not sure what I need to say. S is a subset (countable) of R containing all the points with positive probability. A is a subset of R. The probability of A is the sum of the probabilities of all points of A which have positive probability. Since S consists of all points of probability, the intersection of S and A contains all the points of A needed to define the probability of A.

I think I understand what you mean. Let me write it out:

$$P(X \in A) = P(X\in A\cap X(\Omega))$$
$$= \sum_{x \in A \cap X(\Omega)} P(X=x)$$
$$=\sum_{x\in A \cap S} P(X=x) + \sum_{x \in A \cap X(\Omega) \setminus S} P(X=x)$$

And the last sum is 0. Does this make sense? In the second equality I used that the image of X is at most countable and for the last equality that ##S \subseteq X(\Omega)##
 
Your notation throws me. That's why I prefer words. What\ is\ X(\Omega)? X does not have to be countable - only the subset of X consisting of points of positive probability.
 
mathman said:
Your notation throws me. That's why I prefer words. What\ is\ X(\Omega)? X does not have to be countable - only the subset of X consisting of points of positive probability.

##X(\Omega) = Im(X)##, the image of the function ##X##. I.e. all the values that X attain. And clearly this is countable, because by assumption this is countable (the variable is discrete)
 
  • #10
Math_QED said:
I think I understand what you mean. Let me write it out:

$$P(X \in A) = P(X\in A\cap X(\Omega))$$
$$= \sum_{x \in A \cap X(\Omega)} P(X=x)$$
$$=\sum_{x\in A \cap S} P(X=x) + \sum_{x \in A \cap X(\Omega) \setminus S} P(X=x)$$

And the last sum is 0. Does this make sense? In the second equality I used that the image of X is at most countable and for the last equality that ##S \subseteq X(\Omega)##
In your original definition of X, it appears that X is countable, so there are only a countable number (if any) points in the second sum where each point has P(X=x)=0.
 
  • #11
mathman said:
In your original definition of X, it appears that X is countable, so there are only a countable number (if any) points in the second sum where each point has P(X=x)=0.

Every term in the last sum is zero, because the sum runs over elements not in S (so by definitions points with probability 0)
 
  • #12
Math_QED said:
Every term in the last sum is zero, because the sum runs over elements not in S (so by definitions points with probability 0)
The point I was making is that if the number of points in X were not countable, the last term would be problematical. Summations have meaning only if the number of terms is countable.
 

Similar threads

  • · Replies 3 ·
Replies
3
Views
4K
  • · Replies 1 ·
Replies
1
Views
3K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 2 ·
Replies
2
Views
4K
  • · Replies 1 ·
Replies
1
Views
3K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 1 ·
Replies
1
Views
3K