Conceptual Problems with Random Variables and Sample Theory

siddharth5129
Messages
94
Reaction score
3
Hi
I'm having a few conceptual difficulties with random variables and I was hoping someone could clear up a few things for me:

1) Firstly, what exactly do we mean when we say that two random variables X and Y are equal. I understand what identically distributed means, but my difficulty is with equality.
My professor says that equality of X and Y means that for every outcome ω in the sample space, X(ω) = Y(ω). Now, if these variable are continuously distributed, isn't it also true that P(X=Y) = 0 and that P( X and Y ∈ (a,b)) < 1 for (a,b) ⊂ ℝ . I don't see any inconsistency here, but it seems off. Is this really the definition of equality?

2) Also, I'm not entirely sure what it means to add two random variables. Can I go with the above and say that Z = X + Y if for every outcome ω of the sample space, Z(ω) = X(ω) + Y(ω).

3) My final conceptual difficulty is with the large sample theory. Why do we look at N observations in a population as N random variables in their own right and not as N instances of a single random variable ( which is what they intuitively seem to be ) Is this just a convenient starting point or is their a solid rationale behind it ? Surely the random variable is random and variable across the population studied, not for every individual. Does it make sense, for example, to talk about disease frequency being a random variable in the context of a single person ?

I'd appreciate any sort of clarification. Thanks :)
 
Physics news on Phys.org
siddharth5129 said:
Hi
I'm having a few conceptual difficulties with random variables and I was hoping someone could clear up a few things for me:

1) Firstly, what exactly do we mean when we say that two random variables X and Y are equal. I understand what identically distributed means, but my difficulty is with equality.
My professor says that equality of X and Y means that for every outcome ω in the sample space, X(ω) = Y(ω). Now, if these variable are continuously distributed, isn't it also true that P(X=Y) = 0 and that P( X and Y ∈ (a,b)) < 1 for (a,b) ⊂ ℝ . I don't see any inconsistency here, but it seems off. Is this really the definition of equality?
If the variables X and Y describe the same thing, they are equal. For example, X(w) might be a binary variable like "is male". X(w) is 1 if w is a male and 0 otherwise. Y equals X by the definition above if Y(w) also equals 1 if and only if w is a male, even if the description of Y might be different.

P(X=Y) would equal zero only if the two are both independent and continuously distributed.

2) Also, I'm not entirely sure what it means to add two random variables. Can I go with the above and say that Z = X + Y if for every outcome ω of the sample space, Z(ω) = X(ω) + Y(ω).
That looks right.
3) My final conceptual difficulty is with the large sample theory. Why do we look at N observations in a population as N random variables in their own right and not as N instances of a single random variable ( which is what they intuitively seem to be ) Is this just a convenient starting point or is their a solid rationale behind it ? Surely the random variable is random and variable across the population studied, not for every individual. Does it make sense, for example, to talk about disease frequency being a random variable in the context of a single person ?

I'd appreciate any sort of clarification. Thanks :)
You probably wouldn't discuss frequency of disease in a single person, but probability of a random person having a disease seems fair. Then, if you select 10 random people, you have selected 10 random (binary) variables. Each one being p% likely to have the disease. The frequency is the observed proportion, which you would use to make inferences about what the true probability is within the population at large.
 
siddharth5129 said:
isn't it also true that P(X=Y) = 0 and that P( X and Y ∈ (a,b)) < 1 for (a,b) ⊂ ℝ .

The notation P(X=Y) is ambiguous. It would be unusual to interpret it to mean "The probability that X(w) = Y(w) for each w in the sample space S". For that to make sense, you'd need to considering a different sample space than S. You'd be considering a sample space where the event (X=Y) is defined by "we pick two random variables at random and find they are equal as random variables". Using that interpretation, you can't say if P(X=Y)= 0 without more information.

You might be thinking of a situation where we are given that X=Y ( as random variables) and we sample two possibly different outcomes w1 and w2 from the space S and define the event "X=Y" to mean X(w1) = Y(w2).

In that case, even given that X and Y are continuous random variables we can't say conclude P(X=Y)=0 unless we have more information - for example information about the joint distribution of w1 and w2. Perhaps you are thinking of a special situation - such as letting w1 and w2 be two independent random samples (i.e. single numbers) taken from a normal distribution.

To be clear, notation has to distinguish between "a random variable" and "a realization of a random variable", but it's common to be careless about notation and leave it to the reader to figure things out. For example, if X is a random variable then the notation "X=2" would literally say "X is the constant function X(w) = 2 for each outcome w" But what most people mean by "X=2" when used inside "P(X=2)" is "The set of all outcomes w such that X(w) = 2"
 
Namaste & G'day Postulate: A strongly-knit team wins on average over a less knit one Fundamentals: - Two teams face off with 4 players each - A polo team consists of players that each have assigned to them a measure of their ability (called a "Handicap" - 10 is highest, -2 lowest) I attempted to measure close-knitness of a team in terms of standard deviation (SD) of handicaps of the players. Failure: It turns out that, more often than, a team with a higher SD wins. In my language, that...
Hi all, I've been a roulette player for more than 10 years (although I took time off here and there) and it's only now that I'm trying to understand the physics of the game. Basically my strategy in roulette is to divide the wheel roughly into two halves (let's call them A and B). My theory is that in roulette there will invariably be variance. In other words, if A comes up 5 times in a row, B will be due to come up soon. However I have been proven wrong many times, and I have seen some...

Similar threads

Back
Top