In summary, the conversation discusses a continuous random variable with a probability density function of f(x) = 2x. The domain is restricted between parameters a and b such that the integral from a to b of f(x) equals 1. However, when setting a=0 and solving for b, the function exceeds 1. The confusion is cleared up by realizing that with a continuous variable, P(X=x) always equals 0 and the integral must be used instead. The integral of the function is exactly 1, even if the function itself may be larger than 1 for certain x values. The integral can have large values at a single point and still equal 1.
  • #1
NatFex
26
3
I have a Stats exam on Wednesday and while I thought I was quite well-versed, I've gone back over to the very basics only to find myself confused at what should be introductory.

Suppose I have a continuous random variable modeled by a probability density function: $$f(x)=2x$$ Obviously the domain (x) needs to be restricted between parameters ##a## and ##b## such that ##\int_a^b f(x) \, dx = 1##

Setting ##a=0##, working out ##b## should prove straightforward: $$\int_0^b 2x \, dx = 1$$ $$ \big[x^2\big] _0^b = 1$$ $$b^2 - 0 = 1$$ $$b=1$$ So, the function is now: $$f(x) =\left\{ \begin{array}{l}
2x , & 0 \leq x \leq 1\\ 0 , & \textrm{otherwise} \end{array} \right. $$

Except that the area under this function obviously exceeds 1. Even taking ##f(x)## at the boundary, ##f(1)## returns a value of 2. What am I doing wrong? Why is something so simple not adding up?

Thanks
 
Physics news on Phys.org
  • #2
NatFex said:
I have a Stats exam on Wednesday and while I thought I was quite well-versed, I've gone back over to the very basics only to find myself confused at what should be introductory.

Suppose I have a continuous random variable modeled by a probability density function: $$f(x)=2x$$ Obviously the domain (x) needs to be restricted between parameters ##a## and ##b## such that ##\int_a^b f(x) \, dx = 1##

Setting ##a=0##, working out ##b## should prove straightforward: $$\int_0^b 2x \, dx = 1$$ $$ \big[x^2\big] _0^b = 1$$ $$b^2 - 0 = 1$$ $$b=1$$ So, the function is now: $$f(x) =\left\{ \begin{array}{l}
2x , & 0 \leq x \leq 1\\ 0 , & \textrm{otherwise} \end{array} \right. $$

Except that the area under this function obviously exceeds 1. Even taking ##f(x)## at the boundary, ##f(1)## returns a value of 2. What am I doing wrong? Why is something so simple not adding up?

Thanks
The function defines the triangle ##(0,0);(1,0);(1,2)## where it is not zero. And the volume of it is ##\frac{1}{2} \cdot 2 \cdot 1 = 1##. Beside it is a funny distribution everything looks ok.

Edit: For ##f(x) ≤ 1## you have to renorm it, e.g. ##f(x) = \frac{1}{2} x## with ##0≤x≤2##.
 
  • #3
fresh_42 said:
The function defines the triangle ##(0,0);(1,0);(1,2)## where it is not zero. And the volume of it is ##\frac{1}{2} \cdot 2 \cdot 1 = 1##. Beside it is a funny distribution everything looks ok.
But surely the probability that X takes a value of 1, ##f(1)## cannot equal 2? And yet I got the value of x=1 by integrating to ensure that the sum of probabilities never exceeds 1. That's where my confusion comes from.
 
  • #4
NatFex said:
But surely the probability that X takes a value of 1, ##f(1)## cannot equal 2? And yet I got the value of x=1 by integrating to ensure that the sum of probabilities never exceeds 1. That's where my confusion comes from.
Sorry, edited a moment too late.
 
  • #5
fresh_42 said:
Sorry, edited a moment too late.
I saw your edit, now the confusion arises when taking ##f(2)##, which equals 1. I suspect my understanding of the significance of the values continuous distributions functions return is what's at stake here.

EDIT: Of course, the fallacy is that with a continuous variable, ##P(X=x)## in exact terms, e.g. ##P(X=2.0000...)## as opposed to ##P(1.95 \leq X \leq 2.05)##, always equals 0. Correct? Can't believe that flew by me when making the thread.
 
Last edited:
  • #6
NatFex said:
I saw your edit, now the confusion arises when taking ##f(x)=2##. I suspect my understanding of the significance of the values continuous distributions functions return is what's at stake here.
How do you get ##f(x) = 2## with the renormed function? Your original function values cannot be interpreted as probabilities, maybe as an outcome of an experiment. Therefore you have to modify it.
 
  • Like
Likes NatFex
  • #7
fresh_42 said:
How do you get ##f(x) = 2## with the renormed function? Your original function values cannot be interpreted as probabilities, maybe as an outcome of an experiment. Therefore you have to modify it.
Likewise, fixed my post a moment too late. Perfect, you confirmed what I thought was going on.
 
  • Like
Likes fresh_42
  • #8
NatFex said:
Likewise, fixed my post a moment too late. Perfect, you confirmed what I thought was going on.
Try not to get nervous, possibly by yourself. To look back at things you already learned shortly before a test can sometimes do more damage than good. I remember an examination on a field I wasn't really firm in. I read a book and learned as much as I could a few weeks ahead. On the day of the examination my professor was late of about 2 or 3 hours due to an unexpected funeral he had to attend. So I took the time and went through my condensed notes of the topic. The result was that in the end I only had these in my head and the stuff around them was not as present to me as it should have been.
 
Last edited:
  • #9
NatFex said:
I saw your edit, now the confusion arises when taking ##f(2)##, which equals 1. I suspect my understanding of the significance of the values continuous distributions functions return is what's at stake here.

EDIT: Of course, the fallacy is that with a continuous variable, ##P(X=x)## in exact terms, e.g. ##P(X=2.0000...)## as opposed to ##P(1.95 \leq X \leq 2.05)##, always equals 0. Correct? Can't believe that flew by me when making the thread.
Just to add a comment: the key point is that your f(x) is not a probability! It is a probability density. So it is simply meaningless to talk about f(x). The only meaningful quantity is the integral of f(x) between two values, and this si never larger than 1 even though f(x) itself may be larger than 1 for some x.
 
  • Like
Likes Fooality
  • #10
The integral of your function is exactly 1. That is the difference between the "sum" that you state in the title and an integral. An integrand, f(x), can be huge over a small range of x and still have an integral of 1. The integral of 1000 from 0 to 1/1000 is 1.
 
  • #11
In a continuous distribution you can have large values at a single point and it's still all good.

It's not like a discrete distribution - you have to look at the integral of a region and not the value at a particular real number.

If you take an integral you should always find that it's a valid probability - even if the point value is a lot greater than 1.
 
  • #12
NatFex said:
Setting ##a=0##, working out ##b## should prove straightforward: $$\int_0^b 2x \, dx = 1$$ $$ \big[x^2\big] _0^b = 1$$ $$b^2 - 0 = 1$$

Yes, but you don't have to set ##a = 0 ## in order define a probability density function of the form ##f(x) = 2x## on an interval. Let's set ##a = 1##. Then you get ## b = \sqrt{2} ## and ## f(b) = 2 \sqrt{2} ## so ## f(b) > 1 ##, which illustrates that the value of a probability density function is not itself a probability.

Consider a physical analogy:
A 1 meter rod of variable density whose total weight was 1 kg could have a point where the mass density was 10 kg/ meter. A high density at one point in the rod does not contradict the statement that the total mass of the rod is just 1 kg.
 

1. What is the definition of a probability density function (PDF)?

A probability density function is a mathematical function that describes the likelihood of a continuous random variable taking on a certain value. It is typically represented graphically as a curve, with the area under the curve representing the probability of the variable falling within a certain range of values.

2. Can the sum of probability density functions be greater than 1?

No, the sum of probability density functions cannot be greater than 1. This is because the total probability of all possible outcomes must equal 1, as there is a 100% chance of the event occurring.

3. What does it mean if the sum of probability density functions is greater than 1?

If the sum of probability density functions is greater than 1, it means that there is an error in the calculations or assumptions made. It could also indicate that the probability distribution is not a valid one, as it violates the fundamental principle of probability.

4. How can the sum of probability density functions be adjusted to equal 1?

If the sum of probability density functions is greater than 1, the values of the probability density functions can be scaled down proportionally so that they add up to 1. This can be done by dividing each individual probability density function by the sum of all the functions.

5. Are there any real-life examples where the sum of probability density functions can be greater than 1?

No, there are no real-life examples where the sum of probability density functions can be greater than 1. Probability density functions are used to model events that follow a certain probability distribution, and in these cases, the sum of the functions will always equal 1. However, in some cases, the sum of probabilities may appear to be greater than 1 due to rounding errors or approximation methods used in calculations.

Similar threads

  • Set Theory, Logic, Probability, Statistics
Replies
2
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
4
Views
868
  • Set Theory, Logic, Probability, Statistics
Replies
2
Views
801
  • Poll
  • Set Theory, Logic, Probability, Statistics
Replies
4
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
3
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
698
Replies
0
Views
234
  • Set Theory, Logic, Probability, Statistics
Replies
4
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
6
Views
3K
  • Set Theory, Logic, Probability, Statistics
Replies
7
Views
1K
Back
Top