Undergrad Proving the product rule using probability

Click For Summary
The discussion centers on a proof of the product rule using cumulative distribution functions (CDFs) and probability density functions (PDFs) for independent random variables. It establishes that the PDF of the maximum of two independent variables can be derived from their CDFs, leading to the conclusion that the derivative of the product of their CDFs equals the sum of their individual contributions. Participants explore the implications of this proof for probabilistic functions and discuss alternative methods for deriving the PDF without initially using the CDF. The conversation highlights the cleverness in calculating the PDF by considering small intervals around the maximum value and emphasizes the connection to the intuitive understanding of the product rule. The proof effectively demonstrates the relationship between the independence of random variables and the product of their distributions.
Office_Shredder
Staff Emeritus
Science Advisor
Gold Member
Messages
5,705
Reaction score
1,589
I thought this was kind of a cool proof of the product rule.

Let ##F(x)## and ##G(x)## be cumulative distribution functions for independent random variables ##A## and ##B## respectively with probability density functions ##f(x)=F'(x)##, ##g(x)=G'(x)##. Consider the random variable ##C=\max(A,B)##. Let ##H(x)## be the cumulative distribution function of ##C##, with pdf ##h(x)=H'(x)##. Then ##h(x) = f(x)G(x) + F(x)g(x)##. If ##C=x##, it's because either ##A=x## and ##B\leq x## or ##B=x## and ##A\leq x##. But since ##A## and ##B## are independent, to both be smaller than ##x##, the probabilities just multiply. So the cdf is simply ##F(x)G(x)##. Therefore ##\frac{d}{dx} \left(F(x)G(x)\right) = F'(x)G(x) + F(x)G'(x)##.

That's pretty much it! I thought it was kind of neat and wanted to share it.
 
  • Like
Likes vela, FactChecker, etotheipi and 2 others
Physics news on Phys.org
Having done the proof via probability, does that limit its scope to only probabilistic functions?
 
  • Like
Likes member 587159 and FactChecker
The direct proof is simple enough ##H(x)=P(C\le x)=P(A\le x, B\le x)=P(A\le x)P(B\le x)=F(x)G(x)## The last step uses independence of ##A## and ##B##.
 
I wondered how you deduced ##h(x)## without going via. the CDF first? Usually to derive the PDF of the ##\text{max}## function I would have thought you say, assuming independence, ##H(x) = P(C \leq x) = P(A \leq x) P(B \leq x) = F(x)G(x)## and then make use of the product rule to find ##h(x)##.

But for this proof we need to do it backward, i.e. already know what ##h(x)## is. So is there another way of obtaining ##h(x)##? Thanks... sorry if I missed the point ?:)!
 
Last edited by a moderator:
jedishrfu said:
Having done the proof via probability, does that limit its scope to only probabilistic functions?

Technically here F and G are increasing functions between 0 and 1. But the derivative scales with multiplication by a constant, and is invariant under constant shifts, and also is only a property of the function locally. So I think it's easy to prove given any F and G, you can shift them and maybe multiply them by -1 and then redefine them everywhere except for the area where you want to calculate the derivative to make the functions match the requirements.

mathman said:
The direct proof is simple enough ##H(x)=P(C\le x)=P(A\le x, B\le x)=P(A\le x)P(B\le x)=F(x)G(x)## The last step uses independence of ##A## and ##B##.

Where do you prove the product rule for calculating derivatives here? I think I'm using the fact you posted here in my proof.

etotheipi said:
I wondered how you deduced ##h(x)## without going via. the CDF first? Usually to derive the PDF of the ##\text{max}## function I would have thought you say, assuming independence, ##H(x) = P(C \leq x) = P(A \leq x) P(B \leq x) = F(x)G(x)## and then make use of the product rule to find ##h'(x)##.

I think the point here is basically I calculate h by being clever. There pdf can be interpreted as in a small area ##\Delta x## around ##x##, the probability ##A## is in that region is ##f(x)\Delta x##. Similar for ##B## and ##g(x)##. Then if ##C## is in that ##\Delta x## region, the probability it's because ##A## is in that region and ##B##is smaller is ##G(x)f(x) \Delta x##. And a similar formula for the other way around. Adding them up gives the probability that ##C## is in the region. There's a small issue where I have double counted some of the times where both ##A## and ##B## are both in the region, but the chance of that is proportional to ##(\Delta x)^2## so goes to zero fast enough it doesn't contribute to the probability density function.
 
  • Like
Likes etotheipi
Office_Shredder said:
I think the point here is basically I calculate h by being clever. There pdf can be interpreted as in a small area ##\Delta x## around ##x##, the probability ##A## is in that region is ##f(x)\Delta x##. Similar for ##B## and ##g(x)##. Then if ##C## is in that ##\Delta x## region, the probability it's because ##A## is in that region and ##B##is smaller is ##G(x)f(x) \Delta x##. And a similar formula for the other way around. Adding them up gives the probability that ##C## is in the region. There's a small issue where I have double counted some of the times where both ##A## and ##B## are both in the region, but the chance of that is proportional to ##(\Delta x)^2## so goes to zero fast enough it doesn't contribute to the probability density function.

I think this is pretty similar to a standard intuitive argument for the product rule (not using probabilistic language). Consider a rectangle with side lengths ##f(x)## and ##g(x)##, so its area is ##f(x)g(x).## Change ##x## by a small amount ##\Delta x## and apply the same reasoning.
 
  • Like
Likes etotheipi
Yep, totally agree with that.
 
Office_Shredder said:
I think the point here is basically I calculate h by being clever. There pdf can be interpreted as in a small area ##\Delta x## around ##x##, the probability ##A## is in that region is ##f(x)\Delta x##. Similar for ##B## and ##g(x)##. Then if ##C## is in that ##\Delta x## region, the probability it's because ##A## is in that region and ##B##is smaller is ##G(x)f(x) \Delta x##. And a similar formula for the other way around. Adding them up gives the probability that ##C## is in the region. There's a small issue where I have double counted some of the times where both ##A## and ##B## are both in the region, but the chance of that is proportional to ##(\Delta x)^2## so goes to zero fast enough it doesn't contribute to the probability density function.

Cool, thanks. That makes sense.

Infrared said:
I think this is pretty similar to a standard intuitive argument for the product rule (not using probabilistic language). Consider a rectangle with side lengths ##f(x)## and ##g(x)##, so its area is ##f(x)g(x).## Change ##x## by a small amount ##\Delta x## and apply the same reasoning.
Do you mean like$$\begin{align*}
\Delta(f(x)g(x)) = f(x+\Delta x)g(x + \Delta x) - f(x)g(x) &\approx (f(x) + \Delta x f'(x))(g(x) + \Delta x g'(x)) - f(x)g(x)\\ \\&\approx \Delta x \left(f'(x) g(x) + f(x) g'(x) \right)
\end{align*}$$where we dropped the cross term in ##(\Delta x)^2##, so$$\frac{d(f(x)g(x))}{dx} = \lim_{\Delta x \rightarrow 0}\frac{\Delta(f(x)g(x))}{\Delta x} = f'(x) g(x) + f(x) g'(x)$$
 
  • Like
Likes Infrared

Similar threads

  • · Replies 9 ·
Replies
9
Views
2K
  • · Replies 6 ·
Replies
6
Views
2K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 20 ·
Replies
20
Views
4K
  • · Replies 5 ·
Replies
5
Views
3K
  • · Replies 1 ·
Replies
1
Views
3K
  • · Replies 9 ·
Replies
9
Views
2K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 18 ·
Replies
18
Views
3K