Use the definition of E[g(Y)] to derive E[Y^2]

In summary: The other terms are cancelled due to the binomial expansion.=mq(1+m(q-1))The final result is...E(Y^2) = mq(1+m(q-1))In summary, using the definition of the expectation of a function of a random variable, we can derive E[Y^2] as mq(1+m(q-1)). This involves using summations and the binomial expansion to simplify the expression.
  • #1
Let Y ~ Bi(m,q).

Use the definition of E[g(Y)] to derive E[Y^2].

Hint: Write Y^2 as Y(Y-1) + Y. You do not have to re-derive E[Y].

Not sure where to start with this; my initial reaction was to use the moment generating function?
Physics news on
  • #2
Please post such questions in the homework forum. I moved it for you.
  • #3
mathmajor23 said:
Let Y ~ Bi(m,q).

Use the definition of E[g(Y)] to derive E[Y^2].

Hint: Write Y^2 as Y(Y-1) + Y. You do not have to re-derive E[Y].

Not sure where to start with this; my initial reaction was to use the moment generating function?

Try it and see!

  • #4
mathmajor23 said:
my initial reaction was to use the moment generating function?

But the directions say to use the definition of the expectation of a function of a random variable.
  • #5
Stephen Tashi said:
But the directions say to use the definition of the expectation of a function of a random variable.

A moment-generating function IS the expectation of a function of the RV; it just happens to be a particularly useful function.

  • #6
Ray Vickson said:
A moment-generating function IS the expectation of a function of the RV; it just happens to be a particularly useful function.


Yes, it "is", in the same sense that the derivative "is" a limit. The definition of a moment generating function is an example of an expectation of a function of a random variable; it isn't logically equivalent to the definition of the expectation of a function of a random variable.

The meaning of the direction to "use the definition of E(g(y))" would be clearer if mathmajor23 explained how "E(g(y))" has been used in his course materials.
  • #7
By definition, E[g(Y)] = the summation over all values of y of [g(y)p(y)]
  • #8
I speculate that the problem wants you to show your virtuosity with sums and binomial coefficients. (That's my attempt to mind-read what its author wants. If this problem is from the chapter on moment generating functions, it might want you to use them or it might want you do things "the hard way" so you'll appreciate moment generating functions!)

If we begin from the definition:

[itex] E(Y^2) = \sum_{k = 0}^m \ k^2 \ \binom{m}{k} q^k (1-q)^{m-k} [/itex]

and use the hint:

[itex] = \sum_{k=0}^m \ (k(k-1) + k) \ \binom{m}{k} q^k (1-q)^{m-k} [/itex]

[itex] = \sum_{k=0}^m \ k(k-1) \binom{m}{k} q^k (1-q)^{m-k} + \sum_{k=0}^m \ k \binom{m}{k} q^k (1-q)^{m-k} [/itex]

[itex] = \sum_{k=0}^m \ k(k-1) \ \frac{m!}{(m-k)! k!} q^k (1-q)^{m-k} + mq [/itex]

Does canceling the factors of [itex] k [/itex] let us rewrite the factors [itex] k(k-1) \binom{m}{k} q^k (1-q)^{m-k} [/itex] as [itex] q (1-q) m(k-1) \binom{m-1}{k-1} q^{k-1} (1-q)^{m-1-k}[/itex] ?

If, so the summation begins to look like a linear expression in the expectation of a binomial with parmeters [itex] m-1, q [/itex]. (Edit, This morning, I think that we should only factor out one of the factors [itex] q [/itex] and [itex](1-q) [/itex] to get a Bi[itex](m-1,q) [/itex].) It probably won't be that simple since you can't "cancel" out [itex] k [/itex] when [itex] k = 0 [/itex], but since the term is 0 you can leave it out of the summation. We'll have to pay attention to what the indexes of summation are. We might get an expression that involves the mean of bi(m-1,q) minus a few terms.

I began looking at this problem before it got moved to the homework section. I suppose now there's an excuse to leave it to you to see if that idea works.
Last edited:
  • #9
You're right! My professor "loves" MGF and says she will make us appreciate them, and this is like the third proof we have to do with these nasty summations. I follow you all the way up to the point where you try canceling the factors of k, that's where I am getting confused. I will try rewriting the factors as you recommend and see where I go from there.
  • #10
What methods are you permitted to use? If I were not allowed to use an MGF I would, instead, assume a form like EN^2 = a*n^2 + b*n + c, then find a, b, c by looking at explicit values of EN^2 for small values of n, such as n = 1, 2, 3. Then I would try to _prove_ correctness of the result, by induction, for example. As regards induction: if we let b[n,k] = C(n,k)*p^k (1-p)^(n-k), then we have k*b[n,k] = n*p* b[n-1,k-1], etc.

Last edited:
  • #11
My professor wants us to use just definitions and summations.

As to the last step of the ∑mk=0 k(k−1) m!(m−k)!k!qk(1−q)m−k+mq , the next thing to do is to cancel out the k'sEdit: I can't seem to get the summation symbol to agree with me on here :p
  • #12
mathmajor23 said:
My professor wants us to use just definitions and summations.

As to the last step of the ∑mk=0 k(k−1) m!(m−k)!k!qk(1−q)m−k+mq , the next thing to do is to cancel out the k's

Edit: I can't seem to get the summation symbol to agree with me on here :p

What does it mean to use "just summations"? Are you not allowed to use modern summation tools? Are you not allowed to recognize that k*b[n,k] is the same as n*p*b[n-1,k-1]? If you are not allowed to use *anything*, what on Earth are you left with?

  • #13
You can derive E(Y^2) easily:

let f(x) = ƩC(m,x)*q^x * (1-q)^(m-x)
E(X^2) = ƩC(m,x)*x^2 *q^x * (1-q)^(m-x)

Observe that this is equivalent to ƩC(m-1,x-1)*(m/x)*q^x*(1-q)^(m-x)*x^2
This becomes:

Let y = x-1
x = y+1
This equals

This equals mq*ƩC(m-1,y)*q^(y)*(1-q)^(m-1-y)*(y+1)
Expand via factor (y+1)

mq*ƩC(m-1,y)*q^(y)*(1-q)^(m-1-y)*(y) + mq*ƩC(m-1,y)*q^(y)*(1-q)^(m-1-y)*(1)

The term on the right is simply mq, because the sum itself is equal to 1

The term on the left can be solved easily:

mq*ƩC(m-1,y)*q^(y)*(1-q)^(m-1-y)*(y) becomes..

Let z = y-1
= (m-1)*mq^2
So we have..

ƩC(m,x)*x^2 *q^x * (1-q)^(m-x) = (m-1)*mq^2 + mq
  • #14
Applejacks01 said:
You can derive E(Y^2) easily:

let f(x) = ƩC(m,x)*q^x * (1-q)^(m-x)
E(X^2) = ƩC(m,x)*x^2 *q^x * (1-q)^(m-x)

Observe that this is equivalent to ƩC(m-1,x-1)*(m/x)*q^x*(1-q)^(m-x)*x^2
This becomes:

Let y = x-1
x = y+1
This equals

This equals mq*ƩC(m-1,y)*q^(y)*(1-q)^(m-1-y)*(y+1)
Expand via factor (y+1)

mq*ƩC(m-1,y)*q^(y)*(1-q)^(m-1-y)*(y) + mq*ƩC(m-1,y)*q^(y)*(1-q)^(m-1-y)*(1)

The term on the right is simply mq, because the sum itself is equal to 1

The term on the left can be solved easily:

mq*ƩC(m-1,y)*q^(y)*(1-q)^(m-1-y)*(y) becomes..

Let z = y-1
= (m-1)*mq^2
So we have..

ƩC(m,x)*x^2 *q^x * (1-q)^(m-x) = (m-1)*mq^2 + mq

That is more-or-less what I suggested to him a few postings back, but he seemed to think he was not allowed to use such results.


FAQ: Use the definition of E[g(Y)] to derive E[Y^2]

1. What is the definition of E[g(Y)]?

The definition of E[g(Y)] is the expected value of the function g(Y), where Y is a random variable. This means that E[g(Y)] is the sum of g(Y) multiplied by the probability of each possible outcome of Y.

2. How do you calculate E[g(Y)]?

To calculate E[g(Y)], you first need to determine all possible outcomes of Y and their corresponding probabilities. Then, multiply each outcome by the corresponding probability and sum all the results together. This will give you the expected value of g(Y).

3. What is the importance of using the definition of E[g(Y)]?

The definition of E[g(Y)] allows us to calculate the expected value of a function of a random variable. This is important in statistics and probability as it helps us make predictions and draw conclusions based on the expected value of a random variable.

4. How is E[Y^2] derived from the definition of E[g(Y)]?

E[Y^2] is derived from the definition of E[g(Y)] by substituting g(Y) with Y^2. This means that we are finding the expected value of Y^2 by multiplying each possible outcome of Y^2 by its corresponding probability and summing all the results.

5. Can the definition of E[g(Y)] be applied to any function of a random variable?

Yes, the definition of E[g(Y)] can be applied to any function of a random variable. As long as the function is well-defined and the random variable has a finite expected value, the definition can be used to calculate the expected value of the function.
