# G(E[X],E[Y]) or E[g(X,Y)]

1. Aug 28, 2012

### Apteronotus

Suppose X and Y are r.v.
Suppose also that we get N samples of a r.v. Z which depends on X and Y. That is Z=g(X,Y).

Which is a better estimate of the true value of Z?

$Z=E[g(X,Y)]$
or
$Z=g(E[X],E[Y])$

2. Aug 28, 2012

### chiro

Hey Apteronotus.

You can't actually estimate Z since Z is a random variable and not a parameter: you need to be careful about using estimation in this context.

Z is a random variable, so if you wanted to get the population mean of Z then you calculate E[Z] = E[g(X,Y)].

Remember that estimation concerns estimating something that is essentially fixed, like mu, sigma, lambda, and so on and we construct distributions in statistical theory to estimate either exactly or approximately a distribution for that particular estimator.

3. Aug 30, 2012

### Apteronotus

Hi Chiro,

The situation is that my z is in fact fixed. Its value depends on two other variables x and y.
I have a model/function which calculates the true value of z. That is
z=g(x,y)

Now, the problem is that I have added noise to my x & y variables:
X=x+noise
Y=y+noise

Using these noisy inputs, I get a noisy output Z=g(X,Y)
Since the noise has zero mean
E[X]=x, and
E[Y]=y

I was wondering whether g(E[X],E[Y]) or E[g(X,Y)] would bring me closer to the actual value z?

4. Aug 30, 2012

### Stephen Tashi

I think that if you manage to state your question precisely, the answer will be E( g(X,Y)), but you haven't defined the meaning of "best" in your original post and to say a random result is "closer" to something has no specific meaning. A random variable has no deterministic "close-ness" to anything unless the "close-ness" is defined in statistical terms and there are different ways of doing that.

Perhaps you want to minimized the expected value of the square of the difference between an estimator of the the mean value of g(x,y) and the actual mean value of g(x,y).

In estimation theory, there are "least squares" estimators, "maximum likelihood" estimators, "minimum variance" estimators, etc. Each is "best" according to a different criteria.

5. Aug 30, 2012

### Apteronotus

Stephen. Thank you for taking the time.

I'm hoping that the meaning of "best" becomes apparent from my second post.

I guess I would define it as follows:
Which quantity is smaller

$\left\{E[g(X,Y)]-g(x,y)\right\}^2$
or

$\left\{g(E[X],E[Y])-g(x,y)\right\}^2$

where
$X=x+noise \qquad \mbox{and} \qquad Y=y+noise$

6. Aug 30, 2012

### Apteronotus

Clearly the second quantity $\left\{g(E[X],E[Y])-g(x,y)\right\}^2=0$, as

$E[X]=E[x+noise]=x \qquad \mbox{and}\qquad E[Y]=E[y+noise]=y$

Since, the first quantity, $\left\{E[g(X,Y)]-g(x,y)\right\}^2\ge0$ then I guess to answer my question, the first one must be "Better".

7. Aug 30, 2012

### chiro

Have you tried considering estimation schemes that find the value of Z where the probability of Z is maximal (like they do to find point estimates for parameters using Maximum Likelihood Estimation)?

Another technique you can also do is to find the highest probability density for a given probability value that finds the highest simple region of Z that corresponds to the integral of that region being the value of the probability. For example if p = 0.1 then the HPD will correspond to a region where the probability is maximized and the region itself is minimized.

So the above show two approaches: one is a point-estimate approach and the other is an interval/region approach.

8. Aug 30, 2012

### Stephen Tashi

You can consider answer questions only if you know E(X),E(Y) and E(g(X,Y)). If you already know those quantities, what statistical problem are you trying to solve?

If lower case "x" and "y" denote random variables in those expression, the expressions themselves take-on random values, so you can't claim anything about which one of them is smaller.

In your next post, you seem to say E[X] = x. That would imply x is a constant. So what is "x"? is it a constant or is it a random variable?

A typical scenario is statistics would be that we are trying to estimate E[ g(x,y)] from a sample. We define some function W of the sample data. This function is an "estimator". I think you want to ask which is the "best" estimator for E(g(x,y)). Is it the function defined by W1 = the mean value of g(xi, yi) taken over all data points (xi,yi) ? Or is it the function defined by W2 = g( x_bar, y_bar) where x_bar and y_bar are the mean of the samples x1,x2,.. and y1,y2,... respectively.

One way to define a "best" estimator W (x1,x2,...y1,y2,..) is to say that it minimized the expected square error.

I.e. it minimizes E[ ( w(x1,x2,..,y2,y2...) - E(g(x,y))^2 ]

Note that you have to have two expectations in this expression. If you leave off the one on the left, the expression is a random quantity which varies with the data.

I think the language your are using in your thoughts is failing to distinguish among the following different concepts.

1) The mean of a distribution
2) The mean of sample that is drawn from that distribution
3) An estimator for the mean of the distribution

Similar distinctions hold for other statistical quantities, such as the standard deviation, the variance, the mode etc.