# Help on Covariance: Approximating \Theta1 & \Theta2

• neznam
That's the functional relationship between the two variables. No matter what the distribution of X and Y, the odds ratio, OR, is fixed and the two odds functions are dependent.
neznam
I need to find an approximation of the covariance of a function of a random variable.

$\Theta$1- log[p1/(1-p1)] where p1 is binomial
$\Theta$2- log[p2/(1-p2)] where p2 is binomial

I need to find the covariance of $\Theta1$ and $\Theta2$

Please- any help will be greatly appreciated

neznam said:
I need to find an approximation of the covariance of a function of a random variable.

$\Theta$1- log[p1/(1-p1)] where p1 is binomial
$\Theta$2- log[p2/(1-p2)] where p2 is binomial

I need to find the covariance of $\Theta1$ and $\Theta2$

Please- any help will be greatly appreciated

Without data, the covariance of two random variables can be expressed as an inequality.

$$|Cov(X,Y)|\leq \sqrt{Var(X)Var(Y)}$$

Now what is the variance of X and Y if they have a binomial distribution?

Last edited:

If the p's are independent, the covariance is 0, since the Θ's are independent.

What I am looking for really is an expression for the covariance if I have a function of the random variable instead of only a random variable. The function is the log function above in the original question. I know there is such an expression for the variance of a function of the random variable and the result involves 1st derivative of that function, but I confused how that works for covariance.

Thanks

neznam said:
$\Theta$1- log[p1/(1-p1)] where p1 is binomial
$\Theta$2- log[p2/(1-p2)] where p2 is binomial

How are those formulas going to make sense if the p's are binomial random variables? They will take values from the set 0,1,2,...N. Suppose p1 = 3. Won't you be trying to take the log of a negative number?

What I am looking for really is an expression for the covariance if I have a function of the random variable instead of only a random variable.

My guess at what your trying to say is this:

Let X and Y be random variables. Let f and g be functions of a single variable and let the random variables F and G be defined as F= f(X) and G = g(Y). What is a method for expressing COV(F,G) in terms of COV(X,Y) ?

An even more general question is:

Let X and Y be random variables. Let f and g be functions of two variables and let the random variables F and G be defined as F= f(X,Y) and G = g(X,Y). What is a method for expressing COV(F,G) in terms of COV(X,Y) ?

I don't claim to know the answer to that question. One thought is to use power series expansions.

Let $M = E( F^k G^j)$ be a moment of the joint distribution of $(F,G)$. Expand the function $F^k G^j$ as a power series in X and Y. Then the expectation $E( F^k G^j)$ becomes the sum of expectations like $E( C X^r Y^s)$ where $C$ is a constant that does involves evaluations of partial derivatives of $f$ and $g$.

The particular case of COV(F,G) involves the particular moments E(F), E(G) and E(FG).
We can work out what the power series method says in that case. (Or if we are lucky, some other keen forum member will do it for us!)

Yes
If X and Y are random variables and H(X) and G(Y) are functions of those random variables then what will be an expression of the COV(H(X), G(Y)) in terms of X and Y.

Similar expression is Var[H(X)] equals approximately [H'(mean)]^2 * Var(X)

Appreciate any help
Thank you :-)

Sorry,one more clarification the p's are between 0 and 1, so the log will not be negative.

What you've said about the $\Theta_i$ still isn't consistent. Are the $p_i$ random variables? - or are they parameters?. If they are between 0 and 1, they can't be binomial random variables.

-------------------

Let's try to do the abstract problem : For random variables x and y and random variables defined by F = f(x) and G = g(y), give an approximation for Cov(F,G) in terms of statistics involving x and y.

( What's the world coming to when a person has to try to work out his own suggestions? We need someone skilled with a computer algebra program.)

See if this looks correct:

$$H(x,y) = F(x)G(y)$$

$$H(x,y) \approx H + \frac{\partial H}{\partial x}(x-\mu_x) + \frac{\partial H}{\partial y} (y - \mu_y) + \frac{\partial^2 H}{\partial x \partial y} (x-\mu_x)(y-\mu_y) + \frac{1}{2} \frac{\partial^2 H}{\partial x^2}(x-\mu_x)^2 + \frac{1}{2} \frac{\partial^2 H}{\partial y^2} (y-\mu_y)^2$$

Where $H$ and its derivatives are evaluated at the point $(\mu_x,\mu_y)$.

Take expectations with respect to the joint density of X and Y.

$$E(H) = H + 0 + 0 + Cov(x,y) \frac{\partial^2 H}{\partial x \partial y}+ \frac{\sigma^2_x}{2} \frac{\partial^2 H}{\partial x^2} + \frac{\sigma^2_y }{2}\frac{\partial^2H}{\partial y^2}$$

Also use the approximations:

$$F(x) \approx F + (F')(x - \mu_x) + \frac{ (F'')(x - \mu_x)^2} {2}$$
$$G(x) \approx G + (G')(x - \mu_y) + \frac{ (G'')(x - \mu_y)^2} {2}$$

where F and its derivatives are evaluated at $\mu_x$ and G and its derivatives are evaluated at $\mu_y$.

So

$$E(F(x)) \approx F + 0 + \frac{\sigma^2_x}{2}(F'')$$
$$E(G(x)) \approx G + 0 + \frac{\sigma^2_y}{2}(G'')$$

$$Cov(F,G) = E(FG) - E(F)E(G)$$
$$\approx H + Cov(x,y) \frac{\partial^2 H}{\partial x \partial y}+ \frac{\sigma^2_x}{2} \frac{\partial^2 H}{\partial x^2} + \frac{\sigma^2_y }{2}\frac{\partial^2H}{\partial y^2} - (F + \frac{\sigma^2_x}{2}(F''))(G + \frac{\sigma^2_y}{2}(G''))$$

If that's correct, the next step would be to write terms like $\frac{\partial^2 H}{\partial x \partial y}$ in terms of derivatives of $F$ and $G$.

neznam said:
Sorry,one more clarification the p's are between 0 and 1, so the log will not be negative.

First, note we are dealing with odds, not probabilities. The log(odds) ranges from negative to positive infinity. When p=0.5, the odds=1 and the log odds=0.

$$\frac{p_1/1-p_1}{p_2/1-p_2}= \frac{p_1/q_1}{p_2/q_2}=\frac{p_1 q_2}{p_2 q_1}$$

The last term is the odds ratio expressed as the cross products. In log form it's $$(ln(p_1)+ln(q_2))-(ln(p_2)+ln(q_1))$$

When the odds ratio (OR)is unity (lnOR=0), the two odds functions (logits) are independent and the covariance is therefore zero.

Last edited:
SW VandeCarr said:
When the odds ratio (OR)is unity (lnOR=0), the two odds functions (logits) are independent and the covariance is therefore zero.

This link will give more detail (see section 5 re covariance). Your question regarding covariance requires some understanding of the use of odds ratios as measures of association to answer fully. You can download the full PDF from the linked page.

http://arxiv.org/abs/1105.0852

## 1. What is covariance and why is it important in statistical analysis?

Covariance is a measure of the relationship between two random variables. It shows how changes in one variable are associated with changes in another variable. In statistical analysis, covariance is important because it helps us understand the strength and direction of the relationship between two variables, which is crucial in making predictions and drawing conclusions from data.

## 2. How do you calculate covariance?

Covariance can be calculated using the formula: Cov(X, Y) = E[ (X - E[X]) * (Y - E[Y]) ]. In simpler terms, it is the expectation of the product of the differences between each variable and its mean. This calculation can be done using statistical software or by hand.

## 3. What is the difference between covariance and correlation?

Covariance and correlation both measure the relationship between two variables, but they differ in terms of scale and interpretation. Covariance has no upper or lower limit and its value can be positive or negative, whereas correlation is standardized and always falls between -1 and 1. Additionally, covariance measures linear relationship while correlation measures both linear and non-linear relationships.

## 4. How can covariance be used in data analysis?

Covariance can be used to identify patterns and relationships between variables in a dataset. It can also help in feature selection, where variables with high covariance can be removed to avoid redundancy. In addition, it is used in various statistical tests and models such as linear regression and ANOVA to assess the strength of the relationship between variables.

## 5. How do you interpret the value of covariance?

The value of covariance is not easily interpretable on its own, as it depends on the scale of the variables being measured. A positive covariance indicates a positive relationship between variables, meaning they tend to increase or decrease together. A negative covariance indicates an inverse relationship, where one variable goes up while the other goes down. However, the magnitude of the covariance does not tell us the strength of the relationship, so it is important to also consider correlation or other measures of association.

• Mechanics
Replies
2
Views
702
• Set Theory, Logic, Probability, Statistics
Replies
3
Views
1K
• Set Theory, Logic, Probability, Statistics
Replies
9
Views
1K
• Set Theory, Logic, Probability, Statistics
Replies
8
Views
3K
• Set Theory, Logic, Probability, Statistics
Replies
0
Views
1K
• Set Theory, Logic, Probability, Statistics
Replies
1
Views
1K
• Set Theory, Logic, Probability, Statistics
Replies
5
Views
1K
• Set Theory, Logic, Probability, Statistics
Replies
1
Views
936
• MATLAB, Maple, Mathematica, LaTeX
Replies
6
Views
433
• Set Theory, Logic, Probability, Statistics
Replies
1
Views
872