Entropy (Information Theory Question)

In summary, the conversation discusses finding the entropy of two independent integer-valued random variables, X and Y. X is uniformly distributed over {1,2,...,8}, and Y has a probability function of 2^-k, k=1,2,3,... The solutions for (a) and (b) are 3 bits and 2 bits, respectively, but there is confusion about the solution for (c). The solution given is 5 bits, but the attempt at solving it using a linear transformation does not match. It is then determined that the entropy of X+Y and X-Y, which can also be written as X and Y, is just the sum of the entropy of X and Y, given that they
  • #1
Jskota
5
0

Homework Statement


Let ##X## and ##Y## be two independent integer-valued random variables. Let ##X## be uniformly distributed over ##\left\{1,2,...,8\right\}##, and let ##\text{Pr}\left\{Y=k\right\} =2^{-k},~k=1,2,3,...##
(a) Find ##H(X)##.
(b) Find ##H(Y)##.
(c) Find ##H(X+Y,X-Y)##.

Homework Equations


I am confused about part (c). I have found the answers to (a) and (b), they are obviously 3 bits and 2 bits, respectively. However, the solution I get for (c) does not match the answer. The answer to (c) is apparently 5 bits.

The Attempt at a Solution


I argue that ##Z=X+Y## and ##W=X-Y##. Thus, I create the vectors ##\mathbf{u} = [Z,W]^T## and ##\mathbf{v}=[X,Y]^T## and write them as a linear transformation of each other as

##\mathbf{u}=\begin{bmatrix}1&1 \\ 1&-1 \end{bmatrix}\mathbf{v}=\mathbf{M}\mathbf{v}##.

Therefore, ##H(\mathbf{u})=H(X+Y,X-Y)=H(\mathbf{v})+\log_2\lvert\text{det}\left(\mathbf{M}\right)\rvert##. I then have

##\log_2\lvert\text{det}\left(\mathbf{M}\right)\rvert=1## bit
##H(\mathbf{v})=H(X)+H(Y|X)=H(X)+H(Y)=5## bits (since ##Y## is independent of ##X##).

This leaves me with the answer for (c) to be 6 bits.

Edit: Unless the formula I am using with log-det is only for continuous and not discrete distributions.
 
Last edited:
Physics news on Phys.org
  • #2
I'm not familiar with this log2|det(M)| formula. Can you post a link?
It feels wrong. If Y = 2X, would H(Y) be different from H(X)?
It's sort of obvious that since X and Y are independent H(X+Y,X-Y) = H(X,Y) = H(X)+H(Y).
 
  • #3
haruspex said:
I'm not familiar with this log2|det(M)| formula. Can you post a link?
It feels wrong. If Y = 2X, would H(Y) be different from H(X)?
It's sort of obvious that since X and Y are independent H(X+Y,X-Y) = H(X,Y) = H(X)+H(Y).

I think it was incorrect usage. It doesn't apply here since these are pmfs and not pdfs. I got it from a differential entropy wiki page.

Anyways, I don't disagree that if Y is a scale of X that the uncertainty in the RV will be the same. The probabilities are the same regardless of the values they take on the sample space.

I guess I didn't see it as obvious here since H(X+Y,X-Y) seems more complicated than it is. But the only way I guess I can understand this is that if given we know that Z is the sum and W is the difference, we can always determine X and Y. And so if X and Y are independent then the entropy is just the sum.

Do you know if, in general, when there is an affine relationship between RVs that the entropy is the same? It makes sense conceptually but there aren't really any theorems out there for it that I could find in my book (Cover-Thomas).
 
  • #4
Jskota said:
Do you know if in general there is an affine relationship between the RV that the entropy is the same?
For discrete RVs, I would say they'd be the same given any bidirectional deterministic relationship. If Y = f(X) is a bijection, P(Y=f(x)) = P(X=x).
 
  • Like
Likes Jskota
  • #5
haruspex said:
For discrete RVs, I would say they'd be the same given any bidirectional deterministic relationship. If Y = f(X) is a bijection, P(Y=f(x)) = P(X=x).
Okay. That is actually kind of what I was getting to last night. I eventually sort of proved it to myself that 5 bits makes sense a bit after I had posted this. Thank you for the help!
 

1. What is entropy in information theory?

Entropy in information theory is a measure of the uncertainty or randomness in a system or set of data. It is often used to quantify the amount of information contained in a message or to describe the disorder or unpredictability of a system.

2. How is entropy calculated?

The formula for calculating entropy is H = -∑(pi * log2(pi)), where pi represents the probability of each possible outcome. This formula takes into account both the number of possible outcomes and their respective probabilities to determine the overall level of entropy in a system or dataset.

3. What is the relationship between entropy and information gain?

Entropy and information gain are inverse measures of each other. As entropy increases, information gain decreases, and vice versa. This means that a system with high entropy has a lot of uncertainty and randomness, and therefore provides less new information when a new data point is added. On the other hand, a system with low entropy has more structure and predictability, and provides more new information when a new data point is added.

4. How is entropy used in data compression?

Entropy is used in data compression algorithms to reduce the amount of storage space needed for a given set of data. By identifying and removing patterns or redundancies in the data, entropy-based compression techniques can significantly reduce the size of a file without losing any important information. The lower the entropy of a dataset, the more effectively it can be compressed.

5. What is the role of entropy in machine learning?

In machine learning, entropy is often used as a metric to evaluate the quality of a split in a decision tree algorithm. The goal is to minimize entropy (and thus maximize information gain) at each step to create the most effective and accurate decision tree for a given dataset. Entropy can also be used as a feature selection method, where features with higher entropy are considered more important for predicting the target variable.

Similar threads

  • Calculus and Beyond Homework Help
Replies
4
Views
1K
  • Calculus and Beyond Homework Help
Replies
11
Views
821
  • Special and General Relativity
Replies
5
Views
359
  • Advanced Physics Homework Help
Replies
0
Views
290
  • Classical Physics
Replies
1
Views
1K
  • Calculus and Beyond Homework Help
Replies
3
Views
559
Replies
3
Views
1K
  • Calculus and Beyond Homework Help
Replies
3
Views
1K
  • Differential Equations
Replies
2
Views
1K
Replies
3
Views
1K
Back
Top