Shannon Information Theory: Transducer Entropy doesn't increase

VVS · Nov 15, 2014

HI,

I am reading Shannon's paper on Theory of Communication and I having trouble with a concept.
Shannon writes:

The output of a finite state transducer driven by a finite state statistical source is a finite state statistical source, with entropy (per unit time) less than or equal to that of the input. If the transducer is non-singular they are equal.
Let α represent the state of the source, which produces a sequence of symbols xi; and let β be the state of the transducer, which produces, in its output, blocks of symbols yj. The combined system can be represented by the "product state space" of pairs (α,β). Two points in the space (α1,β1)and (α2,β2), are connected by a line if α1 can produce an x which changes β1 to β2, and this line is given the probability of that x in this case. The line is labelled with the block of yj symbols produced by the transducer. The entropy of the output can be calculated as the weighted sum over the states. If we sum first on β each resulting term is less than or equal to the corresponding term for α, hence the entropy is not increased. If the transducer is non-singular, let its output be connected to the inverse transducer. If H′1, H′2 and H′3 are the output entropies of the source, the first and second transducers respectively, then H′1≥H′2≥H′3=H′1 and therefore H′1=H′2.

I am not able to show the decrease or equality in entropy mathematically. This is what I have got:
[itex] H(y|\beta)=-\sum_{i,j}P(\beta_i)P(y_j|\beta_i)log\left[P(y_j|\beta_i)\right][/itex]
[itex] H(y|\beta)=-\sum_{j}P(\beta_1)P(y_j|\beta_1)log\left[P(y_j|\beta_1)\right]+P(\beta_2)P(y_j|\beta_2)log\left[P(y_j|\beta_2)\right]+..=\sum_iH(y|\beta_i)[/itex]
Now assume that there exist states [itex]\alpha_k[/itex] with output [itex]x_{l}^{k}[/itex] which cause the transition from [itex]\beta_1 \rightarrow \beta_2[/itex]
Then entropy of those states which cause this particular transition with a particular input is:
[itex]H(\beta_1\rightarrow\beta_2|\alpha_k)=H(x_l^k|\alpha_k)=-\sum_{k,l}P(\alpha_k)P(x_{l}^{k}|\alpha_k)log\left[P(x_{l}^{k}|\alpha_k)\right] [/itex]

My guess is that there is exists a relation between [itex]P(\beta_i)P(y_j|\beta_i)[/itex] and [itex]P(\alpha_k)P(x_{l}^{k}|\alpha_k)[/itex] but I just can't see it.

I forgot to add that the output of the transducer and the next state of the transducer are determined by these functions:
[itex] y_n=f(x_n,\beta_n) [/itex]
[itex] \beta_{n+1}=g(x_n,\beta_n) [/itex]
But Shannon doesn't mention whether these mappings are bijective or not.

gtoguy97 · Nov 15, 2014

I think that the key idea here is that Shannon is showing that the entropy of the output of the transducer is not increased. To do this, he shows that the entropy of the output of the transducer is equal to the sum of the entropies of the states that cause the transition from one state of the transducer to another. This means that the entropy of the output is determined by the probabilities of the input symbols that cause the transition from one state of the transducer to another. In other words, the entropy of the output is determined by the probability of the input symbols that are mapped to the output symbols. This essentially shows that the entropy of the output is not increased.

Shannon Information Theory: Transducer Entropy doesn't increase

1. What is Shannon Information Theory?

2. How does Shannon Information Theory relate to transducer entropy?

3. Why doesn't transducer entropy increase in Shannon Information Theory?

4. How does Shannon Information Theory impact communication systems?

5. What are some real-world applications of Shannon Information Theory?

Similar threads

Hot Threads

Recent Insights