Question about Shannon's mathematics

  • Context: Graduate 
  • Thread starter Thread starter ScarTissue
  • Start date Start date
  • Tags Tags
    Mathematics
Click For Summary

Discussion Overview

The discussion revolves around a specific aspect of Shannon's paper "A Mathematical Theory of Communication," focusing on the equation for calculating the number of sequences of symbols of a given duration in the context of information theory. Participants are examining the implications of the summation in the equation and whether it accurately represents the counting of sequences.

Discussion Character

  • Technical explanation
  • Debate/contested

Main Points Raised

  • One participant questions the validity of the summation in Shannon's equation, suggesting that sequences may be counted multiple times, particularly when considering sequences of different lengths.
  • Another participant attempts to clarify that the equation is meant to count only t-length sequences, implying that shorter sequences are not included in the count.
  • A later reply emphasizes the importance of understanding what is being counted and suggests that the potential for double counting does not affect the overall validity of the equation.
  • Some participants express confidence in Shannon's work, suggesting that any apparent discrepancies may stem from misunderstandings rather than errors in the original paper.

Areas of Agreement / Disagreement

Participants do not reach a consensus on whether the summation in Shannon's equation is correct or if it leads to double counting. There are competing views regarding the interpretation of the equation and its implications for counting sequences.

Contextual Notes

The discussion highlights potential ambiguities in the definitions of sequences and their lengths, as well as the assumptions underlying the summation in Shannon's equation. These factors contribute to the uncertainty in the interpretation of the mathematical formulation.

ScarTissue
Messages
7
Reaction score
0
I'm trying to go through Shannon's paper "A Mathematical Theory of Communication" to improve my understanding of information theory.

In Part I (Discrete Noiseless Systems) Shannon states:

Suppose all sequences of the symbols S1, . . . ,Sn are allowed and these symbols have durations t1, . . . ,tn. What is the channel capacity?
If N(t) represents the number of sequences of duration t we have

N(t) = N(t -t1)+N(t -t2)+...+N(t -tn):

The total number is equal to the sum of the numbers of sequences ending in S1, S2, . . . , Sn and these are N(t -t1), N(t -t2), . . . ,N(t -tn), respectively.


So I can't understand how this sum is actually working. For example, if t1=2s and t2=4s, then the first term in the sum is the number of all sequences ending in S1 as expected. However the second term is going to be the number of all sequences ending in either S1,S1 or S2. So this means that some of the sequences ending in S1 have been counted twice by this sum.

Am I missing something here? Or am I correct and the right hand side of the equation is going to be larger than the left?
 
Physics news on Phys.org
To get a sequence of duration t, we append some Si to a sequence of duration t - ti. N(t) is just the number of sequences of duration t, and is the sum of those with each Si to be appended.

-- sorry, I see I answered the wrong question. The question was, is the sum correct?
 
Last edited:
Yes, I understand what the terms mean (I think) but I don't see how the two sides of the equation are equal.
 
We know Claude Shannon as one of the forefathers of the digital age. Someone with this much foresight would not easily make a mistake. Whatever he wrote there we must assume was intentional.

Therefore, look again. And focus too on what is being counted. We are counting only t-length sequences, not any shorter sequences.

In reference to: "So this means that some of the sequences ending in S1 have been counted twice by this sum."

PS. Sorry for biting your head off.
 
Last edited:
We are counting only t-length sequences, not any shorter sequences.

Right. So if we have t1=2 and t2=4, any sequence ending in two S1's will have the same length as any sequence ending in one S2. In such a case you count the S1S1 sequences twice.

I don't believe Shannon could have made a mistake in this paper, and I don't believe it could have gone unquestioned if he had. So really I'm just trying to understand why you don't count the sequences above twice, or if you do, why it doesn't matter.
 

Similar threads

  • · Replies 5 ·
Replies
5
Views
3K
  • · Replies 15 ·
Replies
15
Views
2K
  • · Replies 3 ·
Replies
3
Views
1K
  • · Replies 6 ·
Replies
6
Views
5K
  • · Replies 18 ·
Replies
18
Views
4K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 1 ·
Replies
1
Views
3K
  • · Replies 7 ·
Replies
7
Views
3K
  • · Replies 8 ·
Replies
8
Views
3K
  • · Replies 1 ·
Replies
1
Views
2K