I'm trying to go through Shannon's paper "A Mathematical Theory of Communication" to improve my understanding of information theory. In Part I (Discrete Noiseless Systems) Shannon states: Suppose all sequences of the symbols S1, . . . ,Sn are allowed and these symbols have durations t1, . . . ,tn. What is the channel capacity? If N(t) represents the number of sequences of duration t we have N(t) = N(t -t1)+N(t -t2)+...+N(t -tn): The total number is equal to the sum of the numbers of sequences ending in S1, S2, . . . , Sn and these are N(t -t1), N(t -t2), . . . ,N(t -tn), respectively. So I can't understand how this sum is actually working. For example, if t1=2s and t2=4s, then the first term in the sum is the number of all sequences ending in S1 as expected. However the second term is going to be the number of all sequences ending in either S1,S1 or S2. So this means that some of the sequences ending in S1 have been counted twice by this sum. Am I missing something here? Or am I correct and the right hand side of the equation is going to be larger than the left?