Character strings as random variables?

SW VandeCarr
Messages
2,193
Reaction score
77
Consider a character string randomly generated from an alphabet {T,H} of length L, where T and H each have a probability of 0.5. For an arbitrary finite L the probability of a given string is p=(0.5)^L.

A probability is the sole determinant of Shannon entropy (S). Therefore I'm claiming that such character strings have Shannon entropy which, given a uniform PDF, would be S=-logb2(P)^L.

This is my reasoning for claiming that such character strings have entropy. I've been challenged on this based on the argument that each element of the string is a random variable, but the entire string is a "constant". In fact, there is no specification that the string need be generated sequentially. A string, as defined above, where L=10 has 1024 possible outcomes or states. Is this not an example of entropy?

EDIT: In addition, I'm claiming that if L were an RV and P(T or F) is fixed, then S is a random variable with a known PDF.
(see also LuculentCabal:logarithm of discrete RV Jul 12)
 
Last edited:
Physics news on Phys.org
SW VandeCarr said:
Consider a character string randomly generated from an alphabet {T,H} of length L, where T and H each have a probability of 0.5. For an arbitrary finite L the probability of a given string is p=(0.5)^L.

A probability is the sole determinant of Shannon entropy (S). Therefore I'm claiming that such character strings have Shannon entropy which, given a uniform PDF, would be S=-logb2(P)^L.

This is my reasoning for claiming that such character strings have entropy. I've been challenged on this based on the argument that each element of the string is a random variable, but the entire string is a "constant". In fact, there is no specification that the string need be generated sequentially. A string, as defined above, where L=10 has 1024 possible outcomes or states. Is this not an example of entropy?

EDIT: In addition, I'm claiming that if L were an RV and P(T or F) is fixed, then S is a random variable with a known PDF.
(see also LuculentCabal:logarithm of discrete RV Jul 12)

I think the confusion may be that a particular string is not a random variable but a realization/event of a random variable. Maybe in your past debate you were just having a comunication problem.
 
John Creighto said:
I think the confusion may be that a particular string is not a random variable but a realization/event of a random variable. Maybe in your past debate you were just having a comunication problem.

Well, that is the root of the problem apparently. But if you take the view that an outcome, once observed, has no information, then information/entropy doesn't exist as an observable. If we have a system which has 1024 equally probable states than the entropy of that system is 10 in Shannon measure, is it not? The string that is observed is one randomly realized state of the system. What is the proper context for the concept of information/entropy?
 
Last edited:
SW VandeCarr said:
Well, that is the root of the problem apparently. But if you take the view that an outcome, once observed, has no information, then information/entropy doesn't exist as an observable. If we have a system which has 1024 equally probable states than the entropy of that system is 10 in Shannon measure, is it not? The string that is observed is one randomly realized state of the system. What is the proper context for the concept of information/entropy?

That all makes sense to me. Keep in mind though I haven't studied Shannon entropy but did try reading his paper once (a long long time ago).
 
John Creighto said:
That all makes sense to me. Keep in mind though I haven't studied Shannon entropy but did try reading his paper once (a long long time ago).

The essential thing you need to know is that entropy is defined as:

S = -k \sum (p(x_{i})log_{2} p(x_{i}))

Therefore any value that can be calculated from the appropriate input parameters by means of this equation is entropy. If entropy can be calculated for a character string, then the string has entropy. In the thermodynamic version, k is the Boltzmann constant and it applies to a system whose microstate is defined in terms of the kinetic energy (KE) of the individual particles and the macrostate in terms of temperature (T) resulting in S= KE/T in SI units for systems in thermal equilibrium.

In the statistical application the same equation applies usually with k=1. The the input values in the case at hand is the alphabet {T,H}, the length(L) of the string and the probability of the string (0.5)^{L}.

So character strings can have entropy. The question is how they have entropy. If you assume that a string is generated sequentially, then the character output is known as the process proceeds. Here you can argue that each unit of output is the mapping of a random variable (RV), but the string as a whole is not the mapping of a RV. However, if the string is considered as a unit entity (one state of a system), then the entire string is the mapping of a random variable onto the event space.
 
Last edited:
When, I first read this I thought "So". So I decided to look at your other thread to try and understand why you are making what seems to be a seemingly obvious point. A string has no entropy as it is a particular instance of a random variable but each string is one state or mode. The entropy of the entire system is based upon the number of modes and the probability of each mode. So now that we are in agreement so far let's get back to your other thread.
 
Namaste & G'day Postulate: A strongly-knit team wins on average over a less knit one Fundamentals: - Two teams face off with 4 players each - A polo team consists of players that each have assigned to them a measure of their ability (called a "Handicap" - 10 is highest, -2 lowest) I attempted to measure close-knitness of a team in terms of standard deviation (SD) of handicaps of the players. Failure: It turns out that, more often than, a team with a higher SD wins. In my language, that...
Hi all, I've been a roulette player for more than 10 years (although I took time off here and there) and it's only now that I'm trying to understand the physics of the game. Basically my strategy in roulette is to divide the wheel roughly into two halves (let's call them A and B). My theory is that in roulette there will invariably be variance. In other words, if A comes up 5 times in a row, B will be due to come up soon. However I have been proven wrong many times, and I have seen some...

Similar threads

Replies
2
Views
2K
Replies
16
Views
2K
Replies
37
Views
10K
Replies
1
Views
2K
Replies
2
Views
2K
Back
Top