I'm trying to measure now much non redundant (actual) information my file contains. Some call this the amount of entropy.(adsbygoogle = window.adsbygoogle || []).push({});

Of course there is the standard p(x) log{p(x)}, but I think that Shannon was only considering it from the point of view of transmitting though a channel. Hence the formula requires a block size (say in bits, 8 typically). For a large file, this calculation is fairly useless, ignoring short to long distance correlations between symbols.

There are binary tree and Ziv-Lempel methods, but these seem highly academic in nature.

Compressibility is also regarded as a measure of entropy, but there seems to be no lower limit as to the degree of compression. For my file hiss.wav,

- original hiss.wav = 5.2 MB

- entropy via the Shannon formula = 4.6 MB

- hiss.zip = 4.6 MB

- hiss.7z = 4.2 MB

- hiss.wav.fp8 = 3.3 MB

Is there some reasonably practicable method of measuring how much entropy exists within hiss.wav?

**Physics Forums | Science Articles, Homework Help, Discussion**

The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

# How to practically measure entropy of a file?

Tags:

**Physics Forums | Science Articles, Homework Help, Discussion**