    Information Theory - Shannon's Self-Information units

    No, you can use what ever logarithmic base. For natural information you could use unit "nat" with base e and for binary information unit "bit" with base 2.
    Wavelet transform

    Maybe this would help you? And yes, you can use cofficients or sums of cofficients as a input data. You could use also a wavelet network where neurons activation functions are wavelet functions. See...
    You can also use R-language with Wavelet packet Good starter book about wavelets with R language...