Jan 26
entropy
Discuss the "entropy" concept further,
and the formulas that go along with it.
Work through the variations of the entropy
formula for the conditional probabilities
calculated in the first homework.
How would this work for longer chunks
of bits grouped together as single symbols
(i.e. 8 bit bytes)?
What are the units of H, the information entropy?
Discuss the relationship between compressability
and information entropy. Work through some specific
examples of files with a limited character set,
say only 01 and 10 pairs.
homework
Go over the homework probability computations.
Next homework (in part) : calculate the entropies
of stream1.txt and stream2.txt .
real bits and/or bytes ?
Can we write a tool to return a calculated
entropy for an arbitrary file? What would
be the limitations of such a tool?
conditional entropy & mutual entropy
... coming later.
We'll return to this topic in a few chapters,
when we take up the notions of noise and
"channel capacity", when we have the situation
of a stream of bits being sent (the signal)
through some pipe (the channel) with some
other stream of bits (the noise)
being added in.
The question will be: how much information (entropy)
can we extract from the (signal + noise) that we get?
And what does this have to do with the entropy
of the signal and the entropy of the noise?
As a teaser ...
H=1 max entropy signal being sent P(0) = P(1) = 0.5
H=1 max entropy noise added in, P(0) = P(1) = 0.5
Can the person receiving this mix tell anything about
the original signal?
P(receive 0 | sent 0) = ?
How about noise with P(0)=0.99, P(1)=0.01 ?
Huffman coding
Next up : our first compression technique.