Feb 7
My feedback on your work from last week is up.
The assignment for Thursday is posted - go over what it is.
This is our week to finish lossless encoding, and whatever
is still needed to understand the notion of information entropy.
So any of this is fair game for discussion, with links from the from previous notes:
- probability models (joint, conditional)
- information entropy (and sequences of entropies under different probability models)
- huffman encoding (fixed input bits => variable output bits using a probability per symbol; binary tree)
- arithmetic encoding (similar to huffman but generalize binary to any base and think of as single number)
- LZW encoding (variable input symbols => fixed output bits, generate and reconstruct a table of patterns)
- Burrows-wheeler transform (mysterious and tricky quasi-sort that turns repeated patterns into repeated symbols, making other compression techniques work better) ... see Dylan's BWT notes.
implementations using these (and other) techniques:
- gzip (lz77 - same family as lzw, huffman)
- bzip2 (burrows-wheeler, huffman, others)
- gif, tiff (lzw)
Open discussion of any of these.
next : lossy compression
I'd like to take a side step out of the text next, and discuss lossy compression , as long as we're doing compression.
That stuff is pretty linear algebra heavy ... so let's see where we are with that sort of math :
- vector
- basis vector
- change of basis
- dot product
- matrix
- matrix multiplication