Mar 7
Discuss the material in chap 6 including
- hamming distance
- error detection and correction
- decision rules : most probable vs most likely vs minimum distance
- packing bound
Can you construct an example where the three decision rules are different?
(For the case we're typically doing, n bits with symmetric errors p(1_bit_flip)=e , all three of those are the same.)
Work through the packing bound algebra explicitly.
Depending on time, continue into chap 7 which
culminates in Shannon's Theorem, in terms
of the n-bit symmetric codes that is our
go-to model. We want to understand the ideas
in the recipe presented on page 119 :
- Start with a channel where the error per bit is known (say 0.03). From that calculate a maximum information rate I (which Biggs calls gamma , γ ).
- Choose an allowable mistake rate (say 1e-6).
- Pick a practical information rate rho (Biggs ρ ) such that ρ < γ , namely something less than the channel capacity.
- Use a binary n-bit code Cn in which only k bits are information, and the remaining (n-k) are error checking bits.
- Be clever with the code so that all legit words are far enough apart so that the probability of a mistake is less than 1e-6 .
How to actually implement all this and choose values for n and k and all that is the subject of the next chapter in the textbook, chapter 8.
summary
The big picture :
- If our data is in chunks of k bits, we will pad those out to longer n bit words.
- The bigger that n is, the further apart (hamming distance) the legal words are.
- The further apart they are, the lower the probability of a mistake. However, the information rate goes down too.
- Given a specific error probability, we can add enough check bits to drive the mistake per word as low as we want ... but only at the cost of lowering the information rate.
- The best we can do is given by the channel capacity ... which is Shannon's Theorem.
The question is then how to devise these codes which have a lot of space between the legal words, and how many check bits in these codes are needed given the error per bit and allowed mistake rate.
errata : channel capacity units
I was incorrect in something I said last week in response
to Numen's "what is 'information rate' that I measures"?
The rate should be "bits of information per bit received".
That means for example if we're sending the 127 original
ascii symbols, which takes 7 bits, and adding a parity bit,
or 1 extra bit, the information rate is 7/8 .
(I incorrectly said it was (legal_symbol_count)/(total_symbol_count)
which for the parity bit above would give a rate of 1/2 ...
which even then felt wrong but seemed to be what was
in MacKay's text.)