assignments
due Tue Jan 28
information entropy
- Expect to continue the ideas here for at least another week, with coding assignments to implement some of them coming soon.
- Take a first pass at reading Biggs textbook chapters 1 through 3. I want to talk about entropy before huffman encoding, so focus on that.
- Define and give an example of each of these terms.
- alphabet, string, message, word
- code
- information entropy
- prefix-free, optimal code, huffman code (briefly)
- Other sources
- For another take on the notion of "information entropy", you could look at reading part 1 of Shannon's Mathematical Theory of Communication, which is the first paper to propose this concept.
- Related material is in chapter 2 (and others) of MacKay's book
- Coding
- Write a program to read the files stream1.txt and stream2.txt . What can you say about the probabilities and entropy of the 0's and 1's?
due Tue Feb 4
huffman coding
- Finish reading through chapter 3. Start in on chapter 4. (We'll do LZW and arithmetic next week.)
- On page 33, do exercises 3.6, 3.7, and 3.8.
- Do exercise 3.13 on page 40.
- Here is the text of Moby Dick. Using any choice of symbols you like, construct a Huffman code for this, and use it to compress the file.
- Calculate the entropy of that file, using what seems to be the most appropriate probability model and/or symbol length.
- Compress the file using gzip or any other command line tool.
- Compare the fractional reduction in size with (a) unix compression and (b) your huffman code with (c) the entropy. Discuss.
due Tue Feb 11
lossless compression
- Do exercise 4.3 on pg 49, a small Huffman Code example.
- Do exercise 4.16 on pg 65, an example of an arithmetic code. (You'll need to study example 4.18 and the definitions and procedure that precedes it.)
- Implement and test an LZW encoder and decoder, and test it on a short block of text. (Such as those in examples of the algorithm.) You may use code from online sources (e.g. http://rosettacode.org/wiki/LZW_compression ), as long as you (a) quote your sources, and (b) show that you understand what its doing and why.
- Next week we're going to look at lossy compression, including JPEG and the discrete cosine transform. If you want to get a head start, peruse the wikipedia articles on JPEG, YCbCr, and the Discrete Cosine Transform (e.g. http://www.whydomath.org/node/wavlets/dct.html .) (I'll have a reading list next week, and some background that will hopefully help.)
due Tue Feb 18
lossy compression
- Do some readings on image, sound, and/or video lossy compression codecs and containers, to get a feel for the landscape of what's out there. Report on your exploration.
- Try as much of the following as makes sense given your math chops.
- Study my DCT notes and use those ideas and/or code to explore the DCT .
- Do 4 x 4 2D DCT on this data: [[ 24, 147, 212, 216], [47, 33, 179, 221], [20, 73, 201, 235], [140, 181, 215, 217]] . (This is the top left of the "tiny" image in my DCT notes.)
- Show that the inverse DCT ("undo" in my notes) restores those numbers.
- Set a few of the high frequency values to zero in the transformed values, then undo back to the original image, and show that what you have is almost the same.
- Explain what is meant by "quantization" when using fewer bits to represent a number. To be specific, let's say that the number in question is 216. With eight bits, you can represent that exactly. (Do so.) If you're using a code that represents numbers from 0 to 255, how close to 216 can you come with four bits? With two bits?
- With a random image (or a small piece of one), implement (with code or by hand, your choice) the RGB - YCbCr transform (see the wikipedia article for a definition). If one RGB pixel is 24 bits (8 bits each for of R,G,B), and you want to use fewer for the Y,Cb,Cr numbers, how might you set up the quantization?
- (Note that you can do all these by hand ... or you can write some code to automate the calculation. In either case, make sure you understand what's going on.)
due Tue Feb 25
noise
- Read up on noise, as given in the references in my Tue notes.
- Calculate all the interesting quantities including mutual information and channel capacity for at least two models:
- the symmetric binary channel, with X=(0,1), Y=(0,1), P(1|1)=P(0|0)=1-f, P(1|0)=P(0|1)=f where f is a fractional error rate like 0.01 or 0.1. (This is one of the sample problems in MacCay).
- the 3 symbol X=(1,2,3) Y=(a,b,c) example we started in class.
- Do problem 5.17 pg 87 of Biggs: A simplified keypad has four keys arranged in two rows of two). If the intention is to press key x, there is probability α of pressing the other key in the same row and probability α of pressing the other key in the same column (and consequently probability 1 − 2α of pressing x). Write down the channel matrix for this situation and find its capacity (i.e. the maximum entropy, or information per symbol, that can be sent with this keyboard).
- (You can do these calculations by hand, or write some code to do the math. Either way, make sure you understand what's going on.)
due Tue Mar 4
error correcting ideas
- Read chapters 6 and 7 in Biggs
- Do 6.6, i.e. prove that that hamming distance d(x,y) satisfies d(x,x)=0, d(x,y)=d(y,x), and d(x,z) <= d(x,y) + d(y,z).
- Do 6.15, 6.16, 6.17
- In class we started 6.12, 6.13, 6.14 on pg 102, on an 1-error correcting code with 5 elements from 6-bit words. Finish them.
- Do 7.8, 7.11
due Tue Mar 11
linear correcting codes 1
- Review linear algebra if you need to. Make sure you can
- multiply matrices by column and row vectors
- know what a "basis" is
- understand how any vector can be built from a basis
- understand what the "column space" of a matrix is
- Read chapter 8.
- Do exercises 8.1, 8.2, 8.3
- Do 8.7, 8.8, 8.9 (If we did them in class, then make sure you can, too.)
- Write down any questions you have on this stuff, and come to class Tuesday ready to discuss.
due Tue Apr 8
hamming codes
- Give examples that use the [8,4] extended Hamming code to
- encode a short message
- decode the coded message
- detect an 1-bit error in that message
- correct that 1-bit error
- Describe a Hamming code with a larger block size than the [7,4] code described in http://en.wikipedia.org/wiki/Hamming_code , and construct its check and generation matrices.
- How many errors does it correct?
- What is its rate? (bits of info per bit of data)
- Is there an extended version of this like the [8,4] code?
- Start work on the summary project that I described in the review & summary project notes. (More details coming.)
due Tue Apr 15
crypto
- Play around with one of the historical ciphers described in the text, at least writing a program to encode/decode text, and preferably doing some cryptanalysis of a simple cipher. Possibilities include anything from chapter 11 in the text :
- Do a frequency analysis of a simple substitution cipher.
- Write a brute force search decoder for the correct text that recognizes when the sentence is in English.
- Explore Hill's system and plaintext attacks.
- Read chapter 12 in the text and do exercises 12.10 and 12.11.
- Get a command-line openssl on your computer, if it isn't already. Use it to encode the text "This is the plain text that you should encode to make Jim happy." with the key "quick_brown_foxes_are_foxy" in aes-128-cbc and base64.
- So ... what is aes-128-cbc? Be specific.
- Did you use a salt? Is your code the same each time, or different?
- Is all of the key required? Should it be longer? Will any other key give the same result?
due Tue Apr 22
coding project
- Implement the compress_encode and decode_uncompress command line tools as explained in my CRC codes notes.
- Include a discussion of why you chose the algorithms and parameters you did given the noise and source characteristics, and what compression and correction your method can deliver.
- (This one is for a grade.)
due Tue Apr 29
rsa
- Explain the RSA public/private algorithm including :
- tow the "exponentiation by squaring" allows it to be implemented,
- the basic ideas behind the number theory prime number encoding itself, and
- the protocol used in practice that includes "signing" a message.
- Send me a message using this protocol :
- The gnupg.org software is one reasonable tool.
- Find my public key at e.g. pgp.mit.edu (a key server); lookup mahoney@marlboro and use my 2000 key
- Create your own RSA public/private key, tell me in your submission your public key, and post an encrypted message.
due Tue May 6
crypto project presentations
- optional!
- present and explain in class your crypto code ... see the description below
crypto project writeup
- This is an optional project. (The term grade will either be 2 grades (homework, coding project) or 3; your choice.)
- Choose any of the modern symmetric or asymmetric crypto algorithms we've discussed in the last few weeks.
- Implement the algorithm and test the encode/decode methods.
- You may use online sources; as always, clearly cite what you use, and explain what is and isn't your work. Just using someone else's implementation isn't going to show that you understand what's going on - adopting ideas from a pseudo-code description and explaining them is more what I have in mind.)
term grade
- a place for Jim to record the semester grade