To get the Hound of the Baskervilles, I looked up author Doyle at the Guetenberg project. Then (1) I downloaded the file with wget http://www.ibiblio.org/gutenberg/etext01/bskrv10.zip (2) I uncompressed it with unzip bskrv10.zip (3) I converted the .txt file from pc to unix line endings with dos2unix bskrv10.txt (4) I used an editor to remove the Project Gutenberg legal-eze. --------------- Assignment: Write a C program which reads in this file and creates a summary of all the distinct words in the file and how many times each appears. Define a "word" any way you like. Do this with a hash table whose keys are the words, and whose values are the number of times that word has occured. Use any type of hash or hashing function you wish - that's the tricky part. Questions: Why use a hash to do this rather than a simple array of the words? What features do you want in the hashing function? Give an example of a bad hash function. How many entries would a perfect hash table require for this problem? Is this practical?