Computer Science @ Marlboro  

Statistics

Info | Assignments | Lecture Notes | Resources | Syllabus | Roster

Info
 
WhenTues/Thur 8:30-9:50
WhereSci 217
FacultyJim Mahoney (mahoney@marlboro.edu)
TextUnderstanding Statistics, 4th Edition by Arnold Naiman, Robert Rosenfeld, and Gene Zirkel
Credits 4 ( i.e. 9 hours/week outside class )
Level Introductory
Prereq high school algebra
Website http://cs.marlboro.edu/term/fall03/Statistics.html

An introduction to statistics, including probability, sampling, hypothesis testing, and all that. Data conversion, graphing, simple programming and other related computer skills will also be covered. Recommended for students in the sciences. Prerequisite: high school algebra

Grading will be based on
  • Weekly assignments (one grade for how many done)
  • Two exams early in the semester
  • A term project due at the Thanksgiving break
    (collect some data, test a hypothesis, write it all up)
  • A take-home final
  • The lowest of these five grades will be dropped.
Computers will be an invaluable tool throughout this course as you manage your data, perform various numerical tests, and create various graphics to visualize your results. The two applications we'll be using most are Excel and Mathematica, both of which are installed on all the lab computers.

This webpage will continue to change throughout the semester as assignments and resources are added.


Assignments
  1. for Tues Sep 9
    1. Send me (mahoney@marlboro.edu) an email telling me you're registering for this class.
      Or just email me the whole assignment.
    2. Read chap 1, start reading chap 2.
      (For those who don't have a copy of the textbook yet, there will be a copy on reserve in the library by Friday the 5th.)
    3. Do the math diagnostic at statistics/diagnostic/diagnostic.html.
    4. Do exercises 1-7, 1-11, 2-7, 2-11.

  2. for Tues Sep 16
    1. Finish reading chapter 2. (For those who don't have a copy of the textbook yet, there will be a copy on reserve in the library by Friday the 5th.)
    2. Check out the MathematicaStatsPrimer and DiceStatistics notes I wrote up; there are links at the bottom of this page. (As with most of what I put online for this course, they're in the http://cs.marlboro.edu/term/fall03/statistics/ directory.
    3. Construct a table like that on page 23 from 5 to 10 students at Marlboro. (We'll decide in class what data to gather.) Put the data into an Excel spreadsheet. Save as a CSV file, copy/paste that into an email, and send it to me.
    4. Do exercises 2-17, 2-29, 2-39, 2-57, 2-63
    5. Estimate the mean and standard deviation of the sum of three dice by taking a sample of 10 randomly chosen members. You're welcome to do the calculations by hand or with computer/calculator assist, just be clear when you write it up how you did it. More specifically,
      1. Get three dice. (I'll bring some in if you can't find any.)
      2. Roll them, add the number together, and write that down.
      3. Do that ten times.
      4. Find the mean and standard deviation (which formula are you using?) of those numbers.
      5. Compare what you get with the true values from the population, which are mean=mu=10.5, sigma=2.958.

  3. for Tues Sep 24
    1. Read chap 3, chap 4, and appendix C
    2. Do 3-9, 3-13, 4-5, 4-11, 4-25, 4-27
    3. Lottery : look up one of the state lotteries of your choice. For *one* of the prizes, calculate out what the odds of winning are. What is the expected value? (i.e. on average how much can you expect to win from that prize alone?)
    4. For the survey that we did, with the data I'll post by Thurs, using any tool of your choice (by hand, Excel, Mathematica, ...)
      • Find the mean, median, range, and standard deviation of the heights of the smokers.
      • Find the mean, median, range, and standard deviation of the heights non-smokers.
      • Find the z-score and percentile rank for your height, which ever group you fit in.
    5. Again with the survey data, again by any method of your choice,
      • make 3 histograms of the male heights, with intervals of 1 inch, 2 inches, and 3 inches. Start all graphs at the same height.
      • Do the same for the female heights.
      • Compare and discuss your results. Which format do you like best?
    6. From the survey data, what is the probability that a student picked at random is at least 6 feet tall?

    --- Test 1 --- thursday/friday September 25/26 on material in chapters 1-4.

  4. for Tues Sep 30
    1. Read chapter 5 on the Binomial and chapter 6 on the Normal distribution
    2. Finish histogram from last time
    3. Write out the terms of C(N,m) for N=10, m=0,1,2,...,9,10. (This is the 10th line of Pascal's triangle). Divide by 210 to make these into probabilities for a binomial with p=1/2, N=10. Plot graphs of this probability distribution using *both* Excel and Mathematica. Cut and paste the picture into a Word (or other) document.
    4. Problems 5-15, 5-20, 5-26

  5. for Tues Oct 7
    1. Finish reading 5, 6. Start reading chapter 7.
    2. Do 5-31, 5-32, 6-7, 6-17, 6-19, 6-21
    3. Use Excel to plot a normal distribution. The function is called NORMDIST(x,mean,sigma,FALSE); you'll have to choose some x values that make it look fairly smooth. (Or use any other tool you like. In Mathematica the function is PDF[NormalDistribution[0,1],x] after you load the <<Statistics` package.)
    4. Even if a variable is not normally distributed, its average over many trials will be. Show an example of this by doing class experiment 2 on pg 171. In Excel, the uniform random function is RAND(). What is its probability distribution? Can you make a plot of it?

  6. for Tues Oct 14
    1. Finish reading through chapter 7.
    2. Do 6-29, 7-5, 7-11, 7-23, class survey question 1 on pg 191, and 17 on pg 193.

    --- Test 2 --- Oct 16 on material in chapters 5-7.

  7. Tues Oct 21 - Hendrick's days

  8. for Tues Oct 28
    1. Finish reading chapter 8, start chapter 9
    2. Do exercises 8-13, 8-35, 8-57, 8-63

  9. for Tues Nov 4
    1. Finish reading through chapter 10
    2. Do exercises 9-11, 9-19, 10-9, 10-25
    3. In our class survey, is the percentage of women with brown hair the same as the percentage of men with brown hair?
    4. Invent another hypothesis to test from our class survey data. Do so.

  10. for Tues Nov 11
    1. Proposal for you statistics project to be turned in Wed before Thanksgiving. Be as detailed as you can - what question(s) are you trying to answer with what data collected how? I'd like to do some class presentations - extra brownie points, eh?
    2. Read chapter 11 - Student's t-test.
    3. Do exercises 11-9, 11-24, 11-35 from the text.

  11. for Tues Nov 18
    1. Read chapter 15 (ANOVA - Analysis of Variance). Skipping 15.1; we haven't done much with Chi-Sqaured stuff yet.
    2. Do exercises 15-11, 15-17, 15-19, 15-21
    3. Continue working on your projects.

  12. for Tues Nov 25
    1. Read chapter 14 on Correlation.
    2. Read the Regression notes online here.
    3. Do exercise 14-32 or 14-33. (14-33 is in statistics/regression/)
    4. Are women's heights from our survey correlated with their mother's?
    5. Finish your projects - please have something ready to present in class on Tuesday! (You can hand in the final copy Wednesday if you like.)

  13. for Dec 9
    1. Read chapter 12, on confidence intervals.
    2. No written assignment due; review and catch up on assignments.
    3. Practice final is at statistics/practice-final.txt.

  14. ** FINAL EXAM ** due Mon Dec 15
    1. online copy - but table formatting is messed up.
    2. open book - use any sources or software you like.
    3. don't use other people; ask me if you have questions.
    4. Will be handed out Tues the 9th on the last day of class.


Lecture Notes

Resources

Syllabus

Expect this to change as we go along.


chap 1-4 mean, std dev, probability ideas       Sep  9
   and spreadsheets, graphing,                      16
   Mathematica, arithmetic review                   23

   -- test one -- Sep 26

chap 5-7 binomial, Gaussian                     Sep 30
   with a bit of sigma/sqrt(n)                  Oct  7
   and combining distributions 

   -- test two -- (midterm grades due)          Oct 10
   -- project proposal --                           16

chap 8-10 hypothesis testing                    Oct 14
                                     (hendricks)    22 
                                                    28

chap 11,13,15,16 various tests / examples       Nov  4
   t-test, Chi-Square, Anova			    11
   with case studies				    18

  -- project data and first draft --            Nov 16
  -- project final draft--                          24

chap 12, 14 data 
   confidence intervals, correlation fitting    Dec 24  
                                      (thanksgiving)
                                                     2
				      (last class)   9 

  -- final exam out on Tues Dec 9th, due Mon 15th--



Jim Mahoney (mahoney@marlboro.edu)
Last modified <% scalar localtime($m->current_comp->load_time) %>