Statistics

Spring 2016
course
navigation

Feb 4

asides

Iowa caucuses and coin flips
Is truncating the Y-axis misleading?

context

Where we are:
Today we'll look at the second half of chapter 1, more math-y and specific.
Questions about anything so far?

mean and standard deviation

Discuss and define. (Math on the whiteboard.)
These are crucial ideas for the statistical tests that come later in the course. You should (a) have an intuition for what they are, (b) be able to compute them by hand, and (c) be able to use software like R to compute them for you.
In addition to the basic formulas, I will also mention the "frequency" approach:
scores = c(95, 95, 90, 90, 90) mean = (95 + 95 + 90 + 90 + 90)/5 = sum/how_many
But we can also write this as
mean = (2*95 + 3*90)/(2 + 3)
where the 2 and the 3 count how many times we got 95 and 90.
In fact, if we list all the scores from (say) 90 to 95 and call f(t) the "frequency" or count of how many test scores equal to t there were, we would have
t f(t) -------- 90 3 91 0 93 0 94 0 95 2
and we could write
mean = (3*90 + 0*91 + 0*92 + 0*93 + 0*94 + 0*94 + 2*95)/(3+0+0+0+0+0+2) = sum(t * f(t)) / sum(f(t)) for all values of t
Well, you might say that this is much more complicated, so why bother doing it this way?
The answer is that for many situations it's more convenient and intuitive. One example is the average position (i.e. center of mass) of a physical object that has a density that varies from place to place.
But for this class, the simpler "add-em-up-and-divide-by-how-many" should work fine.
I will also discuss the difference between two different formulas for standard deviation which differ by a factor of sqrt(n/(n-1)) and why there are two versions, sometimes called "sample standard deviation" \( s \) vs the "population standard deviation" \( \sigma \).

R and making graphs

Work through some of the "Intro to Data lab"
The graphs can be done in either of two way: using R's "standard graphics" (which is what the lab describes) or using ggplot (which are nice looking and perhaps simpler).
There are lots of example at
which is listed on the "outside resources" page.
Also see
Coming up next: some homework and practice.
http://cs.marlboro.edu/ courses/ spring2016/statistics/ notes/ Feb_4
last modified Thursday February 4 2016 12:32 am EST