March 8
quiz 1 results
After seeing how it went, I scaled the raw scores up 5 points
to get letter grades with this distribution :
grade: A A- B+ B B-
count: 1 3 1 1 4
I have posted
my answers,
and will go over them briefly.
Homework 2 has been posted. Also a reading assignment for Thursday.
And your group presentations of a pig-like game will also happen Thursday.
(Do we want to use some of today's time to prepare for that?)
where we are
So we know know something about getting data,
finding some summary statistics (mean, standard deviation)
of some data and plotting it (chap 1), and
we know some probability theory (chap 2).
Now we're going to to see the most typical probability
distribution: the normal distribution (chap 3).
(Also called the "bell" curve" or "Gaussian".)
The math behind this gets hairy - I will be stating
some facts without deriving them, mostly for culture
or for the folks who've been exposed to some of this before.
chapter 3 - normal distribution, Z-score, R functions, binomial (a bit)
The normal is
- a continuous, smooth distribution
- If the mean=0 and =1, then the x value is called the "Z-score".
- We can get to a z-score by scaling with (x - mean)/standarddeviation .
(I will use "sigma" and \(\sigma\) and "standard deviation" interchangeably.)
In the old days people used tables to get the various versions
of the normal distribution numbers. But these days those tables
are baked into software like R.
Normal distribution functions (badly named, IHMO) in R :
- dnorm(x) - normal distribution itself, dnorm(x) = \( \frac{1}{\sqrt{2\pi}} e^{-x^2 / 2} \)
- pnorm(x) - total probability up to x pnorm(x) = \( \int_{-\infty}^x \frac{1}{\sqrt{2\pi}} e^{-z^2 / 2} \, dz \)
- qnorm(p) - quartiles, qnorm(p) = pnorm_inverse(p)
- rnorm(n) - generate n normally distributed random numbers
x = seq(-3, 3, 0.01) # Define numbers for an x axis, from -3 to 3.
plot(x, dnorm(x)) # "density" - normal distribution itself (mean=0, sigma=1)
plot(x, pnorm(x)) # "probability" - total probability up to x
prob = c(0.05, 0.25, 0.5, 0.75, 0.95) # Define some probabilities.
qnorm(prob) # Invert pnorm to get the corresponding x values.
numbers = rnorm(100) # generate 100 normally distributed values
hist(numbers) # count how many of each go into different bins
A related distribution is the "binomial", which we will look
at a bit but not use as much.
- The binomial is discrete, not smooth, and depends on a number of trials N.
- As N gets big, the binomial turns into the normal.
- Pascal's Triangle gives the binomial coefficients which show up in the binomial distribution.
I'll go over how all this fits together, and work two practical examples from the text : 3.3 & 3.5