March 31
confidence intervals
Continue discussion of confidence intervals and standard error.
I've created a few R scripts to illustrate the ideas
in the first few sections.
Start with confidence.R : illustration of confidence interval by sampling a population.
aside : why 1/sqrt(N) ?
This is tricky, but I would like to try to motivate where this comes from.
One way to think about the mean of a sample is to consider each randomly choosen
entity to be a random variable from the original distribution. The mean is what
we get by adding together all these random variables and dividing by N.
X : random variable with parent population probability distribution
X1, X2, X2, ... , XN : samples
Y = (X1 + X2 + X3 + ... + XN)/N = new random variable
We want to know the mean and standard deviation of Y.
It turns out that
mean(Y) = mean(X) # mean of the sample is mean of the populuation
variance(Y) = variance(X)/N # variance = sigma**2 gets smaller by factor of 1/N
I simulate something like this situation in the variance.R script, in
which I demonstrate that variance(A + B) = variance(A) + variance(B).
Together with the fact that if c is constant, variance(c A) = c**2 variance(A),
that's enough to explain the 1/sqrt(N) factor .
back to our regular program
Do some examples : maybe some exercises from the text?
If time allows, start discussion of hypothesis testing,
from textbook slideshow or my Tuesday notes.