Statistics Final - fall 2003

Your Name ________________________________________________________

Instructions:  
This is an open book, take home final.  You may use your text, any notes, or any lilbrary or online resources,
including Excel, Mathematica, or other software or calculator.  You may not ask other people for help.
Be clear about any sources you use, and explain how you got your results and what they mean.  As always,
your goal should be to convince me you know the material, not just to get the right answer.  If you have any
questions, ask me.

1

A survey of 200 voters find that 90 would vote for candiate A, 50 would vote for candidate B, and 60 are undecided.
With 95% confidence, what range (i.e. from ___% to ____%) of voters would vote for candidate A?

2

An experimenter measures the crop yield from a number of equal sized plots of land.  The results are
{66,48,45,45,37,79,63,67,81,61}.  Thinking of this as a sample from the population of all yields
from this type of crop, describe the distribution of the population and this sample of it.  
Include as much as you can about the mean of the sample and the underlying population,
the standard deviation of the sample and the underlying population.  Draw some pictures.
With 95% confidence, what is the yield from a typical plot of land like this?

3

The same scientist now tries a genetic variation of the same crop, and wants to know if it has a higher
yield that the one described in problem 2.  The new data is {60,63,56,73,49,61,66,59}.

a) Again find the 95% confidence interval for the mean of the parent population of this variation.
Compare this with your result from problem 2 to find an initial guess as to whether or not the yield
is higher.

b) Choose an appropriate null hypothesis and perform a Students-t test to decide the question,
again with a signficance level of 95%.

c) Test the same hypothesis with an ANOVA test, again at 95% significance.  Compare your results
with a) and b), and discuss.  Which of the three do you like best in this case?

4

a) In your answer to problem 3, could you have made a type I error?  If you had made such an error,
what would be true instead?  Do your best to estimate the probability of such an error.

b) In your answer to problem 3, could you have made a type II error?  If you had made such an error,
what would be true instead?  Do your best to estimate the probability of such an error.

5

A psychologist believes that the number of reported UFO settings is related to the phase of the moon.
From the following data, discuss whether you agree or not.  As usual, describe which test you choose,
and explain your methods.

    number of sitings    moon phase (percent full)
    5            10
    8            20
    4            30
    5            40
    10            50
    7            70
    8            80
    6            90
    11            100

6

We can think of rolling a six sided dice as a binomial if all we want to know is whether
or not a "1" is rolled.  With 95% confidence, how many 1's do you expect to find in
60 rolls of the dice?

7

Three groups of sick people are given three diferent drug treatments, and
their ratio of "bad" to "good" cholesteral is measured.  For comparison,
the ratio without any treatment is given as "none".

The results are

    group    ratio
    
    none    0.3
    A    0.4
    B    0.2
    C    0.2
    none    0.1
    A    0.3
    B    0.2
    C    0.4
    none    0.2
    A    0.6
    B    0.4
    C    0.3
    none    0.3
    A    0.4
    B    0.4
    C    0.2

An ANOVA test is done to see if any of these groups are significantly different.
What does it show?  

8

Another researcher, looking at the same data, also knows that the same four people
were used for all four groups.  He's only interested in treatment A, and so extracts
the following table.

    patient        none    A
    Fred        0.3    0.4
    Chuck        0.1    0.3
    Al        0.2    0.6
    George        0.3    0.4

He uses a paired t-Test to see if these results are statistically significant.  
What does he find?  Explain how this result is consistent with that from problem 7.


Created by Mathematica  (December 9, 2003)