Statistics

Spring 2016
course
navigation

Feb 23 - conditional probability

context

We're talking about probability, working through the material in chapter 2, putting together the theory behind the math models used to build the statistical tests that we're heading towards later in the term.
Today we'll discuss an idea called "conditional probability" : what is the probability of something thing A given that you already know some other thing B. This notation is P(A|B) which you would say as "the probability of A given B".
You will *not* be tested on this material. It's cool stuff, but a bit sideways to what you absolutely need to know in order to do the statistical tests that we're heading towards. Interesting and useful in certain situations, but optional for the core material of intro stats.
We already talked about "independent" things a bit. If two things A and B are independent, then P(A|B) = P(A). In other words, knowing about B doesn't change A's probability.
I think the best way to see this material is with examples. All the ones here have two variables, each with two possible values - this is the simplest setup that shows how this stuff works.

example 1 - medical test

Suppose that you are worried that you might have a rare disease. You decide to get tested, and suppose that the testing methods for this disease are correct 99 percent of the time (in other words, if you have the disease, it shows that you do with 99 percent probability, and if you don't have the disease, it shows that you do not with 99 percent probability). Suppose this disease is actually quite rare, occurring randomly in the general population in only one of every 10,000 people. (From https://www.math.hmc.edu/funfacts/ffiles/30002.6.shtml )
Discuss & work through these concepts :

example 2 - parents & teens college or not

A look at teens who did or didn't go to college (variable A) from families with a parent who did or didn't go to college (variable B). (From our textbook, pg 88.)
The numbers (from which we can work out probabilities) of people are :
count(yes parent & yes teen) = 231 count(yes parent & no teen) = 49 count(no parent & yes teen) = 214 count(no parent & no teen) = 298

theory aside

Bayes Theorem ... let's you flip a conditional probality around.
Since P(A & B) = P(A) * P(B|A) = P(B) * P(A|B) Then P(A|B) = P(A) * P(B|A) / P(B)
Spam filter example :
Measure P( word[i] | spam) by counting each word[i] in collection of emails labeled as spam or not spam. Use Bayes theorem to then look at a new mail message, see what words are in it, and find P(spam | word[i])

example 3 - taxicab witness

A cab was involved in a hit and run accident at night. Two cab companies, the Green and the Blue, operate in the city. You are given the following data:
What is the probability that the cab involved in the accident was Blue rather than Green?
(From http://www.cut-the-knot.org/Probability/RedBlueTaxicabs.shtml )
http://cs.marlboro.edu/ courses/ spring2016/statistics/ notes/ Feb_23
last modified Monday February 22 2016 9:11 pm EST