oct 29
homework
overview of upcoming topics
- discuss upcoming assignments and reading
- conditional probability
- learning - general principles
- training set
- noise
- pattern recognition
- typically "low level" technique; not logic "high level"
- usually end up with "machine" characterized by numbers whose specifics are not subject to a simple interpretation
- naive bayes model
- neural nets
- many variations
- example: character recognition
probability
- http://en.wikipedia.org/wiki/Bayesian_probability
- random variables
- boolean : verdict = (true, false)
- discrete : weather = (sun, clouds, rain, snow)
- continuous : -10 < x < 10
- probability distribution as either
- frequency (typical in hard sciences)
- agent knowledge (typical in AI)
- Bayesian probablity is more about the "belief" version of probality: every new fact alters other probabilities.
Example: What is Travis doing right now ... expressed as a probability. If I now tell you that he's in the physics lab, what are the "probabilities" now?
Terminology:
P(A) is "probability of A" .
P(A|B) is "probability of A given that we know B"
P(a,b,c) is "joint distribution"
Example from text, pg 475 :
toothache !toothache
catch !catch catch !catch
cavity 0.108 0.012 0.072 0.008
!cavity 0.016 0.064 0.144 0.576
What is
P(cavity) ?
P(cavity or toothache) ?
P(cavity | catch) ?
Rules :
P(x|y) * P(y) = P(x and y)
or
P(x|y) = P(x and y) / P(y)
if we sum over all possibilities for y,
P(x) = sum_y P(x|y) * P(y)
if A and B are independent, then
P(A|B) = P(A)
P(B|A) = P(B)
P(A and B) = P(A) * P(B)
Bayes rule :
P(M and N) = P(M|N) * P(N) = P(N|M) * P(M)
therefore
P(M|N) = P(N|M) * P(M) / P(N)
"This simple equation underlies all modern AI systems for probabilistic inference."
cancer example
1% of women at age forty who participate in routine screening have breast cancer. 80% of women with breast cancer will get positive mammographies. 9.6% of women without breast cancer will also get positive mammographies. A woman in this age group had a positive mammography in a routine screening. What is the probability that she actually has breast cancer?
"What do you think the answer is? If you haven't encountered this kind of problem before, please take a moment to come up with your own answer before continuing."
Here's the actual Bayes formula. (Note that the denominator is P(X).)
p(A|X) = p(X|A)*p(A) / { p(X|A)*p(A) + p(X|~A)*p(~A) }
Given some phenomenon A that we want to investigate, and an observation X that is evidence about A - for example, in the previous example, A is breast cancer and X is a positive mammography - Bayes' Theorem tells us how we should update our probability of A, given the new evidence X.
Once we get to learning, we'll apply this rule to training spam filters...