nov 1
homework
AIMI pg 317:
9.9
a) Horses, cows, pigs are mammals.
b) An offspring of a horse is a horse.
c) Bluebeard is a horse.
d) Bluebeard is Charlie's parent.
e) Offspering and parent are inverse relations.
f) Every mammal has a parent.
9.10
a) Draw proof tree for (exists h (horse h)) using backward-chaining.
b) Comments?
c) Number of solutions?
d) Approach to find them all?
See (and make sure you understand the differences between)
Walk through text solution.
cancer / bayes
Discuss attached text file.
conditional independence
We've done Bayes for 2 variables. How about N variables? How does this scale up?
P(spam|wow ^ great ^ penis) = ?
Bayes tells us we can compute this if we know
P(wow ^ great ^ penis | spam)
but the problem is that if there are N keywords in the first list,
and each is true|false, we have 2**N different conditional probibilities
to calculate from our collection of spam. This doesn't scale well.
If however the frequence of these words is independent of each other in the two categories of spam vs !spam, then we say they are "conditionally independent". In that case (which may or may not be true here),
P(wow^great^penis|spam) = P(wow|spam) * P(great|spam) * P(penis|spam)
With N variables, this means we only need to find N probabilities, which is much more doable.
When we assume that one cause (e.g. spam | !spam) has a number of otherwise independent effects (e.g. wow, great, penis probabilities),
we have a mathematical model called "naive Bayes".
It turns out that this often works well even when the assumption
is false, essentially as a first-order approximation technique.
sample probability questions
Look at the following questions from chap 13:
- 13.11 - bag of coins
- 13.15 - blue and green taxis
- 13.16 - prisoners
- 13.18 - text categorization (e.g. our spam example)