Artificial
Intelligence

Fall 2015
course
navigation

Oct 13

Machine learning, statistical approaches, and all that.
resources

homework & textbook discussion

Topics

probability

Example: What is Matt Ollis doing right now ... expressed as a probability. If I now tell you that he's downtown, what are the "probabilities" now?
Terminology:
P(A) is "probability of A" . P(A|B) is "probability of A given that we know B" P(a,b,c) is "joint distribution"
Example from text, pg 475 :
toothache !toothache catch !catch catch !catch cavity 0.108 0.012 0.072 0.008 !cavity 0.016 0.064 0.144 0.576
What is
P(cavity) ? P(cavity or toothache) ? P(cavity | catch) ?
Rules :
P(x|y) * P(y) = P(x and y) or P(x|y) = P(x and y) / P(y)
if we sum over all possibilities for y,
P(x) = sum_y P(x|y) * P(y)
if A and B are independent, then
P(A|B) = P(A) P(B|A) = P(B) P(A and B) = P(A) * P(B)
Bayes rule :
P(M and N) = P(M|N) * P(N) = P(N|M) * P(M) therefore P(M|N) = P(N|M) * P(M) / P(N)
"This simple equation underlies all modern AI systems for probabilistic inference."

cancer example

( from http://yudkowsky.net/rational/bayes ) | (or here)
1% of women at age forty who participate in routine screening have breast cancer. 80% of women with breast cancer will get positive mammographies. 9.6% of women without breast cancer will also get positive mammographies. A woman in this age group had a positive mammography in a routine screening. What is the probability that she actually has breast cancer?
"What do you think the answer is? If you haven't encountered this kind of problem before, please take a moment to come up with your own answer before continuing."
Here's the actual Bayes formula. (Note that the denominator is P(X).)
p(A|X) = p(X|A)*p(A) / { p(X|A)*p(A) + p(X|~A)*p(~A) }
Given some phenomenon A that we want to investigate, and an observation X that is evidence about A - for example, in the previous example, A is breast cancer and X is a positive mammography - Bayes' Theorem tells us how we should update our probability of A, given the new evidence X.
Once we get to learning, we'll apply this rule to training spam filters...

naive bayes spam filtering

AIMA 13.22 Text categorization is the task of assigning a given document to one of a fixed set of categories on the basis of the text it contains. Naive Bayes models are often used for this task. In these models, the query variable is the document category, and the “effect” variables are the presence or absence of each word in the language; the assumption is that words occur independently in documents, with frequencies determined by the document category.
a. Explain precisely how such a model can be constructed, given as “training data” a set of documents that have been assigned to categories.
b. Explain precisely how to categorize a new document.
c. Is the conditional independence assumption reasonable? Discuss.
http://cs.marlboro.edu/ courses/ fall2015/ai/ notes/ Oct_13
last modified Tuesday October 13 2015 1:54 am EDT