Nov 17

aside

More Stanford online classes:

 nlp-class.com (natural language processing)
 pgm-class.com (probabilistic graph models)
 & others

decision trees

I assigned 18.6 as a way to get you to look at the basic idea, without getting too far into the math details.

The basic idea is that one mechanism to draw a conclusion from a data, is to make a series of sequential choices. This is particularly good when all the variables are discrete, i.e.

 in1  in2  in3    output
 1    0    1      1
 2    0    3      1
 ...

In a machine learning context, each row is one training example, and the decision tree is the machine we're going to build from the examples.

First point: any ordering of the inputs gives you a possible tree which can give those outputs.

Second point: depending on how well the outputs match the inputs, some trees will be simpler (i.e. better) than others, giving some confidence that it is a good "model" for that data. Complicated trees are over trained, fitting that specific set of data but not representing general trends.

Third point: since we want a simple tree, we want to find an order for the choices that splits things as much as possible.

With that in mind, discuss the assigned problem AMIA 18.6 :

 A1  A2  A3  Output
 ------------------
  1   0   0     0
  1   0   1     0
  0   1   0     0
  1   1   1     1
  1   1   0     1

Look at what happens intuitively for various choices of using A1, A2, A3 first to divide things up, and what the good choices are after that.

The math details of the best way to do this heads into information theory, which I was just glossing over.

But Sam asked about how the "importance" function works, which is at the heart of it, so, here's how it works:

 Discuss (briefly) the idea of information entropy,
 bits per symbol. If p1, p2, p3, ... are the 
 probabilities of each symbol, then
    
   H(p1, p2, p3) = -sum( p[i] * log2(p[i])
 
 For a boolean with only two probabilities (q, 1-q)
 and following the books notation, this is 
 
   B(q) = - q log2(q) - (1-q) log2(1-q)
 
 which is how many bits of info there is.
 (Discuss briefly; draw the upside down parabola sketch.)
 
 Still following the book notation, 
 in one "clump" of things
   p = number of positive  
   n = number of negative
   B(p/(n+p)) = bits of info
 
 When we use one variable to split the data
 into a partition of clumps, the best split
 causes the biggest information gain (bits per symbol).
 So the technique is to use look at B 
 before and after the split :
 
 importance = B(p/(n+p)) - weighted_sum B(pk/(nk+pk))
              before split              after split
 
 where pk = number of positive in k'th partition
       nk = number of negative in k'th partition
       weighting is over number in that paritition compared to total

Apply these numbers to 18.6, and compare with intuition. I put the solution in this folder.

more computer vision

openCV and processing.org

Examples I tried crashed on my Mac.

I did get some opencv + python working:

 $ sudo port install opencv +python26
 $ sudo port select --set python python26  # as opposed to python26-apple
 $ python
 >> import cv
 >>           # works!

Then on to

This worked :

import cv
img = cv.LoadImageM("dime_building.jpeg", cv.CV_LOAD_IMAGE_GRAYSCALE)
eig_image = cv.CreateMat(img.rows, img.cols, cv.CV_32FC1)
temp_image = cv.CreateMat(img.rows, img.cols, cv.CV_32FC1)
for (x,y) in cv.GoodFeaturesToTrack(img, eig_image, temp_image, 10, 0.04, 1.0, useHarris = True):
    print "good feature at", x,y

And there are more python examples in the OpenCV-2.2 source

including facedetect.py : see the attached screenshot for an example.

The heart of the python code is a call to HaarDetectObjects(), which uses an xml description of a "frontal face detection" trained specificiation; a "cascade" of "haar-like features".

http://cs.marlboro.edu/ courses/ fall2011/ai/ notes/ Nov_17
last modified Thursday November 17 2011 8:13 am EST

attachments

name last modified size

Artificial
Intelligence

course

navigation

Nov 17

aside

decision trees

more computer vision

attachments

ArtificialIntelligence

course

navigation

Nov 17

aside

decision trees

more computer vision

attachments

Artificial
Intelligence