Apr 6
old business
I'm behind in my grading, but will get to your midterm projects "real soon".
assignment
I've posted an assignment due next Tuesday,
on the representation of graphs, breadth-first
and depth-first search.
new business
We'll continue our discussion of the material in the graphs chapter
... which you should start reading if you haven't yet.
The specifics of the "word ladder problem" (and how to build that graph efficiently) and "knight's tour" problems are worth looking at, but not the key points.
For the word ladder problem, building the graph means starting from a words.txt file of (say) one line per m-letter word, then turning each word into a vertex of a graph and with edges between words that differ only by one letter. As the book discusses, considering each pair of words is a O(n**2) approach where n is the number of words. Faster is to use make a hashtable of one-wildcard-keys (for example 'cat' would have '_at', 'c_t', 'ca_') and whose value is a collection of the matching words - that's a O(m*n) operation. Then all the words in that collection have one letter different and therefore have edges between them.
What you should focus on (using other sources like
wikipedia as needed) are
- What is the technical definition of a graph, and what are the variations?
- What operations do we want a graph data structure to preform? (That's their "abstract graph" API. (Not the only choice.)
- How can we implement the storage of a graph in code?
- How do we do a breadth-first or depth-first search of a graph? (And what do those things mean?)
- What are some of the classic graph problems?
It turns out that that both breadth and depth first search can be done with the same algorithm - the only difference is whether a queue or a stack is used to hold the vertices scheduled to be visited in a collection called "the fringe". Each vertex is in one of three states: unvisited, scheduled, and visited. All start 'unvisited', are marked as 'scheduled' when they are put into the fringe, and are marked 'visited' once they are processed and removed from the fringe. The algorithm looks like this:
- Mark all vertices 'unvisited'.
- Create an empty fringe.
- Choose one vertex as the starting point. Push it onto the fringe, and mark it 'scheduled'.
- Then loop until the fringe is empty:
- Pull a vertex out of the fringe for processing and mark it 'visited'.
- Push each of its 'unvisited' children onto the fringe, and mark them 'scheduled'.
- Do whatever other processing of that vertex is needed. (Print it, look to see if it matches some criteria, etc).
The fringe's push and pop behave differently if it is a queue or a stack, and that gives the breadth-first or depth-first behavior.
Notice that this is *not* a recursive approach : there is an explicit loop.
Remember that any of these can be implemented either as classes with methods for the API, or as python data structures (like lists-of-lists) with functions that act on them.
And remember too that if you would like to use the book's code or a my code as a starting point, quote your sources and be clear about what is your own work.
Other sources :
We'll start discussing this material in class and see how far we get.
Aside: