Prev: Elementary Data Structures: Part 2 Next: Binary Heaps |
What Every CMPT 225 Student Should Know About Mathematical Induction
- by Brad Bart, Simon Fraser University | |||||||||||||
Part 1: Overview of Induction
Part 2: How Induction Relates to Computer Science
Part 3: Induction in CMPT 225
Decision Tree of a Dynamic Set:
|
Part 3: Mathematical Induction in CMPT 225
More Complex Linked Data Structures
By using a sorted array to implement a dynamic set, there was a trade-off: an excellent O(log n) running time for search, but with an expensive O(n) running time for insert. Ideally, you should seek an implementation that has an excellent running time for all operations. |
In this context, O(1) is optimal, and
O(log n) is excellent. | ||||||||||||
|
Yes, these sound like the properties of a sorted array, but keep an open mind: a sorted array might not be the only data structure that works. | |||||||||||||
A depiction of the search strategy, or decision tree, is shown in the figure [left]. The decision? If the target key is less than x, then recursively search the subset on the left, but if the target key is greater than x, try the subset on the right. Of course, if the target key equals x, then it's a successful search! Doesn't the figure remind you of a binary tree? If you've never seen a binary tree before, it is like a linked list, but every node has two links to other nodes instead of just one. These links are said to point to the left child and right child or left subtree and right subtree, and the links are usually drawn pointed at angles facing downward. Furthermore, the links may never form a cycle. So, define a binary search tree to be a binary tree that stores keys that are arranged as described above. Stated more formally: For every node in the tree which contains the key x, Sample Binary Search Tree with Integer Keys:
Since every subtree of a binary search tree is itself a binary search tree [It's a recursive definition!] then the correctness and the running time of the implementations for both search and insert will be easy to prove using strong induction. Implementation
To search for a | ||||||||||||||
Successful Search for
|
Insert
Running Time Analysis The running time for both operations is O(log n), as long as each comparison eliminates roughly half of the remaining possibilities. Such a tree is said to be balanced. |
As stated before, the split doesn't
have to be dead even for the running time to be O(log n).
Maintaining the balance of the tree is important, otherwise
the running time for both operations may grow to
O(n). One example is if you inserted n
keys in increasing order: you would get an anemic tree, and
the bottommost keys would take
a linear number of steps to search.
Two Examples of Unbalanced Binary Search Trees:
There are algorithms that can keep the tree balanced after every
operation, and you will probably see two of them in CMPT 225:
the AVL tree and the 2-3 tree2.
Your instructor might also show you the
red/black tree, but
they might defer this one until CMPT 307 if
they are feeling humane.
2Also known as a 2-3-4 tree or a B-tree.
Summary
By stating the desired properties of the dynamic set,
a recursive definition, you derived the
binary search tree. You not only used recursion
to develop the data structure, but you used
induction to justify the excellent running times
of both operations: altogether a pretty amazing analysis.
In the next section, you're going to examine implementations
of a priority queue. An excellent implementation will also
result from a recursive definition: the binary heap.
|