previous next
Prev: Elementary Data Structures: Part 1
Next: Binary Search Trees
What Every CMPT 225 Student Should Know About Mathematical Induction

- by Brad Bart, Simon Fraser University

 

Part 1: Overview of Induction Part 2: How Induction Relates to Computer Science Part 3: Induction in CMPT 225

 

Unsorted Array: Inserting the nth key is O(1)

Part 3: Mathematical Induction in CMPT 225

Case Study: Dynamic Set Implementation Using an Array

It is time to examine two potential implementations of the dynamic set operations insert and search, using an array as the underlying data structure.

Remember, in each implementation, the array A[0..n-1] will hold the n keys currently in the dynamic set, and the variable len will maintain the current number of keys.

Implementation 1: Keys are in first n slots, in any order

In order to insert the nth key, the simplest move is to place
it in slot A[n-1], appending to the contiguous block of the n − 1 previously inserted keys.

This is an O(1) operation like so:

void insert(int key) {
  A[len++] = key;
}
However, to search for a key will be more expensive, requiring a linear scan, which is an O(n) operation:
// linear search - returns -1 if not present
int search(int key) {
  for (int i = 0; i < len; i++)
    if (key == A[i]) return i;
  return -1;
}

Implementation 2: Keys are in sorted order

But instead if the current keys were stored as a sorted array then the search operation could run in O(log n) time. This is another instance of Divide & Conquer:

  • Divide: Probe the middle element of the array and compare it with the search key. Cost: O(1).

  • Conquer: Search the half of the list that is still valid. Cost: T(⎡n/2⎤).

  • Combine: Return the result of the subsearch. Cost: O(1).

This is, of course, the familiar binary search algorithm. One possible implementation:
// binary search - returns -1 if not present
int search(int key) {
  int first = 0, last = len-1;
  while (first < last) {
    int mid = (first+last)/2;
    if (A[mid] < key) // search upper half
      first = mid + 1;
    else // search lower half
      last = mid;
  }
  if (A[first] == key) return first;
  else return -1;
}
The resulting recursive relation, T(n) = T(⎡n/2⎤) + O(1), yields a running time of T(n) = O(log n), which is magically delicious: a mere 30 loops when n = 109.

 

While we're on the subject, any recurrence of the form:

      T(n) = T(n/b) + O(1), b > 1

yields T(n) = O(log n).

Thus, if your algorithm discards any constant fraction of elements on each loop, the running time is still a sweet O(log n).

Sorted Array: Inserting the nth key is O(n)

But here's the rub − the array must remain sorted after every insert. To insert into a sorted list in such a way as to keep it sorted requires the same number of steps as the Insertion Sort algorithm's final iteration: the potential exists for the shifting of O(n) elements, and so insert runs in worst-case linear time.

void insert(int key) {
  // shift all A[] > key
  int j = len-1;
  while (j >= 0 && A[j] > key) {
    A[j+1] = A[j];
    j--;
  }
  A[j+1] = key;
  len++;
}

 


Because of the linear layout of the data structure, at least one of the two operations had a running time of O(n). It seems hopeless that there is something better: a data structure that is sublinear in both insert and search.

But from the ashes of Implementation 2 will rise a strategy that succeeds at doing just that, all by using an inductive argument. This is the Binary Search Tree.


next next        Next: Binary Search Trees