17. Some Basic Algorithms

In the following notes we’ll look at a number of basic algorithms that can be applied to vectors of numbers. We’ll use vector<int> throughout, but they can be used with other kinds of numbers and objects (e.g. strings).

Many of these functions already exist in the C++ standard template library (the STL). Here we are interested in seeing how to implement these functions, and so we won’t use the STL.

17.1. Summing a vector

Here’s a basic function for summing a vector of numbers:

int sum(const vector<int>& v) {
  int result = 0;
  for(int i = 0; i < v.size(); ++i) {
    result += v[i];
  }
  return result;
}

This sums the entire vector, which may not always be what we want. A more general version is this:

// Pre:
// Post: returns v[begin] + v[begin + 1] + ... + v[end - 1]
int sum(const vector<int>& v, int begin, int end) {
  int result = 0;
  for(int i = begin; i < end; ++i) {
    result += v[i];
  }
  return result;
}

Here, begin is the index of the first element of v we want in the sum, and end is the index of one past the last element we want included.

For example:

// v = {2, 1, 4, 5}
sum(v, 0, 4) == 12
sum(v, 0, 3) == 7
sum(v, 1, 4) == 10
sum(v, 1, 3) == 5

We can now easily implement our original sum function using the more general-purpose one:

int sum(const vector<int>& v) {
   return sum(0, v.size());
}

In terms of performance, summing a vector of n elements with these functions always requires doing about n additions (n - 1, to be exact).

17.2. Finding the min/max of a vector

Here’s the main idea for finding the min (analogously, the max) of a vector:

// Pre: end - begin >= 1 (i.e. range is non-empty)
// Post: returns i such that v[i] <= v[j] for begin <= j < end
int min_index(const vector<int>& v, int begin, int end) {
  int min_so_far = begin;
  for(int i = begin + 1; i < end; ++i) {
    if (v[i] < v[min_so_far]) {
      min_so_far = i;
    }
  }
  return min_so_far;
}

An important detail is that there must be at least one element in the range of the vector being checked. Otherwise the first assignment to min_so_far will fail.

min_index forms the basis of some other useful min functions:

int min_index(const vector<int>& v) {
  return min_index(v, 0, v.size());
}

int min_value(const vector<int>& v, int begin, int end) {
  return v[min_index(v, begin, end)];
}

int min_value(const vector<int>& v) {
  return min_value(v, 0, v.size());
}

Here is some test code:

void min_test() {
  cout << "min_test called ...\n";
  vector<int> v;
  v.push_back(3);
  assert(min_value(v, 0, 1) == 3);

  v.push_back(4);
  assert(min_value(v, 0, 1) == 3);
  assert(min_value(v, 0, 2) == 3);
  assert(min_value(v, 1, 2) == 4);

  v.push_back(1);
  assert(min_value(v, 0, 1) == 3);
  assert(min_value(v, 0, 2) == 3);
  assert(min_value(v, 1, 2) == 4);
  assert(min_value(v, 1, 3) == 1);
  assert(min_value(v, 2, 3) == 1);

  cout << "... all min_test tests passed\n";
}

In terms of performance, min algorithms are usually measured by how many equality comparisons (calls to ==) they do. On an n-element vector, it’s necessary to do n - 1 comparisons in all cases: any fewer would mean some elements of the vector are not being checked.

17.3. Reversing a vector

Reversing the elements of a vector has a number of uses. For instance, sorting a vector in reverse numerical order is often done by first sorting it in ascending numerical order, and then reversing it. This is often easier and faster than writing a special-purpose reverse-sorting function.

Here is an implementation of reverse:

// Pre: a has value x, b has value y
// Post: a has value y, b has value x
void swap(int& a, int& b) {
  int temp = a;
  a = b;
  b = temp;
}

// Pre:
// Post: reverses the order of v[begin], ... v[end - 1] in place (i.e. v
// is modified)
void reverse(vector<int>& v, int begin, int end) {
  while (end - begin > 1) {
    swap(v[begin], v[end - 1]);   // note v[end - 1], not v[end]
    ++begin;
    --end;
  }
}

void reverse(vector<int>& v) {
  reverse(v, 0, v.size());
}

Swapping elements of a vector is common in many simple algorithms and so we’ve introduced the swap helper function here. Note that it’s essential that the values passed to swap are pass-by-reference (since they will be changed).

Also, since we use pass-by-reference when calling reverse, the elements of v are actually modified. Thus, this is sometimes referred to as an in- place algorithm, i.e. in-place reverse.

In terms of performance, reversing a vector needs to do about n / 2 swaps for an n-element vector.

17.4. Randomly Shuffling the Elements of a vector

Randomly shuffling the elements of a vector is a useful operation in many programs that need randomness. The standard algorithm for shuffling a vector is know as the Fisher-Yates algorithm:

// Pre:
// Post: v is randomly shuffled such that all permutations
//       are equally likely.
void shuffle(vector<int>& v) {  // v passed by reference
  for(int i = v.size() - 1; i >= 1; --i) {
    int r = randint(0, i + 1);  // 0 <= r <= i
    swap(v[i], v[r]);
  }
}

The basic idea is as follows: the index variable i moves from the right to the left through the vector. A random index, r is chosen between 0 and i inclusive (i.e. r might be set to i), and then the locations of v[i] and v[r] are swapped.

The trickiest part of a shuffling algorithm is ensuring that the shuffle is truly random in the sense that all possible permutations of v are equally likely. We won’t go into the details of why, but the Fisher-Yates algorithm does indeed give a true random shuffle — if implemented correctly.

An error that you can imagine is easy to make is to write randint(0, i) instead of randint(0, i + 1). When you call randint(0, i), the possible return values are from 0 up to, and including i - 1 (not i); and when you call randint(0, i + 1), the possible values range from 0, up to and including, i.

This one small difference has a huge impact on the overall correctness of the algorithm: using randint(0, i) causes it to no longer returns truly random shuffles.

Another tricky issue is being sure that the randint function is truly and fairly random? That turns out to be a very difficult question to answer since generating random numbers with a computer is surprisingly difficult. Our randint function is just a wrapper for the standard C++ rand() function, and we trust that rand() is a reliable random number generator.

But how does rand() work? Here’s one common implementation:

static unsigned long next = 1;

/* RAND_MAX assumed to be 32767 */
int rand() {
    next = next * 1103515245 + 12345;
    return((unsigned)(next/65536) % 32768);
}

void srand(unsigned seed) {
    next = seed;
}

This rand() function implements what is known as a linear congruential pseudo-random number generator. It is relatively efficient, and produces numbers that are random enough for most applications. While we won’t go into of the details of it here, we note that the literal numbers rand() uses must be chosen with some care.

The srand function is used to initialize the random number generator: the sequence of numbers returned by multiple calls to rand() depend entirely upon the seed given to srand. Thus, if you run the same program multiple times using the same seed, the random numbers will be the same for each run. This can be helpful for debugging, but a serious problem for “finished” programs. To get a random seed we often call srand like this:

srand(time(NULL));

This statement sets the seed based on the current time, which is different for each run of a program.

17.5. Questions

  1. Write the following functions:

    // Pre:
    // Post: return the average of v[begin], v[begin + 1],
    //       ... v[end - 1]
    double average(const vector<double>& v, int begin, int end)
    
    // Pre:
    // Post: return the average of v
    double average(const vector<double>& v)
    
  2. Write the following functions:

    // Pre:
    // Post: returns the sum of the positive numbers in v
    int sum_positive(const vector<int>& v)
    
    // Pre:
    // Post: returns the sum of the negative numbers in v
    int sum_negative(const vector<int>& v)
    
  3. Write the following function:

    // Pre:
    // Post: multiplies each element of v by x
    void scale(vector<double>& v, double x)
    
  4. In the notes, four functions were written for finding the min or its index. Write the same four functions for finding the max or its index.

  5. Write the following function:

    // Pre:
    // Post:
    void reverse(string& s)
    
  6. A palindrome is a string that is the same forwards and backwards, e.g. “I”, “pop”, “racecar”, and “nomelon nolemon” are all palindromes. Write the following function:

    // Pre:
    // Post: return true if s is a palindrome, and false otherwise
    bool is_palindrome(const string& s)
    
  7. Write the following functions:

    // Pre:
    // Post: returns the number of times c appears in the substring
    //       s[begin], s[begin + 1], ..., s[end - 1]
    int count(char c, const string& s, int begin, int end)
    
    // Pre:
    // Post: returns the number of times c appears in s
    int count(char c, const string& s)