Some Basic Algorithms ===================== In the following notes we'll look at a number of basic algorithms that can be applied to vectors of numbers. We'll use ``vector`` throughout, but they can be used with other kinds of numbers and objects (e.g. strings). Many of these functions already exist in the C++ standard template library (the STL). Here we are interested in seeing how to implement these functions, and so we won't use the STL. Summing a vector ---------------- Here's a basic function for summing a vector of numbers:: int sum(const vector& v) { int result = 0; for(int i = 0; i < v.size(); ++i) { result += v[i]; } return result; } This sums the *entire* vector, which may not always be what we want. A more general version is this:: // Pre: // Post: returns v[begin] + v[begin + 1] + ... + v[end - 1] int sum(const vector& v, int begin, int end) { int result = 0; for(int i = begin; i < end; ++i) { result += v[i]; } return result; } Here, ``begin`` is the index of the first element of ``v`` we want in the sum, and ``end`` is the index of **one past** the last element we want included. For example:: // v = {2, 1, 4, 5} sum(v, 0, 4) == 12 sum(v, 0, 3) == 7 sum(v, 1, 4) == 10 sum(v, 1, 3) == 5 We can now easily implement our original ``sum`` function using the more general-purpose one:: int sum(const vector& v) { return sum(0, v.size()); } In terms of performance, summing a vector of n elements with these functions always requires doing about n additions (n - 1, to be exact). Finding the min/max of a vector ------------------------------- Here's the main idea for finding the min (analogously, the max) of a vector:: // Pre: end - begin >= 1 (i.e. range is non-empty) // Post: returns i such that v[i] <= v[j] for begin <= j < end int min_index(const vector& v, int begin, int end) { int min_so_far = begin; for(int i = begin + 1; i < end; ++i) { if (v[i] < v[min_so_far]) { min_so_far = i; } } return min_so_far; } An important detail is that there must be at least one element in the range of the vector being checked. Otherwise the first assignment to ``min_so_far`` will fail. ``min_index`` forms the basis of some other useful min functions:: int min_index(const vector& v) { return min_index(v, 0, v.size()); } int min_value(const vector& v, int begin, int end) { return v[min_index(v, begin, end)]; } int min_value(const vector& v) { return min_value(v, 0, v.size()); } Here is some test code:: void min_test() { cout << "min_test called ...\n"; vector v; v.push_back(3); assert(min_value(v, 0, 1) == 3); v.push_back(4); assert(min_value(v, 0, 1) == 3); assert(min_value(v, 0, 2) == 3); assert(min_value(v, 1, 2) == 4); v.push_back(1); assert(min_value(v, 0, 1) == 3); assert(min_value(v, 0, 2) == 3); assert(min_value(v, 1, 2) == 4); assert(min_value(v, 1, 3) == 1); assert(min_value(v, 2, 3) == 1); cout << "... all min_test tests passed\n"; } In terms of performance, min algorithms are usually measured by how many equality comparisons (calls to ``==``) they do. On an n-element vector, it's necessary to do n - 1 comparisons in all cases: any fewer would mean some elements of the vector are not being checked. Reversing a vector ------------------ Reversing the elements of a vector has a number of uses. For instance, sorting a vector in reverse numerical order is often done by first sorting it in ascending numerical order, and then reversing it. This is often easier and faster than writing a special-purpose reverse-sorting function. Here is an implementation of reverse:: // Pre: a has value x, b has value y // Post: a has value y, b has value x void swap(int& a, int& b) { int temp = a; a = b; b = temp; } // Pre: // Post: reverses the order of v[begin], ... v[end - 1] in place (i.e. v // is modified) void reverse(vector& v, int begin, int end) { while (end - begin > 1) { swap(v[begin], v[end - 1]); // note v[end - 1], not v[end] ++begin; --end; } } void reverse(vector& v) { reverse(v, 0, v.size()); } Swapping elements of a vector is common in many simple algorithms and so we've introduced the ``swap`` helper function here. Note that it's essential that the values passed to ``swap`` are pass-by-reference (since they will be changed). Also, since we use pass-by-reference when calling ``reverse``, the elements of ``v`` are actually modified. Thus, this is sometimes referred to as an in- place algorithm, i.e. in-place reverse. In terms of performance, reversing a vector needs to do about n / 2 swaps for an n-element vector. Randomly Shuffling the Elements of a vector ------------------------------------------- Randomly shuffling the elements of a vector is a useful operation in many programs that need randomness. The standard algorithm for shuffling a vector is know as the Fisher-Yates algorithm:: // Pre: // Post: v is randomly shuffled such that all permutations // are equally likely. void shuffle(vector& v) { // v passed by reference for(int i = v.size() - 1; i >= 1; --i) { int r = randint(0, i + 1); // 0 <= r <= i swap(v[i], v[r]); } } The basic idea is as follows: the index variable ``i`` moves from the right to the left through the vector. A random index, ``r`` is chosen between 0 and ``i`` inclusive (i.e. ``r`` might be set to ``i``), and then the locations of ``v[i]`` and ``v[r]`` are swapped. The trickiest part of a shuffling algorithm is ensuring that the shuffle is truly random in the sense that all possible permutations of ``v`` are equally likely. We won't go into the details of why, but the Fisher-Yates algorithm does indeed give a true random shuffle --- if implemented correctly. An error that you can imagine is easy to make is to write ``randint(0, i)`` instead of ``randint(0, i + 1)``. When you call ``randint(0, i)``, the possible return values are from 0 up to, and including ``i - 1`` (not ``i``); and when you call ``randint(0, i + 1)``, the possible values range from 0, up to and including, ``i``. This one small difference has a huge impact on the overall correctness of the algorithm: using ``randint(0, i)`` causes it to no longer returns truly random shuffles. Another tricky issue is being sure that the ``randint`` function is truly and fairly random? That turns out to be a very difficult question to answer since generating random numbers with a computer is surprisingly difficult. Our ``randint`` function is just a wrapper for the standard C++ ``rand()`` function, and we trust that ``rand()`` is a reliable random number generator. But how does ``rand()`` work? Here's one common implementation:: static unsigned long next = 1; /* RAND_MAX assumed to be 32767 */ int rand() { next = next * 1103515245 + 12345; return((unsigned)(next/65536) % 32768); } void srand(unsigned seed) { next = seed; } This ``rand()`` function implements what is known as a *linear congruential pseudo-random number generator*. It is relatively efficient, and produces numbers that are random enough for most applications. While we won't go into of the details of it here, we note that the literal numbers ``rand()`` uses must be chosen with some care. .. note:: One important application where ``rand()`` might not be good enough is cryptography: `the random numbers used in computer security usually need to be more carefully generated in order to prevent hackers from predicting them `_. The ``srand`` function is used to initialize the random number generator: the sequence of numbers returned by multiple calls to ``rand()`` depend entirely upon the seed given to ``srand``. Thus, if you run the same program multiple times using the same seed, the random numbers will be the *same* for each run. This can be helpful for debugging, but a serious problem for "finished" programs. To get a random seed we often call ``srand`` like this:: srand(time(NULL)); This statement sets the seed based on the current time, which is different for each run of a program. Questions --------- #. Write the following functions:: // Pre: // Post: return the average of v[begin], v[begin + 1], // ... v[end - 1] double average(const vector& v, int begin, int end) // Pre: // Post: return the average of v double average(const vector& v) #. Write the following functions:: // Pre: // Post: returns the sum of the positive numbers in v int sum_positive(const vector& v) // Pre: // Post: returns the sum of the negative numbers in v int sum_negative(const vector& v) #. Write the following function:: // Pre: // Post: multiplies each element of v by x void scale(vector& v, double x) #. In the notes, four functions were written for finding the min or its index. Write the same four functions for finding the max or its index. #. Write the following function:: // Pre: // Post: void reverse(string& s) #. A *palindrome* is a string that is the same forwards and backwards, e.g. "I", "pop", "racecar", and "nomelon nolemon" are all palindromes. Write the following function:: // Pre: // Post: return true if s is a palindrome, and false otherwise bool is_palindrome(const string& s) #. Write the following functions:: // Pre: // Post: returns the number of times c appears in the substring // s[begin], s[begin + 1], ..., s[end - 1] int count(char c, const string& s, int begin, int end) // Pre: // Post: returns the number of times c appears in s int count(char c, const string& s)