Analyzing Algorithms¶
In these notes, we’ll introduce O-notation, also known as asymptotic notation, as a mathematical tool for discussing algorithm performance. It is commonly used both in the theoretical analysis of algorithms, and, in practical programming for estimating the running-time of algorithms.
While we will look exclusively at algorithm performance, there are many other questions about algorithms that computer scientists investigate, such as:
- Memory efficiency How much RAM does it use? Does it work well with caches on a CPU?
- Simplicity How easy is it to understand, modify, or extend the algorithm?
- Generality Does the algorithm apply to a wide range of inputs, or is it
restricted to special cases? For example, linear search is more general than
binary search since linear search only needs to compare elements using
==
. - Fault-tolerance How well does the algorithm handle errors?
- Parallelizability Can the algorithm be made to run efficiently on multiple CPUs?
- Security Could the algorithm cause security problems? Could it be hacked, or somehow used in a malicious way?
Algorithms and Input¶
To analyze algorithms mathematically, we usually make some simplifying assumptions. We will assume that an algorithm \(A\) takes an input of size \(n\). Typically, this input is a sequence (e.g. an array, a string, a sequence of bits) with \(n\) elements. We also assume \(A\) runs for a finite amount of time, and eventually stops.
For example, linear search takes a list of \(n\) elements and a target value \(x\) as input, and returns the index location of an element that equals \(x\); if \(x\) is not in the list, -1 is returned:
- linear_search([\(a_0, a_1, \ldots, a_{n-1}\)], \(x\))
- \(i \leftarrow 0\)
- if \(a_i = x\) then return \(i\)
- if \(i = n - 1\) then return \(-1\)
- \(i \leftarrow i + 1\)
- go to step 3
We’ve written this algorithm in pseudocode instead of a real programming language in order to make it more readable. The precise syntactic details of a language rarely matter in the kind of analysis we are doing, so we leave them out.
Measuring Running Time¶
Since we are interested in how quickly algorithms run, you might think that we would use real time — such as seconds or microseconds — to measure performance. We certainly judge real-world software as being fast or slow by how much time it takes.
But real time is difficult to measure for programs because different computers run at different speeds. To fairly compare two programs, we have to make sure they are run on exactly the same computer, with exactly the same software. We also have to be careful about what other software is running, e.g. if you check your email or watch a video while a program is running you might get inaccurate timings.
Key Operations¶
So, to avoid these problems, computer scientists often estimate an algorithm’s running time by measuring how much work it does by counting how many times it executes a carefully-chosen key operation. A key operation is typically the most frequently executed instruction in an algorithm. The main advantage of this approach is that it is independent of hardware and software details, and it can even be applied to algorithms that are only written in pseudocode on paper.
It’s very important to choose a sensible key operation that reflects the work the algorithm does. For example, in our linear search algorithm, there are a couple of reasonable choices for the key operation:
- Item comparison using \(=\).
- Item accesses, i.e. a call to \(a_i\) corresponds accessing an element of an array or vector.
- Variable assignments using \(\leftarrow\).
Traditionally, item comparisons, i.e. calls to \(==\) or \(\leq\), are the key operation for sorting and searching algorithms.
Algorithm Running Time¶
We will refer to the key instruction counts as the running time of the algorithm. Keep in mind that even though we call this time, it is not real time — it is just a count of how many times the key instruction is executed.
Care must be taken when comparing the running times of algorithms with different key operations. For example, comparing two strings may take longer than comparing two integers, and so 1000 string comparisons may take much longer than 1000 integer comparisons. It’s just like when you compare money: if you have $1000 Canadian and $1000 American, it doesn’t make sense to say that you have $2000 in total.
We should also be clear if we are talking about the best case, average case, or worst case running time. In practice, the average case and worst case running times are usually the most useful.
O-Notation¶
Getting precise instruction counts of even simple algorithms is often quite difficult. Plus, simple key operations, such as comparisons or additions, can be done so quickly by most computers that there is often little practical difference between an algorithm that does \(n\) comparisons and ones that does \(n + 1000\) comparisons.
For these reasons we often don’t describe running times of algorithms using precise instruction counts. Instead, we use O-notation (also known as big O-notation, order notation, or asymptotic notation) as a of mathematically respectable way of ignoring details we deem unimportant.
Roughly, O-notation says that the order of a mathematical expression is its biggest term (ignoring constants). For example, we say that the expression \(2n\) is in \(O(n)\), and that \(25n^7 + 20\) is in \(O(n^7)\).
Intuitively, \(O(n)\) is the set of all mathematical expressions of a single variable \(n\) whose highest-order term is \(n\), or lower. Similarly, \(O(n^3)\) is the set of all expressions whose highest order term is \(n^3\) or lower; thus \(O(n)\) is a subset of \(O(n^3)\).
The expression \(45n - 80\) is in both \(O(n)\) and in \(O(n^3)\). We say that \(O(n)\) is a tighter bound than \(O(n^3)\). In practice, we usually want to know the tightest O-notation description for an expression since saying something like “\(2n\) is \(O(n^{500})\)” is true but unhelpful.
It’s easy to determine the high-order term for a polynomial. For example, all of these expressions are in \(O(n^2)\): \(n^2\), \(n^2+100\), \(n^2-100\), \(2n^2+n\), \(100n^2 + 500n + 700\). If you say that a particular algorithm runs in \(O(n^2)\) time, then what you mean is that the key instruction count can be described by some expression from \(O(n^2)\) (but we don’t know which one).
Mathematical Definition¶
Here is a precise definition of O-notation:
Definition of O-notation. Algorithm \(A\) is order \(f(n)\), written \(O(f(n))\), if there exist fixed constants \(k\) and \(n_0\) such that \(A\) requires no more than \(kf(n)\) key operations to process an input of size \(n \geq n_0\).
Using this definition, we can rigorously prove all the basic facts about O-notation that we mentiond above. However, we will not go into the details of such proofs in this course.
O-notation Applied to Algorithms¶
O-expression | Name | Example |
---|---|---|
\(O(1)\) | constant | \(1, 10, 34, 3538, \ldots\) |
\(O(\log\, n)\) | logarithmic | \(8\,\log_2 n + 5\) |
\(O(n)\) | linear | \(29n + 5\) |
\(O(n\, \log\, n)\) | linearithmic | \(n\,\log_3 n + 23n - 4\) |
\(O(n^2)\) | quadratic | \(7n^2 - 4n + 22\) |
\(O(n^3)\) | cubic | \(n^3 + 2n^2 + 22n + 1\) |
\(O(2^n)\) | exponential | \(3^n + n^4 - 2n^2 + 80\) |
We use the O-notation categories in the above table to describe the running times of algorithms. For example:
Linear search does \(O(n)\) comparisons in the average/worst case, and so we say it is a linear-time algorithm. In the best case, linear search does only 1 comparison, and so in that case, it is \(O(1)\), i.e. constant time.
Linear insertion sort does \(O(n^2)\) comparisons in the worst-case and average-case, and so it is a quadratic- time algorithm. In the best case (e.g. when the data is already sorted), insertion sort only does \(O(n)\) comparisons, and so it is linear in that case.
Quicksort does \(O(n \log n)\) comparisons in the average case, but degrades to \(O(n^2)\) in the worst case. The worst-case occurs very rarely, and so many programmers — perhaps unwisely — treat it as if it did \(O(n\log n)\) comparisons in all cases.
Another important sorting algorithm is mergesort, which does \(O(n \log n)\) comparisons in all cases, even the worst case. However, compared to quicksort, mergesort uses about twice as much memory, and is usually a little slower on average.
Binary search does \(O(\log\,n)\) comparisons in the worst-case and average-case (in the best case it does only 1 comparison). Thus it is a logarithmic algorithm. This is not quite as good as \(O(1)\), but in practice logarithmic algorithms are often extremely fast.
Note that the base of a logarithm doesn’t matter when we describe it with O-notation: \(\log_a x\) and \(\log_b x\) differ only by a constant factor, and so they are both \(O(\log n)\).
The Min of an unsorted array can be found in \(O(n)\) in all cases.
The Min of a sorted array can be found in \(O(1)\), or constant time, no matter how long the array is. That’s because the first element of a sorted array is always the smallest, and so we can simply return it without looking at any other elements.
Questions¶
- Besides run-time performance, what are three other kinds of questions that computer scientists may be interested in asking about an algorithm?
- When analyzing algorithms, why do we usually write them in pseudocode instead of a programming language?
- What are three reasons why it can be difficult to measure the real-time performance of an algorithm?
- What is a key operation?
- What is the usual key operation for searching and sorting algorithms?
- Give an example of an algorithm whose key operation is addition.
- What are the three main running-time cases in algorithm analysis? Which is usually the least important?
- Why is it often difficult to compare the running-times of algorithms that have different key operations?
- Write the simplest and tightest O-notation expression for each of the
following:
- 25
- \(n - 100\)
- \(500n + 3000\)
- \(4n^2 - 8000n - 5000\)
- \(n^2 + 1\)
- \(n^3 + n^2 + n + 1\)
- \(n^4 - 5000n^3 + 20n^2 + \log_4\,n - 143\)
- \(2^{n} + n^{10} - 25n^2\)
- For each of the following, name, or briefly describe, an algorithm that has
the given performance characteristics; give a different algorithm for each
question:
- \(O(1)\) in its worst case;
- \(O(1)\) in its best case, but \(O(n)\) in its worst case;
- \(O(n \log\,n)\) in the average case, but \(O(n^2)\) in its worst case;
- \(O(n^2)\) in all cases (best, average, and worst);
- \(O(n \log n)\) in its best case;
- \(O(2^n)\) in its worst case.
- Name two different sorting algorithms that do \(O(n \log\,n)\) comparisons in the average case.