Easy and Hard Problems

O-notation is a useful way to classify algorithms according to their running-time. For example, here are 7 common classifications algorithms:

O-expression Name Example
\(O(1)\) constant \(1, 10, 34, 3538, \ldots\)
\(O(\log\, n)\) logarithmic \(8\,\log_2 n + 5\)
\(O(n)\) linear \(29n + 5\)
\(O(n\, \log\, n)\) linearithmic \(n\,\log_3 n + 23n - 4\)
\(O(n^2)\) quadratic \(7n^2 - 4n + 22\)
\(O(n^3)\) cubic \(n^3 + 2n^2 + 22n + 1\)
\(O(2^n)\) exponential \(3^n + n^4 - 2n^2 + 80\)

There is another, even simpler classification that is of interest in theoretical computer science: exponential running time, or less than exponential running time.

For example, problems like sorting, linear search, binary search, and finding the min of an array can all be solved in time \(O(p(n))\) where \(p(n)\) is a polynomial in \(n\).

But some problems require exponential time, or worse. For example:

  • generating all bit-strings of length \(n\) takes \(O(2^n)\), because there are \(2^n\) bit strings of length \(n\)
  • generating all permutations of \(n\) different numbers takes \(O(n!)\), because \(n\) different numbers can be arranged in exactly \(n!=1 \cdot 2 \cdot 3 \cdot \ldots \cdot n\) ways.

The Traveling Salesman Problem

The Traveling Salesman Problem (TSP) is this: Given a list of cities on a map, what is the shortest way to visit all the cities, starting and ending at the same city?

For example, suppose the cities are named \(c_1, c_2, \ldots, c_n\). If you start at \(c_1\), then you might visit \(c_4\) next, and then maybe \(c_2\), and so on. Eventually you come all the way back to \(c_1\).

Each arrangement of \(c_1, c_2, \ldots, c_n\) corresponds to a different tour, and the TSP asks which of these tours is the shortest.

The “obvious” brute-force solution to the TSP is to generate all possible tours and then choose the shortest. But there are \(O(n!)\) tours, so a brute-force algorithm that did this would have to look at (more than!) an exponential number of tours.

The TSP is a well-studied problem with many practical applications. There are many algorithms that run more efficiently than brute-force. But, so far, no one has discovered an algorithm that can solve every TSP instance in less than exponential time.

It has not been proven that a better-than-exponential algorithm exists for solving the TSP, so there is the possibility that one day someone (maybe you!) could find an polynomial-time algorithm for solving it. But beware: this is an extremely well-studied problem, and most computer scientists and mathematicians believe that no such algorithm exists.

Note: The TSP has lots of applications, and a rich history. The TSP Website discusses the TSP in depth, and includes many interesting solved, and unsolved, instances.

NP-Completeness

The TSP is a member of a class of algorithmic problems known as NP-complete problems. Here are a couple of other NP-complete problems:

  • Logical satisfiability (SAT): For example, given a logical expression such as “(not p) or (not q) and r”, find true/false values for p, q, and r that make the expression true (i.e. that satisfy it). In practice, the expression could consist of thousands of variables, and the obvious brute-force solution of trying all true/false values for the variables would take \(O(2^n)\) time for \(n\) variables.

    There are extremely efficient practical algorithms, called SAT solvers, that can quickly satisfy many large and useful logical expressions. They turn out to be quite useful in applications such as scheduling.

    But there is no known SAT-solving algorithm that runs in polynomial time for all logical expressions. All known SAT algorithms run in \(O(2^n)\) in the worst case.

  • The Number Partition Problem: Given a set of n different integers, can you partition the, into two groups such that the sum of the numbers in each group is the same (or as close as possible to the same)?

    As an example of this problem, suppose you and a friend are carrying in a whole bunch of grocery bags from shopping. How can you divide the bags yourselves so that you each carry the same weight (or as close as possible to the same weight)?

    Like SAT, efficient algorithms are known for solving some instances of this problem, but in general there is no known polynomial-time algorithm for solving all instances.

There are hundreds more NP-complete problems: take a look at the list given here.

What all NP-complete problems have in common is this:

  • The best known algorithm for solving any one of them (in general) is exponential (or worse).
  • If a worst-case polynomial-time algorithm for any one NP-complete problem is discovered, then that automatically means that all NP-complete problems must also have a worst-case polynomial-time algorithm.

This second fact is rather amazing: you only need to come up with a polynomial-time algorithm for one NP-complete problem — any one! — and you will have automatically found a polynomial-time algorithm for all of them.

It’s because of this that many mathematicians and computer scientists believe that there is no polynomial-time algorithm for any NP-complete problem. But no one has been able to prove that this is the case, and so this is still an open problem. Indeed, many computer scientists consider this to be the most important unsolved problem in all of theoretical computer science.

Undecidable Problems

NP-complete problems are examples of problems where no known efficient algorithm exists. But there are also problems where we know there are simply no algorithms that solve them! Such problems are called undecidable problems.

For example, suppose you wanted to write an infinite-loop checker. This is a program that takes another program as its input, and returns “yes” if the input program goes into an infinite loop for some input, and “no” otherwise. Such a program might actually be useful as a tool for debugging programs.

However, it turns out that it is impossible to write such a program. It’s not just that it would be hard to write, or that such a program would run very inefficiently — it can be prove that such a program does not exist.

This particular example is known as the halting problem, and it was the mathematician Alan Turing who first proved (in 1936) that an infinite loop checker doesn’t exist. As part of his proof, he also developed the notion of a Turing Machine, which has since become one of the standard theoretical models of computation in computer science.

For a list of other undecidable problems, check out this list.