| Oded Goldreich | CS Unpluuged | School Timetables |
Complexity theory in computer science is concerned with classifying computational problems in terms of their inherent difficulty - how hard they are to solve. A computational problem is understood to be one that can be solved on a computer: some problems can never be solved and these are called non-computable e.g. the tiling problem. Complexity is measured in terms of the resources needed and the time taken to solve the problem. Time complexity is expressed in 'Big O' notation, which classifies functions according to their growth rates.
These are some of the basic classes of problem in complexity theory.
One possible way to solve a problem is to try all possible solutions (brute force). The time taken to solve a problem (in effect the number of solutions to consider) may be a linear, a polynomial or an exponential function of the input. Linear time is where the number of solutions increases at the same rate as the inputs e.g. a linear search - O(N). Polynomial time is where the number of solutions increases in a quadratic way e.g. bubble sort requires n2-n comparisons - O(n2). Exponential time is where the number of solutions increases in an exponential way - O(2n) e.g. the number of solutions in the TSP problem grows at (n-1)!
In the discussion that follows 'deterministic' means that, in a given model of computation each step is determined by the last. Thus a deterministic Turing machine is a model of computation in which each step in the algorithm follows from the previous one and there are no ambiguities or alternative paths through it.
Non-deterministic means that there may be ambiguities in the model, for example there may be more than one path from the same state-input combination so more than one path must be followed to see how it will be resolved. In the greatest common divisor algorithm and in a bubble sort there is just one route that the algorithm can take (successive division and pair-wise comparison respectively) but in the TSP problem there are multiple paths that must all be followed before a solution can be verified.
A decision problem is one that has an answer of yes or no, for example, in the travelling salesman problem the question might be "Is there a route that visits all nodes, returns to the start and has a total distance less than 500?" The answer to this question is either yes or no. Another decision problem might be "Is 329 a prime number?". Another might be in language M1 {0*1*} is there a sequence of zeroes followed by an equal number of ones?'
NP is a fundamental complexity class, which includes problems of types P and NP-complete. NP stands for non-deterministic polynomial time and is:
"the set of all decision problems for which the 'yes'-answers have simple proofs of the fact that the answer is indeed 'yes'. More precisely, these proofs have to be verifiable in polynomial time by a deterministic Turing machine."
"A problem is assigned to the NP-problem (nondeterministic polynomial-time) class if it permits a nondeterministic solution and the number of steps to verify the solution is bounded by some power of the problem's size. " (Wolfram Mathworld)
A problem of type NP can be verified in polynomial time but there is no way to solve a problem of this type in polynomial time.
P is also one of the fundamental classes of complexity in computer science:
"A problem is assigned to the P-problem (polynomial-time) class if the number of steps needed to solve it is bounded by some power of the problem's size. " (Wolfram Mathworld)
"P contains all decision problems which can be solved by a deterministic Turing machine using a polynomial amount of computation time, or polynomial time." (Wikipedia)
"Cobham's thesis holds that P is the class of computational problems which are "efficiently solvable" or "tractable"; in practice, some problems not known to be in P have practical solutions, and some that are in P do not, but this is a useful rule of thumb." (ibid.)
Examples of problems in P include the greatest common divisor.
P is a subset of NP.
The hardest problems in NP are the NP-complete problems, for which no polynomial time algorithms are known. NP-Complete problems form a subset of NP.
One example of this is the subset sum problem: how many subsets are there in a set of N numbers? It is easy to verify whether a given set of numbers (known as a certificate) sums to zero (known as a witness to the answer 'yes'). A problem is in NP only if the verifier runs in polynomial time, which it does for the subset sum problem.
"If a solution is known to an NP-problem, it can be reduced to a single polynomial-time verification. A problem is NP-complete if it is NP and an algorithm for solving it can be translated into one for solving any other NP-problem. Examples of NP-complete problems include the Hamiltonian cycle and traveling salesman problems. " (Wolfram Mathworld)
"At present, when faced with a hard problem in NP, we can only hope to prove that it is not in P assuming that NP is different from P. This is where the theory of NP-completeness, which is based on the notion of a reduction, comes into the picture.
In general, one computational problem is reducible to another problem if it is possible to efficiently solve the former when provided with an (efficient) algorithm for solving the latter. A problem (in NP) is NP-complete if any problem in NP is reducible to it.
Amazingly enough, NP-complete problems exist, and furthermore hundreds of natural computational problems arising in many different areas of mathematics and science are NP-complete. " (Goldreich)
NP-complete problems are the toughest problems in NP as they are the ones least likely to be in P.
"NP-complete problems are a set of problems that any other NP-problem can be reduced to in polynomial time, but retain the ability to have their solution verified in polynomial time." (ibid.)
"For instance, the decision problem version of the travelling salesman problem is NP-complete, so any instance of any problem in NP can be transformed mechanically into an instance of the traveling salesman problem, in polynomial time. The traveling salesman problem is one of many such NP-complete problems."
Tractable problems are called P-type problems (P=polynomial). This is a subset of P. Many problems in polynomial time are tractable, depending on the size of the input. P contains all decision problems which can be solved by a deterministic Turing machine using a polynomial amount of computation time, or polynomial time (Wikipedia). For example: finding the greatest common divisor; testing primality of a number.
Intractable problems are ones that are unsolvable within polynomial time. Problems that involve exponential time would take too long to solve for large (or even small) input values.
A problem may be intractable because it would take too long to find a solution e.g. TSP for N > 20 (this involves a factorial; problems involving factorials grow very quickly, even faster than exponential time (2n)).
For some intractable problems any potential solution may be checked in polynomial time and either accepted or rejected. The issue here is whether a P-type solution can be found for a NP-type problem.
Many intractable problems are concerned with scheduling or matching e.g. timetabling. Timetabling is typically a problem that requires exponential time to produce a solution but this would put the solution outside the practical limits of what a school requires. To produce a timetable in a reasonable amount of time a sub-optimal solution is adopted. Any given timetable could probably be refined and improved but not within the time available.
The most important question in complexity theory is the P=NP problem: do algorithms exist for NP-complete problems?
Informally this question asks whether "every problem whose solution can be efficiently checked by a computer can also be efficiently solved by a computer." (Wikipedia) Or:
"Suppose that solutions to a problem can be verified quickly. Then, can the solutions themselves also be computed quickly?"
It was introduced by Stephen Cook in 1971 and has been the subject of extensive research since, though no answer has yet been found - there is a prize of $1m for the person who solves it. "Quickly" means "in polynomial time.
"Our daily experience is that it is harder to solve a problem than it is to check the correctness of a solution... Could you imagine a world in which solving any problem is not significantly harder than checking a solution to it? Would the term 'solving a problem' not lose its meaning in such a hypothetical (and impossible in our opinion) world? The denial of the plausibility of such a hypothetical world (in which 'solving' is not harder than 'checking') is what 'P different from NP' actually means, where P represents tasks that are efficiently solvable and NP represents tasks for which solutions can be efficiently checked." (Goldreich)
"If any NP-complete problem is in P, then it would follow that P = NP. Unfortunately, many important problems have been shown to be NP-complete, and as of 2010 not a single fast algorithm for any of them is known."
"The P-vs-NP Question can be phrased as asking whether or not finding solutions is harder than checking the correctness of solutions. An alternative formulation asks whether or not discovering proofs is harder than verifying their correctness; that is, is proving harder than verifying. " (Goldreich)
Is it possible to write a program that can take another program as input and decide whether that program will halt with any given inputs. e.g. will the program go into an infinite loop.
It is not possible to solve the halting problem. It is only possible to find out if a program will halt by running that program, not by subjecting it to analysis in another program. One program that is used to demonstrate this is:
input x e.g. 5
while x <> 1 do
if x is even divide by 2
if x is odd set x to 3x + 1
It is not possible to determine whether this program will halt by analysing it in another program.