Solution to Complexity Evaluation — MIT Mystery Hunt 2021

Back to Puzzle

Answer: NONCONFORMIST

by Josh Alman

The puzzle consists of three components: a list of 5 complexity classes, a list of 7 algorithmic decision problems, and three groups of 7 blanks next to a check, question mark, and X.

Looking at the algorithmic problems, they all take in as input two components in common: a bipartite graph G (some even specifically refer to the ‘top’ and ‘bottom’ bipartitions of that graph) and a variable k. Some of the problems take in other inputs, but these always have default values so that they are not necessary to evaluate the problem. Moreover, for any graph G, there is always at most one k that should make the problem accept. This suggests that we will need to evaluate these problems on certain bipartite graphs, and figure out what values of k make them accept.

What bipartite graphs should we evaluate them on? The bullet points next to the complexity classes and problems, and the three lines of blanks at the bottom which have seven blanks each (one for each problem) suggest that they should be defined in terms of which problems are in which complexity classes. More precisely, we will form three bipartite graphs, which each have five nodes in the top bipartition (corresponding to the five classes) and seven nodes in the bottom bipartition (corresponding to the problems). In the first graph, we will put an edge between complexity class C and problem P if P is known to be in C. In the third graph, we will put an edge if P is known to not be in C. In the second graph, we will put an edge if it is an open problem whether P is in C.

Doing this (see below for all the details) produces the following three graphs:

We next evaluate the problems on these three graphs to find the value of k which makes the problem accept. Python code for doing so can be found here. The results are as follows:

Problem	k (and any explanation)	Corresponding Letter
1	(0, 0, 0, 0, 1)	1=A
2	14	N
3	19, since c = 110860	S
4	23	W
5	5, since s = 10000100100101111100110010001001010	E
6	18, since s = 10111011000010000111010001000101100	R
7	9	I

Problem	k (and any explanation)	Corresponding Letter
1	(1, 0, 0, 1, 1)	19=S
2	14	N
3	15, since c = 104017	O
4	14	N
5	3, since s = 00000001011010000010001100010110101	C
6	15, since s = 01000000110101101000001010100000011	O
7	14	N

Problem	k (and any explanation)	Corresponding Letter
1	(0, 0, 1, 1, 0)	6=F
2	15	O
3	18, since c = 65058	R
4	13	M
5	9, since s = 01111010000000000001000001100000000	I
6	19, since s = 00000100001000010000100100011010000	S
7	20	T

The values of k each correspond to an integer between 1 and 26; they spell out ANSWER IS NONCONFORMIST.

Complexity Classification Details

We now give proofs of all of the classifications of problems into complexity classes. Before we begin, we note that:

Solvers can likely guess or deduce many of these without needing to come up with complex proofs. For instance, below we give reductions to show that various problems are not in AC⁰, but one can often guess that they are not in AC⁰ without a complete proof in mind.
Some of the classifications are intentionally difficult to determine, and may be easier to figure out using puzzle solving skills rather than computer science skills. For two examples: (1) From looking at the possible outputs of problem 4, we can determine that the graphs must have 15, 12, and 8 edges (in some order). (2) Once we see that the first graph is spelling out ANSWER, but possibly with errors, we can figure out how the graph should be modified to cleanly output ANSWER.

Complexity Class Overview

To begin, let’s recall the definitions of the classes and some of their useful properties. These all appear on the Wikipedia pages for the relevant classes.

AC⁰ is the circuit class of problems which can be solved by constant-depth, polynomial-size circuits with AND, OR, and NOT gates. The size of a circuit is measured in terms of its number of wires. (Some definitions only allow NOT gates at the bottom layer of the circuit. Using De Morgan’s laws to ‘push’ NOT gates up and down the circuit, this can be assumed without loss of generality.) It is known that AC⁰ cannot compute PARITY (the problem of determining whether the number of 1s in an input binary string is even), or more generally MODm for any fixed integer m > 1. AC⁰ can compute the addition and subtraction of two integers given in binary, but not their multiplication. As a circuit class (i.e. a nonuniform class), AC⁰ can compute any unary language.

A problem A is NP-hard if, for every problem B in NP, there is a polynomial-time reduction from B to A. Recall in particular that a problem is NP-complete if it is NP-hard and in NP. If P = NP, then every problem A is NP-hard: we can simply solve B in polynomial time, without needing to use A at all. Since it is open whether P = NP, we won’t have any edges to NP-hard in our third graph.

PSPACE is the class of problems that can be solved using polynomial space (even if the algorithm takes exponential time). Brute force algorithms can often be used to show that a problem is in PSPACE, as long as the number of possibilities to check is at most exponential in the input size (so we can keep track of which we’re dealing with in polynomial space), and each can be checked using polynomial space.

P is the class of problems that can be solved in polynomial time. A nice relationship to recall is that any problem in P is also in PSPACE, since it takes one time step to write to a cell of memory, so in polynomial time, one can only write to a polynomial amount of space.

P/poly can be thought of in two equivalent ways. First, it is the circuit class of problems which can be solved by polynomial-size circuits (of any depth) with AND, OR, and NOT gates. Second, it is the class of problems which can be solved in polynomial time when given a polynomial-size advice string which depends only on the input length. For our purposes, the most important things to note about the definition are that any problem in P is also in P/poly (since one can just not use the advice), and any unary language is in P/poly (since it is a non-uniform circuit class; alternatively one can think of the advice as telling you the answer for that input length). It is open whether each of NP, PSPACE, and EXPTIME is contained in P/poly.

Problem 1

Let’s begin by showing this problem is in AC⁰ by constructing an AC⁰ circuit for it. Suppose the graph G has n nodes in the top bipartition, and m nodes in the bottom bipartition.

First, for any fixed integer d, and any fixed nodes a and b in the top bipartition, we will construct a gadget to tell whether a and b have at least d nodes in the bottom bipartition which are not adjacent to either of them. To do this, we take the OR over all choices of d nodes in the bottom bipartition, and for each, we check whether a and b are both not adjacent to all d of those nodes (e.g. by taking an AND over the d nodes x of whether a is not adjacent to x AND b is not adjacent to x). This gadget has size O(m^d), which is polynomial-size for any fixed d. It has depth 3.

Next, for any fixed nodes a and b in the top bipartition, we can compute whether they are friends. To do this, we take the above gadget for both d = 2 and d = 3, and take the XOR of their results. Note that XOR(a, b) = AND(OR(a, b), NOT(AND(a, b))). Thus, this gadget has size O(m³) and depth 6. We can make such a gadget for all pairs of nodes in the top bipartition, which will have total size O(n²m³) and depth 6.

Now, for each node a in the top bipartition, we can compute f(a) by taking NOT of the OR over the above gadget for all nodes b. This adds 1 depth, and a total additional size of O(n²), since we are using an O(n)-fan-in OR gate for each of the n nodes in the top bipartition. (Although it doesn't matter for our purposes, this is an additive O(n²) size, rather than multiplicative, since we are adding O(n²) wires which just connect to the gadgets from the previous layer.) Then, again using an XOR, we can determine whether this equals k_a. This adds 3 depth and O(n) size. Finally, we take a NOT OR over all these comparisons to return our final answer. This adds 1 depth and O(n) size. In total, our circuit has size O(m³n²) and depth 11. With some optimization, it’s likely possible to reduce the depth.

This circuit shows that the problem is in AC⁰. It is simple to convert the above procedure into a polynomial-time algorithm as well, showing that this problem is in P, and hence also PSPACE and P/poly. Finally, since the problem is in P, it must be open whether it’s NP-hard, since resolving this would resolve the P vs NP problem.

Problem 2

Computing C is known as the ‘maximum edge biclique problem’. It is NP-complete. One can find references online proving this, but it is also not hard to prove directly by reducing from the usual ‘maximum clique’ problem: given a (not necessarily bipartite) graph H on n nodes in which we want to compute the maximum size of a clique, we construct a bipartite graph G with n nodes in each bipartition, where there is an edge from node i on the top to node j on the bottom in G if and only if there is an edge between nodes i and j in H. One can verify that the size of a maximum edge biclique in G will be the square of the size of a maximum clique in H. Thus, problem 2, which is a minor modification of the maximum edge biclique problem, is also NP-complete.

It immediately follows that problem 2 is NP-hard, and it is open whether it’s in P. We can see that it is in PSPACE since an algorithm can brute-force over all subsets S and T of the bipartitions of G in polynomial space (we just have to remember which we are currently looking at). In fact it is known that NP is contained in PSPACE.

If problem 2 were not in P/poly, then it would also not be in P, settling P vs NP. If problem 2 were in P/poly, then it would follow that all of NP is in P/poly. Indeed, if problem 2 were in P/poly, then for any problem L in NP, one could do the polynomial-time reduction to problem 2, and solve that using the P/poly algorithm. Since it is open whether NP is in P/poly, it follows that it is open whether problem 2 is in P/poly.

Finally, let’s prove that problem 2 is not in AC⁰. (It is likely intuitive to solvers that this is the case, but we prove it here for completeness.) We will prove this by using an AC⁰ circuit for problem 2 to construct one for PARITY (which is known to not be in AC⁰). Recall that the goal in PARITY is to determine if the number of 1s in an input binary string s of length n is even. It is sufficient to test, for a given even integer 0 ≤ ℓ ≤ n, whether the number of 1s in s is ℓ, since we can then take the OR over the O(n) different choices of even ℓ to get our final result.

To do this, we simply use the circuit for problem 2, picking k = 20 − ℓ and setting G to be the graph whose adjacency matrix is the string s. This is a graph with 1 node in one bipartition, and n nodes in the other, and its number of edges equals the number of 1s in s. Hence, the maximum size of a biclique in G is the number of 1s in s, as desired. (A note regarding picking k = 20 − ℓ: we can compute this in an AC⁰ circuit since subtraction is possible with such circuits, but in fact this is not even necessary, since we make a different circuit for each input size, so we can hard-code in all the relevant values of ℓ.)

Problem 3

First, let’s show that problem 3 is not in AC⁰. We can show this by a reduction from MOD3, the problem of deciding whether the number of 1s in an input binary string is a multiple of 3. Given a binary string s of length n, we can interpret it as the adjacency matrix of a bipartite graph with 1 node in the bottom bipartition and n nodes in the top bipartition. Hence, the integer c it corresponds to is the number of 1s in s. Let’s instead concatenate s with itself, so that c corresponds to twice the number of 1s in s. In particular, c is even, and it is a multiple of 3 if and only if we should return true for the MOD3 problem. Our goal is hence to determine whether 12 ⋅ f(c) + 5 is 12 ⋅ 3 + 5 = 41. We know that f(c) is at most n. Let’s pick u = 12 ⋅ n + 6, so that MOD(12 ⋅ f(c) + 5, u) is just equal to 12 ⋅ f(c) + 5. Thus, MOD(12 ⋅ f(c) + 5, u) will be 41 if and only if f(c) = 3. We can thus use an AC⁰ circuit for problem 3, with G being the graph above, and k = 41, to test for MOD3 in AC⁰ as well.

We now address the other complexity classes. Let’s begin by showing that problem 3 is polynomial-time equivalent (i.e. there are polynomial time reductions in both directions) to integer factorization. First, to solve problem 3, we compute c in polynomial time, then compute its factorization, take the second-smallest prime in that factorization to get f(c), then finally compute and output MOD(12 ⋅ f(c) + 5, u) in polynomial time. Second, to factor a given integer N, we first construct a bipartite graph G which corresponds to c = N (e.g. if there is only one node in the top bipartition, then G corresponds to a binary representation of c), pick u = 12 ⋅ N + 6, which is greater than 12 ⋅ f(c) + 5, so that MOD(12 ⋅ f(c) + 5, u) = 12 ⋅ f(c) + 5, then using problem 3, we can compute 12 ⋅ f(c) + 5 and hence f(c). We know that f(c) is a prime factor of N, so we divide by that factor and recurse until N = 1. (The number of times we need to recurse is the number of prime factors of N, which is at most log₂(N), which is a polynomial in the input length.)

From this, we can answer our remaining questions about problem 3 by finding the corresponding answers for integer factorization, e.g. from Wikipedia. Integer factorization is believed to be an NP-intermediate problem. It is hence open whether it is in P or NP-hard (but it is not believed to be in either). It is also open whether it is in P/poly (for instance, a common assumption of the RSA cryptosystem is that it is not in P/poly). It is in PSPACE, again by a brute-force algorithm, where we iterate over all integers 1 < ℓ < N and see if they divide N.

Problem 4

Solving this problem simply requires counting the number of edges in the input graph G and doing some basic arithmetic. It follows quickly that it is in P, PSPACE, and P/poly, and it is open whether it is NP-hard.

Finally, we show it is not in AC⁰ using a reduction from MOD11. As in the reductions in the previous two problems, given an input string s that we want to solve MOD11 for, we can construct a graph G such that e(G) is the number of 1s in s, then using an AC⁰ circuit for problem 4 (341 copies, one for each possible value of k), we can get MOD(110 ⋅ e(G)² + 102 ⋅ e(G), 341). Since 11 divides 341, and 341 is a constant, we can use a constant-sized circuit to extract from this 110 ⋅ e(G)² + 102 ⋅ e(G) (mod 11), which is 3 ⋅ e(G) (mod 11), and hence we can extract e(G) (mod 11) as desired.

Problem 5

This problem looks like the famous Halting problem. In fact, testing whether t(T, s) = − 1 is exactly the Halting problem. However, in problem 5, the integer k must be positive, so there is no way for us to use it to check whether t(T, s) = − 1. A succinct way to describe problem 5 is: given a positive integer k, Turing machine T, and input s, determine whether T halts on s in exactly k steps. This is perhaps the most classic example of a problem which is EXPTIME-complete.

Most of the answers to our questions for problem 5 can now be found on the Wikipedia page for EXPTIME. It is known by the time hierarchy theorem that P is not equal to EXPTIME, and so problem 5 is not in P (otherwise every problem in EXPTIME would be in P by first reducing to problem 5). It is open whether EXPTIME is contained in P/poly, and so it is open whether problem 5 is in P/poly. It is similarly open whether problem 5 is in PSPACE. We know problem 5 is NP-hard since it is EXPTIME-hard and EXPTIME contains NP.

Finally, we can prove this problem is not in AC⁰ in many ways. For one example, we can construct a Turing Machine T which, on an input of length n, halts in 1000 ⋅ n steps if PARITY should return true on its input, and never halts otherwise. Then, problem 5 solves PARITY by fixing T to be that Turing machine, and k = 1000 ⋅ n.

Problem 6

This problem is very similar to problem 5, except for two key differences. First, we are now checking whether k = 10 + t(T, s), so in the case when k = 9, this problem really is the Halting problem. Second, the input is given to us in unary! Thus, this is a unary, undecidable problem.

As discussed earlier, since it is unary, we know it is in AC⁰ and P/poly. On the other hand, since it is undecidable, we know it is not in P or PSPACE, as the problems in those classes are decidable.

Interestingly, it is open whether this problem is NP-hard. This is intuitively surprising: it feels like an undecidable problem should be much ‘harder’ than NP. However, we have to keep in mind the definition of NP-hard: for this problem to be NP-hard, any problem L in NP would need to have a polynomial-time reduction to it. Most ideas for reductions one can come up with fail only because the input must be unary; there are plenty of polynomial-time reductions to the Halting problem when the input is in binary, but simply converting that input to unary can take exponential time!

The fact that it is open whether this problem is NP-hard actually follows from a more general result: if any unary language is NP-hard, then P = NP. See https://math.stackexchange.com/a/1416150/83330 for a proof. (Meanwhile, if P = NP, then every language is NP-hard.)

Problem 7

The classic Ford-Fulkerson can compute the size of a maximum matching in a bipartite graph in polynomial time. We can thus compute a, and hence P(a), in polynomial time. Thus, this problem is polynomial-time equivalent to computing b, the number of maximum matchings in a bipartite graph.

Counting the number of matchings in a bipartite graph is a classic example of a complete problem for the class #P. #P is the class of problems that can be defined as counting the number of accepting paths of a nondeterministic Turing machine (by comparison, NP just asks whether the number of paths is greater than 0, so #P is intuitively a harder version of NP). In particular, #P-complete problems like problem 7 are known to be NP-hard as well, and it is open whether they are in P (if so, it would show P = NP). A brute force algorithm over all possible matchings (or the fact that it is in #P) shows that it is in PSPACE. Similar to the argument for problem 2, it is open whether problem 7 is in P/poly.

Finally, we will prove that problem 7 is not in AC⁰. The proof is very similar to the proof for problem 2, by showing that if problem 7 has an AC⁰ circuit, then we can use it to design an AC⁰ circuit for PARITY. Given a string s of length n, we can convert it into a bipartite graph G with n nodes in one bipartition and one node in the other, such that the number of edges in G is the number of 1s in s. The size of a maximum matching in G is 1 (assuming G has at least one edge; we can deal with the case where it is 0 separately). The number of maximum matchings is equal to the number of 1s in s. Hence, for any integer 0 ≤ ℓ ≤ n, we can check whether the number of 1s in s is ℓ by evaluating problem 7 with k = ℓ + P(1) = ℓ + 108. We can take the OR over all O(n) even choices of ℓ to compute PARITY as desired.

Author’s Notes

If you liked this puzzle, you should take all the excellent complexity classes taught at MIT!