50/50 (Solution)
by Derek Kisman; server by Ben Buchwald
The puzzle is all about encoding information in an unlikely format: a probability distribution. As the self-evident interface suggests, there is a server simulating "flips" of a digital coin, and you'll notice that these flips really act random. Different people will see different sequences of flips, and the sequence doesn't repeat. The only way to get anything out of the puzzle is to roll up your sleeves and do some statistical analysis.
Note: Even though different sessions of flipping encode the same info, you really want to analyze data from just one session to ensure some of the long-term trends are consistent. This should be clear after some progress is made. In terms of scale: 50,000-100,000 flips are enough to note the key patterns of interest; 300,000-400,000 flips are easily enough to solve the puzzle with a good signal-to-noise ratio; 1,000,000 flips make the noise practically disappear. Teams shouldn't need more flips than this, and the interface intentionally makes it difficult to get beyond this order of magnitude of data collection.
So, how do you start analyzing a sequence of "random" numbers? There are a number of common tools people use (for instance, see http://www.random.org/analysis/). The most straightforward test, of course, is to simply count the frequency of Heads and Tails. Well, unfortunately, that doesn't help; they each come up 50% of the time. However, you can generalize this a bit by looking at strings of several flips in a row. Each string of n flips should occur about 1/2n of the time. If you analyze the flips this way, you'll see the expected behavior up to n=6 ... but suddenly at n=7 things begin to look a little wonky. Unlike a real coin, the flipping is history-dependent!
Let Heads be 0 and Tails be 1. (This convention holds throughout the puzzle; this first step is enough to confirm it.) Take all sequences of length 7 and sort them by their relative frequency. Here's an example from one representative run of 1,000,000 flips:
0.432768: 1100110 S 0.443008: 0100111 S 0.491520: 1010110 K 0.493696: 0010111 K 0.513536: 1000110 C 0.526464: 0000111 C 0.554112: 0011111 O 0.556544: 1011110 O 0.588800: 0011001 L 0.596352: 1011000 L 0.625408: 1000100 B 0.630784: 0000101 B 0.663424: 0011011 M 0.666368: 1011010 M 0.719104: 1101010 U 0.722816: 0101011 U 0.730752: 1011100 N 0.741248: 0011101 N 0.799232: 0110011 Y 0.800256: 1110010 Y 0.818816: 0001001 D 0.837632: 1001000 D 0.867712: 1100100 R 0.871552: 0100101 R 0.899200: 1000010 A 0.906496: 0000011 A 0.936576: 0010001 H 0.953216: 1010000 H 0.972928: 1111111 0.974720: 0110111 0.977024: 0111001 0.982528: 1111101 0.982656: 1101110 W 0.982912: 1001101 F 0.984576: 1001100 F 0.986112: 1111000 0.986240: 0111110 0.986624: 0010011 I 0.987520: 0100011 Q 0.988800: 1001110 G 0.990336: 1101101 V 0.990848: 1101000 T 0.990848: 1111010 0.991104: 1010011 I 0.991232: 1010100 J 0.991744: 0101111 W 0.992128: 0110101 Z 0.993664: 0110000 X 0.993792: 1110100 Z 0.994432: 1101111 W 0.994688: 0001111 G 0.994688: 0110100 Z 0.994944: 0010101 J 0.995328: 1110001 X 0.995328: 1110110 0.995712: 0100010 Q 0.996480: 0111011 0.996736: 1100000 P 0.997120: 1100001 P 0.997504: 0111111 0.997504: 1111110 0.997632: 1101001 T 0.997760: 0001010 E 0.998016: 0101101 V 0.998144: 1001011 E 0.998528: 1110101 Z 1.00019: 1110000 X 1.00083: 1000001 1.00096: 0001101 F 1.00122: 1111011 1.00122: 1111100 1.00147: 0111010 1.00160: 1100010 Q 1.00173: 0111100 1.00237: 1100011 Q 1.00237: 1110111 1.00275: 0101001 T 1.00288: 0101100 V 1.00403: 0000000 1.00442: 1010101 J 1.00480: 0101110 W 1.00493: 0001011 E 1.00813: 0001110 G 1.00864: 0110001 X 1.00902: 0010010 I 1.00902: 0101000 T 1.00928: 1010010 I 1.00941: 0111000 1.00954: 0111101 1.01018: 0001100 F 1.01069: 0110110 1.01414: 1001111 G 1.01568: 1101100 V 1.01670: 1111001 1.01722: 1000000 1.01734: 0000001 1.01773: 1001010 E 1.01965: 0100001 P 1.02054: 0010100 J 1.02131: 0100000 P 1.04666: 1010001 H 1.08774: 0010000 H 1.11168: 0000010 A 1.11757: 1000011 A 1.14432: 1100101 R 1.14675: 0100100 R 1.17683: 1001001 D 1.18669: 0001000 D 1.19347: 1110011 Y 1.21178: 0110010 Y 1.25568: 0011100 N 1.25670: 1011101 N 1.27155: 1101011 U 1.27654: 0101010 U 1.32045: 0011010 M 1.32198: 1011011 M 1.37190: 1000101 B 1.38010: 0000100 B 1.40595: 0011000 L 1.42221: 1011001 L 1.42963: 1011111 O 1.45472: 0011110 O 1.47635: 1000111 C 1.49760: 0000110 C 1.50285: 1010111 K 1.50938: 0010110 K 1.53472: 0100110 S 1.55994: 1100111 S
Some of the sequences have the expected frequency (accounting for variation) and are clustered around the middle, but the outliers have a clear pattern. The middle 5 bits are duplicated, surrounded by 00/11 or 01/10 (and the 00/11 and 01/10 pairs are opposite each other in the list). If you convert the middle 5 bits to letters, you get the message "HARDYNUMBLOCKS", with each successive letter about 4% more likely in the relative frequency table.
(Aside: this 00/11 and 01/10 pairing is mathematically required in order to vary the frequencies of 7-flip strings without affecting the frequencies of 6-flip strings.)
Ok, that's step 1, with a hint towards the next thing to look at. What does HARDYNUMBLOCKS mean? Well, a mathematician (or Google) will tell you about a famous Taxicab Number, 1729. This seems to imply you'll want to divide the data up into blocks of 1729 flips. What can you do with this? Instead of analyzing frequencies between adjacent flips, you can analyze them between adjacent blocks (ie, look at flips that are 1729 apart). You'll find that most of these flips are independent... however, there are a few positions in the block of 1729 which consistently "resist" flipping. They change from H to T or T to H 25% of the time and stay the same 75% of the time.
These positions are all fairly close together; there are 60 of them spanning about 600 positions out of 1729. This tells you which positions are first and last (since you don't actually know the "frame" of the 1729-blocks). The first few positions are at 0, 16, 28, 33, 34, ... Taking the differences between positions and converting to letters, you get the message "PLEASEHELPTRAPPEDINCOINFLIPPINGFACTORYJUSTKIDDINGTHUUUTUXUUUHUHTUUHUH".
(Aside: if you have enough data, it's possible that spectral analysis - ie, a Discrete Fourier Transform - would find these 1729-blocks without the need of the first message. However, I believe this would take far more data than you can reasonably gather.)
Almost there! The latter part of this message looks like some sort of flip sequence, because it has several Ts and Hs in there, but what of the Us and Xs? Well, U just stands for "unknown", i.e., we don't care what goes there. And there's only one X, so it seems significant!
The final step is to look for every occurrence of this pattern in the sequence. The flips that go where the X is are the final channel of information. You'll find that they repeat in an unvarying pattern (no noise!) with period 299=13*23. There's only one way (well, two similar ways) to arrange this pattern into a good-looking rectangular image, and it gives the following image:
.XXXXX.........XX.XX... .XXXXX....XX..XXXXXXXX. .XXXX....X..X...XX.XX.. .XXX....X....X......... .......X......X........ ......XXX.XX...X....... .....XXXXXXXXXXXX...... ....XXXXXXXXXXXXXX..... ...XXXXXXXXXXXXXXXX.... .XXXXXXXXXXXXXXXXXXXXX. ....................... ....X.X.X.X.X.X.X.X.... .......................
This is an image of a MOUNTAIN, the final answer.
(As an aside: in an earlier version of this puzzle, step 3 required you to track down pieces of binary pi, hidden in the stream. They're still there...)