Solution to Extensive

Back to Puzzle


by Lewis Chen, Sam Kim, Colin Lu, Jon Schneider, Rahul Sridhar

Each line is a list of trigrams, which should be assembled in order to form a clue. They are as follows:

ISA ACW HOL IST EDT HRE ELA WSO FRO BOT ICS Isaac who listed three laws of robotics ASIMOV
PRE MEA LAL COH OLI CBE VER AGE Pre-meal alcoholic beverage APERITIF
BEF RUI TFU LAN DBL ANK Be fruitful and blank MULTIPLY
GOA LAF TER ATH ESI SDE FEN SEB RIE FLY Goal after a thesis defense, briefly POSTDOC
COF FEE FLA VOR EDL IQU EUR INA WHI TER USS IAN Coffee-flavored liqueur in a White Russian KAHLUA
COL LEG EOF FIC IAL REC ORD HOL DER College official record holder REGISTRAR
SHA PEO FLA STR EMA INI NGA NCI ENT WON DER Shape of last remaining ancient wonder PYRAMID

The title, along with the trigrams, suggests what to do with these words. Namely, the last three letters in each of these words is also a file extension. This suggests that we should look at the files at these locations. Indeed, in the same directory as the puzzle.txt file containing the trigrams,, aperi.tif, and so on can be found.

However, if you try to open them in the file format that is displayed, you’ll find that these files are actually a different filetype. Specifically, these filetypes form a chain which lets us reorder the clues.

Once we figure out how to open the files, we find that each is a minipuzzle whose answers are trigrams. The solutions to the minipuzzles are as follows:

aperi.tif (actually a .mov)

This is a 20-minute long video where a bunch of trigrams flash by in lexicographically increasing order. In fact, all trigrams from AAA to ZZZ appear in the video except for one that is missing, which is the answer. Each trigram appears for exactly one frame, and each frame lasts exactly the same amount of time (1/15th of a second).

There are a few strategies that are possible here: we can split the movie into frames and then binary search on where the gap is. Depending on the video player, looking at the video at regular intervals can also help as well (for example a button that skips ahead exactly 10 seconds will always move forward 150 trigrams, except over the omitted trigram where it would skip 151).

Regardless of the method used, we find that the missing trigram is DAT, which is our answer.

regist.rar (actually a .tif)

This is an image of a screenshot of Pokemon Red (or Blue). In fact, it’s an image of Pallet Town, which suggests looking at the palette of the image. This is doable in GIMP (where it is known as a color map) and some other image editing tools, or we can read the image directly via programming. Doing so gives us the message (when formatted in a 16×16 grid): ≡ 0 mod 7, where the ≡ sign represents modular congruence.

The palette of the image

The colors on a palette can be identified from 0 through 255, so perhaps we should look at the location of the pixels where they use colors divisible by 7. A tool that lets us alter the palette (such as GIMP) should suffice for this task. Changing these colors to, say, red, allows us to read the answer: APR.

Note that we don’t need to color every 0 mod 7 value in the palette to read the image - probably coloring the first 10 indices or so should suffice. Also, the green colors demonstrate that the message can still be read if we instead 1-indexed the colors.

As a side note, this step still works if we 1-index the colors. The message is just shifted downwards a bit.

kah.lua (actually a .rar)

This is a rar file containing three folders, and a comment “signed” by Fano, Shannon, and Huffman. Taking into account the context of a compressed archive format and these three names, we can reason that the puzzle is about codes used for data compression, particularly Huffman coding, Shannon coding, and Fano coding (sometimes called Shannon-Fano coding).

The three subfolders each encode a graph. The postscript in the comment tells us that the labelings A and B should be ignored, so we should only care about the actual graph represented. In each case, one vertex is labeled with an X.

To proceed, we take the string of letters given to us in the RAR comment, and construct their respective tree graphs in Huffman, Shannon, and Fano coding. The data is designed so that there are no ambiguities (which can arise in eg. Huffman coding when the weights of any two “symbols” are the same). The codes under each of the schemes are:

Letter Shannon Fano Huffman
S 00 00 00
O 010 01 010
A 0110 100 100
L 0111 101 101
E 1000 1100 110
N 1001 1101 0110
T 10100 1110 0111
C 10101 11110 1111
D 101100 111110 11100
R 101101 111111 11101

Doing so will create graphs in which each leaf node is labeled with one of the letters in the data.

Note that there are some slightly different conventions on how to represent the graph for Shannon coding in particular, based on whether to assign the bit strings based on binary representations of the cumulative probabilities, or simply assigning the lexicographically earliest string with the required length. This puzzle uses the latter, which appears to be more common.




The graphs can be matched with the graphs given by the folders - in order, they correspond to Shannon, Fano, and Huffman coding respectively, and the vertex labeled X is always uniquely identifiable as a particular letter in the coding graph.

Taking these letters in order of the folder names gives OCE.

post.doc (actually a .lua)

This is a Lua program that is running an exhaustive recursive search to solve a particular Kakurasu logic puzzle. Because the grid is too large, the program will never terminate in a reasonable amount of time, but it’s quite straightforward to solve the Kakurasu by hand:


Doing so draws out the letters SSI.

pyra.mid (actually a .doc)

This is a 154-page long Microsoft Word document file (notably, a pre-Microsoft Word 2007 document file; this is when the file extension changed from .doc to .docx). On first glance, the file appears to consist of the following quote repeated multiple times:

“The beauty of word processing, God bless my word processor, is that it keeps the plotting very fluid. The prose becomes like a liquid that you can manipulate at will. In the old days, when I typed, every piece of typing paper was like cast in concrete.”

The three letters of the trigram are hidden throughout the file in separate ways:

Letter 1: At the bottom of the file is a line in very tiny print (size 2 font). Zooming in / enlarging it reveals the message “The first letter you seek is N”.

Letter 2: Twenty-five copies of the quote have been modified by inserting a letter (written in a white font) into the quote. These white letters in order spell “The second letter you seek is G”. One good way of finding these letters is to replace all instances of the original quote with the empty string; this will leave only the modified lines.

Letter 3: In one page of the document, there is a tiny, seemingly all-white image. One good way to find this image is to convert the document to a format which makes the contents more readily visible (e.g. HTML); alternatively, find-and-replacing all text out of the doc should leave you with just this image. Opening this image in an image editor reveals (e.g., by using the paint bucket tool with some other color) that there are some off-white pixels. These off-white pixels spell out the message “The third letter you seek is M”.

These letters give us the answer NGM.

multi.ply (actually a .mid)

This is a MIDI file that sounds terrible plays beautiful music.

Letter 1: Opening up the file in a MIDI editor shows that the notes spell out the message “The first amazing letter is A” on the piano roll.

Letter 2: The file contains lyric metaevents that spell out “The second amazing letter is C”. These can be found from opening it up in an editor or a player that supports lyrics.

Letter 3: Note that the file contains 27 tracks (0–26), and each note is played on a track from 1–26 using the same numbered instrument from 1–26. (While the instruments are internally represented as 0–25, most midi devices and Wikipedia shows the instruments as being 1-indexed.) Taking either the instrument or the track used for each note and converting them to letters gives us the repeating message “THETHIRDAMAZINGLETTERISH”. (Each row of notes was shifted slightly to give a unique ordering to the notes.)

These letters give us the answer ACH.

Images are of MidiEditor. You can read the script used to generate the file here. (actually a .ply)

This is a “Polygon File Format” file, a filetype used to store three-dimensional data. We can open this file with any program that can view 3D files, e.g. various CAD software, Blender, or Microsoft 3D Viewer (which is installed by default on Windows 10).

Upon opening the file, we see a black box:

The answer to this puzzle lies inside this box. There are several ways to see what’s inside the box. One option is to use a suitable 3D editor which allows one to remove the outer box in a variety of ways: in the image below, we have used the “Split” tool in Microsoft 3D Builder to intersect the puzzle with a halfspace. Alternatively, you can work with the .ply file directly, deleting polygons belonging to the exterior box (these appear at a list at the end of the .ply file; some care is needed to make sure you end up with a valid .ply file).

Once we can see inside the box, we can see the letters INE.

Taking the trigrams in order, we get DATA PROCESSING MACHINE, which is a COMPUTER, the answer.

DAT aperi.tif .mov
APR regist.rar .tif
OCE kah.lua .rar
SSI post.doc .lua
NGM pyra.mid .doc
ACH multi.ply .mid
INE .ply