Back

Solution to BLAST from the Past

Authors: Susan Byrne and Thomas Mack

The solver is presented with a series of 14 DNA sequences.  Each sequence begins with a clue in the usual amino acid translation alphabet, followed by a stop codon, and then a sequence from Anolis caroliensis (North American green anole lizard), the first lizard genome sequenced (Alföldi J., et al. Nature 2011).

The answer to each clue is a dinosaur root word followed by "lizard". For example, RECEIVEDAURALLYAFTERLIGHTNING clues "thunder";" thunder-lizard" is brontosaurus; and so the first letter "B" is used.  When arranged in the proper order, the letters spell AMBER, OCHRE, OPAL, which clue the final answer STOP CODON.

1A = duck (Anatosaurus) = TEAMSKATERNHLANAHEIM
2M = vicious (Masiakasaurus) = LATEGUITARISTSID
3B = thunder (Brontosaurus) = RECEIVEDAURALLYAFTERLIGHTNING
4E = light-weight (Elaphrosaurus) = ITISTHECLASSHEAVIERTHANFEATHERWEIGHT
5R = Sussex (Regnosaurus) = RIGHTNEARHAMPSHIREANDKENTWITHCHICHESTERANDLEWES

6O = weapon (Oplosaurus) = INFINALFANTASYSEVENEMERALDEG
7C = helmet (Corythosaurus) = RIDINGSAFETYHEADSHIELD
8H = western (Hesperosaurus) = LAWLESSFILMGENRESPAGHETTI
9R = thornbush (Rubeosaurus) = DETERRENTPRICKLYHEDGEPLANT
10E = riddle (Enigmosaurus) = (B) NRAVEHARRYSMAGICALNEMESISREALLASTNAME

11O = eye (Opthalmosaurus) = (Z) QEKEKATRINAANDIKEHADTHISATTHEIRCENTERS
12P = parrot (Psittacosaurus) = AVIANTHATCANMIMICTALKING
13A = silver (Argyrosaurus) = ERIKVENDTRECEIVEDTHISMEDALATATHENS
14L = mud (Limusaurus) = EARTHYWETDIRT

In the amino acid alphabet, U codes for selenocysteine, which is coded by TGA under certain circumstances. B codes for either asparagine (N) or aspartic acid (D). Z codes for either glutamic acid (E) or glutamine (Q).

The 14 sequences are presented to the solver in alphabetical dinosaur order: 1A, 13A, 3B, 7C, 4E, 10E, 8H, 14L, 2M, 6O, 11O, 12P, 5R, 9R.

In this order, the first letter of each clue phrase spells down TERRINLELIQARD, or "terrible lizard", cluing dinosaur.

Appended to the clue phrase sequences is another DNA sequence, which a BLAST search should identify as Anolis caroliensis (anole lizard). This lizard DNA clues that these are dinosaur root words ("thunder-lizard").

Three genes from the lizard are used, which specify the grouping for the words (amber, ochre, and opal, respectively). In addition, the stop codon between the clue phrase and the lizard DNA matches that of the word: TAG (amber) for 1A, 2M, 3B, 4E, and 5R; TAA (ochre) for 6O, 7C, 8H, 9R, and 10E; TAG (opal) for 11O, 12P, 13A, and 14L.

The ordering of the letters within each word can be specified by either: the position of the sequence; the length of the appended lizard sequence (shortest-longest); or the total length of the sequence.

Clue Len. mRNA Seq. Location Lizard Gene Len. Total Seq. Len. Genome DNA Location Gene Name
1A 63 43-133 91 154 GL343369.1: 679050 - 679140 keratin-associated beta protein 37
2M 51 134-373 240 291 GL343369.1: 678810 - 679049 LI-AC-37
3B 90 374-648 275 365 GL343369.1: 677658 - 677932 ENSACAG00000027550
4E 111 649-948 300 411 GL343369.1: 677358 - 677657  
5R 144 949-1260 312 456 GL343369.1: 677046 - 677357  
6O 87 166-255 90 177 GL343221.1: 826076 - 826165 zinc finger protein 804A
7C 69 256-386 131 200 GL343221.1: 774136 - 774266 ZNF804A
8H 78 387-952 566 644 GL343221.1: 764609 - 765174 ENSACAG00000003720
9R 81 953-2041 1089 1170 GL343221.1: 763520 - 764608  
10E 114 2042-3651 1610 1724 GL343221.1: 761910 - 763519  
11O 117 19-105 84 201 6: 35096247 - 35096330 vitelline membrane outer layer protein 1
12P 75 106-234 129 204 6: 35096331 - 35096459 VMO1
13A 105 235-366 132 237 6: 35098151 - 35098282 ENSACAG00000007746
14L 42 367-713 347 389 6: 35101248 - 35101594