Earlier this year, the New York Times introduced a new word game called Letter Boxed. The game consists of a square with 12 letters evenly spaced around the edges, e.g.:

A B C - * * * - L * * D K * * E J * * F - * * * - I H G

The goal is to make valid words of three letters or more by connecting the letters together, and using all letters in the box at least once. After you make each word, the last letter becomes the first letter of the next word, and so on. Letters can be reused but each consecutive letter must come from a different side. Using the above example, that means that “BEG” is a valid word but “BAG” is not (since “B” and “A” are on the same side). A valid sequence of moves could be BEAD-DICE-ELF.

An average player should be able to complete the puzzle in about five words, depending on the letters. However, the best solutions always have exactly two words. For instance, this was the March 28, 2019 puzzle:

R M E - * * * - I * * W P * * C A * * L - * * * - K G T

The puzzle author’s solution was PRAGMATIC-CAKEWALK, which meets all the criteria — every letter is used at least once, no letters from the same side appear consecutively, and the last letter of the first word is the first letter of the last word. Ingenious!

This puzzle got me thinking — what would a programmatic solution to Letter Boxed look like, and how fast could it be? Let’s start by considering the possible moves.

The shortest valid move consists of three letters, with no two consecutive letters coming from the same side. This gives us 12 possibilities for the first letter, and 9 possibilities for each letter thereafter, since we can’t use any of the 3 letters from the side we just came from. Appending one more letter gives us 9 times more possibilities, and the pattern continues like that until we reach the maximum length we’re willing to consider. Assuming we stop at 12 letter words, we have a total of 12 x 9^{2} + 12 x 9^{3} + . . . + 12 x 9^{11} = 423,644,304,600 moves!

It is clear that brute forcing a solution from the raw permutations of moves would be a very inefficient strategy. There has to be a better way…

As it turns out, while there are many geometrically feasible moves in this game, only a tiny fraction of these moves will produce a valid English word. The Official Scrabble Players Dictionary (Fourth Edition) has some 178,000 words (inflected forms included) of two to 15 letters long — a far cry from the 423 billion total Letter Box sequences.

The key to efficiency really lies in determining when we are heading down the wrong path with a move. Considering the ABCDEFGHIJKL puzzle, let’s say we begin with the letter J. There are definitely English words that start with J so we are good so far. If we were to pick A as the next letter, we would still be fine as there are words like “JAIL” that start with “JA.” Now let’s say we go to F — this is where we run into a roadblock. There is no (Scrabble approved) word that has the prefix “JAF.” If we could discard a move sequence when we hit an invalid prefix, then we could save many billions of moves.

We are in luck! There is a data structure expressly for this purpose known as a trie, a contraction of “retrieval tree” and often pronounced “try.” In its simplest form, it is effectively a hashtable of hashtables keyed by English letters (characters) which facilitates fast lookups of words (strings). A sample use case might look like this:

StringTrie trie = new StringTrie(); // each Add operation implicitly creates intervening // child nodes as necessary trie.Add("JAB"); trie.Add("JAIL"); trie.Add("JIB"); trie.Add("JOB"); trie.Add("JABS"); // nodes can be looked up by char, starting at the root var nodeJ = trie['J']; var nodeJA = nodeJ['A']; var nodeJAB = nodeJA['B']; bool b = nodeJAB.IsTerminal; // b == true, this is a terminal node, i.e. it has a valid value string jab = nodeJAB.Value; // jab == "JAB", the node's stored value var nodeJABZ = nodeJAB['Z']; // nodeJABZ == null, there is no such child

The structure of this trie would look something like this:

J / | \ / | \ A I O / \ | | B$ I B$ B$ | | S$ L$

(The `$`

here represents a terminal node, one which stores a valid value.)

My games repository on GitHub has a very simple implementation of the `StringTrie`

shown above. I’ll be using it as a building block of my eventual Letter Boxed solver.

Okay, so we have a promising data structure, but we need an algorithm to make use of it. We will explore this more in the next post.

Pingback: Letter Boxed: an algorithmic solution – WriteAsync .NET