Previously, I introduced the Letter Boxed puzzle and a `StringTrie`

data structure to help determine valid moves. Today, I will go over one possible algorithm to solve it.

We first consider the Letter Box as a graph with numbered vertices like so:

0 1 2 - * * * - 11 * * 3 10 * * 4 9 * * 5 - * * * - 8 7 6

There are 12 x 9 = 108 edges in this graph, one for every valid two step movement. For example, the entire set of edges with 0 as the starting vertex would be `{ (0, 3), (0, 4), (0, 5), (0, 6), (0, 7), (0, 8), (0, 9), (0, 10), (0, 11) }`

— in other words, all possible combinations excluding vertices on the starting side.

Now let’s relabel the Letter Box with English letters:

A B C - * * * - L * * D K * * E J * * F - * * * - I H G

Using our `StringTrie`

, we start from a given vertex and select the trie node corresponding to it. Let’s say we start with vertex 0 which is letter A. We do a lookup `node = trie['A']`

to get the first node. Now from this node, we select a valid adjacent vertex, let’s say 6 / ‘G’ and do a lookup `node = node['G']`

to go one level deeper. We continue this until we reach a nonexistent node; e.g. if we pick 10 / ‘K’, we would end up with a null node since there are no English words with a prefix “AGK.” If at any point the node has a valid value (e.g. “AGE” with vertices 0-6-4), we report this as a found word but otherwise continue the algorithm in the hopes of finding longer matches.

What I’m describing is classic depth-first search which naturally lends itself to a recursive implementation. In pseudocode, it might look like this:

procedure findWords(trie, box, found): for all vertices v1 in box do: c1 = box[v1] visited1 = { v1 } nextSearch(trie[c1], box, v1, visited1, found) procedure nextSearch(node, box, v1, visited1, found): if node is not null then if node has a valid value then call found(node.value, visited1) for all vertices v2 in box.adjacent(v1) do: c2 = box[v2] visited2 = visited1 union with { v2 } nextSearch(node[c2], box, v2, visited2, found)

We can be assured the algorithm will terminate eventually because even though the box graph has cycles (“ADADADAD…”), the trie does not. There is, after all, a finite limit on the number of English words up to a certain length. Note that the algorithm uses a `visited`

vertex set and reports this to the `found`

callback along with the word value; this will become important in the second part of the solution.

So now we’ve found all the words hidden in the box. Next we must find pairs of these words `(w1, w2)`

such that the last letter of `w1`

is the first letter of `w2`

*and* which together contain all letters in the box at least once (aka a set cover over the Letter Box). In pseudocode:

procedure findSolutions(allWords, found): for all words w1 in allWords do: for all words w2 in allWords.startingWith(w1.last) do: visited = w1.visited union with w2.visited if visited is complete set then call found(w1, w2)

Now that we have a basic sketch of the solution, we can start implementing the pieces necessary to build it. First, the `LetterBox`

, which is just a thin wrapper around a string. It also needs to provide a `Vertices`

set, implemented here as a bit field; the user can query for valid next Vertices given a starting vertex, and also combine two sets. Our implementation so far: LetterBox.cs

Next, we need the word search algorithm. A new `LetterBoxSearch`

class will do nicely, with inputs for the trie and letter box. A user-specified `Action<string, LetterBox.Vertices>`

will represent the found callback. Here is where we are now: LetterBoxSearch.cs

Finally, we need a structure to collect words and implement the two-word solution search logic. We will call it `LetterBoxWords`

. Internally it will have a dictionary which maps initial letters to hash sets of matching words with their vertex sets. After those changes (plus a new property `IsComplete`

for `LetterBox.Vertices`

) we have: LetterBoxWords.cs

All that’s left is to write a solver application. It will just need a letter box string and a word list file as inputs. The OSPD4 word list should suffice; I found a copy here which I processed by stripping the definitions and sorting the words: words.txt

The app sets up the letter box, reads the word list, loads each entry into the trie, and constructs the `LetterBoxSearch`

and `LetterBoxWords`

instances. The search callback adds words to the collection and the find solution callback prints out each two-word solution. Et voilĂ : Solver.cs

Now we just need to compile it in Release mode and see it in action. Let’s solve the March 28, 2019 Letter Boxed puzzle:

> LetterBoxedSolver.exe RMEWCLTGKAPI words.txt [000.000] Loading trie... [000.211] Loaded 162309 words. [000.212] Finding valid words... [000.219] Found 754 valid words. [000.219] Finding solutions... MARKETPLACE-EARWIG WIGWAM-MARKETPLACE PRAGMATIC-CAKEWALK [000.226] Done.

Amazing! In less than 1/4 of a second, we have a list of valid solutions. The vast majority of the time is spent loading the trie — the search algorithms themselves only take about 7 ms each.

As fast as this is, we can probably do better. We’ll explore performance optimization next time.

Pingback: Letter Boxed: optimizing the trie – WriteAsync .NET