# Letter Boxed: an algorithmic solution

Previously, I introduced the Letter Boxed puzzle and a `StringTrie` data structure to help determine valid moves. Today, I will go over one possible algorithm to solve it.

We first consider the Letter Box as a graph with numbered vertices like so:

```     0 1 2
- * * * -
11 *       * 3
10 *       * 4
9 *       * 5
- * * * -
8 7 6
```

There are 12 x 9 = 108 edges in this graph, one for every valid two step movement. For example, the entire set of edges with 0 as the starting vertex would be `{ (0, 3), (0, 4), (0, 5), (0, 6), (0, 7), (0, 8), (0, 9), (0, 10), (0, 11) }` — in other words, all possible combinations excluding vertices on the starting side.

Now let’s relabel the Letter Box with English letters:

```     A B C
- * * * -
L *       * D
K *       * E
J *       * F
- * * * -
I H G
```

Using our `StringTrie`, we start from a given vertex and select the trie node corresponding to it. Let’s say we start with vertex 0 which is letter A. We do a lookup `node = trie['A']` to get the first node. Now from this node, we select a valid adjacent vertex, let’s say 6 / ‘G’ and do a lookup `node = node['G']` to go one level deeper. We continue this until we reach a nonexistent node; e.g. if we pick 10 / ‘K’, we would end up with a null node since there are no English words with a prefix “AGK.” If at any point the node has a valid value (e.g. “AGE” with vertices 0-6-4), we report this as a found word but otherwise continue the algorithm in the hopes of finding longer matches.

What I’m describing is classic depth-first search which naturally lends itself to a recursive implementation. In pseudocode, it might look like this:

```procedure findWords(trie, box, found):
for all vertices v1 in box do:
c1 = box[v1]
visited1 = { v1 }
nextSearch(trie[c1], box, v1, visited1, found)

procedure nextSearch(node, box, v1, visited1, found):
if node is not null then
if node has a valid value then call found(node.value, visited1)
for all vertices v2 in box.adjacent(v1) do:
c2 = box[v2]
visited2 = visited1 union with { v2 }
nextSearch(node[c2], box, v2, visited2, found)
```

We can be assured the algorithm will terminate eventually because even though the box graph has cycles (“ADADADAD…”), the trie does not. There is, after all, a finite limit on the number of English words up to a certain length. Note that the algorithm uses a `visited` vertex set and reports this to the `found` callback along with the word value; this will become important in the second part of the solution.

So now we’ve found all the words hidden in the box. Next we must find pairs of these words `(w1, w2)` such that the last letter of `w1` is the first letter of `w2` and which together contain all letters in the box at least once (aka a set cover over the Letter Box). In pseudocode:

```procedure findSolutions(allWords, found):
for all words w1 in allWords do:
for all words w2 in allWords.startingWith(w1.last) do:
visited = w1.visited union with w2.visited
if visited is complete set then call found(w1, w2)
```

Now that we have a basic sketch of the solution, we can start implementing the pieces necessary to build it. First, the `LetterBox`, which is just a thin wrapper around a string. It also needs to provide a `Vertices` set, implemented here as a bit field; the user can query for valid next Vertices given a starting vertex, and also combine two sets. Our implementation so far: LetterBox.cs

Next, we need the word search algorithm. A new `LetterBoxSearch` class will do nicely, with inputs for the trie and letter box. A user-specified `Action<string, LetterBox.Vertices>` will represent the found callback. Here is where we are now: LetterBoxSearch.cs

Finally, we need a structure to collect words and implement the two-word solution search logic. We will call it `LetterBoxWords`. Internally it will have a dictionary which maps initial letters to hash sets of matching words with their vertex sets. After those changes (plus a new property `IsComplete` for `LetterBox.Vertices`) we have: LetterBoxWords.cs

All that’s left is to write a solver application. It will just need a letter box string and a word list file as inputs. The OSPD4 word list should suffice; I found a copy here which I processed by stripping the definitions and sorting the words: words.txt

The app sets up the letter box, reads the word list, loads each entry into the trie, and constructs the `LetterBoxSearch` and `LetterBoxWords` instances. The search callback adds words to the collection and the find solution callback prints out each two-word solution. Et voilĂ : Solver.cs

Now we just need to compile it in Release mode and see it in action. Let’s solve the March 28, 2019 Letter Boxed puzzle:

```> LetterBoxedSolver.exe RMEWCLTGKAPI words.txt
[000.212] Finding valid words...
[000.219] Found 754 valid words.
[000.219] Finding solutions...
MARKETPLACE-EARWIG
WIGWAM-MARKETPLACE
PRAGMATIC-CAKEWALK
[000.226] Done.
```

Amazing! In less than 1/4 of a second, we have a list of valid solutions. The vast majority of the time is spent loading the trie — the search algorithms themselves only take about 7 ms each.

As fast as this is, we can probably do better. We’ll explore performance optimization next time.