Previously, I introduced the Letter Boxed puzzle and a StringTrie
data structure to help determine valid moves. Today, I will go over one possible algorithm to solve it.
We first consider the Letter Box as a graph with numbered vertices like so:
0 1 2 - * * * - 11 * * 3 10 * * 4 9 * * 5 - * * * - 8 7 6
There are 12 x 9 = 108 edges in this graph, one for every valid two step movement. For example, the entire set of edges with 0 as the starting vertex would be { (0, 3), (0, 4), (0, 5), (0, 6), (0, 7), (0, 8), (0, 9), (0, 10), (0, 11) }
— in other words, all possible combinations excluding vertices on the starting side.
Now let’s relabel the Letter Box with English letters:
A B C - * * * - L * * D K * * E J * * F - * * * - I H G
Using our StringTrie
, we start from a given vertex and select the trie node corresponding to it. Let’s say we start with vertex 0 which is letter A. We do a lookup node = trie['A']
to get the first node. Now from this node, we select a valid adjacent vertex, let’s say 6 / ‘G’ and do a lookup node = node['G']
to go one level deeper. We continue this until we reach a nonexistent node; e.g. if we pick 10 / ‘K’, we would end up with a null node since there are no English words with a prefix “AGK.” If at any point the node has a valid value (e.g. “AGE” with vertices 0-6-4), we report this as a found word but otherwise continue the algorithm in the hopes of finding longer matches.
What I’m describing is classic depth-first search which naturally lends itself to a recursive implementation. In pseudocode, it might look like this:
procedure findWords(trie, box, found): for all vertices v1 in box do: c1 = box[v1] visited1 = { v1 } nextSearch(trie[c1], box, v1, visited1, found) procedure nextSearch(node, box, v1, visited1, found): if node is not null then if node has a valid value then call found(node.value, visited1) for all vertices v2 in box.adjacent(v1) do: c2 = box[v2] visited2 = visited1 union with { v2 } nextSearch(node[c2], box, v2, visited2, found)
We can be assured the algorithm will terminate eventually because even though the box graph has cycles (“ADADADAD…”), the trie does not. There is, after all, a finite limit on the number of English words up to a certain length. Note that the algorithm uses a visited
vertex set and reports this to the found
callback along with the word value; this will become important in the second part of the solution.
So now we’ve found all the words hidden in the box. Next we must find pairs of these words (w1, w2)
such that the last letter of w1
is the first letter of w2
and which together contain all letters in the box at least once (aka a set cover over the Letter Box). In pseudocode:
procedure findSolutions(allWords, found): for all words w1 in allWords do: for all words w2 in allWords.startingWith(w1.last) do: visited = w1.visited union with w2.visited if visited is complete set then call found(w1, w2)
Now that we have a basic sketch of the solution, we can start implementing the pieces necessary to build it. First, the LetterBox
, which is just a thin wrapper around a string. It also needs to provide a Vertices
set, implemented here as a bit field; the user can query for valid next Vertices given a starting vertex, and also combine two sets. Our implementation so far: LetterBox.cs
Next, we need the word search algorithm. A new LetterBoxSearch
class will do nicely, with inputs for the trie and letter box. A user-specified Action<string, LetterBox.Vertices>
will represent the found callback. Here is where we are now: LetterBoxSearch.cs
Finally, we need a structure to collect words and implement the two-word solution search logic. We will call it LetterBoxWords
. Internally it will have a dictionary which maps initial letters to hash sets of matching words with their vertex sets. After those changes (plus a new property IsComplete
for LetterBox.Vertices
) we have: LetterBoxWords.cs
All that’s left is to write a solver application. It will just need a letter box string and a word list file as inputs. The OSPD4 word list should suffice; I found a copy here which I processed by stripping the definitions and sorting the words: words.txt
The app sets up the letter box, reads the word list, loads each entry into the trie, and constructs the LetterBoxSearch
and LetterBoxWords
instances. The search callback adds words to the collection and the find solution callback prints out each two-word solution. Et voilĂ : Solver.cs
Now we just need to compile it in Release mode and see it in action. Let’s solve the March 28, 2019 Letter Boxed puzzle:
> LetterBoxedSolver.exe RMEWCLTGKAPI words.txt [000.000] Loading trie... [000.211] Loaded 162309 words. [000.212] Finding valid words... [000.219] Found 754 valid words. [000.219] Finding solutions... MARKETPLACE-EARWIG WIGWAM-MARKETPLACE PRAGMATIC-CAKEWALK [000.226] Done.
Amazing! In less than 1/4 of a second, we have a list of valid solutions. The vast majority of the time is spent loading the trie — the search algorithms themselves only take about 7 ms each.
As fast as this is, we can probably do better. We’ll explore performance optimization next time.
Pingback: Letter Boxed: optimizing the trie – WriteAsync .NET