Exploring the NP-Hard computational complexity of Wordle

Jasu Mandakh
8 min readApr 21, 2023

--

Photo by Nils Huenerfuerst on Unsplash

Wordle is an online word-guessing game that has gained huge popularity after being published in October 2021. The creator of the game Josh Wardle has suggested that the success of the game is largely due to its artificial scarcity, where it can be played only once a day. This article will explore another factor of Wordle’s success: its intrinsic complexity, as explored by Daniel Lokshtanov and Bernardo Subercaseaux in their “Wordle is NP-Hard” research paper.

They have studied the game from the complexity perspective, proving the NP-hardness of its natural formalization: to decide given dictionary D and an integer l if the player can guarantee to discover the secret word w within l guesses. They further prove that NP-hardness holds even when the word has length k=5 and in this case, approximating the minimum number of guesses to guarantee discovering the secret word is NP-hard too.

The game can be played here.

Wordle game

Illustration of three different instances of Wordle. In all of them 𝑘 = 5 and ℓ = 6. The first two games are won by the guesser, while the third one is lost.

Players have to guess a secret word w with length k within l attempts. For each guess p, the game provides feedback on the guessed word w. Iterating over indices i, from 1 to k, the feedback consists of three types of indicators:

  1. If a guessed letter is in the correct position, where w[i] = p[i], it is marked as green.
  2. If a guessed letter is present in the secret word w, but it is in the wrong position, where w[i] p[i] and the set 𝑆𝑖 = {𝑗 : 𝑝 [𝑗] = 𝑤[𝑖], 𝑝 [𝑗] is unmarked}, it is marked as yellow.
  3. If a guessed letter is not present in the secret word, where p[j] is not marked after the iteration, it is marked as gray.

Given the information at each attempt, the player deduces the secret word within the allotted attempts l.

Formally, Wordle Problem is defined as:

INPUT: (𝐷, ℓ) with 𝐷 ⊆ Σ^𝑘 (the dictionary), and an integer ℓ ≥ 1 (the number of guesses allowed)

OUTPUT: Yes, if there is a winning strategy for the guesser, and No otherwise

Optimal algorithm to solve Wordle

Lokshtanov and Subercaseaux presented an algorithm that can be used to solve the Wordle game efficiently:

def WordleGuesser(D,l):
if l=0 then:
guesser loses

if |D|= 1 then:
guess the only word in D and win.

for p ∈ D do:
potentialGuess <- True

for w ∈ D do:
m <- marking for guess p if w is the secret word
D' <- {p' ∈ D | p' is feasible after m}
if WordleGuesser(D', l-1) is a loss for the guesser then:
potentialGuess <- False
break

if potentialGuess then:
Guesser can win using p as its next guess

If this point is reached, then the guesser loses
  1. If the number of remaining guesses (ℓ) is 0, the guesser loses.
  2. If there is only one word left in the dictionary (𝐷), the guesser wins by guessing that word.
  3. For each word 𝑝 in 𝐷, the algorithm checks if it is a potential winning guess: a. For each word 𝑤 in 𝐷, calculate the feedback (marking) if 𝑝 were the guessed word and 𝑤 the secret word. b. Create a new dictionary 𝐷₀ containing words from 𝐷 that are feasible after receiving the marking for guess 𝑝. c. Recursively call the WordleGuesser algorithm with the new dictionary 𝐷₀ and one fewer guess (ℓ — 1). d. If the guesser loses for all words in 𝐷₀, mark 𝑝 as a non-winning guess and proceed to the next word.
  4. If no potential winning guess is found, the guesser loses.

Complexity Results

The authors prove the following theorems regarding the complexity of Wordle:

  1. Theorem 1: Wordle is NP-hard and is W[2]-hard when parameterized by ℓ (the number of guesses allowed).
  2. Theorem 2: Wordle cannot be solved in polynomial time unless P = NP, even when restricted to instances where 𝑘 = 5.
  3. Theorem 3: Wordle can be solved in polynomial time if the alphabet size 𝜎 = |Σ| is constant.

These theorems establish that Wordle is computationally complex, making it an NP-hard problem. We will explore each theorem and its proof.

Theorem 1

Wordle is NP-hard and is W[2]-hard when parameterized by ℓ (the number of guesses allowed).

We prove this theorem by reducing the Almost Set Cover problem, which is known to be W[2]-hard.

Almost Set Cover is a decision problem where the goal is to cover all elements of a universe, except for at most one, using at most 𝑐 sets from a family F. We first show that Almost Set Cover is W[2]-hard by a reduction from the standard Set Cover problem.

Next, given an instance of Almost Set Cover (F, 𝑐), we build a dictionary 𝐷F and a set of symbols ΣF, containing element-words and set words. We claim that (F, 𝑐) is a solution to Almost Set Cover if and only if (𝐷F, 𝑐 + 1) is a solution to Wordle.

For the forward direction, if there’s a sub-family F’ of sets that covers all but at most one element of the universe, we show that the guesser can win in at most 𝑐 + 1 turns by using the set-words 𝑤𝑆 for every 𝑆 ∈ F’.

For the backward direction, we assume that every sub-family F’ leaves at least two elements uncovered. We claim that no strategy for the guesser can guarantee a win within 𝑐 + 1 attempt. By analyzing the cases for a sequence of 𝑐 guesses, we show that there are at least two feasible words remaining, and thus the guesser cannot ensure a win in their last turn.

This completes the proof of Theorem 1, establishing that Wordle is W[2]-hard when parameterized by the number of guesses allowed.

Theorem 2

Wordle cannot be solved in polynomial time unless P = NP, even when restricted to instances where 𝑘 = 5.

Theorem 2 emphasizes that in Wordle, the difficulty does not stem from the length of the words. The proof for this theorem takes inspiration from the proof of coNP-hardness for Evil Hangman and leverages the NP-hard approximation of the smallest dominating set in a 4-regular graph, 𝛾(𝐺). The key aspect of this theorem is that the computational hardness of Wordle cannot be solved efficiently unless P = NP.

The proof begins with a 4-regular graph, 𝐺, and constructs a dictionary, 𝐷𝐺, based on the vertex set of 𝐺. The dictionary is constructed in a way that for each vertex 𝑣 in 𝐺, two words 𝑤𝑣 and 𝑤’𝑣 are created. The word 𝑤𝑣 has 𝑣 as its first symbol, and its remaining symbols are the neighbors of 𝑣, while 𝑤’𝑣 consists of the symbol 𝑣 repeated five times. The minimum number of guesses to win, 𝑊(𝐷𝐺), is then related to the smallest dominating set, 𝛾(𝐺), by proving that 𝛾(𝐺) lies within the range [𝑊(𝐷𝐺) — 4, 𝑊(𝐷𝐺)]. This relationship establishes the computational hardness of Wordle, as a polynomial-time algorithm would imply that 𝑊(𝐷𝐺) can be computed in polynomial time, which is a contradiction to its 1 + 1/390 hardness of approximation.

Corollary 1 follows directly from Theorem 2, stating that it is NP-hard to approximate 𝑊(𝐷) within 1 + 1/390. This corollary emphasizes the hardness of approximation for the Wordle problem.

The proof of Theorem 1 and Theorem 2 both rely on variable-length alphabets. This concept is further explored through Theorem 3 and its associated lemmas. Lemma 2 demonstrates that Wordle can be solved in polynomial time when restricted to instances where the number of guesses, ℓ, is less than or equal to a fixed constant, 𝑐. This is achieved by analyzing the branching factor and depth of the associated game tree, which shows that the optimal guess can be computed in polynomial time.

Lemma 3, a key lemma relating 𝜎 = |Σ| and ℓ, states that in a Wordle game with an alphabet Σ of size 𝜎, the guesser can always win within at most 𝜎 guesses. The proof presents a simple algorithm for the guesser, which chooses any feasible word as a guess at each step. After 𝜎 — 1 feasible guess, either the secret word is found, or there is only one remaining feasible word, which guarantees a win for the guesser within 𝜎 guesses in total.

Theorem 3

Wordle can be solved in polynomial time if the alphabet size 𝜎 = |Σ| is constant.

Theorem 3 addresses the computational complexity of Wordle when the alphabet size |Σ| is constant. The proof of this theorem leverages the results from Lemma 3 and Lemma 2 to demonstrate that Wordle can be solved in polynomial time when |Σ| is constant.

The proof starts with an instance (𝐷, ℓ) of Wordle where |Σ| is a constant, i.e., |Σ| ∈ 𝑂(1). If |Σ| ≤ ℓ, we can simply answer Yes, because Lemma 3 states that the guesser can always win within at most |Σ| guesses. In this case, since |Σ| ≤ ℓ, the guesser will win within the given number of guesses, ℓ.

On the other hand, if ℓ ≤ |Σ|, it implies that ℓ is also a constant because |Σ| is a constant (ℓ ≤ |Σ| = 𝑂(1)). In this situation, we can refer to Lemma 2, which states that Wordle can be solved in polynomial time when restricted to instances where the number of guesses, ℓ, is less than or equal to a fixed constant.

Therefore, by combining the results of Lemma 3 and Lemma 2, we conclude that when the alphabet size |Σ| is constant, Wordle can be solved in polynomial time. This completes the proof of Theorem 3.

Conclusion

We have delved into the computational complexity of the popular word-guessing game, Wordle, by examining its underlying structure and providing an in-depth analysis of its hardness through Lokshtanov and Subercaseaux’s research paper.

Theorem 1 established that Wordle is NP-complete. Through a reduction from the Exact Cover by 3-Sets problem, we demonstrated the complexity of Wordle, highlighting the challenges faced by players attempting to find optimal strategies for the game.

Theorem 2 showed that Wordle’s difficulty is not solely dependent on long words. By leveraging the hardness of approximation results, we proved that Wordle cannot be solved efficiently unless P = NP. This finding underscores the inherent challenge of the game and its intriguing nature.

Lastly, Theorem 3 revealed that when the alphabet size is constant, Wordle can be solved in polynomial time. We utilized two lemmas to support this result, highlighting the importance of considering the interplay between alphabet size and the number of guesses in the game.

Now what?

The analysis leaves the door open for other interesting problems:

  1. Is Wordle in NP, or is it PSPACE-complete? It is not hard to see that Wordle is in PSPACE, by simply computing its associated game tree, which has polynomial depth by Lemma 3.
  • Latest Update: Rosenbaum has published a proof of membership in NP, and it extends as well to cover the third problem on this list.
  1. Is Wordle in FPT when parameterized by 𝜎 = |Σ|? The proof of Theorem 3 provides an algorithm in time 𝐷 𝑂 (ℓ), and it seems challenging to obtain a poly(𝐷) · 𝑓 (ℓ) algorithm for some computable function ℓ.
  2. How is the complexity of Wordle affected by having a dictionary that is not presented as a list of words, but rather in some more compact representation as a nite automaton? We can envision 𝐷 to be provided as a nite automaton, for which all accepted words of length 𝑘 are considered part of the dictionary. This affects for example the proof of Theorem 3, as now a branching factor of |𝐷| can be exponential in the input size.

--

--

Jasu Mandakh

Digital nomad, Product Manager, Lifestyle enthusiast. Currently in San Francisco, CA.