Welcome to the Wordle Solver. In this project I'll explain how to use the solver to wrangle the
New York Times Wordle Solver. Following the widget I will explain my reasonings for making the solver, my
design process and the source code.
The solver may have some bugs in it, but it should be user-proof for the most part. If you are able to break
it I would love to know (See bottom of page). I have personally broke it many times and watched my wife break
it once which broke my heart.
Prior to starting the solver I recommend you open up a
Wordle tab or bring it up on your
phone to play along. If you have already solved the wordle for the day you can do it in a private
browser. This widget can also be used for helping you if you are stuck in a current game.
Wordle is a game where you attempt to find the hidden word in six guesses or less. After each guess a color
will be assigned to each letter of the guess. Green means it is the correct letter and in the correct space,
yellow means it is the correct letter in the wrong space, and black means it is the wrong letter in
the wrong space.
Click the "> Run" button at the top of the widget to start.
Wordle first entered my life in the middle of Jan 2022 when a couple of friends came over for dinner. All was
going well until Anne started to cry. Melle and Anne left to go figure some things out so I was left alone with
the guests. One thing led to another and they were explaining the rules of the game to me. Their first guess was
Rates which still influences my guesses to this day. I found it interesting how random the first guess was.
Before the first guess you have no information about what the hidden word could be. This means that there was
most likely an optimal opening word...
Flash forward to the end of Jan 2022 when my parents came out for a visit. I played wordle once or twice since that
fateful night but didn't obsess over it. On a quiet morning I was eating breakfast and my mother was on her
phone. What was she playing on her phone?
Wordle
After she informed me of a group chat with other wordlers I was hooked. We were all doing the wordle each morn
trying to get the best score. After a couple of games I felt as though I understood the rules pretty well and
thought that a wordle solver wouldn't be too hard.
Just like they taught me in Compt 101, I began with the
pseudo-code. This type of code isn't actually
code but more of a quick way to be able to explain what you want the more complex code to do. Typically I
only include the more complex parts and leave out any of the basic steps.
1. Get list of all possible 5 letter words
2. Get optimal guess
3. Get the colours of the guess
4. Apply results of 2. to 1.
5. Repeat steps 2 through 4 until hidden word is found
The above list is very similar to what I do when I am playing wordle without help. The computer and the
human have different advantages which I will have outlined in the table:
David |
Computer |
Think of a lot of 5 letter words
-Dynamic thinking
-Hard to think of obscure words |
Import a list of all 5 letter words
-Long and complete list
-Static and strictly 5 letters |
Think of a good guess
-Can use "gut feeling"
-Impossilbe to consider all guesses |
Use algorithm to get best guess
-Will always get best guess
-Tricky implementation |
Look at results from guess
-Quick
-Revisit info many times |
User inputs results
-No obvious downsides
-Only needed once per guess |
Think of new guess
-Dynamic
-Hard to make best use of
previous information |
Apply results to word list
-Perfect use of results
-Rules must be well defined |
I assume that you have a good idea of how a human would go about completing each of the above steps,
so I will mainly talk about how the computer code will go about it.
The first thing I did was head to the wordle website to
inspect the source code for clues. This was back when the game was hosted by the creator on
www.powerlanguage.co.uk/wordle/
Since this was a small game at the time none of the code was too hard to get to and soon enough I found myself
looking at the brains of the operation. It was hard to read except for the massive list of five letter words.
There were two lists in the code:
1. Possible guesses
2. Possible answers
2. was a subset of 1. because the creator wanted to make the hidden word a "typical" word. This word segregation
is best explained through some role-play. Lets listen in on Rem and Jordyn playing a game of scrabble where Rem
just played a 5 letter word Jordyn has never heard of before...
(note: all characters are fictional, any similarity to actual persons is purely coincidental)
"Rem: I will play Yrivd""Jordyn: That is not a word"
"R: Sure it is, should we look at the dictionary?"
"J: Okay, but when its not in there you lose a turn"
They turn to the back of the book and find that sure enough "Yrivd" is in there and is the past participle of rive.
"R: 24 points because I am on a double word tile"
If a word would elicit the same situation as we just went through, it most likely is not in list 2. I ran into
the first major decision of this journey; Do I use just list 2. and have a great but specific solver or do I
use 1. and have a good but more general solver? Spoiler: I chose the latter.
Disclaimer: I know algorithm is a scary word now-a-days but mine will only discriminate against bad words.
This problem kept me up many nights (well Anne also kept me up but whos counting) trying to think of what makes
a "good". My original plan was to find the frequency of each letter in each spot of all the words and sum them up
to get a score. This meant that the score was dependent on not only the letter but also what spot the letter was
in. The reasoning behind this was simple: I had green-goggles. I thought that getting a green letter was the best
type of information so I highly valued the position of the letter within the word. After running the algorithm for
the first time, it suggested the word:
This is a terrible starting word, or any word for that matter. What even is a esses?
"plural noun, a thing shaped like the
letter S" -google. Great. The real reason this is such a bad word is the repeating letters. I can't blame the code because
I wrote it but it sure is a stupid suggestion. I like to think of double (and triple letters for that matter) as a new
letter in the alphabet with respect to wordle. So for the same reason words like
zygal are good starting words, double
letter words are also not good. There just aren't that many words with double or triple letters to make it worth a guess in
the early rounds.
After adding in a penalty for multiples of letters I ran the code again and got
Cares. Okay. This seems to be a good
starting word from my perspective. It has a couple of vowels in it and and s which I would imagine is quite common. This is
good but I should take off my green-goggles and try to get some yellows as well. This was a quick change because I now just
cared about the frequency of the letter rather than the position. The final picking algorithm looks like this:
The only bit that might need explaining is the divisor in the for loop. This is a constant that makes sure the scores are
a decimal number which is helpful during debugging and can technically be removed. The last part of this algorithm finds
the top 3 words and prints them as suggestions: AROSE, ARISE, STALE. Those all seem like pretty good guesses to me.
This part of the project was by far the trickiest. I explained the rules earlier and it took only a sentence or two but
that was not the case for the code. There are many pit falls and traps that I walked into in this section. My plan was
to start with the list of possible words and remove any word that does not conform to the results of the previous guess.
If I keep modifying the most recent list, I only ever have to take into consideration the most recent result. I will go
through every word in the list and see if it adheres to the rules from the result. I will first assume that the word
DOES adhere to the rules put it through a series of tests. If the word doesn't fail, it gets to stay in the list. If it
fails, it gets removed from the list.
Lets start with the basics, each position a letter is in must be assigned one colour and one colour only: Green, Yellow,
or Black. Green is the most simple;
If the word under scrutiny does not have all the matching green letters, remove it. greenList is a list of all of the green
letters in the most recent result along with thier corresponding position. The word must contain all of the green letters in
greenList to pass.
Just because a word passes the green test does not mean it is home free. The next test is the yellow test. The yellow letters
are more complex because they have two rules. The first rule is the opposite of the green rule. If a word has the same letter
in the same position as a yellow letter, it is removed. This is because yellow means correct letter but wrong space. The
second yellow rule is that the word under scrutiny must contain all the yellow words. If both of these conditions are
met, the word passes to its final test.
The Black Test.
Black was surprisingly the most complex rule. This is best shown in an example:
In the above guess the hidden word is Caulk. The first C is a green but the last is a black. If I were to simply remove all
words that contained a black letter then Caulk would be removed and the solver would fail. The same is true for yellow letters.
This means I first need to check that the black letter is not also green or a yellow letter in a different position. Finally just
like with the yellow letters, all words with a black letter in the same position can be removed.
Force Computer to Play Many Wordle Games
The wordle solver is now complete. It is playable and
seems to get good results. However I do not know how good the
results are. The final step was to create a script to play wordle automatically. The only interesting step in this part
was the fact that I had to generate the result of the guess as well. It is just a different way of applying the rules I
laid out previously. Assuming that any word could be the hidden word, LATER got a score of 4.60. This means that on average
it took 4.6 guesses to get the correct word with LATER as the starting word.
Sightly more narrow, if I use only the answer list from earlier, LATER gets a score of 4.2 while ARISE gets a score of 4.12 and
CRANE a score of 4.15.
If you want to have the best shot at getting a low score, this wordle solver can help you do it. The best strategy when playing
the official wordle is to use the top suggestion when the possible answers are above ~40 and then switch to picking words you
recognize later in the game. For all of my code check the
Github repo
and for some more wordle deep dives check
here.