Neuroevolution and Wordle - Project Introduction
If you’ve ever solved a Wordle manually, you’ve probably wondered if it’s possible to use a combination of Genetic Algorithms and Reinforcement Learning to produce a neural net that can solve them for you. I certainly have. Now that I have an Nvidia 5070 Ti GPU at home, it’s finally time to find out.
Historical Background
In the before times, when we used to type code into a keyboard with our fingers, I wrote a Go programme that plays Wordle using dictionary files. It keeps track of a shortlist of remaining possible solutions, and favours whichever guess would, in the worst case in terms of coloured tile feedback, result in the biggest reduction in this shortlist.
Design Overview for Neuroevolutionary Wordle
In my head, there is a picture of a model.
(Actually, in my head there are no pictures, but that’s a topic for another time.)
This model has several components:
- Neural net with weights
- Output embedding
- ‘Hard-coded’ embedding
- ‘Free’ embedding
Keep an eye out for future posts in my Neuroevolutionary Wordle series for more information about what these mean, but the important parts are these:
- there is a neural net with trainable weights;
- at least part of the output embedding needs to be changable during training.
This brings us to the topic of how to do the training.
How to Train the Neural Net
The “Sensible” Way - I’m not doing this
The sensible thing to do here is to use my existing Wordle solver as a source of synthetic data. Training would happen in two phases:
- The network is trained using back-propagation until its behaviour resembles that of the existing Wordle playing programme
- Reinforcement Learning lets the model play loads of Wordles, and figure out how to get better
This is an extremely sensible approach, but leaves little room for messing around with CUDA and playing with my new Nvidia-made toy, so I will not be doing it.
The “Silly” Way - I am doing this
The alternative approach to training is Neuroevolution, and it goes like this:
- Genetic Algorithm changes the model weights and the trainable portion of the output embedding
- Reinforcement Learning lets the model play loads of Wordles, and figure out how to get better
Both approaches solve the obvious problem: that unguided learning is going to struggle to get going on its own. There is a large action space, and the nature of the game is that almost all actions result in failure. Try playing a Wordle by picking six random words out of the dictionary, and you’ll see what I mean.
Of the two approaches, using the high-quality synthetic data source is the most likely to succeed. However, I haven’t written a GA since the Spice Girls were putting out records, and I really want to find out if I can make a Neuroevolution approach work here. In particular, the GA will be written in CUDA, which I look forward to.