How I created a chess AI from scratch
After watching The Queen’s Gambit on Netflix, I developed a sudden interest in chess. Like many others, I went to the internet to learn and try it out. And, just like many others, I realised that I am terrible at it.
Sure, I could practice and get better, but that sounds like a lot of effort. Instead, I had a completely different idea: what about creating a bot to play chess for me? I already know how to code, so this should be relatively easy, right? Well…
Implementing the game
The final game has an interface that I put together using Xamarin Forms, Prism, and UWP, and it allows you to play using the mouse. You can find all the code here.
Needless to say, before I can explain the juicy details about how I implemented all of this and tackled the various problems I encountered, we need to dive into the basics of how the algorithms that I used for this project work.
The minimax algorithm
First things first, I needed an AI. But I didn’t want to use a neural network, I wanted to do it old-style, so I went for an artificial-intelligence algorithm that I remember studying in uni: minimax. I like this algorithm because it is easy to understand, easy to implement, and very intuitive — and because it uses recursion, which I am a big fan of. By the way, just as a quick disclaimer, I must say that the version of the algorithm that I will explain here is actually called negamax.
In a nutshell, this algorithm tries to find the best move for the next turn by analysing the output of all the possible sequences of moves that both players could play. This is, in order to find the best move, it uses brute force to study all the possible developments of the game. This sounds simple enough, but when both players are trying to compete for the win, there are a few caveats.
For example, let’s consider we are using it to play tic-tac-toe.
Taking a certain initial state, it is easy to imagine the search tree that the algorithm will have to go through in order to try all the potential moves available:
As the figure above shows, the first possible move (a) leads to a defeat (a.a) or to a tie (a.b.a), the second one (b) leads to a victory (b.a.a) or to a tie (b.b.a), and the third one (c) leads to a certain victory. This search tree includes all the possible moves for both players, but not all of them would make sense if both players are playing well. Therefore, the algorithm will not only generate this tree, but it will also analyse it.
Without thinking about code yet, let’s focus on the subtree generated by the first possible move, a. If we assume that the opponent (red) is trying to get the best outcome for themselves, then we have to disregard the possibility of a tie (a.b.a), as the red player will definitely play the move a.a in order to secure their victory in their turn, ignoring the move a.b and causing blue’s defeat immediately. Ultimately, this means that the algorithm will have to label the initial move a as an inevitable path to blue’s defeat, given that the only other possibility derived from it (the tie produced by the move a.b.a) cannot happen if red plays well in their turn by making the move a.a.
Similarly, the algorithm will also understand that the second move (b) will only lead to a tie (b.b.a) and never to a victory (b.a.a), as red will always choose the move that stops blue from winning (b.b) over the one that doesn’t (b.a) — from red’s perspective, a tie is a better output than a defeat.
I could keep going, but I hope you get the idea. As one would expect, the algorithm will end up selecting the move c as the best one, as it gives blue an immediate victory. Obvious.
Show me the code!
As we have seen, the logical ideas behind this algorithm are very intuitive: think of all the possibilities, try to guess what your opponent would do in each case under the assumption that they will play well, and then choose the move that would lead to the best possible outcome for you, all things considered. However, how does this all really work?
To proceed further, we will have to distance ourselves from the specifics of tic-tac-toe. Instead, we should think of any possible zero-sum game, whether it is tic-tac-toe, chess, or any other — the minimax algorithm can also be used for non-zero-sum games, but we will ignore them to keep things simple.
As a side note, and just in case you haven’t heard the term before, “zero-sum games” is simply a fancy way of referring to games in which what is good for one player is proportionally bad for the other(s). In most board games where there isn’t an explicit score, such as tic-tac-toe or chess, the most obvious example of it is that what one player sees as a victory, the other one sees as a defeat.
Now that we have understood the general concepts explored with the tic-tac-toe example, we will have to get our hands dirty and write the code that will be able to go through all the possible sequence of moves of any given game. As I mentioned previously, the algorithm will not only need to go through all the possible sequences of moves, but it will also need to analyse and compare each one of them in order to find the best move.
Below we have an example of how the code might look:
The algorithm is pretty simple, and you can guess what is supposed to be doing by reading the lengthy comments I have added to it.
Basically, it will calculate the best possible move at each level of the tree by trying all the possible ones in a loop and then comparing their scores. This is, for each move in the loop:
- The move will be made.
- If the move happens to end the game (i.e., if there are no more moves available after this one has been made), the algorithm will just calculate the score of the move by getting the final score of the player.
- If the move doesn’t cause the game to finish, then the algorithm will calculate the best possible next move for the opponent by calling itself again, which will bring us down one level in the tree. While necessary, this will keep happening recursively. Then, once the call returns and we are back at the level that made the original recursive call, the score of the returned move will be inverted to account for the fact that it was calculated from the opponent’s perspective — we must remember that a victory for the opponent is a defeat for us, and vice versa. This inverted score will be considered as the score of the move, as it is the score that the move would eventually lead to.
- Once the move has been given a score, regardless of whether such score was calculated directly or via recursion, the algorithm will compare it to the score of the best move found so far. If this new score happens to be higher than the score of the current best move, then the move will become the new best one.
- Finally, the move will be undone so the next iteration can start fresh — the point of doing this is to avoid altering the state of the game after executing this method.
This will happen repeatedly until the loop is over. After that, the move with the highest score will be returned as a result.
I understand this all might still sound confusing, and probably pretty abstract too. However, this algorithm is very easy to understand once we apply it to a concrete example.
The algorithm in action
If we go back to the tic-tac-toe tree, we can finally see step by step how the algorithm would work in a real scenario.
As the figure below shows, the move a will be the first one to be analysed because it comes first in the loop of the top level, the one that iterates over a, b, and c. Since this move doesn’t cause the game to finish, its score will be calculated recursively in the levels below. This way, at the second level, the algorithm will iterate over the two possible moves for red after a has been played by blue, which are a.a and a.b. Then, the scores of these moves will be calculated, and the best of the two moves, which will be a.a with a score of +1, will be returned. Please note that there was no need to mention the third level here because the loop at that depth only has one move available (a.b.a), which will necessarily be the one returned up as a result, causing a.b to have a score of 0.
Since no other moves have been studied yet in the top level of the tree, the move a, with its score now determined as -1, will temporarily become the best move.
In the second iteration of the loop on the top level of the tree, the score of b will also be calculated recursively. In this branch, once the algorithm gets down to the second level of the tree, the moves to compare will be b.a and b.b. In this case, the move that will be returned as the best one will be b.b, as its score of 0 makes it better than b.a, which has a score of -1. This will leave the value of the move b as 0 — once again, there is no need to focus on the third level of the tree here because the loops at that depth only have one possible move each, so they will always return their one and only move.
After this iteration, the move b, with its score now determined as 0, will replace a as the best move. As we can see, the algorithm here has managed to discard an unrealistic possibility of victory for blue (b.a.a) by assuming that red would use the move b.b in their turn in order to minimise blue’s score, stopping them from winning.
Finally, in the third and last iteration on the top level of the tree, the algorithm will find that the move c has a score of +1. This will result in c being considered as the best move for blue, as it has the biggest score found so far at this level.
Since c was the last move available to explore, the algorithm will now finish, returning c as the best move — if we remember the original tree for our tic-tac-toe example, c was the move that would make blue win instantly, so it makes sense for the machine to choose this move.
And that’s it, that’s how the most basic version of the negamax algorithm manages to create the illusion of intelligence, by trying out every last combination of moves with recursion in order to determine the best move for each occasion.
This algorithm is very elegant, and it certainly works like a charm in games like tic-tac-toe, where it’s easy to explore all the possibilities. However, what happens when we try to use it in games like chess, where the number of possible move combinations is so vast that no computer could ever have the time to explore them all?
Adding heuristics
To try out this algorithm, I initially implemented a tic-tac-toe game, and it all worked without problems; the machine would never lose. However, when I later implemented the game Connect 4, my computer was unable to process the immense amount of possible developments of that game (according to Google, there are exactly 45,319,852,190,92).
For complex games, we need to limit how deep our algorithm is allowed to go in the search tree in order to reduce the processing needed to find the next best move. Regrettably, this means that some branches will have to stop going down the tree before having found a final solution. And in those cases, when we cannot try more moves but the game isn’t finished and doesn’t have a clear winner yet, we will need to determine the score of the move with a heuristic function.
To make things work, this function will try to accurately calculate how likely a player is to win or to lose depending on the state of the game.
For example, a good heuristic function to know how well a player is doing in Connect 4 would be one that counts how many lines there are where the player has three of their chips and an empty space for another one — the more of these lines where a victory could potentially be achieved, the higher the player’s score. This simple heuristics is actually the one I implemented in my own version of Connect4, and it works surprisingly well, making the machine play quite smartly.
With a limitation on the depth of the tree and a heuristic function, this is how the algorithm will now look like:
In most games, the algorithm will rarely be able to reach nodes where the game is finished, and this means that a good heuristic function will be key to have an AI that plays well and intelligently.
Unfortunately, as much as limiting the depth of the tree helps to reduce the processing needed by the algorithm, this code is still not performant enough to be used for a game as complex as chess. For that, we will need to learn about further optimisations that can be applied to this algorithm.
Alpha-beta pruning to the rescue
It is very unusual to implement the minimax algorithm the way I have explained it so far, as it is way too slow for most games. Instead, an optimised variation of it called alpha-beta pruning is usually what most developers would use.
In this variation, the algorithm won’t always evaluate all the possible moves of every level. Instead, before studying each move, the algorithm will apply some simple logic to determine whether or not it would be worth to continue iterating over the moves of the current level.
Although this is achieved with a very minor change in the code, the increase in performance is potentially and practically enormous.
To understand how this technique works, we can take a look at the following diagram:
Imagine that the algorithm, after iterating over the first five moves that blue has available at the first level of the tree, is now trying to get the score of the sixth move, which is located at the top right side of the figure. To calculate it, it will have to go down one level in the tree using recursion. Then, down there at the second level, the algorithm will have to iterate over a loop that has the two moves that are available to red at this point. So far, so good.
We can see that the first move of the two has a score of +1, which means that it will be returned as the best move unless the other move in the loop happens to have a higher score (in all our examples, no move could have had a score higher than +1, but let’s assume that it could happen in this case). Normally, the algorithm would just keep going through the loop and calculate the score of the next move, but here we can be smarter than that.
At this point, while iterating over these two moves at the second level of the tree, we know that the current best move found at the level above has a score of 0.3. And we also know that, from the second level, the score of the move that will be returned will be either +1, which is the highest score found so far, or an even higher one if the second move happens to have a higher score. However, we also know that the score to be returned from this second level of the tree, whatever its value, will be inverted in the level above, which will turn it into -1, or into an even smaller number depending on what gets returned. Therefore, we can already conclude that the score that will be returned from this second level, regardless of whether it is +1 or an even higher value, will be ignored at the first level, as 0.3 will always be bigger than the inverse of any number equal to or bigger than +1. Thus, we don’t need to evaluate any more moves at the second level because we now know that the level above will definitely ignore whatever result the second level ends up returning.
This is already pretty nice, but there is even more that we can do. For example, if we go back to assuming that the maximum possible score for any move is +1, another improvement we could apply is to stop the search on a level as soon as a move with this score is found. After all, if we know that it will be literally impossible to find a move with a better score than +1, why bother studying any more moves?
And it is just like that, by applying very simple logical rules, that the algorithm will sometimes decide to stop the loop and return immediately, drastically reducing the number of moves that it would normally need to analyse. Amazing stuff!
Show me the code… Again!
After implementing these changes, this is how our code will look:
As you can see, the changes here are pretty minor, but the improvement in performance is huge and usually very noticeable.
The chess game
After going through all these basics, and once I had all of this implemented in a generic way and working efficiently for my versions of tic-tac-toe and Connect 4, I was finally ready to tackle the chess problem. Or so I thought…
Creating a simple UI
Although the aim of the project wasn’t to create a fully-fledged app, it was necessary to create some sort of interface that would allow me to play chess against the machine. When I implemented tic-tac-toe and Connect 4, a small console app was more than enough. But with chess, that option wouldn’t suffice.
Since I needed as much computing power as possible, having the app running on my computer was a must. However, as a mobile developer, I am not used to developing apps for desktop, which is why I chose Xamarin Forms, as it is a framework that I am familiar with that can generate Windows apps via UWP.
As I have expressed in the past, I am not a huge fan of Xamarin Forms. In theory, it is ideal for prototyping, but in practice, you usually end up wasting tons of time battling with its bugs, of which there is no shortage. Nevertheless, I gave it a shot for this project because I assumed that things would have gotten better by now, as it had been a while since the last time I used it.
Shamefully, things don’t seem to have changed much. For example, the CollectionView, which was supposed to be the new and shiny replacement for the outdated, bug-riddled ListView, has all sorts of issues on UWP, ranging from erroneous spacing to certain events not firing. And so do the buttons and many other elements.
Given the above, it was more of a struggle than it should have been, but I eventually managed to get a UI that would work to help me test things out.
Implementing the heuristics
Much more important than the UI, though, is the heuristics of the game. Getting it right is especially important in chess for two reasons:
- Most chess games need a lot of moves to finish (38 on average, to be precise). This means that, more often than not, the algorithm will have to rely on the score provided by the heuristic function.
- Chess is a complex game with a lot of rules, such as the different types of moves each piece can make. This will cause each iteration of the alpha-beta algorithm to run slower than with other games, forcing us to put a tight limit on the maximum depth of the tree in order to speed things up — we don’t want a bot that takes five or ten minutes to decide each move. In turn, this will make it even harder for the algorithm to reach nodes where the game is actually finished and no heuristic function is needed.
Bear in mind, heuristics in chess is a tricky business. It’s not only that the game is complex and very different at the beginning than at the end, or that chess has a lot of rules to implement. The main issue is that the more complex the heuristics, the worse the algorithm will perform, as it will have to execute the heuristic function thousands of times. Hence, unless you are using a supercomputer like Deep Blue, there needs to be a balance between having a complex, refined heuristics and having a simple, highly-performant one.
Initially, just to start things up, I implemented the most simple heuristic I could think of, one where the score would be the total value of all the pieces that the player has left on the board. For this to work properly, I gave each piece a different individual value:
- Pawn: 100
- Knight: 320
- Bishop: 333
- Rook: 510
- Queen: 880
- King: 1000
Translated into code, what I just explained looks very simple:
Obviously, this heuristics does the trick for a start, but it is lacking.
To create a good heuristic function for a chess game, there are all kinds of things that you could do. And in some cases, you can even bypass this function altogether — for example, by using an opening book or an endgame tablebase, which are databases with thousands of possible scenarios that contain the best move for each one of them, something that many chess applications use to avoid having to calculate most of the moves.
But I wanted to keep things simple, and I saw no fun in downloading a gigantic database with chess moves from the internet — if I wanted a perfect chess engine, I would download Stockfish or some other one. What I wanted was to create a chess AI that would play guided only by the power of my code, so I refined my heuristics until I got to something performant that I was happy with:
Here there are four different weighted criteria being used to calculate the player score, so I will break them down and explain them individually — please keep in mind that this isn’t a perfect heuristics for chess by any means, but just the arbitrary measurements I settled on after some basic trial and error.
1. Score of the pieces (65% of the weight)
As I explained above, this is calculated by adding together the values of the remaining pieces of the player.
Also, in addition to giving each piece a value depending on its importance, I included modifiers that would increase or decrease the value of each piece depending on its position. This is because in chess each piece is stronger or weaker depending on where it is located on the board.
For the values of these modifiers, I took the numbers given in this article:
2. Potential score of the moves (20% of the weight)
This is calculated by getting all the available moves for the player, giving each one of them a score depending on how promising it seems, and then adding all of these scores together.
Specifically, this is the very simple formula I implemented to calculate the potential score of each move:
Potential score of the move = value of the piece after moving — value of the piece before moving + value of the piece to capture (if any)
3. Number of available moves (10% of the weight)
This is calculated by counting how many moves the player can choose from in their turn.
4. Number of pieces (5% of the weight)
This is calculated by counting how many pieces the player has left.
Testing the AI
Creating a UI, defining a heuristic function and implementing the rules of chess is definitely a lot of fun, but it would all be for nothing without being able to test whether the implementation works and makes sense. And getting to test it isn’t always easy.
Initially, I would just try to play against the machine and see how well it would behave. This was very slow and unproductive because:
- It is hard to really measure how well the machine is playing just by looking at its moves. Wait, was that weird last move absolutely genius, or is the machine simply making a mistake?
- Playing chess against the machine was very time-consuming, especially in the first iterations of the project, as the code wasn’t too refined and the machine would take a long time to choose its next move.
- I am terrible at playing chess. The machine would beat me every time, sure, but so would a monkey moving pieces at random.
Eventually, I had a better idea. I went to the internet, looked for chess puzzles, and baked a few into the app. I would let the machine choose what move should be made next on each puzzle, and then I would compare the machine’s choice to the move that actually solves the puzzle. If the machine managed to choose the right move, then I would consider that it passed the test.
To my dismay, I realised that the machine would fail to solve most of the puzzles. However, it wasn’t because it was too dumb due to having bad heuristics, but because I had limited the depth of the search tree to a maximum of 4. Interestingly, the machine would actually solve all the puzzles when the maximum depth was set at 5.
On the one hand, this was great news because it meant that there was some “intelligent” thinking going on, as the machine showed capable of solving puzzles. On the other hand, having a maximum depth of 5 meant that each move would take several minutes to calculate, which would make the app unplayable.
At this point, I realised that something had to be done to improve the performance of the algorithm. I needed to be able to run it with a maximum depth of 5, as that seemed to be the minimum required to solve simple puzzles. But I needed to be able to do it without having to wait minutes for the machine to move — this is, I needed to reduce the waiting times by an order of magnitude to go from minutes to seconds… But how?
Tests screen
I had a few ideas on how to improve the speed of the app, but before doing anything, I wanted to be able to measure the times reliably, so I created a tests screen where I would be able to click a button to run all the tests:
From this point on, I would always use these tests and their times as a reference to check whether my changes were improving the performance or making it worse, and also as a tool to verify that the AI was still working and not broken.
Alpha-beta pruning running in parallel
When thinking about performance, the first thing that came to my mind was parallelisation, so I refactored the algorithm to run each branch in parallel. This was surprisingly easy to do, but the performance didn’t improve that much.
Since I was trying to get parallel code in C#, I used awaitable tasks in combination with a call to Task.WhenAny from inside a foreach loop (see this article for reference). But after some researching, I found out about the Task Parallel library.
This library seemed to be designed specifically for this kind of job, so I decided to refactor the code again to start using its methods. After the changes, the performance of the app did seem to be better than using tasks.
Eventually, after some more trial and error, I settled for a hybrid approach, as it was the one that seemed to work the best in my machine. In this approach, the parallelisation, which was implemented with the Task Parallel library, would be applied only to the top level of the tree, meaning that the algorithm would run without any parallelisation in all the other levels.
After going through all these changes without getting good-enough results, I started to get a little bit desperate. I had managed to improve the performance of the app significantly thanks to the parallelisation, but not nearly as much as I needed to, given that some moves would still take minutes for the algorithm to calculate. And I wasn’t even sure I would ever manage to improve things any further.
At this point, I needed a breakthrough.
The potential of sorting moves by potential
Luckily, I found a very simple trick that made the app much, much faster: before iterating over the moves, the algorithm would sort them from the most promising one to the least promising one.
As I explained earlier, each move has a “potential score” that is used by the algorithm to calculate a fraction of the total heuristic score of the player. And, as I also explained earlier, the algorithm has the ability to discard entire branches whenever it detects that they will never return anything better than the best score found so far. We just have to piece all of this together.
We know that the higher the current best score found by the algorithm when iterating over the moves of a level of the tree, the higher the chances of having entire branches discarded. And we also know that the algorithm will be more likely to find high scores in the moves that show promise. Therefore, if we make sure that the algorithm always studies the promising moves first, we increase the chances of finding high scores early in the search, which in turn increases the chances of getting branches entirely discarded. Pretty neat!
Just so you can see how powerful this technique is, take a look at how much it improves the times that it takes for the app to run each test:
Finally, the app was fast enough for me to use it with a maximum depth of 5 without having to wait for ages!
At this point, I felt like I was done with this project, as I was finally satisfied with the result.
Starting from scratch, I had managed to create an AI capable of playing chess — or, at the very least, capable of solving chess puzzles — , and I had also managed to refine it until it was fast enough to become playable. I was pretty proud of the result, and even more so after a friend of mine who plays chess often lost a game against my AI, but there is no denying that this is as amateur as it gets.
Room for improvement
There are many things I know could be improved, and many more I don’t even know about, given my lack of expertise on matters of artificial intelligence and code optimisation. But I will quickly list the few that come to mind right now:
- I should have probably written this game in a more performant language like C, C++ or Rust.
- I could have improved my heuristics further. For instance, I could try to make the function distinguish between the start-game, the middle-game and the end-game.
- Using pre-filled databases with the best moves for certain situations would have probably been a good idea, both to improve the performance of the game and to make the machine play more “humanly” and strategically, especially at the beginning of each match — although I have already explained why I chose not to do that.
- I could have added an empty database to store the moves used in each situation. That way, if the same scenario occurs more than once, there would be no need to calculate the move again because it could be read from the database.
- I could have tried to apply further, more complex optimisations, such as null-move heuristics.
- My engine isn’t designed for chess only, but for any two-player, zero-sum game. In fact, I have implemented three games on it. However, I could have focused only on chess, which would have probably given me more room for finer tuning and low-level optimisations.
- Although this is pretty minor, I could improve the UI, use proper notation to describe the moves, etc.
Conclusions
This project has been a lot of fun, and it shows clearly how powerful the alpha-beta algorithm can be. But it also shows how much it lacks when it comes to complex games like chess.
For example, the machine won’t be able to navigate deep enough in the tree to generate long-term strategies. It will know how to avoid silly mistakes, and it will even be able to make seemingly-smart moves that require planning a few moves ahead. But when it comes to the long-term strategy that chess requires for high-level playing, the machine will simply be incapable.
Another issue of this algorithm is that the machine will almost always play the same way — without parallelisation, the “almost” can actually be removed from the sentence. This is, for the same game state, it will always choose the same next move. This will be especially noticeable at the beginning of each match.
Something else worth noting is that the algorithm will never try to trick you into making a mistake. As humans, we sometimes develop our strategy by hoping that the opponent will make a mistake, but the machine assumes that no mistakes will ever be made. For instance, in tic-tac-toe, it is usually a good strategy to start in a corner, as this will allow us to win if the opponent makes the mistake of choosing any position but the centre — the centre would lead to a draw. But for the machine, all the initial positions have exactly the same score (playing without errors, they all lead to a draw), since it works its magic under the assumption that the opponent will never make mistakes. Hence, the machine will show no preference for starting in the corner, even though a human player would.
On the other hand, the fact that the search tree has to stop at a certain depth can be problematic too. For example, one move found at the maximum depth of the tree might look very promising for the player, but it could actually be detrimental for them in the next turn, which the algorithm will be partially oblivious to. This problem is known as the horizon effect.
To all of the above, we also have to add that this algorithm requires a lot of processing power. I have been running it on a powerful computer, but I can imagine waiting forever if I try to run this same code on a mobile device.
In any case, despite all its problems, this algorithm can be a good starting point. Implemented with as many optimisations as possible, it can be used in combination with other techniques and algorithms in order to make for a great chess game. After all, this is no other than the algorithm that the supercomputer Deep Blue used to defeat Kasparov in 1997.
Check out the actual code
That was all I had for this one. If you have gotten this far and are still interested, please feel free to check out the code of this project.