Monday, February 14, 2022

Making Santa-Tac-Toe part 2 of 4

 Part 2: Solving Using a Tree

With the basic game class that was created in the last part, we are ready to create a better player to play against. This is where the computer can have an advantage over us humans as they can examine many more moves per second than we can. This allows more exploration of the possible moves, and with a simple game like Tic-Tac-Toe, can even explore all the possible moves when deciding what move they wish to make. How is this possible? When you think about a game, it can be represented as a tree structure as shown in the image below.


The first move has nine options. The second has 8 options for each of the 9 boards. The third move has 7 options for each of the 72 boards and so on. The total number of board configurations is 9!, which is 9 factorial or 9x8x7x6x5x4x3x2x1 for a grand total of 362,880. As each move leads to a set of other moves, the game can be represented as a tree. The AI then just hast to start on the node of the tree representing the current state of the board then find the move most likely to win. There is one problem, namely that the opponent also wants to win.

We need to be able to pick the best move to make for the computer while also assuming that the opponent will always pick the best move that they can. This may sound like a problem, as what happens if the opponent doesn’t pick the best move? The answer is that the AI will still be in a good position, possibly even able to win the game faster by taking advantage of the mistake. Still, if you are using some of the pruning techniques we will be discussing, this could be a problem as the AI may not have stored information about that move and will have to search from scratch.

The basic approach then is to recursively search the tree taking the best move from whoever the current player is.  As the opponent winning is bad for us, we will invert the score from the opponent so will weigh the moves that the opponent wins as an undesirable position for us. This is implemented as a recursive pair of methods.  ScoreTiles simply takes a board and scores each possible move on the board by calling scoreTile. ScoreTile checks to see if a particular tile is a valid move and if so if that move will result in a win. If the move is not an immediate win or tie, we need to look at the next moves in the game taking the highest score from that and inverting it. This recursion will repeat until a winning or tie move is reached.

scoreTiles(board, curPlayer) {
   
let scores = [];

    for
(let i = 0; i < 9; ++i) {
       
if (board.getTileByIndex(i) === TTTConsts.EMPTY) {
            scores.
push(this.scoreTile(board, i, curPlayer));
       
} else
           
scores.push(TTTConsts.INVALID_MOVE);
   
}

   
return scores;
}

scoreTile(board, index, curPlayer) {
   
if (board.getTileByIndex(index) !== TTTConsts.EMPTY) return TTTConsts.INVALID_MOVE;
    this
.testBoard.clone(board);
    this
.testBoard.setTileByIndex(index, curPlayer);
    let
winState = (curPlayer === TTTConsts.X) ? TTTConsts.X_WON : TTTConsts.O_WON;
    let
state = this.testBoard.getGameState();
    if
(state.state === TTTConsts.PLAYING) {
       
// recurse
       
let rboard = new TTTBoard();
       
rboard.clone(this.testBoard);
        let
opponent = (curPlayer === TTTConsts.X) ? TTTConsts.O : TTTConsts.X;
        let
scores= this.scoreTiles(rboard, opponent);
        let
highIndex = 0;
        for
(let i = 1; i < 9; ++i)
       
if (scores[highIndex] < scores[i])
            highIndex = i
;
        return
-scores[highIndex];
   
}
   
return (state.state === winState) ? 1 : 0;
}


The getNextMove API call then simply calls the ScoreTiles for the existing move and picks the move with the highest score.

getNextMove(board, playerToMove = TTTConsts.EMPTY) {
   
let scores = this.scoreTiles(board, playerToMove);
   
console.log(scores)
   
// return best move
   
let highIndex = 0;
    for
(let i = 1; i < 9; ++i)
       
if (scores[highIndex] < scores[i])
            highIndex = i
;
    return
highIndex;
}


The search tree for Tic-Tac-Toe is a small enough tree that we can fully explore it, but what if it is not? More complicated games may not allow for us to search the entire tree in a reasonable timespan. For these games we need to come up with a different way of scoring the state of the board. This will be game specific, but one approach I would consider if there is not a good way of finding a weight for the board would be to recurse to the desired end depth and then simply do a series of random games and return a score based on how many wins occurred. Limiting the depth that you search also is a way to reduce the difficulty of the game.

To speed up searching, there are ways of pruning the search tree. If you assume that your opponent will always take the winning move if it is available, you can stop searching for an opponent move once there is a winning move available. This may not sound like much but can reduce the search by a surprising amount. 

One of the new and highly effective ways of picking moves is the use of the Monty Carlo Tree Search algorithm. We will not be implementing the full algorithm but will do the starting point and explain the algorithm in the next part.