07 December 2015

TCEC Season 8 - Evaluation Anomalies

After the TCEC Season 8 Superfinal ended (see Week 4 for a wrapup), I stepped through all 100 games looking for examples of misleading evaluations. These are games where at least one of the engines seems to have misevaluated the position. Thanks to the TCEC Archive Mode page, it's easy to review the games in quick succession.

I found 32 games (there's that number again!) where one side or the other evaluated the position as a strong possibility of a win, but the game eventually ended in a draw. I whittled that number down to six and made the following composite chart. It shows the TCEC evaluation graph for each of the six games.

For example, the first graph (game 8, Stockfish - Komodo) shows that around move 20, White evaluated the position to be +1.20 in its favor, while Black evaluated the position at +0.40 for White. The game eventually petered out to a draw.

The third graph shows the infamous game 22 (Stockfish had White in all even-numbered games) discussed in my post on Week 2, where White apparently blundered a certain win. The sixth graph shows the same game I used in my post on Week 3, where I noted,

White starting with an advantage of ~0.60 Pawns in the opening, eventually dropping to 0.00 in the endgame.

Many games followed that same pattern, although with different evaluations in the opening, some with only the ~0.20 advantage predicted by theory for the traditional start position.

A previous post in this series, Chess Engines - Advanced Evaluation, discussed the components of the evaluation function. We also know that A Pawn Equals 200 Rating Points (February 2013), thereby allowing us to use the calculated evaluation to predict the probability of a win. In game 8, the +1.20 advantage equates to an 80% chance of a win, but the game was nevertheless drawn. In another post I'll take a closer look at one or two of these games to determine why the engine(s) failed.

