28 December 2015

Evaluation Anomalies - Engines Behaving Badly

After the previous post, Evaluation Anomaly - Mass Exchange, there are still a number of games to be examined in TCEC Season 8 - Evaluation Anomalies. In this post, I'll take a quick look at two more. (As in the previous posts, see TCEC Archive Mode to play through the complete games using TCEC's helpful game viewer.)

The diagram below shows positions from two games, both with Stockfish playing White and Komodo playing Black. The top row shows two positions from game 44 and the bottom row shows two positions from game 52.

In game 44, after 61.Bf2, reaching the first diagrammed position, Stockfish evaluated the position at wv=0.33 (a third of a Pawn in its favor). Komodo played 61...Rd5, with a value of wv=0.00 (dead equal). Stockfish played 62.g6+ (wv=0.33), to which Komodo replied 62...Ke7 (wv=-0.41), reaching the second position. Note that Komodo's evaluation of the position has dropped suddenly to a negative value (i.e. in Black's favor).

The game continued for another 30 moves with White giving itself an advantage of wv=0.33 and Black giving itself an advantage of around wv=-0.40. At move 90, Black's evaluation quickly dropped to zero (wv=0.00), and ten moves later White's evaluation did the same.

What happened here? It's easy for a human to see that after 62.g6+, the position is completely blocked. Neither player can break through without incurring a serious disadvantage. The game continued for another 50 moves before being declared a draw, with both engines recognizing the inevitable draw 10-20 moves before the 50th move was reached.

Stockfish - Komodo, TCEC Season 8 Superfinal, game 44

Stockfish - Komodo, TCEC Season 8 Superfinal, game 52

In game 52, Stockfish played 43.Raa1 reaching the first position in the bottom row and giving itself an advantage of wv=0.65. Komodo played 43...Qd2, with a similar evaluation of wv=0.62. Both engines consider the position to be nearly two-thirds of a Pawn in White's favor, which should give good winning chances.

The next few moves were 44.Qxd2 (wv=0.81) 44...exd2 (wv=0.03) 45.Red1 (wv=0.81) 45...Re2 (wv=0.00) 46.h3 (wv=0.00) 46...Ra8 (wv=0.00), reaching the second position in the bottom row. The advantage of two-thirds of a Pawn has evaporated and both engines consider the position to be completely equal, although it took White a few additional moves to realize it.

What happened here? In the first diagram, both engines saw that White will win a Pawn -- losing the a-Pawn in exchange for Black's d/e-Pawns. That leaves White a healthy Pawn ahead, right? No, unfortunately for White, the 'healthy' extra Pawn is an advantage of f/g/h-Pawns for White vs. g/h-Pawns for Black. With both Kings placed behind their Pawns, an experienced human player knows that the Rook and Pawn endgame is a draw. It took the engines a few more moves to see that.

In both games discussed here, the engines continued to move their pieces hither and thither for many moves before the draw was declared. Good human players would have agreed a draw as soon as boredom set in.

No comments: