I ended the post on TCEC Season 8 - Evaluation Anomalies with the desire to 'take a closer look at one or two of these games to determine why the engine(s) failed'. The first game of the six I flagged was played in round eight, so I started there. To play through the game and to examine all of the engine evaluations, see TCEC - Top Chess Engine Championship - Archive Mode, Season 8 - 2015.08.21, Superfinal - 2015.11.06, game 8.
Annotating a game between good engines -- the two best chess engines in the world -- is almost hopeless. The calculations are (nearly) error free, the variations are razor sharp, and the plans are incredibly deep, making the whole game incomprehensible to the human analyst. Having said that, I'll give it a try anyway, hoping to learn something from the exercise.
The evaluation anomaly started at move 16, the first position shown in the following composite diagram. White has sacrificed a Pawn, getting the open g-file against the Black King as compensation. White (Stockfish) played 16.Be2, threatening a discovered attack on the Black Queen. The TCEC statistics show that White expected 16...Re8 in response, and gave the position a value of 0.29. Black (Komodo) played instead 16...Ne7, with a value of wv=0.40.
Here White saw a future combination and prepared it with 17.Qd2, while its evaluation shot up to wv=1.05. Now something went wrong with the TCEC stats. Black played the expected 17...Ng6, protecting the g-file, but the stats show wv=0.00, indicating equality. That can't be right, so I'll just ignore it. The game continued 18.h3 (wv=0.94), protecting the h-Pawn with the Rook and finally threatening the discovered attack on the Queen, 18...Qa5 (wv=0.37), reaching the position in the second diagram.
Here White unleashed the planned combination, sacrificing the Knight with 19.Nh4. The game continued 19...Nxh4 20.Rxg7+ Kxg7 21.Rg3+ (discovering another attack on the Black Queen) 21...Qg5 22.Rxg5+ hxg5 23.Qxg5+ Ng6, reaching the third diagram.
Stockfish - Komodo, TCEC Season 8 Superfinal, game 8
In the third diagram, White has a Queen and a Pawn against two Rooks and a Knight. Normally this would be better for Black, but the combination isn't finished yet. White played 24.h4 (wv=0.94), threatening to attack the pinned Knight. Play continued 24...f6 (wv=0.43, Black's evaluation is consistent with its opinion from before the combination started), 25.Qg3 Kh7 26.h5 Ne7 27.Qxd6 Rf7 28.Bd3 Ng8 29.e5+ f5 30.d5 Re7 31.Qd8 exd5 32.f4 b6, reaching the last diagram.
After 32...b6 (wv=0.11), White has two ways to repair the material deficit. It chose 33.Qxd5 (wv=0.81), followed by 33...Rb8 34.Qd8 Kh8 35.Bxf5 Bxf5 36.Qxb8. After the last capture, the material is a Queen and three Pawns against Rook, Bishop, and Knight (Q+3P:R+B+N). White gave the position wv=0.72, while Black, after 36...Kh7, gave it wv=0.12.
The game continued for another 40+ moves. White was unable to break Black's defense and the game ended 'Draw by adjudication: TCEC draw rule', both sides evaluating the position at wv=0.00.
What happened to the advantage of wv=1.05 that White calculated for 17.Qd2? Of course, I can't say for sure, but the combination initiated with 19.Nh4 wasn't completed until 36.Qxb8. That's 18 moves, around 36 ply, deep. The evaluation of the resulting material imbalance (Q+3P:R+B+N) is itself a complex task. Maybe it's simply an engine example of 'long analysis, wrong analysis'.