21 September 2020

Stockfish NNUE Dev

Six weeks ago, when I posted Stockfish NNUE = +90 Elo (August 2020), I wrote,

I hope some kind, understanding soul will eventually produce something like 'NNUE for Dummies'. There is a lot to keep track of here.

For this post I took some time for another search on a good introduction. After all, with so many people looking into NNUE, many of them must need basic knowledge. Youtube is often a good start for tutorial material, but the NNUE videos it suggested were about experts analyzing a game where one of the engines used NNUE.

More promising was Stockfish NNUE - The Complete Guide (yaneu.com). Excluding the resources in Japanese, it still offers many good leads. One obvious choice that I had managed to overlook was the 'Discord for Stockfish' page, specifically #general-nnue-dev. Who could explain NNUE better than the people who had been its early adopters?

Discord isn't for everyone. If you're not interested in a subject technically, it's probably over your head, which has been my general experience with it. Having only dabbled with it in the past, I was faced with a steep learning curve. First, here's a Q&A extracted from the NNUE FAQ. The last question was the most informative to me:-

Q: Is NNUE an opening book? • A: No. It is purely an algorithm for the evaluation function. Classical handcrafted evaluation functions are terrible in the opening.

Why the emphasis on openings? Perhaps because it's the phase of the game where evaluation functions are weakest. I've looked at engine evaluation several times in the past:-

Since I use engines for practical play, I'm well aware of their strengths (lots of these) and weaknesses (few of these, but they exist). Back to the NNUE FAQ:-

Q: What makes NNUE run much better with traditional AB/minimax engines on CPU? • A: The neural net used for evaluation is very sparse compared to UCT/MCTS engines on GPU, and the code is optimised for modern CPU instruction sets with vector intrinsics, both which allow for much faster calculations of the evaluation function necessary for AB/minimax search on the CPU.

Q: Is NNUE a library like Fathom for Syzygy tablebases? • A: No, each engine developer will have to manually port NNUE to their code base, or write the algorithm from scratch.

Q: Is NNUE related to Leela Chess Zero? • A: No. NNUE is descended from linear neural networks used by shogi engine developers since the early 2010s, while Leela Chess Zero is descended from the convolutional neural network based go engine Leela Zero, which itself is based upon Deepmind's AlphaGo Zero from 2017.

Q: Is it possible to train a NNUE net from zero like the T10 Leela nets? • A: Yes. Although the neural nets are typically trained with the help of the evaluation of the engine used for training, the lambda parameter in the training indicates how much the evaluation is used to help train a neural net, and setting lambda equal to zero means no evaluation is used to help train the net.

Q: Does this make NNUE zero? • A: No, because there is chess knowledge handcrafted into the various neural net architectures used in NNUE themselves.

Q: What is the most commonly used architecture used for NNUE? • A: Halfkp. 'k' stands for 'king', 'p' stands for 'piece', and 'half' refer's to the player's own king: put all together, 'halfkp' refers to the fact that the position of the player's own king and the relationship between the player's own king and every other piece on the board gets encoded into the first layer of the neural net.

Long story short : NNUE gets to the essence of handcrafted evaluation functions. Those FAQ explanations lead to as many questions as they answer, often a sign of being headed in the right direction. I'll come back to the subject after I've had some time to orient myself. I hope the time will be well spent.

No comments: