05 November 2018

So Many Ideas, So Little Time

In last week's post, Happy 10th Anniversary, Stockfish!, I noted,

The name of choice for the commemorative release was 'Stockfish X', but with only a few days to go, it doesn't look like anything special is going to happen.

And nothing did happen. Let's continue with the Stockfish forum, FishCooking -- first seen on this blog three years ago in Chess Engines : FishCooking (October 2015) -- and look at some of the recent posts that attracted discussion from the Stockfish community *and* that provide some insight into the group's working methods. First up are a couple of forum posts about selecting a version for the TCEC final match, which is currently underway and nearing its conclusion.

  • 2018-10-20: Executable for TCEC superfinal? • 'Apparently we haven't submitted an update to play in the TCEC Superfinal yet. I suspect it is too late now, but perhaps the senior devs could prepare one or comment on whether the e.g. abrok one should be used and contact TCEC?'

  • 2018-10-21: Request for binary for TCEC superfinal • Marco Costalba: 'We have few hours left before TCEC deadline is met to provide a binary for the superfinal. We would need a fast binary of the current master within this evening (European time) and I will send to TCEC people. If not, we will fallback on current TCEC binary. If someone has a better idea, please share it.'

The open source nature of Stockfish development presents many challenges.

  • 2018-10-17: Discussion for optimizing methodology • 'SF has been able to progress with the same 7(?) year old recipe and without doubt still is, but in my opinion with some adjustments in accordance to its current situation and needs it can be helped tremendously'

A technical issue related to evaluation has been under investigation for months.

  • 2018-10-03: First results of contempt tests • 'Actually on STC it seems that optimal contempt value is somewhere between 30 and 70. I will try to build the same diagram for LTC and if it shows the same stuff then we can probably conclude that for TCEC divP optimal value of contempt will be much higher than default 21 - closer to 50 because field there is much weaker than SF9, in fact, every single engine there is weaker than SF9.'

As those excerpts show, the forum members use so many acronyms and so much jargon that they might easily dissuade casual chess engine fans from trying to understand what they are discussing. Here are a few terms that are worth knowing. These first two acronyms I covered in a recent post titled Catching Up with Engine Competitions (October 2018):-

CCCC = Chess.com Computer Chess Championship
TCEC = Top [formerly 'Thoresen'] Chess Engines Competition

The next two acronyms are used throughout the forum. If you don't know what they mean, there will be many discussions that you won't understand. The terms are key elements of a testing strategy that starts with STC games and, if no problems are discovered, continues with LTC games.

STC = Short time control
LTC = Long time control

This next term is not specific to chess.

SPRT = Sequential Probability Ratio Test

It is a type of statistical analysis that is explained in Wikipedia's Sequential probability ratio test. The next acronym stems from a web domain that is used for administration, e.g. Stockfish Development Versions.

ABROK = abrok.eu

The word 'contempt' in the last thread listed above is a current topic of experimentation. In Contempt Factor (chessprogramming.org), the wiki defines it as:-

The Contempt Factor reflects the estimated superiority/inferiority of the program over its opponent. The Contempt factor is assigned as draw score to avoid (early) draws against apparently weaker opponents, or to prefer draws versus stronger opponents otherwise.

In other words, even if the evaluation shows a small advantage for the opponent, treat it as equality (a 0.00 evaluation). The typical debate is about what numerical value should be used. If it's too large, you risk losing drawn games; if too small, you draw games that you might win.

Another thread that has been running for nearly two years points to a useful tool for understanding Stockfish evaluations.

  • 2017-01-22: "Stockfish Evaluation Guide" tool • 'I developed tool where you can investigate each part of Stockfish static evaluation function. It is standalone single HTML page with javascript. Every evaluation term is rewritten in single small javascript function. You can setup any position with FEN or by moving pieces on chessboard and see how evaluation is computed and what is result and if possible to attach score to individal squares it is visualized on chessboard.'

The thread eventually points to Main evaluation (hxim.github.io). If I had the time, I would definitely take a closer look at it. I could say the same for many of the threads in the FishCooking forum.

No comments: