27 January 2009

Chess960 Opening Theory

In Undefended Pawns in Chess960 Start Positions, Tom Chivers commented,

I've started questioning whether comparing % scores of any of the 959 positions with chess makes any sense at all. After all, the 959 positions are basically played out with zero theory: so it would make more sense to compare them with chess results around the time of Greco, not chess now. This leads to a further thought. Do the %'s change with rudimentary theory applied? For instance, in the position where 1.Ne3 c5 2.d4! is extremely good for white, does 1.Ne3 e6 score much more moderately? Perhaps one way to test this would be using 'Monte Carlo' analysis?

The point in my previous posts wasn't necessarily to compare W-L-D statistics of the 959 positions with traditional chess (SP518). It was rather to compare all 960 positions with each other. A key question for the acceptance of chess960 is knowing whether all chess960 positions are equally fair, and, if not, are they reasonably fair.

If the numbers in Advantage in Chess960 Start Positions Revisited, can be taken at face value, then I have to conclude, 'No, they aren't equally fair.' A 76% success for White in one position is manifestly unfair. If the statistic applied to a variation in SP518, the line would be dropped quickly by players of the inferior side at master level.

The kicker in that last paragraph is 'if the numbers can be taken at face value'. Using SP518 as a reference, its 39% success, combined with the knowledge that White has a small advantage in traditional chess, means that the numbers can't be taken at face value. What's wrong with them? My guess is that the number of sample games for each start position is too small to be meaningful.

As for the second question -- 'Do the percentages change with rudimentary theory applied?' -- I would suppose, 'Yes, they do'. The source of the data, CCRL 404FRC : Downloads and Statistics (www.computerchess.org.uk), explains where the percentages come from.

CCRL means "Computer Chess Rating Lists". We are a club of people inspired by watching computers play chess. We want to compare the strength of different chess programs, and we want to share our findings with others. We are achieving this by running thousands of games between chess programs, collecting the games into a single database, and then computing and publishing a rating list. • CCRL 40/4 FRC: This is the first FRC [Fischer Random Chess] testing project, done in 40/4 time control.

Additional CCRL explanation relevant to this discussion is:

  • Time Control: Equivalent to 40 moves in N minutes
  • Book learning: Off for all engines

This method would appear to be compatible with Monte Carlo analysis. Curious about how close the CCRL tests compare to known opening theory, I downloaded the test games for SP518 and SP534 (the same as SP518 with the King and Queen switched).

The 32 games in the SP518 sample started: 17 x 1.e4, 7 x 1.Nf3, 5 x 1.Nc3, and 3 x 1.d4. The 22 games in the SP534 sample started: 13 x 1.d4, 7 x 1.Nf3, and 2 x 1.Nc3, roughly mirroring the results for SP518. The line that ocurred most frequently for the 13 x 1.d4 games is shown in the following diagram. It is a mirror of 1.e4 e5 2.Nf3, also known as the Open Game, in traditional chess.



Start Position 534
1.d4 d5 2.Nc3

The chess playing machines in the CCRL experiment are not learning from their results in the opening ('Book learning: Off'), but suppose they were. Furthermore, suppose that they played millions of traditional games (SP518) with each other, enough to establish a theory of the opening comparable in volume to what we have created since the time of Greco. How would that machine theory compare to what we humans have developed over the centuries? Would 16% of the games continue to start 1.Nc3, or would these shift to another first move, like 1.d4?

As for those of us who play chess and who are not machines, considering that we will play the same chess960 position only a few times during our entire lives, is it even worth bothering with questions of theory? Didn't Fischer promote his variant to free us from the burden of theory?

1 comment:

Tom Chivers said...

I meant to comment earlier that I really enjoyed this post. Of course you're right, that comparing all 960 positions to each other makes most sense. I still think that in deciding which are fair (or fair enough) for human play, then chess circa Greco might be a fair enough benchmark, but I suppose this really is a minor side issue.

The question of theory is interesting. I think 959 meaningful new theories is unlikely and indeed rather takes the point of the game away. However, I'm not unconvinced new opening principles might not be plausible. For instance: in positions where the queen starts in the corner, is it usually best to develop her via a fianchetto, by moving the a or h pawn, early or late, via the back rank, etc? Ie, do the new features of the 959 start positions also lead to new opening principles? Who knows!