25 January 2021

Engine Scaling

Two weeks ago, in the post CCC Hardware Upgrade (January 2021), I referenced a thread on the Talkchess forum, CCC has serious hardware update! (talkchess.com). That thread went further than a discussion of the CCC's hardware. It addressed the key question about what performance gains can be expected from an upgrade in hardware. As one commenter put it,

Both [CCC & TCEC] have massive hardware. Will it change any results? If you notice even on much smaller hardware, the [relative] ranking of the engines does not really change. And this is also what I am seeing in my testing.

It's a good point, but it wasn't the only point of the thread. We all know that if a single processor executes faster, an engine will perform better because it analyzes more positions and variations in the same amount of time. But what happens when you add a second processor of the same speed? Or a third and fourth?

The concept is called 'scaling'. How does the performance of an engine scale as it runs on increasingly more cores? During the past ten years we learned that the traditional engines -- Stockfish, Houdini, Komodo, et al -- scale well. They analyze significantly faster, with huge gains in playing strength.

After Stockfish 12 (aka Stockfish NNUE) was released last year, an earlier Talkchess thread, CCRL flawed testing : SF12 above SF12 8CPU (talkchess.com), observed that Stockfish 12 showed no improvement running on eight cores instead of a single core. As the title of that thread suggests, the first post assumed that the phenomenon was caused by inadequate CCRL test procedures. Other commenters suggested that it might instead be a characteristic of NNUE engines: they don't scale well, perhaps not at all.

As another commenter to 'Serious Hardware Update!' noted,

I don't see the point in this huge hardware. The results are probably the same on 32 cores, and I'd guess none of the authors have tested their engines on this level of hardware. Better quality chess however you define that, maybe, but you also probably increase the draw rate as well.

And more cores give you less and less real speed because of diminishing returns, as you can only effectively subdivide the search of the engine so many times. What is a bigger benefit in the new hardware is the improved IPC [Instructions Per Cycle? Interprocess Communications?] and frequency of each core.

I doubt that's the last word on 'massive hardware'. As with most everything in technology, it's two steps forward and one step back.

No comments: