14 December 2018

DeepMind Videos

Last week's post, AlphaZero Is Back!, ended with a request for more time to understand what had just happened.

This is too much new material to digest in the time available for a simple blog post, so I'll come back to the subject as soon as I can.

This video from Google's DeepMind is partly a restatement of what we learned from their first announcement a year ago, partly a statement of what they have been doing since then, and partly a declaration about where they want to go with the technology.

AlphaZero: Shedding new light on the grand games of chess, shogi and Go (4:38) • 'Published on Dec 6, 2018'

The description explains,

DeepMind's AlphaZero is the successor of AlphaGo, the first computer program to beat a world champion at the ancient game of Go. It taught itself from scratch how to master the games of chess, shogi and Go, beating a world-champion program in each case and discovering new and creative playing strategies that hint at the potential of these systems to tackle other complex problems.

A DeepMind blog post, AlphaZero: Shedding new light on the grand games of chess, shogi and Go (deepmind.com/blog), bearing the same title and publication date as the video, goes into more depth. One paragraph explains the essence of the technology.

An untrained neural network plays millions of games against itself via a process of trial and error called reinforcement learning. At first, it plays completely randomly, but over time the system learns from wins, losses, and draws to adjust the parameters of the neural network, making it more likely to choose advantageous moves in the future.

In other words, an NN plays a few million games, compares its predictions about the outcome of its moves against the result of those games, adjusts its internal NN parameters to eliminate discrepancies between its predictions and its results, then starts the process over with the new parameters. Eventually it reaches a level where the predictions and the results almost coincide. DeepMind has also put together a couple of video courses on the underlying technology:-

I now know what I'll be doing during the year-end holidays.

No comments: