01 June 2018

It's Much More than Chess

Thanks to a post I wrote at the end of last year, The Constellation of AlphaZero (December 2017), I became curious to know more about the technology used in AlphaZero:-

The excerpt I copied from the paper also talks about parameters. It ends, 'The updated parameters are used in subsequent games of self-play.' I wonder if I can find out more about those parameters. I'd also like to know how all of this was used in the match that crushed Stockfish.

This led me to a series of courses on Coursera called Deep Learning (coursera.org):-

In five courses, you will learn the foundations of Deep Learning, understand how to build neural networks, and learn how to lead successful machine learning projects. You will learn about Convolutional networks, RNNs, LSTM, Adam, Dropout, BatchNorm, Xavier/He initialization, and more. You will work on case studies from health care, autonomous driving, sign language reading, music generation, and natural language processing. You will master not only the theory, but also see how it is applied in industry. You will practice all these ideas in Python and in TensorFlow, which we will teach.

I had already taken a bitcoin/blockchain course on the same site and knew that their material was reliable. It took me about five months to work through the five courses, which were titled:-

Neural Networks and Deep Learning
Improving Deep Neural Networks: Hyperparameter tuning...
Structuring Machine Learning Projects
Convolutional Neural Networks
Sequence Models

One of the big advantages of Coursera is that the courses can be audited for free. Although all material for the courses -- videos, quizzes, programming assignments, and forums -- is available at no cost, the quizzes aren't graded without payment and there is no certificate issued for completing a course. Not having done any programming in more than 20 years, I was concerned that my skills were out of date, but there was plenty of support offered for both Python and TensorFlow (Keras is also used) and I had no trouble completing the assignments. This was partly due to the cookie-cutter structure of the exercises and to forum discussions by previous students that shed sufficient light on the knottiest problems. It also helped to know some math, particularly linear algebra and calculus.

The courses are the work of Andrew Ng (wikipedia.org) and his company, Deeplearning.ai. After noting a few weeks ago that AI and all it entails is A Transformational Technology (May 2018), I'm glad I understand the subject much better than I did six months ago.

A key question I had going into this endeavor was 'Why is AI taking off now?' People have been talking about it for decades, but it was always just around the corner. What changed to make it reality?

An important driver has been 'big data'. We now have so much digital data with huge portions of it correctly cataloged and labeled that we have the means to explore digital relationships that were inaccessible in the past.

We also have algorithms to process big data that, not so long ago, were unknown or undeveloped. Most modern AI is based on the same set of related algorithms. Computing hardware has progressed in parallel with the development of those algorithms allowing for their practical implementation. The necessary support for vectorization in both processors and software development tools was previously missing.

Finally, the impetus for capturing big data and developing big tools is big money. AI first revolutionized the commercial endeavor of selling ads and is now gradually creeping into every commercial activity on the planet.

No comments: