Monday, May 21, 2018

Fizzbuzz zero

AlphaGoZero and Fizzbuzz Zero


You might know that AlphaGo played Go with Lee Sedol (one of the best players of Go) and AlphaGo won all but the fourth game. This proved that an artificial intelligence, which learned by reinfocement learning, can even surpass the best human player. AlphaGo used many new moves and skills, that were still unknown to human players at that time, so many people were impressed by how AlphaGo was strong and creative.

Although this surprised the world, the subsequent model AlphaGoZero has surprised the world even more by playing 100 games of Go with AlphaGo and won all the games. More surprisingly, although AlphaGo learned from human's go games, AlphaGoZero didn't learn from them. AlphaGoZero learned Go all by himself from scratch. Now artificial intelligence doesn't even need humans to learn Go games and is much stronger than humans.

In this post, I will introduce a program that is similar to AlphaGoZero. This is called FizzBuzzZero. This make two AI players fight with each other and make them learn how to play FizzBuzz (like AlphaGoZero learned how to play Go). It's more simple project, so it can be applied to other fields. Regardless of the strength, Fizzbuzz zero has simple structure.

How to use Fizzbuzz zero


At first, download the project.
$ git clone https://github.com/ymgaq/FizzBuzzZero

Then let it play fizzbuzz.
$ cd FizzBuzzZero
$ python fizzbuzzzero.py --learn 

This AI didn't even know how to count at first. What he knew was only the rule. But after 100 games, he answered perfectly (accuracy 100%).


AlphaGoZero-like project


suragnair/alpha-zero-general claims to be a clean implementation based on AlphaZero for any game in any framework. This can be interesting too.

Download it this way:
$ git clone https://github.com/suragnair/alpha-zero-general.git

Start training a model for Othello this way:
$ cd alpha-zero-general
$ python main.py

Training of Othero.


Chess & Shogi


AlphaZero is a subsequent model of AlphaGoZero and can learn Shogi and Chess in addition to Go. According to the DeepMind's research paper, it has become unbelievably strong at Shogi & Chess after days of training.

It might be good to do similar research with these AlphaZero-like projects and python-shogi & python-chess.