Alpha Zero: The dawn of narrow artificial intelligence

scisteem (65)in #science • 8 years ago

It has been twenty years since people started to be scared of the fact that computers seem to be smarter then humans It was the day when Garry Kasparov lost his chess game against Deep Blue. Deep Blue doesn’t have to scare us anymore. He can’t win a single game against Alpha Zero.

AI artist’s rendition - Source: Wikimedia Commons

Chess programs usually work by predetermined rules, man-made evaluation tools, and deep thoughts about each move. The rules are fairly easy to understand, they are the same rules by which humans play the game. The evaluation function calculates the move’s score based on material (for example my tower +500, enemies bishop -300), position (protected king, covered squares) and something called domain adaptation (in chess it’s things like a database of openings, endings and similar). A chess AI is capable of evaluating millions of positions every single second, then evaluating each possible foes response and going further exponentially. If we took only brute force into account, evaluating six moves ahead takes an AI on a supercomputer a few seconds.

But post possible moves are dumb and those get removed from the computation very quickly, giving the AI time to think more deeply about the perspective ones. But even this isn’t ideal. A move that first looks like a blunder may come out as a genius move in a few moves. But this is more than enough to beat humans. And so we don’t feel inferior we call chess machines AI even if they usually aren’t much more than what we described above. If we, for example, wanted to adapt an AI for a different game with different rules (for example to play a first-person shooter video game) we would need to give it a completely different evaluation function and domain adaptation otherwise the AI would be incredibly dumb. These dumb AIs are also known as narrow artificial intelligence.

AlphaZero playing Breakout - Two Minute Papers YouTube Channel

This way of doing things, also known as Alpha-Beta pruning is very good for chess. Each position has on average 35 possible moves and only 3 moves are good on average. This means powerful machines are capable of thinking at least 10 moves ahead. But Alpha-Beta pruning isn’t as good for the game Go but are great for neural networks. That was why many good players are capable of beating very good Go machines, but the AplhaGo AI beat the best Go player of the world Lee Sedola recently.

But AplhaZero is different then AplhaGo. The Zero in the AIs name means we don’t give the AI any domain adaptation. Only the rules of the game. The evaluation function can be created by the AI itself based on playing against itself and reinforcement learning using a non-linear evaluating function that applies the general MCTS algorithm of deep neural networks (DNN). This capability to learn in general has been tested in three games so far. Chess, Shogi and Go. And it all of these AplhaZero was capable to beat the previous best AIs. Even though it was limited to only being capable of evaluating tenths of thousands of moves per second (that’s a thousand times less than it’s foes). AlphaZero also plays old Atari games, where it gets the image as an input. Not only it soon becomes a god-like player but it can also create a strategy for itself. For example, while playing Breakout it figured out that the best way to play is to dig a tunnel through the blocks (as seen in the video).

Does this mean we are standing at the dawn of general artificial intelligence?

#technology #blog #news #ai