Overcoming the new AlphaGo limits through machine learning

Leturia Azkarate, Igor

Informatikaria eta ikertzailea

Elhuyar Hizkuntza eta Teknologia

They have recently announced for the first time that they have been able to win a professional Go board game player, well known in the Far East. It is the first time that in a game of these characteristics it is possible to overcome human beings with the methods called automatic studies. But it will not be the last.

alphago-muga-berriak-gainditzen-ikasketa-automatik
Go is a widespread game in Japan and China. There are at least 10,100 times more combinations than in chess. Ed. DOLLARPHOTOCLUB/LENZENDORFMARCUS.

On the road to machine intelligence, that is, to artificial intelligence, it has been customary to try to invent machines, programs or technologies that win human beings in tasks or games that have been considered complex and merely suitable for the realization of human brains. Thus have been falling one by one, limits that seemed impossible in advance, and computers have been overcoming the man in several games: in the game of three shepherds, the machine OXO, able to make the matches perfect in 1952 and ensure at least the draw; in the game of the ladies, the Chinese program beat the best player of all time, in 1994 (and the machine managed to be able to make a perfect match for the Ep debated in 2007);

The game Go was the next challenge to overcome by machines. Although somewhat unknown in our country, it is widespread in China and Japan and has more than 40 million players in the world. It is played in a table of table 19x19 in which two players must place the black and white chips to win the chips of the others and finish with more chips than the rest. A match can last several days, but in professional matches are limited to 16 hours. It is estimated that in the game Go there can be about 2x10170 possible combinations, more than the square of the number of atoms calculated in the universe! And at least 10100 times more combinations than chess! Due to this complexity, researchers were unable to make a program that would win professional players in the Go game. Until October last year. Then, Google's AlphaGo system first won a professional player Go.

 

Machine learning

In the history of computer science various ways of solving problems have been used. In the simplest way, human beings directly code, in a program, the decisions or plays that must be made in each situation depending on the conditions. This only works for the simplest problems, but at the beginning of computers you could not do anything else. However, it has served to overcome the game of three pastors and others.

Another methodology is the so-called “brute-force search”, that is, “search for wild force” or combinatory or deep search. Through it, it is analyzed at every moment of the match where they can take all possible options. However, depending on the complexity of the game, you cannot
analyze the chances of winning or you cannot reach the required minimum depth. For example, you have to study a million opportunities to reach the depth of 4 players, a trillion options for 8 players, a trillion for 12… However, these types of systems have been used very much and successfully. Deep Blue, in short, used this method. But this game system Go has its limitations, since at every moment you can make about 250 possible plays.

Therefore, the system most used in recent times to solve this type of games, as well as many other complex problems, such as many related to language and voice technology, is machine learning. In machine learning different methodologies are combined with the same operation: there are certain data structures that provide inputs that provide outputs; these structures are given many examples for the inputs and outputs they would need and through algorithms adapt their structure to give the outputs they want to achieve. For example, in chess entries can be board states and outputs, the best plays for these situations. If you manage to adapt the data structures to give these outputs to these inputs, in most cases you will be able to give an optimal response even in new situations that have never been shown before.

In machine learning, data structures called neural network or neural networks were initially used to mimic the behavior of neurons in human brains. In the 1970s, artificial intelligence research moved away from these systems, but in the late 2000s interest in neural networks, which are widely used, resurfaced. The AlphaGo system we are dealing with is based on neural networks. In view of the evolution of recent years and the results obtained with the game of Go, it seems that in the coming years research on artificial intelligence will advance towards machine learning.

 

...and beyond!

Interest in machine learning and neural networks has resurfaced because today's powerful computers have allowed more complex data structures or neural networks. Specifically, in the case of neural networks, artificial neurons can now be divided into multiple layers, use much more data to train or learn and solve complex problems.They are called Deep neural network or deep neural networks, and AlphaGo uses one of these features.

But a network of neurons of these characteristics for the game Go, despite training with 30 million human players, did not get so good results. Another technique used was reinforcement learning. In this method, the system obtained was set to play against itself, many matches, and subsequently the movements of those matches were used to train and improve the system through machine learning. And once this was done, they managed to build AlphaGo.

That means that to make an artificial player a machine learning system has been used, but not to learn from humans, but to learn from itself! Initial studies have been conducted from human headings. But would it work without them? That is, if with a very naive or random starting machine and playing matches with itself, we would start to learn and improve constantly, would the result be the same? If so, the consequences would be enormous.

In any case, in October, the player who was defeated by the AlphaGo system, Fan Hui, is a 2 day professional player despite being European champion. In mid-March, the AlphaGo system will face Lee Se-dol, one of the best players in history and with a ranking of 9 dans (maximum existing) in the public match to be held in South Korea. The follow-up of the match will certainly be of great interest.

Babesleak
Eusko Jaurlaritzako Industria, Merkataritza eta Turismo Saila