Mind boggling machine learning results from AlphaZero

PAllen · Dec 11, 2017

PAllen said:

Getting back to this interesting point, it occurs to me that since AlphaZero is learning from scratch, a more appropriate comparison would be with the total number of games of chess played by all serious chess players through history. Every chess player learns from the play of the current generation of strong players, who learned from the play of those before, etc. Thus, the comparable human neural net is not one person but the collection of all serious players from the advent of chess, and even reasonably close predecessor games.

My guess is that this would still not total 44 million, but I have no data on this. It would certainly less disparate than looking at games of just one human player.

Ok, here is a data point:

https://shop.chessbase.com/en/products/mega_database_2017

Almost 7 million games we have a record of. Thus, to order of magnitude, a claim could be made that AlphaZero was as effective at mastering chess as the collective net of human chess players.

Andy Resnick · Dec 12, 2017

PAllen said:

Getting back to this interesting point, it occurs to me that since AlphaZero is learning from scratch, a more appropriate comparison would be with the total number of games of chess played by all serious chess players through history. Every chess player learns from the play of the current generation of strong players, who learned from the play of those before, etc. Thus, the comparable human neural net is not one person but the collection of all serious players from the advent of chess, and even reasonably close predecessor games.

My guess is that this would still not total 44 million, but I have no data on this. It would certainly less disparate than looking at games of just one human player.

I finally got a chance to read the arXiv report, which is fascinating- my question is, is there some way to 'peek under the hood' to see the process by which AlphaZero optimized the move probabilities based on the Monte-Carlo tree search, and if during the process of selecting and optimizing the parameters and value estimates arrived at an overall strategic process that is measurably distinct from 'human' approaches to play- could AlphaZero pass a 'Chess version' Turing test?

PAllen · Dec 12, 2017

Andy Resnick said:

I finally got a chance to read the arXiv report, which is fascinating- my question is, is there some way to 'peek under the hood' to see the process by which AlphaZero optimized the move probabilities based on the Monte-Carlo tree search, and if during the process of selecting and optimizing the parameters and value estimates arrived at an overall strategic process that is measurably distinct from 'human' approaches to play- could AlphaZero pass a 'Chess version' Turing test?

Yes, it is very worth reading the whole paper.

I can’t think of any way to formulate a chess Turing test that would clearly distinguish the self trained AlphaZero from any of the top current engines. For example, both would handle novel chess problems well, using their very different techniques.

sysprog · Dec 12, 2017

GM Amanov provides insightful commentary.

GTOM · Dec 12, 2017

sysprog said:

GM Amanov provides insightful commentary.

Looks like what i said about a simpler brute force and avoid error is rather that Stockfish (which is enough to achieve draw in the majority of cases), than that new program, well it is sure interesting, what are the differences, drop useless combinations faster, more likely start in right direction?

phyzguy · Dec 12, 2017

There is a big push for "Explainable AI", driven by the uses in medicine, law and other places. Here is a New York Times article on the issues. So people are working on being able to ask how these deep neural networks make their decisions. If they make progress, it would be interesting to apply to AlphaZero, to see if we can gain insights on how it chooses the correct move.

sysprog · Dec 12, 2017

GTOM said:

Looks like what i said about a simpler brute force and avoid error is rather that Stockfish (which is enough to achieve draw in the majority of cases), than that new program, well it is sure interesting, what are the differences, drop useless combinations faster, more likely start in right direction?

Amanov mentions in the video that the majority of the games were not yet released -- that renders at best tenuous any suggested plenary interpretation of the results -- preliminary study of emergents from the program set implementation presumably should avail of unfettered access to every game.

gleem · Dec 12, 2017

phyzguy said:

There is a big push for "Explainable AI", driven by the uses in medicine, law and other places. Here is a New York Times article on the issues. So people are working on being able to ask how these deep neural networks make their decisions. If they make progress, it would be interesting to apply to AlphaZero, to see if we can gain insights on how it chooses the correct move.

This is a huge issue. A bit off the topic but there are a number of uses of AI where we would definitely like to know the reason for a behavior, a result or a recommendation from the AI. See. https://www.technologyreview.com/s/604087/the-dark-secret-at-the-heart-of-ai/

AaronK · Dec 12, 2017

phyzguy said:

There is a big push for "Explainable AI", driven by the uses in medicine, law and other places. Here is a New York Times article on the issues. So people are working on being able to ask how these deep neural networks make their decisions. If they make progress, it would be interesting to apply to AlphaZero, to see if we can gain insights on how it chooses the correct move.

I mentioned this earlier in the thread, but the discussion on 'Explainable AI' has been a topic of mainstream debate recently because of this talk at NIPS:

Eminent ML expert and director of AI research at Facebook Prof. Yann LeCun disagreed (not explicity that there needs to be more rigor in ML) and many have contributed their opinion.

anorlunda · Dec 13, 2017

This thread has been extensively cleaned up to remove off topic posts commenting on the misconception that alpha one used traditional gaming strategies rather than neural networks.

Greg Bernhardt · Dec 19, 2017

Nice video on the subject

Sue Rich · Jan 1, 2018

I love chess. But I don't see the appeal in playing a game against a machine, knowing you're going to lose. Chess is an exercise for the mind. There is no benefit knowing the inevitable outcome is failure. I do, however, see a huge significance for science. Imagine a machine that can tell you exactly how to cure an incurable disease. Or one that can compute the closest planet that will sustain life. The possibilities are mind boggling.

Fooality · Jan 1, 2018

One thing I wonder about is the capacity to abstract generalized intelligence from the physical world. One thing that defines AI is the vast training sets, humans don't really use. But we do have vast training sets in terms of a continuous stream of experiences from birth, and somehow we are able to use that experience to rapidly learn new abstract things. Scary as it sounds, the bridge to real AI, or a demonstration that Google has it, may have to come from agents processing such streams of world experience: droids!

PAllen · Jan 1, 2018

Fooality said:

One thing I wonder about is the capacity to abstract generalized intelligence from the physical world. One thing that defines AI is the vast training sets, humans don't really use. But we do have vast training sets in terms of a continuous stream of experiences from birth, and somehow we are able to use that experience to rapidly learn new abstract things. Scary as it sounds, the bridge to real AI, or a demonstration that Google has it, may have to come from agents processing such streams of world experience: droids!

Actually, the distinguishing feature of AlphaZero is that it had no training data at all, nor any input of human expertise. However, in current form it is not at all general intelligence. Instead, it is a general capability to self learn extreme expertise within a closed system all on its own. I agree that much of what people consider general intelligence is tied to interacting with the world and other people, especially via language. At some point this will have to be tackled, to achieve any form of true AI.

Fooality · Jan 1, 2018

PAllen said:

Actually, the distinguishing feature of AlphaZero is that it had no training data at all, nor any input of human expertise. However, in current form it is not at all general intelligence. Instead, it is a general capability to self learn extreme expertise within a closed system all on its own. I agree that much of what people consider general intelligence is tied to interacting with the world and other people, especially via language. At some point this will have to be tackled, to achieve any form of true AI.

It's really impressive. But in a sense I think does have training data, in terms of the games (as I understand it) it plays against itself. What seems unique is our ability to abstract general information from basic experience, like if one of these were able to see how programming was like chess, and use generalized chess skills to program.

Do you notice how we do that with language? So much in computer science is metaphors for physical world: Trees, folders, firewalls, viruses, sockets...All these terms relate physical experience to abstract entities. The elements of experience we generalize have application beyond the physical world we have experienced thus far and relate also to new realms we have not experienced, including information realms.

mfb · Jan 1, 2018

Fooality said:

But in a sense I think does have training data, in terms of the games (as I understand it) it plays against itself.

That still means it had to develop everything itself. It didn't have even the most basic knowledge ("keeping the queen is good"). I wonder how the first games were played. Completely randomly until one side happened to be able to check mate the other side within a few moves? A few games until it discovers that it is advisable to beat pieces of the opponent?

Fooality · Jan 2, 2018

mfb said:

That still means it had to develop everything itself. It didn't have even the most basic knowledge ("keeping the queen is good"). I wonder how the first games were played. Completely randomly until one side happened to be able to check mate the other side within a few moves? A few games until it discovers that it is advisable to beat pieces of the opponent?

I agree, its really compelling. That's why it's got me thinking about how far it is from being general AI. The game trees of these games the Alpha machines are playing are huge, so it must be abstracting lessons or rules already, classifying situations, generalizing in a sense. How far is it from generalizing rules that help in other games, or in general world situations do you think?

mfb · Jan 2, 2018

Imperfect knowledge is a big issue. See poker: All relevant probabilities for the hands are trivial even for a 1980 computer, but it took until 2017 for computers to accurately evaluate how to deal with cards others see but the program does not. StarCraft is still an open problem because the AIs cannot scout as well as humans, have trouble including the scouting results in their strategy, and so on. Well, in August Blizzard released tools that made it easier to train AIs. And of course the DeepMind team is involved. Certainly something to watch in 2018.

PeroK · Jan 2, 2018

mfb said:

That still means it had to develop everything itself. It didn't have even the most basic knowledge ("keeping the queen is good"). I wonder how the first games were played. Completely randomly until one side happened to be able to check mate the other side within a few moves? A few games until it discovers that it is advisable to beat pieces of the opponent?

This is what I can't quite understand. Even if a human being wasn't told the queen is powerful, they would soon work that out. It does seem clumsy to have to play randomly.

I wonder, however, whether the knowledge of checkmate led initially not to random moves but to direct attacks. The position without the queen was immediately assesed as less favourable because there were fewer checkmating patterns left? Perhaps, generally, that's how it very quickly learned the value of extra material?

In the end, of course, it appears to have developed an indirect, super-strategic style - the opposite of direct attack. The game above is a good example, where checkmate or even an attack on the king never entered into the game. It was a pure strategic out-manoeuvering of Stockfish with no sense that checkmate had anything to do with the game at all.

Although, perhaps that's only its style against a highly-powerful computer than makes no tactical errors. I wonder how it would play against a human?

mfb · Jan 2, 2018

I would be surprised if it can understand the opponent - I would expect it to play purely based on the current board (and the RNG output).

It plays unconventionally - I can imagine that human opponents get lost quickly. The AI will take an opportunity to check mate if it is there, but simply improving the material and tactical advantage more and more is a very reliable strategy as well.A checkmate can be done quickly, that is a good point - probably not too many random moves then.

PeroK · Jan 2, 2018

mfb said:

I would be surprised if it can understand the opponent - I would expect it to play purely based on the current board (and the RNG output).

It plays unconventionally - I can imagine that human opponents get lost quickly. The AI will take an opportunity to check mate if it is there, but simply improving the material and tactical advantage more and more is a very reliable strategy as well.A checkmate can be done quickly, that is a good point - probably not too many random moves then.

It's not that is understands its opponent, but that Stockfish never gave it the opportunity to go in for some tactics. I'm not sure how Stockfish works, but a strategy for Stockfish would be to be programmed to favour ultra-sharp positions, where its calculating power might be better than Alpha's.

A human could try that strategy, although against a computer it would almost certainly backfire. But, a human could (easier said than done) - especially as White - force Alpha out of its comfort zone. I doubt that Alpha would get caught out by trying to retreat into a strategic game. I would expect Alpha to be able to scrap it out.

I also meant that human mistakes would lead Alpha to a more aggressive style that Stockfish didn't allow. In any case, against a human opponent (especially a weaker player), we might see another side to Alpha's game.

Devils · Jan 11, 2018

Andy Resnick said:

I finally got a chance to read the arXiv report, which is fascinating- my question is, is there some way to 'peek under the hood' to see the process by which AlphaZero optimized the move probabilities based on the Monte-Carlo tree search, and if during the process of selecting and optimizing the parameters and value estimates arrived at an overall strategic process that is measurably distinct from 'human' approaches to play- could AlphaZero pass a 'Chess version' Turing test?

I don't think so. You are right in what the paper is about. There is no alpha-beta, no "clever heuristics", no evaluation function. Instead there is a deep neural network and Monte Carlo Tree Search.

What they seem to have done is transform the problem from one domain to another. Evaluations functions are subject to human whims and best guess. On the other hand numerical optimisation problems have been studied for well over 40 years.

The authors make statements like "AlphaGo Zero tuned the hyper-parameter of its search by Bayesian optimisation" , " AlphaZero instead estimates and optimises the expected outcome" , AlphaZeroevaluates positions using non-linear function approximation." We know an awful lot about efficient numerical optimisation algorithms, and not so much about hand-coded evaluation functions.

They also say "while alpha-beta programs based on neural networks have previously been un-able to compete with faster, handcrafted evaluation functions." In other words, people have tried similar approaches before and failed, and the authors believe their new approach is right.

The "breakthrough" seems to be the transformation process, or creating a "dual" problem. So have they given the algorithm ? No, just a hint. All they have said is there is a "secret sauce" without providing the recipe - yet.

mfb · Jan 12, 2018

Devils said:

and the authors believe their new approach is right.

With the success in Go and Chess: The approach can't be too bad...

BWV · Jan 12, 2018

Listened to a podcast w Gary Kasparov from 2

BWV · Jan 12, 2018

Is the key issue that the neural networks can do a better job than humans of trimming the tree of potential moves?

The way I understand traditional chess engines is that a human expert provides a framework which keeps the combinations that the computer crunches with raw force at a manageable level. This is combined with an opening book and an endgame book of solved positions (which I think is now solved up to 6-7 pieces on each side, at about the limit that any computer will ever do given how rapidly the number of combinations explodes). Its impressive if Deep Mind trained itself not only on the middle game, which traditional computers rely on human-programmed heuristics to keep the calculations manageable, but also trained itself to a level equivalent with the current 100TB databases of solved endgames.

mfb · Jan 12, 2018

BWV said:

Is the key issue that the neural networks can do a better job than humans of trimming the tree of potential moves?

It would be interesting to see how well AlphaZero performs if the number of states it can go through is limited to human-like levels. I'm not aware of such a competition.

PAllen · Jan 13, 2018

mfb said:

It would be interesting to see how well AlphaZero performs if the number of states it can go through is limited to human-like levels. I'm not aware of such a competition.

Problem is, nobody knows how many positions humans consider, because humans cannot accurately report on both conscious and unconscious thought - a milder version of a major problem with neural networks. If you believe what is reported, Capablanca (world chess champion with reasonable claim to being the greatest natural chess prodigy) answered the question “how many moves do you consider?” with “I only consider one move - the best one”. Of course tongue in cheek, but really nobody including the grandmaster knows all that goes into choosing a move.

mfb · Jan 13, 2018

You can't get the number right, but you can get a rough estimate. "Did you consider this state?" Currently we know every chess engine considers much more boards than humans.

Hendrik Boom · Jan 13, 2018

Fooality said:

One thing I wonder about is the capacity to abstract generalized intelligence from the physical world. One thing that defines AI is the vast training sets, humans don't really use. But we do have vast training sets in terms of a continuous stream of experiences from birth, and somehow we are able to use that experience to rapidly learn new abstract things. Scary as it sounds, the bridge to real AI, or a demonstration that Google has it, may have to come from agents processing such streams of world experience: droids!

I keep wondering whether this technology would be applicable to solving mathematics problems, perhaps defining the game rules by some formal system. What I find hard to imagine is how to formulate the state of a partial proof in a form that can be the input of an artificial neural net.

BWV · Jan 13, 2018

Was not clear in my OP, the question is that chess engines like Stockfish, to my understanding, rely on human-programmed heurstics to trim the decision tree in the middle game. I am guessing that this is where the NN outperforms

Mind boggling machine learning results from AlphaZero

AI vs. Humans as Processors in an Environment

Sweetspot of data compression

How to show RS(U+TRS)* is equivalent to (R+SUT)SU?

HTML/CSS Problems with DNS records

PHP My website presents the visitor with the choice of opting out of using cookies....

Python Applying Accelerated Raymarching to Reduce Rendering Time

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect