Mind boggling machine learning results from AlphaZero

PAllen · Dec 6, 2017

I have always been on the skeptical, but not dismissive, end of judgments about achievemnts and rate of progress in this field. However, the following (please read through carefully) just completely blows my mind:

https://en.chessbase.com/post/the-future-is-here-alphazero-learns-chess

A self learning algorithm with no knowledge or base of games, starting only from the rules of chess, within 24 hours is much better than any other existing chess program despite running on hardware 900 times slower!

256bits · Dec 7, 2017

Well, there is the link "Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm", PDF, under heading A new paradigm ( third way down ).
If I read Table S3 correctly, it took 44 million training games to learn chess, and 21 million to learn Go and Shogi, to become the best at winning.
Humans hardly play that many games with their slow processing grey matter.

PAllen · Dec 7, 2017

256bits said:

Well, there is the link "Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm", PDF, under heading A new paradigm ( third way down ).
If I read Table S3 correctly, it took 44 million training games to learn chess, and 21 million to learn Go and Shogi, to become the best at winning.
Humans hardly play that many games with their slow processing grey matter.

True, but what shocked me was the leap beyond any prior results in machine learning that I’m aware of.

256bits · Dec 8, 2017

PAllen said:

True, but what shocked me was the leap beyond any prior results in machine learning that I’m aware of.

It would be interesting if some kind of Moore's law could be applied to the evolution of smarter machines, not just that due to faster processing power, but from the use of better algorithms. The people and teams that work on these systems seems to be themselves some kind of brainiacs putting it all together and making it work.

PAllen · Dec 8, 2017

It is perhaps worth clarifying that this is an amazing result in machine learning in a closed domain (fixed rules, fixed definition of value). It does not address an open domain, or AI per se, at all.

QuantumQuest · Dec 8, 2017

I have been playing chess for a long time so I can see the value for us humans regarding the game itself in the long run but clearly it is something beyond our reach to follow.
From an AI perspective it is very impressive but of more value is (quoting from the article);

This completely open-ended AI able to learn from the least amount of information and take this to levels hitherto never imagined is not a threat to ‘beat’ us at any number of activities, it is a promise to analyze problems such as disease, famine, and other problems in ways that might conceivably lead to genuine solutions.

And once again, the basic principle taught in intro courses in CS that advancements in hardware cannot outperform a very efficient algorithm (a fortiori combined with some sort of neural network and its respective processes), is popping up.

gleem · Dec 8, 2017

If you have been following any of the other threads on AI with regards to its impact on society you may have notice a distinct group who believe that AI's impact and in particular rivaling human intelligence is nil even in the next century. In particular our understanding of cognitive processes and the state of the art of computer systems does not support the idea of super general intelligence that might overcome human intelligence.

The success of machine learning in Go, Dota2(video game) and the new one in Chess which have occurred recently should give those doubter some pause to reconsider their opinions. In particular the rate of increase in the improvement in performance. The article in the OP noted that it would take a decade for AI to compete in Go against humans. It took three years. In another year it defeated the world champion 3-0. Now the AlphaGoZero has beaten that system 100-0 and with 1/12 the processors of the original system.

Both in Go and in Chess the AlphaGoZero system has "found" new strategies/moves not heretofore identified surprising developers. In Go for example the new strategy was adopted by the world champ and he went on a 22 game winning streak against other players.

Is there a Moore's law for AI? Of course we will have to come up with a metric to determine it. But there seems to be significant progress in AI in recent years. Note that the AlphaGoZero is a factor of 900 slower than other Chess systems but still outperforms them AlphaGoZero developers seem to keeping in mind the motto for their system "work smarter not harder".

MathematicalPhysicist · Dec 8, 2017

https://xkcd.com/1875/

:-)

PeroK · Dec 8, 2017

gleem said:

If you have been following any of the other threads on AI with regards to its impact on society you may have notice a distinct group who believe that AI's impact and in particular rivaling human intelligence is nil even in the next century. In particular our understanding of cognitive processes and the state of the art of computer systems does not support the idea of super general intelligence that might overcome human intelligence.

The success of machine learning in Go, Dota2(video game) and the new one in Chess which have occurred recently should give those doubter some pause to reconsider their opinions. In particular the rate of increase in the improvement in performance. The article in the OP noted that it would take a decade for AI to compete in Go against humans. It took three years. In another year it defeated the world champion 3-0. Now the AlphaGoZero has beaten that system 100-0 and with 1/12 the processors of the original system.

Both in Go and in Chess the AlphaGoZero system has "found" new strategies/moves not heretofore identified surprising developers. In Go for example the new strategy was adopted by the world champ and he went on a 22 game winning streak against other players.

Is there a Moore's law for AI? Of course we will have to come up with a metric to determine it. But there seems to be significant progress in AI in recent years. Note that the AlphaGoZero is a factor of 900 slower than other Chess systems but still outperforms them AlphaGoZero developers seem to keeping in mind the motto for their system "work smarter not harder".

I was an average chess player at about 1800-1900 ELO. In terms of the chess playing, computers were better than me almost from the outset. The most remarkable thing to me is the standard to which some humans can play chess. The idea of having any chance against a modern computer is absurd.

If chess playing or go playing is a measure of intelligence, then computers have been more intelligent than me for decades.

And, am I really that much less intelligent than Magnus Carlsen? It seems to me that he is more like a machine than a human in terms of his chess playing ability.

That said, Alpha zeros approach to learning chess is remarkable. What it did to Stockfish in some of those games was beautiful.

MathematicalPhysicist · Dec 8, 2017

PeroK said:

I was an average chess player at about 1800-1900 ELO. In terms of the chess playing, computers were better than me almost from the outset. The most remarkable thing to me is the standard to which some humans can play chess. The idea of having any chance against a modern computer is absurd.

If chess playing or go playing is a measure of intelligence, then computers have been more intelligent than me for decades.

And, am I really that much less intelligent than Magnus Carlsen? It seems to me that he is more like a machine than a human in terms of his chess playing ability.

That said, Alpha zeros approach to learning chess is remarkable. What it did to Stockfish in some of those games was beautiful.

Well, I never really enjoyed playing Chess so much, it's just a matter of rote memorizing all the correct combinations.

Playing football or basketball is a lot more fun, more spontaneous.
Either way, you need to practice a lot to be a master in something, like the fact that the machines played so many games.

MathematicalPhysicist · Dec 8, 2017

BTW, when I first heard Edward Witten talk, I thought he was a robot.

He has that voice pattern... :-D

Reminds me of Blade Runner.

AaronK · Dec 8, 2017

Certainly fantastic results from the colloborators at DeepMind. The next step is going to be a major hurdle though--we'll have to extend it to model-free control. The Test of Time award winner at NIPS, Ali Rahimi, gave a great talk about the ever more essential need for greater scientific rigor and the development of useful theoretical foundations in Machine Learning such that we might make some insights to approach problems in the field that currently seem almost intractable (i.e. high dimensionality, etc.). For example, one thing he mentioned was how we use batch normalization to accelerate gradient descent, and how this is "explained" by saying batch norm "reduces internal covariate shift." The problem is, we really haven't a clue as to *why* reducing internal covariate shift speeds up gradient descent. Essentially, he wants us to move away from the sort of unguided engineering approach, and move towards a culture of and approach to research more similar to Biology (as opposed to just training our models conventionally and trying to reduce error rate as much as possible, and ultimately just a more empirical 'be creative and see what sticks' kind of approach).

Interestingly, this has spawned a debate in the community. Prof. Yann LeCun came out and officially disagreed with Rahimi, stating that the state of the art so far has been developed due to the clever engineering of researchers (he took offense to what Rahimi called it: "Alchemy"), and that practical technology almost always precedes the theory developed to fully explain the technology (i.e. Watt's steam engine came before Carnot Cycle--you can't criticize Watt for not being Carnot, basically). I think the essence of the debate is whether the current culture of research will be beneficial or detrimental for the field going forward--Rahimi seems to think it is becoming detrimental, since it's still a challenge to pedagogically deliver ML to students without any anchoring theoretical foundations (you actually have to "get good" at machine learning by developing a lot of experience and intuition while training models, without being able to anchor back to any real first principles), and the need to have proper explanations for the workings of our ML systems in the context of life or death situations involving humans (i.e. autonomous driving, cancer pathology, etc.). Prof. LeCun thinks that this way of doing things is just fine, and that Rahimi shouldn't needlessly criticize the work done so far, instead go and do the theory work, to which Rahimi replied that the talk he gave was basically a public plea for help.

I see both sides' arguments, and it's been very interesting to follow the discussion so far. Either way, I'm excited to see where we'll be in 5 or 10 years, when we'll have seriously improved and expanded datasets for model training combined with superior hardware like MITs programmable nanophotonic processor or utilizing the phase-change memory work of IBM researchers to allow for massively parallel computing systems, which would be great for ML. Maybe by then we'll have progressed a good bit on developing theoretical foundations in ML.

Greg Bernhardt · Dec 8, 2017

PeroK said:

And, am I really that much less intelligent than Magnus Carlsen?

PAllen · Dec 8, 2017

Greg Bernhardt said:

To contrast with this video, Magnus is a big fan of Monty Python, an interest shared with Viswanathan Anand, and there are videos of them doing skits together. He is also a fan of Donald Duck - this is apparently a Norway thing - Donald Duck is quite popular there.

PAllen · Dec 8, 2017

As for blindfold play, this is a specialty with the current record being 48 simultaneous blindfold games:

https://www.chess.com/news/view/timur-gareyev-plays-blindfold-on-48-boards-5729

(In the above video, Magnus plays 10 simultaneous blindfold games, the most he has ever done. The record that stood for decades was by Miguel Najdorf during WW II of 45 simultaneous blindfold games. Of interest about this is that one player in Timur’s exhibition had also played in Najdorf’s. Even more remarkable is that Najdorf’s exhibition was motivated in part to try to let his family in Nazi Germany know he was ok and alive. He had fled to Argentina and there was no normal communication method. He figured correctly that his feat would be covered even in Germany and his family would see it. Postwar, it was verified that his idea had worked.)

Delta2 · Dec 8, 2017

Well to be honest I have my doubts, though it seems astonishing to beat the top engines that are based solely on search and evaluation algorithms and rely on NPS computational power (Nodes(positions) per Second processed(searched and evaluated)),

still reading the article it says that the engine as white prefers to play the English opening (1.c4 ...) or the Queen's Gambit openings (1.d4 d5 2.c4 ...).

According to Bobby Fischer, one of the top players in all the history of chess, best first move for white is 1.e4 .

mfb · Dec 8, 2017

For some reason I trust the preference of an algorithm that beats all algorithms that consistently beat all humans more than the preference of a human. e4 might be a good move against humans, but apparently not against much stronger opponents.

PAllen · Dec 8, 2017

Delta² said:

Well to be honest I have my doubts, though it seems astonishing to beat the top engines that are based solely on search and evaluation algorithms and rely on NPS computational power (Nodes(positions) per Second processed(searched and evaluated)),

still reading the article it says that the engine as white prefers to play the English opening (1.c4 ...) or the Queen's Gambit openings (1.d4 d5 2.c4 ...).

According to Bobby Fischer, one of the top players in all the history of chess, best first move for white is 1.e4 .

Fischer said that, but in his critical match against Spassky in 1972, he played English and QGD frequently as white. In fact his first win as white in this match was an English that transposed to a QGD.

Delta2 · Dec 8, 2017

mfb said:

For some reason I trust the preference of an algorithm that beats all algorithms that consistently beat all humans more than the preference of a human. e4 might be a good move against humans, but apparently not against much stronger opponents.

I don't think its good to trust a computer program more than a human Top GM(GrandMaster).

The problem with humans against engines is that humans (even strong GMs) lack behind in terms of tactical processing of a given position in board. However humans are better in positional processing of the position.

Since time is a factor (both players start with a limited time) and humans spent a lot of time calculating the tactical complexities of a position (a human even a GM, might spend 5min for something that an engine can process in 0.005min, like for example to see an elegant queen sacrifice that leads in a forced mate in 5 moves) that's the main reason humans are getting beaten by engines.

If we allow for a hybrid of a human GM+a typical engine that analyses positions so that the GM sees the engine analysis for the various possible moves he has in mind, then i believe this hybrid can beat any engine, (stockfish or rybka , or even alphazero or whatever).Or if we allow for very big(slow) time controls, so that the human has a lot of time to think about the tactical complexities of a position, then I believe a human GM has the advantage over any kind of engine.

PAllen · Dec 9, 2017

Delta² said:

I don't think its good to trust a computer program more than a human Top GM(GrandMaster).

The problem with humans against engines is that humans (even strong GMs) lack behind in terms of tactical processing of a given position in board. However humans are better in positional processing of the position.

Since time is a factor (both players start with a limited time) and humans spent a lot of time calculating the tactical complexities of a position (a human even a GM, might spend 5min for something that an engine can process in 0.005min, like for example to see an elegant queen sacrifice that leads in a forced mate in 5 moves) that's the main reason humans are getting beaten by engines.

If we allow for a hybrid of a human GM+a typical engine that analyses positions so that the GM sees the engine analysis for the various possible moves he has in mind, then i believe this hybrid can beat any engine, (stockfish or rybka , or even alphazero or whatever).Or if we allow for very big(slow) time controls, so that the human has a lot of time to think about the tactical complexities of a position, then I believe a human GM has the advantage over any kind of engine.

It is true that centaurs (human + computer) beat computer alone for current commercial engines. Human with long time control no longer does. It would certainly be interesting to try too human plus stockfish or Houdini against alphazero. I am not so sure the centaur would win. Some top grandmasters have likened alphazero’s play to perfected Karpov, i.e. immense positional understanding.

Delta2 · Dec 9, 2017

PAllen said:

It is true that centaurs (human + computer) beat computer alone for current commercial engines. Human with long time control no longer does. It would certainly be interesting to try too human plus stockfish or Houdini against alphazero. I am not so sure the centaur would win. Some top grandmasters have likened alphazero’s play to perfected Karpov, i.e. immense positional understanding.

Has alphazero played any games against strong GMs or against centaurs?

PAllen · Dec 9, 2017

Delta² said:

Has alphazero played any games against strong GMs or against centaurs?

No. And I suspect it won’t happen. The purpoase of this exercise, for the Deep Mind team, was to validate a general learning from scratch network using 3 radically different complex rule sets (chess, shogi, and go). My guess is that we won’t hear from them in a while as they try to make inroads into open domains without fixed rule sets.

Delta2 · Dec 9, 2017

That would be interesting to see , as you say, the transitive property doesn't generally hold in sports (chess can be seen as a mental sport), in otherwords it is not sure that because alphazero beats conventional engines and engines beat human GMs, that alphazero will win a strong human GM.

mfb · Dec 9, 2017

AlphaZero plus human might have an advantage over AlphaZero alone (I'm not even sure about that - the communication time might be worse than just letting AlphaZero move), but I don't see how humans plus a different existing program would come close to AlphaZero. Whatever the other program would give to a human to evaluate, AlphaZero would evaluate on its own. And we know AlphaZero can evaluate situations much better than humans, given the same time - that's how it decides what to do, and it can do that extraordinarily well. That takes a factor 2 in processing time - so what? It is so far above other programs that a factor 2 in processing time doesn't make a difference.

We are not talking about a program that is a bit better than other programs. We are talking about a program that did not lose a single game out of 100, against a computer program so strong that you need several iterations of "x has no chance against y" until you reach the level of humans.

Delta2 · Dec 9, 2017

mfb said:

AlphaZero plus human might have an advantage over AlphaZero alone (I'm not even sure about that - the communication time might be worse than just letting AlphaZero move), but I don't see how humans plus a different existing program would come close to AlphaZero. Whatever the other program would give to a human to evaluate, AlphaZero would evaluate on its own. And we know AlphaZero can evaluate situations much better than humans, given the same time - that's how it decides what to do, and it can do that extraordinarily well. That takes a factor 2 in processing time - so what? It is so far above other programs that a factor 2 in processing time doesn't make a difference.

We are not talking about a program that is a bit better than other programs. We are talking about a program that did not lose a single game out of 100, against a computer program so strong that you need several iterations of "x has no chance against y" until you reach the level of humans.

I have my doubts on that, but you might be right , we ll just have to wait for some games of AlphaZero against human GMs and centaurs (with no AlphaZero program in their aid).

AaronK · Dec 9, 2017

PAllen said:

It is true that centaurs (human + computer) beat computer alone for current commercial engines. Human with long time control no longer does. It would certainly be interesting to try too human plus stockfish or Houdini against alphazero. I am not so sure the centaur would win. Some top grandmasters have likened alphazero’s play to perfected Karpov, i.e. immense positional understanding.

What would be of particular interest would be to see the system that beat Go Champion Ke Jie 3/nil collaborate together with him against the current Alphazero over a series of matches. I would pay money to see that live.

EDIT: Though yes, Alphazero may in fact completely crush that older AlphaGo system/Ke Jie team against it. I would just like to confirm it, since many have been claiming that humans and machines working together almost in a team-like manner is how our relationship with our advancing AI will continue to be, seeing how machine/human teams were beating the sole AlphaGo agent, if I'm remembering correctly--they suspected the human brought something integral to the match that somehow benefited upon collaboration with the agent.

And yet, I think we'll perhaps soon begin to see that this sort of thinking is a kind of illusion. There is real uncertainty here, at least with respect to the ultimate horizon of possible play (especially in Go). It may be that machine learning agents begin to play at a level beyond the comprehension of humans, as in they'll make moves that can't really be understood, at least not in real time, not even probably by the world's best players.

anorlunda · Dec 9, 2017

256bits said:

It would be interesting if some kind of Moore's law could be applied to the evolution of smarter machines, not just that due to faster processing power, but from the use of better algorithms.

That would be welcome. But it needs to be very simple and understandable so that every person, every organization that tries to apply it have the same understanding of what it means before comparison. It also needs to remain constant in time and resist shifting definitions.

Got any suggestions?

QuantumQuest · Dec 9, 2017

Delta² said:

According to Bobby Fischer, one of the top players in all the history of chess, best first move for white is 1.e4 .

Yes, Bobby Fischer said that and I don't think that there is a single chess player out there that can doubt his great value / expertise. The important thing is what is the rationale behind it. By playing white 1.e2 - e4 you open into one of the most crucial central squares of the chessboard, you give immediate mobility to your queen, bishop and one more square for the knight (from the outset) and also you don't make clear to your opponent which one of the available systems (for this position) you'll choose i.e. flexibility. This is great but as chess playing over centuries has shown, there are way more strong openings - Bobby Fischer himself has done such choices as well, taking of course into account the changes / adaptations you'll need to do, according to the opponent's choices. Personally - although far enough from GM level, from some point on, I'm a fan of 1. d2 - d4 and in particular of Botvinnik's chess school/system but there is a lot of great systems also, including English opening (talking for white). So, I personally find it reasonable that (quoting from the article)

So what openings did AlphaZero actually like or choose by the end of its learning process? The English Opening and the Queen's Gambit!

Also, I don't think that any possible combination of human player (even at the GM level) with some conventional chess engine can stand any chance against AlphaZero for the reasons that mfb points out in #24. I am not absolutely sure of course and no one can be in advance. It has to be proven first.

Buffu · Dec 9, 2017

PAllen said:

It is true that centaurs (human + computer) beat computer alone for current commercial engines.

Is there a recent match where this has happened ?

PAllen said:

It would certainly be interesting to try too human plus stockfish or Houdini against alphazero. I am not so sure the centaur would win.

Thats a bold claim.

PAllen · Dec 9, 2017

Buffu said:

Is there a recent match where this has happened ?

Every postal chess match is a battle between centaurs, with a few who just pick machine moves. Skilled centaurs always win. Note, postal chess admitted it was pointless to expect people wouldn’t use machine aids, so they just allowed it, redefining the nature of the competition.

Buffu said:

Thats a bold claim.

How so? It is a non-claim, that I don’t know what would happen

Delta2 · Dec 9, 2017

There must be some "backstage" info , I mean the developer team of Alphazero would probably have hired some human GMs/IMs/FMs to cooperate with them, and the program probably have played some unofficial games against GMs. Anyone knows anything about this?

phyzguy · Dec 9, 2017

Delta² said:

There must be some "backstage" info , I mean the developer team of Alphazero would probably have hired some human GMs/IMs/FMs to cooperate with them, and the program probably have played some unofficial games against GMs. Anyone knows anything about this?

Why would you think this? Are you saying the developers lied about how they trained the neural network? The computer played over 500,000 games against itself. This would have given it a huge number of possible games to learn from. Would a few games against a human grand master really make much difference? How many chess games does a human grand master play in the course of learning the game?

Delta2 · Dec 9, 2017

phyzguy said:

Why would you think this? Are you saying the developers lied about how they trained the neural network? The computer played over 500,000 games against itself. This would have given it a huge number of possible games to learn from. Would a few games against a human grand master really make much difference? How many chess games does a human grand master play in the course of learning the game?

No , I am not saying that they lie on how they trained the program , but I don't know if they got some aid from GMs regarding the development of the source code of the program. Conventional chess program developers cooperate often with GMs and that reflects on the source code (mainly the source code regarding its evaluation function) of the conventional chess program. I thought that Alphazero developers may also did the same thing.(in simple words, a GM can tell a programmer how he/she thinks when playing chess and the programmer can somehow incorporate this info into the source code of the program)

AaronK · Dec 9, 2017

Delta² said:

No , I am not saying that they lie on how they trained the program , but I don't know if they got some aid from GMs regarding the development of the source code of the program. Conventional chess program developers cooperate often with GMs and that reflects on the source code (mainly the source code regarding its evaluation function) of the conventional chess program. I thought that Alphazero developers may also did the same thing.(in simple words, a GM can tell a programmer how he/she thinks when playing chess and the programmer can somehow incorporate this info into the source code of the program)

The earlier version of AlphaGo (at least for Go) was trained by looking at human gameplay before then playing numerous games against itself and that proabably biased that older version to assign greater value to certain moves for some low-complexity shapes on the board (or else it independently discovered the best moves to play in those situtations that humans happen to have also discovered). Alphazero, however (as stated previously), was only given the rules of the games. Some of the incredibly unorthodox play we've seen from DeepMinds flagship AI for awhile now is probably the main reason for that, even aside from its ability to explore so much more of the gamespace.

mfb · Dec 10, 2017

Which other games do you expect next?
Chess and Go have no randomness and no hidden information - all players always know the full state of the game. Many card games have hidden information (cards not shown to everyone) and randomness, for example. There is a Poker-AI beating humans but that is a different topic.

There are multiple games with no randomness or hidden information, but most of them are much easier for computers than Chess and Go. You don’t need a Checkers-AI or Connect 4-AI because there is an exact manual how to not lose.

It will be interesting to see if the AI can be adapted to work with a broader range of problems.

Laurie K · Dec 10, 2017

mfb said:

Which other games do you expect next?
...
It will be interesting to see if the AI can be adapted to work with a broader range of problems.

Would they be anything like the fun and games you get on a stock market or currency exchange?

Could the AI ever win a game if the game was rigged in the interests of others not being the AI and the AI was not allowed to tally the score and operate off its own internal calculations because that would be illegal?

Ryan_m_b · Dec 10, 2017

mfb said:

Which other games do you expect next?
Chess and Go have no randomness and no hidden information - all players always know the full state of the game. Many card games have hidden information (cards not shown to everyone) and randomness, for example. There is a Poker-AI beating humans but that is a different topic..

Earlier on in the year an AI beat champion players in a limited DOTA game:
https://www.google.co.uk/amp/s/arst...t-takes-on-the-pros-at-dota-2-and-wins/?amp=1

The game doesn’t have the perfect information that chess/go have (you can only see other players in your vicinity). It will be interesting to see if this AI can graduate to play proper games. At the moment it’s limited to playing as one specific character, against one specific character in solo matches (rather than the usual mixed 5v5).

Haelfix · Dec 10, 2017

I gather this was done with some form of a RNN.

What's funny about this business is that the general algorithms and structures have been known for well over 30 years. Some tweaking is needed for the particular game of course, but why has it taken this long for results like this to all of a sudden show up. It just seems like a completely obvious thing to try with chess, so I doubt this team were the first to try this approach.

Like everyone else, I started dabbling with ML about 5-6 years ago for a Kaggle competition, but it seems like almost all the big results have been occurring recently.

So is it a question of computation power and storage capacity? That seems partially true, but also pretty odd. Certainly different games have vastly different state counting so you would expect results to be more spread out in time.

On the other hand, could it be that there are inflection points within the search strategies, where say past a certain number of (layers, iterations, etc) convergence properties are substantially altered?

girts · Dec 10, 2017

Very interesting topic, I want to share my two cents, please don't take me as ignorant as I would really like to know much more about this than I probably do at the moment.Correct me as necessary but I fail to see the intellect part in this AI, well it is definitely artificial and it has some strong capabilities in some areas that's for sure but is it intelligence?
The way I see intelligence is not maximum capability in specific logic strategy tasks (that's essentially a computer) I see intellect as the capability to learn a fixed rule set and then seeing the problem within that rule set and coming up with a solution that is totally different and not within the rule set and not even within all the possible outcomes of the rules. Because if we are talking about the possibility of finding a solution to a specific problem which is within a fixed set of rules then isn't that just a matter of time question? For example chess has fixed rules and fixed amount of possible moves and outcomes, and I assume the reason why a human can't beat a computer and lately AlphaZero is because the computer is X times faster in it's capability to process all possible strategies from every move made by either itself or it's opponent.
So other than this fact the other fact that seems so novel about these news is that Alpha Zero learned how to play only from the rules of the games (Go, chess) so isn't this also a purely deterministic solution? I'm thinking in terms of knowing the possible moves and rules that govern them it is only a matter of time with trying and error to come up with all the possible outcomes both winning and losing ones?I can imagine how such approach and device could help and solve mathematical and scientific problems which is very great,which is useful if one already knows the necessary inputs or atleast some of them.

An example of an intellect comes to mind, say the situation in which Albert Einstein was in when he conceived the Theory of Relativity, he had no physical examples of the theory and no way of proving it with experiments back in 1900's but it proved out to be correct.
Now could AI come up with a correct explanation or unknown physical law that would explain some of the mysteries in science like the inside of a black hole, dark matter, etc if it was given only a partial set of rules as we arguably don't know all of the laws and rules of this universe as of this moment?
to me it seems chess and even learning chess is different in this regard as you already know the full picture and so it becomes a matter of time and processing approach and power how you figure out the winning strategy, but how does one figure out something that is not known and cannot be explained/arrived at with the existing laws/rules?

Pretty much physics history was learning the unknown while simply experimenting based on what we know so far, so trial and error or educated guess, so if we were to say build up a real AI based on the definition of it right now with our current level of knowledge and understanding about the universe, could such AI find the answers to the very things we don't know so far and if so then from what inputs or ways it would do that?I apologize if this is bit off topic I'm just curious.

phyzguy · Dec 10, 2017

Delta² said:

No , I am not saying that they lie on how they trained the program , but I don't know if they got some aid from GMs regarding the development of the source code of the program. Conventional chess program developers cooperate often with GMs and that reflects on the source code (mainly the source code regarding its evaluation function) of the conventional chess program. I thought that Alphazero developers may also did the same thing.(in simple words, a GM can tell a programmer how he/she thinks when playing chess and the programmer can somehow incorporate this info into the source code of the program)

I don't think you understand how these "deep learning" machines work. They are very different from conventional chess playing machines. There is no "evaluation function" programmed into the machine. It builds up its own evaluation of the best move in the course of training. The only value judgement programmed in is the value of winning or losing, which is stated in the paper: -1 for a loss, 0 for a draw, +1 for a win. In the course of playing several hundred thousand games, the synaptic weights of the neural net are adjusted by the machine itself to increase the probability of winning. The only chess specific information programmed in is the size of the board, how each piece can move, and what constitutes a win/loss/draw.

phyzguy · Dec 10, 2017

Haelfix said:

I gather this was done with some form of a RNN.
What's funny about this business is that the general algorithms and structures have been known for well over 30 years. Some tweaking is needed for the particular game of course, but why has it taken this long for results like this to all of a sudden show up. It just seems like a completely obvious thing to try with chess, so I doubt this team were the first to try this approach.

My understanding is that the recent explosion in successful applications of neural networks is due to improvements in methods for adjusting the synaptic weights. The use of "deep neural nets", which have many hidden layers of neurons between the input and the output, was prohibitive in the past because there was no known method to adjust the weights in a reasonable length of time. New techniques, in particular the use of Restricted Boltzmann Machines included improved algorithms for training the network.

Haelfix · Dec 10, 2017

I think that's definitely partially true, there have been some algorithmic changes. However even very simple convolution neural networks (and other feedforward NNs) with simple backpropagation using gradient descent are now being utilized extremely successfully in applications like facial recognition. My laptop PC with a GPU card is able achieve accuracy that was unheard of even 10 years ago. So it just seems surprising that everything seems to be happening at once.

PAllen · Dec 10, 2017

I think some distinctions in the field called AI are worth making:

1) There is a long track record of success in neural network training - where people provide training data and guide the training (to varying extents). AlphaGo Master that beat Lee Sedol and (with further refinement and training) Ke Jie (who is generally considered the strongest living Go player), was a result in this category. It was remarkable, but only in the sense that people had tried this with Go without any success comparable to this, and the team itself expected this achievement to take much longer (perhaps 10 years, according to some team members). These techniques have been used for both closed, complete information problems, as well as a number of incomplete information problems or partly open problems.

2) Machine (self) learning is what is explored in this new project, which has minimal precedent (that I am familiar with). This is having a neural network train itself with no human provided data or guidance. The technology developed by the AlphaZero team is at present fundamentally limited to finite actions possible, with finite rules for generating them, and a score that can be represented as a real number whose expectation can be maximized. (Note, for chess and Shogi, the score values wore -1,0,1, for Go they were 0, 1, but the framework was explicitly designed to allow scores like 2.567, if there were a problem with such characteristics). It also seems required that the sequence of actions before which a scoring can occur can't be too long (for practical reasons of computation limits, even given the large computational power available during self training). There are no other limitations or specializations. This necessitated an artificial rule added to chess, for the self training phase (beyond the 50 move rule and 3 fold repetition that in principle terminate any game in finite time). It is still possible (especially with machines) to have 1000 move games without terminating per any of the official rules (49 moves, capture or pawn move, 49 moves, capture or pawn move, etc). The group was concerned with these rats holes eating up too much processing time, so they added a rule that games over some threshold length were scored as draws (the paper does not specify what value they chose for this cutoff). This strikes me as a risky but presumably necessary step due to system limitations. They specifically did NOT want to address this by adding ajudicating rules, because these would have to involve chess knowledge. Particularly intriguing to me, looking at black vs. white results, is that AlphaZero seems to have evolved a meta rule on its own to play it safe with black, and take more risks with white. This is the practice of the majority of top grandmasters.

3) It seems that except possibly for the core neural network itself, huge changes and breakthroughs would be needed to apply their (self learning, with no training data) system to incomplete information, open type problem areas. Further, there is no sense in which it is an AI. This is not pejorative. The whole AI field is named after a hypothetical future goal which no existing project is really directly working on (because no one knows how, effectively). It is silly to judge AlphaZero against this goal, because that is not remotely what it was trying to achieve.

Delta2 · Dec 10, 2017

I found some additional info, Stockfish 8 was allowed to use only up to 1GB of Ram for hash table, and that together with the 1-minute per move time control imposed, ruins to some extent, the effective use of 64 cores by Stockfish 8.

How much ram and stored space in HDD Alphazero was using? Couldn't find info for that, could been hundreds of GB to store all those neural network synaptic weights info...

jerromyjon · Dec 10, 2017

While I find this quite interesting, it also seems quite "unstructured". What I mean is that I assume there are no constraints on repetition, and I wonder how replaying the same series of identical moves affects the weighting of "good play" when there are random repetitions. I would think having some type of iteration scheme to allow it to play through all possible games would give it the power to determine the best possible move out of all options but I don't know if that is out of the range of possible in a finite time.

PAllen said:

It seems that except possibly for the core neural network itself, huge changes and breakthroughs would be needed to apply their (self learning, with no training data) system to incomplete information, open type problem areas

I often wonder if this type of system could be applied to mathematics, giving it basic rules of math and scoring according to deriving known mathematical complexities, seems pretty simple to a layman like me but perhaps I'm missing something obvious as I'm not much of a mathematician.

Ryan_m_b · Dec 10, 2017

jerromyjon said:

While I find this quite interesting, it also seems quite "unstructured". What I mean is that I assume there are no constraints on repetition, and I wonder how replaying the same series of identical moves affects the weighting of "good play" when there are random repetitions. I would think having some type of iteration scheme to allow it to play through all possible games would give it the power to determine the best possible move out of all options but I don't know if that is out of the range of possible in a finite time.

For chess it certainly isn’t possible as the number of legitimate games is often compared to the number of atoms in the universe.

PAllen · Dec 10, 2017

Ryan_m_b said:

For chess it certainly isn’t possible as the number of legitimate games is often compared to the number of atoms in the universe.

Many many times greater than the number of atoms in the observable universe.

jerromyjon · Dec 10, 2017

PAllen said:

Many many times greater than the number of atoms in the observable universe.

Okay, now I agree with the title of the post, "mind boggling"!

phyzguy · Dec 10, 2017

jerromyjon said:

While I find this quite interesting, it also seems quite "unstructured". What I mean is that I assume there are no constraints on repetition, and I wonder how replaying the same series of identical moves affects the weighting of "good play" when there are random repetitions. I would think having some type of iteration scheme to allow it to play through all possible games would give it the power to determine the best possible move out of all options but I don't know if that is out of the range of possible in a finite time.

In addition to the fact that there are a huge number of possible games, so there is no way to iterate through all of the possibilities, note that it wasn't playing "random repetitions". It was playing against itself, so as it learned and got better it was playing against a stronger and stronger opponent. So the games it was learning from were far from randomly selected.

jerromyjon · Dec 10, 2017

phyzguy said:

So the games it was learning from were far from randomly selected.

Very good point, I didn't think about it that way. Thanks.

Mind boggling machine learning results from AlphaZero

Similar threads

Hot Threads

Touch-typing for programmers

How to calculate Tension for a series of connected points?

Python Complaining About Python

Fortran Reading files in pre-f77 - handling end of file

Python Partial pivoting in getting the reduced row echelon form of a matrix

Recent Insights

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight

Insights Relativator (Circular Slide-Rule): Simulated with Desmos - Insight

Insights Fixing Things Which Can Go Wrong With Complex Numbers

Insights Fermat's Last Theorem