Mind boggling machine learning results from AlphaZero

In summary, the conversation discusses a groundbreaking achievement in the field of machine learning, where a self-learning algorithm was able to master the game of chess within 24 hours, surpassing any other existing chess program despite running on slower hardware. The algorithm also showed impressive results in learning other games such as Go and Shogi. The conversation also touches on the potential impact of this advancement in AI and the debate surrounding the possibility of AI surpassing human intelligence. However, the success of machine learning in recent years, particularly in games like Go and chess, challenges the belief that AI's impact on society will be minimal. The conversation ends with a discussion on the potential for a "Moore's law" for AI and the need to come up with a
  • #1
PAllen
Science Advisor
9,180
2,412
I have always been on the skeptical, but not dismissive, end of judgments about achievemnts and rate of progress in this field. However, the following (please read through carefully) just completely blows my mind:

https://en.chessbase.com/post/the-future-is-here-alphazero-learns-chess

A self learning algorithm with no knowledge or base of games, starting only from the rules of chess, within 24 hours is much better than any other existing chess program despite running on hardware 900 times slower!
 
  • Like
Likes ISamson, mfb, Andy Resnick and 1 other person
Technology news on Phys.org
  • #2
  • Like
Likes Delta2
  • #3
256bits said:
Well, there is the link "Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm", PDF, under heading A new paradigm ( third way down ).
If I read Table S3 correctly, it took 44 million training games to learn chess, and 21 million to learn Go and Shogi, to become the best at winning.
Humans hardly play that many games with their slow processing grey matter.
True, but what shocked me was the leap beyond any prior results in machine learning that I’m aware of.
 
  • #4
PAllen said:
True, but what shocked me was the leap beyond any prior results in machine learning that I’m aware of.
It would be interesting if some kind of Moore's law could be applied to the evolution of smarter machines, not just that due to faster processing power, but from the use of better algorithms. The people and teams that work on these systems seems to be themselves some kind of brainiacs putting it all together and making it work.
 
  • #5
It is perhaps worth clarifying that this is an amazing result in machine learning in a closed domain (fixed rules, fixed definition of value). It does not address an open domain, or AI per se, at all.
 
Last edited:
  • Like
Likes scottdave, Mentz114, GTOM and 2 others
  • #6
I have been playing chess for a long time so I can see the value for us humans regarding the game itself in the long run but clearly it is something beyond our reach to follow.
From an AI perspective it is very impressive but of more value is (quoting from the article);

This completely open-ended AI able to learn from the least amount of information and take this to levels hitherto never imagined is not a threat to ‘beat’ us at any number of activities, it is a promise to analyze problems such as disease, famine, and other problems in ways that might conceivably lead to genuine solutions.

And once again, the basic principle taught in intro courses in CS that advancements in hardware cannot outperform a very efficient algorithm (a fortiori combined with some sort of neural network and its respective processes), is popping up.
 
Last edited:
  • #7
If you have been following any of the other threads on AI with regards to its impact on society you may have notice a distinct group who believe that AI's impact and in particular rivaling human intelligence is nil even in the next century. In particular our understanding of cognitive processes and the state of the art of computer systems does not support the idea of super general intelligence that might overcome human intelligence.

The success of machine learning in Go, Dota2(video game) and the new one in Chess which have occurred recently should give those doubter some pause to reconsider their opinions. In particular the rate of increase in the improvement in performance. The article in the OP noted that it would take a decade for AI to compete in Go against humans. It took three years. In another year it defeated the world champion 3-0. Now the AlphaGoZero has beaten that system 100-0 and with 1/12 the processors of the original system.

Both in Go and in Chess the AlphaGoZero system has "found" new strategies/moves not heretofore identified surprising developers. In Go for example the new strategy was adopted by the world champ and he went on a 22 game winning streak against other players.

Is there a Moore's law for AI? Of course we will have to come up with a metric to determine it. But there seems to be significant progress in AI in recent years. Note that the AlphaGoZero is a factor of 900 slower than other Chess systems but still outperforms them AlphaGoZero developers seem to keeping in mind the motto for their system "work smarter not harder".
 
  • Like
Likes PAllen
  • #9
gleem said:
If you have been following any of the other threads on AI with regards to its impact on society you may have notice a distinct group who believe that AI's impact and in particular rivaling human intelligence is nil even in the next century. In particular our understanding of cognitive processes and the state of the art of computer systems does not support the idea of super general intelligence that might overcome human intelligence.

The success of machine learning in Go, Dota2(video game) and the new one in Chess which have occurred recently should give those doubter some pause to reconsider their opinions. In particular the rate of increase in the improvement in performance. The article in the OP noted that it would take a decade for AI to compete in Go against humans. It took three years. In another year it defeated the world champion 3-0. Now the AlphaGoZero has beaten that system 100-0 and with 1/12 the processors of the original system.

Both in Go and in Chess the AlphaGoZero system has "found" new strategies/moves not heretofore identified surprising developers. In Go for example the new strategy was adopted by the world champ and he went on a 22 game winning streak against other players.

Is there a Moore's law for AI? Of course we will have to come up with a metric to determine it. But there seems to be significant progress in AI in recent years. Note that the AlphaGoZero is a factor of 900 slower than other Chess systems but still outperforms them AlphaGoZero developers seem to keeping in mind the motto for their system "work smarter not harder".

I was an average chess player at about 1800-1900 ELO. In terms of the chess playing, computers were better than me almost from the outset. The most remarkable thing to me is the standard to which some humans can play chess. The idea of having any chance against a modern computer is absurd.

If chess playing or go playing is a measure of intelligence, then computers have been more intelligent than me for decades.

And, am I really that much less intelligent than Magnus Carlsen? It seems to me that he is more like a machine than a human in terms of his chess playing ability.

That said, Alpha zeros approach to learning chess is remarkable. What it did to Stockfish in some of those games was beautiful.
 
  • #10
PeroK said:
I was an average chess player at about 1800-1900 ELO. In terms of the chess playing, computers were better than me almost from the outset. The most remarkable thing to me is the standard to which some humans can play chess. The idea of having any chance against a modern computer is absurd.

If chess playing or go playing is a measure of intelligence, then computers have been more intelligent than me for decades.

And, am I really that much less intelligent than Magnus Carlsen? It seems to me that he is more like a machine than a human in terms of his chess playing ability.

That said, Alpha zeros approach to learning chess is remarkable. What it did to Stockfish in some of those games was beautiful.
Well, I never really enjoyed playing Chess so much, it's just a matter of rote memorizing all the correct combinations.

Playing football or basketball is a lot more fun, more spontaneous.
Either way, you need to practice a lot to be a master in something, like the fact that the machines played so many games.
 
  • #11
BTW, when I first heard Edward Witten talk, I thought he was a robot.

He has that voice pattern... :-D

Reminds me of Blade Runner.
 
  • #12
Certainly fantastic results from the colloborators at DeepMind. The next step is going to be a major hurdle though--we'll have to extend it to model-free control. The Test of Time award winner at NIPS, Ali Rahimi, gave a great talk about the ever more essential need for greater scientific rigor and the development of useful theoretical foundations in Machine Learning such that we might make some insights to approach problems in the field that currently seem almost intractable (i.e. high dimensionality, etc.). For example, one thing he mentioned was how we use batch normalization to accelerate gradient descent, and how this is "explained" by saying batch norm "reduces internal covariate shift." The problem is, we really haven't a clue as to *why* reducing internal covariate shift speeds up gradient descent. Essentially, he wants us to move away from the sort of unguided engineering approach, and move towards a culture of and approach to research more similar to Biology (as opposed to just training our models conventionally and trying to reduce error rate as much as possible, and ultimately just a more empirical 'be creative and see what sticks' kind of approach).

Interestingly, this has spawned a debate in the community. Prof. Yann LeCun came out and officially disagreed with Rahimi, stating that the state of the art so far has been developed due to the clever engineering of researchers (he took offense to what Rahimi called it: "Alchemy"), and that practical technology almost always precedes the theory developed to fully explain the technology (i.e. Watt's steam engine came before Carnot Cycle--you can't criticize Watt for not being Carnot, basically). I think the essence of the debate is whether the current culture of research will be beneficial or detrimental for the field going forward--Rahimi seems to think it is becoming detrimental, since it's still a challenge to pedagogically deliver ML to students without any anchoring theoretical foundations (you actually have to "get good" at machine learning by developing a lot of experience and intuition while training models, without being able to anchor back to any real first principles), and the need to have proper explanations for the workings of our ML systems in the context of life or death situations involving humans (i.e. autonomous driving, cancer pathology, etc.). Prof. LeCun thinks that this way of doing things is just fine, and that Rahimi shouldn't needlessly criticize the work done so far, instead go and do the theory work, to which Rahimi replied that the talk he gave was basically a public plea for help.

I see both sides' arguments, and it's been very interesting to follow the discussion so far. Either way, I'm excited to see where we'll be in 5 or 10 years, when we'll have seriously improved and expanded datasets for model training combined with superior hardware like MITs programmable nanophotonic processor or utilizing the phase-change memory work of IBM researchers to allow for massively parallel computing systems, which would be great for ML. Maybe by then we'll have progressed a good bit on developing theoretical foundations in ML.
 
Last edited:
  • #13
PeroK said:
And, am I really that much less intelligent than Magnus Carlsen?
 
  • Like
Likes AlexCaledin and jerromyjon
  • #14
Greg Bernhardt said:

To contrast with this video, Magnus is a big fan of Monty Python, an interest shared with Viswanathan Anand, and there are videos of them doing skits together. He is also a fan of Donald Duck - this is apparently a Norway thing - Donald Duck is quite popular there.
 
Last edited:
  • #15
As for blindfold play, this is a specialty with the current record being 48 simultaneous blindfold games:

https://www.chess.com/news/view/timur-gareyev-plays-blindfold-on-48-boards-5729

(In the above video, Magnus plays 10 simultaneous blindfold games, the most he has ever done. The record that stood for decades was by Miguel Najdorf during WW II of 45 simultaneous blindfold games. Of interest about this is that one player in Timur’s exhibition had also played in Najdorf’s. Even more remarkable is that Najdorf’s exhibition was motivated in part to try to let his family in Nazi Germany know he was ok and alive. He had fled to Argentina and there was no normal communication method. He figured correctly that his feat would be covered even in Germany and his family would see it. Postwar, it was verified that his idea had worked.)
 
Last edited:
  • Like
Likes QuantumQuest
  • #16
Well to be honest I have my doubts, though it seems astonishing to beat the top engines that are based solely on search and evaluation algorithms and rely on NPS computational power (Nodes(positions) per Second processed(searched and evaluated)),

still reading the article it says that the engine as white prefers to play the English opening (1.c4 ...) or the Queen's Gambit openings (1.d4 d5 2.c4 ...).

According to Bobby Fischer, one of the top players in all the history of chess, best first move for white is 1.e4 .
 
  • #17
For some reason I trust the preference of an algorithm that beats all algorithms that consistently beat all humans more than the preference of a human. e4 might be a good move against humans, but apparently not against much stronger opponents.
 
  • #18
Delta² said:
Well to be honest I have my doubts, though it seems astonishing to beat the top engines that are based solely on search and evaluation algorithms and rely on NPS computational power (Nodes(positions) per Second processed(searched and evaluated)),

still reading the article it says that the engine as white prefers to play the English opening (1.c4 ...) or the Queen's Gambit openings (1.d4 d5 2.c4 ...).

According to Bobby Fischer, one of the top players in all the history of chess, best first move for white is 1.e4 .
Fischer said that, but in his critical match against Spassky in 1972, he played English and QGD frequently as white. In fact his first win as white in this match was an English that transposed to a QGD.
 
  • #19
mfb said:
For some reason I trust the preference of an algorithm that beats all algorithms that consistently beat all humans more than the preference of a human. e4 might be a good move against humans, but apparently not against much stronger opponents.

I don't think its good to trust a computer program more than a human Top GM(GrandMaster).

The problem with humans against engines is that humans (even strong GMs) lack behind in terms of tactical processing of a given position in board. However humans are better in positional processing of the position.

Since time is a factor (both players start with a limited time) and humans spent a lot of time calculating the tactical complexities of a position (a human even a GM, might spend 5min for something that an engine can process in 0.005min, like for example to see an elegant queen sacrifice that leads in a forced mate in 5 moves) that's the main reason humans are getting beaten by engines.

If we allow for a hybrid of a human GM+a typical engine that analyses positions so that the GM sees the engine analysis for the various possible moves he has in mind, then i believe this hybrid can beat any engine, (stockfish or rybka , or even alphazero or whatever).Or if we allow for very big(slow) time controls, so that the human has a lot of time to think about the tactical complexities of a position, then I believe a human GM has the advantage over any kind of engine.
 
  • Like
Likes Tosh5457, QuantumQuest and GTOM
  • #20
Delta² said:
I don't think its good to trust a computer program more than a human Top GM(GrandMaster).

The problem with humans against engines is that humans (even strong GMs) lack behind in terms of tactical processing of a given position in board. However humans are better in positional processing of the position.

Since time is a factor (both players start with a limited time) and humans spent a lot of time calculating the tactical complexities of a position (a human even a GM, might spend 5min for something that an engine can process in 0.005min, like for example to see an elegant queen sacrifice that leads in a forced mate in 5 moves) that's the main reason humans are getting beaten by engines.

If we allow for a hybrid of a human GM+a typical engine that analyses positions so that the GM sees the engine analysis for the various possible moves he has in mind, then i believe this hybrid can beat any engine, (stockfish or rybka , or even alphazero or whatever).Or if we allow for very big(slow) time controls, so that the human has a lot of time to think about the tactical complexities of a position, then I believe a human GM has the advantage over any kind of engine.
It is true that centaurs (human + computer) beat computer alone for current commercial engines. Human with long time control no longer does. It would certainly be interesting to try too human plus stockfish or Houdini against alphazero. I am not so sure the centaur would win. Some top grandmasters have likened alphazero’s play to perfected Karpov, i.e. immense positional understanding.
 
  • Like
Likes Delta2
  • #21
PAllen said:
It is true that centaurs (human + computer) beat computer alone for current commercial engines. Human with long time control no longer does. It would certainly be interesting to try too human plus stockfish or Houdini against alphazero. I am not so sure the centaur would win. Some top grandmasters have likened alphazero’s play to perfected Karpov, i.e. immense positional understanding.
Has alphazero played any games against strong GMs or against centaurs?
 
  • #22
Delta² said:
Has alphazero played any games against strong GMs or against centaurs?
No. And I suspect it won’t happen. The purpoase of this exercise, for the Deep Mind team, was to validate a general learning from scratch network using 3 radically different complex rule sets (chess, shogi, and go). My guess is that we won’t hear from them in a while as they try to make inroads into open domains without fixed rule sets.
 
  • #23
That would be interesting to see , as you say, the transitive property doesn't generally hold in sports (chess can be seen as a mental sport), in otherwords it is not sure that because alphazero beats conventional engines and engines beat human GMs, that alphazero will win a strong human GM.
 
  • #24
AlphaZero plus human might have an advantage over AlphaZero alone (I'm not even sure about that - the communication time might be worse than just letting AlphaZero move), but I don't see how humans plus a different existing program would come close to AlphaZero. Whatever the other program would give to a human to evaluate, AlphaZero would evaluate on its own. And we know AlphaZero can evaluate situations much better than humans, given the same time - that's how it decides what to do, and it can do that extraordinarily well. That takes a factor 2 in processing time - so what? It is so far above other programs that a factor 2 in processing time doesn't make a difference.

We are not talking about a program that is a bit better than other programs. We are talking about a program that did not lose a single game out of 100, against a computer program so strong that you need several iterations of "x has no chance against y" until you reach the level of humans.
 
  • Like
Likes QuantumQuest and PeroK
  • #25
mfb said:
AlphaZero plus human might have an advantage over AlphaZero alone (I'm not even sure about that - the communication time might be worse than just letting AlphaZero move), but I don't see how humans plus a different existing program would come close to AlphaZero. Whatever the other program would give to a human to evaluate, AlphaZero would evaluate on its own. And we know AlphaZero can evaluate situations much better than humans, given the same time - that's how it decides what to do, and it can do that extraordinarily well. That takes a factor 2 in processing time - so what? It is so far above other programs that a factor 2 in processing time doesn't make a difference.

We are not talking about a program that is a bit better than other programs. We are talking about a program that did not lose a single game out of 100, against a computer program so strong that you need several iterations of "x has no chance against y" until you reach the level of humans.

I have my doubts on that, but you might be right , we ll just have to wait for some games of AlphaZero against human GMs and centaurs (with no AlphaZero program in their aid).
 
  • Like
Likes GTOM
  • #26
PAllen said:
It is true that centaurs (human + computer) beat computer alone for current commercial engines. Human with long time control no longer does. It would certainly be interesting to try too human plus stockfish or Houdini against alphazero. I am not so sure the centaur would win. Some top grandmasters have likened alphazero’s play to perfected Karpov, i.e. immense positional understanding.

What would be of particular interest would be to see the system that beat Go Champion Ke Jie 3/nil collaborate together with him against the current Alphazero over a series of matches. I would pay money to see that live.

EDIT: Though yes, Alphazero may in fact completely crush that older AlphaGo system/Ke Jie team against it. I would just like to confirm it, since many have been claiming that humans and machines working together almost in a team-like manner is how our relationship with our advancing AI will continue to be, seeing how machine/human teams were beating the sole AlphaGo agent, if I'm remembering correctly--they suspected the human brought something integral to the match that somehow benefited upon collaboration with the agent.

And yet, I think we'll perhaps soon begin to see that this sort of thinking is a kind of illusion. There is real uncertainty here, at least with respect to the ultimate horizon of possible play (especially in Go). It may be that machine learning agents begin to play at a level beyond the comprehension of humans, as in they'll make moves that can't really be understood, at least not in real time, not even probably by the world's best players.
 
Last edited:
  • #27
256bits said:
It would be interesting if some kind of Moore's law could be applied to the evolution of smarter machines, not just that due to faster processing power, but from the use of better algorithms.

That would be welcome. But it needs to be very simple and understandable so that every person, every organization that tries to apply it have the same understanding of what it means before comparison. It also needs to remain constant in time and resist shifting definitions.

Got any suggestions?
 
  • #28
Delta² said:
According to Bobby Fischer, one of the top players in all the history of chess, best first move for white is 1.e4 .

Yes, Bobby Fischer said that and I don't think that there is a single chess player out there that can doubt his great value / expertise. The important thing is what is the rationale behind it. By playing white 1.e2 - e4 you open into one of the most crucial central squares of the chessboard, you give immediate mobility to your queen, bishop and one more square for the knight (from the outset) and also you don't make clear to your opponent which one of the available systems (for this position) you'll choose i.e. flexibility. This is great but as chess playing over centuries has shown, there are way more strong openings - Bobby Fischer himself has done such choices as well, taking of course into account the changes / adaptations you'll need to do, according to the opponent's choices. Personally - although far enough from GM level, from some point on, I'm a fan of 1. d2 - d4 and in particular of Botvinnik's chess school/system but there is a lot of great systems also, including English opening (talking for white). So, I personally find it reasonable that (quoting from the article)

So what openings did AlphaZero actually like or choose by the end of its learning process? The English Opening and the Queen's Gambit!

Also, I don't think that any possible combination of human player (even at the GM level) with some conventional chess engine can stand any chance against AlphaZero for the reasons that mfb points out in #24. I am not absolutely sure of course and no one can be in advance. It has to be proven first.
 
Last edited:
  • Like
Likes Delta2
  • #29
PAllen said:
It is true that centaurs (human + computer) beat computer alone for current commercial engines.

Is there a recent match where this has happened ?

PAllen said:
It would certainly be interesting to try too human plus stockfish or Houdini against alphazero. I am not so sure the centaur would win.

Thats a bold claim.
 
  • #30
Buffu said:
Is there a recent match where this has happened ?
Every postal chess match is a battle between centaurs, with a few who just pick machine moves. Skilled centaurs always win. Note, postal chess admitted it was pointless to expect people wouldn’t use machine aids, so they just allowed it, redefining the nature of the competition.

Buffu said:
Thats a bold claim.
How so? It is a non-claim, that I don’t know what would happen
 
  • #31
There must be some "backstage" info , I mean the developer team of Alphazero would probably have hired some human GMs/IMs/FMs to cooperate with them, and the program probably have played some unofficial games against GMs. Anyone knows anything about this?
 
  • #32
Delta² said:
There must be some "backstage" info , I mean the developer team of Alphazero would probably have hired some human GMs/IMs/FMs to cooperate with them, and the program probably have played some unofficial games against GMs. Anyone knows anything about this?

Why would you think this? Are you saying the developers lied about how they trained the neural network? The computer played over 500,000 games against itself. This would have given it a huge number of possible games to learn from. Would a few games against a human grand master really make much difference? How many chess games does a human grand master play in the course of learning the game?
 
  • #33
phyzguy said:
Why would you think this? Are you saying the developers lied about how they trained the neural network? The computer played over 500,000 games against itself. This would have given it a huge number of possible games to learn from. Would a few games against a human grand master really make much difference? How many chess games does a human grand master play in the course of learning the game?

No , I am not saying that they lie on how they trained the program , but I don't know if they got some aid from GMs regarding the development of the source code of the program. Conventional chess program developers cooperate often with GMs and that reflects on the source code (mainly the source code regarding its evaluation function) of the conventional chess program. I thought that Alphazero developers may also did the same thing.(in simple words, a GM can tell a programmer how he/she thinks when playing chess and the programmer can somehow incorporate this info into the source code of the program)
 
  • #34
Delta² said:
No , I am not saying that they lie on how they trained the program , but I don't know if they got some aid from GMs regarding the development of the source code of the program. Conventional chess program developers cooperate often with GMs and that reflects on the source code (mainly the source code regarding its evaluation function) of the conventional chess program. I thought that Alphazero developers may also did the same thing.(in simple words, a GM can tell a programmer how he/she thinks when playing chess and the programmer can somehow incorporate this info into the source code of the program)

The earlier version of AlphaGo (at least for Go) was trained by looking at human gameplay before then playing numerous games against itself and that proabably biased that older version to assign greater value to certain moves for some low-complexity shapes on the board (or else it independently discovered the best moves to play in those situtations that humans happen to have also discovered). Alphazero, however (as stated previously), was only given the rules of the games. Some of the incredibly unorthodox play we've seen from DeepMinds flagship AI for awhile now is probably the main reason for that, even aside from its ability to explore so much more of the gamespace.
 
  • #35
Which other games do you expect next?
Chess and Go have no randomness and no hidden information - all players always know the full state of the game. Many card games have hidden information (cards not shown to everyone) and randomness, for example. There is a Poker-AI beating humans but that is a different topic.

There are multiple games with no randomness or hidden information, but most of them are much easier for computers than Chess and Go. You don’t need a Checkers-AI or Connect 4-AI because there is an exact manual how to not lose.

It will be interesting to see if the AI can be adapted to work with a broader range of problems.
 
  • Like
Likes ISamson

Similar threads

  • STEM Academic Advising
Replies
13
Views
4K
Back
Top