ELO chess ranking system applied incorrectly in video games

DragonPetter · Feb 10, 2012

Are there any chess players, statisticians, or smart people who are familiar with the ELO ranking system? This concerns a game with at least over 100,000 players, so its not an obscure concern.

Basically, I play an online video game that tries to apply an ELO ranking system to rank individual player skill, however its a multiplayer team-oriented game. I think the application of ELO in this case is severely flawed, and it feels so obvious to me logically; however, a way to prove it is not immediately obvious to me and I would like to get input before I venture into calculations, simulations, and models.

I'll try to describe the method and why I think its invalid, and then the supporters' reasons for why its valid.

The games consist of 2 teams, each with 5 players. The gameplay is heavily dependent on team performance, where 1 player can influence the outcome at times, but in general, it takes a team effort to outperform another team. Imagine it being basketball if you want. So, if you are on a team with bad players, you are severely disadvantaged. At the end of the game, all players of the winning team receive a boost to their ELO rating, while players of the losing team receive a deduction from their rating, independent of each player's individual performance.

Now, one of the key flaws I see is that the system applies the team performance to an individual's ranking. If the teams were always constant, as in the players being rated were always on the same team together, this would actually work, because then the ELO rating is applied to the team, rather than the individual, and the team has a record of statistical datapoints that are valid. Say some lucky bad player plays with 4 professional players on his team every game, then the rating he has should not represnt his personal skill, but rather his team's skill.

In the game I'm playing, each game, or trial if you can call it that, the teams are completely randomized with players that could be very experienced or players that have never played the game before. The system finds 5 players on a team, and it picks 2 teams of random players that have the same average rating (in an attempt to make the matchup fair). So each game a player plays, his team has changed completely, but the end of game results (win or lose) are applied to the individual player and so that random team's performance is applied as a datapoint for someone's individual, non-random, statistical performance. Does this start to sound flawed? The dataset being compiled is basically random events, since the players are picked out of a random pool with an average ranking value.

The argument in support of this method is that A.) the player pool to make teams is random and B.) the dependent variable in all trials is that the individual player was an influence in every game (in other words, the common element to all the random games is the player being rated was a participant in all of them). And with these premises, the individual's influence should begin to average out above the randomness of his team makeup in each event. So with a sufficient enough amount of games played, the accumulation of performance of the random team matches that he participated in should start to reflect his own performance.

Now, my argument is this: You cannot base an individual's ranking based on how randomized teams he plays on perform. The +/- received from a win or loss should only be applied to the team performance, rather than an individual's performance. I would also argue that the distribution of player skill in the randomized teams and the fact that the player is only 1/10th of the participants in a match, that the "averaging" effect that they think should surface is completely drowned out by "noise". I think of it as a signal to noise ratio analogy, and the 9 other players are the noise floor, and the individual player is below this noise threshold and his influence cannot be measured.

Sorry if this doesn't belong here or doesn't make much sense, but if you have any experience or thoughts on this I'd appreciate comments.

Jimmy Snyder · Feb 10, 2012

The result will be a progressively smaller number of people whose scores are increasingly skewed.

For instance, suppose the following variation on the traditional ELO system where you are scored by your own games only. However, after each game a coin is flipped and the loser of the coin toss gives one ELO point to the winner.

At the end of the first round, half of the players are 1 point ahead, half are 1 point behind. At the end of the second round, half will have correct ELO scores, a quarter will be 2 points ahead, a quarter 2 points behind. Etc.

Containment · Feb 10, 2012

I think one thing to remember is that the elo in a team game would not reflect how good of a player you are but more so how well you work with a team. It would be possible to add some factors that would allow it to value how good of a player you are. It's possible that would lead to people just trying to work the system to get scores that reflects that they are good.

With just about any elo system the score will become more accurate the more games you play have you taken this into account?

Jimmy Snyder · Feb 10, 2012

If I understand the OP, you don't play as a team, you play as a group of 5 individuals. It is only the rating points that are distributed on the basis of the team's result. When you win a game in the traditional ELO system, you get some of your opponent's points. In this system you only get your opponent's points if your team wins. Say you defeat your opponent and the other 4 members of your team lose their games. In the traditional system you would win points. In this system you lose points.

Matterwave · Feb 10, 2012

It seems to me that Jimmy's conclusions are reasonable. For the majority, with sufficiently large player pool, will average out their scores to reasonable levels. A small minority may have gotten exceptionally lucky or unlucky and was always paired with players above or below their skill level consistently and will have a skewed rating.

DragonPetter · Feb 10, 2012

Jimmy Snyder said:

If I understand the OP, you don't play as a team, you play as a group of 5 individuals. It is only the rating points that are distributed on the basis of the team's result. When you win a game in the traditional ELO system, you get some of your opponent's points. In this system you only get your opponent's points if your team wins. Say you defeat your opponent and the other 4 members of your team lose their games. In the traditional system you would win points. In this system you lose points.

Hi all, thanks for the replies.

You do play as a team. Its a strategic game where 5 players play together to beat the other team. My problem is that you are then rated as an individual based on your team's performance, and your team changes every match and is randomly chosen.

fluidistic · Feb 10, 2012

I also believe elo ranking system is flawed in such games (like football-soccer too).
I played for a few months a video game in which teams remained fixed even after a game would end. We would create "clans" and were teamed up only with people of our clan vs other clans. The number of players wasn't fixed, a game could start 2 vs 5 if there was 5 people from 1 clan and 2 people from the other clan.
I was stronger than the average guy in my team; when the season of clans ended my elo rating went up by almost 200 points and I entered the top 50 or so. However my skills remained the same or almost, so that my elo didn't mean absolutely anything reliable.

In one of my first games (elo 1500), I played a 3 vs 3 against the strongest guy of the video game (elo around 2200). I managed to beat him almost exclusively in 1 vs 1 but my team mates failed to win the game. Result of the game: I lost elo points, the strongest guy of the video game gained elo... though I personally did better than him in that particular game. Such a ranking system cannot be right. In my opinion elo isn't suited for team games.
P.S.:In that game if you die entirely but your team manages to win the game, you "win" for the ranking system.

mege · Feb 10, 2012

One thing to remember: rating systems are meant to be a predictor, not a sure thing. (and from the sounds of it - OP is talking about League of Legends?). Even with team games, over a long enough period of time, a k-weighted ELO system should be relatively accurate as long as ratings are used in their appropriate pool. Multiplayer ratings used for randomly assigned teams etc. Also, something to note for League of Legends - they use a personal rating (hidden to you) to help granulate some match making even further.

Magic: The Gathering (and several other 1v1 tabletop games) have all recently dropped the ELO rating system for their competitions. Over a long enough period of time, the ratings became a bit unwieldy (because they weren't using them for match making, just large scale comparisions). People would sit on their ratings for large events forgoing small events (ie: a sufficiently high rated player would need to go undefeated to not lose significant points at a smaller event). There is also the thought that ELO ratings don't accurately represent games with a chance element.

Finally, especially in a team game something that is being discounted is your skill as a team player. Sure, 1v1 you may beat someone - but 2v2 that other individual may be far better at utilizing his partner. I think that you're discounting "plays well with others" as a measured skill in this case. To your soccer analogy: your striker may have the best kick in the game, but if he's a ball-hog things may not bode well. But again - over a long period of time, in a randomly-selected team environment, the 'better' individual will have a higher rating.

Jimmy Snyder · Feb 11, 2012

DragonPetter said:

Hi all, thanks for the replies.

You do play as a team. Its a strategic game where 5 players play together to beat the other team. My problem is that you are then rated as an individual based on your team's performance, and your team changes every match and is randomly chosen.

How do you play as a team? Is there a single game and the 5 of you vote on which move to play? Or do you take turns making moves? Can you clarify for me what it means to play a chess game as a team?

Jack21222 · Feb 11, 2012

Ah, League of Legends. My little brother just got me into that game.

I guess the main counterargument to the OP is that the high ELO players are legitimately the best players, and the low ELO players are generally crap. I disagree that your performance in the game will get drowned out by the others in the long run. We've all had games where one guy on either team basically solos the entire game, and we've all had games where some horrible player on one of the team just feeds the opposing carries.

But, my main point is that players with high ELOs exist, and that is the biggest counterexample to your argument.

Jack21222 · Feb 11, 2012

Jimmy Snyder said:

How do you play as a team? Is there a single game and the 5 of you vote on which move to play? Or do you take turns making moves? Can you clarify for me what it means to play a chess game as a team?

It's a real-time game, not turn based. Instead of trying to describe it, let me post a video here. This is a video showing some high ELO players with commentary.

https://www.youtube.com/watch?v=rdgXYwanyDk

Jimmy Snyder · Feb 11, 2012

Aha!. I though we were talking about chess.

encorp · Feb 12, 2012

Is this where I can jump in and flame LoL over HON/Dota?

Ha.

I kid.

This is interesting, I'd like to see what people think about the system that exists in all the above games.

ThomasT · Feb 12, 2012

fluidistic said:

In my opinion elo isn't suited for team games.

I agree. It was designed for chess, afaik. For chess, and other 1 on 1 games, it's a good predictor. For team competitions that keep pretty much the same personal from game to game it should be a pretty good predictor of team performance. For rating individuals in team competition, where the team membership changes randomly from contest to contest and individual performance stats aren't taken into account ... definitely no.

Jack21222 · Feb 12, 2012

ThomasT said:

I agree. It was designed for chess, afaik. For chess, and other 1 on 1 games, it's a good predictor. For team competitions that keep pretty much the same personal from game to game it should be a pretty good predictor of team performance. For rating individuals in team competition, where the team membership changes randomly from contest to contest and individual performance stats aren't taken into account ... definitely no.

Disagree, because your individual performance will impact whether or not your team wins. So, better players will win more often regardless of the rest of their team.

Again, the biggest counterargument is that the strongest players have the highest ELOs, even in solo queue. This fact cannot be explained if ELO isn't related to individual performance.

fluidistic · Feb 12, 2012

Jack21222 said:

Disagree, because your individual performance will impact whether or not your team wins. So, better players will win more often regardless of the rest of their team.

In the game where I was teamed up with my clan mates rather than randomly, my elo was in the 1400-1500's. When the "season" was over, the team would be randomly created. My elo suddenly went up to high 1700's, my skills however remained the same. I'm not the only one to whom this happened, many people criticized the elo ranking system for that particular game due to this and totally unbalanced games where one could guess the outcome of the game from start even regardless of what the elo had to say. The same would apply even when the teams would be randomly balanced. In that particular game economy (think of a starcraft-like one's) is shared. If you have a noob in your team and he's wasting all the economy on useless stuff, even the best player can't do much to win the game.

Again, the biggest counterargument is that the strongest players have the highest ELOs, even in solo queue. This fact cannot be explained if ELO isn't related to individual performance.

This did not happen in my game (game name is Zero-K and the seasons of clan teams is called planet wars).

Jack21222 · Feb 13, 2012

fluidistic said:

In the game where I was teamed up with my clan mates rather than randomly, my elo was in the 1400-1500's. When the "season" was over, the team would be randomly created. My elo suddenly went up to high 1700's, my skills however remained the same. I'm not the only one to whom this happened, many people criticized the elo ranking system for that particular game due to this and totally unbalanced games where one could guess the outcome of the game from start even regardless of what the elo had to say. The same would apply even when the teams would be randomly balanced. In that particular game economy (think of a starcraft-like one's) is shared. If you have a noob in your team and he's wasting all the economy on useless stuff, even the best player can't do much to win the game.

This did not happen in my game (game name is Zero-K and the seasons of clan teams is called planet wars).

You're talking about a different game than the OP is, so I have no comment on that.

In League of Legends, there are team queues and solo queues. The teams generally have lower ELOs than individuals, so I don't know how meaningful it is to compare the ELO of a team vs the ELO of an individual, as you seem to be doing. However, the point stands that you're talking about a completely different game.

fluidistic · Feb 13, 2012

Jack21222 said:

You're talking about a different game than the OP is, so I have no comment on that.

In League of Legends, there are team queues and solo queues. The teams generally have lower ELOs than individuals, so I don't know how meaningful it is to compare the ELO of a team vs the ELO of an individual, as you seem to be doing. However, the point stands that you're talking about a completely different game.

My fault, in post #15 I thought you were answering to any team game rather than League of Legends in particular.

DragonPetter · Feb 14, 2012

Glad people started using game names, I didn't want to appear as a nerd too badly. The game I'm referring to is HoN.

I'm also glad to see people agreeing with me. But does anyone know how to do a mathematical analysis to prove its invalid?

DragonPetter · Feb 14, 2012

Jack21222 said:

Ah, League of Legends. My little brother just got me into that game.

I guess the main counterargument to the OP is that the high ELO players are legitimately the best players, and the low ELO players are generally crap. I disagree that your performance in the game will get drowned out by the others in the long run. We've all had games where one guy on either team basically solos the entire game, and we've all had games where some horrible player on one of the team just feeds the opposing carries.

But, my main point is that players with high ELOs exist, and that is the biggest counterexample to your argument.

Thanks for the counterexample, and this brings up some subtle points I forgot to make originally.

First, most of the high rank players also tend to play with other high rank players, and play in organized teams rather than randomized teams. If not organized teams, they usually at least have a buddy they play with consistently. From my first post, I mention that the ranking system becomes more accurate as the makeup of the team remains unchanged rather than randomized. I highly doubt the "strongest" players can do much
when they are thrown back down to below the average rank and have their hands tied by 4 beginners.

Secondly, as the "randomness" tilts you in one direction or another, you start to notice a landslide effect. If I go on a bad streak, my rank takes a dive. If I get on a winning streak, I tend to stay up at that position until I get bad luck (horrible teammates) again.

Its because if you start to win a couple, the system starts to pair you with other people who have won recently, and then these winners help to pull you further away from the average. It has very little to do with your own actual skill ranking.

Jack21222 · Feb 14, 2012

DragonPetter said:

Thanks for the counterexample, and this brings up some subtle points I forgot to make originally.

First, most of the high rank players also tend to play with other high rank players, and play in organized teams rather than randomized teams. If not organized teams, they usually at least have a buddy they play with consistently. From my first post, I mention that the ranking system becomes more accurate as the makeup of the team remains unchanged rather than randomized. I highly doubt the "strongest" players can do much
when they are thrown back down to below the average rank and have their hands tied by 4 beginners.

Secondly, as the "randomness" tilts you in one direction or another, you start to notice a landslide effect. If I go on a bad streak, my rank takes a dive. If I get on a winning streak, I tend to stay up at that position until I get bad luck (horrible teammates) again.

Its because if you start to win a couple, the system starts to pair you with other people who have won recently, and then these winners help to pull you further away from the average. It has very little to do with your own actual skill ranking.

First: In league of legends, organized teams are ranked separately from individuals in random teams, so this issue does not happen. It may be different in your game.

Second, the "landslide effect," happens in chess too. The higher ranked players you beat, the more points you get. The more points you get, the higher ranked players you play.

Third, one higher level can carry an entire game even with 4 weak teammates. Sometimes, higher level players will create new "summoner" profiles and play in the lower level games. When this happens, they usually dominate the game. It's abundantly clear who has level 30 alts and who is a legitimate level 10. In the low level queues, if you get a high level on your team, you're virtually guaranteed a win.

I don't know if you play league of legends, but watch some of the "top plays of the week" videos on youtube. As a fairly new player, I can look at those plays and tell that those players are very good. It is no accident that they have high ELOs.

DragonPetter · Feb 14, 2012

Jack21222 said:

First: In league of legends, organized teams are ranked separately from individuals in random teams, so this issue does not happen. It may be different in your game.

Second, the "landslide effect," happens in chess too. The higher ranked players you beat, the more points you get. The more points you get, the higher ranked players you play.

Third, one higher level can carry an entire game even with 4 weak teammates. Sometimes, higher level players will create new "summoner" profiles and play in the lower level games. When this happens, they usually dominate the game. It's abundantly clear who has level 30 alts and who is a legitimate level 10. In the low level queues, if you get a high level on your team, you're virtually guaranteed a win.

I don't know if you play league of legends, but watch some of the "top plays of the week" videos on youtube. As a fairly new player, I can look at those plays and tell that those players are very good. It is no accident that they have high ELOs.

I must just say the common phrase "correlation does not imply causation". Just because good players also have high ELOs does not mean being a good player gives high ELO.

Jack21222 · Feb 14, 2012

DragonPetter said:

I must just say the common phrase "correlation does not imply causation". Just because good players also have high ELOs does not mean being a good player gives high ELO.

It does when there's a specific mechanism that mathematically gives good players a high ELO.

DragonPetter · Feb 14, 2012

Jack21222 said:

It does when there's a specific mechanism that mathematically gives good players a high ELO.

That is a possibility . . or you could be doing the mathematical equivalent of running in circles.

A good first step was if we had the formula they use before us. Then we could evaluate how closely tied the two are mathematically.

Jack21222 · Feb 14, 2012

DragonPetter said:

That is a possibility . . or you could be doing the mathematical equivalent of running in circles.

A good first step was if we had the formula they use before us. Then we could evaluate how closely tied the two are mathematically.

They don't give the exact formula, but it's a modified version of the chess one.

http://na.leagueoflegends.com/learn/gameplay/matchmaking gives some details.

If you want the math of the chess formula, which they modified, you can find that here:

http://en.wikipedia.org/wiki/Elo_rating_system#Mathematical_details

JaWiB · Feb 14, 2012

fluidistic said:

In the game where I was teamed up with my clan mates rather than randomly, my elo was in the 1400-1500's. When the "season" was over, the team would be randomly created. My elo suddenly went up to high 1700's, my skills however remained the same. I'm not the only one to whom this happened, many people criticized the elo ranking system for that particular game due to this and totally unbalanced games where one could guess the outcome of the game from start even regardless of what the elo had to say. The same would apply even when the teams would be randomly balanced. In that particular game economy (think of a starcraft-like one's) is shared. If you have a noob in your team and he's wasting all the economy on useless stuff, even the best player can't do much to win the game.

I disagree with this. I would say your elo deserved to be in the 1400s-1500s because your team was presumably not as good as the teams you were randomly matched with.

Starcraft does this correctly IMO because you have a ranking for every team you play with, as well as a "Random Team" ranking. Sure, if you play by yourself you'll get crappy players but your ranking is accurate on average.

Halo Reach broke off from this type of ranking system from the previous games in the series, and it was part of the reason that I stopped playing. They had some sort of voodoo to figure out how well you as an individual did in a team game. Winning was no longer the objective in team games, because it was no guarantee that your rank would increase. To rank up, you basically needed a lot of kills, which took a lot of the strategy out of the game since everyone went into run and gun mode most of the time.

fluidistic · Feb 14, 2012

JaWiB said:

I disagree with this. I would say your elo deserved to be in the 1400s-1500s because your team was presumably not as good as the teams you were randomly matched with.

Point taken. I honestly do not know what would have been my approximate elo. But there's a reason: for that particular game, the elo system was so bad that intuition would work much better. I used to bet on the outcome from start, I once predicted the right outcome 11 games in a row, while the !predict command based on elo failed totally. That command's output was something like "team 1 has 65% chance to win vs team 2". While intuition could be "team 1 has 0% chance to win vs team 2".
I knew many players because I'd spectate games (you can put all your attention into a particular player and therefore learn from him). I knew for instance a guy rated 1500 that could kill about 4 people before dying but his clan/team was so bad that he'd still lose elo points compared to an "average player" in an opposite team despite being a really strong player. I could not beat that particular guy in 1 vs 1 even though my elo was in the 1700's, even if I had played 10 games in a row. I know this for sure. In 1 vs 1 we usually choose small maps and in that game this means that the commander (special unit which is customizable) has a very important role. That guy's commander was a beast and he knew very well how to manage it.

Another example that elo wasn't well applied for that game is that one would win/lose more elo points when there was less players per team. I've seen a 1300 elo teamed up with the strongest player -elo 2200- vs an average player of 1500. 2 vs 1. The noob (1300 elo) would give his commander to the pro player who would win at any time he wanted. The noob who almost didn't play got a boost in elo, the pro player too while the average guy that could be a good player all in all, gets a huge drop his elo points. After a few games like this you end up with a good player rated 1300, a bad player rated 1400 and a pro player rated 2300. That's just terrible.

Fredrik · Feb 15, 2012

[nitpick]
It's actually Elo, not ELO. It's a person's last name. The guy who invented the system was named Arpad Elo.
[/nitpick]

DragonPetter · Feb 15, 2012

Fredrik said:

[nitpick]
It's actually Elo, not ELO. It's a person's last name. The guy who invented the system was named Arpad Elo.
[/nitpick]

Oh sorry. I'm always thinking of the band when I read it.

ELO chess ranking system applied incorrectly in video games

High School Ant on a stretchy rope puzzle

High School Potato paradox

Geometric Game: Fun With Matches (Safe!)

Undergrad Three Circle Problem

High School Three Squares Problem

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

ELO chess ranking system applied incorrectly in video games

Similar threads