Banner and Crosspost


Home    Overlay   Discord   Mutators   Maps   Integration   Links   About

November 25, 2021

Mixed strategies

Mixed strategies
Mixed strategies

In this post I tried to explore mixed strategy Nash equilibria for a few strategies – all-in, standard, and defensive. Games often feature these archetypal strategies with a rock-paper-scissors relationship between them. I will also make it more interesting by introducing other variables like player skill and added randomness.

Disclaimer: This is a fun project. I'm no expert on game theory. I was interested in how this would look like, how I would implement and visualize it, and learn something along the way.

Rock-paper-scissors

Rock-paper-scissors is a simple game that will serve as a good example. There are three strategies, each one countering one other with a 100% winrate: rock < paper < scissors < rock.

We can represent that by a payoff matrix. We don't have to use winrates as payoffs, but it will be good for later. On the diagonal (rock-rock, paper-paper, scissors-scissors), I have put 50% for a draw. Also, I don't have to write payoffs for the second player as this is a zero-sum game. Those will be 100% minus payoffs for the first player (winrates adding up to 100%).

[P1↓ P2→] Rock Paper Scissors
Rock 50% 0% 100%
Paper 100% 50% 0%
Scissors 0% 100% 50%

Going for one pure strategy, for example always going for rock, isn't a good idea as it can be easily exploited by the opponent. If we did a game-theoretic calculation, we would discover that if both players are playing smart, they would choose each sign with an equal probability.

That's a mixed strategy Nash equilibrium. It's "mixed" because it's mixing different strategies – like going sometimes rock and sometimes paper. It's a Nash equilibrium because there is no incentive for one player to deviate from it given the opponent's strategy.

Model

This model is similar to rock-paper-scissors as we still have two players and three strategies. However, to make it more interesting (1) no strategy will have a 100% winrate against another, (2) we will take into account player skill, (3) and one strategy will be more affected by randomness.

How does that work?

  • Three strategies (all-in, defensive, and standard)
  • Player skill is represented by ELO rating
  • Strategy advantage over another is represented as a bonus to ELO rating
  • The all-in strategy represents a strategy where randomness plays a significant role. This is done by reducing the ELO rating difference by 50% when there is one all-in played, and by 75% with two all-ins.

I have chosen ELO because of its simplicity. It's an easy way to represent player skill, translate it into winrates, and represent strategy matchups as giving a certain ELO rating bonus. Alternatively, I could have used TrueSkill with mu as skill, and all-ins increasing sigma or beta for given matchups.

The randomness and reduction of skill difference for all-ins can come from for example doing coin-flip builds, not managing to scout a hidden building, or by focusing on a single early rush with minimal player interactions where the better player might not get the chance to outplay the opponent, or the randomness of those few interactions won't get averaged out.

To represent the relationship between strategies: all-in < defensive < standard < all-in, I have chosen these payoffs (winrates) for equally skilled opponents:

[P1↓ P2→] All-in Standard Defensive
All-in 50% 55% 35%
Standard 45% 50% 60%
Defensive 65% 40% 50%

And this is how it looks like when there is a 200 ELO difference between players:

[P1↓ P2→] All-in Standard Defensive
All-in 57% 68% 49%
Standard 59% 76% 83%
Defensive 77% 68% 76%

As expected it's now favoring the player one. Values on the diagonal aren't the same anymore. Since all-in strategy is more affected by randomness, two players all-ining each other will be more random and hence closer to 50% than a standard-standard game.

★ ★ ★

Now what if we tried to find mixed strategy Nash equilibria for different ELO differences between players? It would show how players should mix their strategies based on how good or bad their opponent is. This assumes both players are rational and knowing all this.

The optimal mix of strategies based on game theory.
The second chart belongs to the opponent – letting you compare which mixes of strategies face each other.

A significantly worse player (> 355 ELO difference) should always all-in as that effectively reduces the difference between players' skills. At the other end of the spectrum, the player is better off always being defensive, as they can outplay the opponent later in the game, and it's all about surviving the all-in.

In this model there are 5 phases where different strategies are viable:

ELO difference
Viable strategies
   ? – -355 All-in
-355 – -69      All-in | Standard
-69 – 70     All-in | Standard | Defensive
70 – 356 Standard | Defensive
356 – ?       Defensive

This shows the importance of good matchmaking as the most strategies are viable for even matches. The same will be the case for close matches in tournaments.

Surprisingly, in phase 2 we see the number of all-ins increase while the player is getting worse opponents. It's caused by the rise of standard strategy for the opposing player. That shows the dynamics between even with three strategies might not be intuitive.

Another counter-intuitive thing is that a strategy being buffed against another can lead to the buffed strategy being used less. It will cause its counter-strategy to be used more and subsequently stifle the buffed strategy. Here is an interesting example where all-in is significantly improved but only against the standard strategy. This made all-in better for some ELO differences, but significantly worse at others where the defensive strategy became more frequent.

Changing all-in to be significantly better only against standard has big repercussions

For comparison, I will also include how it looks when all strategies are affected by randomness exactly the same way. In this case, all strategies stay viable at any ELO difference. However, I don't think that's a realistic assumption for most RTS games.

All strategies stay viable if all strategies are affected by randomness exactly the same way

The impact of randomness

Let's plot how winrate scales for certain strategies that are affected differently by randomness.

It's not surprising that the more a strategy is affected by randomness, the closer its winrate is to the 50% line. Flipping a coin would be fully on the 50% line regardless of the player's skill (-100% skill reduction).

If the effect of randomness on all-ins is set to zero, we get almost the same scaling.

This is an interesting result as well – mixed strategies scale pretty much the same as standard vs standard. You would think that by choosing strategies at random, you would introduce some randomness and move closer to the 50% line. But for this effect to be visible I had to significantly increase the winrates of strategies that counter each other (from ~55% to 99% and 99.999999% winrates in the two following charts).

When strategies counter each other with a 99% winrate, the difference becomes visible and the game more random.
Other pure strategies are stacked under the all-in curve as there is no difference here.
This effect further increases with a 99.999999% winrate. Asymptotically mixed strategies will go to the 50% line.

Final words

Thank you for reading. My main goal for this post was to see how I would implement and visualize this, and learn things along the way. And there were things I didn't expect – the effect of added randomness to a certain strategy, or sometimes non-obvious behavior of mixed strategies.

Here is the repository with my code. This includes naive solutions to 2x2 and 3x3 payoff matrices. If I wanted the project to scale to more strategies, I would be smarter about that or use a library. Overall it was a fun project and the charts look nice. This cannot be directly applied to a game like StarCraft II where players aren't fully rational agents, builds are more on a continuous scale, and there are other variables like balance, maps, and more.

Links to check out:

Recent posts

Endlinks

Copyright

Powered by Blogger

Main post