Thematic Solitaires for the Spare Time Challenged

A blog about solitaire games and how to design them. I'm your host, Morten, co-designer of solo modes for Scythe, Gaia Project, Wingspan, Glen More II, and others.
 Thumb up

How to make AI difficulty levels feel satisfying

Morten Monrad Pedersen
flag msg tools
Microbadge: Fan of Stonemaier GamesMicrobadge: 1 Player Guild - Together We Game AloneMicrobadge: Geoff EngelsteinMicrobadge: Level 18 BGG posterMicrobadge: Automa Factory fan
I have published an updated version of this post [blogpost=132575]here[/blogpost]

I try to make my Automas (artificial opponents) so that playing against them feel like playing against a human player, but as mentioned in a recent blogpost there are exceptions to this. One such exception is related to simulating the skill level of human opponents.

Before we get to that, I should note that while I for reasons of clarity talk about ranges in VP, then some of it also applies to games like Chess that doesn’t have scoring.

Human players vary a lot in skill level for all games, which means that in any game that rewards skill to a significant degree there’ll often be plays where one player wipes out the other.

This can lead to an unsatisfying experience for both players – in particular the loser. Still, though, it can feel “right” because the difference can be ascribed to a difference in skill and so the result will feel fair to many players.

Don’t replicate that in your solo AIs

If we put on our AI design-hat it might seem to us that since we want our AI to replicate the feel of playing against a human (well, that’s what I want). That thought can lead us into problems, though. As mentioned, human players vary wildly in skill level, then we should also let the AI vary wildly in skill level. I.e. in games about scoring victory points (VP) the AI should vary as much as human players (for simplicity I’ll talk about VP, but what I write apply equally well to a VP-less game like Chess).

Doing this, can in my opinion easily lead to bad play experiences. The reason for this is that a cardboard AI doesn’t have skill in the same way as a human does and therefore the relative skill level between players normally won’t be as strong a factor in determining the result of a human vs. AI game as in a human vs. human game.

Instead of skill level, randomness will often be a major factor in determining the score of an AI. So, if an AI has significantly random swings in scores as large as the full range of human players, then randomness will make a huge difference in the scores of the AI, because the range of the specific player playing against the AI will have a much narrower range of scores.

Let’s take an example: In game X, I tend to score 41 to 50 VP, but human players in total score in the range of 1 to 100 VP and so the game’s AI also scores in that range. For simplicity let’s furthermore say that each of the Automa’s possible scores are equally likely and that my influence on the AI’s score is negligible.

So, there’s a 40% chance that the AI will lose no matter where in my interval I score and a 50% chance that I’ll lose no matter what. In total that means that in only 10% of the plays my performance has any impact on the outcome of the game and in 90% it doesn’t matter what I do, instead it comes down to randomness. To me that would be very frustrating, and I’d either stop playing the game or switch to consider it a beat-your-own-high-score game with the AI being an obstacle, not an opponent to compete against.

If the AI’s score is instead mainly decided by how well the human player plays, then it’s a completely different situation, but then we’re no longer simulating the range of human skill range, but instead one specific opponent skill level who’s score is highly affected by the skill of the human player.

Don’t replicate the scoring range of a specific human skill level

Even if we don’t to simulate the full range of all humans in one go and instead base each difficulty level on players of a specific skill level, we can still get into trouble.

It might seem fair and realistic to have the AI score within such a range, but unless the impact of the human player on the AI’s score is strong, the winner of the game will to a large extent be decided by the random score not the human’s play and to most there’ll be a big psychological difference between the result of the game being heavily influence by the skill dependent score of a human opponent and it being heavily influenced by a mainly random AI score in the same range.

While it can be argued that psychological factors should be trumped by how actual mechanical reality is, then the player’s experience and fun from the game comes from the psychological factors not the actual reality, and so psychology must be taken very seriously.

Board Game: Patchwork: Automa

The robot army of the Patchwork Automa. Image credit: Qualith.

The fix: Low variation difficulty levels

Because of this, I always try to create difficulty levels that vary less in score than a human player of a specific skill level.

Luckily, as AIs designers we have control of the scoring of the AI, both because we can control its behavior in a way that a designer of human-only competitive games can’t and because we can partially decouple the scoring of the AI from the way that players normally scores allowing us to shape the scoring in a balanced manner.

Thus, what we can do is to create a series of difficulty levels each of which mainly scores in a narrow interval. This means that by choosing a suitable difficulty level, the player can get a play experience that’s tense much more often than the norm in competitive human-only games.

I think that most players prefer games that are close, because of the tension that provides, but those that prefer to win/lose most of the time can also use the difficulty system can also choose that via a tight difficulty level system. This means that the player can tailor the game experience to their preference, and we can make sure that the human player’s skill is the major factor in deciding the winner (not accounting for the randomness of the game itself).

As an example, the “veteran” difficulty level of the Patchwork Automa had a scoring range of 38 VP, while 2 experienced playtesters I just picked at random had ranges of 49 and 53 VP and they of course each played much fewer plays than the Automa and so would be expected to have gotten a larger range if they had played as many times as the Automa. So, the scoring spread of the Automa is probably close to half that of a human player.

A consequence of AIs with low variance is that if you don’t make multiple difficulty levels the AI will only be an interesting opponent for players of a specific skill level.

An example where this in my opinion goes wrong is Imperial Settlers where the single difficulty level (IIRC) of the AI is so low that I think I lost to it once (maybe zero) times out of close to 30 plays. Like the hypothetical high variance AI mentioned above I treat the Imperial Settlers AI as an obstacle, not an opponent and the game as a beat-your-own-high-score game, not a competitive game.

In fairness, I’ll add that even though I think that this is an issue with the game, it’s still among my favorite solo games and I like it so much that I preordered the roll and write follow-up sight completely unseen.

Board Game: Imperial Settlers

The Imperial Settlers AI army. Image credit: Mirosław Gucwa.

How to achieve low variance difficulty levels

Achieving such narrow scoring intervals can be approached in multiple ways:

1) Assign the AI a fixed or slightly variable score at the beginning of the game. We did this in Viticulture where in the base game the Automa always scores 20 VP at the normal difficulty level. This makes it feel less like an actual player in the game and there’s less a feeling of competing while playing. For these reasons we me might change this if we were to make a new Automa for Viticulture.

On the other hand, that Automa was aimed at being extremely fast to run for the human player, so I’m not sure that we’d change it.

An additional downside of determining the AI’s score from the beginning is that it can reduce the tension towards the end of the game, because you might know by mid game whether you’ll beat that score or not.

2) To address this, you can generate the AI’s score at the end of the game. This still leaves out the feeling of competing while playing and it can feel like the winner of the game is determined by one random event at the end of the game, which can be deflating.

Both 1 and 2 could get narrow scoring ranges per difficulty level by having a base score for each level and then add a fixed range random number. This feels a bit ham-fisted to me, though and I prefer scores that are produced in a more organic manner, which the third option allows for.

3) The final route is to let the AI score during the game, which is a topic that deserves its own section.

Its own section

The most obvious way to let an AI score VP during the game is to have it do it in the exact same way as a human player. The problem with this is that because an AI typically behaves randomly, the scores can vary wildly, which is exactly what we want to avoid.

Let’s take Charterstone as an example. In that game players will have will have turns that nets them maybe 2-8 VP and many that gives 0. This happens because you spend some turns gathering resources, cards, etc. and then you take actions where what you just gathered is used to score VP after which the cycle is repeated.

From gallery of mortenmdk

Made up example of the VP progress of 2 human players in a game of Charterstone.

Our Charterstone Automas each draw a card every turn determining what action the Automa takes that turn. If the scoring was modelled after that of a human player, we’d have cards that make the Automa take an action that gives it 2-8 VP and cards with actions that give 0. This would mean that in a game with 2 Automas one might get most of the VP action cards, the other might get most of the 0 VP action cards. The result would be a game with widely varying Automa scores, which would be frustrating for the player:

From gallery of mortenmdk

Made up example of the VP progress of 2 human players and 2 unbalanced Automas in a game of Charterstone.

The easiest way to fix this is to decouple the AIs scoring from what’s going on in the game and let it score a small number of VP for each card independently of the action specified by the card. This is what we did for Charterstone.

From gallery of mortenmdk

Snippets showing VP gains (top icon) from 3 Charterstone Automa cards.

The one at the bottom of the stack indicates that the number of VP gained is based on the difficulty level. The result is that the Automas scored more smoothly than a human player and this meant that we could control their scores to be in the desired narrow ranges.

From gallery of mortenmdk

Made up example of the VP progress of 2 human players and 1 Automa with smoothened scoring in a game of Charterstone.

This balances the scoring allowing for a balanced and tense game, but the downside is that the Automas feel less like a human player than I’d prefer.

Using handicaps

An alternative to the above approaches is to keep the AI’s scoring independent of the difficulty level and instead give the human player a lower or higher number of rounds to gain VP. We did this in Viticulture by either giving the human one more or one less round to play for a range of 6-8 rounds.

This is actually realistic, because the game ends after any player passes a specific number of VP, which means that good players consistently end the game in fewer rounds than bad players (with the 6 to 8 round range be fairly accurate).

Reversely, we varied the Automa’s number of turns in the Gaia project, while keeping the human player’s number of turns entirely dependent on their skill.

Instead of altering the number of turns, you can give the player or the AI a boost or handicap. We’re using this approach in Stonemaier’s upcoming civilization game, but care needs to be taken not to take the game experience to far away from what the its designer intended.

Play length

In games where the end is triggered by player actions instead of after a fixed number of turns the same issues applies as for scoring. In Scythe for example, the game ends after any player has placed 6 stars each of which are triggered by 1 action, but in most cases, it takes multiple turns to prepare for that star triggering turn, which is the same pattern as the gather resources for X turns then convert them into VP like in Charterstone.

If we had put a “gain 1 star” action on a number of Automa cards and none on the majority, then we’d risk an Automa ending the game on turn 6 if it drew a star card each turn compared to the roughly 20 turns it takes for at human player. On the flipside, it might take the Automa 30 turns, which is beyond what it takes for a non-beginner human player.

Because we’re only talking about 6 stars, not 30 VP as in Charterstone (which as the campaign progresses can easily go to beyond 100 VP per play), we couldn’t give the Automa a small but controlled number of stars per turn, because the then game would be over by turn 3.

The solution was to let the Automa gain fractions of a star each turn. We didn’t want to make player add 2/7ths of star to the 3 and 3/4ths the Automa already had, though, so instead we had about 80% of the Automa cards advance a token one space on a track. At specific points along that track the Automa would place a star, which gave us tight control of the Automa’s star placing pace and difficulty levels could be varied by having separate tracks.

Board Game: Scythe

A Scythe Automa star track in the bottom right corner. Image credit: Guilherme Lima.

Euphoria is another game we worked on where the game ends after any player has placed a specific number of stars - in this case 10. There we did make star placing cards, but we avoided the extreme swings by having a small deck of Automa cards and multiple cycles through it. This meant that we could control the number of cards needed before 10 star-cards have been drawn to a range of our choice by altering the star card/non-star card ratio in the deck and size of the deck.

For both Scythe and Euphoria the star placing pace of a human player ramps up during the game. To mimic that the star track of Scythe had the stars placed with fewer and fewer spaces between them, while in Euphoria we added cards to the Automa deck after each pass through it, so that the fraction of star cards increased over time.

Such systems would work in exactly the same way for VP scoring and in Charterstone we could have increase the value of the variable VP icon to ramp up the Automa’s scoring pace if that had been needed. Conversely, such a variable icon system could have been used in Scythe to increase the number of spaces advanced on the star track each time instead of placing the star spaces closer and closer to each other.


In the end we can boil the above 2500 words down to this to avoid frustratingly random outcomes of games against AIs:

1) Make multiple difficulty levels,
2) that vary less in scores than a human player of a specific skill level.
3) A good way to achieve this is to decouple the AI’s scoring from its actions
4) and make it consistently score close to the average for a human player of specific skill levels.
5) Another approach is to use handicaps and boosts for either the human, the AI or both, but take care not to alter the game too much.
Twitter Facebook
Subscribe sub options Fri Jul 12, 2019 1:23 pm
Post Rolls
  • [+] Dice rolls
Loading... | Locked Hide Show Unlock Lock Comment     View Previous {{limitCount(numprevitems_calculated,commentParams.showcount)}} 1 « Pg. {{commentParams.pageid}} » {{data.config.endpage}}
    View More Comments {{limitCount(numnextitems_calculated,commentParams.showcount)}} / {{numnextitems_calculated}} 1 « Pg. {{commentParams.pageid}} » {{data.config.endpage}}