Recommend
13 
 Thumb up
 Hide
27 Posts
1 , 2  Next »   | 

BattleLore» Forums » General

Subject: Are adventures balanced? Ask statistics! rss

Your Tags: Add tags
Popular Tags: [View All]
Niels Taatgen
Netherlands
Groningen
Groningen
flag msg tools
badge
Avatar
mbmbmbmbmb
Using the reported wins and loses on the Days of Wonder site, it is possible to calculate whether scenarios are likely to be balanced. What I did was use the binominal function to calculate the 95% confidence interval. That is, for each adventure I will give a range of percentages, and it there is a 95% probability that the true win percentage is in that range.

For example:

1. Agincourt: 66 wins out of 107 for the standard banners means that the 95% confidence interval is 52%-71%. This means the true probability of winning for the French is somewhere in between 52%-71%, with 5% probability that is is outside this range. This means that Agincourt is probably not fair, because 50% is not within the confidence interval.

Here are the others. The confidence interval is always the probability that the standard bearers win, and the numbers are based on the information on the DOW site on 30 december. I will make an update when the numbers increase at some point to improve the reliability (assuming people are interested).

2. First Chevauchee 32 out of 42, 46%-71%
3. Burgos 23 out of 29, 62%-90%
4. Deeper in Castille 15 out of 19, 57%-91%
5. Wizards and Lore 11 out of 24, 28%-65%
6. A complex web 17 out of 24, 51%-85%
7. Crisis in Avignon 4 out 11, 15%-65%
8. A Burgundian Chevauchee 3 out of 15, 7%-45%
9. Free companies on war footing 7 out of 16, 23%-67%
10. Assaulting the Tourelles 8 out of 13, 36%-82%

As you can see, there are quite some adventures that are not balanced given the current reports. It is of course unlikely that the probability of winning is exact 50%, so hopefully the intervals can inform you how reasonable the odds are (and they will become more precise with more data).
 
 Thumb up
1.00
 tip
 Hide
  • [+] Dice rolls
Kevin Duke
United States
Wynne
Arkansas
flag msg tools
Avatar
mbmbmbmbmb
Some kind of "understandable communications" class would be useful as well.
3 
 Thumb up
 tip
 Hide
  • [+] Dice rolls
Paul DeStefano
United States
Long Island
New York
flag msg tools
designer
badge
It's a Zendrum. www.zendrum.com
Avatar
mbmbmbmbmb
The samples are tiny. The margin of error in 15 games will be huge.
 
 Thumb up
 tip
 Hide
  • [+] Dice rolls
Brian

Unspecified
Unspecified
msg tools
mbmb
One thing to consider is that some of these mission were not intended to be balanced but rather to show off a new rule or feature as they are mostly tutorial in nature. For example, the whole point of the Complex Web mission seems to be to show the value of a creature (if they wanted it more balanced they should have given it to the other side).

In fact, only the the last couple seemed to be "real missions", in the sense of them not being tutorial in nature. Not surprisingly, they seem to be the most balanced of the lot.
 
 Thumb up
 tip
 Hide
  • [+] Dice rolls
Bob Gallagher
United Kingdom
Halesowen
flag msg tools
Spectrum is Green
Avatar
mbmbmbmbmb
kduke wrote:
Some kind of "understandable communications" class would be useful as well.


Thanks Kevin!

My keyboard is now wearing the coffee I was drinking...
1 
 Thumb up
 tip
 Hide
  • [+] Dice rolls
Niels Taatgen
Netherlands
Groningen
Groningen
flag msg tools
badge
Avatar
mbmbmbmbmb
Warpstorm wrote:
One thing to consider is that some of these mission were not intended to be balanced but rather to show off a new rule or feature as they are mostly tutorial in nature. For example, the whole point of the Complex Web mission seems to be to show the value of a creature (if they wanted it more balanced they should have given it to the other side).

In fact, only the the last couple seemed to be "real missions", in the sense of them not being tutorial in nature. Not surprisingly, they seem to be the most balanced of the lot.


I agree, but it is nevertheless useful to know the odds, so you can give the side with the highest probability of winning to the novice player.
 
 Thumb up
 tip
 Hide
  • [+] Dice rolls
Niels Taatgen
Netherlands
Groningen
Groningen
flag msg tools
badge
Avatar
mbmbmbmbmb
Ok, I made a graphic to better illustrate the issue. Below are all the adventures in Battlelore. The dot in each row represents the current probability that the standard bearers will win. But of course, this can be a bad estimate, because it is based on only a couple of games, and random chance may influence the accuracy of the current estimate. So the bars to the left and right indicate the range that the real probability is in. As we are dealing with statistics, nothing is certain, but there is a 95% probability that it is within this range.



(click to enlarge)
 
 Thumb up
 tip
 Hide
  • [+] Dice rolls
gary rembo
United Kingdom
brighton
Alaska
flag msg tools
mbmbmbmbmb
Having played a lot of Memoir 44 I think that as in memoir the trick is to alternate sides. So whenever you play you play two games in a session changing sides. This keeps things nice and fair and prevents the "your army was better than mine" syndrome. My self i love the challenge of trying to overcome a superior foe. The victory is that much sweeter.
 
 Thumb up
 tip
 Hide
  • [+] Dice rolls
Niels Taatgen
Netherlands
Groningen
Groningen
flag msg tools
badge
Avatar
mbmbmbmbmb
Update for January 3, 2006


With some more reports coming in, we can narrow the margins:

1. Agincourt 91 out of 150, 53%-68%
2. First Chevauchee 44 out of 79, 45%-66%
3. Burgos 27 out of 40, 52%-80%
4. Deeper in Castille 20 out of 27, 55%-87%
5. Wizards and Lore 17 out of 37, 31%-62%
6. A complex web 26 out of 38, 53%-81%
7. Crisis in Avignon 8 out 16, 28%-72%
8. A Burgundian Chevauchee 5 out of 17, 13%-53%
9. Free companies on war footing 9 out of 20, 26%-66%
10. Assaulting the Tourelles 11 out of 16, 44%-86%

As you can see, the early adventures start to narrow down. Agincourt is definitely unbalanced, but not dramatically, something like 60/40 in favor of the French. First Chevauchee seems nicely balanced, but the two goblin adventures (Burgos and Deeper in Castille) favor the non-goblin player, although further data should show to what extend. Wizards and Lore appears to be pretty well balanced, but the the first Spider adventure, a complex web, appears to favor the Spider player (or is it the non-goblin player?) The final four adventures need more data: all of them could still be balanced.
 
 Thumb up
 tip
 Hide
  • [+] Dice rolls
Stephan Rasmussen
Denmark
Odense C
flag msg tools
badge
Avatar
mbmbmbmbmb
having played the first 3 scenarios many times I would agree with the statements except that in my games the english tends to win a little more than the french in Agincourt.. Scenario 2 "the first chevauchee" is nicely balanced because each side has the same troops and in scenario 3 I still have to find a winning strategy for the goblin side so good work on that .. The next step for your statistics analysis would be to somehow make a strength record for each different unit.. for example how many green goblins would it take to equal one red infantry.. If you can do that with your statistics knowledge it would be awesome..

When the above mentioned task is done all you had to do was type in what units each side had and you would get how balanced the scenario is..

(then all there is to consider is victory conditions, strategy, terrain, number of command cards and so on )
 
 Thumb up
 tip
 Hide
  • [+] Dice rolls
Niels Taatgen
Netherlands
Groningen
Groningen
flag msg tools
badge
Avatar
mbmbmbmbmb
Stradk wrote:
The next step for your statistics analysis would be to somehow make a strength record for each different unit.. for example how many green goblins would it take to equal one red infantry.. If you can do that with your statistics knowledge it would be awesome..

When the above mentioned task is done all you had to do was type in what units each side had and you would get how balanced the scenario is..


Well, my statistics knowledge is enough to know that such an analysis is as good as impossible. What you can do is pit one unit against the other, and figure out the odds that one unit will win. But I would rather use a simulation to calculate it. In a real game various factors interact, and the probability of winning will be a combination of setup and strategy. Simulation might do the trick here again, but you need some AI to play each side in a way that is similar to human players.

This is of course all good news, because a game that can be analyzed to easily is probably not a very good game.
 
 Thumb up
 tip
 Hide
  • [+] Dice rolls
David desJardins
United States
Burlingame
California
flag msg tools
badge
Avatar
mbmbmbmbmb
niels wrote:
Using the reported wins and loses on the Days of Wonder site, it is possible to calculate whether scenarios are likely to be balanced. What I did was use the binominal function to calculate the 95% confidence interval. That is, for each adventure I will give a range of percentages, and it there is a 95% probability that the true win percentage is in that range.


With all due respect, that is NOT what a confidence interval means. This is, unfortunately, the most misunderstood subject in all of statistics.

You can't compute the probability that the win percentage is in any particular interval, from the data you have. You would need to have the prior probability distribution of the win percentage, which you don't.
 
 Thumb up
 tip
 Hide
  • [+] Dice rolls
Matthew M
United States
New Haven
Connecticut
flag msg tools
admin
8/8 FREE, PROTECTED
badge
513ers Assemble!
Avatar
mbmbmbmbmb
DaviddesJ wrote:
niels wrote:
Using the reported wins and loses on the Days of Wonder site, it is possible to calculate whether scenarios are likely to be balanced. What I did was use the binominal function to calculate the 95% confidence interval. That is, for each adventure I will give a range of percentages, and it there is a 95% probability that the true win percentage is in that range.


With all due respect, that is NOT what a confidence interval means. This is, unfortunately, the most misunderstood subject in all of statistics.

You can't compute the probability that the win percentage is in any particular interval, from the data you have. You would need to have the prior probability distribution of the win percentage, which you don't.


I imagine niels is taking the average result and creating an interval based upon taking two standard deviations above and below that mean. So it's really a credible interval, rather than confidence. But only stat geeks will care about the distinction .

-MMM
 
 Thumb up
 tip
 Hide
  • [+] Dice rolls
Niels Taatgen
Netherlands
Groningen
Groningen
flag msg tools
badge
Avatar
mbmbmbmbmb
DaviddesJ wrote:
niels wrote:
Using the reported wins and loses on the Days of Wonder site, it is possible to calculate whether scenarios are likely to be balanced. What I did was use the binominal function to calculate the 95% confidence interval. That is, for each adventure I will give a range of percentages, and it there is a 95% probability that the true win percentage is in that range.


With all due respect, that is NOT what a confidence interval means. This is, unfortunately, the most misunderstood subject in all of statistics.

You can't compute the probability that the win percentage is in any particular interval, from the data you have. You would need to have the prior probability distribution of the win percentage, which you don't.


I guess you are right. What I should have said is that if we would have taken multiple samples of the same size, 95% percent of the confidence intervals would have contained the true probability. If I have time left over I will calculate the right distributions assuming a uniformly distributed priors. But will anyone still understand me?
 
 Thumb up
 tip
 Hide
  • [+] Dice rolls
John Harley
Canada
Toronto
Ontario
flag msg tools
mbmbmbmbmb
this thread rocks.

i subscribed, hopefully there will be more updates as the DOW numbers pile in.
 
 Thumb up
 tip
 Hide
  • [+] Dice rolls
Niels Taatgen
Netherlands
Groningen
Groningen
flag msg tools
badge
Avatar
mbmbmbmbmb
Update for January 9, 2007

1. Agincourt 119 out of 197, 53%-67%
2. First Chevauchee 63 out of 112, 47%-65%
3. Burgos 42 out of 61, 56%-79%
4. Deeper in Castille 29 out of 42, 54%-81%
5. Wizards and Lore 26 out of 52, 37%-63%
6. A complex web 32 out of 49, 51%-77%
7. Crisis in Avignon 11 out 21, 32%-72%
8. A Burgundian Chevauchee 7 out of 19, 19%-59%
9. Free companies on war footing 12 out of 23, 33%-71%
10. Assaulting the Tourelles 15 out of 26, 39%-74%

Conclusion: no major changes since the previous update, just some narrowing of the margins.
 
 Thumb up
 tip
 Hide
  • [+] Dice rolls
Joe Grundy
Australia
Sydney
NSW
flag msg tools
badge
Avatar
mbmbmbmbmb
One quick point... this technique is making assumptions about the games that are logged. While I confess I've never even seen BattleLore, it seems likely that at least some of the plays are logged by one pair of people playing rematches. It seems likely that in at least some cases there are two people playing the same scenario repeatedly with the same players taking the same sides.

Which means the relative skill of the players logging the results is likely a significant factor producing a win/loss skew in at least some cases.


btw the other way the original poster could be correct is if you change the word "calculate" to the word "formulate".

He would be correct to assume binomial distrubtion of win/loss.

And hence I believe he would also be correct (terminology quibbles aside) to assert that his intervals contain the "actual" win percentages with 95% probability. Except for (at least) the sampling bias I just noted.
 
 Thumb up
 tip
 Hide
  • [+] Dice rolls
Niels Taatgen
Netherlands
Groningen
Groningen
flag msg tools
badge
Avatar
mbmbmbmbmb
jgrundy wrote:
One quick point... this technique is making assumptions about the games that are logged. While I confess I've never even seen BattleLore, it seems likely that at least some of the plays are logged by one pair of people playing rematches. It seems likely that in at least some cases there are two people playing the same scenario repeatedly with the same players taking the same sides.

Which means the relative skill of the players logging the results is likely a significant factor producing a win/loss skew in at least some cases.


The rules encourage alternating sides when replaying a scenario, so I don't think that there is a danger that the percentages are biased too much by that. The bias that I am worried about is the fact that more experienced players may tend to pick the side with the lower odds, giving the beginners the better odds. If that happens systematically, balance problems will be underestimated.
 
 Thumb up
 tip
 Hide
  • [+] Dice rolls
Ryan Langton
United States
Unspecified
Unspecified
flag msg tools
niels wrote:
Update for January 3, 2006

Agincourt is definitely unbalanced, but not dramatically, something like 60/40 in favor of the French.


You do realize many people play Agincourt with the full-rules, even though in the text it says not to use them (battle-back, take ground, pursuit). Without these rules I'd say Agincourt is balanced.
 
 Thumb up
 tip
 Hide
  • [+] Dice rolls
tom-le-termite
United States
ft lauderdale
Florida
flag msg tools
badge
I think we're going to need a bigger tub.
Avatar
mbmbmbmbmb
Hey all,

Is there a way to have updated stats with the new official scenarii?

(me? picky? nooooooo ^^ )

 
 Thumb up
 tip
 Hide
  • [+] Dice rolls
Niels Taatgen
Netherlands
Groningen
Groningen
flag msg tools
badge
Avatar
mbmbmbmbmb
Update for May 1, 2007

1. Agincourt 297 out of 497, 55%-64%
2. First Chevauchee 139 out of 275, 45%-56%
3. Burgos 110 out of 181, 54%-68%
4. Deeper in Castille 123 out of 164, 68%-81%
5. Wizards and Lore 119 out of 231, 45%-58%
6. A complex web 98 out of 161, 53%-68%
7. Crisis in Avignon 43 out 108, 31%-49%
8. A Burgundian Chevauchee 38 out of 81, 36%-58%
9. Free companies on war footing 48 out of 88, 44%-65%
10. Assaulting the Tourelles 44 out of 83, 42%-63%

With many more datapoints for all of the adventures, the earlier conclusions have helt with one exception: Crisis in Avignon now seems to favor the pennant bearers. Otherwise, five of the ten adventures are clearly balanced or close to balanced.
 
 Thumb up
 tip
 Hide
  • [+] Dice rolls
Niels Taatgen
Netherlands
Groningen
Groningen
flag msg tools
badge
Avatar
mbmbmbmbmb
Here are some statistics for the new adventures. I haven't included the Epic adventures yet, because there are too few datapoints for them.

Two bridges 24 out of 41, 43%-72%
Crossing the Rhone 18 out of 29, 44%-77%
Hill Camp 33 out of 62, 41%-65%
West of the Rhone 14 out of 27, 34%-69%
Brignals 6 out of 19, 15%-54%
The battle of Brignals 5 out of 7, 36%-92%
The battle of Lewes 11 out of 19, 36%-77%

Conclusions: All of them are still potentially balanced, but more data will have to tell.
 
 Thumb up
 tip
 Hide
  • [+] Dice rolls
tom-le-termite
United States
ft lauderdale
Florida
flag msg tools
badge
I think we're going to need a bigger tub.
Avatar
mbmbmbmbmb
thanks for the update
 
 Thumb up
 tip
 Hide
  • [+] Dice rolls
Lee LaFond
United States
flag msg tools
badge
Avatar
mbmbmbmbmb
Thanks for the stats Niels! Very interesting info to know. I agree that aside from some sampling/response bias that these intervals are good to go by. Just a basic one proportion z confidence interval on a binomial variable, and I don't see anything unusual here.

I'll have to run these by my AP Stats kids I teach at my high school. This would be a good real life example of stats in action.
1 
 Thumb up
 tip
 Hide
  • [+] Dice rolls
Jacques Marcotte
United States
Chicago
IL
flag msg tools
mbmbmbmbmb
Do you have any further update? Or do you have a spreadsheet / program used to calculate these? I'd love to be able to run it myself. I'm a bit of a stats geek myself and love the numbers. I'd also love to do these on my own and learn a bit more about the process... in the process.
 
 Thumb up
 tip
 Hide
  • [+] Dice rolls
1 , 2  Next »   | 
Front Page | Welcome | Contact | Privacy Policy | Terms of Service | Advertise | Support BGG | Feeds RSS
Geekdo, BoardGameGeek, the Geekdo logo, and the BoardGameGeek logo are trademarks of BoardGameGeek, LLC.