Recommend
7 
 Thumb up
 Hide
16 Posts

Baseball Highlights: 2045» Forums » General

Subject: Gems Of Wisdom From CPU Stengel rss

Your Tags: Add tags
Popular Tags: [View All]
Peter Kossits
Canada
Montreal
Quebec
flag msg tools
Avatar
mbmbmbmbmb
Hi Guys,

The artificial intelligence from the app has been churning out numbers all weekend as he begins to understand this game more and more.

The problem I wanted to solve was deciding just how powerful Immediate Actions like Cancel 1 Hit, Cancel 2 Hits and Change Hits To Walks really are, so I asked the computer to run an analysis.

First, I decided to find out what the average Threatened Hits were for all cards that had any sort of offense. I set up an empty diamond with a 0-0 score and then ran the threatened hits for every card in the base game one after the other and scored the end positions.

I'm scoring a run as 1 point. For baserunners I'm doing:
First Base: 0.12
Second Base: 0.37
Third Base : 0.50

If Wifflebot came to bat and got his 3 hits, it would be scored 0.99.
And if a player who has a single Home Run came to bat and got his hit, it would be scored 1.00.

For all of the standard cards that have at least one threatened hit, the average offensive value is 0.634 or roughly a runner on first and third if that average batter came to bat with the bases empty.

If I toss in the cards for the Robots, Cyborgs and Naturals expansions, the average offensive value goes up slightly to 0.67.

Now I decided to run the simulation again but assuming that a "Cancel 1 Hit" card had been played on the player. If I do that, the average offensive values are affected as follows:

Without Expansions: 0.634 -> 0.094
With Expansions: 0.677 -> 0.117

If your opponent can play a Cancel 1 Hit card on each of his plays, the average offensive card goes down from runners at 1st and 3rd to more or less a runner at first! If the expansions are used, the effect of Cancel 1 Hit is slightly less powerful, but on the whole the Immediate Action reduces offense by a factor of 6-7!

And the overall numeric value of a Cancel 1 Hit action would be
0.634 - 0.094 = 0.54 or roughly the same as a Triple on the offensive side of things!

It's actually a bit more powerful than that. This analysis has Cancel 1 Hit being played against random cards and average cards, but in a real game you will be able to try and target it against the above average cards your opponent is playing....hopefully.

How about Cancel 2 Hits? The famous Magna Gloves. I ran the sim again, but canceling the 2 most powerful hits on each player. The average offensive values are affected as follows:

Without Expansions: 0.634 -> 0.005
With Expansions: 0.677 -> 0.015

The action shuts down the average offense almost completely and an average batter can expect to get about a tenth of a single against a Magna Glove. Note that offense is 3 times higher with the expansions than without it. Most of the players who can resist a Magna Glove are in the expansions.

Lastly, how about playing Change All Hits To Walks against a random batter? The average offensive values are affected as follows:

Without Expansions: 0.634 -> 0.296
With Expansions: 0.677 -> 0.304

The average of first and third comes down to roughly a double with this action. It's interesting that, on average, Cancel 1 Hit is roughly 3 times more powerful than Change All Hits To Walks. The value of this Immediate Action is roughly the same as a Double when mapped to equivalents on the offensive side.
6 
 Thumb up
 tip
 Hide
  • [+] Dice rolls
Martin G
United Kingdom
Bristol
flag msg tools
badge
Don't fall in love with me yet, we only recently met
Avatar
mbmbmbmbmb
Quote:
I'm scoring a run as 1 point. For baserunners I'm doing:
First Base: 0.12
Second Base: 0.37
Third Base : 0.50


How did you arrive at these values? Personally, I think a runner on first is worth a lot more than an eighth of a run.
 
 Thumb up
 tip
 Hide
  • [+] Dice rolls
Peter Kossits
Canada
Montreal
Quebec
flag msg tools
Avatar
mbmbmbmbmb
I tweaked a little to be more easily able to tell if threatened hits were sufficient to score a run or not.

I started out with the linear:
1st base = 0.25
2nd base = 0.5
3rd base - 0.75

Then I realized that bases loaded would sum up to 1.5 and I didn't like that - I wanted bases loaded to come in at just under 1. So I chopped 33% off of each of the above numbers.

First Base: 0.17
Second Base: 0.33
Third Base: 0.5

And then I decided to shift a couple of points over from first base to second base because it's far easier to score from 2nd.
 
 Thumb up
 tip
 Hide
  • [+] Dice rolls
Byron S
United States
Ventura
California
flag msg tools
I don't remember what I ate last night
badge
but I can spout off obscure rules to all sorts of game like nobody's business!
Avatar
mbmbmbmbmb
I'd say that bases loaded should count as more than one. As long as the runner on 2nd isn't slow, any hit will score at least 2 runs.
2 
 Thumb up
 tip
 Hide
  • [+] Dice rolls
Peter Kossits
Canada
Montreal
Quebec
flag msg tools
Avatar
mbmbmbmbmb
Unless you just played your last card. Then bases loaded doesn't matter any more and is definitely worth less than a run.

Anyways, I'm using these ratios because as I mentioned, the AI likes to know quickly whether a run will score or not. The scoring formula is mainly being used to quickly compare the effects of different cards being played and it's been doing great for that purpose so far.

I have a feeling if you refine, you will end up with slightly different values but the same conclusions for the relative effects of the Immediate Actions.



 
 Thumb up
 tip
 Hide
  • [+] Dice rolls
Peter Kossits
Canada
Montreal
Quebec
flag msg tools
Avatar
mbmbmbmbmb
Next, CPU Stengel was curious about the value of the Clutch Immediate Action. Since that action can only be used when there is a runner in scoring position, he would need to know just how often in the game a runner will be in scoring position.

When it's a player's turn to play a card, there are 8 possible combinations of runners on the bases:

Bases Empty
Runner On First
Runner on 2nd
Runners on 1st and 2nd
Runner on 3rd
Runners on 1st and 3rd
Runners on 2nd and 3rd
Bases Loaded


The positions that are clutch situations are indicated in red - all but the first two.

If you assume a uniform distribution among those 8 states, they would each occur 12.5 % of the time. and 75% of the time, CPU Stengel would be able to make use of a Clutch ability.

But is the distribution uniform? Probably not. So CPU Stengel started recording the board position every time it was a team's turn to play a card and this is what he came up with:

Bases Empty: 26.2%
Runner On First: 21.9%
Runner On 2nd : 8.5%
Runners On 1st and 2nd: 18.1%
Runner On Third: : 3.7%
Runners On First And Third: 9.5%
Runners On 2nd And 3rd: 9.1%
Bases Loaded: 4.3%


Over 48% of the time that it is a team's turn to play a card the bases will either be empty or there will only be a runner on first base. So, the Clutch ability on average can only be used 52% of the time.

Next, he's going to calculate just how much more powerful a Clutch Double is than a Clutch Single.


2 
 Thumb up
 tip
 Hide
  • [+] Dice rolls
Peter Kossits
Canada
Montreal
Quebec
flag msg tools
Avatar
mbmbmbmbmb
I'm not going to show all the math, but having the probabilities of all of base-runner combinations has allowed CPU Stengel to calculate the value of a Clutch Double Immediate Action as being 1.18...roughly a Home Run followed by a single.

I'd have to double-check, but it looks like this is the highest valued offensive Immediate Action in all of the cards supported by the app so far. You'd think that BIG MO with his Quick Eye Home Run might come close but that's only rated at 0.233. The odds of the Cyborg needed to trigger it pulls it down. It's far more likely to have a Clutch situation on your own board than it is for your opponent to play a Cyborg on his previous card.

By comparison a Clutch Single comes in at a still very respectable 0.892 - just under a home run.

A really fun bit of analysis was done for the Teamwork Cards.

Kevin Pedroia hits an Immediate single for every Natural played before him. CPU Stengel rates the power of that card as being 0.73 which is roughly between 2 and 3 average speed base-runner singles.

By contrast, the more expensive Fred Fisk who hits a home run if 3 Naturals had been played before him only has his immediate action rated as 0.294. There is only a ~30% chance of getting 3/4/5 Naturals in your hand not including Fisk himself. You can of course, and should, skew that through your purchases on the Free Agent market if you do go for Fisk. Fisk also has a consolation threatened hit single which adds a bit to his overall value.

Next up. What's more valuable - Pickoff or Double Play? The results may be a little surprising.

We'll also take a look at the Curve, Fastball and Spitball actions. These all cancel all hits on a particular type of batter and it's tempting to think that they're all pretty much equivalent when choosing a new player for your team and with no knowledge of the opponent's cards. They're not!

1 
 Thumb up
 tip
 Hide
  • [+] Dice rolls
Peter Kossits
Canada
Montreal
Quebec
flag msg tools
Avatar
mbmbmbmbmb
Not sure if there's much interest for these Stengelmetrics, but he/it has just compiled a list of the 20 best players in the base game, taking into account the individual values of threatened hits, offensive and defensive immediate actions and revenue/purchasing power.

There may still be some bugs in there, but it's looking pretty good.

#20 = Brooks Nettles - Score: 1.57
#19 - Troy Jeter - Score: 1.61
#18 - Babe Bench - Score: 1.62
#17 - Speedbot - Score: 1.63
#16 - Sprint 36 - Score: 1.74
#15 - Speedo 42 - Score: 1.74
#14 - Juan Spahn - Score: 1.82
#13 - Hideo Tanaka - Score: 1.83
#12 - Moose Giambi - Score: 1.93
#11 - See Ya - Score: 2.00
#10 - Z Bat - Score: 2.00
#9 - Max Verlander - Score: 2.07
#8 - Kong 35 - Score: 2.07
#7 - Barry Sosa - Score: 2.18
#6 - Catfish Carlton - Score: 2.22
#5 - Ty Terry - Score: 2.26
#4 - Willie McGwire - Score: 2.30
#3 - Model T - Score: 2.38
#2 - Mickey Maris - Score: 2.56


.....and

#1 - 5 Tool Model - Score: 2.70
1 
 Thumb up
 tip
 Hide
  • [+] Dice rolls
Mark Delano
United States
Stamford
Connecticut
flag msg tools
mbmbmbmbmb
peterk1 wrote:
I tweaked a little to be more easily able to tell if threatened hits were sufficient to score a run or not.

I started out with the linear:
1st base = 0.25
2nd base = 0.5
3rd base - 0.75

Then I realized that bases loaded would sum up to 1.5 and I didn't like that - I wanted bases loaded to come in at just under 1. So I chopped 33% off of each of the above numbers.

First Base: 0.17
Second Base: 0.33
Third Base: 0.5

And then I decided to shift a couple of points over from first base to second base because it's far easier to score from 2nd.


I ran some simulations based on the following:

Each deck 40% Free Agents, 60% Starters
Each player with 3 cards left to play.
Play through with all possible runners currently on base (from none to full, with slow to fast).
Random card play, but 10000 trials in each situation.

I think this is roughly the midpoint of a 7 game series with 3 buy rounds. Looking at the numbers here is how the runners on each base translate into runs:

1st Base
Slow: .33
Ave: .4
Fast: .54

2nd Base
Slow: .56
Ave: .66
Fast: .72

3rd Base
Slow: .66
Ave: .7
Fast: .75

Bases Loaded
All Slow: 1.7
All Ave: 1.98
All Fast: 2.17


It looks like you are undervaluing runners on base. Also bases loaded is worth more than the sum of single runners on 1st, 2nd and 3rd by ~.15. I assume that is the combined influence of making walks score runs and reducing the opportunity for Pick Off or Double Play to remove all the runners on base.

Given this it's obviously sensitive to the current situation. If I run it with 1 card left to play:

1st Base
Slow: .1
Ave: .14
Fast: .2

2nd Base
Slow: .25
Ave: .33
Fast: .35

3rd Base
Slow: .33
Ave: .37
Fast: .38

Bases Loaded
All Slow: 0.75
All Ave: 0.93
All Fast: 1.03


And again with 5 cards left to play:

1st Base
Slow: .52
Ave: .59
Fast: .75

2nd Base
Slow: .74
Ave: .8
Fast: .86

3rd Base
Slow: .82
Ave: .87
Fast: .87

Bases Loaded
All Slow: 2.3
All Ave: 2.4
All Fast: 2.6


The interesting element is what stays consistent. The bases loaded bonus is pretty constant and the benefit for 2nd vs. 3rd base isn't that large compared to the jump from 1st to 2nd.

A big part of the reason that runners on base are valuable is how difficult it is to get rid of them. There's Pick Off (quite rare) and Double Play (more common but not plentiful). Defense generally revolves around stopping the hits in the first place.
2 
 Thumb up
 tip
 Hide
  • [+] Dice rolls
Peter Kossits
Canada
Montreal
Quebec
flag msg tools
Avatar
mbmbmbmbmb
frunkee wrote:

It looks like you are undervaluing runners on base. Also bases loaded is worth more than the sum of single runners on 1st, 2nd and 3rd by ~.15.


Again, like I mentioned above - there's a practical programming shortcut reason for my doing that. It's saving me from writing a ton of hairy additional code in a bunch of places. As long as it's done uniformly everywhere, and I still get the behavior I want, I'm OK with it. The effect of having a run always be worth more than the score of a bases loaded position makes CPU Stengel more runs-hungry and makes him want to score quickly.

 
 Thumb up
 tip
 Hide
  • [+] Dice rolls
Ed Ray
msg tools
mbmbmbmb
First, I think that the AI plays really well and is very-well executed.
Just a thought ---- which may be silly.
I think that the danger with any set of AI algorithms is that the player can get a real feel for exactly how the AI plays. Kind of like always playing against the same person. Given that there are in fact multiple approaches to deciding how AI should work, perhaps you could have several sets of values (especially considering that some of them are subjective or at least somewhat subjectively determined), and the AI would randomly choose which set of values to use for any given game (or even any given move). That way you can both split the difference for any of the debates about the exact relative value of various conditions, while at the same time making the AIs approach a bit less predictable.
Just my 2 cents, keep up the good work!
 
 Thumb up
 tip
 Hide
  • [+] Dice rolls
Mark Delano
United States
Stamford
Connecticut
flag msg tools
mbmbmbmbmb
peterk1 wrote:
Again, like I mentioned above - there's a practical programming shortcut reason for my doing that. It's saving me from writing a ton of hairy additional code in a bunch of places. As long as it's done uniformly everywhere, and I still get the behavior I want, I'm OK with it. The effect of having a run always be worth more than the score of a bases loaded position makes CPU Stengel more runs-hungry and makes him want to score quickly.



I guess it depends on what use you are putting this evaluation to. A hit is not a runner, at least not yet, so a runner at first isn't the same as playing a card that threatens a single. Moving the runner around the bases only increases their value, so it shouldn't ever cause it to be considered a loss to try to play hits to advance them.

I'd also be wary of scoring quickly to be considered that valuable. You can always try to score first thing by playing a HR, but usually you'd rather get runners on the bases first and use the HR to push them home.
1 
 Thumb up
 tip
 Hide
  • [+] Dice rolls
Peter Kossits
Canada
Montreal
Quebec
flag msg tools
Avatar
mbmbmbmbmb
26halo26 wrote:
First, I think that the AI plays really well and is very-well executed.


Well, what you have right now is CPU Stengel Easy.

I'm working on the Normal Level right now - which is why I'm getting all of these crazy numbers and sharing them. With all of the data I have right now I am able to make CPU Stengel play an all out offense game or an all out defense game or a financial game. CPU Stengel can also now predict the outcome of the game right at the beginning, so he may start trash-talking a little. He's accurate about 60% of the time at the moment.

The Top-20 list that I posted above. Right after I generated that, I did a top-20 Moneyball list. The best players for the cheapest amount of money.

After almost a week of crunching numbers and doing prep work, I finally played my first games against the new normal AI this afternoon.

The first group of series went 4-1, 4-0 and 4-0 for me and I really wasn't happy. The Easy level can beat me about 20% of the time but even when it loses it can usually push the series deep.

I thought that the AI was undervaluing the defensive immediate actions so I pumped those up quite a bit and played another series. I got pretty well thumped by the AI 4-2. It looks good but I'll know for sure tomorrow.


 
 Thumb up
 tip
 Hide
  • [+] Dice rolls
Brian Bankler
United States
San Antonio
Texas
flag msg tools
badge
"Keep Summer Safe!"
Avatar
mbmbmbmbmb
frunkee wrote:

I'd also be wary of scoring quickly to be considered that valuable. You can always try to score first thing by playing a HR, but usually you'd rather get runners on the bases first and use the HR to push them home.

Agreed, and that's not even considering the influence of the Rally Cap expansion, although with the right deck of closers (hold) an early HR can be brutal, but that's pretty rare to get multiples.
 
 Thumb up
 tip
 Hide
  • [+] Dice rolls
Peter Kossits
Canada
Montreal
Quebec
flag msg tools
Avatar
mbmbmbmbmb
frunkee wrote:
A hit is not a runner, at least not yet, so a runner at first isn't the same as playing a card that threatens a single.


Very true. The work needed to do a true statistical evaluation of the value of threatened hits is really, really mind-boggling. You need to sim them against all of the possible combinations of base positions (8 of them), take the baserunner speed into account (which causes the combinations to explode quite a bit), and then take into account the probability of cards that cancel or mutate the hits in some way and do all of it over again for each of those. I did part of that work for triples and home-runs because you can ignore baserunner speed for those for the most part, and it was incredibly time consuming. I ended up sticking with the "simple" way of doing it - assuming the threatened hits happen and while the bases are empty.

That lets me compare threatened hits of player A and player B, but makes it hard/impossible to really compare the value of player A's threatened hits to Player B's defensive immediate action.
 
 Thumb up
 tip
 Hide
  • [+] Dice rolls
Renato Borges
Brazil
Rio de Janeiro
Rio de Janeiro
flag msg tools
peterk1 wrote:
Not sure if there's much interest for these Stengelmetrics, but he/it has just compiled a list of the 20 best players in the base game, taking into account the individual values of threatened hits, offensive and defensive immediate actions and revenue/purchasing power.

There may still be some bugs in there, but it's looking pretty good.

#20 = Brooks Nettles - Score: 1.57
#19 - Troy Jeter - Score: 1.61
#18 - Babe Bench - Score: 1.62
#17 - Speedbot - Score: 1.63
#16 - Sprint 36 - Score: 1.74
#15 - Speedo 42 - Score: 1.74
#14 - Juan Spahn - Score: 1.82
#13 - Hideo Tanaka - Score: 1.83
#12 - Moose Giambi - Score: 1.93
#11 - See Ya - Score: 2.00
#10 - Z Bat - Score: 2.00
#9 - Max Verlander - Score: 2.07
#8 - Kong 35 - Score: 2.07
#7 - Barry Sosa - Score: 2.18
#6 - Catfish Carlton - Score: 2.22
#5 - Ty Terry - Score: 2.26
#4 - Willie McGwire - Score: 2.30
#3 - Model T - Score: 2.38
#2 - Mickey Maris - Score: 2.56


.....and

#1 - 5 Tool Model - Score: 2.70


I've liked this list, but it should change based on the situation. A great deck is one that nullifies your opponent or have a very great combo. If my opponent bought Barry Sosa, I will rather pick a Fastball player than Babe Bench, even though the second is a better overall player.

My only concern about AI playing is the lack of combos and also some bad timing choices. Sometimes AI didn't play a great defensive card, saving it for the end of the minigame. But he ended up never playing it!
 
 Thumb up
 tip
 Hide
  • [+] Dice rolls
Front Page | Welcome | Contact | Privacy Policy | Terms of Service | Advertise | Support BGG | Feeds RSS
Geekdo, BoardGameGeek, the Geekdo logo, and the BoardGameGeek logo are trademarks of BoardGameGeek, LLC.