Oliver Kiley(Mezmorki)United States
So I have a bit of a love affair with Excel. I’m sorry, but it’s true. I’m not a super mathy-type but I do enjoy working with statistics and thinking about data. Probably has something to do with my background in science eh?
Anyway, since I’ve been on BGG I’ve had a lot of interest in running some basic analysis on the game data, focused on the ranked games. While statistics can certainly be manipulated to say just about anything, I still find it intriguing to examine the numbers, make graphs, and pontificate on imagined importance.
So what have I done? I’ve assembled a massive excel file containing game data for all 8000+ ranked games in the BGG database. I’ve started whipping up some graphs below (which we will get into), but I’m also curious about what other people are interested in seeing. Any correlations between factors? Trends? Summary statistics?
But first, I need to give credit where it is due. First, I need to thankAndrea Nand(n_and)Italy
ModenaI wrote nanDECK and BGG1tool
for his awesome tool BGG2nanDECK. Among other things, this tool allows you to load list of game ID’s from collection, geeklist, or manually imputed data and then collect a vast array of information from those games. This data can be exported into excel file formats.
I also need to thankLuis Olcese(lolcese)Argentina
for writing a script to collect the game ID’s for all the ranked games. While that is awesome and got me started, n_and has since added a feature to BGG2nanDECK to allow you to download data for games ranked from X to Y. Whow!
So the discussion below is based on analyzing 8082 ranked games from data collected on 2012-07-13. Here we go!
First up for this post, I wanted to start looking at the data behind the sub-domains. The first chart (below) shows the distribution of sub-games across the 8000+ ranked games.
Bear in mind, that many games can be listed in multiple sub-domains. But it is worth noting that there are lot more wargames and a lot fewer thematic games that I would have guessed.
So the majority of games are falling into one sub-domain (above), but there is a fair amount of games with two sub-domains listed as well.
As an additional point of consideration, I looked at the subdomain function as breakdown of ownership (below). In cases where games were part of more than 1 sub-domain, each game split its ownership, counting 50% towards one sub-domain and 50% towards the other (or 33% in the case of those games in three subdomains.)
So while strategy + family games (the "euro-games"?) constitute about 22% of the total number of games, in terms of percentage of ownership here on BGG it accounts for 50% of all games owned. Compare this against wargames, where nearly 1/4 of the ranked games are wargames yet it only 15% of the owned games.
Next, I was curious to explore what subdomains tended to be associated with which other subdomains. The table below provides raw number of associations as well as a percentage of the total number of games within each subdomain associated with another one.
I've highlighted a few interesting cells. Namely that around 21-22% of strategy games are cross-listed with family games. 25% of party games are cross-listed with family games (not surprising). And that in total nearly half (43%) of Thematic games are associated with either Strategy, wargames, or family games. This perhaps explains the total confusion in attempting to label thematic games on BGG; there is a lot of cross-over.
Overall though, wargames have the purest sub-domain association (i.e. associated the most with themselves only), followed by customizable, abstract, and children’s games. Thematic and family games have the most cross-over with other domains, as nearly 50% of those games are cross-listed. Party and strategy games are around 62%.
For what it's worth, Startup Fever is the lone game that is listed in both the thematic and abstract categories. Huh-what!?
I was particularly interested at the start of this endeavor how the mechanics were breaking down across the ranked games. So the first chart is just a basic histogram of the distribution of mechanics. I found it interesting (although I guess not terribly surprising) how much die rolling shows up as a mechanic. Many of the more popular euro-style mecahcnis (i.e. worker placement) are naturally much lower on the list, reflecting the few number of games that have been released to date using those mechanics compared to others in use for decades (or centuries).
Next up, and for my own understanding, I was curious to see how many games listed multiple mechanics. So here' a little chart (below), depicting that.
Lastly, I performed a rather heavy operation to look at the distribution of ratings for each mechanic, including the mean rating, +/- 1 standard devision, and the 1% and 99% percentile extremes. So the crazy chart below shows all of this.
All time lowest mechanic? Roll + Move, followed closely by singing. I guess that explains why Cranium is such a hit with this group right? It likewise isn't surprising to see Worker Placement + Deck/Pool building take the #1 and #2 spots for the highest average. These are the latest two "trends" mechanically speaking, so I suppose the newness of them is pushing up the ratings quite a bit.
One thing that caught me eye on the distributions above is the low-tail end of the tended to be much more compressed than the high end. What is going on with the ratings overall that might explain this observation? To help answer, I took a look at a histogram of the rating distribution, and here is what I found:
Basically, you have a huge volume of ratings in the middle (5.5 to 6). The ratings on the high end drop off relatively gradually towards the upper end of the range. The ratings on the bottom however drop off very quickly. Overall there are just not that many games getting low scores below 5.25 or so, a few dozen here and there only.
For what's worth, BGG has a weight ratings, some combination of rules complexity and strategic depth that remains forever ambiguous. That said, the data is interesting. First up a brief histogram of the weight distributions.
That's nice isn't it?
Now, one question I see pop up quite a bit is whether the weights of games have gone up or down over the past. Of course there are a lot of factors that go into this next piece of data, most importantly "who" is assigning the weight values. My guess is that wargamers are going to rate a heavy euro lower at a lower weight rating than a family gamer is, because people's perception of weight is relative to their own experience. That said, here is the chart:
Essentially, the weight ratings are all over the place until things start to stabilize in the 1970's. Perhaps this coincides with the growing hobby game market? Overall, I think the trend is interesting, showing a gradually climb higher and higher all the way up the present.
Again, I'm not sure what explains this. It is hard to look at this data and say objectively whether games are in fact harder, because the weight ratings are so dependent on the users rating the games. But I suppose it is valid to say, from the perspective of this sample population, the weight of games is increasing on average over time.
Last, it's probably worth examining the relationship between weight and rating. Are heavier games rated higher? Here are the results:
This terrible looking chart isn't too convincing. But I ran a regression analysis to check for statistical significance and there IS a statistically significant correlation between weight and rating; that is, as weight goes up one can expect the BGG rating to go up as well. Of course this isn't a hard and fast rule, but it's interesting to think about.
For our last wonderful chart, let's look at the average rating of games by year.
Ratings really drop off prior to 1970. No love for monopoly it seems; at least not here at BGG. Things are fairly flat (but slowly rising) up until the 1990's, at which point the ratings start climbing more swiftly. The last few years mark a definite up-tick in the ratings as well. Are games really getting better? Or are people just overly infatuated with the newest hotness? Of course this data doesn't tell us that.
For those wanting to download and play around with the data yourself, here is is the link to the file:
BGG Ranked Game Data (2012-07-13)
That's all for now. If you have things you want me to look at, let me know. I have a little laundry list of things I want to spend more time playing with, so I'll post things below as they emerge.
Musings on games, design, and the theory of everything. www.big-game-theory.com
- [+] Dice rolls