Recommend
2 
 Thumb up
 Hide
21 Posts

Kingdom Death: Monster» Forums » General

Subject: BGG's Thumb on the Scale? rss

Your Tags: Add tags
Popular Tags: [View All]
Emmit Svenson
United States
Illinois
flag msg tools
mb
I've always assumed that KD:M wasn't ranked higher in BGG's "Geek Rating" rankings because not as many people have rated it as have rated other games with lower average ratings.

However, I've just noticed that KD:M's ranked below "A Feast for Odin", which has fewer ratings as well as a lower average rating than KD:M.

Does that mean there is something beyond our user ratings factored into these rankings?
2 
 Thumb up
 tip
 Hide
  • [+] Dice rolls
Richard Sampson
United States
Bothell
Washington
flag msg tools
badge
Avatar
mbmbmbmbmb
Yes there is more to the ranking than simply average rating and number of ratings though I don't think it is publicly known exactly what the algorithm is or what all of the factors are. If you search around, you can find discussions about this such as this one.
4 
 Thumb up
 tip
 Hide
  • [+] Dice rolls
L S
Germany
flag msg tools
badge
Avatar
mbmbmbmbmb
https://en.wikipedia.org/wiki/Bayesian_average
 
 Thumb up
 tip
 Hide
  • [+] Dice rolls
Klutz
Canada
Quebec
flag msg tools
I am a "certified" art critic.
badge
I am a professional BGG commenter.
Avatar
mbmbmbmbmb
I'm pretty sure that recent ratings carry more weight than older ratings.

KD:M probably got a ton of votes when the forst KS shipped, and then a trickle of new votes as people purchase it at retail / eBay. I imagine it'll get another huge input of ratings when the current KS ships wave 1, and it'll shoot up the BGG rankings.
 
 Thumb up
 tip
 Hide
  • [+] Dice rolls
Emmit Svenson
United States
Illinois
flag msg tools
mb
Pretty sure it can't be Bayesian, with a smaller sample getting heavier weighting.
 
 Thumb up
 tip
 Hide
  • [+] Dice rolls
L S
Germany
flag msg tools
badge
Avatar
mbmbmbmbmb
emmit svenson wrote:
Pretty sure it can't be Bayesian, with a smaller sample getting heavier weighting.

According to the FAQ (https://boardgamegeek.com/wiki/page/BoardGameGeek_FAQ), the BGG ranking system employs some sort of "Bayesian averaging" to prevent games with few votes to top the overall ratings. The constant used for the Bayesian average depends on the variation in the dataset, which indicates the degree to which users are in agreement or disagreement about a game's quality, i.e. the objectivity/reliability of the user scores. Since user ratings for A Feast for Odin have a standard deviation of 1.37 compared to KDM's 1.73, the former is adjusted towards the theoretical average of 5.5 less strongly than the latter.

(Copied from last week's thread about KDM's ratings at https://boardgamegeek.com/article/24214118#24214118)
1 
 Thumb up
 tip
 Hide
  • [+] Dice rolls
Richard Sampson
United States
Bothell
Washington
flag msg tools
badge
Avatar
mbmbmbmbmb
Randombias wrote:
emmit svenson wrote:
Pretty sure it can't be Bayesian, with a smaller sample getting heavier weighting.

According to the FAQ (https://boardgamegeek.com/wiki/page/BoardGameGeek_FAQ), the BGG ranking system employs some sort of "Bayesian averaging" to prevent games with few votes to top the overall ratings. The constant used for the Bayesian average depends on the variation in the dataset, which indicates the degree to which users are in agreement or disagreement about a game's quality, i.e. the objectivity/reliability of the user scores. Since user ratings for A Feast for Odin have a standard deviation of 1.37 compared to KDM's 1.73, the former is adjusted towards the theoretical average of 5.5 less strongly than the latter.

(Copied from last week's thread about KDM's ratings at https://boardgamegeek.com/article/24214118#24214118)
I am fairly certain that is not the complete algorithm though. I believe there is something else to offset the 1s/10s wars that people get into, especially for popular games, and something that weighs the timing of the rating (as previously mentioned).

Edit: BTW if you take a look at the GeRBIL rating thing I linked to above which I am pretty sure uses Bayesian averaging you will see KD:M is actually one of the most off on BGG from where it should be with that guy ranking it 56 back when he did it.
 
 Thumb up
 tip
 Hide
  • [+] Dice rolls
L S
Germany
flag msg tools
badge
Avatar
mbmbmbmbmb
ras2124 wrote:
Randombias wrote:
emmit svenson wrote:
Pretty sure it can't be Bayesian, with a smaller sample getting heavier weighting.

According to the FAQ (https://boardgamegeek.com/wiki/page/BoardGameGeek_FAQ), the BGG ranking system employs some sort of "Bayesian averaging" to prevent games with few votes to top the overall ratings. The constant used for the Bayesian average depends on the variation in the dataset, which indicates the degree to which users are in agreement or disagreement about a game's quality, i.e. the objectivity/reliability of the user scores. Since user ratings for A Feast for Odin have a standard deviation of 1.37 compared to KDM's 1.73, the former is adjusted towards the theoretical average of 5.5 less strongly than the latter.

(Copied from last week's thread about KDM's ratings at https://boardgamegeek.com/article/24214118#24214118)
I am fairly certain that is not the complete algorithm though. I believe there is something else to offset the 1s/10s wars that people get into, especially for popular games, and something that weighs the timing of the rating (as previously mentioned).

Edit: BTW if you take a look at the GeRBIL rating thing I linked to above which I am pretty sure uses Bayesian averaging you will see KD:M is actually one of the most off on BGG from where it should be with that guy ranking it 56 back when he did it.

There's certainly more to BGG's algorithm, the FAQ says as much. Yet, the process of Bayesian averaging by itself thoroughly explains the dissonance perceived by the OP. While truncation wouldn't be completely out of the question, Bayesian averaging usually does a pretty good job at addressing the same problem in a dataset, so even if "1/10-wars" are truncated, the effect would be redundant. And I don't see how anybody would deem time an important factor, outside of perhaps assigning a lower weight to ratings that are assigned prior to the game being published.

As for the GeRBIL guy, I don't see anything in his table that takes variance/deviation/reliability into account, so I don't know where a Bayesian constant could possibly come from. However, I have to admit that I lost interest in his calculations as soon as he divided relative rank differences by absolute ranks.
 
 Thumb up
 tip
 Hide
  • [+] Dice rolls
Richard Sampson
United States
Bothell
Washington
flag msg tools
badge
Avatar
mbmbmbmbmb
Randombias wrote:
And I don't see how anybody would deem time an important factor, outside of perhaps assigning a lower weight to ratings that are assigned prior to the game being published.
Time is important because very few people update ranks. Therefore a lot of early high scores can be just hype that never gets corrected. Similarly games that are seemingly innovative and stand-out titles when released may not stand the test of time, again leading to overly high reviews from long ago that are less relevant now.
 
 Thumb up
 tip
 Hide
  • [+] Dice rolls
L S
Germany
flag msg tools
badge
Avatar
mbmbmbmbmb
ras2124 wrote:
Randombias wrote:
And I don't see how anybody would deem time an important factor, outside of perhaps assigning a lower weight to ratings that are assigned prior to the game being published.
Time is important because very few people update ranks. Therefore a lot of early high scores can be just hype that never gets corrected. Similarly games that are seemingly innovative and stand-out titles when released may not stand the test of time, again leading to overly high reviews from long ago that are less relevant now.

So, in order to "fix" hype for new games, you suggest to assign a lower weight to older ratings? Are you really sure you've thought this through?
 
 Thumb up
 tip
 Hide
  • [+] Dice rolls
Richard Sampson
United States
Bothell
Washington
flag msg tools
badge
Avatar
mbmbmbmbmb
Randombias wrote:
ras2124 wrote:
Randombias wrote:
And I don't see how anybody would deem time an important factor, outside of perhaps assigning a lower weight to ratings that are assigned prior to the game being published.
Time is important because very few people update ranks. Therefore a lot of early high scores can be just hype that never gets corrected. Similarly games that are seemingly innovative and stand-out titles when released may not stand the test of time, again leading to overly high reviews from long ago that are less relevant now.

So, in order to "fix" hype for new games, you suggest to assign a lower weight to older ratings? Are you really sure you've thought this through?
Do I think a rating from 10 years ago that has never been updated should be weighted less than a rating from within the last year? Yes, I do.

Also every rating you assign has a date below it showing when you assigned that rating meaning BGG clearly keeps track of how old ratings are. I also believe they factor this into their ranking metrics.
 
 Thumb up
 tip
 Hide
  • [+] Dice rolls
Josh
United States
Pennsylvania
flag msg tools
Avatar
mbmbmbmbmb
ras2124 wrote:
Randombias wrote:
ras2124 wrote:
Randombias wrote:
And I don't see how anybody would deem time an important factor, outside of perhaps assigning a lower weight to ratings that are assigned prior to the game being published.
Time is important because very few people update ranks. Therefore a lot of early high scores can be just hype that never gets corrected. Similarly games that are seemingly innovative and stand-out titles when released may not stand the test of time, again leading to overly high reviews from long ago that are less relevant now.

So, in order to "fix" hype for new games, you suggest to assign a lower weight to older ratings? Are you really sure you've thought this through?
Do I think a rating from 10 years ago that has never been updated should be weighted less than a rating from within the last year? Yes, I do.

Also every rating you assign has a date below it showing when you assigned that rating meaning BGG clearly keeps track of how old ratings are. I also believe they factor this into their ranking metrics.


That's silly, why would I update my ratings if they don't change?
 
 Thumb up
 tip
 Hide
  • [+] Dice rolls
L S
Germany
flag msg tools
badge
Avatar
mbmbmbmbmb
ras2124 wrote:
Randombias wrote:
ras2124 wrote:
Randombias wrote:
And I don't see how anybody would deem time an important factor, outside of perhaps assigning a lower weight to ratings that are assigned prior to the game being published.
Time is important because very few people update ranks. Therefore a lot of early high scores can be just hype that never gets corrected. Similarly games that are seemingly innovative and stand-out titles when released may not stand the test of time, again leading to overly high reviews from long ago that are less relevant now.

So, in order to "fix" hype for new games, you suggest to assign a lower weight to older ratings? Are you really sure you've thought this through?
Do I think a rating from 10 years ago that has never been updated should be weighted less than a rating from within the last year? Yes, I do.

Also every rating you assign has a date below it showing when you assigned that rating meaning BGG clearly keeps track of how old ratings are. I also believe they factor this into their ranking metrics.

Well, what you're proposing would have the exact opposite of the desired effect you stated earlier. Since that doesn't bother you, I don't know what else to tell you.

Every rating is also color-coded. Do you "believe" that's factored into BGG rankings as well?
 
 Thumb up
 tip
 Hide
  • [+] Dice rolls
Richard Sampson
United States
Bothell
Washington
flag msg tools
badge
Avatar
mbmbmbmbmb
Randombias wrote:
Well, what you're proposing would have the exact opposite of the desired effect you stated earlier. Since that doesn't bother you, I don't know what else to tell you.

Every rating is also color-coded. Do you "believe" that's factored into BGG rankings as well?
How would it have the opposite effect? Games that have old ratings don't typically get a bunch of hype scores several years later.

Also, of course the colors affect the rating. A red color will bring it down and a dark green one will bring it up. In fact, for all I know BGG could use the colors and not the numbers at all since they are literally the same thing.
 
 Thumb up
 tip
 Hide
  • [+] Dice rolls
L S
Germany
flag msg tools
badge
Avatar
mbmbmbmbmb
ras2124 wrote:
Randombias wrote:
Well, what you're proposing would have the exact opposite of the desired effect you stated earlier. Since that doesn't bother you, I don't know what else to tell you.

Every rating is also color-coded. Do you "believe" that's factored into BGG rankings as well?
How would it have the opposite effect? Games that have old ratings don't typically get a bunch of hype scores several years later.

Also, of course the colors affect the rating. A red color will bring it down and a dark green one will bring it up. In fact, for all I know BGG could use the colors and not the numbers at all since they are literally the same thing.

Games receive hype ratings ad hoc, not some years down the road. That the same ratings may be still there at a later point in time doesn't change their reliability - if they are hype ratings, they are hype ratings right from the start. If you chose to devalue older ratings, you would simultaneously heighten the relative weight of newer ratings - which means that hype would be more pronounced in the overall ratings than it is without that silly procedure.

Also, the colors are categorical data, the numerical ratings are numerical data. Try changing one of your ratings from 7.6 to 7.7 and tell me what happens with the color.
 
 Thumb up
 tip
 Hide
  • [+] Dice rolls
Richard Sampson
United States
Bothell
Washington
flag msg tools
badge
Avatar
mbmbmbmbmb
Randombias wrote:
Games receive hype ratings ad hoc, not some years down the road. That the same ratings may be still there at a later point in time doesn't change their reliability - if they are hype ratings, they are hype ratings right from the start. If you chose to devalue older ratings, you would simultaneously heighten the relative weight of newer ratings - which means that hype would be more pronounced in the overall ratings than it is without that silly procedure.
If your argument is that such a system would as a whole favor newer games in the overall rankings, then I agree and I urge you to take a look at the current top 10 games. Half are from 2015 and 2016.

In fact the top 100 games in general is typically new games sweeping upward, hitting a peak within their first year or two and this slowly falling. Truly great games with wide appeal will fall more slowly than others, but even Agricola, once number 1, is not even in the top 10.

I am not saying using time as a metric is a perfect system. I am saying I believe BGG uses it with the intention of dealing with people having stale ranks.
 
 Thumb up
 tip
 Hide
  • [+] Dice rolls
Josh
United States
Pennsylvania
flag msg tools
Avatar
mbmbmbmbmb
I wonder if BGG prompting people to do a 'yearly review' of their game ratings would actually downplay some of the hype, forcing people to consider their games vis a vie each other.
 
 Thumb up
 tip
 Hide
  • [+] Dice rolls
L S
Germany
flag msg tools
badge
Avatar
mbmbmbmbmb
ras2124 wrote:
Do I think a rating from 10 years ago that has never been updated should be weighted less than a rating from within the last year? Yes, I do.


ras2124 wrote:
I am not saying using time as a metric is a perfect system. I am saying I believe BGG uses it with the intention of dealing with people having stale ranks.


Frankly, by now I'm unable to decipher what you actually intend to say. Moreover, since you're only going with your gut feeling anyway, there's not much to discuss here anyway. Yes, maybe BGG skews the ratings intentionally to generate hype for new games. But without any backup, that's a conspiracy theory. You might as well argue that KDM is ranked lower because the BGG admins want to save ink when printing images of the top 100, and since the game box is all black, giving KDM a fair rating would cost them extra money from their ink fund. Now that I think about it, that would actually make more sense than to explain KDMs strong alignment towards dummy votes by overvaluation of new votes. After all, the game is from 2015.
 
 Thumb up
 tip
 Hide
  • [+] Dice rolls
Emmit Svenson
United States
Illinois
flag msg tools
mb
Randombias wrote:
Since user ratings for A Feast for Odin have a standard deviation of 1.37 compared to KDM's 1.73, the former is adjusted towards the theoretical average of 5.5 less strongly than the latter.


Ah, thanks, that does explain the weighting rather well.

Randombias wrote:
The constant used for the Bayesian average depends on the variation in the dataset, which indicates the degree to which users are in agreement or disagreement about a game's quality, i.e. the objectivity/reliability of the user scores.


Weird that someone would interpret less variation in these datasets as being more objective, though. That would be a reasonable interpretation of people observing an objectively measurable phenomenon, but the pleasure one derives from a game is purely subjective.
 
 Thumb up
 tip
 Hide
  • [+] Dice rolls
Richard Sampson
United States
Bothell
Washington
flag msg tools
badge
Avatar
mbmbmbmbmb
I don't understand why you have those quotes as opposing.

Before the first one, you asked me DIRECTLY if I thought time should be used in the metric so I answered in my own opinion. I do think it should be used, and I personally think that if tuned, it would fix more problems than it creates though I agree it can create problems.

However, the entire conversation (including the OPs question) has been about what BGG is doing and my answer to that is that I (and others) believe that BGG is using time in its metrics.

Furthermore, I don't really put much weight into the rankings here. As you can tell from my own ratings, I haven't rated anything in years, and I haven't done a rating for most of my games. Mostly I am too lazy. That being said, I do hope that BGG gives my rankings less weight BECAUSE I am too lazy to fix them, and because I believe there are many people who are also lazy.

Also I haven't made any effort to say that KD:M is ranked unfairly other than to say that the other users metric showed KD:M as having a significantly different rank from BGG's metric. Instead I have been merely trying to say that 1.) BGG has a hidden metric that uses unknown variables, and 2.) I think time of rankings is likely one of those variables.
 
 Thumb up
 tip
 Hide
  • [+] Dice rolls
L S
Germany
flag msg tools
badge
Avatar
mbmbmbmbmb
emmit svenson wrote:
Randombias wrote:
Since user ratings for A Feast for Odin have a standard deviation of 1.37 compared to KDM's 1.73, the former is adjusted towards the theoretical average of 5.5 less strongly than the latter.


Ah, thanks, that does explain the weighting rather well.

Randombias wrote:
The constant used for the Bayesian average depends on the variation in the dataset, which indicates the degree to which users are in agreement or disagreement about a game's quality, i.e. the objectivity/reliability of the user scores.


Weird that someone would interpret less variation in these datasets as being more objective, though. That would be a reasonable interpretation of people observing an objectively measurable phenomenon, but the pleasure one derives from a game is purely subjective.

I think it's important to receive this process of Bayesian averaging strictly within the context of its intended purpose (as stated in the BGG FAQ). It's supposed to adjust the ranks of games with relatively few votes - nothing more, nothing less. Within that context, I find it actually quite reasonable to assume that the more users tend to confirm each other, the less a game's ratings should be adjusted. After all, a low standard deviation also indicates that new ratings aren't expected to lead to drastic changes (and if they do, they also change the standard deviation).
 
 Thumb up
 tip
 Hide
  • [+] Dice rolls
Front Page | Welcome | Contact | Privacy Policy | Terms of Service | Advertise | Support BGG | Feeds RSS
Geekdo, BoardGameGeek, the Geekdo logo, and the BoardGameGeek logo are trademarks of BoardGameGeek, LLC.