This is just something I'm curious about. I like the fact that this site uses Bayesian statistics to rank games, and I think I get the idea of how it works: You take the votes you have for a game and add in a certain number of dummy votes to bring games which have not been voted on by many people closer to the middle. But my problem is that I have tried to duplicate the calculations on some of the games, and I don't get the same numbers. There could be several reasons for this. First, the number of dummy votes is said to be 100 or so, but I also heard that it is set to change depending on the number of games in the database, so I don't know it exactly. Also, I have heard lots of different rumors on what the "average" dummy vote is  some say it is 5.5, others say it is the average over all the games on the database, so I'm not sure what to use for that. Theoretically, one could just take all the data from any two games and solve the resulting system of linear equations to get the two constants, but I tried that several times and never got it quite right. I am wondering whether this has anything to do with the "secret" method of throwing out shill and shillbustervotes that Aldie mentioned once. I don't suppose Aldie is going to give away the secret, but does anyone have any insight on this?
Before you reply, try it out yourself and see if your answers match. When I use the constants 100 and 5.5, I can usually get it correct to the tenths place, but beyond that it breaks down.
I would prefer that this particular thread not get into an argument about whether the current system is good or bad  the question here is just how the numbers are calculated.



100 dummy ratings, each being equal to the average of all ratings on BGG (once it was possible to see that value in the "stats" page, but now it seems gone...).
Since such value has probably a lot of decimals, that's why you can't reproduce it exactly with a truncated value (like 5.5).

The X factor here is outlier ratings. Aldie mentioned some time ago that "suspect" ratings are ignored. So, if a game has an average of 7.5 with 50 votes, some of those votes may not be used in the Bayesian calculation (like, a vote of 1 from a user who has rated two gamesthis game a 1 and their own design a 10). I think those votes are removed and since you don't know the value of those, you can only approximate the "official" Bayesian average.



Randy Cox wrote: The X factor here is outlier ratings. Aldie mentioned some time ago that "suspect" ratings are ignored. So, if a game has an average of 7.5 with 50 votes, some of those votes may not be used in the Bayesian calculation (like, a vote of 1 from a user who has rated two gamesthis game a 1 and their own design a 10). I think those votes are removed and since you don't know the value of those, you can only approximate the "official" Bayesian average. I don't think so. There are surely plenty of games (probably of little fame) with only a bit more than 30 ratings, none of which are "suspect". The trick is the truncation (in fact Leezer wrote that he can find the number up to the first decimals ).

Expecting your calculations to match the floating point calculations of whatever tool Aldie is using is a bit optimistic  floating point calculations can be remarkably flawed unless they are carefully managed. I've run into more than a few software packages that completely mangle anything beyond the second digit to the right.
So your calculations may actually be better than the board's!
If you want to solve for the average with two equations, I would recommend using two games with about 100 ratings each as this will apply equal weight to the two unknowns and likely give you better values.
Given the tight spread in the rankings between games seven, eight, nine, ten, and eleven, and given their vastly different number of votes, a few more good (or bad) ratings for OTHER games could make quite a difference in THEIR ratings, and have a big change in their ranking positions. So if you want Commands & Colors: Ancients in the top ten, go give a skyhigh vote to Hungry Hungry Hippos.
Back when they posted the average rating, it was around 5.8. You may want to see if the calculations match better with an average a little higher than 5.5.
