Corey Butler
United States Marshall Minnesota

Average ratings provide a convenient measure of the popularity of a game, and thus the probability that we would like it if we buy it. But should we use the simple average or the Bayesian average? I have read spirited arguments on this site about the merits of each system. Alternatively, some members have suggested that we should take standard deviation into account, because a game that is both highly and consistently rated is more likely to be acceptable to any given individual. As far as I could determine, no one has studied this empirically, so as both a social science geek and a gaming geek, I thought it would be an interesting thing to do. I decided to correlate various measures of popularity with my personal ratings to see which one best predicts my gaming preferences.
I'm not sure I entirely trust Bayesian statistics, which modify the actual average by bringing scores closer to the mean. My hypothesis is that simple average will be most closely related to my gaming preferences.
Method Of the 93 games I’ve rated, 10 were excluded from the study because there were fewer than 30 ratings, leaving a final sample of 83. For each game, I obtained my rating, the average rating, the Bayesian average, my GeekBuddy analysis, and a composite rating based on simple average minus standard deviation. All scores were entered into SPSS for statistical analysis.
Results and Discussion
The Pearson correlation coefficients between my ratings and the other measures were as follows:
Average .75 Bayesian Average .74 GeekBuddy .37 Composite .76
There is essentially no difference between the average, Bayesian average, and the composite score in terms of predicting my game preferences. Indeed, the correlations between these three measures were all between r=.97 and r=.99. The GeekBuddy analysis was statistically significant at p<.05, but considerably worse than the other measures. I have only 10 geek buddies and they didn’t rate all the games I rated, so until I get a larger pool of likeminded individuals on my list, their average rating will probably not be a very good guide to what games I would like. For nonparametric statisticians, I also did the analysis using Spearman’s coefficient, finding essentially the same pattern of results.
Conclusions
It doesn’t matter! Use either the Bayesian or the simple average, as they are essentially the same thing. Taking the standard deviation into account causes a miniscule increase in the ability to predict your preferences and is probably not worth the trouble. On the other hand, be careful using the recommendation of friends or GeekBuddy analysis. The larger number of ratings in the overall database appears to produce a much more reliable estimate of our personal ratings.

Stephen Smith
United States Jackson MS

An arguement for FORM?
It doesn’t matter! Use either the Bayesian or the simple average, as they are essentially the same thing.
This sounds suspiciously like some of the rational Tim used when promoting the abolition of the Bayesian average and the use of FORM

Ido Magal
United States Seattle Washington

now most of your description was over my head. but i did notice that you say:
"It doesn’t matter! Use either the Bayesian or the simple average, as they are essentially the same thing."
and then you say:
"The larger number of ratings in the overall database appears to produce a much more reliable estimate of our personal ratings."
Sounds like you just argued for bayesian after all, whose purpose is to increase reliability with increased sample size. no?

Lyman Hurd
United States Cupertino California

my statistic
The statistic I would really like to see is the number of people rating a game, say, 8 or higher. That tells me that it has given a lot of pleasure to a lot of people. Then I can dig further into reviews, seeing if any of my friends are on the list, etc., to determine whether I am one of the people who would enjoy the game.

Chuck Uherske
United States Rockville Maryland

I take your point about limited data samples, but how are you selecting your Geekbuddies? I select mine largely on the basis of rating commonality, so it would be almost a circulatrity, I suspect, for me to examine the correlation between my ratings and theirs.
Are you perhaps selecting your Geekbuddies by criteria other than common game preferences?

Corey Butler
United States Marshall Minnesota

I believe that the Bayesian estimate is designed to provide a more reliable measure when there are few ratings. Maybe I need to examine a larger pool of games, especially games with few ratings, but I didn't find any evidence that the Bayesian average was superior for my sample. My argument was and is that they are equal. The reliability problem appeared when I used my GeekBuddy analysis, which is based on a much smaller sample. I'd like to get my GeekBuddy list over 30 and then see how it compares to the overall averages.
As for FORM, I don't know if it is any better than the measures I used. Maybe Tim can demonstrate its validity with his own set of correlations. My hunch is that it won't be much better than simple average, but then I'm a simple kind of guy. I am inclined to agree with his criticism of the Bayesian measure. Very few people even know what it is, and it doesn't seem to provide any additional information, at least not for my game ratings.

Corey Butler
United States Marshall Minnesota

GeekBuddies
Chuck, considering that you're on my list, my GeekBuddy selection criteria could probably use some improvement.
If you have a large number of GeekBuddies (again, perhaps over 30) and they have been selected on some kind of rational ground, such as rating commonality, then I think the correlation may rise over .75.
Would it be circular? Only if you were using your buddies for information on games that you've already tried and rated. It seems to me that a good pool of GeekBuddies would be very useful for answering the question of whether or not you should try a new game.

Chuck Uherske
United States Rockville Maryland

Silly me! And I knew that, too. Sorry I'm letting you down with my game recommendations!

Ido Magal
United States Seattle Washington

bayesian
i don't see how bayesian can be more reliable, if it explicitly deviates from the average. the average is always more likely to predict your score.
the value of bayesian to me, is that it artificially depresses ratings that aren't very robust.
how i evaluate ratings:
if a game scores a bayesian 8, then i know that
a. those that have played it think it's an 8.
b. enough people think it deserves an 8 or higher to justify a reliable rating of 8.
i wish search results were sorted by bayesian rather than average, to make the most of it.

Tim Benjamin
United States Los Alamos New Mexico

The Bayesian Rabbit
Fanboy et al.
If you put a rabbit in a hat then you can pull a rabbit out of a hat. The Bayesian Average used here merely adds 30 votes of about 5.9. It centers the average votes, but to what purpose since it takes a minimum of 30 real votes to establish a ratings ranking; is mere centering a virtue? If so, when do the real votes become adequate in themselves (not just swamping out the Bayesian 30)? Is centering even an effective means to produce a ratings system that is most representative of the BGG population?
I think FORM does this better and I've put together a FORM vs average 'test' for my own amusement. Looks like I'll have to accept the challenge and receive a Geekcritique of my little test concept. (Test to follow when I have more time.)
PS I take great stock in my Geekbuddies' ratings and comments.

Kevin Nesbitt
Canada Ajax Ontario

Yep, Geekbuddies are my main source of information, and probably the best at predicting if I'll like a game or not. Only one question remains: What will I get them for Christmas?!

Tim Benjamin
United States Los Alamos New Mexico

Xmas
Kevin,
I still don't own Ra (or another 240 on my list...).

yegods
United States Campbell CA

hmm, i guess i use my geekbuddy list in an incompatible way to this discussion. i usually put those folks that i know, or that i have had some successful deal with. successful deal means... trading. so, i end up with a bunch of buddies who didn't like the games i want!
hmm.

Corey Butler
United States Marshall Minnesota

with a little help from my GeekBuddies...
I'm expanding my list of GeekBuddies to see if this allows me to better predict game ratings. I got twenty new buddies by going to my three top rated games and adding people who a) gave the game a 10, b) have been active in October, c) rated at least 50 games, d) approximated a normal curve in their ratings distributions, and e) showed some amount of agreement with games I have rated. This last one was pretty subjective. I want to add a few more buddies, then I'll see how well they do in predicting my ratings. Naturally, I'll have to exclude games I used to "recruite" GeekBuddies from the analysis to avoid inflating the correlation. Tautologies don't prove anything.


