Rebekah B
United States Colorado

I'd like to analyze a certain game situation, but it's stretching the limits of my alreadyrusty probability skills.
There are 2 each of 10 different items. These 20 items are randomly assigned to 4 different groups of 5.
I'd like to know the probability of: Exactly 1 group having at least one matching pair. Exactly 2 groups having at least one matching pair. Exactly 3 groups having at least one matching pair. All 4 groups having at least one matching pair.
EDIT: Oops, I should have also included none of the groups having any matching pairs.
Thanks for any help you can offer.

Scott Humpert
United States Dillsburg Pennsylvania

Not sure if there is a clean formula that will get you what you are looking for directly, but in absense of that, it would seem to me that your problem would be a good candidate for a Monte Carlo simulation.
Fire up your favorite speadsheet application, randomly assign your 20 items into 4 groups of 5, and count any matching pairs among the groups. Rinse and repeat for 1000 samples.
With enough of a sample size and a good randomizer, your results will approach the true probability.

Rebekah B
United States Colorado

Thank you. I guess I can use on online list randomizer to simulate it if there isn't a more direct approach.
If I'm overlooking another way to calculate the probability, though, I'd love to hear it.

Omar Germino
United States Schaumburg Illinois

I am certain there's a way to find what you're looking for using combinatorics/statistics. Unfortunately, it's been a long time since I've taken any classes on the subject, and I'm afraid I've gotten just as rusty.
You may want to try posting your question on math forums, particularly those dealing with combinatorics and probability, if no one else on the Geek can come up with an answer.
If I somehow come up with the solution later on, I'll be sure to let you know.

Guy Srinivasan
United States Kirkland Washington

Getting the exact answer on something like this is hard (or at least some trickiness is involved). Getting approximate answers is easy. Make the first group: the probability it has no pairs is (20*18*16*14*12)/(20*19*18*17*16) = 52%. Try assuming everything's independent even though we know it's not, and you get:
0 groups: 0.52^4 = 7.3% 1 group: 4c1 * 0.52^3 * 0.48^1 = 27% 2 groups: 4c2*0.52^2*0.48^2 = 37.4% 3 groups: 23% 4 groups: 5.3%
I'd guess this is quite close to the actual values. The difference between this and the real values comes from the fact that if (say) your first group has no pairs, then subsequent groups are more likely to have no pairs, whereas if your first group has pairs, then subsequent groups are more likely to have pairs. So throw a few percentage points away from the mean (52%*4=2.08 groups) and put 'em at the outliers (season "a few" to taste):
0: 9% 1: 27% 2: 34% 3: 23% 4: 7%
Anyone want to do it for real and find out how close that is?

Christopher Dearlove
United Kingdom Chelmsford Essex
SoRCon 11 2325 Feb 2018 Basildon UK http://www.sorcon.co.uk

I happen to have this program (still in development, one day I might release it) that allows me to run a MonteCarlo simulation on an expression in a slightly obscure but quite powerful language. A quick example I threw together gave me results over a million runs of
0 > 101877 ~ 0.101877 [0.101284, 0.10247] 1 > 271659 ~ 0.271659 [0.270787, 0.272531] 2 > 323092 ~ 0.323092 [0.322175, 0.324009] 3 > 212686 ~ 0.212686 [0.211884, 0.213488] 4 > 90686 ~ 0.090686 [0.0901232, 0.0912488]
Mean = 1.91864 Standard deviation of mean = 0.00111713 95% confidence interval = [1.91646, 1.92083]
The first five lines are what you asked for, number of sets with at least one pair, estimated probability, and (in []) a 95% confidence interval for that probability. Or in other words the answers are about 10%, 27%, 32%, 21%, 9% (and yes, that only adds up to 99% due to rounding).
FWIW here's the quick and dirty expression I used in my program (mainly so if I ever revisit this thread it's here). I dare say I could improve it. It should be on one line.
f0[find(0,delta(sort(q0)))>=0];v0:=shuffle(sequence10#sequence10+true20);f0(head5(v0))+f0(head5(tail15(v0)))+f0(head5(tail10(v0)))+f0(tail5(v0))

Rebekah B
United States Colorado

Thank you! These estimates are just what I needed. Sure beats 1000+ play tests just to figure out whether or not one rule works with the current numbers.

David Molnar
United States Ridgewood New Jersey

Dearlove wrote: f0[find(0,delta(sort(q0)))>=0];v0:=shuffle(sequence10#sequence10+true20);f0(head5(v0))+f0(head5(tail15(v0)))+f0(head5(tail10(v0)))+f0(tail5(v0))
That's fairly sweet. Plus now I don't have to do it.
Agreed that this is probably doable by hand, but that it is a hard problem.

Chris Okasaki
United States White Plains New York

This is a bit late, but here are the exact probabilities, calculated by a computer program that enumerated all possible configurations:
0: 1200697344/11732745024 = 0.10233729119178035 1: 3173990400/11732745024 = 0.27052410953339745 2: 3794273280/11732745024 = 0.3233917785001376 3: 2496614400/11732745024 = 0.21279030566956264 4: 1067169600/11732745024 = 0.09095651510512191

Christopher Dearlove
United Kingdom Chelmsford Essex
SoRCon 11 2325 Feb 2018 Basildon UK http://www.sorcon.co.uk

cokasaki wrote: This is a bit late, but here are the exact probabilities, calculated by a computer program that enumerated all possible configurations:
0: 1200697344/11732745024 = 0.10233729119178035 1: 3173990400/11732745024 = 0.27052410953339745 2: 3794273280/11732745024 = 0.3233917785001376 3: 2496614400/11732745024 = 0.21279030566956264 4: 1067169600/11732745024 = 0.09095651510512191
Compared to my 95% confidence intervals:
Quote: 0 > 101877 ~ 0.101877 [0.101284, 0.10247] 1 > 271659 ~ 0.271659 [0.270787, 0.272531] 2 > 323092 ~ 0.323092 [0.322175, 0.324009] 3 > 212686 ~ 0.212686 [0.211884, 0.213488] 4 > 90686 ~ 0.090686 [0.0901232, 0.0912488]
One actual result is just outside the 95% confidence interval, which is not that surprising. Useful test (from my perspective).


