On Sample Sizes
Recently, there was a study ‘proving’ that men are better at parking than women. It was done in Germany. People were ‘randomly’ selected.
This is largely irrelevant; the selection bias is enormous:
- A study like this recruits people. Only people that feel they have the time will show up. Busy mothers are out, for one thing.
- This data will only apply to people in Germany.
- This data will only apply to the people in Germany close enough to the test site to agree to it.
Germany has a population of 82 million people. A quick Google search reveals that the voting population is about 62 million people. Conservatively, let’s assume that the driving population is 31 million people. Using this handy sample size calculator, let’s plug some numbers in.
Test 1:
Margin of Error: 5%
Confidence level: 90% (this is actually pretty sloppy; makes the sample size smaller)
Response Distribution: 10% (this assumes that we know a lot about the expected values and that we expect them to fall very close to one another)
Calculated Sample Size: 377
Test 2:
Margin of Error: 5%
Confidence Level: 95% (more common — this is the 19 times out of 20 thing you hear during polling numbers)
Response Distribution: 50%
Calculated Sample Size: 385
The sample size used in the study was 65. What do you get with a number like that? Well, if you leave everything the same as Test 1, your margin of error is 6.12%. If you leave everything as in Test 2, your margin of error is 12.16%. In this study, the men were 5% better at placing the car than women. This is within the margin of error. In the population at large, then, this means nothing at all. There wasn’t even a control group (i.e., people that don’t drive attempting to park cars).
If you’re going to do studies, at least set them up properly.
