So I was having a NaNoWriMo-related discussion involving characters and perceptions of sexual orientation, and someone said ‘Unless you’ve got about fifteen characters, it’s statistically unlikely that any of them will be gay anyway’.
Statistics are funny about what’s likely and what’s not. People have an intuitive tendency to think that if there’s a 10% chance of something happening, then it will happen exactly once out of every ten iterations. (I see a lot of this in World of Warcraft; if a particular giant fire snake has a 10% chance to be carrying a particular fantastical treasure, some players tend to assume that either A) if they don’t have one after killing ten giant fire snakes then the percentage is obviously wrong, or B) if they get one from the first giant fire snake the probability is obviously much higher than everyone says it is.)
Statistics aren’t like that, of course – they laugh at our feeble attempts to predict the future (or the past, or the present). I am not a masterful statistician, but I’m confident with my basics, so I drew a few Hermetic circles on the table, performed the somatic invocation according to the Vancian tomes of arcane lore and the much more user-friendly Jack Vance’s Big Book of Interplanar Summoning, and called upon mathematics to assist me.
Assume a 10% chance (P = 0.1) for any random person to be lesbian or gay. (I have no idea if this is accurate, but it seems to be the popular statistic. Because this is rhetorical math, I’m not calculating numbers for trans or bi or asexual people at the moment.) That means that, ceteris paribus, there’s a 10% chance that the first character in a story will be L/G. One we add a second character with their own 10% chance, we have a (0.1)*(0.1) = 1% chance that both characters are L/G, and a (0.1)+(0.1)-(0.1*0.1) = (0.2 – 0.01) = 19% chance that at least one of them is L/G.
Ignoring multiples at the moment, I’m interested in the probably that at least one character will be L/G, which means I can instead calculate the decreasing probability that every character will be straight. That’s (1 – P) = (0.9), or 90% for any single character, so (0.9)^N where N is the number of characters involved.
0.9^1 = 90%
0.9^2 = 81%
0.9^3 = 72.9%
0.9^4 = 65.61%
0.9^5 = 59.049%
0.9^6 = 53.1441%
0.9^7 = 47.82969%
Interpreting ‘statistically likely’ as meaning ‘the thing that is more likely to happen than the other thing’*, then we have just passed the 50-50 threshold – with seven characters, there is only around a 48% chance that they will all be straight, and it is statistically more likely that at least one of them will be lesbian or gay.
This doesn’t control for cultures and subcultures, social institutions, shared interests, the diverse ranges of human sexualities – basically, this doesn’t control for reality, but I found it was an interesting couple minutes of calculation. It suggests that (again, not accounting for any complexities whatsoever) we might expect at least half of all stories with at least seven characters to feature at least one QUILTBAG person. Like the Bechdel-Wallace test, this is no way to judge how good a story is, but it might be illuminating to think about how frequently the books and math match, and why it is when they don’t.
*I probably shouldn’t try to write anything for Simple Wikipedia, because I would go entirely overboard. ‘Null hypothesis’ = the thing that means the thing we were looking to see happen didn’t happen. ‘Confidence interval’ = the area where it is most likely that things actually happen most of the time. ‘Error term’ = the thing that happened where we didn’t expect something to happen and we don’t know why.