ALL BLOG POSTS AND COMMENTS COPYRIGHT (C) 2003-2020 VOX DAY. ALL RIGHTS RESERVED. REPRODUCTION WITHOUT WRITTEN PERMISSION IS EXPRESSLY PROHIBITED.

Tuesday, August 06, 2019

Math is hard, Barbie

Uncephalized demonstrates why it is tremendously foolish to attempt to "correct" one's intellectual superiors in taking exception to my observation about it being astronomically unlikely that any individual present at one mass shooting in the United States will be present at another one:
I'm responding to Vox's OP: The odds against one person in a country of 320 million being in the vicinity of two such events are astronomical.

Which is flat-out wrong--unless I am making some boneheaded error, which is always possible, and why I showed my work--and leads me to think Vox didn't bother doing the math at all before jumping to a conspiracy as his explanation. I may be in error here--I'm sure someone will quickly point it out, if so--but by my math these coincidences are far from astronomically unlikely.

Las Vegas 2017 attendance: 20,000
Gilroy 2019 attendance: 80,000

I don't know how many attendees were actually physically present at each event at the time of the shootings, but I'll assume two thirds, so 14,520 and 52,800.

Proportion of US population present at LV shooting: 14,520 / 350,000,000 = .000041 or .0041%

Proportion of the population NOT at LV is the inverse or 99.9959%

Likelihood of one person being at both events is then: 1 - (.999959^52,800). Which is 88.8%. The number of times this apparently happened is 3, so it's 0.888^3, or 70%.

In other words, through purely random chance it is more likely than not that 3 people who were at the LV 2017 shooting would also be present at the Gilroy shooting.

Making similar estimates about the LV-Parkland connection: 39% likely assuming an average family size of 4, 3000 attendees of Parkland and we are looking for a direct family member involved in both events.

The Borderline-LV coincidence has the lowest odds as I run the numbers, actually quite low, at 0.046% or about 1 chance in 2200, partly because the guy was actually killed, a much lower number than "was also there".

I don't know enough about the San Bernardino or Toronto events to start even making assumptions.

In this model the probability of the LV-Gilroy and LV-Parkland coincidences both happening is 0.70*0.39 = 0.27 or 27%. Just better than 1 in 4.

The very large numbers present at these festivals make for counterintuitive probabilities. The Borderline connection is the only one that even gives me pause, but even 1 in 2200 is not what I'd call "astronomically low".
Uncephalized is correct about one thing. I didn't bother actually doing the math beforehand because I didn't need to do it. And I didn't need to do it in order to have a general idea about the size of the probabilities involved because a) I am highly intelligent and b) I have an instinctual grasp of mathematical relationships. I knew the probabilities were astronomical, in much the same way you know a tall man's height is over six feet without needing to actually measure him. The boneheaded error Uncephalized has made here is that he simply doesn't know how to calculate the probability of independent events. But it's really not a difficult concept. For example, if the odds of rolling a six on a six-sided die are 1 in 6, then the odds of rolling two sixes on two different six-sided dice are (1/6) * (1/6) = 1/36.

But before we calculate the probability of these two specific independent events, let's get the base numbers right. The Gilroy Garlic Festival is a three-day event, so that 80,000 is reduced to 26,667 before being reduced another one-third as per Uncephalized's assumption to account for the timing of the event. This brings us to an estimated 17,787 people present at the time of the shootings. Note that reducing the estimated 20,000 Las Vegas attendance by the same one-third gives us 13,340, not 14,520.

It's never a good sign when they can't even get the simple division right. Now for the relevant probabilities.
  • Gilroy probability: Dividing 17,787 by 350,000,000 results in a probability of 0.00005082, or one in 19,677.
  • Las Vegas probability: Dividing 13,340 by 350,000,000 results in a probability of 0.00003811428, or one in 26,237
  • Gilroy AND Las Vegas probability: Multiplying 0.00005082 by 0.00003811428 results in a probability of 0.0000000019369677096, or one in 516,270,868.
You will observe that 88.8 percent, or 1/1.13, is very, very far away from 1/516,270,868, and 0.0000000019369677096 cubed - to account for all three dual survivors reported - is even further off from 70 percent. As I originally stated, the odds against anyone having been at both events are astronomical, even if we leave out relevant factors such as the fact that 11 percent of all US citizens have never left their birth state throughout the course of their entire lives and that Las Vegas averages 24,381 visitors from California every day.

I suggest that "astronomical" is a perfectly reasonable way to describe a probability of one in 516 million cubed, or if you prefer, one in 137,604,570,000,000,016,192,784,160. I also suggest that you refrain from attempting to correct me if your IQ is sub-Mensa level. And finally, I suggest that it is not "jumping to a conspiracy" to observe obvious and glaring statistical improbabilities.

Labels: , ,

427 Comments:

«Oldest ‹Older 401 – 427 of 427
Blogger SirHamster August 07, 2019 12:06 PM  

@Markku,Dole

Never mind, needed to sleep on it. The binomial formula works for coin flips, where each trial is completely independent of the previous one.

This is not true for drawing without replacement, as each draw affects the pool and probability.

But for scenarios n >> k, the binomial method can provide a useful approximation of the real probability, or an upper bound.


Dole wrote:Yes, I didn't doubt that, what is surprising that a phd approved the method (not the answer). Dumbing down indeed when a phd can't even handle basic probability.
You're pretty fixated on the phd title. The method provides a valid approximation.

The errors introduced in the population estimates outweight the errors introduced through this approximation. In the end it doesn't matter, because the point of the exercise is getting a rough idea of the probability. Which was accomplished. People who want to criticize the math have to demonstrate it matters.

@Markku
Feels like that there's a sockposting Gamma trying to boost his own calls for Muh Apology.

Blogger Dole August 07, 2019 1:30 PM  

@SirHamster,

Drawing without replacement still assumes the events are independent, that's only more confusion regarding different concepts. If the events were dependent you could not say that event A goers and Americans would have an equal probability of being chosen to event B. If the events are dependent you need different math. The only difference is whether there is replacement or not, in this case there clearly isn't as for example having two same Americans at the event makes zero sense, if an American is at the event, he can't be drawn again.

When a phd verifies math I would expect it to use correct formulas. People have been fired for much less.

Blogger J. B. August 07, 2019 1:56 PM  

FWIF, I picture the two mathematical approaches described in the OP in the following way:

Imagine a 350,000,000-sided die.

Method A (Uncephalized): Mark 13,340 of the die's sides with a red dot. Roll the die 17,787 times. What is the probability that a red-dot-marked side will be rolled during those 17,787 rolls?
A: I get 67.79%

Method B (VD): Mark 1 of the die's sides with a red dot. Roll the die 13,340 times (series V). Roll the die again 17,787 times (series G). What is the probability that the red-dot-marked side was rolled in both series G and V?
A: I get 0.0000001936968%

Blogger Matrick August 07, 2019 2:28 PM  

I think an apology from VD to Uncephalized is in order here, especially given all the insults and smugness about "intellectual superiority".

And from the others who piled on. Frankly, this is a humiliation of the highest order. I've no head for maths, but, like judging the proverbial tall man, I knew that there could be a significant overlap between the people who attend both events without needing to crunch the numbers. I'm shocked that anyone could miss it.

Blogger Ten41 August 07, 2019 6:26 PM  

@157 Markku

Markku, if you ever visit this thread again, could you post the site that has your calculations? Would like to understand more.

Blogger Thomas Howard August 07, 2019 6:30 PM  

This entire thread is quite a sight to behold. Perhaps the most illuminating one in years. Could it be a welcome opportunity for introspection and displaying self awareness? Where things go from here will be interesting.

Blogger Markku August 07, 2019 6:56 PM  

It was not about reviewing math, it was whether the answer 0.000000002 or the answer 0.888 is the correct answer. As I already showed, using the approximation was perfectly legitimate in this case because even in the absolute worst case, it didn't change a single digit. And even if it did, those two numbers are so far apart that the question would still have been conclusively answered.

Blogger Markku August 07, 2019 7:03 PM  

https://infogalactic.com/info/Hypergeometric_distribution This also says that you are allowed to use the approximation when k is much smaller than n:

Quote:
Let Y have a binomial distribution with parameters n and p; this models the number of successes in the analogous sampling problem with replacement. If N and K are large compared to n, and p is not close to 0 or 1, then X and Y have similar distributions, i.e., P(X \le k) \approx P(Y \le k).

Blogger Markku August 07, 2019 7:14 PM  

As for expecting apologies, you guys crack me up.

Blogger Markku August 07, 2019 7:22 PM  

I don't really care if these are sockpuppets or not. I'm not going to delete them on a mere suspicion. If we were in a place where I thought this inept manipulation attempt would elicit more than a grin from people, then I might worry about it. But as it is, I'm perfectly willing to let it all stand there.

For posterity.

Blogger Peter August 07, 2019 9:39 PM  

Engineers 1 business majors 0 :)

Blogger chrimony August 08, 2019 12:18 AM  

This post by Vox Day, and his subsequent comments, are a gold mine. Anybody who took Probability 101 and understand the balls-in-an-urn category of problems knows where Vox went wrong. So much for his "superior intellect".

And being the scoundrel that he is, he even KNOWS he was proven wrong, but tries to act like he was correct because he asked the WRONG question to the problem at hand. Using Vox's reasoning, we should be surprised that anybody went to the Vegas concert at all, since the random chance of any one person in the United States attending it is ridiculously small. Only Vox's low-intelligence beta orbiters suffering from Dunning-Kruger effect are fooled by this, though they seem to make up the bulk of his fanbase.

Blogger Uncephalized August 08, 2019 2:37 AM  

@401 SirHamster "Feels like that there's a sockposting Gamma trying to boost his own calls for Muh Apology."

I don't know if that was aimed at me, but I don't particularly care if Vox apologizes or not. If he feels the need, I'll adjust my opinion of him accordingly.

@409 Markku "As for expecting apologies, you guys crack me up."

I haven't been around here very long, but that sounds about right.

Blogger Markku August 08, 2019 2:56 AM  

You got one from Stickwick, but she's a woman. They do all kinds of crazy shit. The way I see it, your ultimate conclusion was wrong. As SirHamster calculated, it is in fact more likely that there are NOT 3+ people shared between the two events, than that there are. Chances of that are 37%. Also, you did calculate two thirds by multiplying by the number 0.66 . If you were to claim rounding, it would have at least have to be 0.67 to be in the ballpark of correctness, but then you gave too many significant numbers in the result for multiplying by two significant numbers. The 88.8% you got right. That's one out of three.

So, way I see it, what's fair is to acknowledge that the 88.8% is indeed the probability of finding 1+ overlap, assuming independent probability and no geographical bias in what events people attend. But apology - no.

Blogger Uncephalized August 08, 2019 10:36 AM  

@414 Markku -- yes, I acknowledged that error already and thanked SirHamster for pointing it out. People don't need to apologize over math. I'm much more annoyed by his extremely uncivil personal behavior towards me. But I don't expect it, it would be out of character, and I'm not that bothered.

Blogger SirHamster August 08, 2019 12:13 PM  

Uncephalized wrote:I don't know if that was aimed at me ...

Of course not. You aren't demanding an apology. How do you look at me talking about people demanding an apology, and thinking it applies to you when you aren't?

Uncephalized wrote:I'm much more annoyed by his extremely uncivil personal behavior towards me.

That's what you get for attempting to correct something that wasn't wrong, missing the context, and getting your own math wrong. It's a social smackdown.

Vox:
The odds against one person in a country of 320 million being in the vicinity of two such events are astronomical. The odds of this happening at least six times are astronomical squared. What the explanation is, I don't know, but the probability math indicates extreme skepticism is required.

"Which is flat-out wrong--unless I am making some boneheaded error, which is always possible, and why I showed my work--and leads me to think Vox didn't bother doing the math at all before jumping to a conspiracy as his explanation."

He made accurate observations and didn't make any explanations, but here you say he's wrong and jumped to an unsupported explanation.

As my math showed, your understanding of the scaling was off. If you treat 6 coincidences in mass shootings as .88^6, you'd think the observation was about 50/50, where the real probability with the givens is much much lower.

I don't recall seeing an apology to Vox on your part, but you have tried to pass off your math as being "less wrong". It is wrong and Vox's isn't.

Blogger SirHamster August 08, 2019 12:29 PM  

Markku wrote:So, way I see it, what's fair is to acknowledge that the 88.8% is indeed the probability of finding 1+ overlap, assuming independent probability and no geographical bias in what events people attend. But apology - no.

1+ overlap given the Vegas shooting.

But if I live in a small town with population of 14,520, I would see the same 88.8% chance of finding 1+ people in my small town at the Gilroy shooting.

The population of the small town and the set of Vegas shooting victims is different.

Namely, there are tons of small towns in America, but only one set of Vegas shooting victims. 88.8% chance is a relative probability that hides some of the improbability of the events we observed.

If we look for the expected value of Gilroy+Vegas double shooting victims, we find that we use Vox's probabilities to calculate an absolute probability. That gives us an EV of 0.68.

3 double shooting victims is 440% of that expected number. Would need to calculate variance and some other long forgotten stats stuff to analyze that further, but it looks surprising at a glance.

Blogger Uncephalized August 08, 2019 4:12 PM  

@416 "Of course not. You aren't demanding an apology. How do you look at me talking about people demanding an apology, and thinking it applies to you when you aren't?"

You mentioned sockpuppeting and I thought you might mean that I was doing it. Apparently that's not what you meant, so fine. Thanks for the clarification, no need to be so touchy about it.

"That's what you get for attempting to correct something that wasn't wrong, missing the context, and getting your own math wrong. It's a social smackdown."

That's an... interesting take on how the interaction went. By my lights it was Vox missing the context, using an irrelevant calculation to arrive at a number that was something like 8 orders of magnitude out of the reality, despite being narrowly correct insofar as he didn't make any errors of arithmetic; whereas mine was at most off by a single order, and contained a subsequent error that wasn't central to the main conclusion--hence my later statement that it was "less wrong". He then immediately descended into personal insults and smug condescension. I expected that reaction and mostly ignored it, preferring to focus on the argument instead.

If you see that as me having given the greater offense, we're not going to see eye to eye.

"As my math showed, your understanding of the scaling was off. If you treat 6 coincidences in mass shootings as .88^6, you'd think the observation was about 50/50, where the real probability with the givens is much much lower."

I've acknowledged you on this several times now, and thanked you for the correction the first time you pointed it out. Do you have a further point? As I recall this brought the likelihood of the coincidence down to something like 20%, which is still much closer in magnitude to my original 70% than it is to 0.000000002.

"He made accurate observations and didn't make any explanations, but here you say he's wrong and jumped to an unsupported explanation."

Don't pretend you didn't know what he meant; it's unbecoming and dishonest. Is there anything in your previous experience with Vox's worldview that would possibly lead to the conclusion he wasn't implying some kind of conspiracy?

He even admitted that the calculation I stepped through in my first post was the one he should have made: "my observation about it being astronomically unlikely that any individual present at one mass shooting in the United States will be present at another one"

His words, not mine. The problem he is describing in words is not the one he solved in math. Therefore, he was wrong. I don't know how to put it any more clearly.

Blogger SirHamster August 08, 2019 5:46 PM  

Uncephalized wrote:You mentioned sockpuppeting and I thought you might mean that I was doing it. Apparently that's not what you meant, so fine. Thanks for the clarification, no need to be so touchy about it.
Unless you're engaged in sockpuppeting, that comment wouldn't apply to you either. I'm not being touchy, but your ego is showing.

By my lights it was Vox missing the context, using an irrelevant calculation to arrive at a number that was something like 8 orders of magnitude out of the reality ...
Is the expected value of double victim shootings a relevant calculation for the purposes of statistical analysis?

Don't pretend you didn't know what he meant; it's unbecoming and dishonest.
I'm not pretending. I agree with Vox's perspective and reject your Gamma reframing of events.

Is there anything in your previous experience with Vox's worldview that would possibly lead to the conclusion he wasn't implying some kind of conspiracy?
Better make up your mind right now: Did Vox jump to conclusions with an explanation of conspiracy, or did Vox imply some kind of conspiracy?

Implying a conspiracy is not an explanation of conspiracy. Which are you accusing Vox of, given you have said both things?

As to your specific question, Vox couldn't be clearer:
"What the explanation is, I don't know, but the probability math indicates extreme skepticism is required."

No implication of conspiracy, though that's a reasonable inference.

"my observation about it being astronomically unlikely that any individual present at one mass shooting in the United States will be present at another one"

His words, not mine. The problem he is describing in words is not the one he solved in math.

Your mistake is that the words he used can describe 2 math problems:

1. How likely is it for us as 3rd parties to see any overlap between the victim groups of the two events by random chance. (88.8%)

2. How likely is it for an individual to be present at both events by random chance. (1 out of 516 million)

Vox's OP shows that he meant the latter. The existence of the first problem does not invalidate the second problem.

I give you credit for bringing up the calculation. It's a relevant metric. But you are reading more into Vox's words than what is actually there, your "corrections" needed their own corrections, and you insist there is an error when there isn't.

Uncephalized wrote:Therefore, he was wrong. I don't know how to put it any more clearly.
You are wrong, he wasn't wrong. Reread, rethink, and stop digging.

Blogger Matamoros August 09, 2019 10:11 AM  

This guy supposed was at three terror attacks and survived - https://nypost.com/2016/03/25/the-american-who-survived-brussels-paris-and-boston-terror-attacks-says-hes-lucky/ .
The liklihood has to be astronomical - Paris, Brussels and Boston

Blogger BaddoSpirito August 09, 2019 10:23 AM  

Your calculation is wildly wrong here. His calculation is also wrong but much closer to the actual probability than yours. It would be easy to see why you are wrong if you looked at a much simpler example. Suppose there are only 4 people in the population and 2 people attend event A and event B. If your calculations were correct, then the probability that at least one same person will attend both events should be (1/2) * (1/2) = 1/4. If you write down all possibilities however, this is obviously wrong.

If we number the people 1,2,3,4, the following possibilities exist for who attends event A: 12, 13, 14, 23, 24, 34. Similarly, for each of these possibilities, there are also 6 possibilities for who attends event B: 12, 13, 14, 23, 24, 34. For a total of 36 possibilities. Assuming all possibilities are equally likely and events are independent, you can see that the probability of at least one same person being in both events in 5/6 because for example, if 12 attended first event, the five out of six possibilities for the second event, i.e. 12,13,14,23,24 satisfy the condition that at least one same person attends both events. And the same is true for all other possibilities for event A.

The correct way to calculate this is as 1 - Prob(there is no same person who attended both events) = 1 - (2/4)*(1/3) = 5/6.

In the real world problem, the probability of at least one same person attending both events is:

prob = 1
for i = 0 to size_of_group_A - 1
prob = prob * ( (totalPopulation - size_of_group_B - i) / (totalPopulation -i) )
end
prob = 1 - prob

For 350 million population and group sizes 17,787 and 13,340, the probability is about 49%. Calculating the probability of at least 3 same people being in both groups is more complicated and not exactly equal to (0.49)^3 as Uncephalized suggests but it is certainly not astronomical and would not be too far off from (0.49)^3 = 12%.

Blogger Tutors in USA July 12, 2020 12:32 PM  

This is really informative blog for students, pinecrest tutoring keep up the good work.

Blogger Evan Weisberg July 18, 2020 3:13 PM  

I just wanted to add a comment here to mention thanks for you very nice ideas.SAT Tutor Gulf Stream I appreciate when I see well written material.

Blogger Private tutor Lighthouse July 20, 2020 7:23 AM  

It seems like an educational blog which always some new information to visitors Private tutor Lighthouse and Enjoying every little bit of it and I have you bookmarked to check out new stuff you blog post.

Blogger Raza Abbas July 28, 2020 11:18 AM  

Couldn't be written any better. Reading this post reminds me of my old room mate! He always kept talking about this. I will forward this article to him. Boca Raton Tutoring Pretty sure he will have a good read. Thanks for sharing!

Blogger Raza Abbas October 13, 2020 10:26 AM  

It was a beneficial workout for me to go through your webpage. Tallahassee Tutoring It definitely stretches the limits with the mind when you go through very good info and make an effort to interpret it properly. I am going to glance up this web site usually on my PC. Thanks for sharing

Blogger Muhammad Sajjad January 11, 2021 1:13 PM  

Its wonderful blog really very nice site and blog facility.every title is very nice and very fatastic concept.Special needs Tutor Larchmont Thanks for sharing the information.

«Oldest ‹Older 401 – 427 of 427

Post a Comment

Rules of the blog

<< Home

Newer Posts Older Posts