ALL BLOG POSTS AND COMMENTS COPYRIGHT (C) 2003-2020 VOX DAY. ALL RIGHTS RESERVED. REPRODUCTION WITHOUT WRITTEN PERMISSION IS EXPRESSLY PROHIBITED.

Tuesday, August 06, 2019

Math is hard, Barbie

Uncephalized demonstrates why it is tremendously foolish to attempt to "correct" one's intellectual superiors in taking exception to my observation about it being astronomically unlikely that any individual present at one mass shooting in the United States will be present at another one:
I'm responding to Vox's OP: The odds against one person in a country of 320 million being in the vicinity of two such events are astronomical.

Which is flat-out wrong--unless I am making some boneheaded error, which is always possible, and why I showed my work--and leads me to think Vox didn't bother doing the math at all before jumping to a conspiracy as his explanation. I may be in error here--I'm sure someone will quickly point it out, if so--but by my math these coincidences are far from astronomically unlikely.

Las Vegas 2017 attendance: 20,000
Gilroy 2019 attendance: 80,000

I don't know how many attendees were actually physically present at each event at the time of the shootings, but I'll assume two thirds, so 14,520 and 52,800.

Proportion of US population present at LV shooting: 14,520 / 350,000,000 = .000041 or .0041%

Proportion of the population NOT at LV is the inverse or 99.9959%

Likelihood of one person being at both events is then: 1 - (.999959^52,800). Which is 88.8%. The number of times this apparently happened is 3, so it's 0.888^3, or 70%.

In other words, through purely random chance it is more likely than not that 3 people who were at the LV 2017 shooting would also be present at the Gilroy shooting.

Making similar estimates about the LV-Parkland connection: 39% likely assuming an average family size of 4, 3000 attendees of Parkland and we are looking for a direct family member involved in both events.

The Borderline-LV coincidence has the lowest odds as I run the numbers, actually quite low, at 0.046% or about 1 chance in 2200, partly because the guy was actually killed, a much lower number than "was also there".

I don't know enough about the San Bernardino or Toronto events to start even making assumptions.

In this model the probability of the LV-Gilroy and LV-Parkland coincidences both happening is 0.70*0.39 = 0.27 or 27%. Just better than 1 in 4.

The very large numbers present at these festivals make for counterintuitive probabilities. The Borderline connection is the only one that even gives me pause, but even 1 in 2200 is not what I'd call "astronomically low".
Uncephalized is correct about one thing. I didn't bother actually doing the math beforehand because I didn't need to do it. And I didn't need to do it in order to have a general idea about the size of the probabilities involved because a) I am highly intelligent and b) I have an instinctual grasp of mathematical relationships. I knew the probabilities were astronomical, in much the same way you know a tall man's height is over six feet without needing to actually measure him. The boneheaded error Uncephalized has made here is that he simply doesn't know how to calculate the probability of independent events. But it's really not a difficult concept. For example, if the odds of rolling a six on a six-sided die are 1 in 6, then the odds of rolling two sixes on two different six-sided dice are (1/6) * (1/6) = 1/36.

But before we calculate the probability of these two specific independent events, let's get the base numbers right. The Gilroy Garlic Festival is a three-day event, so that 80,000 is reduced to 26,667 before being reduced another one-third as per Uncephalized's assumption to account for the timing of the event. This brings us to an estimated 17,787 people present at the time of the shootings. Note that reducing the estimated 20,000 Las Vegas attendance by the same one-third gives us 13,340, not 14,520.

It's never a good sign when they can't even get the simple division right. Now for the relevant probabilities.
  • Gilroy probability: Dividing 17,787 by 350,000,000 results in a probability of 0.00005082, or one in 19,677.
  • Las Vegas probability: Dividing 13,340 by 350,000,000 results in a probability of 0.00003811428, or one in 26,237
  • Gilroy AND Las Vegas probability: Multiplying 0.00005082 by 0.00003811428 results in a probability of 0.0000000019369677096, or one in 516,270,868.
You will observe that 88.8 percent, or 1/1.13, is very, very far away from 1/516,270,868, and 0.0000000019369677096 cubed - to account for all three dual survivors reported - is even further off from 70 percent. As I originally stated, the odds against anyone having been at both events are astronomical, even if we leave out relevant factors such as the fact that 11 percent of all US citizens have never left their birth state throughout the course of their entire lives and that Las Vegas averages 24,381 visitors from California every day.

I suggest that "astronomical" is a perfectly reasonable way to describe a probability of one in 516 million cubed, or if you prefer, one in 137,604,570,000,000,016,192,784,160. I also suggest that you refrain from attempting to correct me if your IQ is sub-Mensa level. And finally, I suggest that it is not "jumping to a conspiracy" to observe obvious and glaring statistical improbabilities.

Labels: , ,

427 Comments:

«Oldest ‹Older 201 – 400 of 427 Newer› Newest»
Blogger Azimus August 06, 2019 6:07 PM  

Bless you, bless you all you fellow nerds! I love this blog...

Blogger JH August 06, 2019 6:11 PM  

So if the first event is guaranteed, then each person of the first event would have 50/50 chance on attending the second event, right? The population shouldn't impact the probability.

Blogger Stilicho August 06, 2019 6:13 PM  

Markku, I haven't had a chance to run through your work at the other place. However, I view your question about ignoring the event wich already occurred as a category error.

If gilroy already occurred, then the probability of 17787 (assuming that is the correct number present) attending is 1. MC and uncephalized are then calculating the probability that one of a random group of 17787 Americans did not attend Vegas (I know the temporal order is backwards but that's the order we've been discussing). This calculation does not account for the exclusivity of the Gilroy group.

If you don't, you are calculating from a midpoint in a temporal sense, and the calculation only shows you the probability of a no one from a random group of Americans numbering 17787 was present at another random gathering of of 13340 Americans. The Gilroy group ceased being random before veags in this scenario but the calculation does not account for it. The calculation is exactly the same as if you simply calculated the probability of anyone of a random group of 17787 Americans was at vegas.

There may be a better way of accounting for the distinct makeup of the Gilroy group, but you cannot ignore it. If you do, you might as well assume that 3 gilroy attendees were also at Vegas, because it has already happened as reported. What we are trying to do here is determine the odds that no specific member of the gilroy group was also at vegas.

Blogger Dole August 06, 2019 6:15 PM  

@Markku 0! is 0? What math is this?

Blogger Markku August 06, 2019 6:16 PM  

You are thinking of coin tosses, JH. The probability of the second event is the (real) attendance divided by US population. This is the same for a random American, and a survivor of the first event, because we are talking about conditional probability. The probability of B given A. I posted the link earlier for calculating conditional probabilities.

But with coin tosses, if we ask the question "100 people toss a coin twice. Of those who got heads on the first toss, how likely is it that they get heads on the second?". The answer is 50%. By the selection of the population, you have eliminated the probability of the first heads. Similarly, by putting out the call at the second event, you have eliminated the probability of the second event, and you only consider the first event. And since you don't just randomly walk to a person once, and ask if he was at the first event, but you are putting a call for such people to identify themselves, it becomes an entirely different calculation.

Blogger dcn August 06, 2019 6:17 PM  


"How do you square these two statements?"

There's no conflict. I never accepted his "88%" as a correct conclusion to the sum of all problems, I said he was correct in his selection of the problem and his methodology. He did not account for additional factors like likelihood to travel, I assume he did not account for it because he could not account for it. What is the exact probability that someone who would attend a country music concert in LV would also attend a garlic festival in California? If you have that number then share it, but for now all we can do is calculate the number as though it's perfectly random and then make logical assumptions about whether it's too high or too low. I assume it's way too high, but without more information beyond the quantity of attendees, how more accurate can you get? It's not a matter of him "ignoring a bunch of stuff" it's a matter of what information we have access to.

Sperging over some theoretical "ideal textbook problem" is fun and all, but not really useful.

Pure math is useful because it gives you a baseline. Multiplying irrelevant numbers together to get some mind blowingly small number to surprise your audience is what's not really useful.

How come we don't see such similar "coincidences" when we pair up literally any other two similar events? If the math works out this simply and the probs are that high, we should be seeing multiple double attendance at all kinds of real world "events," and we simply do NOT see that.

I don't know, and I'm not denying that this goes beyond coincidence. But we don't see everything that happens. For us to know someone was at both, the person in question has to say so and someone with a platform has to tell it to us. My point was that Vox's calculation had nothing to do with the problem he highlighted, and Uncephalized had a better methodology and it's therefore completely inappropriate for Vox to be shitting on this guy, because he was more right than Vox was.

Blogger Markku August 06, 2019 6:18 PM  

Technically 0! is NaN. Which makes the binomial distribution NaN, and my argument remains. You don't need it.

Blogger Dole August 06, 2019 6:18 PM  

@JH That is correct (if the events are independent). The attendance on the first event has no bearing on the second.

Blogger Dole August 06, 2019 6:20 PM  

@Markku What? 0! is ONE!!!

Blogger Markku August 06, 2019 6:20 PM  

"How come we don't see such similar "coincidences" when we pair up literally any other two similar events?"

Because there is not enough interest to put out a call, or answer a call. Maybe such a person told that story in a bar once. How would we ever know?

But here, we have the interest because of the narrative, and now we actually get the textbook probability.

Blogger Azure Amaranthine August 06, 2019 6:22 PM  

I didn't even try to think about the math in the other thread. Now, in order to avoid confusion from following stupid people doing stupid calculations, I haven't read any calculations in this thread other than those in Vox and Uncephalized's replies.

Correct, Uncephalized did not read before saying it was wrong.

However, it's odd to calculate the probability of one person out of ~350 million being at both events. In order to then get the probability that there was at least one attendee of both events out of the given population size, you'd have to multiply that number by the population you divided by in the first place. ~1/512 million multiplied by ~350 million is ~68%.

There are many potential extenuating factors, the majority of which would be expected to lower the probable result. Among them are such things as how little most people travel during their lives, and that there was more than a single incident of a person being at multiple shootings, rather, quite a few.

On the other side of the factor fence is the possibility of foreign travelers being possible at both events, which is rightly excluded by the opening statement. Also that there are a limited number of arbitrarily defined "locations" in which to be present in the USA.

All this being said, there is a large enough sample size of Uncephalized's chosen inputs to calculate the probability of so many inaccuracies favorable to one conclusion over the other, and it's low enough that I would logically suspect bias rather than coincidence.

Blogger The Cooler August 06, 2019 6:23 PM  

That's it. Noogies for everyone.

Blogger Markku August 06, 2019 6:23 PM  

Hmm, yes, so it is. I have never actually used 0! because it seems like a totally nonsense concept. Looks like it is DEFINED into 1: https://zero-factorial.com/whatis.html. There is probably some scenario where this solves a problem.

Blogger P Glenrothes August 06, 2019 6:23 PM  

@176 How about the odds of surviving a first shooting and then getting killed at a second shooting? That is different from merely being present at both events. And infinitely more improbable. Can the quants advise?

The Borderline is a country music venue, as was the Las Vegas shooting, just 300 miles away. The Vegas event was heavily advertised among Borderline patrons. It's a small, dedicated community. In short, there is nothing improbable about a victim being at both venues. This important detail is overlooked, instead skepticism centers around a victims unusual name. Not much of a case here.

Blogger Catallaxy August 06, 2019 6:27 PM  

@203 you're making it more complicated than it needs to be. Probability can get confusing and can be counter-intuitive which is why I like to write simple simulations when I need to convince myself of something. That way it just comes down to counting and computers are good at counting.

I can write a program to select a set of 13,340 random Vegas attendees from a population of 350,000,000 individuals and to independently select a set of 17,787 random Gilroy attendees from the same population of 350,000,000 individuals. I can then have the computer count how many individuals are in both sets. I can then ask it to repeat that trial as many times as I want and summarize the results.

Here's the results from one test run doing that 100 times (see the code running here: http://ideone.com/69fIfB )

0: 53
1: 31
2: 10
3: 6

Almost half the time there are one or more people at both events. No fancy probability calculations or math. Just counting.

Blogger Dole August 06, 2019 6:27 PM  

@Markku It's not nonsense, especially once you consider the continuous extension of the factorial function.

https://en.wikipedia.org/wiki/Gamma_function

I remember seeing some videos on the good reasons on youtube related to this also. Might want to check it.

Blogger Salt August 06, 2019 6:31 PM  

I don't see how the population at 300M+ is relevant. It's not like everyone would be interested in going to a ~Baby Metal concert even if it were in their town.

Blogger Markku August 06, 2019 6:32 PM  

The traditional definition of factorial is multiplying all integers, starting from the given integer, down to one. So, multiplying zero down to one seemed like nonsense. I assumed it would be arbitrarily defined into zero. But yes, there is probably an alternative way to define the factorial, and in that definition, the answer of one makes sense.

Blogger Azure Amaranthine August 06, 2019 6:32 PM  

Effectively, you're calculating the probability that there is a limited number of chances for a particular person out of 350m being present at a location with a limited number of people. Doing that twice and multiplying the results gives you the chance that if you chose a random person out of 350m that they would then be present at both events, not the chance that someone out of 350m would be present at both events.

Blogger Azure Amaranthine August 06, 2019 6:35 PM  

"I don't see how the population at 300M+ is relevant. It's not like everyone would be interested in going to a ~Baby Metal concert even if it were in their town."

That would be an extenuating factor decreasing the ultimate probability. The reason we don't calculate these all out is because there are a ridiculous number of them, and ones like this involve applying mathematics to assumed psychology.... yeah no, not jumping into that shitpit.

Blogger Hammerli 280 August 06, 2019 6:39 PM  

The other issue that I have not seen discussed much is that we may be discussing a fairly small group. If I were to take photographs of the attendees at one day of the National Muzzle-Loading Rifle Association's national matches, then take photos of the attendees at the North-South Skirmish Association's Fall Nationals, the odds are extremely high of having some of the same people in both sets of photos. Because the membership of both groups has overlap.

If we're going to argue over the math, we need to nail down all the variables.

Having said that, my gut is churning over this. Because I, too, have noticed how these events "just happen" to take place at precisely the right moment for the Left.

Blogger Dole August 06, 2019 6:39 PM  

There are also factors that may increase the probability. In any case, I would say that the statistical arguments against people attending both events fails.

Blogger Ingot9455 August 06, 2019 6:40 PM  

@200 Exactly. Thus my comment: enemy action.

Blogger Markku August 06, 2019 6:46 PM  

Revised argument for why the distribution disappears when x is zero: Since 0!=1, n-choose-k when n is literally anything but k is 0, is 1. Binomial probability formula is:

n-choose-x * p^x * (1-p)^(n-x).

Note that x is 0, so anything^0 is always 1. n-choose-x is also 1. That leaves us only the last element, and since x was zero, it only leaves us the power of n. Which is exactly what you find in the original calculation. You get the 1-p, and then you get the power of the total population.

Blogger Azure Amaranthine August 06, 2019 6:47 PM  

"The other issue that I have not seen discussed much is that we may be discussing a fairly small group."

Outside of "people who like large public entertainment gatherings", there aren't a lot of short correlative leaps apparent.

What I find more interesting in that regard, is that is the organizers of this sort of thing are not merely human, they appear to have a vanguard of humans whose travel they can control.

Blogger SirHamster August 06, 2019 6:48 PM  

Markku wrote:Hmm, yes, so it is. I have never actually used 0! because it seems like a totally nonsense concept.

It easily shows up when you use the binomial equation:

binom(n, k) = n! / (k! * (n-k)!)


binom(10, 0) counts how many combinations of 0 heads you can get in 10 coin flips. Correct answer is 1.

binom(10, 1) counts how many combinations of 1 head you can get in 10 coin flips, and the correct answer is 10.

Blogger Azure Amaranthine August 06, 2019 6:48 PM  

*if* the organizers.

Blogger Azure Amaranthine August 06, 2019 6:55 PM  

"This is not hard: tv stations have a news tip number. Just call it and say, " My friend X was at two shootings! Call me at $&@€£>!! Then when the reporter calls, produce crisis actor."

That would stuff the "witness" box, but the event itself would have to be organized separately.

Unless you just produced, say, 30+ "witnesses" "present" at a location at a moment in time when none of the normal population of that location would be present to contradict them. Such as during a fire drill evacuation?

Blogger Markku August 06, 2019 6:58 PM  

SirHamster wrote:Markku wrote:Hmm, yes, so it is. I have never actually used 0! because it seems like a totally nonsense concept.

It easily shows up when you use the binomial equation:

binom(n, k) = n! / (k! * (n-k)!)

binom(10, 0) counts how many combinations of 0 heads you can get in 10 coin flips. Correct answer is 1.

binom(10, 1) counts how many combinations of 1 head you can get in 10 coin flips, and the correct answer is 10.


Yeah, now that you reminded me, that was indeed the argument I heard at school for defining 0! that way.

Blogger New Atlantis Lost August 06, 2019 6:58 PM  

I read the original bad math comment and didn’t have to do any math to know that thinking you have an 80% chance of getting shot at twice in different places is retarded.

Blogger Lightdescent August 06, 2019 6:58 PM  

@133 I totally agree. That is dead on. And that is why I think Vox is correct in his choice of math.

Blogger Markku August 06, 2019 7:02 PM  

New Atlantis Lost wrote:I read the original bad math comment and didn’t have to do any math to know that thinking you have an 80% chance of getting shot at twice in different places is retarded.

No, "you", which really means "any given individual at the first concert during the time of shooting", has 0.015% probability. But that is not the question. The question is that when we put a call to all of them, we can find at least one such individual. That probability is 88.8%.

Blogger Markku August 06, 2019 7:03 PM  

Let's say we toss a coin ten times. What is the probability that the first toss is heads? What is the probability that we can find heads among the tosses? Are those probabilities the same? That is the difference between the questions.

Blogger Azure Amaranthine August 06, 2019 7:05 PM  

"to know that thinking you have an 80% chance of getting shot at twice in different places is retarded."

You? Sure, but you've already self-selected to then run those chances, so yes your personal chance will be something like 1/512,000,000.

Blogger Markku August 06, 2019 7:08 PM  

Also, we are defining "survivor" in a strange way here. Typically you'd understand it as someone who was shot at. But the bullet either missed, or he recovered from his wounds. But we are actually defining it just as being present at the event, at least in the media. If we defined it in the sane way, then the probabilities would get tiny.

Blogger Markku August 06, 2019 7:10 PM  

We could as easily calculate those probabilities, if someone suggested a way to estimate the percentage of people in the event who would have been in this situation.

Blogger Sharrukin August 06, 2019 7:17 PM  

32.  MarkkuAugust 06, 2019 7:02 PM

No, "you", which really means "any given individual at the first concert during the time of shooting", has 0.015% probability. But that is not the question. The question is that when we put a call to all of them, we can find at least one such individual. That probability is 88.8%.

This assumes a completely random probability which is incorrect. If there was a concert on a military base with the same number of attendees you would arrive at the same 88.8% chance and be 100% wrong.

Those who go to these events do not do so randomly nor are the Las Vegas concert goers randomly roaming the nation 365 days of the year.

The real life probability of this happening isn't going to be anywhere near 88.8%.

Blogger Markku August 06, 2019 7:19 PM  

It doesn't matter because Vox already set the standard, by using the assumption of independent probabilities. Only, he calculated the wrong thing with that assumption. He calculated the probability that when we stop a random man on the street, he turns out to have been at both shootings.

Blogger nyb August 06, 2019 7:19 PM  

Markku wrote:
That probability is 88.8%.


*Conditional* on all of the (pretty bad) assumptions made. What useful prediction can be made with that probability model?

The calling up of people has already been done (more evidence added to the question) and looks like 3 folks were at both! So the probability of at least one person in Vegas being at Gilroy = 1, given this additional evidence.

All probability is conditional.

Is this information useful to help estimate the probability of someone being at another *future* mass shooting event, given that they were already at one? It depends on the *cause* of them being at the one, and the guessed cause for them being at the other. Probability models are silent on cause. So we make bad guesses by making lots of bad assumptions. The math is right, but the logic isn't.

Blogger Outside The Box August 06, 2019 7:21 PM  

I am unimpressed with Vox Day's "huge intellect" based on this thread. You are flat out wrong on the math here, and your continued condescension towards the many who have patiently and correctly explained that to you is embarrassing. [I have a PhD in Applied Mathematics fwiw] Prove your intelligence by acknowledging when you have made a mistake. Wisdom is realizing that you can know only a tiny sliver of all that there is to know.

Blogger Dirk Manly August 06, 2019 7:26 PM  

@211

" In order to then get the probability that there was at least one attendee of both events out of the given population size, you'd have to multiply that number by the population you divided by in the first place. ~1/512 million multiplied by ~350 million is ~68%."

Not Even Wrong.

Blogger Markku August 06, 2019 7:29 PM  

Everyone loves credentialism, so I'll whip out mine too. (The rest of my grades you don't get to know).

https://i.imgur.com/GWrPBZi.jpg

I have obviously forgotten a few details during all those years, but I still remember the procedure of doing these calculations.

Blogger Dirk Manly August 06, 2019 7:29 PM  

@214

" It's a small, dedicated community. In short, there is nothing improbable about a victim being at both venues. This important detail is overlooked, instead skepticism centers around a victims unusual name. Not much of a case here."

Actually, the "unusual name" of a non-existant person is the MOST important piece of data in the whole thing. WHY is there a person at this event reporting that his name is that of an identity which DOESN'T EVEN EXIST.

That little item there has "spook shit" written all over it.

Blogger Dole August 06, 2019 7:32 PM  

@Outsidethemath

His math was wrong though.

Blogger MC August 06, 2019 7:34 PM  

Stilicho wrote:If gilroy already occurred, then the probability of 17787 (assuming that is the correct number present) attending is 1. MC and uncephalized are then calculating the probability that one of a random group of 17787 Americans did not attend Vegas (I know the temporal order is backwards but that's the order we've been discussing). This calculation does not account for the exclusivity of the Gilroy group.



Hi Stilicho

(BTW you're quite right - i introduced the backwards order of the events- i got Vegas mixed up with Texas, sorry, I'm in England! It doesn't affect the math)

I've been thinking it over on my way home from work

I'm starting to think that you and Vox etc are right in some way.

The "cold math" part - ie chance of any random 17,787 Americans being at Vegas is indeed ~50%

That part I'm fairly confident on - it seems counter-intuitive, like the Birthday Paradox, but it's still true.

However i'm rethinking my "confidence" in disregarding the low probability of them being in Gilroy ("event 1") in the first place - when looking for signs of something suspicious, this earlier low probability might also be a factor.


Quick Thought Experiment:

Imagine we introduce National Lottery in England. (same 1 million participants every week, 1 ticket each)

Imagine that Joe Bloggs wins the first 2 lotteries in a row.

We'd all suspect with near-certainty that there's a fix, or a flawed machine etc.

Now, that suspicion, is not just based on looking at the low probability of SECOND result.

Because ANYONE who won that second one was a 1-in-a-million chance, so what's suspicious about Joe Bloggs winning it as opposed to anyone else?

What makes it suspicious is the fact that the FIRST result was also so low-probability!

Hence the 2 "one-in-a-million" results combine to create the suspicion!

Uncephalized's (and my) mathematical way of looking at the situation do not appear to take this into account at all.

I must log off or will be in trouble with wife but will try to think on and post more tomorrow


Good night all


MC

Blogger SirHamster August 06, 2019 7:35 PM  

Conceptually, 0! describes the number of ways you can describe an empty set, {}, which is 1. It's identical to the number of ways to describe a set of 1, {A}. It's only when you go to a set of 2+ that you start getting combinations. {AB, BA}.


Did some number crunching on the probability function, using Uncephalized's raw numbers.

Probability at Vegas (Pv) = 0.000041
Number at Gilroy (G) = 52,800

Pr(n) is probability of n people being at Gilroy shooting given they were at Las Vegas shooting.


Pr(n) = Pv^(n) * (1-Pv)^(G-n) * binom(G, n)

Pr(0) = 0.1148
Pr(1) = 0.2485
Pr(2) = 0.2689
Pr(3) = 0.1940
...

All those probabilities should sum up to 1. Note that the highest probability is at 2 people. By adding all probabilities above a certain threshold, this allows us to see how surprising a certain number of people being at an event is. This is shown below.

Pr(1+) = 0.89
Pr(2+) = 0.64
Pr(3+) = 0.37
Pr(4+) = 0.17

From this, we see that while it might be unsurprising for 1 person to be at both events, it is not likely for 3 people to be at both events, which matches our intuition. Note also how the probability of additional people dies very quickly. 4+ people is pretty unlikely.

Uncephalized's math that 3 people are pretty likely (70%) is very wrong.

Blogger Dirk Manly August 06, 2019 7:35 PM  

@221

"Having said that, my gut is churning over this. Because I, too, have noticed how these events "just happen" to take place at precisely the right moment for the Left."

It's called "street theater."

It's a scripted drama, played out, not on a stage in a theater, but within the social environment. The casualties might be (and most likely are) real, but the purpose is to get the photographs and the quotes from "survivors" and "victim's surviving relatives" into the newspapers and on the radio/tv/website new reports.

Blogger Uncephalized August 06, 2019 7:38 PM  

@SirHamster I agree, it was incorrect to simply raise 0.888 to the third power to get the probability of three such events. Thanks for the correction. Fortunately this is not a killshot to my overall point.

@238 Markku "It doesn't matter because Vox already set the standard, by using the assumption of independent probabilities. Only, he calculated the wrong thing with that assumption. He calculated the probability that when we stop a random man on the street, he turns out to have been at both shootings."

Exactly. I never once claimed my numbers were perfectly accurate, the issue was that Vox wasn't even approaching the problem from a sensible direction.

Blogger Dave Dave August 06, 2019 7:40 PM  

@240. Nobody cares about your PhD if you have no idea what you're talking about. The probability that someone from event a was at event b is not close to 88.8%. There were zero people from Columbine at Parkland. There were zero people from San Bernadino at Gilroy. If you come to the conclusion that 88.8% is a reasonable and believable number, you are either missing the forest for the trees or are a liar.

Blogger Uncephalized August 06, 2019 7:42 PM  

@SirHamster (again) "Uncephalized's math that 3 people are pretty likely (70%) is very wrong."

To be fair, 70% is a lot less wrong than in 1 in 500,000,000...

Anonymous Anonymous August 06, 2019 7:43 PM  

Azure Amarantine said: That would stuff the "witness" box, but the event itself would have to be organized separately.
Unless you just produced, say, 30+ "witnesses" "present" at a location at a moment in time when none of the normal population of that location would be present to contradict them. Such as during a fire drill evacuation?

There isn't a "normal population" of a festival to notice. As long as nobody saw the supposed lucky dude during the weekend in question, it would be difficult to prove he didn't go to a large music festival, especially if he happened to have a bracelet pass or stamp to show around. I'm assuming if crisis actors were involved, their handlers would take care of 1) motel room far out of town for weekend in question 2) plastic festival paraphernalia. Of course, calling a tv station hotline would work if it really happened, too. No need for reporters tapping random people to interview.

Blogger Markku August 06, 2019 7:44 PM  

Stickwick is also a PhD, is signed up to have a channel at Unauthorized.tv, and agrees with the calculation.

Blogger Uncephalized August 06, 2019 7:46 PM  

@Markku I am curious about this 'other place' where are you numerate folk seem to be hanging out and waiting to ride to the defense of beleaguered nerds, but I understand if you don't want to say for some reason.

Blogger Dave Dave August 06, 2019 7:49 PM  

@250. 1/5,000,000 is a correct calculation for a random population. Your calculation is very, very wrong for what you're trying to do. Conditional probability is not appropriate to use because the logic behind it is incorrect. Conditional probability assumes that Las Vegas was a given, but it's not. If you change Las Vegas to another event, you'll see why the logic doesn't hold. Probability of a random person being at both is a more appropriate calculation.

Blogger Markku August 06, 2019 7:51 PM  

I'm sorry, it's for Vox Popoli regulars who have been commenting on the forum for roughly 10 years or more. You may have the wrong idea. We are all Vox's fans, and I'm actually his business partner. It just so happens that he stepped on it this time. We can admit to it, when that happens.

Blogger dcn August 06, 2019 7:52 PM  

The probability that someone from event a was at event b is not close to 88.8%. There were zero people from Columbine at Parkland.

Retard. The 88% number came from comparing two highly attended events. The 88% number only applies to those events or similarly attended events. Now you're comparing two SCHOOL shootings, one over 10 years ago involving adults, and one a couple years ago involving children. No shit there's no overlap.

Blogger SirHamster August 06, 2019 7:56 PM  

Uncephalized wrote:To be fair, 70% is a lot less wrong than in 1 in 500,000,000...

1 in 500 million is describing something different. That doesn't make it wrong.

Blogger nyb August 06, 2019 7:56 PM  

Dave Dave wrote:@250.Conditional probability is not appropriate to use

There is No Such Thing as unconditional probability.

Pr(P|Q) quantifies our uncertainty of P *given* evidence Q. There is no "random" person - you are substituting your own conditions, e.g. 'a pseudo-random generator is used to select 1 person from 350 million in the USA'.

You are also specifying a different P - "What is our uncertainty that a specific person that we've pre-selected was in attendance at both events."

You cannot just write Pr(P). It's done *everywhere* though, which is one of the reasons that statistics is a disaster.

Blogger Dave Dave August 06, 2019 7:57 PM  

@256. Then you understand why using conditional probability is inappropriate. That's the point. Las Vegas and Gilroy do not have the type of relation where conditional probability would be sensible to use. Mathematicians can be really stupid because they get lost in the numbers without understanding what the reality is.

Blogger billo August 06, 2019 7:57 PM  

BadThinker wrote:

"Probability models, just like their casual and deterministic brethren, can and must be verified. That means only one thing: making predictions of observables never before seen. I mean never as in never... If the model makes useful predictions - where usefulness is related to the decisions one makes with a model - the model is good, else it is bad."



The statement is correct, but misapplied here. This is a statement about the *verification* of a model, not the use of one, and it is just a restatement of the scientific method. One validates a model by making predictions based on it. Once validated, it can be applied to past events. In particular, it can be applied to past events. I can make, for instance, a predictive model to diagnose a disease and test it by making predictions about future cases. Once that model is verified, I can look at historical records and make inferences based on that model about whether or not, say, an Egyptian pharoah had rickets, or Abraham Lincoln had Marfan's syndrome.

You are trivially correct in that observed events no longer can be predicted once known, and thus the "probability" of the observed event in the past is 0 or 1. However, one can make inferences about the past based on probabilistic models. Let us say that I am investigating the death of a person dead by the side of the road. There are injuries suggestive of, but not completely diagnostic of, a glancing hit and run pedestrian-motor vehicle collision. Given the hypothesis of a hit and run injury, I can make the "prediction" that there exists an occult fracture of the tibia on the side of the glancing hit. I prosect the lower extremity, and there is a small non-displaced fracture of the styloid process of the tibia. This is considered significant confirmation the inference of a hit and run, compared to, for instance, a simple fall down an embankment. The model that suggests the fracture is validated by prospective studies, but the application to casework is not prospective.

Fundamentally, this is what VD and the OP are arguing about. Your formulation of the question as:

'This model starts with a a P of "A person at one mass shooting is also at another mass shooting" and a Q of "???"'

...is incomplete. Instead it is "Given x="at second event", y="at first event" , that the priors Q, does P(x|y) have a high probability, or need one look for some additional evidential component, e.g. y="at first event & has a plan for world domination."

VD's claim is that given the known priors, P(x|y) is very low, and thus y should have another evidentiary component. In other words, he claims that there should be a styloid process fracture that we are not looking for.

The OP's claim is that given the known priors P(x|y) is fairly high, and thus no need for an additional evidentiary component He claims there is no need to look for that styloid fracture because the inferential strength of the model is sufficient.

My claim is that it is impossible to know the priors, and if you want to make an inference about the completness of the evidence, then you have to move to a non-Pascalian model. Thus, while this model can be formulated in Bayesian terms in order to explain what they are trying to do, it won't work because you don't know most of the components of the priors -- and assuming away your ignorance is not a solution, which is what both VD and the OP are doing.

I'm getting the idea that your claim is that you can't make inferences about past events using probability models, and that's just not correct, I don't think. The legal system, at least, would disagree with you :-)

Blogger Uncephalized August 06, 2019 8:03 PM  

@Markku I see. No worries, just curious.

Anonymous Anonymous August 06, 2019 8:11 PM  

This is probably a reflection of my non-mathematical background when I say that all the statistics are probably wrong because its not factoring in any of the other elements that, I believe, Vox makes mention of in his "relevant factors" statement. The more and more unrelated certain events are (like an obscure knitting festival in the Southwest vice a death metal concert in the Northeast), the more and more the unlikelihood. However, I would say the odds could potentially go up if the events are something tied together in subcultures, in locations that are also tied together (like the likelihood of some elderly crowds in NYC having ties to retirees in Florida). Southwestern folks often use Las Vegas as their own little retiree area. So odds go up.

However, it all seems irrelevant to me. It is one of those things that are in fact impossible to calculate due to the unknown potentially huge number of soft factors that would also greatly impact the odds. Odds based purely on population and basic probability math alone are all nonsense and not going to be truly accurate as they don't factor in the soft elements; however, knowing what the soft factors actually are is also impossible to be 100% confident in the accuracy of, the relevance of, or the feasbability of even being able to calculate.

It goes without saying, that the odds are astronomical, unless there was some unknown element of sub-cultures tying the places together, making the association of the two areas' populations all the more common. Otherwise, odds are tiny.

Also, to Vox's credit, he appears to be one of the few that can both express doubt while also not marrying himself unquestioningly to that doubt. In his latest videos, he concedes it's possible that the shooters are who it is reported. He's merely expressing doubt over the trustworthiness of the official narrative. That is hardly a conspiratorial viewpoint. I would actually say it's the opinion of a moderate. I feel the more accurate statement would be that some occasional viewers, commenters, audience members appear to be conspiratorial but as these are all just statements, often of a speculative nature, made by people on the internet, it is impossible to gauge the extent to which they're simply thinking out loud. And even if conspiratorial, what does it matter? A healthy dose of doubt toward what is reported is beneficial. It is the business of news outlets to engineer themes in their overarching reporting stream, themes which have been preselected by the editorial divisions of the media as well as the board members. This means that even when they are reporting truth, they are still lying as they will never be reporting it in a vacuum, but always a blip in a larger narrative of perception they are trying to distort to meet their agenda. That's not conspiracy, that is their openly admitted intent.

Blogger Dave Dave August 06, 2019 8:20 PM  

@262. I agree. Trying to mathematically determine the chances of these obvious false flags is a waste of time. We already know that the chances of it all being an accident are effectively zero. Using math tricks is fundamentally idiotic because you can always define the parameters to make something probable, or to make it improbable. There are always more factors you can add in to lower the probability, and there are some you can add to increase the probability. Coincidence is fake news.

Blogger Dole August 06, 2019 8:21 PM  

@Markku, where is your calculation? What formula did you use?

Blogger Azure Amaranthine August 06, 2019 8:26 PM  

"Not Even Wrong."

That's nice Dirk. Now show your work.

Blogger SirHamster August 06, 2019 8:26 PM  

Dave Dave wrote:Las Vegas and Gilroy do not have the type of relation where conditional probability would be sensible to use.

Conditional probability is a way to incorporate existing information into a statistical calculation.

Given information about the number of people at the Vegas shooting, we can determine the conditional probability that some of them will be at the Gilroy shooting.

A lack of relation makes them more independent events, which makes the presence of people at both more surprising, and indicative of a relation we don't know about yet.

Say, a secret organization creating havoc to drive the news.

Blogger Markku August 06, 2019 8:29 PM  

-We select population as the attendees of first event. Probability for each member of attending the event: 1.0 (that's how selection bias works)
-Probability of any given American being at the second event: 52,800 / 350,000,000 = 0.00015
-Since Vox already established independent probability by his own calculation of a different thing, that means that for everyone in the population, individually, it's the same 0.00015 probability
-Conversely, there is 0.99984 probability of such an individual NOT being at the event. This now becomes our simple "and" test, that we apply to everyone in the population. It must return TRUE every single time. That is calculated by 0.99984^14520 which is 0.11185 . The inverse of "nobody attended the second event" is "at least one attended it, which is the number we are calculating. that is 1 - 0.11185 which then gives our final result.

Blogger Azure Amaranthine August 06, 2019 8:33 PM  

Obviously that's not the chance of there being exactly one, if that's your quibble. It's the average chance of there being one, meaning that the chances of there being two, three, four, etc are also part of this solution.

Blogger nyb August 06, 2019 8:33 PM  

@260 I don't think we're in disagreement, for the most part.

I would agree that in the example you gave, you can make an inference and come up with a number if you've got a model that's been well validated (we've seen this fracture on nearly all hit and run victims, we do not see it on folks that trip over a guardrail). There is a big decision riding on that number - e.g. does the court award damages?

Even then, I'd argue that using a specific number is far too certain. Better to simply say 'very likely, because of all of this evidence here', or, if you must, use a range of values when quantifying the uncertainty.

The problem I see is that what is being asked is not the two-part question: "What is the likelihood that Bill was hit by a car, given that he's all banged up and lots of folks that get hit by cars look like Bill?" followed by the use of Bayes to 'update' the probability after examining Bill's tibia.

Instead, the question posed is sort of "Bill was hit by a car. What is the likelihood of that happening?"

The basic point I want to make is that those kinds of questions aren't useful. So yes, you are right - you can make inferences about *uncertain* past events.

And you're completely right that essentially, eliminative induction is the a better approach if we are trying to look at future events like these ones.

Blogger Markku August 06, 2019 8:34 PM  

But as it turned out, you can simply put n-choose-k with k being 0 to the binomial probability formula, and get the same answer. Which is what SirHamster already did.

Blogger Dirk Manly August 06, 2019 8:44 PM  

Azure... the entire point of "not even wrong" is that the proposed answer or explanation is so badly flawed that going into each and every mistake would fill a book, and is better of being just dismissed, rather than dissected and autopsied.

So, no, I'm not going to WASTE MY TIME on all of the several problems with the math which was presented.

It's so f'ed up...that it's literally NOT EVEN WRONG.

Now, if you want to learn why it is wrong, then go take a stats class, and study well enough to get an A in the course.

I'm not your unpaid professor.

Blogger Dave Dave August 06, 2019 8:46 PM  

The calculation by the mathfags must be amended for the fact that it was the same combination of 3 people at both events. It's not 3 random people, it's a collection of 3 people exactly.

Blogger Markku August 06, 2019 8:51 PM  

No, it is three OR MORE, because the deciding factor is whether the number is surprising. If 3 is surprising, then any number above that is also surprising, right up to the total attendance of the lower attendance event. As luck would have it, SirHamster already calculated that using the binomial probability formula. It's 37%.

Blogger Frank Lee August 06, 2019 8:53 PM  

I believe that all males (particularly the tech oriented kind) fall into one of two categories. Fans of Captain Kirk or fans of Spock. Spock fans don't have a problem with him staying, "The odds are astronomically against…" and Kirk fans admire that Kirk doesn't make decisions based on obscure math odds.

Disappointed to see Vox is more of a Spock guy.

As Vox can tell if someone is over 6 feet without measuring them, I can tell that he's wrong on the basic math concepts here. I don't know if the odds are good, or bad that that two people would be at the events, but it's not astronomically against. Doesn't mean that there isn't a conspiracy, but math doesn't prove there is.

Blogger Dirk Manly August 06, 2019 8:57 PM  

Dave Dave

If there is nothing special about those 3 people, then ANY combination of 3 people who were at both shootings is acceptable.

Conversely, if there IS something special about those three people, then the calculation must be done on the odds of that particular set of 3 people being at both events.

My gut feeling is, at least one of those people was part of both false-flag ops, and decided to take the opportunity to attention-whore to a reporter, because he or she was too stupid to realize that the last thing such a person should be doing is drawing attention to himself as having been at both events.... EVEN IF that person's role in the op is to sow disinformation. For that purpose, he could have just as easily said, "a personal friend of mine was at the Las Vegas shooting and remarked X about Y.... and I noticed a similar thing here."

But remember, the current crop of deep staters and their minions just aren't that bright. Or as Q keeps saying, "These people are STUPID!"

Blogger Dole August 06, 2019 8:59 PM  

@Markku it's not correct. Imagine a model with 3 people and two events of two people. In this case there is always someone attending the first and second event. Yet your model says the probability for no one attending the second is (1-2/3)^2. Anyway it's not that important since it's in the ballpark, but just wanted to check since a phd verified it.

Blogger Markku August 06, 2019 9:01 PM  

Note that the probability can simultaneously be credible, AND these people be stooges. Because it's not a trivial thing to find those three people. They won't necessarily contact the media after the call. You don't know if they will, and you don't know how long it will take. But the narrative is in a hurry. So, maybe you just do a quick calculation, determine that 37% isn't too bad, and then use stooges. If any of the actual people to whom this happened pipe up, stonewall them.

Blogger Dave Dave August 06, 2019 9:08 PM  

@275. Incorrect. These 3 people were in a group together at both events. It has to be calculated not for 3 random people, but for a group of 3 people. It could be for any group of 3 people, but it is necessarily for a group of 3 people.
I discourage any attempts at calculating this, however, because it's purposeless. We know this was a false flag. It's just like David Hogg who happened to be interviewed by the media several times before Parkland. What are the odds? Low enough to not care.

Blogger Markku August 06, 2019 9:11 PM  

Dole wrote:@Markku it's not correct. Imagine a model with 3 people and two events of two people. In this case there is always someone attending the first and second event. Yet your model says the probability for no one attending the second is (1-2/3)^2. Anyway it's not that important since it's in the ballpark, but just wanted to check since a phd verified it.

In your example, the problem is that of estimating a statistical probability. Because in our original example the numbers are so huge, we can give the p in the formula an actual number. That is, we treat the people as if the concept of "2.34 people" for example, is a sensical one. The same way that an average number of kids is not a number that anyone actually has.

With such small numbers, you can no longer ignore the fact that you are dealing with integers, and you cannot assign a p to the formula. This particular calculation is done by combinations. You get all the different ways to combine the three people, and then calculate the number that satisfies the condition, with the total number.

Blogger Dave Dave August 06, 2019 9:13 PM  

@273. You misunderstand me. It's about the group of people. What are the odds that a group of three (or more, if that satisfies you) attends both events. This is distinct from 3 random people for the same reason that conditional probability is distinct from Vox's initial calculation.

Blogger Markku August 06, 2019 9:16 PM  

I haven't spent even a second watching any news coverage of the shooting. What I take it you are saying is that these people told the media that they were together at the first event, and were also together at the second event. That would introduce so many variables to the situation that I can't even estimate it. But I also can't say that it's definitely very low. If I absolutely had to (or was paid to) research this, then the first thing I would look into, would be to analyze how people group together in these kind of events.

Blogger Dole August 06, 2019 9:24 PM  

@Markku I am fine with approximations, but the method is not entirely correct, which shows up as huge discrepancies when using smaller numbers. One problem is also that if you reverse the order of events you get a different result in general. Not in this case because the numbers are so big, but in general. That also shows that the model can not be correct.

Phd should give back his degree.

Blogger Dave Dave August 06, 2019 9:24 PM  

@281. That's exactly why doing any sort of mathfaggotry on these things is a waste of time. There are always more variables you can introduce to fudge the numbers however you see fit. Sure, the maths might be consistent, but that's irrelevant when the maths doesn't represent reality. I like SirHamster's math, but it's distracting because it presupposes the statistical possibility that something like this could happen. It's also based upon the bad assumptions from Uncephalized's original post.

Blogger Markku August 06, 2019 9:25 PM  

No, it turned out that reversing the events gives the same answer, due to the assumption of independent probability, established by Vox's calculation of a different thing. Look at the original post. Look at what's in his exponential, and what is in mine. We did calculate in reverse order, and got the same number.

Blogger Dirk Manly August 06, 2019 9:27 PM  

This comment has been removed by the author.

Blogger SirHamster August 06, 2019 9:27 PM  

Dole wrote:Imagine a model with 3 people and two events of two people. In this case there is always someone attending the first and second event. Yet your model says the probability for no one attending the second is (1-2/3)^2.

The model says that the probability that a person did not attend either event is 1/3^2 = 1/9, which is correct.

It's not counting attendance at either the events, it's looking at the probable overlap of the attendance of said events, if those events are independent.

Now obviously if the events are related, then the calculation doesn't apply.

Ex: Dragoncon 2017 and 2019, you would expect there to be a lot of overlap in attendance, because those aren't independent events. People who are fans/creators attending one year are likely to be attending the next one.

But if we see "random" mass shootings with improbable overlap in their victims, that undermines the assumption that they are random. Maybe someone hates garlic loving country music fans. Who knows?

Blogger Stilicho August 06, 2019 9:28 PM  

"-We select population as the attendees of first event. Probability for each member of attending the event: 1.0 (that's how selection bias works)
-Probability of any given American being at the second event: 52,800 / 350,000,000 = 0.00015
-Since Vox already established independent probability by his own calculation of a different thing, that means that for everyone in the population, individually, it's the same 0.00015 probability
-Conversely, there is 0.99984 probability of such an individual NOT being at the event. This now becomes our simple "and" test, that we apply to everyone in the population. It must return TRUE every single time. That is calculated by 0.99984^14520 which is 0.11185 . The inverse of "nobody attended the second event" is "at least one attended it, which is the number we are calculating. that is 1 - 0.11185 which then gives our final result."

Markku, I get that. What I struggle with is the fact that it is a calculation of the probability of at least 1 member of any random group of 14520 Americans out of 350 million to be at the second event. The first group (14520) was not random though at the time of the second event, it was a single combination of 14520 out of a set of 350 million. How does that calculation account for the probability of belonging to the first group of 14520?

Blogger Dirk Manly August 06, 2019 9:32 PM  

Re: David Hogg. He wasn't a student at (((commie bitch Stoneman))) High... he's the spitting image of the David Hogg who was arrested several years earlier (at the age of 19) in South Carolina on charges related to hard drugs.

Hogg made a deal with someone (Feebs? Clowns?) in exchange for being let out of custody and allowed to live like a normal person. Considering that his father is a feeb, that's probably who the deal was with.

Blogger Dole August 06, 2019 9:32 PM  

This comment has been removed by the author.

Blogger Dirk Manly August 06, 2019 9:35 PM  


@282

". One problem is also that if you reverse the order of events you get a different result in general. Not in this case because the numbers are so big, but in general. That also shows that the model can not be correct.

Phd should give back his degree."

Sorry, but that is NOT an indicator of correctness or non-correctness, as they are different initial conditions.

You need to go back to stats class.

Blogger Dave B August 06, 2019 9:35 PM  

Original phrasing: "The odds [of] one person in a country of 320 million being in the vicinity of two such events."

Interpretation 1 (Uncephalized's calculation [note 1]): Probability that at least 1 person in the country will attend both events.
Interpretation 2 (VD's calculation): Probability that one specific individual, (e.g. Lindsay Smith of Toledo, OH) will attend both events.

Both interpretations are semantically valid, and both calculations were performed with sufficient accuracy. But only Uncephalized's interpretation answers the actual question here, which is, how surprised should we be to see some of the same people at these events? (Answer: not very surprised).

The quantity that VD calculates doesn't tell us anything interesting, and it doesn't prove his point about coincidences. Basically, he is wrong. He can only claim to be correct on a technicality about the phrasing, as he does in @102.

I think an apology from VD to Uncephalized is in order here, especially given all the insults and smugness about "intellectual superiority".




[1] His formula is inexact, but 88.8% is correct in every decimal place. The exact formula is called the Hypergeometric distribution. https://en.wikipedia.org/wiki/Hypergeometric_distribution

With N=350,000,000, K=52,800, n=14,520
1-Pr(k=0) is 0.888157

The bit about raising .888 to the 3rd power is not exactly correct either, but again it doesn't really change the conclusion.

Blogger Markku August 06, 2019 9:35 PM  

SirHamster wrote:Dole wrote:Imagine a model with 3 people and two events of two people. In this case there is always someone attending the first and second event. Yet your model says the probability for no one attending the second is (1-2/3)^2.

The model says that the probability that a person did not attend either event is 1/3^2 = 1/9, which is correct.


No, think about it. You can't calculate it in the same WAY as I did mine. You can't take the attendees of the first event, and then calculate the probability that none of them were at the second event. Because that's impossible. There is only one slot for being left out. If the one of these two people was in that slot, then the other wasn't. This is the failure of statistical probability with very small numbers.

Blogger Dave Dave August 06, 2019 9:36 PM  

@287. The calculation is that at least one random person who has a probability 1 of being at event A was at event B. It is a given that they were at event A, so it's essentially a restricted test for event B. The probability of a random person being at event A is not considered.

Blogger Dirk Manly August 06, 2019 9:37 PM  

Hmmm... if they were in a group at both events, then they should really be counted as ONE entity for both events, rather than as 3 entities.

Blogger Dole August 06, 2019 9:37 PM  

@Markku Did you miss the "in general". Of course, these numbers would be different as well if not for limitations of floating point arithmetic.

(1-a/c)^b =! (1-b/c)^a almost ever.

Blogger Markku August 06, 2019 9:38 PM  

Stilicho wrote:How does that calculation account for the probability of belonging to the first group of 14520?

I have explained several times that the first probability vanishes due to this being a "there exists" question. Answer this: 100 people toss a coin twice. Of those who got heads on the first toss, how likely is it that they get heads on the second?

Blogger Markku August 06, 2019 9:43 PM  

Also, what is the probability of "there exists a person who was at the first event"? Obviously 100%. This is why "there exists" questions are fundamentally different than "a given person" questions. The second question already starts at a very low probability at the first step. The first question starts at 100%.

Blogger Dave Dave August 06, 2019 9:44 PM  

@291. The wildly incorrect 70% figure absolutely changes the conclusion, and the bad numbers to begin with alter the probability of at least one person from 88.8% to something considerably lower. When making a calculation and arriving at a number that's obviously not true, such as 70%, competent mathematicians would try again to find the right formula. Using SirHamster's method, we can see that, even when using the highly generous numbers, the probability is nowhere near 70%.
The first red flag was 88.8%, because that's extremely high. It's an indication that the numbers used can't be right. Given that, 70% is still downright stupid.

Blogger Dirk Manly August 06, 2019 9:45 PM  

(1 - 3/4)^2 != (1 - 2/4)^3

(1/4)^2 != (1/2)^3
1/16 != 1/8

That was the first set of values I picked for a, b, and c.

My hunch is that the left and right quantities are rarely (if ever) equal, for all cases when a != b

Blogger Markku August 06, 2019 9:53 PM  

However, this also demonstrates that it DOES work with large numbers. Because note how SirHamster put the three people scenario into the binomial probability formula, and got the correct result. If you draw out all the combinations of those three people on paper and count, you'll get the same number. However, using the original numbers, he got the same result as me. So, this demonstrates that the error vanishes quickly as the numbers grow.

Blogger Azure Amaranthine August 06, 2019 9:59 PM  

"Azure... the entire point of "not even wrong" is that the proposed answer or explanation is so badly flawed that going into each and every mistake would fill a book, and is better of being just dismissed, rather than dissected and autopsied."

I know what it means. It means you don't understand probability nearly as well as you think you do.

It's not wrong. It's correct from the context given for it. You can't name even one reason why it's wrong because you don't know one and are blustering.

Well, you just got called on it.

Blogger Dole August 06, 2019 10:00 PM  

@Markku it's not statistical principles that are at fault, it's that they are not being applied correctly. The way of calculating the probability makes similar conceptual errors as seen elsewhere in the discussion.

Blogger Dirk Manly August 06, 2019 10:01 PM  

Your equation's FORM was not even correct, and you pulled 1/512,000,000 out of thin air (even if using 1/2^29, it's still not correct).

Now, shut up and fuck off.

Blogger Azure Amaranthine August 06, 2019 10:02 PM  

Context: Given that the odds are 1/512,000,000 of a particular person out of 350,000,000 was at both of two events, the chances that, on average, one person out of 350,000,000 was at both events is 350,000,000x higher than that.

The key word here is average Dirk. Now man up and state your issue or shut up.

Blogger Azure Amaranthine August 06, 2019 10:03 PM  

"Your equation's FORM was not even correct, and you pulled 1/512,000,000 out of thin air"

I pulled it out of Vox's calc in the OP. Look for the big bold letters you blithering idiot. I'm not saying it's correct or not. I'm saying that IF GIVEN that it is correct, the rest follows.

Blogger Azure Amaranthine August 06, 2019 10:06 PM  

~512. Granted I misread, it says "one in 516,270,868". Close enough, or is that your issue?

Blogger Dirk Manly August 06, 2019 10:09 PM  

"
Gilroy AND Las Vegas probability: Multiplying 0.00005082 by 0.00003811428 results in a probability of 0.0000000019369677096, or one in 516,270,868.
You will observe that 88.8 percent, or 1/1.13, is very, very far away from 1/516,270,868, and 0.0000000019369677096 cubed -"

None of those numbers are 1/512,000,000.

Blogger Azure Amaranthine August 06, 2019 10:10 PM  

I misread 516, then rounded to the nearest million, hence the tilde.

You do know what a tilde means, right Dirk?

Blogger Azure Amaranthine August 06, 2019 10:11 PM  

Now where's the issue with my equation form. I'll bet you take issue because I'm not using an equation that would tell me the odds of it being precisely one, rather than an average.

Blogger Dirk Manly August 06, 2019 10:14 PM  

Again, I'm not your unpaid stats professor, nor your "I throw idiotic shit out, and this guy does 10x more work refuting my idiocy" monkey.

Blogger Azure Amaranthine August 06, 2019 10:15 PM  

You don't have an answer. Now shut the fuck up.

Blogger Dirk Manly August 06, 2019 10:16 PM  

Yes, Tilde means somewhere in the range of "within the precision of the last significant digit"

If you want to use 3 significant digits, but bungle the 3rd digit, don't blame the reader for not comprehending that you can't copy a 3 digit number correctly, and especially since 512*10^6 is a decent approximation of 10^29.

YOU failed to explain where your got the 512,000,000 number from. That's on you, not the reader.

Blogger Azure Amaranthine August 06, 2019 10:17 PM  

I didn't say where I got it from initially, nor is that the issue. I said that given that, then this.

Try to focus Dirk. Try really hard.

Blogger Azure Amaranthine August 06, 2019 10:18 PM  

Direct question: What is the error with the form of my equation? Not the pussy, quibbling number, but the equation which form you stated was in error.

Blogger Dirk Manly August 06, 2019 10:19 PM  

I'm not wasting my time educating trying to educate some moron who makes a 3rd week stats mistake, because I don't care to spew out 3 hours of lecture to get you up to speed, just because YOU haven't taken the relevant classwork.

I am not a fucking slave. And your attempt at shaming isn't going to make me budge on the matter.

Blogger Azure Amaranthine August 06, 2019 10:20 PM  

You state I'm wrong, but you can't say how, yet you insist that I am.

Put up or shut up. You have one more chance to answer the question Dirk.

Blogger Azure Amaranthine August 06, 2019 10:22 PM  

I've had almost this exact argument over random distribution calculations before, that's why I predict that you're going to try to give me a calculation for the probability of exactly one, rather than average one.

Blogger Azure Amaranthine August 06, 2019 10:23 PM  

Or is that giving you too much credit?

Blogger Dirk Manly August 06, 2019 10:24 PM  

Note also that NOBODY has chimed in to say that your form is correct.

Simply put, it's too simplistic. No probability calculation of this sort is as simple as the equation you threw out there. It's observably wrong on the basis that statistics is NOT that simple. I have neither the time nor inclination to give you 3 hours of probability and statistics lectures to get you up to speed.

Notice that NONE of the other proposed solutions is anywhere as close to as simple (4th grade arithmetic) as yours. That's because the problem under discussion is one that's beyond the concepts of 4th grade arithmetic.

Now, do you want me to continue with showing the rest of the group what an utter ass you are? Or are you going to shut up and drop it.

Blogger Azure Amaranthine August 06, 2019 10:27 PM  

"Note also that NOBODY has chimed in to say that your form is correct."

And no one has said it is wrong. Bandwagon fallacy already Dirk? But really it's not even a strong bandwagon.

"Simply put, it's too simplistic."

It is a shortcut that for some reason a lot of people don't use. It is correct.

"It's observably wrong on the basis that statistics is NOT that simple."

Woe is me, I know enough about probability to take shortcuts that Dirk can't understand.

" I have neither the time nor inclination to give you 3 hours of probability and statistics lectures to get you up to speed."

You haven't the capacity, and if you tried you'd find that I'm correct.

Blogger Azure Amaranthine August 06, 2019 10:28 PM  

I've gone through the entire math to prove the shortcut before, repeatedly, hence why I immediately suspected you were going to pop a "specifically one" equation.

Blogger Dirk Manly August 06, 2019 10:31 PM  

IF your work is correct as you say, than show it.

Personally, I'm not certain of the entire form needed. But it's MUCH more complex than what you presented. One, because we are talking about people, not marbles drawn blindly out of a bag.

Blogger Dirk Manly August 06, 2019 10:32 PM  

And if you've proved it in the past, then a simple cut and paste should reduce the effort needed down to about 30 seconds.

So, put up or shut up.

Blogger Azure Amaranthine August 06, 2019 10:32 PM  

The form for finding (Odds of exactly one incidence/Exactly two incidences/etc) is indeed much more complicated, I agree on that Dirk. However if you try to find the number for an average of one incidence, you will find that the shortcut I took is valid.

To me it's an intuitive shortcut, but from experience, apparently it isn't to most people.

Blogger Azure Amaranthine August 06, 2019 10:33 PM  

Give me some time to cool down and set up an example, I will provide.

Blogger Azure Amaranthine August 06, 2019 10:44 PM  

Okay, let's go with coin toss probability. It's simple enough to make a quick example.

I can take a two-sided coin and toss it up ten times. On average over multiple trials (tending to infinity) it will land five times on heads. This obviously doesn't mean that after five tosses of tails the rest are guaranteed heads.

The experimental outcomes range from zero-of-ten to ten-of-ten heads. This is similar to saying that it's possible that no people from one event attended the other, ranging up to ten (arbitrary example limit) people attending both.

If you were to calculate the odds of there being precisely five heads and five tails of the coin, you'd have to go more complicated, possibly by accounting for all the different combinations of coin flips that total to five each, then calculating the specific probability of each combination and adding them all together to get the total probability of exactly five heads and five tails, and this would not be a 50% probability. It would be lower than 50%.

The increasingly less likely (because of less numbers of possible combinations for each successive total) totals tending toward zero-of-ten and ten-of-ten eat into that, reducing it below 50% for specifically five heads and five tails.

However, you can simply shortcut calculating the probability for each specific combintation, and instead say "on average there will be five heads and five tails out of ten coin flips with a fair coin".

Blogger Ska_Boss August 06, 2019 10:45 PM  

Some (many) people can't see the forest for the trees.

Blogger Azure Amaranthine August 06, 2019 10:46 PM  

Let me think if I've left any holes or questions here, or point out anything you have an issue with so far....

Blogger SirHamster August 06, 2019 10:48 PM  

Markku wrote:No, think about it. You can't calculate it in the same WAY as I did mine. You can't take the attendees of the first event, and then calculate the probability that none of them were at the second event. Because that's impossible. There is only one slot for being left out. If the one of these two people was in that slot, then the other wasn't. This is the failure of statistical probability with very small numbers.
Rethinking this gave me a headache, but I see a nuance in the problem setup.

There's a difference between saying each person has a independent 2/3 chance to go to each event, and the scenario being described.

For the former, possible outcomes are

A-ABC (1 person attends first event, 3 people attend second)
AB-X (2 people attend first event, no one attends second)


For the latter, all outcomes are in the form of

AB-BC (2 people attend first event, 2 people attend second)

For the latter scenario, the correct answer is that there is a 1/9 chance for a specific individual to not attend either event, and a 1/3 chance for any individual to not attend either event.


In short, I shouldn't have accepted the claim that the numbers were correctly plugged into the model. The model was designed for a different scenario and tweaking the rules changes the correct equation to use.

Choosing the right equation should give the correct probability. I can't think of the right equation for the second scenario at the moment, and I think the original model is only good for the first one.

Blogger Azure Amaranthine August 06, 2019 10:51 PM  

This is an example of getting an average chance by multiplying the singular chance by the number of chances.

Blogger No August 06, 2019 10:55 PM  

Imagine two random events (theoretically randomly)
Winning the Presidency in 2016 and landing tails on a coin flip.

Winning the Presidency is a 1 in 300 million event.
Getting tails on a coin flip is a 1 in 2 event.

So:
What are the odds that someone is President AND gets tails on a coin flip?
What are the odd that someone does both?

The first probability is 1/300million times 50%
But the second probability is 1 times 50%

Blogger Azure Amaranthine August 06, 2019 10:57 PM  

"Winning the Presidency is a 1 in 300 million event."

And "coincidentally" if you multiply that by all 300 million people who could potentially compete for it, you'll average one president elected, unless we have open civil war.

Blogger Azure Amaranthine August 06, 2019 10:59 PM  

It really is that simple as long as you don't forget the "on average" part. That's the part that will bite your ass off if you try to use it in other ways.

Blogger justaguy August 06, 2019 11:00 PM  

Guys, this is simple probability. Depending on how the problem is stated, there is a good chance of someone attending both shootings. Just take a look at a probs and stat book. Assume the # X at the first and then look at probability that 1,2,or 3 or those X go to second that has #y. pretty simple with few assumptions (random selection). If one put more conditions, then likely becomes more probable. Why does this thread take 330 comments for college level stats?

Yes VD made a common mistake, so what, he isn't a numbers guy. There is a reason many many scientists with PHds screw up statistics... it takes specific training. This is why we have statisticians and mathematicians. Few of us are sufficient in multiple disciplines, and it takes very specific training to even approach engineering and hard sciences (2 semesters diff Qs plus lots more)

Blogger Markku August 06, 2019 11:00 PM  

You are right about this, Azure, but you are just calling it with a nonstandard name. You should be calling it "expected value". The formula is EV = p * n

Blogger Azure Amaranthine August 06, 2019 11:04 PM  

@Markku I will bow to your nomenclature, I autodidacted this part of this discipline when I was around sixteen, so I have little knowledge of the technical terms.

Blogger Azure Amaranthine August 06, 2019 11:16 PM  

"There is a reason many many scientists with PHds screw up statistics... it takes specific training."

It's also extremely easy to fudge in some areas, because, well, most people don't understand them....

"Lies, damned lies, and statistics."

Blogger No August 06, 2019 11:28 PM  

Yes, VD made a common mistake. Statistics really can be difficult and confusing. But he also belittled the intelligence of the person who didn't make that mistake seemingly *because* they didn't make that mistake.
The decent thing to do would be to admit his error and apologize.
In order to further knowledge of statistics for his readers he should create a new post explaining the statistical issue. His original post might be used mistakenly by others.

Blogger Dole August 06, 2019 11:28 PM  

This comment has been removed by the author.

Blogger Dirk Manly August 06, 2019 11:30 PM  

@330

"This is an example of getting an average chance by multiplying the singular chance by the number of chances."

That is almost always an oversimplification, and gives misleading results.

For example, rolling a die. 6 trials. Obtaining a "3" is a positive result, all others are failures.

By your method, (1/6) x 6 = 1 occurrance.



In actually 1- (5/6)^6 ~= 0.6651, which means 1/3 of the time, you don't even get ONE positive result.

Markku is right. The Binomial distribution is necessary when looking at low population situations.

In probability-related , oversimplification is the cause of many substantial errors. You're completely ignoring many Combinations, and instead, making a calculation that each possible outcome (a side of the die coming up on top) will occur exactly once.

This is more a situation of drawing a limited number of the population out of a pool, and permanently removing that member from the pool after each draw. So the odds for each draw changes. The formulas for that (using the P(n,m) and C(n,m) functions are well-established, but is also an overly simplistic model for this case.

Vox's math is merely a starting point -- a first order alculation which will get the magnitude of the probability to within -50%/+100% margin of error.

Blogger Azure Amaranthine August 06, 2019 11:31 PM  

Trying to goad someone into apologizing is not profitable.

That being said, he's under no obligation to do anything for his readers. They can get the knowledge they might want out of this thread with a little patience. And if they haven't the patience for themselves, why should he have it for them?

Blogger Azure Amaranthine August 06, 2019 11:34 PM  

"That is almost always an oversimplification, and gives misleading results."

I chose my words carefully. The result may be misleading to some people, but it is still accurate for what it is stated for.

"By your method, (1/6) x 6 = 1 occurrance."

False. You did not choose your words carefully. Average one occurrence in six rolls of a specific number, tending toward precision over infinite experiments.

Blogger Azure Amaranthine August 06, 2019 11:35 PM  

" You're completely ignoring many Combinations, and instead, making a calculation that each possible outcome (a side of the die coming up on top) will occur exactly once."

False. As I've already shown. Please re-read.

Blogger No August 06, 2019 11:40 PM  

Certainly VD is an independent human being and no contracts are involved. He has no "obligation". Just like all of us have no obligation for lots and lots of things. "should" and "ought" are different concepts. Actually I would expect an apology from him because that is the manly, tough thing to do in this case. In reality what I would like is for him to stop insulting people ... but most likely, if he did stop ... I wouldn't read his blog. That probably doesn't reflect well on me.

Blogger SirHamster August 06, 2019 11:41 PM  

Now that I've had a little warmup on my statistics, I can see how dumb all the midwit ankle-biters are being.

Hey geniuses, how would you calculate the expected value of double victims of Vegas and Gilroy?


Expected value of Vegas shooting: Pv * PopUS = Vegas attendees

Expected value of Gilroy shooting: Pg * PopUS = Gilroy attendees

Expected value of Vegas + Gilroy shooting: Pg * Pv * PopUS

That Pg*Pv term should look familiar, midwits.

Expected value of double victims: 350,000,000 / 516,270,868 = 0.678


That lines up with the binomial math. One double victim is not surprising (89%). 3 double victims? Somewhat surprising (37%), especially since this is just rough math.


Any of you idiots "correcting" Vox going to correct yourselves on the value of his math? Or how about you show your chops by correctly calculating the Standard Deviation of the double victim distribution?

Blogger Azure Amaranthine August 06, 2019 11:43 PM  

"Actually I would expect an apology from him because that is the manly, tough thing to do in this case."

How many people besides you do you think care what you expect?

"what I would like is for him to stop insulting people ... but most likely, if he did stop ... I wouldn't read his blog."

So you wouldn't like it, you only think that you would, or think that you'd like other people to think that you would.

I don't think there's a need to continue either of these lines of discussion.

Blogger No August 06, 2019 11:44 PM  

The probability of getting two 1's on a six sided die is 1 in 36 (1/6 X 1/6)
But, if you do roll a 1 the probability of rolling another 1 is just 1 in 6.
The probability of both numbers being the same (rolling doubles) is also 1 in 6. That's the important point in the original discussion.

There is a useful chart here: https://www.thoughtco.com/probabilities-of-rolling-two-dice-3126559

Blogger Azure Amaranthine August 06, 2019 11:45 PM  

This comment has been removed by the author.

Blogger Azure Amaranthine August 06, 2019 11:47 PM  

"The probability of both numbers being the same (rolling doubles) is also 1 in 6. That's the important point in the original discussion."

Well stated.

Blogger Dirk Manly August 06, 2019 11:49 PM  

@342

""By your method, (1/6) x 6 = 1 occurrance."

False. You did not choose your words carefully. Average one occurrence in six rolls of a specific number, tending toward precision over infinite experiments."

But we are not running 1,000,000 trials of two mass-shootings. You're using the wrong technique.

Blogger Azure Amaranthine August 06, 2019 11:49 PM  

Obviously there are still far too many "coincidences" for all of this to likely be natural.

Blogger No August 06, 2019 11:51 PM  

"expect" as in "what I predict" not "what I demand". I didn't consider the other interpretation of what I wrote but as you pointed it out that seems quite reasonable. I predict he will correct it.

And certainly, I don't have a blog with millions of readers. No one cares what I write.

Blogger Azure Amaranthine August 06, 2019 11:53 PM  

"But we are not running 1,000,000 trials of two mass-shootings. You're using the wrong technique."

If we take the chance of a random person out of X number being at one of those, and then the chance of a random person out of that same number being at the other, and then multiply those chances, we have the chance of selecting a random person out of X and them being at both events.

If we want the likelihood that any person out of X was at both events, we take that final number and multiply by X. It is possible that there were no people at both. It is possible that there were several. The Expected Value of persons out of X being at those events is this, which is an average probability of there being A person out of X at both events.

Blogger Azure Amaranthine August 06, 2019 11:58 PM  

I'm using the right technique to get the number I stated and the number I care about for my very first step, which happens to be much, much, much larger than any collatives we'd come up with by adding more factors in.

All of these factors combined, easily astronomically unlikely.

Blogger Dirk Manly August 06, 2019 11:58 PM  

@332

"And "coincidentally" if you multiply that by all 300 million people who could potentially compete for it, you'll average one president elected"

Because the probability for each person is derived from the fact that you get exactly one president. It's basically an odds for a lottery, where the size of the pool of drawn numbers exactly matches the number of tickets sold.

This is NOT such a case. We're talking the intersection of two sub-sets, X (survivors of Las Vegas Country Concert), and Y (survivors of Gilroy Garlic Fest), out of the total national population, set N.

How big should we EXPECT the intersection of X and Y to be, given their sizes, and that they are drawn from the total US Population (ignoring, for the moment, foreigners at both events).

It's not so simple. You're trying to solve it the way a 5th grader would. But if probability was this easy, it would be taught in 5th grade. There's a reason that Probability and Statistics is taught as a junior-year college course, and no sooner. Even when the entirety of an equation can be solved using no more than just the 4 basic arithmetic operators, there are many subtleties at work that make "the obvious solution" to be nothing more than oversimplified garbage.

Blogger Azure Amaranthine August 06, 2019 11:59 PM  

"you get exactly one president"

It's a joke. Don't explain the joke dammit.

Blogger Dirk Manly August 07, 2019 12:01 AM  

@343

>> " You're completely ignoring many Combinations, and instead, making a calculation that each possible outcome (a side of the die coming up on top) will occur exactly once."

> False. As I've already shown. Please re-read.

Maybe that's not what you mean, but that's what your math is doing.

Blogger Dole August 07, 2019 12:02 AM  

This comment has been removed by the author.

Blogger Azure Amaranthine August 07, 2019 12:03 AM  

Doesn't matter how big the intersection is, it's still the singular chance out of population X. You still can multiply it by popX to get the Expected Value for someone out of popX being at both.

I don't hate you Dirk, but I can see that there's no point in me continuing on this topic with you.

Blogger Azure Amaranthine August 07, 2019 12:04 AM  

"Maybe that's not what you mean, but that's what your math is doing."

I said exactly what my math was doing, and I actually meant what I said, and I was correct.

Blogger Azure Amaranthine August 07, 2019 12:06 AM  

Let's just agree that you don't like the way I use math, shall we?

Blogger Dirk Manly August 07, 2019 12:17 AM  

AA, you need to take a college level stats course.

It's not nearly as simple as you believe it to be.


That's part of why you used the term "probability" instead of "expected value." It's like using the term "average" (the sum of all values in the data set divided by the number of members in the data set) when what you're actually talking about is the mode (the most common value in the data set). And no, people using "average" when they mean "mode" is a very very common mistake made by people who have no statistics background. "The average car holds 5 passengers" means "the most common seat configuration of cars on the road is seating for 5" (2 bucket seats up front, plus a back seat bench with 3 sets of safety belts and room for 2 normal-sized people and one tiny person in the middle getting squished), i.e. the mode. The average is another story.

Using non-standard terminology in a loose way in the midst of a technical discussion where precisely defined terms are being used is always going to bite you in the ass.

That's why I said I have absolutely no interest in producing 3 hours of P & S lecture for you to figure out why your statement was so incredibly fucked up, and why I said that your equation wasn't even in the proper form. Because NO probability equation is in that form. EVER.

Blogger Azure Amaranthine August 07, 2019 12:19 AM  

Dirk, you're just confirming that you don't understand what I'm talking about.

Blogger Azure Amaranthine August 07, 2019 12:20 AM  

I actually meant average. That you can even think that I meant mode exemplifies your inability to understand.

Blogger Azure Amaranthine August 07, 2019 12:21 AM  

I didn't use it in a loose way. I used it exactly the way I intended to, and I was still correct. Nonstandard, sure.

Blogger Azure Amaranthine August 07, 2019 12:22 AM  

No amount of faith in your course-taking credentials will absolve you of your mathematical sin.

Blogger Azure Amaranthine August 07, 2019 12:22 AM  

You're the one taking what I said and failing to paraphrase.

Blogger Dirk Manly August 07, 2019 12:23 AM  

If you want to learn P & S on your own, then I suggest looking to the Schaum's Outlines series of publications. They are far cheaper than the usual textbooks used in college level courses, and often superior as a learning tool.

There's a joke that shines a light on an awful truth:

"There is no such thing as a good physics textbook, because a good physics textbook would be written to make it as easy as possible for students to learn the subject of physics, whereas actual physics textbooks are written for the purpose of impressing other physics professors."

The Schaum's Outlines series of paperback books on college-level course material demonstrates the truth within the above statement. They're written for the student to learn, not for impressing other professors.

Blogger Azure Amaranthine August 07, 2019 12:24 AM  

Dirk, I'm tired of this circling. You don't understand and I don't have the ability to correct your inability. I'm done. Say what you like, it won't make it true.

Blogger Dave Dave August 07, 2019 12:27 AM  

Azure, please stop commenting.

Blogger Azure Amaranthine August 07, 2019 12:29 AM  

@Dave Dave I won't be responding to Dirk any more, you on the other hand, better watch out.

Blogger Dirk Manly August 07, 2019 12:43 AM  

"I actually meant average. That you can even think that I meant mode exemplifies your inability to understand."

I was using average/mode as an illustration of the common misuse of statistical terms by people who have never had a stats class.

However, if you want people to understand what you are trying to communicate, then it is incumbent upon YOU to use the terminology using the standard, agreed-upon definitions used by everybody else in the discussion. By using your own private, non-standard definitions is even more disruptive to communications than if you just used some random word borrowed from a foreign language, or one that you just made up, because at least the reader would NOT have a pre-conceived idea of what you are saying, based on the definition of that word that he already knows but which is in conflict with the definition of the word that you are using.

This is one reason why so much philosophy, especially leftist-oriented philosophy, is such a shambles. The philosopher either invents a new word out of thin air (instead of starting with Latin or Greek prefixes, suffixes and roots), and making some nonsense word that is jarring to the thought process... or, among the less creative types, they just take a word that's already in use, and proceed to redefine the word. Then, they bounce back and forth between the common, every-day definition of the word, and their own, new, invented definition for the word, and thus waste their own effort writing, and the reader's time and effort reading, hundreds of pages of idiocy due to equivocation (basing an argument and two different definitions for the same word. Trivial example: say a philosopher makes a new definition of "large" to ALSO mean a color similar to violet...and then after demonstrating that elephants are large animals, then proceeds to switch to the "violet-like color" definition, and therefore attempts to use that to prove that all whales, skyscrapers, and multi-hundred-seat passenger aircraft are all colored at the extreme of the blue end of the color spectrum. Now obviously, something like this immediately grabs your attention, and it wouldn't pass the laugh test, because the two contrasting definitions are so far apart... but when they are closer together, they pass casual inspection, and it's only if someone very carefully reads and disects a philosophical treatise which employs such redefinitions of words that one can really see the lie that's being sold.
This is one of the reasons why the sexual-pervert faction of the political left is ALWAYS redefining words (such as, for example, "tolerance" (now means, exuberant support and cheering), "marriage," and more recently, "consent").

I don't mean that you meant to lie. But when nonstandard definitions are allowed to creep into a conversation, then the end result is shit. It's ALWAYS shit, even if the person misusing the word is sincere, the conclusion is based on the non-standard definition, and is then propagated forwards, and instead of revealing truth, instead, truth is hidden and obscured.

Blogger Dirk Manly August 07, 2019 12:44 AM  

"You're the one taking what I said and failing to paraphrase."

If you were using the commonly-accepted terminology, there wouldn't have been any need to paraphrase.

Blogger Dirk Manly August 07, 2019 12:45 AM  

And no, in a technical discussion, especially math based, expecting the reader to play the role of mind-reader, is really beyond the pale.

Blogger Markku August 07, 2019 12:46 AM  

Yes, expected value is not the same as mode. Let's say we are dealing with a bistable system that gives a number between 1 and 10. Here are the numbers of time each number occurred in our test: 9, 1, 1, 1, 1, 1, 1, 1, 1, 10

The mode is 10. The expected value is roughly 5.

Blogger Dirk Manly August 07, 2019 12:48 AM  

This comment has been removed by the author.

Blogger Markku August 07, 2019 12:49 AM  

The way expected value relates to "average" is that if you do the test an infinite number of times, the outcome will average exactly to expected value.

Blogger Markku August 07, 2019 12:50 AM  

Dirk, you clearly read the last number in the list wrong. It was 10.

Blogger Dole August 07, 2019 12:53 AM  

@SirHamster

It's the difference between sampling with and without replacement replacement. In this case we sample without replacement. The population of America is over 300 million and 300 million - 17K is still roughly the same number, so the end result is roughly the same. The formula that Markku uses is still wrong, and on each draw the probability that the person is drawn from the event A should increase as there are less Americans to draw from since they are already occupying the seats of the event B.

Thus the proper formula is 314.99 mil /315mil * 314.99 mil -1/315 mil etc. Which is given by hypergeometric distribution IIRC.

Blogger Markku August 07, 2019 12:57 AM  

Dole, that would only make sense if the events happened simultaneously. Since they happen years apart, the entire population of the first event is returned.

Blogger Markku August 07, 2019 1:00 AM  

Take it further. What if the events were twenty years apart, and two thirds of all Americans participated. Would you calculate it with the above formula, and what would it give as the result?

Blogger Markku August 07, 2019 1:09 AM  

And yes, I know bagging and boosting. They don't apply due to the temporal distance.

Blogger Azure Amaranthine August 07, 2019 1:17 AM  

"expecting the reader to play the role of mind-reader, is really beyond the pale."

Okay now you're just being dishonest. If you understood what I was talking about in the first place you would have understood that my words were correct, even though they might not be those usually used to discuss the topic. No amount of mind reading would save you in this case. Shut the fuck up Dirk, now I do have antipathy for you.

Blogger Azure Amaranthine August 07, 2019 1:18 AM  

I give you rope and you try to dishonestly hang me with it. Screw you.

Blogger Markku August 07, 2019 1:25 AM  

Well ok, technically not the ENTIRE population or we wouldn't be talking about this. A statistically insignificant number is taken out.

Blogger Dole August 07, 2019 1:41 AM  

@Markku Temporal difference has nothing to do with it. You are sampling the entire population into the event B and want to know the probability of the sample containing no Americans who attended event A. That means no replacement.

Probability that the first guy didn't participate in A is very high.
The second guy, slightly lower, since the first guy is taken out of the population.
The third guy, even lower, since again, there are less those who did not go to A to sample from.

The formula that you used is simply put irredeemably wrong, accept it. The fact that the phd didn't catch the mistake is shocking.

In this way it also doesn't matter whether we sample to event A or event B as it should be. Any probability calculation that gives two different probabilities on what should be the same probability is of course, ludicrous.

Blogger SirHamster August 07, 2019 1:44 AM  

Dole wrote:The formula that Markku uses is still wrong, and on each draw the probability that the person is drawn from the event A should increase as there are less Americans to draw from since they are already occupying the seats of the event B.

If you're talking about the binomial formula I brought up, it uses factorial, which accounts for drawing without replacement.

n! = 1*2*3*4 ... *(n-1)*n

The neat thing is that for small values of k, n!/(k!(n-k)!) can be simplified or approximated easily.

Blogger Dole August 07, 2019 2:02 AM  

I am talking about the method that was outlined by Markku, which phd supposedly verified.

Blogger Stephen St. Onge August 07, 2019 4:52 AM  

60.  VD

        MC: “So almost 50% chance that at least 1 person was at both.  Over many mass shootings, you would expect to occasionally get some double attendees.”

        VD: “Now apply this logic to four mass shootings and the probabilities that anyone from the first three shootings was at the fourth one.

        “In other words, why weren’t there Parkland and Columbine and Virginia Tech survivors at Gilroy? And why weren’t there Las Vegas and Gilroy survivors in El Paso, given the odds you give?”

        To figure the probabilities, we have to know how many were at each event.

        Searching Infogalactic, I get figures of about 101 survivors of the Parkland shooting.  Probability of a Parkland survivor being at Gilroy is therefore about 0.5%, assuming equal probability for anyone in the U.S. being at Gilroy, which is probably too high because Florida is three thousand miles from California.

        Around 488 students at Columbine.  Couldn’t find staff figures, so let’s say 50, one for every ten students.  Gives about a 2.7% chance of any of them being at Gilroy, using the same assumption.

        Virginia Tech is much harder, because it’s harder to define “survivor.” Cho went into a room in a residence hall, shot Emily Hilscher, shot Ryan Clark who tried to aid her, then left.  How many survivors? I chose 895, the approximate number of people who roomed there.  May be too high, since some must have been gone, but let’s not quibble.  Cho then went to Norris Hall and killed more people.  How many Norris Hall survivors? I can’t find any figures, so I used 895 again.  The result is about an 8.7% probability that one Virginia Tech survivor was at Gilroy.

At El Paso, I get about 1100 on the scene.  Therefore, the probability of a Gilroy survivor being at the scene is about 5.5%, the probability of a Vegas survivor being at the scene is about 4%.

        Overall, assuming equal probabilities that someone at one event was at the others (again, likely untrue, because they were in widely separated parts of the country), about a 12% probability that some one person at Columbine, OR Parkland, OR Virginia Tech was at Gilroy is about 12%.  And the probability that anyone from the three school shootings OR at Gilroy OR at Las Vegas was at El Paso is about 20%.

        So, eight to one against a school shooting survivor being at Gilroy, four to one against for anyone at the school shootings, Vegas, or Gilroy being in El Paso.

        Stop digging, VD.

Blogger Stephen St. Onge August 07, 2019 4:57 AM  

        And btw, the probability of a LV survivor being at Gilroy, using the equal probability assumption, is about 49%.  Probability of three being there is therefore about 12%.

        Using the more realistic assumption that people at LV were more likely to be at Gilroy than 1 in 350 million (because the events are relatively close to each other), the probability climbs of attendance at both.

Blogger Markku August 07, 2019 6:23 AM  

Ok, I see what you mean, though I don't "accept" "irredeemably wrong". The midpoint error (once we have sampled half the population) is 350,000,000 vs 349,973,600 and the question as it was posed at the other place was, is 88.8% right, as opposed to the minuscule number presented by Vox. That kind of difference wasn't going to affect a calculation to three significant numbers. But yes, the p does change for each individual. Going with p taken from the midpoint error would have been a better approximation. Theoretically it could have affected which way you round the last digit.

Blogger Markku August 07, 2019 6:28 AM  

Which formula is that, SirHamster? If you mean the original binomial probability calculation, no, it doesn't at this particular location. Because you still have the p there, and that p is uniform. In reality, the p also changes.

Blogger Markku August 07, 2019 6:29 AM  

You can confirm this by using it for n=0 and finding that it ends up being the exact formula used by OP and myself in that special case, due to so many terms disappearing. If mine was wrong, then the binomial was also wrong.

Blogger Markku August 07, 2019 6:30 AM  

I meant k=0

Blogger Markku August 07, 2019 6:50 AM  

I freely admit that I have never actually calculated hypergeometric probability. If k has not been very small compared to n, I have given up instead and hoped someone else will deal with it. Which has universally been the case.

Blogger Markku August 07, 2019 7:23 AM  

However, let's be even MORE generous and count the entire 52,800 for EVERYONE in the population when calculating the p. Result: 88.8%. The true answer, then, lies between 88.8% and 88.8% .

Blogger Dole August 07, 2019 8:22 AM  

Yes, I didn't doubt that, what is surprising that a phd approved the method (not the answer). Dumbing down indeed when a phd can't even handle basic probability.

And yes the binomial formula is also wrong, it samples with replacement.

So in any case we have been done here for a while, there is no statistical argument for that people can't be in two events. Many outlined why the probability should be lower. Well here are reasons why it should be higher:
- The events are very age selective.
- The events attract certain type of people.
- We are only considering two events. The probability of people from ANY of the possible shooting events attending is higher.
- Leftists classify anyone in the same town... county as attending anyway.

Blogger billo August 07, 2019 8:50 AM  

BadThinker wrote:

The basic point I want to make is that those kinds of questions aren't useful. So yes, you are right - you can make inferences about *uncertain* past events.



Yeah, I think that if we were sitting in a hotel lounge drinking vodka martinis, we wouldn't be having this issue -- a comment section is not a great medium for these kinds of discussion.

I guess the bottom line for me is that the discussion between VD and the OP (whose handle I forget) is all about which version of the "prosecutor's fallacy" to use, and ignores that it *is* the prosecutor's fallacy.

Our discussion is a little in the weeds beyond that. I disagree and think those kinds of questions *are* useful -- though that brings in the whole frequentist/bayesian thing -- and are part and parcel of what a lot of us do. But I think you are right. Our disagreement may be primarily in how we are describing things, and not so much at the core.

Blogger billo August 07, 2019 8:58 AM  

BadThinker wrote:Even then, I'd argue that using a specific number is far too certain. Better to simply say 'very likely, because of all of this evidence here', or, if you must, use a range of values when quantifying the uncertainty.


No argument there. In fact, the National Institutes of Science and Technology Forensic Science Board has come up with a position statement that, in general, such statements should not be used. In particular, they have come out against the use of the use of the terms "reasonable medical certainty" and "reasonable scientific certainty" in court because they really don't have quantitative meaning.

When I am on the stand, I generally refuse to provide quantitative probability estimates in my cases. Instead, I provide the "Your kidding me" scale, which consists of:

1) Well, I'm not surprised, I guess
2) Hmmm, that's surprising, but OK
3) Wow, I really wasn't expecting that. I would have bet a week's salary on it.
4) You're kidding. I would have bet a year's salary that I was right.
5) That just can't be the case. I don't believe it. This is alien abduction territory.

Juries and judges seem to be OK with that kind of Likert scale response, and it's never come back to bite me on appeal.

Blogger Happy Birthday August 07, 2019 9:18 AM  

MC is right and Vox is wrong. It's the Birthday Paradox applied to concert attendance.

With a more reasonable model factoring in geography and demographics (different ages and demos have different probabilities of attending, concerts draw from a population within a limited radius of the venue, some small portion of the population attends concerts within radius n at a very high probability, etc.), the expected number of double attendees would almost definitely exceed 1.

The mistake in the OP is understandable, but it's still a mistake.

Math is hard, barbies.

«Oldest ‹Older 201 – 400 of 427 Newer› Newest»

Post a Comment

Rules of the blog

<< Home

Newer Posts Older Posts