ALL BLOG POSTS AND COMMENTS COPYRIGHT (C) 2003-2016 VOX DAY. ALL RIGHTS RESERVED. REPRODUCTION WITHOUT WRITTEN PERMISSION IS EXPRESSLY PROHIBITED.

Tuesday, December 16, 2014

The randomness of scientistry

Science is finally turning scientody on scientistry... and the results are not as self-flattering to professional science as most scientists expected.
The NIPS consistency experiment was an amazing, courageous move by the organizers this year to quantify the randomness in the review process. They split the program committee down the middle, effectively forming two independent program committees. Most submitted papers were assigned to a single side, but 10% of submissions (166) were reviewed by both halves of the committee. This let them observe how consistent the two committees were on which papers to accept.  (For fairness, they ultimately accepted any paper that was accepted by either committee.)

The results were revealed this week: of the 166 papers, the two committees disagreed on the fates of 25.3% of them: 42. But this “25%” number is misleading, and most people I’ve talked to have misunderstood it: it actually means that the two committees disagreed more than they agreed on which papers to accept. Let me explain.

The two committees were each tasked with a 22.5% acceptance rate. This would mean choosing about 37 of the 166 papers to accept. Since they disagreed on 42 papers total, this means each committee accepted 21 papers that the other committee rejected and vice versa, for 21 + 21 = 42 total papers with different outcomes. Since they each accepted 37 papers, this means they disagreed on 21/37 ≈ 56% of the list of accepted papers.

In particular, 56% of the papers accepted by the first committee were rejected by the second one and vice versa. In other words, most papers at NIPS would be rejected if one reran the conference review process (with a 95% confidence interval of 40-75%).
What rightly concerns the writer is the fact that a purely random process would have resulted in a 77.5 percent disagreement, which is closer to the 56 percent observed than the 30 percent expected. And, of course, the 0 percent that the science fetishists would have us believe is always the case.

This is a very important experiment, because it highlights the huge gap between science the process (scientody) and science the profession (scientistry). Some may roll their eyes at my insistence on using different words for the different aspects of science, but the observable fact, the scientodically informed fact, is that using the same word to refer to the two very differently reliable aspects of science is incredibly misleading.

Labels:

36 Comments:

Blogger Chris Scena December 16, 2014 12:03 PM  

I don't know which is a sadder statement; that the events in this article happened or that no one is surprised to hear it.

Anonymous dantealiegri December 16, 2014 12:07 PM  

RIP, science by committee, let it stay dead this time.

Anonymous Michael Maier December 16, 2014 12:09 PM  

I am confused. Why would a subjective process resemble random results? And since it is a subjective process so why would the science fetishists ever expect 0% variance?

Anonymous VD December 16, 2014 12:14 PM  

Why would a subjective process resemble random results?

Irrelevant. It's mentioned for the purpose of comparing the value of the process.

And since it is a subjective process so why would the science fetishists ever expect 0% variance?

Because they believe fervently in the reliability of peer reviewed paper science. If peer reviewed paper science was reliable as they claim it to be, both groups should have selected the same objectively superior 37 papers.

Anonymous Huckleberry -- est. 1977 December 16, 2014 12:24 PM  

Some may roll their eyes at my insistence on using different words for the different aspects of science

I certainly don't.
Your coinage for this immensely important distinction is quite helpful and much appreciated.

Anonymous Sensei December 16, 2014 12:28 PM  

Can someone who speaks statistics explain what "a 95% confidence interval of 40-75%" means?

Anonymous Jack Amok December 16, 2014 12:30 PM  

Why would a subjective process resemble random results?

The point is that it shouldn't, but it did.

And since it is a subjective process so why would the science fetishists ever expect 0% variance?

Science is supposed to be objective, not subjective.

Anonymous RS December 16, 2014 12:33 PM  

Sensei -

Assuming the underlying statistical distributions do not change, if you were to repeat this process for a grand total of 20 times, you would expect that 19 out of 20 of the trials would have disagreement rates between 40% and 75% (the single trial observed 56%).

Anonymous Porky December 16, 2014 12:42 PM  

Ha! This is a helpful benchmark. Now any committee with less than 56% disagreement can reliably be considered biased.

Anonymous A Paradigm Is More Than Twenty Cents December 16, 2014 12:54 PM  

Count me among those who found the terms "sciencetody" and "scientistry" to be redundant, somewhat pretentious and therefore of no use.

However, this little study is quite clarifying, it verifies some of my own observations about the peer review process. Not in the sense of confirmation bias, but in the sense of "yeah, I suspected as much".

Therefore I reluctantly accept the terms, because science the process is obviously not the same as science the profession and thus the words clarify thought.

Anonymous Michael Maier December 16, 2014 12:55 PM  

So between this and the other link you posted a while back about non-replicability, how much more evidence do we need to show science doesn't work, bitches?

Oh wait... SJW and other liberal idiots ... yeah, forget I asked.

Anonymous Earl December 16, 2014 1:02 PM  

I am glad you use different words for different aspects of science. One aspect buys itself legitimacy at the other aspect's expense. Just like the macro-evolutionists credit themselves with micro-evolution.

Anonymous Sensei December 16, 2014 1:29 PM  

19 out of 20 of the trials would have disagreement rates between 40% and 75%

Thanks RS, makes sense.

Anonymous maniacprovost December 16, 2014 1:32 PM  

And yet my defining "ethics" to be based on observation of the natural world, while "morality" is based on divine revelation, is a ludicrous redefinition of terms that has no place in dialectical debate.

Well, I'm a nobody, so I can't make up words at will. I am pretty sure that I was responsible for the widespread use of the word "fail" in MMOs 5 to 10 years ago, so I have that.

Blogger Vox December 16, 2014 1:50 PM  

And yet my defining "ethics" to be based on observation of the natural world, while "morality" is based on divine revelation, is a ludicrous redefinition of terms that has no place in dialectical debate.

It's not that you're a nobody, it's that you are a nobody offering a competing definition for perfectly well-defined words with which most people are already familiar. Your definitions don't clarify, they confuse.

My neologisms are for concepts insufficiently articulated. It's as if there was only one word, teal, to cover all blues and greens, and I'm saying THIS is blue and THAT is green. You're looking at red and yellow and deciding to call them hwirth and murgu.

Anonymous Stickwick December 16, 2014 1:51 PM  

Michael Meier: So between this and the other link you posted a while back about non-replicability, how much more evidence do we need to show science doesn't work, bitches?

A little perspective is in order. First, note that almost all of the articles discussing the problems in science have to do with the medical field, biology, climatology, or the social sciences. The hard sciences are not without their problems, but in general they are in far better shape than the softer, and certainly the social, sciences.

Second, science is only as good as the culture from which it derives its values and ethics. When science is influenced by big money and politics and becomes ever-more divorced from its Christian roots, we can expect to see more corruption, sloppiness, and an overall decline in standards.

Science based on the biblical notion that all things must be tested, and that we are ultimately seeking His Truth, has led to magnificent work. Science based on the humanist notion that all things must serve a subjective worldview, and that there is ultimately no objective truth, has led to, well, what we're seeing now in many fields.

Anonymous Athor Pel December 16, 2014 1:57 PM  

" maniacprovostDecember 16, 2014 1:32 PM
...
I am pretty sure that I was responsible for the widespread use of the word "fail" in MMOs 5 to 10 years ago, so I have that.
"



If you played a tank I understand your pain and forgive you.
If you played dps then you stink and your momma dresses you funny.
If you played a healer then by definition you do no wrong, all tanks love you and beautiful music plays as you walk by.

Anonymous jack December 16, 2014 2:15 PM  

Bless you Stickwick!
keep keeping it real.

Anonymous Stephen J. December 16, 2014 2:18 PM  

I agree that a distinction in terms needs to be made, but I'm not a fan of "scientody" and "scientistry", for several reasons: 1) the words just sound rather goofy, 2) they can be easily misremembered for each other with a lapse in attention, and 3) they appear vulnerable to the rhetorical accusation of hairsplitting nitpickiness for someone who hasn't grasped the sharp distinction of their meanings.

I therefore propose that "science" retain its classic meaning as the technique of empirical testing of hypotheses about physical reality, and that everything related to the employment, publication, profession and political aspects of the vocation -- anything, in other words, where anything other than such empirical testing determines results -- be called academarchy (for "rule by the school"). Anything "scientific" is distinguished by explicit use of the full scientific method, e.g. full replication of an experiment; anything "academarchic" involves a decision by an invested academic authority for reasons other than that full empiricism. Academarchic peer-review, thus, proves only that a majority of validated judges consider a paper valid as is, and can be taken with all the grains of salt necessary by those who know the politics of the academarchy.

Blogger Cee December 16, 2014 2:19 PM  

First, note that almost all of the articles discussing the problems in science have to do with the medical field, biology, climatology, or the social sciences.
B-but then how would we have found that most people get their Vitamin E from desserts, and smokers derive most of their Vitamin C from pizza??

Science works!!!

This past week has been a great "what the hell are biologists even doing over there" experience for me.

Blogger Doom December 16, 2014 2:25 PM  

No, I like the notion of separating practices through terms, because they are different. Much as what needs to happen in church terminology, which I seem to recall you messing with as well. The church as a building, as a people, as a belief system, as a bride of Christ, and possibly others. I'm just a little uncertain, always, exactly which you are discussing if the context is less than revealing, eluding to some past debate, or such. You honest need a side page of Vox terms. It would help the terminology go mainstream. If well done, as I began using the terms, I might even include a link for others to go find the terms and meanings as I begin using them more often.

So? Get teh dang Voxitionary up, already. I've only got a couple of decades of life left to me. Can't wait all year.

Anonymous Discard December 16, 2014 2:42 PM  

It seems to me that this is what you would expect from having 4500 degree granting institutions in this country, over half of them four year schools. They've all got science professors and they all have to write stuff, at least once. How many are simply mediocre? How many would have been first rate bakers and woodcarvers but are busily boring their students instead?

This might just be a necessary consequence of requiring teachers to publish. We need physics and chemistry professors to teach engineers, and we need biology professors to teach doctors. But we probably ought to look into separating teaching professors from researchers. Let the real scientists do research and teach if they want to, and let the PhD'd schoolteachers do a little research if they want to.

Anonymous patrick kelly December 16, 2014 3:18 PM  

Wow, awesome thread.

Still not sure I can explain without just parroting other's words what scientody and scientistry are, but this barely mid-wit now has a better understanding of why the separation from science is necessary and profitable.

Blogger SirHamster December 16, 2014 3:27 PM  

So between this and the other link you posted a while back about non-replicability, how much more evidence do we need to show science doesn't work, bitches?

None. Science! runs on blind faith.

That's why they continuously mischaracterize people with an evidence-based faith as holding a blind faith - they project.

Blogger JCclimber December 16, 2014 5:30 PM  

Still sadly shaking my head over the results of something that happened at Amgen just a couple years ago.

Billion dollar international company. Lots of research, but paid for with the expectation that the research will lead to marketable results. Their scientists were starting wonder why experiments were not giving expected results.

So.....they decided, on the company dime, to try to replicate 53 important published research papers in the Biotech field.

They were only able to replicate 6 of the 53 studies. Experienced scientists who WANTED to replicate the original papers, not disprove those papers. I know from my years spent at Amgen's competitor and from my interactions with Amgen myself that their scientists are no slouches.

But then, I've been a skeptic of published science since I went to a trade conference in the 90's and saw my own name on a science study being used to sell a particular product to attendees. Almost lost my job after complaining about not being notified about that in advance. The reason for the complaint? Since I ran the experiments, I knew that the product was barely better than water, and knew how many times we had to run the trials just to get the statistical power to conclude that "yes, it is statistically superior to plain water".

Anonymous Giuseppe December 16, 2014 6:12 PM  

Vox,
I tend to agree with Stephen J. I find the new terms a bit unwieldy. Personally I think science and things scientific are well defined, and by trying to invent new terms we in a sense give the SWJs involuntary ammunition by allowing them to pervert the language and redefine words. While Stephen's Academarchy word is just as ... unpalatable(?) it at least has the distinction of instantly being clear. Personally I simply tend to explain that most scientists do science in the same way that most chimps do science, i.e. not often or well, because that is the nature of trained circus apes.
It's unwieldy and academarchy probably serves better (although I understand your attachment to your words probably because of TIA) although I personally prefer my own version i.e:
Circus Monkeys (or even apes) != scientists

Anonymous Giuseppe December 16, 2014 6:16 PM  

JCclimber,
I recall long ago in I think New Scientist, a study that basically scientifically proved the point that 95% of everything is bullshit. I think that was probably real science at work there.

Anonymous kh123 December 16, 2014 6:30 PM  

"They were only able to replicate 6 of the 53 studies. Experienced scientists who WANTED to replicate the original papers, not disprove those papers."

Indefatigable Science Fetishist: "Well, that just proves that the method and peer review works. And if nothing else, it proves that flat-earthers and nay-sayers don't understand how Science works. Because amazingly accurate results. And every rape allegation from the four corners of academia is undoubtedly accurate."

Anonymous kh123 December 16, 2014 6:32 PM  

"Circus Monkeys (or even apes) != scientists"

I see what you did there.

Blogger Anthony December 16, 2014 7:24 PM  

One further complication, not dicussed at the original post or by VD:

Rejection doesn't necessarily mean the results are bad; acceptance doesn't necessarily mean the results are good. Rejection can be for poor writing or poor organization or because the reviewer thinks the results aren't important. Any of those reasons for rejection could mean that the *results* are perfectly valid, but that the reviewer doesn't think the *paper* should be published. A paper might get accepted because the reviewer thinks everything was done well enough, the paper is written well enough, and the result agrees with the reviewer's prejudices; however, even if the methodology is sound, that 95% confidence interval means that 1 in 20 results is just plain wrong.

The experiment was done with a conference proceeding, where there's a time deadline; if a paper is submitted to a journal, poor writing or poor organization can result in a "revise and resubmit", meaning the reviewer thinks the *results* are worth something, but that the *paper* isn't yet publishable. On the other hand, conference proceedings are typically seen as less important than journals, so reviewers might be more lax about accepting good results in a bad paper than they would for a journal.

Anonymous Stephen J. December 16, 2014 8:31 PM  

"Stephen's Academarchy word is just as ... unpalatable"

You only say that because I passed on my first thought, which was the horrible term Scholastico-Bureaucracy.

I wanted something that would summarize the procedural political calcification being criticized in an immediately obvious way.

Blogger MendoScot December 16, 2014 8:56 PM  

Every scientist I know has horror stories of the peer review process. And I could add my experience, including the world's top journals. The process is corrupt, incompetent and often extortionate. The editors are more worried about maintaining their stable of reviewers than actually considering whether the reviews are valid.

And the number of new journals offering to publish your research for cost? Almost all based in India and China?

Well, ciao to the Western Christian model of science.

Publish or perish.

Anonymous sth_txs December 16, 2014 9:03 PM  

Great study! Something to use on LinkedIn when arguing with dopes who throw the peer review comment around as if it is some sacred infallible process. Yeah, people with degrees in engineering and science really put faith in this.

Anonymous Titus Didius Tacitus December 17, 2014 4:56 AM  

A problem with academarchy (for "rule by the school") is that ultimately the school doesn't rule.

Bruce Charlton has good points on the not-even-trying new scientific elite, and one of them is that the mass media rules. From time to time the mass media pillories an academic (like Watson) and enforces the taboos of the tribal-moral community. The opposite doesn't happen. Academics don't put a mass media figure in the pillory from time to time, certainly not in the way Watson was made an un-person. So there's a pecking order in the determination of scientific and academic acceptability, and scientists and academics are not at the top of it.

Blogger William Ogle December 17, 2014 9:19 AM  

The discrepancy is simple, by arbitrarily assigning a number of acceptances (37), they are forcing the issue. 26 publications were accepted by both committees these pass the review criteria. The rest (21) are forced, they don't meet the acceptable criteria and but have to be selected to reach 37 acceptances. That forces a choice that turns out to be random. Perutz said 90% of all science is crap, this supports that observation

Anonymous Stephen J. December 17, 2014 9:50 AM  

"A problem with academarchy... is that ultimately the school doesn't rule. ...(T)he mass media rules. From time to time the mass media pillories an academic (like Watson) and enforces the taboos of the tribal-moral community. The opposite doesn't happen."

True, but I have to say that strikes me as like saying the KGB ruled the Soviet Union because they were the ones who shot the dissidents behind the woodsheds; they did do that and their agents had power because of that fact, but for the most part they took orders from the Politburo rather than giving them. Note that most times an expert of any stripe comes in for a public pasting, the media almost always gets other experts to provide the key testimony; it's only when the offense is so politically incorrect that they don't need such counter-expertise (like Watson's ostensible racism, where "everybody knows" how wrong it is) that they don't bother.

And the thing about the mass-media pilloryings is that they are the rarest and most extreme form of sanction. Most of the academarchy's control actions happen well out of public view, inside the administrative offices, where offensive papers never make it as far as the media before they're rejected or censored; the "Climategate" emails are a perfect example of this process.

Scientists, i.e. real empirical experimenters or theoretical developers, may not be at the top of the academic pecking order, but the media are more the lions in the arena than the emperor; it's the deans and department heads who are the real rein-holders, I think.

Post a Comment

Rules of the blog
Please do not comment as "Anonymous". Comments by "Anonymous" will be spammed.

<< Home

Newer Posts Older Posts