18 April 2019

The tyranny of p-values

An article in Science News this week will be of interest to anyone with experience in scientific research.
In science, the success of an experiment is often determined by a measure called “statistical significance.” A result is considered to be “significant” if the difference observed in the experiment between groups (of people, plants, animals and so on) would be very unlikely if no difference actually exists. The common cutoff for “very unlikely” is that you’d see a difference as big or bigger only 5 percent of the time if it wasn’t really there — a cutoff that might seem, at first blush, very strict...
More than 800 statisticians and scientists are calling for an end to judging studies by statistical significance in a March 20 comment published in Nature. An accompanying March 20 special issue of the American Statistician makes the manifesto crystal clear in its introduction: “‘statistically significant’ — don’t say it and don’t use it.”

There is good reason to want to scrap statistical significance. But with so much research now built around the concept, it’s unclear how — or with what other measures — the scientific community could replace it. The American Statistician offers a full 43 articles exploring what scientific life might look like without this measure in the mix.

More at the link, and the subject matter is important.

In an earlier phase of my life I spent uncounted hours in an empty lab after all the staff had gone home, crunching numbers with an HP calculator, and sometimes coming up with p values that didn't meet the 0.05 cutoff that would determine acceptance for publication - knowing that the results were "true" and "important" but wouldn't be accepted.  Then looking at the notebooks and seeing some outliers, resisting the urge to ignore (lose) a data point or two, then having to decide whether the dataset could be analysed with a Mann Whitney U nonparametric analysis instead.   And the counterpoint was serving as reviewer for several journals, reading manuscripts and thinking "yes I see the p-value, but the study is still bullshit" (but having to write something more circumspect in the review).  I totally agree with this guy:

This isn’t the first call for an end to statistical significance, and it probably won’t be the last. “This is not easy,” says Nicole Lazar, a statistician at the University of Georgia in Athens and a guest editor of the American Statistician special issue. “If it were easy, we’d be there already.

2 comments:

  1. I'm a judge at my regional and state high school science project fairs. I've been amazed how over the course of a few years, the kids have been programmed to rattle of the p values of their projects.

    IMHO p-values only really relevant when you do big data and you're looking for tiny effects. When you're looking for a signal in noise. Otherwise, these p-values are as easily manipulated as the rest of the data.

    ReplyDelete
  2. I've heard a lot about p-hacking lately, maybe something will finally come of it! Here's the obligatory XKCD link :)
    https://www.xkcd.com/882/

    ReplyDelete