*sensitivity*or

*true positive rate*is the probability that a test result will be positive when the disease is actually present. The

*specificity*or

*true negative rate*is the probability that a test result will be negative when the disease is not actually present. Different choices of diagnostic criteria correspond to different combinations of sensitivity and specificity. A more sensitive diagnostic test could reduce false negatives, but might increase the false positive rate. Receiver operating characteristic (ROC) curves are a way to visually present this tradeoff by plotting true positive rates or sensitivity on the y-axis and false positive rates (100%-specificity) on the x-axis.

Source: https://www.medcalc.org/manual/roc-curves.php |

As the figure shows, ROC curves are upward sloping-- diagnosing more true positives typically means also increasing the rate of false positives. The curve goes through (0,0) and (100,100), because it is possible to either diagnose nobody as having the disease and get a 0% true positive rate and 0% false positive rate, or to diagnose everyone as having the disease and get a 100% true positive rate and 100% false positive rate. The further an ROC is above the 45 degree line, the better the diagnostic test is, because for any level of false positives, you get a higher level of true positives.

Rafa Irizarry at the Simply Statistics blog makes a really interesting analogy between diagnosing disease and making scientific discoveries. Scientific findings can be true or false, and if we imagine that increasing the rate of important true discoveries also increases the rate of false positive discoveries, we can plot ROC curves for scientific disciplines. Irizarry imagines the ROC curves for biomedical science and physics (see the figure below). Different fields of research vary in the position and shape of the ROC curve--what you can think of as the production possibilities frontier for knowledge in that discipline-- and in the position on the curve.

In Irizarry's opinion, physicists make fewer important discoveries per decade and also fewer false positives per decade than biomedical scientists. Given the slopes of the curves he has drawn, biomedical scientists could make fewer false positives, but at a cost of far fewer important discoveries.

Source: Rafa Irizarry |

*along*its ROC curve by changing the field's standards regarding peer review and replication, changing norms regarding significance testing, etc. More critical review standards for publication would be represented by a shift down and to the left along the ROC curve, reducing the number of false findings that would be published, but also potentially reducing the number of true discoveries being published. A field could

*shift*its ROC curve outward (good) or inward (bad) by changing the "discovery production technology" of the field.

The importance of discoveries is subjective, and we don't really know numbers of "false positives" in any field of science. Some never go detected. But lately, evidence of fraudulent or otherwise irreplicable findings in political science and psychology point to potentially high false positive rates in the social sciences. A few days ago, Science published an article on "Estimating the Reproducibility of Psychological Science." From the abstract:

We conducted replications of 100 experimental and correlational studies published in three psychology journals using high-powered designs and original materials when available. Replication effects were half the magnitude of original effects, representing a substantial decline. Ninety-seven percent of original studies had statistically significant results. Thirty-six percent of replications had statistically significant results; 47% of original effect sizes were in the 95% confidence interval of the replication effect size; 39% of effects were subjectively rated to have replicated the original result; and if no bias in original results is assumed, combining original and replication results left 68% with statistically significant effects.As studies of this type hint that the social sciences may be far to the right along an ROC curve, it is interesting to try to visualize the shape of the curve. The physics ROC curve that Irizarry drew is very steep near the origin, so an attempt to reduce false positives further would, in his view, sharply reduce the number of important discoveries. Contrast that to his curve for biomedical science. He indicates that biomedical scientists are on a relatively flat portion of the curve, so reducing the false positive rate would not reduce the number of important discoveries by very much.

What does the shape of the economics ROC curve look like in comparison to those of other sciences, and where along the curve are we? What about macroeconomics in particular? Hypothetically, if we have one study that discovers that the fiscal multiplier is smaller than one, and another study that discovers that the fiscal multiplier is greater than one, then one study is an "important discovery" and one is a false positive. If these were our only two macroeconomic studies, we would be exactly on the 45 degree line with perfect sensitivity but zero specificity.