Sunday, August 30, 2015

False Discoveries and the ROC Curves of Social Science

Diagnostic tests for diseases can suffer from two types of errors. A type I error is a false positive, and a type II error is a false negative. The sensitivity or true positive rate is the probability that a test result will be positive when the disease is actually present. The specificity or true negative rate is the probability that a test result will be negative when the disease is not actually present. Different choices of diagnostic criteria correspond to different combinations of sensitivity and specificity. A more sensitive diagnostic test could reduce false negatives, but might increase the false positive rate. Receiver operating characteristic (ROC) curves are a way to visually present this tradeoff by plotting true positive rates or sensitivity on the y-axis and false positive rates (100%-specificity) on the x-axis.


As the figure shows, ROC curves are upward sloping-- diagnosing more true positives typically means also increasing the rate of false positives. The curve goes through (0,0) and (100,100), because it is possible to either diagnose nobody as having the disease and get a 0% true positive rate and 0% false positive rate, or to diagnose everyone as having the disease and get a 100% true positive rate and 100% false positive rate. The further an ROC is above the 45 degree line, the better the diagnostic test is, because for any level of false positives, you get a higher level of true positives.

Rafa Irizarry at the Simply Statistics blog makes a really interesting analogy between diagnosing disease and making scientific discoveries. Scientific findings can be true or false, and if we imagine that increasing the rate of important true discoveries also increases the rate of false positive discoveries, we can plot ROC curves for scientific disciplines. Irizarry imagines the ROC curves for biomedical science and physics (see the figure below). Different fields of research vary in the position and shape of the ROC curve--what you can think of as the production possibilities frontier for knowledge in that discipline-- and in the position on the curve.

In Irizarry's opinion, physicists make fewer important discoveries per decade and also fewer false positives per decade than biomedical scientists. Given the slopes of the curves he has drawn, biomedical scientists could make fewer false positives, but at a cost of far fewer important discoveries.

Source: Rafa Irizarry
A particular scientific field could move along its ROC curve by changing the field's standards regarding peer review and replication, changing norms regarding significance testing, etc. More critical review standards for publication would be represented by a shift down and to the left along the ROC curve, reducing the number of false findings that would be published, but also potentially reducing the number of true discoveries being published. A field could shift its ROC curve outward (good) or inward (bad) by changing the "discovery production technology" of the field.

The importance of discoveries is subjective, and we don't really know numbers of  "false positives" in any field of science. Some never go detected. But lately, evidence of fraudulent or otherwise irreplicable findings in political science and psychology point to potentially high false positive rates in the social sciences. A few days ago, Science published an article on "Estimating the Reproducibility of Psychological Science." From the abstract:
We conducted replications of 100 experimental and correlational studies published in three psychology journals using high-powered designs and original materials when available. Replication effects were half the magnitude of original effects, representing a substantial decline. Ninety-seven percent of original studies had statistically significant results. Thirty-six percent of replications had statistically significant results; 47% of original effect sizes were in the 95% confidence interval of the replication effect size; 39% of effects were subjectively rated to have replicated the original result; and if no bias in original results is assumed, combining original and replication results left 68% with statistically significant effects.
As studies of this type hint that the social sciences may be far to the right along an ROC curve, it is interesting to try to visualize the shape of the curve. The physics ROC curve that Irizarry drew is very steep near the origin, so an attempt to reduce false positives further would, in his view, sharply reduce the number of important discoveries. Contrast that to his curve for biomedical science. He indicates that biomedical scientists are on a relatively flat portion of the curve, so reducing the false positive rate would not reduce the number of important discoveries by very much.

What does the shape of the economics ROC curve look like in comparison to those of other sciences, and where along the curve are we? What about macroeconomics in particular? Hypothetically, if we have one study that discovers that the fiscal multiplier is smaller than one, and another study that discovers that the fiscal multiplier is greater than one, then one study is an "important discovery" and one is a false positive. If these were our only two macroeconomic studies, we would be exactly on the 45 degree line with perfect sensitivity but zero specificity.

Thursday, August 6, 2015

Macroeconomics Research at Liberal Arts Colleges

I spent the last two days at the 11th annual Workshop on Macroeconomics Research at Liberal Arts Colleges at Union College. The workshop reflects the growing emphasis that liberal arts colleges place on faculty research. There were four two-hour sessions of research presentations--international, banking, information and expectations, and theory--in addition to breakout sessions on pedagogy. I presented my research in the information and expectations session.

I definitely recommend this workshop to other liberal arts macro professors. The end of summer timing was great. I got to think about how to prioritize my research goals before the semester starts and to hear advice on teaching and course planning from a lot of really passionate teachers. It was very encouraging to witness how many liberal arts college professors at all stages of their careers have maintained very active research agendas while also continually improving in their roles as teachers and advisors.

After dinner on the first day of the workshop, there was a panel discussion about publishing with undergraduates. I also attended a pedagogy session on advising undergraduate research. Many of the liberal arts colleges represented at the workshop have some form of a senior thesis requirement. A big part of the discussion was how to balance the emphasis on "product vs. process" for undergraduate research. In other words, how active of a role should a faculty member take in trying to ensure a high-quality final product of a senior thesis project versus ensuring that different learning goals are met. What should those learning goals be? Some possibilities include helping students decide if they want to go to grad school, teach independence, writing skills, econometric techniques, the ability to for an economic argument. And relatedly, how should grades or honors designations reflect the final product and the learning goals that are emphasized?

We also discussed the relative merits of helping students publish their research, either in an undergraduate journal or a professional journal. There was a lot of lack of clarity about how it affects an assistant professor's tenure case if they have very low-ranked publications with undergraduate coauthors, and a general desire for more explicit guidelines about whether that is considered a valuable contribution.

These discussions of research by or with undergraduates left me really curious to hear about others' experiences doing or supervising undergraduate research. I'd be very happy to feature some examples of research with or by undergraduates as guest posts. Send me an email if you're interested.

At least two other conference participants have blogs, and they are definitely worth checking out. Joseph Joyce of Wellesley blogs about international finance at "Capital Ebbs and Flows." Bill Craighead of Wesleyan blogs at "Twenty-Cent Paradigms." Both have recent thoughtful commentary on Greece.