This was a joint guest lecture from Andrew Mao and Amit Sharma with an overview of causal inference and randomized experiments.

Most of what we’ve discussed in this class has focused on observational data—data obtained without direct intervention from or manipulation by those studying it. We can learn a lot from observational data and use it to find interesting relationships, build predictive models, or even to generate hypotheses, but it has it limits. This is often summarized by catchy phrases such as “correlation is not causation” or “no causation without manipulation”.

Amit opened this discussion by comparing two scenarios: (a) making a forecast about a static world with (b) trying to predict what happens when you change something in the world. For the former you might do well by simply recognizing correlations (e.g., seeing my neighbor with an umbrella might predict rain), but the latter requires a more robust model of the world (e.g., handing my neighbor an umbrella is unlikely to cause rain). We discussed the idea of trying to estimate the “effects of causes”, touching on the potential outcomes and causal graphical model frameworks.

Using the effect of hospitalization on health as an example, we talked about confounding factors that complicate causal inference. For instance, my health today might affect both whether I go to the hospital as well as my health tomorrow, making it difficult to isolate the effect of hospitalization on health from other factors. We saw this mathematized in what Varian calls the “basic identity of causal inference”: observational estimates conflate the average treatment effect with selection bias, where selection bias measures the baseline difference between those who opted into treatment and those who didn’t. Amit also discussed Simpson’s paradox, where selection bias is so large that it leads to a directionally incorrect estimate of a causal effect: what appears to be a positive correlation without adjusting for possible confounds can in fact become a negative one when all available information is accounted for.

Andrew then introduced counterfactuals and randomized experiments. The question you’d really like to answer is this: if you cloned each person and sent one copy of that person to the hospital, but not the other, what would the resulting difference in health be? Short of being able to do this, we could ask a slightly different question: if we had two groups of people who were nearly identical in every way and we sent one group to the hospital, but not the other, how would the health of the two groups differ? This is precisely the idea behind randomized experiments, such as clinical trials in medicine and A/B testing for online platforms. Randomization is key here, as it provides a way of creating two groups that are as similar as possible prior to the treatment (e.g., hospitalization) being administered: if people are randomly assigned to groups, then there shouldn’t be any systematic difference between the two groups, eliminating selection bias. Since the only difference between the groups is that one gets treated and the other doesn’t, we can ascribe differences in the outcome to the treatment.

While randomized experiments are the “gold standard” for causal inference, Andrew discussed some caveats and limitations in traditional approaches to experimentation in the social sciences, covering issues of both “internal” and “external” validity. The first asks whether the experiment was properly designed to isolate the intended effect, whereas the second asks if we should expect the results of the study to generalize to other scenarios. He proposed large-scale online experiments as a new paradigm that addresses some of these issues, and demonstrated the power of this approach with an in-class replication of his recent experiment showing how people learn to cooperate in the long-run even when it’s not in their interest to do so in the short term.

Amit closed the lecture by introducing natural experiments, where the idea is to exploit naturally occuring variation to tease out causal effects from observational data. More on this next lecture.

References: