The PhD Student Meeting Protocol*


Over time, I have learned that the following agenda works well with PhD students. I warn them to expect these five questions, always in the same order. If they can’t answer the first question, there is no point in thinking about the second. If they can’t answer the second, there is no point in thinking about the third. And so on...

1. What is your research question?

This sounds straightforward, but there is a trap. Many students arrive and state, “My research question is, why do we observe X?” Now, this may be a genuine puzzle, and perhaps a good topic for a theory paper. But it is a question about the causes of an effect. I encourage students (at least initially) to focus on the effects of causes, which requires a question of the form, “What is the effect of X on Y?”

2. Why should I care?

Not every question is important. I do not care about a topic just because it is novel – sometimes that gap in the literature exists for good reason! I also don’t care about a question just because there exists a great instrument – note that Question 2 comes before Question 4.

3. What is the simplest possible research design, and why is it wrong?

Given a research question, we can start to think about practicalities. Where do the data come from? What is the unit of analysis? Are the outcome and explanatory variables good measures of the relevant theoretical constructs? At this stage, the key question is: If we ran a simple OLS regression, what is the single biggest threat -- omitted variable, selection, or simultaneity -- to a causal interpretation of the result?

4. What is your identification strategy?

Volumes have been written on this topic, so I can’t add much here. At this stage, we try to do some creative thinking, and I also encourage students to pay attention to empirical etiquette.

5. Do you have a Table 2?

It is easy to become so focused on estimating “the effect of X on Y” that we lose sight of the broader agenda. That is where Table 2 comes in. If Table 1 illustrates a paper’s main result, Table 2 provides a deeper understanding of the phenomenon. It might illustrate heterogeneous impacts across outcomes or sub-samples to highlight a particular mechanism, or use a placebo to show that the main result doesn’t appear where it shouldn’t. There is no formula for creating a good Table 2 – but asking the question at this stage provides a backdoor through which we can return to the “why?” questions that I banned in Step 1.

Many of these ideas were developed in conversations with Ajay Agrawal and Avi Goldfarb. Ajay reminded me that this agenda worked, which inspired me to write it up.