I am an econometrician with research interests in causal inference and weak identification. I primarily focus on developing new methods for causal inference in a more realistic setting of treatment effects heterogeneity. I have also contributed research on weak identification with many instruments. Currently I am a Lecturer in the Department of Economics at University College London and an Untenured Associate Professor (on leave) at CEMFI. Previously I was a Postdoctoral Research Fellow at UC Berkeley. I received my Ph.D in Economics and Statistics from MIT in 2021, and my B.A. in Economics and Mathematics from Wellesley College in 2014.
I am also on Google Scholar. For the NBER Summer Institute Methods Lectures 2023 “Linear Panel Event Studies” (with Jesse M. Shapiro), the notes and recordings are available here.
When there are multiple outcome series of interest, Synthetic Control analyses typically proceed by estimating separate weights for each outcome. In this paper, we instead propose estimating a common set of weights across outcomes, by balancing either a vector of all outcomes or an index or average of them. Under a low-rank factor model, we show that these approaches lead to lower bias bounds than separate weights, and that averaging leads to further gains when the number of outcomes grows. We illustrate this via simulation and in a re-analysis of the impact of the Flint water crisis on educational outcomes.
Empirical research typically involves a robustness-efficiency tradeoff. A researcher seeking to estimate a scalar parameter can invoke strong assumptions to motivate a restricted estimator that is precise but may be heavily biased, or they can relax some of these assumptions to motivate a more robust, but variable, unrestricted estimator. When a bound on the bias of the restricted estimator is available, it is optimal to shrink the unrestricted estimator towards the restricted estimator. For settings where a bound on the bias of the restricted estimator is unknown, we propose adaptive estimators that minimize the percentage increase in worst case risk relative to an oracle that knows the bound. We show that adaptive estimators solve a weighted convex minimax problem and provide lookup tables facilitating their rapid computation. Revisiting some well known empirical studies where questions of model specification arise, we examine the advantages of adapting to---rather than testing for---misspecification.
Empirical Welfare Maximization (EWM) is a framework that can be used to select welfare program eligibility policies based on data. This paper extends EWM by allowing for uncertainty in estimating the budget needed to implement the selected policy, in addition to its welfare. Due to the additional estimation error, I show there exist no rules that achieve the highest welfare possible while satisfying a budget constraint uniformly over a wide range of DGPs. This differs from the setting without a budget constraint where uniformity is achievable. I propose an alternative trade-off rule and illustrate it with Medicaid expansion, a setting with imperfect take-up and varying program costs.
The synthetic control method (SCM) is a popular approach for estimating the impact of a treatment on a single unit with panel data. Two challenges arise with higher-frequency data (e.g., monthly versus yearly): (i) achieving excellent pretreatment fit is typically more challenging, and (ii) overfitting to noise is more likely. Aggregating data over time can mitigate these problems but can also destroy important signal. In this paper, we bound the bias for SCM with disaggregated and aggregated outcomes and give conditions under which aggregating tightens the bounds. We then propose finding weights that balance both disaggregated and aggregated series.
Linear instrumental variable regressions are widely used to estimate causal effects. Many instruments arise from the use of “technical” instruments and more recently from the empirical strategy of “judge design”. This paper surveys and summarizes ideas from recent literature on estimation and statistical inferences with many instruments. We discuss how to assess the strength of the instruments and how to conduct weak identification-robust inference under heteroscedasticity. We establish new results for a jack-knifed version of the Lagrange Multiplier (LM) test statistic. Many exogenous regressors arise often in practice to ensure the validity of the instruments. We extend the weak-identification-robust tests to settings with both many exogenous regressors and many instruments. We propose a test that properly partials out many exogenous regressors while preserving the re-centering property of the jack-knife. The proposed tests have uniformly correct size and good power properties.
We propose a semi-parametric test to evaluate (a) whether different instruments induce subpopulations of compliers with the same observable characteristics, on average; and (b) whether compliers have observable characteristics that are the same as the full population, treated subpopulation, or untreated subpopulation, on average. The test is a flexible robustness check for the external validity of instruments. To justify the test, we characterise the doubly robust moment for Abadie’s class of complier parameters, and we analyse a machine learning update to weighting that we call the automatic kappa weight. We use the test to reinterpret Angrist and Evans' different local average treatment effect estimates obtained using different instrumental variables.
eventstudyinteract is a Stata module that implements the interaction weighted estimator for an event study. Sun and Abraham (2021) proves that this estimator is consistent for the average dynamic effect at a given relative time even under heterogeneous treatment effects. eventstudyweights is a Stata module that estimate weights underlying two-way fixed effects regressions based on Sun and Abraham (2021).
twostepweakiv is a Stata module that implements the two-step weak-instrument-robust confidence sets based on Andrews (2018) and the refined projection method for subvector inference based on Chaudhuri and Zivot (2011) for linear instrumental-variable (IV) models. Development versions and replication code for the article are available on GitHub.