Heckman Treatment Correction

The Heckman treatment correction addresses endogeneity arising from treatment selection — situations where a binary predictor (treatment vs. no treatment) is not randomly assigned but driven by unobserved factors that also affect the outcome.

When to Use It

Use the Heckman treatment correction when:

  • The focal predictor is binary (e.g., exposed vs. not exposed, adopted vs. not adopted)
  • Assignment to treatment is non-random and likely driven by unobserved factors
  • Those same unobserved factors plausibly affect the outcome

This makes it the IV approach of choice for treatment selection — the specific form of endogeneity where unobserved characteristics determine who receives the treatment.

How It Works

Step 1: Model the Treatment Decision

Estimate a probit regression that predicts the probability of receiving the treatment (scoring “1” on the binary predictor). This first-stage model should include:

  • All exogenous controls and fixed effects from the outcome equation
  • At least one strong and valid instrument — a variable that predicts treatment assignment but has no direct effect on the outcome

From this probit model, compute the Inverse Mills Ratio (IMR) — a correction term that captures the bias introduced by non-random treatment assignment.

Step 2: Estimate the Outcome Equation

Include the Inverse Mills Ratio from Step 1 as an additional control variable in the outcome regression. The instrument is excluded from this stage. The IMR absorbs the correlation between the treatment decision and the error term, correcting for selection bias.

If the IMR is statistically significant, this indicates that treatment selection is indeed endogenous and that the correction is doing meaningful work.

Standard errors must be corrected via bootstrapping, since the second stage uses a generated regressor from the first stage.

Requirements

  • A binary endogenous predictor (treatment indicator)
  • At least one strong instrumental variable (significantly predicts treatment assignment)
  • At least one valid instrumental variable (affects the outcome only through the treatment)
  • The first stage must include the same controls and fixed effects as the outcome equation
  • Bootstrapped standard errors to account for the generated Inverse Mills Ratio

Implementing the Heckman Treatment Estimate Approach

The etregress command can be used to implement the Heckman Treatment Estimate with the following code in Stata (https://www.stata.com/manuals/causaletregress.pdf).

Copy to Clipboard


The sampleSelection package can be used to implement the Heckman Treatment Estimate with the following code in R (https://cran.r-project.org/web/packages/sampleSelection/index.html).

Copy to Clipboard

 

References

  • Heckman, James J. (1979), “Sample Selection Bias as a Specification Error,” Econometrica: Journal of the Econometric Society, 47, 153-161.