Implementing the Heckman Selection Correction Approach
The Heckman Selection Correction addresses sample selection with the assumption that a strong and valid instrument. the exclusion restriction, removes the endogenous part of the predictor (Heckman 1979; Wolfolds and Siegel 2019). Additional observations of the predictor outside the sample are needed to model the sample selection. A first-stage regression computes the Inverse-Mills Ratio (IMR) of being in the sample, including at least one strong and valid instrument. In the second-stage regression, the IMR from the first-stage is added and the instrument is excluded from the model. Bootstrapping needs to be used for the standard errors.
The heckman command can be used to implement the Heckman Treatment Estimate with the following code in Stata (https://www.stata.com/manuals/rheckman.pdf).
//estimate the parameters of the endogenous treatment-regression model
heckman Outcome Predictor Controls, select(Selection = Instrument Controls)
The sampleSelection package can be used to implement the Heckman Selection Correction with the following code in R (https://cran.r-project.org/web/packages/sampleSelection/index.html).
#load the sampleSelection package
library (sampleSelection)
#outcome and treatment selection equations
model_HSC <- selection(
selection = Selection ~ Instrument + Controls,
outcome = Outcome ~ Predictor + Controls,
data = Dataset,
method = “2step”
)
#obtain the estimates
summary(model_HSC)
References
Heckman, James J. (1979), “Sample Selection Bias as a Specification Error,” Econometrica: Journal of the Econometric Society, 47, 153-161.
Wolfolds, Sarah E., and Jordan Siegel (2019), “Misaccounting for Endogeneity: The Peril of Relying on the Heckman Two‐Step Method without a Valid Instrument,” Strategic Management Journal, 40(3), 432-462.