Implementing the Control Function Approach
The control function approach addresses omitted variables, simultaneity, and measurement error with the assumption that a strong and valid instrument removes the endogenous part of the predictor (Papies et al. 2017; Petrin and Train 2010). A first-stage regression computes the fitted residuals of the predictor, including at least one strong and valid instrument. In the second-stage regression, the fitted residuals from the first-stage are added and the instrument is excluded from the model. Bootstrapping needs to be used, and standard errors need to be corrected (Karaca-Mandic and Train 2003, Petrin and Train (2003).
The cfregress command can be used to implement the control function approach with the following code in Stata (https://www.stata.com/manuals/rcfregress.pdf).
//estimate two-stage least-squares regression including first stage estimates
cfregress Outcome Controls (Predictor = Instrument)
The functionality of the cfregress command is not directly available in R
References
Karaca‐Mandic, Pinar, and Kenneth Train (2003), “Standard Error Correction in Two‐Stage Estimation with Nested Samples,” The Econometrics Journal, 6(2), 401-407.
Papies, Dominik, Peter Ebbes, and Harald van Heerde (2017), “Addressing Endogeneity in Marketing Models,” Advanced Methods for Modeling Markets, Cham: Springer, 581–627.
Petrin, Amil and Kenneth Train (2003), “Omitted Product Attributes in Discrete Choice Models,” National Bureau of Economic Research Working Paper No.W9452, Cambridge, MA.
Petrin, Amil, and Kenneth Train (2010), “A Control Function Approach to Endogeneity in Consumer Choice Models,” Journal of Marketing Research, 47(1), 3-13.