{"id":155,"date":"2024-08-05T08:41:40","date_gmt":"2024-08-05T08:41:40","guid":{"rendered":"https:\/\/www.endogeneity.net\/?page_id=155"},"modified":"2026-04-04T14:36:45","modified_gmt":"2026-04-04T14:36:45","slug":"two-stage-least-squares-approach","status":"publish","type":"page","link":"https:\/\/www.endogeneity.net\/?page_id=155","title":{"rendered":"Two-Stage Least Squares Approach"},"content":{"rendered":"<div class=\"fusion-fullwidth fullwidth-box fusion-builder-row-1 fusion-flex-container has-pattern-background has-mask-background nonhundred-percent-fullwidth non-hundred-percent-height-scrolling\" style=\"--link_hover_color: var(--awb-color5);--link_color: var(--awb-color5);--awb-background-blend-mode:multiply;--awb-border-color:var(--awb-color1);--awb-border-radius-top-left:0px;--awb-border-radius-top-right:0px;--awb-border-radius-bottom-right:0px;--awb-border-radius-bottom-left:0px;--awb-padding-top:50.156000000000006px;--awb-padding-bottom:0px;--awb-padding-top-small:70px;--awb-padding-right-small:40px;--awb-padding-bottom-small:0px;--awb-padding-left-small:40px;--awb-margin-bottom-medium:0px;--awb-margin-bottom-small:60px;--awb-background-color:#ffffff;--awb-flex-wrap:wrap;\" ><div class=\"fusion-builder-row fusion-row fusion-flex-align-items-center fusion-flex-content-wrap\" style=\"max-width:1248px;margin-left: calc(-4% \/ 2 );margin-right: calc(-4% \/ 2 );\"><div class=\"fusion-layout-column fusion_builder_column fusion-builder-column-0 fusion_builder_column_1_1 1_1 fusion-flex-column\" style=\"--awb-padding-bottom-medium:0px;--awb-bg-size:cover;--awb-width-large:100%;--awb-margin-top-large:0px;--awb-spacing-right-large:1.92%;--awb-margin-bottom-large:85px;--awb-spacing-left-large:1.92%;--awb-width-medium:100%;--awb-order-medium:0;--awb-spacing-right-medium:1.92%;--awb-spacing-left-medium:1.92%;--awb-width-small:100%;--awb-order-small:0;--awb-spacing-right-small:1.92%;--awb-margin-bottom-small:44px;--awb-spacing-left-small:1.92%;\"><div class=\"fusion-column-wrapper fusion-column-has-shadow fusion-flex-justify-content-flex-start fusion-content-layout-column\"><div class=\"fusion-text fusion-text-1 fusion-text-no-margin\" style=\"--awb-content-alignment:center;--awb-text-color:var(--awb-color1);--awb-margin-right:15%;--awb-margin-bottom:0px;--awb-margin-left:15%;\"><p style=\"text-align: left;\"><span style=\"color: #000000;\"><strong>Two-Stage Least Squares<\/strong><\/span><\/p>\n<p style=\"text-align: left; color: #000000;\">Two-stage least squares is the most widely used instrumental variable estimator. It addresses endogeneity from omitted variables, simultaneity, and measurement error by replacing the endogenous predictor with its exogenously predicted component. It is best suited for models with a continuous predictor.<\/p>\n<p style=\"text-align: left;\"><strong style=\"color: #000000;\">How It Works<\/strong><\/p>\n<p style=\"text-align: left;\"><em><span style=\"color: #000000;\">Step 1: First-Stage Regression<\/span><\/em><\/p>\n<p style=\"text-align: left; color: #000000;\">Regress the endogenous predictor on the instrumental variable(s) and all exogenous controls and fixed effects from the outcome equation. Save the predicted values \u2014 these represent the variation in the predictor that is driven solely by the instrument and the exogenous controls, stripped of the endogenous component.<\/p>\n<p style=\"text-align: left;\"><em><span style=\"color: #000000;\">Step 2: Second-Stage Regression<\/span><\/em><\/p>\n<p style=\"text-align: left; color: #000000;\">Estimate the outcome equation, but replace the original endogenous predictor with the predicted values from the first stage. The instrument is excluded from this stage.<\/p>\n<p style=\"text-align: left; color: #000000;\">Because the predicted values contain only exogenous variation, the second-stage coefficient estimates the causal effect free of endogeneity bias \u2014 provided the instrument is both strong and valid.<\/p>\n<p style=\"text-align: left; color: #000000;\">Unlike the control function approach, 2SLS is a combined estimation procedure: both stages are estimated jointly, so standard errors are automatically corrected.<\/p>\n<p style=\"text-align: left;\"><strong style=\"color: #000000;\">Requirements<\/strong><\/p>\n<ul style=\"text-align: left;\">\n<li><span style=\"color: #000000;\">At least one strong instrumental variable (significantly predicts the endogenous predictor)<\/span><\/li>\n<li><span style=\"color: #000000;\">At least one valid instrumental variable (affects the outcome only through the predictor)<\/span><\/li>\n<li><span style=\"color: #000000;\">The first stage must include the same controls and fixed effects as the outcome equation<\/span><\/li>\n<li><span style=\"color: #000000;\">Works best with a continuous endogenous predictor<\/span><\/li>\n<\/ul>\n<p style=\"text-align: left;\"><strong style=\"color: #000000;\">Limitations<\/strong><\/p>\n<p style=\"text-align: left; color: #000000;\">2SLS becomes less practical in certain settings:<\/p>\n<ul style=\"text-align: left;\">\n<li><span style=\"color: #000000;\">Binary or discrete predictors \u2014 the linear first stage may produce predicted values outside the logical range<\/span><\/li>\n<li><span style=\"color: #000000;\">Interaction terms \u2014 when the endogenous predictor appears in interactions, instrumenting becomes cumbersome<\/span><\/li>\n<li><span style=\"color: #000000;\">Nonlinear models \u2014 logit, probit, Poisson, and similar models do not lend themselves naturally to the 2SLS framework<\/span><\/li>\n<\/ul>\n<p style=\"text-align: left; color: #000000;\">In these cases, the control function approach is often a better fit. In linear models with continuous variables, both approaches produce identical results. They diverge when interactions or nonlinearities are present \u2014 2SLS is more robust to misspecification, while the control function is more efficient when correctly specified.<\/p>\n<p style=\"text-align: left;\"><strong style=\"color: #000000;\">Implementing the Two-Stage Least Squares Approach<\/strong><\/p>\n<p style=\"text-align: left;\"><span style=\"color: #000000;\">The <\/span><em><strong style=\"color: #000000;\">ivregress<\/strong><\/em><span style=\"color: #000000;\"> command in <\/span><strong style=\"color: #000000;\">Stata<\/strong><span style=\"color: #000000;\"> (<\/span><a style=\"color: #000000;\" href=\"https:\/\/www.stata.com\/manuals\/rivregress.pdf\">https:\/\/www.stata.com\/manuals\/rivregress.pdf<\/a><span style=\"color: #000000;\">) can be used to implement two-stage least-squares regression with the following code.<\/span><\/p>\n<\/div><style type=\"text\/css\" scopped=\"scopped\">.fusion-syntax-highlighter-1 > .CodeMirror, .fusion-syntax-highlighter-1 > .CodeMirror .CodeMirror-gutters {background-color:var(--awb-color1);}.fusion-syntax-highlighter-1 > .CodeMirror .CodeMirror-gutters { background-color: var(--awb-color2); }.fusion-syntax-highlighter-1 > .CodeMirror .CodeMirror-linenumber { color: var(--awb-color8); }<\/style><div class=\"fusion-syntax-highlighter-container fusion-syntax-highlighter-1 fusion-syntax-highlighter-theme-light\" style=\"opacity:0;margin-top:0px;margin-right:15%;margin-bottom:0px;margin-left:15%;font-size:14px;border-width:1px;border-style:solid;border-color:var(--awb-color3);\"><div class=\"syntax-highlighter-copy-code\"><span class=\"syntax-highlighter-copy-code-title\" data-id=\"fusion_syntax_highlighter_1\" style=\"font-size:14px;\">Copy to Clipboard<\/span><\/div><label for=\"fusion_syntax_highlighter_1\" class=\"screen-reader-text\">Syntax Highlighter<\/label><textarea class=\"fusion-syntax-highlighter-textarea\" id=\"fusion_syntax_highlighter_1\" data-readOnly=\"nocursor\" data-lineNumbers=\"1\" data-lineWrapping=\"\" data-theme=\"default\">\/\/estimate two-stage least-squares regression including first stage estimates\n\nivregress 2sls Outcome Controls (Predictor = Instrument), first<\/textarea><\/div><div class=\"fusion-text fusion-text-2 fusion-text-no-margin\" style=\"--awb-content-alignment:center;--awb-text-color:var(--awb-color1);--awb-margin-right:15%;--awb-margin-bottom:0px;--awb-margin-left:15%;\"><p style=\"text-align: left; color: #000000;\"><span style=\"background-color: rgba(0, 0, 0, 0);\"><br \/>\nThe <\/span><b style=\"background-color: rgba(0, 0, 0, 0);\"><i>ivreg <\/i><\/b><span style=\"background-color: rgba(0, 0, 0, 0);\">package in <\/span><b style=\"background-color: rgba(0, 0, 0, 0);\">R <\/b><span style=\"background-color: rgba(0, 0, 0, 0);\">(https:\/\/cran.r-project.org\/web\/packages\/ivreg\/index.html) can be used to implement two-stage least-squares regression with the following code. To obtain the first stage estimates, a separate regression model need to be run.<\/span><\/p>\n<\/div><style type=\"text\/css\" scopped=\"scopped\">.fusion-syntax-highlighter-2 > .CodeMirror, .fusion-syntax-highlighter-2 > .CodeMirror .CodeMirror-gutters {background-color:var(--awb-color1);}.fusion-syntax-highlighter-2 > .CodeMirror .CodeMirror-gutters { background-color: var(--awb-color2); }.fusion-syntax-highlighter-2 > .CodeMirror .CodeMirror-linenumber { color: var(--awb-color8); }<\/style><div class=\"fusion-syntax-highlighter-container fusion-syntax-highlighter-2 fusion-syntax-highlighter-theme-light\" style=\"opacity:0;margin-top:0px;margin-right:15%;margin-bottom:0px;margin-left:15%;font-size:14px;border-width:1px;border-style:solid;border-color:var(--awb-color3);\"><div class=\"syntax-highlighter-copy-code\"><span class=\"syntax-highlighter-copy-code-title\" data-id=\"fusion_syntax_highlighter_2\" style=\"font-size:14px;\">Copy to Clipboard<\/span><\/div><label for=\"fusion_syntax_highlighter_2\" class=\"screen-reader-text\">Syntax Highlighter<\/label><textarea class=\"fusion-syntax-highlighter-textarea\" id=\"fusion_syntax_highlighter_2\" data-readOnly=\"nocursor\" data-lineNumbers=\"1\" data-lineWrapping=\"\" data-theme=\"default\">#load the ivreg package\n\nlibrary (ivreg)\n\n#estimate two-stage least-squares regression\n\nmodel_2SLS <- ivreg(Outcome ~ Predictor + Controls | Instrument + Controls, data = Dataset)\n\nsummary(model_2SLS)\n\n#obtain the first-stage estimates\n\nmodel_first_stage <- lm(Predictor ~ Instrument + Controls, data = Dataset)\n\nsummary(model_first_stage)<\/textarea><\/div><div class=\"fusion-text fusion-text-3 fusion-text-no-margin\" style=\"--awb-content-alignment:center;--awb-text-color:var(--awb-color1);--awb-margin-right:15%;--awb-margin-bottom:0px;--awb-margin-left:15%;\"><p style=\"text-align: left;\"><strong style=\"background-color: rgba(0, 0, 0, 0); color: #000000;\"><br \/>\nReferences<\/strong><\/p>\n<ul>\n<li style=\"text-align: left;\"><span style=\"color: #000000;\">Papies, Dominik, Peter Ebbes, and Harald van Heerde (2017), \u201cAddressing Endogeneity in Marketing Models,\u201d Advanced Methods for Modeling Markets, Cham: Springer, 581\u2013627.<\/span><\/li>\n<li style=\"text-align: left;\"><span style=\"color: #000000;\">Wooldridge, Jeffrey M. (2010). Econometric analysis of cross section and panel data. Cambridge: MIT Press.<\/span><\/li>\n<\/ul>\n<\/div><\/div><\/div><\/div><\/div>\n","protected":false},"excerpt":{"rendered":"","protected":false},"author":1,"featured_media":0,"parent":0,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"100-width.php","meta":{"footnotes":""},"class_list":["post-155","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/www.endogeneity.net\/index.php?rest_route=\/wp\/v2\/pages\/155","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.endogeneity.net\/index.php?rest_route=\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/www.endogeneity.net\/index.php?rest_route=\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/www.endogeneity.net\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.endogeneity.net\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=155"}],"version-history":[{"count":11,"href":"https:\/\/www.endogeneity.net\/index.php?rest_route=\/wp\/v2\/pages\/155\/revisions"}],"predecessor-version":[{"id":502,"href":"https:\/\/www.endogeneity.net\/index.php?rest_route=\/wp\/v2\/pages\/155\/revisions\/502"}],"wp:attachment":[{"href":"https:\/\/www.endogeneity.net\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=155"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}