**** * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * ; **** ; **** The macro requires the input dataset to contain all vars (i.e. outcome, endog var, and all exog vars) ; **** as well as a ~UNIQUE~ observation identifier, called `N` in the macro. ; **** ; **** * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * ; %macro ClusteredTSLS( INPUTDATA , OUTCOME , ENDOGVAR , INSTRUMENTS , OTHEREXOGVARS , CLUSTERVAR ); **** 1.1 First estimate the first stage using PROC SURVEYREG to cluster the first stage coefs; PROC SURVEYREG data=&INPUTDATA ; title "1) First Stage Results with SEs Clustered by &CLUSTERVAR"; CLUSTER &CLUSTERVAR ; model &ENDOGVAR = &INSTRUMENTS &OTHEREXOGVARS / solution ; run; **** 1.2 Reestimate the first stage using PROG REG to save the predicted values (since SURVEYREG can`t do it); PROC REG data=&INPUTDATA NOPRINT; model &ENDOGVAR = &INSTRUMENTS &OTHEREXOGVARS ; output out = Stage1Preds(keep=N X_hat) PREDICTED = X_hat; run; **** 2.1 Run SYSLIN to get the structural estimates and save the TSLS residuals ; PROC SYSLIN data=&INPUTDATA 2SLS out=Stage2Resids ; title "2) Second Stage Results"; ENDOGENOUS &ENDOGVAR ; INSTRUMENTS &INSTRUMENTS &OTHEREXOGVARS ; model &OUTCOME = &ENDOGVAR &OTHEREXOGVARS ; output RESIDUAL = IV_resid ; run; **** 3.1 Merge the TSLS resids with the first stage preds; data Stage2Resids; merge Stage2Resids(in=in1) Stage1Preds(in=in2); by N; if in1 and in2; run; **** 4.1 To get clustered SEs, run the TSLS resids on X_hat & OTHEREXOGVARS via PROC SURVEYREG, clustering by CLUSTERVAR; **** The SE for the coef on the endog var in the structural model is the one reported for `X_hat`; proc SURVEYREG data=Stage2Resids ; title "3) Clustered SEs for Second Stage"; CLUSTER &CLUSTERVAR ; model IV_resid = X_hat &OTHEREXOGVARS / solution ; run; %mend;