regress lntobinsq lnassets FXDerivatives10 IRDerivatives10 bookleverage_w1 roa_w1 cratio_w1 rnd_rev_w1 cash_to_totalassets_w1 div_yield_w1 year2016 if inlist(year,2015,2016), robust Sorry for asking all these questions but I'm new to stata/econometrics in general and I was wondering, if I wanted to use robust standard errors with each model would it be correct to just use the robust option after each of these commands ie. Note: ind249 omitted because of collinearity Note: ind240 omitted because of collinearity Note: ind23 omitted because of collinearity regress lntobinsq lnassets FXDerivatives10 IRDerivatives10 bookleverage_w1 roa_w1 cratio_w1 rnd_rev_w1 cash_to_totalassets_w1 div_yield_w1 year2016 ind2* if inlist(year,2015,2016) And this is the same model with industry dummies: The results are practically dentical (see the github issue page on remaining differences).Regress lntobinsq lnassets FXDerivatives10 IRDerivatives10 bookleverage_w1 roa_w1 cratio_w1 rnd_rev_w1 cash_to_totalassets_w1 div_yield_w1 year2016 if inlist(year,2015,2016) In the repo I stored the codes for model 1-6. Multiple R-squared(full model): 0.689 Adjusted R-squared: 0.6248 (1133 observations deleted due to missingness) Residual standard error: 0.2921 on 22708 degrees of freedom Call:įelm(formula = ln_wage ~ age + ttl_exp + tenure + not_smsa + south | idcode + year | 0 | idcode + wks_work, data = df, exactDOF = TRUE, cmethod = "cgm2") Otherwise standard errors will be different from Stata. Note that you need the latest version of lfe in order to be able to use the cgm2 method. # whether degree of freedom should be computed exactly # define fixed effects | instruments | standard errors Ln_wage ~ age + ttl_exp + tenure+ not_smsa + south We can recreate the regression results from reghdfe using: model7 = felm( Rm(list=ls()) # delete global environment In this model we have two dimensional fixed effects ( idcode and year) and two-way standard error clustering ( idcode and wks_work).įirst, prepare the workspace in R: # clear work space * = FE nested within cluster treated as redundant for DoF computation adjusted for 105 clusters in idcode wks_work) Number of clusters (wks_work) = 105 Root MSE = 0.2922 Number of clusters (idcode) = 4,107 Within R-sq. Statistics robust to heteroskedasticity Prob > F = 0.0000 I only print the results of the last model 7: HDFE Linear regression Number of obs = 26,834Ībsorbing 2 HDFE groups F( 5, 104) = 107.26 Reghdfe ln_w grade age ttl_exp tenure not_smsa south, abs(idcode year) cluster(idcode wks_work) Reghdfe ln_w age ttl_exp tenure not_smsa south, abs(idcode year) cluster(idcode year) Reghdfe ln_w age ttl_exp tenure not_smsa south, abs(idcode year) cluster(idcode) Xtreg ln_w grade age ttl_exp tenure not_smsa south i.year, re cluster(idcode) Reghdfe ln_w grade age ttl_exp tenure not_smsa south, abs(year) cluster(idcode) Reghdfe ln_w grade age ttl_exp tenure not_smsa south, abs(idcode) cluster(idcode) Reghdfe ln_w grade age ttl_exp tenure not_smsa south, abs(idcode) Reg ln_w grade age ttl_exp tenure not_smsa south In a nutshell we have seven regression models here, that we are trying to replicate in R next: cls If you are reading this post, chances are you are familiar with Stata’s reghdfe, so I won’t spend much time explaining what the below Stata code does. In this post I am showing how you can use R’s linear fixed effects lfe packge and its felm command to replicate regression resultsįrom Stata’s reghdfe. R on the other hand has a lot of APIs that are useful in such a context (for example sparklyr package to write spark applications). Yet, for some modern data pipelines (in particular on distributed systems), it is not trivial to integrate Stata. It offers a wide range of functionality desired in (financial) economics research, like multi dimensional fixed effects, instrumental variables and standard error clustering. If you are like me, you love Stata’s reghdfe command for linear regression.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |