[R] Panel data - replicating Stata's xtpcse in R

Achim Zeileis Achim.Zeileis at uibk.ac.at
Thu Apr 7 21:56:26 CEST 2011

On Thu, 7 Apr 2011, Florian Markowetz wrote:

> Dear list,
> I am trying to replicate an econometrics study that was orginally done in Stata. (Blanton and Blanton. 2009. A Sectoral Analysis of Human Rights and FDI: Does Industry Type Matter?  International Studies Quarterley 53 (2):469 - 493.) The model I try to replicate is in Stata given as
> xtpcse total_FDI lag_total ciri human_cap worker_rts polity_4 market 
> income econ_growth log_trade fix_dollar fixed_xr xr_fluct lab_growth 
> english, pairwise corr(ar1)
> According to the paper, this is an OLS regression with panel corrected 
> standard errors including a lagged dependent variable (lag_total is 
> total_FDI t-1) and controlling first order correlations within each 
> panel (corr(ar1)).

I'm not sure about the Stata command (because I haven't got Stata 
installed myself) and how it translates to R. Other people might know 

>From the verbal description "OLS plus panel-corrected standard errors" I 
would have expected that the coefficients could be estimated by lm() but 
that does not seem to be the case. Note sure why...it doesn't seem to be 
_O_LS then. Did you check that the Stata command produces the same output 
as indicated in the paper? (Maybe some data preprocessing is 

In any case, I've had success with replicating such results with the "plm" 
package (see also http://www.jstatsoft.org/v27/i02/). Typically using the 
model = "pooling" (i.e., OLS) and then computing the standard errors via 
vcovBK(). The latter stands for "Beck & Katz" which is what the "pcse" 
package also implements.

In a few other cases, I replicated the so-called panel-corrected standard 
errors via geeglm() from "geepack" (http://www.jstatsoft.org/v15/i02/).
Using the default corstr = "independence" (i.e., again correspond to OLS).
Other corstr could be employed.

Just as additional information: Many econometricians don't know much about 
the type of models the "nlme" estimates. Usually, least squares technology 
is preferred in econometrics rather than likelihood-based ideas. Also, 
other multi-level models are rarely used. If specified in the same way, 
both approaches often yield similar results. There is a paragraph in the 
above-mentioned JSS paper on "plm" that discusses (dis)similarities with 

Finally, a JSS paper on the "pcse" package is also waiting for publication 
in a special volume...hopefully online next month.

Good luck with the replication!

> The BIG QUESTION is how to replicate this line in R.
> Econometrics is a new field to me, but a bit of searching showed that  packages like plm, nlme, pcse should be able to handle this kind of problem. In particular, function gls() uses auto-correlation structure and pcse() corrects the standard errors of the fitted model. Below is some code to show what I have done, and some problems I ran into.
> ## setup and load data from web
> library(foreign)
> library(nlme)
> library(pcse)
> D <- read.dta("http://umdrive.memphis.edu/rblanton/public/ISQ_data/blanton_isq08_data.dta")
> D[544,"year"] <- 2005 ## fixing an unexpected NA in the year column
> ## Model formula
> form <- total_FDI ~ lag_total + ciri + human_cap + worker_rts + polity_4 + market_size + income + econ_growth + log_trade + fixed_xr + fix_dollar + xr_fluct + english + lab_growth
> ## Model 1: no auto-correlation
> res1  <- gls(model=form, data=D,correlation=NULL,na.action=na.omit)
> coefficients(res1)
> ## Model 2: with auto-correlation
> corr <- corAR1(.1,~1|c_name)
> res2  <- gls(model=form, data=D,correlation=corr,na.action=na.omit)
> coefficients(res2)
> Now, I know from the paper how the Stata coefficients looked like.  For 
> example, for log_total it should be .852 and for market_size .21 (these 
> were the two significant ones). The result of Model1 is closer to this 
> than the result of Model 2, but there is still quite a gap.
> The goal is to do OLS on panel data with AR(1) and PCSE - am I on the 
> right track here? More specifically:
> Question 1: Auto-correlation - how to specify the parameter 'value' in 
> corAR1 (the .1 above is completely arbitrary) - Any other ideas how to 
> translate Stata's corr(AR1) into R? (I'm not even completely sure what 
> Stata does there and didn't find any details in the online manuals)
> Question 2: PCSE - the pcse function seems to work on objects of class 
> 'lm' only. Any way to use it for gls-objects?
> Any help is greatly appreciated!
> Florian
> --
> Florian Markowetz
> Cancer Research UK
> Cambridge Research Institute
> Li Ka Shing Centre
> Robinson Way, Cambridge, CB2 0RE, UK
> phone: +44 (0) 1223 40 4315
> email: florian.markowetz at cancer.org.uk
> web  : http://www.markowetzlab.org
> skype: florian.markowetz
> This communication is from Cancer Research UK. Our website is at www.cancerresearchuk.org. We are a registered charity in England and Wales (1089464) and in Scotland (SC041666) and a company limited by guarantee registered in England and Wales under number 4325234. Our registered address is Angel Building, 407 St John Street, London, EC1V 4AD. Our central telephone number is 020 7242 0200.
> This communication and any attachments contain information which is confidential and may also be privileged.   It is for the exclusive use of the intended recipient(s).  If you are not the intended recipient(s) please note that any form of disclosure, distribution, copying or use of this communication or the information in it or in any attachments is strictly prohibited and may be unlawful.  If you have received this communication in error, please notify the sender and delete the email and destroy any copies of it.
> E-mail communications cannot be guaranteed to be secure or error free, as information could be intercepted, corrupted, amended, lost, destroyed, arrive late or incomplete, or contain viruses.  We do not accept liability for any such matters or their consequences.  Anyone who communicates with us by e-mail is taken to accept the risks in doing so.
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

More information about the R-help mailing list