[R] HAC standard errors
Nick Pretnar
npretnar at gmail.com
Mon Jun 2 21:56:20 CEST 2014
Hello,
I am having a great amount of difficulty running a simple linear regression model with entity and time fixed effects and HAC standard errors. I have a data set with 3 million observations and 30 variables. My data is structured as follows:
NAME STATE YEAR Y X1 X2
1 1 2012 1 1 1
2 1 2012 1 2 7
3 1 2012 1 1 2
4 2 2012 2 4 5
etc. ... For every state in every year, there are about 10,000 row vectors corresponding to individual observations. This is not a longitudinal dataset: an individual surveyed in year 2000 in state 1 is never spoken to again. Nonetheless, I still wish to control for geographical and time fixed effects. To do so, I run the following:
> load("data.frame.rda")
> library(sandwich)
> library(pcse)
> model <- lm(data.frame$Y ~ data.frame$X1 + data.frame$X2 + as.factor(data.frame$state) + as.factor(data.frame$year))
> vcovHAC(model, prewhite = FALSE, adjust = FALSE, sandwich = TRUE, ar.method = "ols")
R will not return any results, yet acts as if it is computing the results. This goes on for 4 hours or more.
I wanted to run the following:
> library(pcse)
> model <- lm(data.frame$Y ~ data.frame$X1 + data.frame$X2 + as.factor(data.frame$state) + as.factor(data.frame$year))
> model.pcse <- pcse(model, groupN = data.frame$state, groupT = data.frame$year)
But I get the error:
> Error in pcse(model, groupN = BRFSS_OBESEBALANCED$X_STATE, groupT = BRFSS_OBESEBALANCED$YEAR) :
There cannot be more than nCS*nTS rows in the using data!
If there are any workarounds for this problem, I would greatly appreciate learning about them.
Thanks,
Nicholas Pretnar
University of Missouri, Economics
npretnar at gmail.com
More information about the R-help
mailing list