[R] Errors-In-Variables in R

John Fox jfox at mcmaster.ca
Sun Mar 3 16:54:26 CET 2013


Dear Cedric,

If I understand correctly what you want to do, and if you're willing to
assume that the variables are normally distributed, then you should be able
to use any of the latent-variable structural-equation-modeling packages in
R, such as sem, OpenMX, or lavaan. 

Here's an artificial example using the sem package:

------------ snip ----------

> set.seed(12345)
> zeta <- rnorm(1000)
> y <- 1 + 2*zeta + rnorm(1000, 0, 1)
> x <- zeta + rnorm(1000)
> plot(x, y)
> Data <- data.frame(x, y)
> summary(lm(y ~ x)) # biased

Call:
lm(formula = y ~ x)

Residuals:
    Min      1Q  Median      3Q     Max 
-6.6339 -1.1406  0.0299  1.1573  6.5652 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  1.04007    0.05514   18.86   <2e-16 ***
x            1.06089    0.04012   26.44   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 1.743 on 998 degrees of freedom
Multiple R-squared:  0.4119,	Adjusted R-squared:  0.4113 
F-statistic: 699.1 on 1 and 998 DF,  p-value: < 2.2e-16

> plot(x, y) # not shown
> 
> library(sem)
> 
> eqns <- specifyEquations()
1: y = alpha*Intercept + beta*zeta
2: x = 1*zeta
3: V(y) = sigma
4: V(x) = 1
5: V(zeta) = phi
6: 
Read 5 items

> model <- sem(eqns, data=Data, raw=TRUE, fixed.x="Intercept")
> summary(model)

Model fit to raw moment matrix.

 Model Chisquare =  0.2264654   Df =  1 Pr(>Chisq) = 0.6341572
 AIC =  8.226465
 BIC =  -6.68129

 Normalized Residuals
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
 0.0000  0.1635  0.1711  0.2189  0.2564  0.4759 

 Parameter Estimates
      Estimate  Std Error  z value   Pr(>|z|)                     
alpha 1.0400668 0.05507397 18.884905 1.518098e-79 y <--- Intercept
beta  2.2553406 0.14197058 15.885971 7.926103e-57 y <--- zeta     
sigma 0.6404697 0.25612060  2.500657 1.239632e-02 y <--> y        
phi   0.8881856 0.08444223 10.518263 7.117323e-26 zeta <--> zeta  

 Iterations =  15 

> library(car)
> linearHypothesis(model, c("alpha = 1", "beta = 2", "sigma = 1", "phi =
1")) # true parameter values
Linear hypothesis test

Hypothesis:
alpha = 1
beta = 2
sigma = 1
phi = 1

Model 1: restricted model
Model 2: model

  Res.Df Df  Chisq Pr(>Chisq)
1      5                     
2      1  4 3.8285     0.4297

------------ snip ----------

For other distributional assumptions, you'd have to write your own objective
function but you may still be able to use sem or one of the other SEM
packages.

I hope this helps,
 John

-----------------------------------------------
John Fox
Senator McMaster Professor of Social Statistics
Department of Sociology
McMaster University
Hamilton, Ontario, Canada



> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
> project.org] On Behalf Of Cedric Sodhi
> Sent: Saturday, March 02, 2013 4:56 PM
> To: Rui Barradas
> Cc: r-help at r-project.org
> Subject: Re: [R] Errors-In-Variables in R
> 
> Perhaps it would have been clearer that this is no homework if I
> hadn't forgotten to say what [1] is. Sorry for that.
> 
> [1] https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=15225
> 
> (This is no homework but genuinely adresses the problem that R to my
> knowledge does not have models for error in variables)
> 
> 
> On Sat, Mar 02, 2013 at 09:34:21PM +0000, Rui Barradas wrote:
> > There's a no homework policy in R-help.
> >
> > Rui Barradas
> >
> > Em 02-03-2013 18:28, Cedric Sodhi escreveu:
> > > In reference to [1], how would you solve the following regression
> > > problem:
> > >
> > > Given observations (X_i,Y_i) with known respective error
> distributions
> > > (e_X_i,e_Y_i) (say, 0-mean Gaussian with known STD), find the
> parameters
> > > a and b which maximize the Likelihood of
> > >
> > > Y = a*X + b
> > >
> > > Taking the example further, how many of the very simplified
> assumptions
> > > from the above example can be lifted or eased and R still has a
> method
> > > for finding an errors-in-variables fit?
> > >
> > > ______________________________________________
> > > R-help at r-project.org mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> > >
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list