[Rd] Subsetting issue in model.frame with na.omit
Trevor John Hastie
hastie at stanford.edu
Mon Sep 19 22:13:29 CEST 2016
Running R version 3.3.1 (2016-06-21) Bug in Your Hair
I have discovered an issue with model.frame() with regard to its
implementation of the na.action argument. This impacts the gam package.
We are expecting the last thing to happen in model.frame() is that it
runs na.action on the frame it has produced. In the example
below, we use "na.action=na.omit", which calls for subsetting
out rows of the frame. However, when it does this, it does
not see that there is a [.smooth method for the two columns,
which are of S3 class "smooth". So it does do the subsetting,
but does not use the subset methods. In my example, this is
evidenced by the attribute element $NAs of (each) of
the components still being present.
When instead, I use "na.action=na.pass" in the call to model.frame,
and then filter the resulting frame through na.omit(), it does the right thing.
The $NAs component has disappeared, which is what should have
happened here.
set.seed(101)
n=30
x=matrix(runif(n*2),n,2)
x[sample(1:20,6,replace=FALSE)]=NA
dx=data.frame(x)
library(gam)
###Compare
m=model.frame(~s(X1,df=4)+s(X2,df=4),data=dx,na.action=na.omit)
attributes(m[[1]])
###with
m=model.frame(~s(X1,df=4)+s(X2,df=4),data=dx,na.action=na.pass)
m=na.omit(m)
attributes(m[[1]])
------------------------------------------------------------------------------
Trevor Hastie hastie at stanford.edu
Professor, Department of Statistics, Stanford University
Phone: (650) 725-2231 Fax: (650) 725-8977
URL: http://www.stanford.edu/~hastie
address: room 104, Department of Statistics, Sequoia Hall
390 Serra Mall, Stanford University, CA 94305-4065
------------------------------------------------------------------------------
More information about the R-devel
mailing list