[R] Potential problem with subset !!!
Douglas Bates
bates at stat.wisc.edu
Fri Feb 12 16:35:03 CET 2010
On Fri, Feb 12, 2010 at 9:22 AM, Arnaud Mosnier <a.mosnier at gmail.com> wrote:
> Dear useRs,
>
> Just a little post to provide the answer of a problem that took me
> some time to resolve !
> Hope that reading this will permit the others to avoid that error.
>
> When using the subset function, writing
>
> subset (data, data$columnname == X) or subset (data, columnname == X)
>
> do the same thing.
>
> thus, the function consider that argument name given after the coma
> (like "columnname") is the name of a column of the data frame
> considered.
> A problem occur when other arguments such as X are the names of both a
> column of the data frame and an object !
>
> Here is an example:
>
> df <- data.frame(ID = c("a","b","c","b","e"), Other = 1:5)
> ID <- unique (df$ID)
> ID
>
> ## Now the potential problem !!
>
> subset (df, df$ID == ID[4])
>
> ## BE CAREFUL subset function use the column ID of the data.frame
> ## and NOT the object ID containing unique value !!!!
>
> Sorry if it seems obvious for some of you, but hope that others find
> it useful !!
Myself, I think it would be obvious to anyone who had read the
documentation for which the third paragraph is
For data frames, the ‘subset’ argument works on the rows. Note
that ‘subset’ will be evaluated in the data frame, so columns can
be referred to (by name) as variables in the expression (see the
examples).
>
> Arnaud
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
More information about the R-help
mailing list