[R] Data frame vs matrix quirk: Hinky error message?

Bert Gunter gunter.berton at gene.com
Tue May 1 22:16:30 CEST 2012


Many thanks to all. I appreciate your kindness and patience.

The point is, of course, that matrix subscripting by logicals requires
different semantics than by numeric indices, as it must. I'd still say
this is a case of option 4, dumb Bert: I should have figured this out.

Duncan's proposed changes to both behavior and documentation would
certainly address all my points of confusion. However, I agree that
numeric replacement indices for data frames may be a can of worms:
presumably silent type conversion would be required when replacing
values in mixed type columns. Keeping the warnings in -- and maybe
issuing some more when the type conversion occurs -- is certainly a
good idea.

Best,
Bert

On Tue, May 1, 2012 at 12:57 PM, Nordlund, Dan (DSHS/RDA)
<NordlDJ at dshs.wa.gov> wrote:
> Bert,
>
> I think this is what is needed for the data frame
>
> ix <- cbind(1:2,2:3)
> ixm <- matrix(FALSE,4,3)
> ixm[ix] <- TRUE
> zdf[ixm] <- NA
>
> Hope this is helpful,
>
> Dan
>
> Daniel J. Nordlund
> Washington State Department of Social and Health Services
> Planning, Performance, and Accountability
> Research and Data Analysis Division
> Olympia, WA 98504-5204
>
>
>> -----Original Message-----
>> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
>> project.org] On Behalf Of Bert Gunter
>> Sent: Tuesday, May 01, 2012 11:46 AM
>> To: Duncan Murdoch
>> Cc: r-help at r-project.org
>> Subject: Re: [R] Data frame vs matrix quirk: Hinky error message?
>>
>> Duncan:
>>
>> Maybe there **is** a bug, then.
>>
>>  > zmat <- matrix(1:12,nr=4)
>> > zdf <- data.frame(zmat)
>> > ix <- cbind(c(FALSE,TRUE),c(TRUE,TRUE))
>> > zmat[ix]
>> [1] 2 3 4 6 7 8 10 11 12
>> > zdf[ix]
>> [1] 2 3 4 6 7 8 10 11 12
>> > zmat[ix] <- NA
>> > zmat
>>      [,1] [,2] [,3]
>> [1,]    1    5    9
>> [2,]   NA   NA   NA
>> [3,]   NA   NA   NA
>> [4,]   NA   NA   NA
>>
>> ## ??
>>
>> > zdf[ix] <- NA
>> Error in `[<-.data.frame`(`*tmp*`, ix, value = NA) :
>>   only logical matrix subscripts are allowed in replacement
>>
>> That matrix replacement should not work with (in general mixed type)
>> data frames seems reasonable, actually. Trying to "fix things" may not
>> be. But I leave this to you and your fellow expeRts,
>>
>> Cheers,
>> Bert
>>
>>
>> On Tue, May 1, 2012 at 11:30 AM, Duncan Murdoch
>> <murdoch.duncan at gmail.com> wrote:
>> > On 01/05/2012 2:12 PM, Bert Gunter wrote:
>> >>
>> >> Many thanks, Ista:
>> >>
>> >> I only looked in "].default" so the answer is: Alternative 4: dumb
>> >> Bert. Rap knuckles with ruler.
>> >>
>> >> Actually, indexing by a logical matrix doesn't make much  sense to
>> me
>> >> in either case, as it does not have the effect of selecting
>> individual
>> >> elements, which is what numeric matrix indices do. But that's a
>> matter
>> >> of usage, neither bug nor feature.
>> >>
>> >> If I had gotten something like the error message: "Matrix indices
>> not
>> >> allowed for replacement in data frames," I would not have been
>> >> surprised. But as you said, the behavior **IS** documented.
>> >
>> >
>> > Your version is not correct:  matrix indices *are* allowed for
>> replacement,
>> > but only logical matrix indices, not two column numerical ones.   The
>> > message might be clearer if instead of saying "only logical matrix
>> > subscripts are allowed in replacement"
>> > it said "matrix subscripts must be logical matrices in replacement",
>> but I
>> > think the basic problem is the limitation.  I'll fix that.
>> >
>> > Duncan Murdoch
>> >
>> >>
>> >> Best,
>> >> Bert
>> >>
>> >>
>> >>
>> >> On Tue, May 1, 2012 at 10:49 AM, Ista Zahn<istazahn at gmail.com>
>>  wrote:
>> >> >  Hi Bert,
>> >> >
>> >> >  The failure itself is the documented behavior: ?'[.data.frame'
>> says
>> >> >
>> >> >  "Matrix indexing ('x[i]' with a logical or a 2-column integer
>> >> >       matrix 'i') using '[' is not recommended, and barely
>> supported.
>> >> >       For extraction, 'x' is first coerced to a matrix.  For
>> >> >       replacement, a logical matrix (only) can be used to select
>> the
>> >> >       elements to be replaced in the same way as for a matrix."
>> >> >
>> >> >  The error message may be a bit hinky, as obviously data.frames
>> can be
>> >> >  indexed by things other than logical matricies. Or is there
>> another
>> >> >  reason this strikes you as odd?
>> >> >
>> >> >  Best,
>> >> >  Ista
>> >> >
>> >> >  On Tue, May 1, 2012 at 1:33 PM, Bert
>> Gunter<gunter.berton at gene.com>
>> >> >  wrote:
>> >> >>  AdvisoRs:
>> >> >>
>> >> >>  Is the following a bug, feature, hinky error message, or dumb
>> Bert?
>> >> >>
>> >> >>>  mtest<- matrix(1:12,nr=4)
>> >> >>>  dftest<- data.frame(mtest)
>> >> >>>  ix<- cbind(1:2,2:3)
>> >> >>>  mtest[ix]<- NA
>> >> >>>  mtest
>> >> >>       [,1] [,2] [,3]
>> >> >>  [1,]    1   NA    9
>> >> >>  [2,]    2    6   NA
>> >> >>  [3,]    3    7   11
>> >> >>  [4,]    4    8   12
>> >> >>
>> >> >>  ## But ...
>> >> >>>  dftest[ix]<- NA
>> >> >>  Error in `[<-.data.frame`(`*tmp*`, ix, value = NA) :
>> >> >>    only logical matrix subscripts are allowed in replacement
>> >> >>
>> >> >>  Obviously, I was expecting matrix indexing for replacement to
>> work
>> >> >>  similarly in both cases; however, I can see why it would be
>> >> >>  problematic for data frames (mixed types), but was a bit
>> nonplussed by
>> >> >>  the error message, which seems hinky to me.
>> >> >>
>> >> >>  Cheers,
>> >> >>  Bert
>> >> >>
>> >> >>  --
>> >> >>
>> >> >>  Bert Gunter
>> >> >>  Genentech Nonclinical Biostatistics
>> >> >>
>> >> >>  Internal Contact Info:
>> >> >>  Phone: 467-7374
>> >> >>  Website:
>> >> >>
>> >> >>  http://pharmadevelopment.roche.com/index/pdb/pdb-functional-
>> groups/pdb-biostatistics/pdb-ncb-home.htm
>> >> >>
>> >> >>  ______________________________________________
>> >> >>  R-help at r-project.org mailing list
>> >> >>  https://stat.ethz.ch/mailman/listinfo/r-help
>> >> >>  PLEASE do read the posting guide
>> >> >> http://www.R-project.org/posting-guide.html
>> >> >>  and provide commented, minimal, self-contained, reproducible
>> code.
>> >>
>> >>
>> >>
>> >
>>
>>
>>
>> --
>>
>> Bert Gunter
>> Genentech Nonclinical Biostatistics
>>
>> Internal Contact Info:
>> Phone: 467-7374
>> Website:
>> http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-
>> biostatistics/pdb-ncb-home.htm
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-
>> guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm



More information about the R-help mailing list