[R] Problem Subsetting Rows that Have NA's
David Winsemius
dwinsemius at comcast.net
Wed Oct 25 20:17:15 CEST 2017
> On Oct 25, 2017, at 6:57 AM, BooBoo <booboo at gforcecable.com> wrote:
>
> On 10/25/2017 4:38 AM, Ista Zahn wrote:
>> On Tue, Oct 24, 2017 at 3:05 PM, BooBoo <booboo at gforcecable.com> wrote:
>>> This has every appearance of being a bug. If it is not a bug, can someone
>>> tell me what I am asking for when I ask for "x[x[,2]==0,]". Thanks.
>> You are asking for elements of x where the second column is equal to zero.
>>
>> help("==")
>>
>> and
>>
>> help("[")
>>
>> explain what happens when missing values are involved. I agree that
>> the behavior is surprising, but your first instinct when you discover
>> something surprising should be to read the documentation, not to post
>> to this list. After having read the documentation you may post back
>> here if anything remains unclear.
>>
>> Best,
>> Ista
>>
>>>> #here is the toy dataset
>>>> x <- rbind(c(1,1),c(2,2),c(3,3),c(4,0),c(5,0),c(6,NA),
>>> + c(7,NA),c(8,NA),c(9,NA),c(10,NA)
>>> + )
>>>> x
>>> [,1] [,2]
>>> [1,] 1 1
>>> [2,] 2 2
>>> [3,] 3 3
>>> [4,] 4 0
>>> [5,] 5 0
>>> [6,] 6 NA
>>> [7,] 7 NA
>>> [8,] 8 NA
>>> [9,] 9 NA
>>> [10,] 10 NA
>>>> #it contains rows that have NA's
>>>> x[is.na(x[,2]),]
>>> [,1] [,2]
>>> [1,] 6 NA
>>> [2,] 7 NA
>>> [3,] 8 NA
>>> [4,] 9 NA
>>> [5,] 10 NA
>>>> #seems like an unreasonable answer to a reasonable question
>>>> x[x[,2]==0,]
>>> [,1] [,2]
>>> [1,] 4 0
>>> [2,] 5 0
>>> [3,] NA NA
>>> [4,] NA NA
>>> [5,] NA NA
>>> [6,] NA NA
>>> [7,] NA NA
>>>> #this is more what I was expecting
>>>> x[which(x[,2]==0),]
>>> [,1] [,2]
>>> [1,] 4 0
>>> [2,] 5 0
>>> ______________________________________________
>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>
> I wanted to know if this was a bug so that I could report it if so. You say it is not, so you answered my question. As far as me not reading the documentation, I challenge anyone to read the cited help pages and predict the observed behavior based on the information given in those pages.
Some of us do share (or at least remember feeling) your pain. The ?Extract page is long and complex and there are several features that I find non-intuitive. But they are deemed desirable by others. I think I needed to read that page about ten times (with multiple different problems that needed explication) before it started to sink in. You are apparently on that same side of the split opinions on the feature of returning rows with logical NA's as I am. I've learned to use `which`, and I push back when the conoscienti says it's not needed.
After you read it a few more times you may come to a different opinion. Many people come to R with preconceived notions of what words like "equals" or "list" or "vector" mean and then complain about the documentation. You would be better advised to spend more time studying the language. The help pages are precise but terse, and you need to spend time with the examples and with other tutorial material to recognize the gotcha's.
Here's a couple of possibly helpful rules regarding "[[" and "[" and logical indexing:
Nothing _equals_ NA.
Selection operations with NA logical index item return NA. (Justified as a warning feature as I understand it.)
"[" always returns a list.
"[[" returns only one thing, but even that thing could be a list.
Generally you want "[[" if you plan on testing for equality with a vector.
The "R Inferno" by Burns is an effort to detail many more of the unexpected or irregular aspects of R (mostly inherited from S).
--
Best of luck in your studies.
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
David Winsemius
Alameda, CA, USA
'Any technology distinguishable from magic is insufficiently advanced.' -Gehm's Corollary to Clarke's Third Law
More information about the R-help
mailing list