[R] NAs and row/column calculations
David Winsemius
dwinsemius at comcast.net
Fri Mar 12 05:44:36 CET 2010
On Mar 11, 2010, at 11:28 PM, David Winsemius wrote:
>
> On Mar 11, 2010, at 6:20 PM, Jim Bouldin wrote:
>
>>
>>>
>>> On 12/03/2010, at 11:25 AM, Jim Bouldin wrote:
>>>
>>>>
>>>> I continue to have great frustrations with NA values--in particular
>>> making
>>>> summary calculations on rows or cols of a matrix containing
>>>> them. For
>>>> example, why does:
>>>>
>>>>> a = matrix(1:30,nrow=5)
>>>>> is.na(a[c(1:2),c(3:4)]);a
>>>> [,1] [,2] [,3] [,4] [,5] [,6]
>>>> [1,] 1 6 NA NA 21 26
>>>> [2,] 2 7 NA NA 22 27
>>>> [3,] 3 8 13 18 23 28
>>>> [4,] 4 9 14 19 24 29
>>>> [5,] 5 10 15 20 25 30
>>>>> apply(a[!is.na(a)],2,sum)
>>>>
>>>> give me this:
>>>>
>>>> "Error in apply(a[!is.na(a)], 2, sum) : dim(X) must have a positive
>>> length"
>>>>
>>>> when
>>>>
>>>>> dim(a)
>>>> [1] 5 6
>>>>
>>>> What is the trick to calculating summary values from rows or
>>>> columns
>>>> containing NAs? Drives me nuts. More nuts that is.
>>>
>>> When you do a[!is.na(a)] you get a ***vector*** --- not a matrix.
>>> ``Obviously''!!!
>>
>> Well, obvious to you maybe, or someone who's done it before, but
>> not to me.
>>
>> The non-missing values of a cannot be arranged in
>>> a 5 x 6 matrix; there are only 26 of them. So (as my late Uncle
>>> Stanley would have said) ``What the hell do you expect?''.
>>
>> Silly me, I expected, based on (1) previous experience doing
>> summary calcs
>> on subsets of a matrix using exactly that style of command, and (2)
>> the
>> fact that dim(a) returns: [1] 5 6, and (3) the fact that a help
>> search
>> under the "apply" function gives NO INDICATION of any possible use
>> of the
>> na.rm command,
>
> Not really true. You may be at a stage where you are not paying
> attention to what the , ...) arguments to functions are doing, so
> you may have passed over the fact that it is described as "optional
> arguments to FUN." Now in fairness to the apply help page authors it
> would be impossible to list all of the possible optional arguments
> because the range of possible functions is, while countable, still
> extremely large. I think it would be useful to describe on that help
> page a bit more about what restrictions may exist here and to
> include an example that uses that facility, but I am not part of R
> Core.
>
>
>> AND (4) a help search on "na.action" does not even mention
>> na.rm, that:
>>
>>> apply(a[!is.na(a)],2,sum)
>>
>> would sum the non-NA elements of matrix a, by columns. Terribly
>> faulty
>> reasoning on my part, obviously.
>
> What, may I inquire, happens when you look at the help page for
> "sum"? While you are at it, you may want to acquaint yourself with
> the "na.rm=" parameter in other functions, because it is also
> essential for productive use of several other useful functions, like
> median and density.
AS a further exercise you may want to follow this path. (I learned new
bits.) After getting annoyed that neither ""na.rm", nor ??"na.rm"
provided any 'help', I tried the sos package:
> ??"na.rm"
No help files found matching ‘na.rm’ using regexp matching
> library(sos)
Loading required package: brew
Attaching package: 'sos'
The following object(s) are masked from package:utils :
?
> ???"na.rm"
found 476 matches; retrieving 20 pages, 400 matches.
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
>
>>
>>
>>>
>>> The ``trick'' is to remove the NAs at the summing stage:
>>>
>>> apply(a,2,sum,na.rm=TRUE)
>>>
>>> Not all that tricky.
>>>
>>> cheers,
>>>
>>> Rolf Turner
>
> David Winsemius, MD
> West Hartford, CT
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
David Winsemius, MD
West Hartford, CT
More information about the R-help
mailing list