[R] Vectorized forms of isTRUE, identical and all.equal?

Robin Evans rje42 at stat.washington.edu
Thu Apr 8 02:37:14 CEST 2010


On 7 April 2010 16:27, Steve Lianoglou <mailinglist.honeypot at gmail.com> wrote:
> On Wed, Apr 7, 2010 at 7:16 PM, Robin Evans <rje42 at stat.washington.edu> wrote:
>> On 7 April 2010 16:12, Steve Lianoglou <mailinglist.honeypot at gmail.com> wrote:
>>> Hi,
>>>
>>> On Wed, Apr 7, 2010 at 5:44 PM, Robin Evans <rje42 at stat.washington.edu> wrote:
>>>> Dear all,
>>>>
>>>> I'm wondering if there exist vectorized forms of 'isTRUE()',
>>>> 'identical()' and 'all.equal()'.  My problem is that I wish to test if
>>>> each element of a vector is equal to a particular value (or
>>>> numerically close), whilst dealing carefully with NAs and so on.
>>>> However, using sapply() with identical() is very slow because it makes
>>>> so many separate function calls:
>>>>
>>>> x = rbinom(1e4, 1, 0.5)
>>>>
>>>> system.time(sapply(x, function(x) isTRUE(all.equal(x, 0))))
>>>>
>>>> system.time(abs(x)  < .Machine$double.eps^0.5)
>>>>
>>>> The latter version is fast, but potentially dangerous.  Any suggestions?
>>>
>>> Why is it dangerous? Because some values in x can be NA?
>>>
>> Precisely - I would like all the answers to be TRUE or FALSE.
>
> What happens when the value @ x is NA, though?
>
> Here's a function for you:
>
> almost.equal <- function (x, y, tolerance=.Machine$double.eps^0.5,
> na.value=TRUE)
> {
>  answer <- rep(na.value, length(x))
>  test <- !is.na(x)
>  answer[test] <- abs(x[test] - y) < tolerance
>  answer
> }
>
> Now depending on what you want the answer to default to in the
> locations that x is NA, set that in the `na.value`.
>
> Be careful the types of things you pass in as x and y:
>
> R> head(x)
> [1]    NA 1e+00 1e-16 0e+00 1e+00 1e+00
>
> R> head(almost.equal(x, 0))
> [1]  TRUE FALSE  TRUE  TRUE FALSE FALSE
>
> R> head(almost.equal(x, 0, na.value=FALSE))
> [1] FALSE FALSE  TRUE  TRUE FALSE FALSE
>
I wanted anything other than a number close to 0 to return FALSE.

Thanks - this all sounds reasonable, I just figured that there might
be function already in the libraries.

Rob
> --
> Steve Lianoglou
> Graduate Student: Computational Systems Biology
>  | Memorial Sloan-Kettering Cancer Center
>  | Weill Medical College of Cornell University
> Contact Info: http://cbio.mskcc.org/~lianos/contact
>
>



-- 
Robin Evans
Statistics Department
University of Washington
www.stat.washington.edu/~rje42



More information about the R-help mailing list