[R] if else statement for rain data to define zero for dry and one to wet
William Dunlap
wdunlap at tibco.com
Sat Jun 6 23:48:13 CEST 2015
Your f1() has an unneeded for loop in it.
f1a <- function(mat) mat > 0.1, 1, 0)
would do the same thing in a bit less time.
However, I think that a simple
mat > 0.1
would be preferable. The resulting TRUEs and FALSEs
are easier to interpret than the 1s and 0s that f1a()
produces and arithmetic functions treat them TRUE
as 1 and FALSE as 0 internally. E.g., mean(mat>0.1)
gives the proportion of wet(tish) days.
Bill Dunlap
TIBCO Software
wdunlap tibco.com
On Sat, Jun 6, 2015 at 1:55 PM, Dennis Murphy <djmuser at gmail.com> wrote:
> I'm sorry, but I have to take issue with this particular use case of
> ifelse(). When the goal is to generate a logical vector, ifelse() is
> very inefficient. It's better to apply a logical condition directly to
> the object in question and multiply the result by 1 to make it
> numeric/integer rather than logical.
>
> To illustrate this, consider the following toy example. The function
> f1 replicates the suggestion to apply ifelse() columnwise (with the
> additional overhead of preallocating storage for the result), whereas
> the function f2 applies the logical condition on the matrix itself
> using vectorization, with the recognition that a matrix is an atomic
> vector with a dim attribute.
>
> set.seed(5290)
>
> # 1000 x 1000 matrix
> m <- matrix(sample(c(0, 0.05, 0.2), 1e6, replace = TRUE), ncol = 1000)
>
> f1 <- function(mat)
> {
> newmat <- matrix(NA, ncol = ncol(mat), nrow = nrow(mat))
> for(i in seq_len(ncol(mat)))
> newmat[, i] <- ifelse(mat[, i] > 0.1, 1, 0)
> newmat
> }
>
> f2 <- function(mat) 1 * (mat > 0.1)
>
>
> On my system, I got
>
> > system.time(m1 <- f1(m))
> user system elapsed
> 0.14 0.00 0.14
>
> > system.time(m2 <- f2(m))
> user system elapsed
> 0.01 0.00 0.01
>
> > identical(m1, m2)
> [1] TRUE
>
> The all too common practice of using ifelse(condition, 1, 0) on an
> atomic vector is easily replaced by 1 * (condition), where the result
> of condition is a logical atomic object coerced to numeric.
>
> To reduce memory, one should better define f2 as
>
> f2 <- function(mat) 1L * (mat > 0.1)
>
> but doing so in this example no longer creates identical objects since
>
> > typeof(m1)
> [1] "double"
>
> Thus, f1 is not only inefficient in terms of execution time, it's also
> inefficient in terms of storage.
>
> Given several recent warnings in this forum about the inefficiency of
> ifelse() and the dozens of times I've seen the idiom implemented in f1
> as a solution over the last several years (to which I have likely
> contributed in my distant past as an R-helper), I felt compelled to
> say something about this practice, which BTW extends not just to 0/1
> return values but to
> 0/x return values, where x is a nonzero real number.
>
> Dennis
>
>
> On Sat, Jun 6, 2015 at 12:50 AM, Jim Lemon <drjimlemon at gmail.com> wrote:
> > Hi rosalinazairimah,
> > I think the problem is that you are using "if" instead of "ifelse". Try
> this:
> >
> > wet_dry<-function(x,thresh=0.1) {
> > for(column in 1:dim(x)[2]) x[,column]<-ifelse(x[,column]>=thresh,1,0)
> > return(x)
> > }
> > wet_dry(dt)
> >
> > and see what you get.
> >
> > Also, why can I read your message perfectly while everybody else can't?
> >
> > Jim
> >
> >>> -----Original Message-----
> >>> From: roslinaump at gmail.com
> >>> Sent: Fri, 5 Jun 2015 16:49:08 +0800
> >>> To: r-help at r-project.org
> >>> Subject: [R] if else statement for rain data to define zero for dry and
> >>> one to wet
> >>>
> >>> Dear r-users,
> >>>
> >>> I have a set of rain data:
> >>>
> >>> X1950 X1951 X1952 X1953 X1954 X1955 X1956 X1957 X1958 X1959 X1960 X1961
> >>> X1962
> >>>
> >>> 1 0.0 0.0 14.3 0.0 13.5 13.2 4.0 0 3.3 0 0
> 0.0
> >>>
> >>>
> >>> 2 0.0 0.0 21.9 0.0 10.9 6.6 2.1 0 0.0 0 0
> 0.0
> >>>
> >>>
> >>> 3 25.3 6.7 18.6 0.8 2.3 0.0 8.0 0 0.0 0 0
> 11.0
> >>>
> >>>
> >>> 4 12.7 3.4 37.2 0.9 8.4 0.0 5.8 0 0.0 0 0
> 5.5
> >>>
> >>>
> >>> 5 0.0 0.0 58.3 3.6 21.1 4.2 3.0 0 0.0 0 0
> 15.9
> >>>
> >>>
> >>> I would like to go through each column and define each cell with value
> >>> greater than 0.1 mm will be 1 and else zero. Hence I would like to
> attach
> >>> the rain data and the category side by side:
> >>>
> >>>
> >>> 1950 state
> >>>
> >>> 1 0.0 0
> >>>
> >>> 2 0.0 0
> >>>
> >>> 3 25.3 1
> >>>
> >>> 4 12.7 1
> >>>
> >>> 5 0.0 0
> >>>
> >>>
> >>> ...
> >>>
> >>>
> >>> This is my code:
> >>>
> >>>
> >>> wet_dry <- function(dt)
> >>>
> >>> { cl <- length(dt)
> >>>
> >>> tresh <- 0.1
> >>>
> >>>
> >>> for (i in 1:cl)
> >>>
> >>> { xi <- dt[,i]
> >>>
> >>> if (xi < tresh ) 0 else 1
> >>>
> >>> }
> >>>
> >>> dd <- cbind(dt,xi)
> >>>
> >>> dd
> >>>
> >>> }
> >>>
> >>>
> >>> wet_dry(dt)
> >>>
> >>>
> >>> Results:
> >>>
> >>>> wet_dry(dt)
> >>>
> >>> X1950 X1951 X1952 X1953 X1954 X1955 X1956 X1957 X1958 X1959 X1960
> >>> X1961
> >>> X1962 X1963 X1964 X1965 X1966 X1967 X1968 X1969 X1970 X1971 X1972 X1973
> >>> X1974 X1975 X1976 X1977
> >>>
> >>> 1 0.0 0.0 14.3 0.0 13.5 13.2 4.0 0.0 3.3 0.0 0.0
> >>> 0.0
> >>> 4.2 0.0 2.2 0.0 4.4 5.1 0 7.2 0.0 0.0 0.0 5.1
> >>> 0 0.0 0 0.3
> >>>
> >>> 2 0.0 0.0 21.9 0.0 10.9 6.6 2.1 0.0 0.0 0.0 0.0
> >>> 0.0
> >>> 8.4 0.0 4.0 0.0 4.9 0.7 0 0.0 0.0 0.0 0.0 5.4
> >>> 0 3.3 0 0.3
> >>>
> >>> 3 25.3 6.7 18.6 0.8 2.3 0.0 8.0 0.0 0.0 0.0 0.0
> >>> 11.0
> >>> 4.2 0.0 2.0 0.0 14.2 17.1 0 0.0 0.0 0.0 0.0 2.1
> >>> 0 1.7 0 4.4
> >>>
> >>> 4 12.7 3.4 37.2 0.9 8.4 0.0 5.8 0.0 0.0 0.0 0.0
> >>> 5.5
> >>> 0.0 0.0 5.4 0.0 6.4 14.9 0 10.1 2.9 143.4 0.0 6.1
> >>> 0 0.0 0 33.5
> >>>
> >>>
> >>> It does not work and give me the original data. Why is that?
> >>>
> >>>
> >>> Thank you so much for your help.
> >>>
> >>> [[alternative HTML version deleted]]
> >>>
> >>> ______________________________________________
> >>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >>> https://stat.ethz.ch/mailman/listinfo/r-help
> >>> PLEASE do read the posting guide
> >>> http://www.R-project.org/posting-guide.html
> >>> and provide commented, minimal, self-contained, reproducible code.
> >
> > ______________________________________________
> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
[[alternative HTML version deleted]]
More information about the R-help
mailing list