[R] conditional Dataframe filling

Wed Mar 27 21:41:10 CET 2013

Dear Camilo,

You can do this:
dat1 <- structure(list(
w = c(TRUE,TRUE,TRUE,TRUE,TRUE,FALSE,FALSE,FALSE,FALSE,TRUE,TRUE,TRUE,TRUE),
x = c(NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA),
y = c(FALSE,FALSE,FALSE,FALSE,FALSE,TRUE,TRUE,FALSE,FALSE,TRUE,TRUE,TRUE,FALSE),
z = c(TRUE,TRUE,TRUE,TRUE,FALSE,TRUE,TRUE,TRUE,FALSE,TRUE,TRUE,TRUE,FALSE)),
row.names = c(NA, -13L),
class = "data.frame")

dat1<-t(dat1)
colnames(dat1)<-c("a","b","c","d","e","f","g","h","i","j","k", "l","m")
dat1<- as.data.frame(dat1)
dat2<-dat1
dat2[rowSums(is.na(dat2))==0,]<- t(apply(!dat1[rowSums(is.na(dat1))==0,],1,function(x) unlist(lapply(split(x,cumsum(c(0,abs(diff(x))))),cumsum))))

dat2
#   a  b  c  d  e  f  g  h  i  j  k  l  m
#w  0  0  0  0  0  1  2  3  4  0  0  0  0
#x NA NA NA NA NA NA NA NA NA NA NA NA NA
#y  1  2  3  4  5  0  0  1  2  0  0  0  1
#z  0  0  0  0  1  0  0  0  1  0  0  0  1

Suppose if NAs are there but not for the entire row (if I understand correctly), you wanted to have the whole row NA, right.

datNew<- structure(list(a = c(TRUE, NA, FALSE, TRUE, TRUE), b = c(TRUE, 
NA, FALSE, TRUE, TRUE), c = c(TRUE, NA, FALSE, TRUE, FALSE), 
    d = c(TRUE, NA, FALSE, TRUE, FALSE), e = c(TRUE, NA, FALSE, 
    FALSE, NA), f = c(FALSE, NA, TRUE, TRUE, NA), g = c(FALSE, 
    NA, TRUE, TRUE, TRUE), h = c(FALSE, NA, FALSE, TRUE, FALSE
    ), i = c(FALSE, NA, FALSE, FALSE, NA), j = c(TRUE, NA, TRUE, 
    TRUE, TRUE), k = c(TRUE, NA, TRUE, TRUE, FALSE), l = c(TRUE, 
    NA, TRUE, TRUE, FALSE), m = c(TRUE, NA, FALSE, FALSE, TRUE
    )), .Names = c("a", "b", "c", "d", "e", "f", "g", "h", "i", 
"j", "k", "l", "m"), row.names = c("w", "x", "y", "z", "u"), class = "data.frame")

datNew
#      a     b     c     d     e     f     g     h     i    j     k     l     m
#w  TRUE  TRUE  TRUE  TRUE  TRUE FALSE FALSE FALSE FALSE TRUE  TRUE  TRUE  TRUE
#x    NA    NA    NA    NA    NA    NA    NA    NA    NA   NA    NA    NA    NA
#y FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE FALSE FALSE TRUE  TRUE  TRUE FALSE
#z  TRUE  TRUE  TRUE  TRUE FALSE  TRUE  TRUE  TRUE FALSE TRUE  TRUE  TRUE FALSE
#u  TRUE  TRUE FALSE FALSE    NA    NA  TRUE FALSE    NA TRUE FALSE FALSE  TRUE

dat2New<- datNew
dat2New[rowSums(is.na(dat2New))==0,]<-t(apply(!datNew[rowSums(is.na(datNew))==0,],1,function(x) unlist(lapply(split(x,cumsum(c(0,abs(diff(x))))),cumsum))))
dat2New[rowSums(is.na(dat2New))!=0 & rowSums(is.na(dat2New))!=ncol(dat2New),]<-NA
 dat2New
#   a  b  c  d  e  f  g  h  i  j  k  l  m
#w  0  0  0  0  0  1  2  3  4  0  0  0  0
#x NA NA NA NA NA NA NA NA NA NA NA NA NA
#y  1  2  3  4  5  0  0  1  2  0  0  0  1
#z  0  0  0  0  1  0  0  0  1  0  0  0  1
#u NA NA NA NA NA NA NA NA NA NA NA NA NA
A.K.

----- Original Message -----
From: Camilo Mora <cmora at dal.ca>
To: arun <smartpink111 at yahoo.com>
Cc: R help <r-help at r-project.org>
Sent: Wednesday, March 27, 2013 4:10 PM
Subject: Re: [R] conditional Dataframe filling

Thanks Arun,

Well that is interesting. My intention was to have a dataframe with  
the same number of rows in the original data, and for the rows with  
NAs, then return NA (If there are NAs, often the entire row has NAs).  
What is interesting is that in your code with NAs, the row that has  
NAs gets NAs in the output, which is what I am looking for.

I guess a solution is to subset complete rows and then run your line  
of code. Unless there is an alternative, to tell cumsum to leave NAs  
as NAs?

Thanks again,

Camilo

Camilo Mora, Ph.D.
Department of Geography, University of Hawaii
Currently available in Colombia
Phone:   Country code: 57
          Provider code: 313
          Phone 776 2282
          From the USA or Canada you have to dial 011 57 313 776 2282
http://www.soc.hawaii.edu/mora/

Quoting arun <smartpink111 at yahoo.com>:

> Dear Camilo,
>
> How do you want to deal with the NAs?
>
> If I remove the NAs:
> dat1 <- structure(list(
> w = c(TRUE,TRUE,TRUE,TRUE,TRUE,FALSE,FALSE,FALSE,FALSE,TRUE,TRUE,TRUE,TRUE),
> x = c(NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA),
> y =  
> c(FALSE,FALSE,FALSE,FALSE,FALSE,TRUE,TRUE,FALSE,FALSE,TRUE,TRUE,TRUE,FALSE),
> z = c(TRUE,TRUE,TRUE,TRUE,FALSE,TRUE,TRUE,TRUE,FALSE,TRUE,TRUE,TRUE,FALSE)),
> row.names = c(NA, -13L),
> class = "data.frame")
>
> dat1<-t(dat1)
> colnames(dat1)<-c("a","b","c","d","e","f","g","h","i","j","k", "l","m")
> dat1<- as.data.frame(na.omit(dat1))
> dat2<-dat1
> dat2[]<-t(apply(!dat1,1,function(x)  
> unlist(lapply(split(x,cumsum(c(0,abs(diff(x))))),cumsum))))
>  dat2
> #  a b c d e f g h i j k l m
> #w 0 0 0 0 0 1 2 3 4 0 0 0 0
> #y 1 2 3 4 5 0 0 1 2 0 0 0 1
> #z 0 0 0 0 1 0 0 0 1 0 0 0 1
>
>
>  dat1
> #      a     b     c     d     e     f     g     h     i    j    k    l     m
> #w  TRUE  TRUE  TRUE  TRUE  TRUE FALSE FALSE FALSE FALSE TRUE TRUE TRUE  TRUE
> #y FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE FALSE FALSE TRUE TRUE TRUE FALSE
> #z  TRUE  TRUE  TRUE  TRUE FALSE  TRUE  TRUE  TRUE FALSE TRUE TRUE TRUE FALSE
>
>
> A.K.
>
>
>
>
>
> ----- Original Message -----
> From: Camilo Mora <cmora at dal.ca>
> To: arun <smartpink111 at yahoo.com>
> Cc: R help <r-help at r-project.org>
> Sent: Wednesday, March 27, 2013 3:27 PM
> Subject: Re: [R] conditional Dataframe filling
>
> Dear Arun,
>
> Thank you very  much for your help with this.I did not know where to  
> start looking to solve that problem, so I truly appreciate your input.
>
> The line of code you sent seems to work but it duplicates the  
> results. Do you know why that may happen?
> Below is a larger database, to which I apply your line of code.
>
> Thank you very much again,
> Camilo
>
>
> dat1 <- structure(list(
> w = c(TRUE,TRUE,TRUE,TRUE,TRUE,FALSE,FALSE,FALSE,FALSE,TRUE,TRUE,TRUE,TRUE),
> x = c(NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA),
> y =  
> c(FALSE,FALSE,FALSE,FALSE,FALSE,TRUE,TRUE,FALSE,FALSE,TRUE,TRUE,TRUE,FALSE),
> z = c(TRUE,TRUE,TRUE,TRUE,FALSE,TRUE,TRUE,TRUE,FALSE,TRUE,TRUE,TRUE,FALSE)),
> row.names = c(NA, -13L),
> class = "data.frame")
>
> dat1<-t(dat1)
> colnames(dat1)<-c("a","b","c","d","e","f","g","h","i","j","k", "l","m")
>
> dat2<-dat1
>
> dat2[]<-t(apply(!dat1,1,function(x)  
> unlist(lapply(split(x,cumsum(c(0,abs(diff(x))))),cumsum))))
>
>
>
>
>
>
>
>
>
>
>
>
>
> Camilo Mora, Ph.D.
> Department of Geography, University of Hawaii
> Currently available in Colombia
> Phone:   Country code: 57
>          Provider code: 313
>          Phone 776 2282
>          From the USA or Canada you have to dial 011 57 313 776 2282
> http://www.soc.hawaii.edu/mora/
>
>
>
> Quoting arun <smartpink111 at yahoo.com>:
>
>> HI,
>>
>> Just a correction:
>>
>> :
>>
>> dat2[]<-t(apply(!dat1,1,function(x)  
>> unlist(lapply(split(x,cumsum(c(0,abs(diff(x))))),cumsum))))   
>> #should also work
>> A.K.
>>
>>
>>
>> ----- Original Message -----
>> From: arun <smartpink111 at yahoo.com>
>> To: Camilo Mora <cmora at dal.ca>
>> Cc: R help <r-help at r-project.org>
>> Sent: Wednesday, March 27, 2013 9:09 AM
>> Subject: Re: [R] conditional Dataframe filling
>>
>>
>>
>> Hi,
>> You could try:
>> dat1<- read.table(text="
>> a    b    c    d
>> TRUE  TRUE  TRUE  TRUE
>> FALSE FALSE FALSE TRUE
>> FALSE  TRUE  FALSE  FALSE
>> ",sep="",header=TRUE)
>> dat2<-dat1
>>  dat2[]<-t(apply(1*!dat1,1,function(x)  
>> unlist(lapply(split(x,cumsum(c(0,abs(diff(x))))),cumsum))))
>>  dat2
>> #  a b c d
>> #1 0 0 0 0
>> #2 1 2 3 0
>> #3 1 0 1 2
>> A.K.
>>
>>
>> ----- Original Message -----
>> From: Camilo Mora <cmora at dal.ca>
>> To: r-help at r-project.org
>> Cc:
>> Sent: Wednesday, March 27, 2013 4:31 AM
>> Subject: [R] conditional Dataframe filling
>>
>> Hi everyone:
>>
>> This may be trivial but I just have not been able to figure it out.
>>
>> Imagine the following dataframe:
>> a     b     c     d
>> TRUE  TRUE  TRUE  TRUE
>> FALSE FALSE FALSE TRUE
>> FALSE  TRUE  FALSE  FALSE
>>
>> I would like to create a new dataframe, in which TRUE gets 0 but if  
>> false then add 1 to the cell to the left. So the results for the  
>> example above should be something like:
>>
>> a     b     c     d
>> 0     0     0     0
>> 1     2     3     0
>> 1     0     1     2
>>
>> I wonder if you may know?.
>>
>> Thanks,
>>
>> Camilo
>>
>>
>>
>>
>> Camilo Mora, Ph.D.
>> Department of Geography, University of Hawaii
>> Currently available in Colombia
>> Phone:   Country code: 57
>>          Provider code: 313
>>          Phone 776 2282
>>          From the USA or Canada you have to dial 011 57 313 776 2282
>> http://www.soc.hawaii.edu/mora/
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>
>