[R] Selecting cases from matrices stored in lists

mdvaan mathijsdevaan at gmail.com
Mon Aug 22 20:50:00 CEST 2011


Thanks Jean, changing c[[t]] to c[[year]] solved the issue.

Math


Jean V Adams wrote:
> 
>> Re: [R] Selecting cases from matrices stored in lists
>> mdvaan 
>> to:
>> r-help
>> 08/22/2011 09:46 AM
>> 
>> Jean V Adams wrote:
>> > 
>> >> [R] Selecting cases from matrices stored in lists
>> >> mdvaan 
>> >> to:
>> >> r-help
>> >> 08/22/2011 07:24 AM
>> >> 
>> >> Hi,
>> >> 
>> >> I have two lists (c and h - see below) containing matrices with 
> similar
>> >> cases but different values. I want to split these matrices into 
> multiple
>> >> matrices based on the values in h. So, I did the following:
>> >> 
>> >> years<-c(1997:1999) 
>> >> for (t in 1:length(years)) 
>> >>         { 
>> >>         year=as.character(years[t]) 
>> >>         h[[year]]<-sapply(colnames(h[[year]]), function(var)
>> >> h[[year]][h[[year]][,var]>0, h[[year]][var,]>0]) 
>> >>         } 
>> >> 
>> >> Now that I have created list h (with split matrices), I would like to 
> 
>> > use
>> >> these selections to make similar selections in list c. List c needs 
> to 
>> > get
>> >> the exact same shape as h, so that `8026`in 1997 (c$`1997`$`8026`) 
> looks
>> >> like this: 
>> >> 
>> >> $`1997`$`8026` 
>> >>       B 
>> >> B      8025 8026 8029 
>> >>   8025   1.0000000 0.7739527 0.9656091 
>> >>   8026   0.7739527 1.0000000 0.7202771 
>> >>   8029   0.9656091 0.7202771 1.0000000 
>> >> 
>> >> Can anyone help me doing this? I have no idea how I can get it to 
> work.
>> >> Thank you very much for your help! 
>> >> 
>> > 
>> > Try this:
>> > 
>> > c2 <- h
>> > years <- names(h)
>> > for (t in seq(years))
>> >         { 
>> >         year <- years[t]
>> >         c2[[year]] <- sapply(colnames(h[[year]]), function(var) 
>> >                 c[[t]][h[[year]][ ,var] > 0, h[[year]][var, ] > 0]) 
>> >         }
>> > 
>> > By the way, it's great that you included code in your question.
>> > However, I encountered a couple of errors when running you code (see 
>> > below).
>> > 
>> > Also, it would be better to use a different name for your list "c", 
>> > because c() is a function in R.
>> > 
>> > Jean
>> > 
>> >> 
>> >> library(zoo) 
>> >> DF1 = data.frame(read.table(textConnection("    B  C  D  E  F  G 
>> >> 8025  1995  0  4  1  2 
>> >> 8025  1997  1  1  3  4 
>> >> 8026  1995  0  7  0  0 
>> >> 8026  1996  1  2  3  0 
>> >> 8026  1997  1  2  3  1 
>> >> 8026  1998  6  0  0  4 
>> >> 8026  1999  3  7  0  3 
>> >> 8027  1997  1  2  3  9 
>> >> 8027  1998  1  2  3  1 
>> >> 8027  1999  6  0  0  2 
>> >> 8028  1999  3  7  0  0 
>> >> 8029  1995  0  2  3  3 
>> >> 8029  1998  1  2  3  2 
>> >> 8029  1999  6  0  0  1"),head=TRUE,stringsAsFactors=FALSE)) 
>> >> 
>> >> a <- read.zoo(DF1, split = 1, index = 2, FUN = identity) 
>> >> sum.na <- function(x) if (any(!is.na(x))) sum(x, na.rm = TRUE) else 
> NA 
>> >> b <- rollapply(a, 3,  sum.na, align = "right", partial = TRUE) 
>> > 
>> > Error in FUN(cdata[st, i], ...) : unused argument(s) (partial = TRUE)
>> > 
>> > rollapply() has no argument partial.
>> > 
>> >> newDF <- lapply(1:nrow(b), function(i) 
>> >>               prop.table(na.omit(matrix(b[i,], nc = 4, byrow = TRUE, 
>> >>                 dimnames = list(unique(DF1$B), names(DF1)[-1:-2]))), 
> 1)) 
>> > 
>> >> names(newDF) <- time(a) 
>> > 
>> > Error in names(newDF) <- time(a) : 
>> >   'names' attribute [5] must be the same length as the vector [3]
>> > 
>> > newDF has only 3 names, but time(a) is of length 5.
>> > 
>> >> c<-lapply(newDF, function(mat) tcrossprod(mat / 
> sqrt(rowSums(mat^2)))) 
>> >> 
>> >> DF2 = data.frame(read.table(textConnection("  A  B  C 
>> >> 80  8025  1995 
>> >> 80  8026  1995 
>> >> 80  8029  1995 
>> >> 81  8026  1996 
>> >> 82  8025  1997 
>> >> 82  8026  1997 
>> >> 83  8025  1997 
>> >> 83  8027  1997 
>> >> 90  8026  1998 
>> >> 90  8027  1998 
>> >> 90  8029  1998 
>> >> 84  8026  1999 
>> >> 84  8027  1999 
>> >> 85  8028  1999 
>> >> 85  8029  1999"),head=TRUE,stringsAsFactors=FALSE)) 
>> >> 
>> >> e <- function(y) crossprod(table(DF2[DF2$C %in% y, 1:2])) 
>> >> years <- sort(unique(DF2$C)) 
>> >> f <- as.data.frame(embed(years, 3)) 
>> >> g<-lapply(split(f, f[, 1]), e) 
>> >> h<-lapply(g, function (x) ifelse(x>0,1,0)) 
>> >    [[alternative HTML version deleted]]
>> > 
>> > ______________________________________________
>> > R-help at r-project.org mailing list
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide
>> > http://www.R-project.org/posting-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.
>> > 
>> 
>> Sorry, I am using the devel version of zoo which allows you to use the
>> "partial" argument. The correct code is given below. 
> 
> My error.  I didn't have the latest version installed.
> 
>> 
>> I didn't get your suggestion to work. If I understand what you are 
> trying to
>> do (multiplying c and h), this is likely to give the wrong results 
> because h
>> contains values of 0. Since I am ultimately interested in the values of 
> the
>> split matrices in c (based on the original matrices in c), this will
>> probable not work. Or am I just not understanding you? 
> 
> I'm not doing any multiplication.  I just applied your extraction
>         [h[[year]][ ,var] > 0, h[[year]][var, ] > 0]
> to the c list rather than the h list.
> 
> You say you didn't get it to work.  Did you get an error message?  Or did 
> it run, but not give you the values you wanted?  Or ... ?
> 
> Jean
> 
>> 
>> Thanks! 
>> 
>> # devel version of zoo
>> install.packages("zoo", repos = "http://r-forge.r-project.org")
>> library(zoo)
>> DF1 = data.frame(read.table(textConnection("    B  C  D  E  F  G 
>> 8025  1995  0  4  1  2 
>> 8025  1997  1  1  3  4 
>> 8026  1995  0  7  0  0 
>> 8026  1996  1  2  3  0 
>> 8026  1997  1  2  3  1 
>> 8026  1998  6  0  0  4 
>> 8026  1999  3  7  0  3 
>> 8027  1997  1  2  3  9 
>> 8027  1998  1  2  3  1 
>> 8027  1999  6  0  0  2 
>> 8028  1999  3  7  0  0 
>> 8029  1995  0  2  3  3 
>> 8029  1998  1  2  3  2 
>> 8029  1999  6  0  0  1"),head=TRUE,stringsAsFactors=FALSE)) 
>> 
>> a <- read.zoo(DF1, split = 1, index = 2, FUN = identity) 
>> sum.na <- function(x) if (any(!is.na(x))) sum(x, na.rm = TRUE) else NA 
>> b <- rollapply(a, 3,  sum.na, align = "right", partial = TRUE) 
>> newDF <- lapply(1:nrow(b), function(i) 
>>               prop.table(na.omit(matrix(b[i,], nc = 4, byrow = TRUE, 
>>                 dimnames = list(unique(DF1$B), names(DF1)[-1:-2]))), 1)) 
> 
>> names(newDF) <- time(a) 
>> c<-lapply(newDF, function(mat) tcrossprod(mat / sqrt(rowSums(mat^2)))) 
>> 
>> DF2 = data.frame(read.table(textConnection("  A  B  C 
>> 80  8025  1995 
>> 80  8026  1995 
>> 80  8029  1995 
>> 81  8026  1996 
>> 82  8025  1997 
>> 82  8026  1997 
>> 83  8025  1997 
>> 83  8027  1997 
>> 90  8026  1998 
>> 90  8027  1998 
>> 90  8029  1998 
>> 84  8026  1999 
>> 84  8027  1999 
>> 85  8028  1999 
>> 85  8029  1999"),head=TRUE,stringsAsFactors=FALSE)) 
>> 
>> e <- function(y) crossprod(table(DF2[DF2$C %in% y, 1:2])) 
>> years <- sort(unique(DF2$C)) 
>> f <- as.data.frame(embed(years, 3)) 
>> g<-lapply(split(f, f[, 1]), e) 
>> h<-lapply(g, function (x) ifelse(x>0,1,0))
>> 
>> years<-c(1997:1999) 
>> for (t in 1:length(years)) 
>>         { 
>>         year=as.character(years[t]) 
>>         h[[year]]<-sapply(colnames(h[[year]]), function(var)
>> h[[year]][h[[year]][,var]>0, h[[year]][var,]>0]) 
>>         } 
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 


--
View this message in context: http://r.789695.n4.nabble.com/Selecting-cases-from-matrices-stored-in-lists-tp3759597p3760909.html
Sent from the R help mailing list archive at Nabble.com.



More information about the R-help mailing list