[R] using 'apply' to apply princomp to an array of datasets

Thu Dec 13 01:18:19 CET 2012

Thank you, Rui!   This is incredibly helpful -- anonymous functions
are new to me, and I appreciate being shown how useful they are.

Best regards,
David

On Wed, Dec 12, 2012 at 10:12 AM, Rui Barradas <ruipbarradas at sapo.pt> wrote:
> Hello,
>
> As for the first question try
>
> scoreset <- lapply(pcl, function(x) x$scores[, 1])
> do.call(cbind, scoreset)
>
>
> As for the second question, you want to know which columns in 'datasets'
> have NA's?
>
> colidx <- apply(datasets, 2, function(x) any(is.na(x)))
> datasets[, colidx]  # These have NA's
>
>
> For the column numbers you can do
>
> colnums <- which(colidx)
>
> Hope this helps,
>
> Rui Barradas
>
> Em 12-12-2012 17:14, David Romano escreveu:
>>
>> Hi everyone,
>>
>> Suppose I have a 3D array of datasets, where say dimension 1 corresponds
>> to
>> cases, dimension 2 to datasets, and dimension 3 to observations within a
>> dataset.  As an example, suppose I do the following:
>>
>>> x <- sample(1:20, 48, replace=TRUE)
>>> datasets <- array(x, dim=c(4,3,2))
>>
>> Here, for each j=1,2,3, I'd like to think of datasets[,j,] as a single
>> data
>> matrix with four cases and two observations.  Now, I'd like to be able to
>> do the following: apply pca to each dataset, and create a matrix of the
>> first principal component scores.
>>
>> In this example, I could do:
>>
>>> pcl<-apply(datasets,2,princomp)
>>
>> which yields a list of princomp output, one for each dataset, so that the
>> vector of first principal component scores for dataset 1 is obtained by
>>
>>> score1set1 <- pcl[[1]]$scores[,1]
>>
>> and I could then obtain the desired matrix by
>>
>>> score1matrix <- cbind( score1set1, score1set2, score1set3)
>>
>>
>> So my first question is: 1) how could I use *apply to do this?  I'm having
>> trouble because pcl is a list of lists, so I can't use, say,
>> do.call(cbind,
>> ...) without first having a list of the first component score vectors,
>> which I'm not sure how to produce.
>>
>> My second question is: 2) Having answered question 1), now suppose there
>> may be datasets containing NA value -- how could I select the subset of
>> values from dimension 2 corresponding to the datasets for which this is
>> true (again using *apply?)?
>>
>> Thanks in advance for any light you might be able to shed on these
>> questions!
>>
>> David Romano
>>
>>         [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>