[R] question about getting things out of an lapply

Sun Jul 10 19:08:21 CEST 2011

Dear Josh,

thank you so much for working this out for me - it works really well now!

My only remaining problem is that I can't seem to plot the gradients to 
color.edge in plot.phylo() for my tree, but I've asked the relevant list 
(R-sig-phylo) for some troubleshooting with that.

Many thanks :)!
Annemarie

Joshua Wiley wrote:
> Dear Annemarie,
>
> Look at what you are passing to your function:
>
> lapply(tree$edge, print)
>
> I am guessing you want to be passing:
>
> lapply(1:nrow(tree$edge), print)
>
> so that you are using each row of tree$edge.  Also, take a look at the
> code below for some examples of ways you can simplify (and vastly
> speed) your task if what I said above is true.  The final version does
> your function and saves the results in just three lines without even
> creating a special function.
>
> Hope this helps,
>
> Josh
>
> ################################
> library(ape)
> library(plotrix)
>
> ## Shortened data creation
> tree <- rtree(15)
> data <- rnorm(15, mean = 0.5, sd = 0.15)
> data <- c(data, ace(data, tree)$ace)
> names(data) <- NULL
>
> ## original function
> create.gradient <- function(i){
> colorgrad01<-color.scale(seq(0,1,by=0.01), extremes=c("red","blue"))
>
> tree$edge[i,1] -> x
> tree$edge[i,2] -> y
> print(x)
> print(y)
> data[x] -> z
> data[y] -> z2
>
> round(z, digits = 2) -> z
> round(z2, digits = 2) -> z2
> z*100 -> z
> z2*100 -> z2
> print(z)
> print(z2)
> colorgrad<-colorgrad01[z:z2]
> colorgrad
> }
>
> ## Store results
> ## note that rather than using tree$edge direction
> ## I am using 1:nrow(tree$edge), I think that is what you want
> out <- lapply(1:nrow(tree$edge), create.gradient)
>
>
> ## simplified version of above function
> create.gradient2 <- function(i) {
>   colorgrad01 <- color.scale(seq(0, 1, by = 0.01), extremes = c("red", "blue"))
>   z <- round(data[tree$edge[i, ]], 2) * 100
>   print(z)
>   colorgrad01[z[1]:z[2]]
> }
>
> ## Store results
> out2 <- lapply(1:nrow(tree$edge), create.gradient2)
>
>
> ## even more simplified
> colours <- color.scale(seq(0, 1, by = 0.01), extremes = c("red", "blue"))
> index <- matrix((round(data, 2) * 100)[tree$edge], ncol = 2)
> ## store results
> out3 <- lapply(1:nrow(tree$edge), function(x) colours[index[x, 1]:index[x, 2]])
>
> ## test whether results of all three ways are identical
> all(identical(out, out2), identical(out, out3))
>
> ## So whats the value of the different versions?
> ## Besides simplicity of code, here is how long 1000 replications
> ## of each version took on my (rather slow) laptop
>
> ## Version 1 (original)
>    user  system elapsed
>   61.76    0.27   62.42
> ## Version 2 (simplified quantity of code)
>    user  system elapsed
>   54.82    0.25   59.26
> ## Version 3 (almost completely vectorized)
>    user  system elapsed
>    0.42    0.01    0.45
>
> On Thu, Jul 7, 2011 at 4:25 AM, Annemarie Verkerk
> <annemarie.verkerk at mpi.nl> wrote:
>   
>> Dear Josh,
>>
>> thanks for pointing this out - the idea behind writing this function is
>> plotting gradients on branches of phylogenetic trees - 'tree' refers to a
>> phylogenetic tree. It's easy to create a random phylogenetic tree in R:
>>
>> library(ape)
>> library(plotrix)
>>
>> rtree(15) -> tree
>>
>> This gives you a tree with 15 taxa. You can plot it with plot() if you want
>> to take a look.
>>
>> then the data - you can create a fake data set:
>>
>> rnorm(15, mean = 0.5, sd = 0.15) -> data
>>
>> for the data which the function needs, you also need:
>>
>> ace(data, tree) -> results
>>
>> data <- append(data,results$ace)
>>
>> names(data) <- NULL
>>
>> I also tried with the following updated code I still got the same error
>> message:
>>
>> create.gradient <- function(i){
>> colorgrad01<-color.scale(seq(0,1,by=0.01), extremes=c("red","blue"))
>> tree$edge[i,1] -> x
>> tree$edge[i,2] -> y
>> print(x)
>> print(y)
>> data[x] -> z
>> data[y] -> z2
>> round(z, digits = 2) -> z
>> round(z2, digits = 2) -> z2
>> z*100 -> z
>> z2*100 -> z2
>> print(z)
>> print(z2)
>> colorgrad<-colorgrad01[z:z2]
>> colorgrad
>> }
>>
>> lapply(tree$edge, create.gradient)
>>
>> - Error in FUN(X[[26L]], ...) : subscript out of bounds
>>
>> I hope this help and you can replicate the problem too.
>>
>> Thanks!
>> Annemarie
>>
>> Joshua Wiley wrote:
>>     
>>> Dear Annemarie,
>>>
>>> Can you replicate the problem using a madeup dataset or one of the
>>> ones built into R?  It strikes me as odd to pass tree1$edge directly
>>> to lapply, when it is also hardcoded into the function, but I do not
>>> have a sense exactly for what you are doing and without data it is
>>> hard to play around.
>>>
>>> Cheers,
>>>
>>> Josh
>>>
>>> On Wed, Jul 6, 2011 at 12:31 PM, Annemarie Verkerk
>>> <annemarie.verkerk at mpi.nl> wrote:
>>>
>>>       
>>>> Dear R-help subscribers,
>>>>
>>>> I have a quite stupid question about using lapply. I have the following
>>>> function:
>>>>
>>>> create.gradient <- function(i){
>>>> colorgrad01<-color.scale(seq(0,1,by=0.01), extremes=c("red","blue"))
>>>> tree1$edge[i,1] -> x
>>>>
>>>>         
>>> this works, but it would typically be written:
>>>
>>> x <- tree1$edge[i, 1]
>>>
>>> flipping back and forth can be a smidge (about 5 pinches under an
>>> iota) confusing.
>>>
>>>
>>>       
>>>> tree1$edge[i,2] -> y
>>>> print(x)
>>>> print(y)
>>>> all2[x] -> z
>>>> all2[y] -> z2
>>>> round(z, digits = 2) -> z
>>>> round(z2, digits = 2) -> z2
>>>> z*100 -> z
>>>> z2*100 -> z2
>>>> print(z)
>>>> print(z2)
>>>> colorgrad<-colorgrad01[z:z2]
>>>> colorgrad
>>>> }
>>>>
>>>> Basically, I want to pick a partial gradient out of a bigger gradient
>>>> (colorgrad01) for values that are on row i, from a matrix called tree1.
>>>>
>>>> when I use lapply:
>>>>
>>>> lapply(tree1$edge, create.gradient)
>>>>
>>>> I get the following error message:
>>>>
>>>> Error in FUN(X[[27L]], ...) : subscript out of bounds
>>>>
>>>> I'm not sure what's wrong: it could be either fact that 'colorgrad' is a
>>>> character string; i.e. consisting of multiple characters and not just
>>>> one,
>>>> or because 'i' doesn't come back in the object 'colorgrad' that it has to
>>>> return. Or it could be something else entirely...
>>>>
>>>> In any case, what I prefer as output is a vector with all the different
>>>> 'colorgrad's it generates with each run.
>>>>
>>>> Thanks a lot for any help you might be able to offer!
>>>> Annemarie
>>>>
>>>> --
>>>> Annemarie Verkerk, MA
>>>> Evolutionary Processes in Language and Culture (PhD student)
>>>> Max Planck Institute for Psycholinguistics
>>>> P.O. Box 310, 6500AH Nijmegen, The Netherlands
>>>> +31 (0)24 3521 185
>>>> http://www.mpi.nl/research/research-projects/evolutionary-processes
>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>
>>>>
>>>>         
>>>
>>>
>>>       
>> --
>> Annemarie Verkerk, MA
>> Evolutionary Processes in Language and Culture (PhD student)
>> Max Planck Institute for Psycholinguistics
>> P.O. Box 310, 6500AH Nijmegen, The Netherlands
>> +31 (0)24 3521 185
>> http://www.mpi.nl/research/research-projects/evolutionary-processes
>>
>>
>>     
>
>
>
>   

-- 
Annemarie Verkerk, MA
Evolutionary Processes in Language and Culture (PhD student)
Max Planck Institute for Psycholinguistics
P.O. Box 310, 6500AH Nijmegen, The Netherlands
+31 (0)24 3521 185
http://www.mpi.nl/research/research-projects/evolutionary-processes