[R] how to rewrite this without a loop ?
Stijn Lievens
stijn.lievens at ugent.be
Fri Nov 19 10:37:10 CET 2004
Thomas Lumley wrote:
> On Thu, 18 Nov 2004, Stijn Lievens wrote:
>
>>
>> <code>
>> add.fun <- function(perf.data) {
>> ss <- 0
>> for (i in 0:29) {
>> ss <- ss + cor(subset(perf.data, dataset == i)[3],
>> subset(perf.data, dataset == i)[7], method = "kendall")
>> }
>> ss }
>> </code>
>>
>> As one can see this function uses a for-loop. Now chapter 9 of 'An
>> introduction to R' tells us that we should avoid for-loops as much as
>> possible.
>
>
>
> You don't say whether `dataset' is the name of a column in `perf.data'.
> Assuming it is, and assuming that 0:29 are all the values of `dataset'
>
> sum(by(perf.data, list(perf.data$dataset),
> function(d) cor(d[,3],d[,7], method="kendall")))
>
> would work.
Indeed, this works. The 'by' command is exactly what I was looking for.
As far as I can tell, this useful command it isn't mentioned in 'An
introduction to R'.
> If this is faster it will be because you don't call
> subset() twice per iteration, rather than because you are avoiding a
> loop. However it has other benefits: it doesn't have the variable `i',
> it doesn't have to change the value of `ss', and it doesn't have the
> range of `dataset' hard-coded into it. These are all clarity
> optimisations.
>
In fact I don't care too much about speed at the moment, but a one-line
statement is more convenient to type (and recall) in the command line
interface then a multi-line statmement.
Your solution really does the trick for me. Thanks,
Stijn.
> -thomas
More information about the R-help
mailing list