[R] Sum efficiently from large matrix according to re-occuring levels of factor?
Ralph S.
ruffel1 at hotmail.com
Mon Jul 21 03:41:44 CEST 2008
yes - thank you very much! slowly getting to the full power of R . . .
----------------------------------------
> Date: Sun, 20 Jul 2008 21:21:35 -0400
> From: jholtman at gmail.com
> To: ruffel1 at hotmail.com
> Subject: Re: [R] Sum efficiently from large matrix according to re-occuring levels of factor?
> CC: h.wickham at gmail.com; r-help at r-project.org
>
> Does this do what you want:
>
>> # following up on another idea that was presented
>> # where are the breaks
>> dataBreaks <- cumsum(c(0, (diff(x[, 2] + x[, 1] * max(x[, 2])) != 0)))
>> # sum up column 3 and output the first two columns with the indices
>> result <- lapply(split(seq(nrow(x)), dataBreaks), function(.sect){
> + c(x[.sect[1], 1:2], sum(x[.sect, 3]))
> + })
>> do.call(rbind, result)
> [,1] [,2] [,3]
> 0 1 7 3
> 1 2 4 2
> 2 3 2 3
> 3 1 7 10
>
>
> On Sun, Jul 20, 2008 at 7:57 PM, Ralph S. wrote:
>>
>> The first and second column are actually indices of another matrix (my example may make this not sufficiently clear). I want to compare the sum with that corresponding entry, and then record the result of that.
>>
>> Any idea?
>>
>> Best,
>>
>> Ralph
>>
>>
>>
>> ----------------------------------------
>>> Date: Sun, 20 Jul 2008 16:50:41 -0700
>>> From: h.wickham at gmail.com
>>> To: ruffel1 at hotmail.com
>>> Subject: Re: [R] Sum efficiently from large matrix according to re-occuring levels of factor?
>>> CC: r-help at r-project.org
>>>
>>> On Sun, Jul 20, 2008 at 4:47 PM, hadley wickham wrote:
>>>> On Sun, Jul 20, 2008 at 4:16 PM, Ralph S. wrote:
>>>>>
>>>>> Hi,
>>>>>
>>>>> I am trying to calculate the sum for each occurrence of the level of a factor in a very large matrix. In addition, I want to save that sum together with the information of the level of the factor and the level of a second factor.
>>>>>
>>>>> My matrix looks like this:
>>>>>
>>>>> x<-matrix(c(1,1,1,2,2,3,3,1,1,7,7,7,4,4,2,2,7,7,1,1,1,1,1,1,2,5,5),9,3)
>>>>>
>>>>> I want to sum according to the levels in the first column and save the sum with the information of the level in the first and the second column in a new matrix.
>>>>>
>>>>> That is, I want output in the matrix of form:
>>>>>
>>>>> 1 7 3
>>>>> 2 4 2
>>>>> 3 2 3
>>>>> 1 7 10
>>>>>
>>>>
>>>> Why that and not:
>>>>
>>>> 1 7 13
>>>> 2 4 2
>>>> 3 2 3
>>>>
>>>> ?
>>>
>>> Here's a solution for that case:
>>>
>>> index <- x[, 2] + x[, 1] * max(x[, 2])
>>> cbind(x[!duplicated(index), 1:2], tapply(x[, 3], index, sum))
>>>
>>> It takes about half a second for a million row matrix.
>>>
>>> Hadley
>>>
>>>
>>>
>>> --
>>> http://had.co.nz/
>>
>> _________________________________________________________________
>> With Windows Live for mobile, your contacts travel with you.
>>
>> 072008
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
>
> --
> Jim Holtman
> Cincinnati, OH
> +1 513 646 9390
>
> What is the problem you are trying to solve?
_________________________________________________________________
_WL_Refresh_messenger_video_072008
More information about the R-help
mailing list