[R] Choosing the larger of row pairs

James Rome jamesrome at gmail.com
Thu Mar 25 16:29:20 CET 2010


On 3/24/2010 10:58 PM, Peter Alspach wrote:
> jim
>   
  key value
1   1     1
2   2     0
3   2     2
4   3     0
5   4     0
6   5     1
7   6     3
8   6     2
9   7     0

> > tt <- rle(jim$key)$lengths
> > ttJim <- jim[cumsum(tt)-tt+tapply(jim$value, jim$key, which.max),]
> > ttJim
>   
  key value
1   1     1
3   2     2
4   3     0
5   4     0
6   5     1
7   6     3
9   7     0

HTH ....

Peter Alspach

-------------------------
Close but not quite. My real key is a POSIXct date. As a result, it
seems to delete the entry after the matching pair instead of one of the
pair:
> z[,c(9,13)]
     key                                           value
1  1/2/2010 10:39                             1
3  1/2/2010 10:38                             1
5  1/2/2010 10:32                             6
7  1/2/2010 10:28                             0
8  1/2/2010 10:27                             6
10 1/2/2010 10:27                             1
11 1/2/2010 10:25                             3
12 1/2/2010 10:23                             1
13 1/2/2010 10:23                             0
14 1/2/2010 10:22                             3
16 1/2/2010 10:16                             4
18 1/2/2010 10:11                             4
20 1/2/2010 10:06                             4
22 1/2/2010 10:00                             5
24  1/2/2010 9:54                             5
26  1/2/2010 9:48                             5
28  1/2/2010 9:43                             6
30  1/2/2010 9:37                             6
32  1/2/2010 9:32                             7
34  1/2/2010 9:26                             8
36  1/2/2010 8:57                             8
38  1/2/2010 8:44                             9
39  1/2/2010 8:42                            10
41  1/2/2010 8:38                            10
43  1/2/2010 8:15                             0
44  1/2/2010 7:36                            15
46  1/2/2010 7:32                            15
>

> tapply(z$value, z$key, which.min)
1/2/2010 10:00 1/2/2010 10:06 1/2/2010 10:11 1/2/2010 10:16 1/2/2010 10:22
             1              1              1              1              1
1/2/2010 10:23 1/2/2010 10:25 1/2/2010 10:27 1/2/2010 10:28 1/2/2010 10:32
             2              1              2              1              1
1/2/2010 10:38 1/2/2010 10:39  1/2/2010 7:32  1/2/2010 7:36  1/2/2010 8:15
             1              1              1              1              1
 1/2/2010 8:38  1/2/2010 8:42  1/2/2010 8:44  1/2/2010 8:57  1/2/2010 9:26
             1              1              1              1              1
 1/2/2010 9:32  1/2/2010 9:37  1/2/2010 9:43  1/2/2010 9:48  1/2/2010 9:54
             1              1              1              1              1
> cumsum(tt) - tt
 [1]  0  1  2  3  4  6  7  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
24 25 26
> cumsum(tt) + tt
 [1]  2  3  4  5  8  8 11 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
26 27 28
> cumsum(tt) - tt + tapply(z$value, z$key, which.min)
1/2/2010 10:00 1/2/2010 10:06 1/2/2010 10:11 1/2/2010 10:16 1/2/2010 10:22
             1              2              3              4              5
1/2/2010 10:23 1/2/2010 10:25 1/2/2010 10:27 1/2/2010 10:28 1/2/2010 10:32
             8              8             11             11             12
1/2/2010 10:38 1/2/2010 10:39  1/2/2010 7:32  1/2/2010 7:36  1/2/2010 8:15
            13             14             15             16             17
 1/2/2010 8:38  1/2/2010 8:42  1/2/2010 8:44  1/2/2010 8:57  1/2/2010 9:26
            18             19             20             21             22
 1/2/2010 9:32  1/2/2010 9:37  1/2/2010 9:43  1/2/2010 9:48  1/2/2010 9:54
            23             24             25             26             27
> z$value
 [1]  1  1  6  0  6  1  3  1  0  3  4  4  4  5  5  5  6  6  7  8  8  9
10 10  0
[26] 15 15
> zg = z[cumsum(tt)-tt+tapply(z$key, z$value, which.min),]
> zg[,c(9, 13)]
       key                                            Value
1    1/2/2010 10:39                             1
3    1/2/2010 10:38                             1
5    1/2/2010 10:32                             6
7    1/2/2010 10:28                             0
8    1/2/2010 10:27                             6   ## 10:25 missing
12   1/2/2010 10:23                             1
12.1 1/2/2010 10:23                             1
16   1/2/2010 10:16                             4   ## 10:22 missing
16.1 1/2/2010 10:16                             4
18   1/2/2010 10:11                             4
20   1/2/2010 10:06                             4
22   1/2/2010 10:00                             5
24    1/2/2010 9:54                             5
26    1/2/2010 9:48                             5
28    1/2/2010 9:43                             6
30    1/2/2010 9:37                             6
32    1/2/2010 9:32                             7
34    1/2/2010 9:26                             8
36    1/2/2010 8:57                             8
38    1/2/2010 8:44                             9
39    1/2/2010 8:42                            10
41    1/2/2010 8:38                            10
43    1/2/2010 8:15                             0
44    1/2/2010 7:36                            15
46    1/2/2010 7:32                            15

Thanks,
Jim

Thanks,
Jim



More information about the R-help mailing list