[R] Trouble retrieving the second largest value from each row of a data.frame
David Winsemius
dwinsemius at comcast.net
Sat Jul 24 14:40:05 CEST 2010
On Jul 23, 2010, at 9:20 PM, <mpward at illinois.edu> wrote:
> I have a data frame with a couple million lines and want to retrieve
> the largest and second largest values in each row, along with the
> label of the column these values are in. For example
>
> row 1
> strongest=-11072
> secondstrongest=-11707
> strongestantenna=value120
> secondstrongantenna=value60
>
> Below is the code I am using and a truncated data.frame. Retrieving
> the largest value was easy, but I have been getting errors every way
> I have tried to retrieve the second largest value. I have not even
> tried to retrieve the labels for the value yet.
>
> Any help would be appreciated
> Mike
Using Holtman's extract of your data with x as the name and the order
function to generate an index to names of the dataframe:
> t(apply(x, 1, sort, decreasing=TRUE)[1:3, ])
[,1] [,2] [,3]
1 -11072 -11707 -12471
2 -11176 -11799 -12838
3 -11113 -11778 -12439
4 -11071 -11561 -11653
5 -11067 -11638 -12834
6 -11068 -11698 -12430
7 -11092 -11607 -11709
8 -11061 -11426 -11665
9 -11137 -11736 -12570
10 -11146 -11779 -12537
Putting it all together:
matrix( paste( t(apply(x, 1, sort, decreasing=TRUE)[1:3, ]),
names(x)[ t(apply(x, 1, order, decreasing=TRUE)
[1:3, ]) ]),
ncol=3)
[,1] [,2] [,3]
[1,] "-11072 value120" "-11707 value60" "-12471 value180"
[2,] "-11176 value120" "-11799 value180" "-12838 value0"
[3,] "-11113 value120" "-11778 value60" "-12439 value180"
[4,] "-11071 value120" "-11561 value240" "-11653 value60"
[5,] "-11067 value120" "-11638 value180" "-12834 value0"
[6,] "-11068 value0" "-11698 value60" "-12430 value120"
[7,] "-11092 value120" "-11607 value240" "-11709 value180"
[8,] "-11061 value120" "-11426 value240" "-11665 value60"
[9,] "-11137 value120" "-11736 value60" "-12570 value180"
[10,] "-11146 value300" "-11779 value0" "-12537 value180"
--
David.
>
>
>> data<-data.frame(value0,value60,value120,value180,value240,value300)
>> data
> value0 value60 value120 value180 value240 value300
> 1 -13007 -11707 -11072 -12471 -12838 -13357
> 2 -12838 -13210 -11176 -11799 -13210 -13845
> 3 -12880 -11778 -11113 -12439 -13089 -13880
> 4 -12805 -11653 -11071 -12385 -11561 -13317
> 5 -12834 -13527 -11067 -11638 -13527 -13873
> 6 -11068 -11698 -12430 -12430 -12430 -12814
> 7 -12807 -14068 -11092 -11709 -11607 -13025
> 8 -12770 -11665 -11061 -12373 -11426 -12805
> 9 -12988 -11736 -11137 -12570 -13467 -13739
> 10 -11779 -12873 -12973 -12537 -12973 -11146
>> #largest value in the row
>> strongest<-apply(data,1,max)
>>
>>
>> #second largest value in the row
>> n<-function(data)(1/(min(1/(data[1,]-max(data[1,]))))+
>> (max(data[1,])))
>> secondstrongest<-apply(data,1,n)
> Error in data[1, ] : incorrect number of dimensions
>>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list