[R] Simple indexing conundrum

Jim Brennan jfbrennan at rogers.com
Fri Jul 1 14:45:04 CEST 2005


Here is a different approach I only send since the result is slightly
different in that two rows are returned for Month 9 and the original row
number is retained.

> max2<-function(x){max(x,na.rm=T)}
> MonthMax<-ave(Solar.R,Month,FUN=max2)
> new<-subset(airquality,Solar.R==MonthMax)
> new<-subset(airquality,Solar.R==MonthMax)
> new
    Ozone Solar.R Wind Temp Month Day
16     14     334 11.5   64     5  16
45     NA     332 13.8   80     6  14
67     40     314 10.9   83     7   6
105    28     273 11.5   82     8  13
133    24     259  9.7   73     9  10
135    21     259 15.5   76     9  12

-----Original Message-----
From: r-help-bounces at stat.math.ethz.ch
[mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Liaw, Andy
Sent: July 1, 2005 8:31 AM
To: 'Martin Henry H. Stevens'; R-Help
Subject: Re: [R] Simple indexing conundrum

Is this close to what you want?

> air.sub <- do.call("rbind", lapply(split(airquality, airquality$Month), 
+                                    function(d) d[which.max(d$Solar.R),]))
> air.sub
  Ozone Solar.R Wind Temp Month Day
5    14     334 11.5   64     5  16
6    NA     332 13.8   80     6  14
7    40     314 10.9   83     7   6
8    28     273 11.5   82     8  13
9    24     259  9.7   73     9  10

Andy

> From: Martin Henry H. Stevens
> 
> My apologies in advance for my thickness but I can't seem to 
> solve the 
> following, seemingly simple, data manipulation problem:
> 
> I have a data frame that contains multiple factors and multiple 
> continuous response variables, but duplicates of some factor 
> combinations. The duplicates contain bad data, so I would like to 
> eliminate the duplicates. I would like to retain the entire rows 
> identified by the maximum value of one particular continuous response 
> variable.
> 
> For instance,
> 
>  >data(airquality)
> 
>  > str(airquality)
> `data.frame':	153 obs. of  6 variables:
>   $ Ozone  : int  41 36 12 18 NA 28 23 19 8 NA ...
>   $ Solar.R: int  190 118 149 313 NA NA 299 99 19 194 ...
>   $ Wind   : num  7.4 8 12.6 11.5 14.3 14.9 8.6 13.8 20.1 8.6 ...
>   $ Temp   : int  67 72 74 62 56 66 65 59 61 69 ...
>   $ Month  : int  5 5 5 5 5 5 5 5 5 5 ...
>   $ Day    : int  1 2 3 4 5 6 7 8 9 10 ...
> 
> I would like to subset airquality, retaining only the rows, 
> containing 
> the maximum Solar.R for each month.
> 
> Any solution would be greatly appreciated.
> 
> Regards,
> Hank
> 
> 
> 
> Dr. Martin Henry H. Stevens, Assistant Professor
> 338 Pearson Hall
> Botany Department
> Miami University
> Oxford, OH 45056
> 
> Office: (513) 529-4206
> Lab: (513) 529-4262
> FAX: (513) 529-4243
> http://www.cas.muohio.edu/botany/bot/henry.html
> http://www.muohio.edu/ecology/
> http://www.muohio.edu/botany/
> "E Pluribus Unum"
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
> 
> 
>

______________________________________________
R-help at stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html




More information about the R-help mailing list