[R] simple subset question
arun
smartpink111 at yahoo.com
Sun Dec 2 22:18:04 CET 2012
Hi,
From the ddply() output, you could get the whole row by:
fish1 <- structure(list(Year = 2002:2012, maxTotal = c(1464311L, 1071051L,
714837L, 2115018L, 850491L, 207537L, 321195L, 935599L, 194429L,
157260L, 303259L)), .Names = c("Year", "maxTotal"), row.names = c(NA,
-11L), class = "data.frame")
fish[fish[,2]%in%fish1[,2][fish1[,1]==2012],] #fish (or winter) is your original dataset
# IDWeek Total Fry Smolt FryEq Year
#21 47 303259 34008 269248 491733 2012
A.K.
________________________________
From: Felipe Carrillo <mazatlanmexico at yahoo.com>
To: William Dunlap <wdunlap at tibco.com>; arun <smartpink111 at yahoo.com>
Cc: R help <r-help at r-project.org>
Sent: Sunday, December 2, 2012 2:34 PM
Subject: Re: [R] simple subset question
Using my whole dataset I get:
library(plyr)
ddply(winter,"Year",summarise,maxTotal=max(Total))
fish <- structure(list(Year = 2002:2012, maxTotal = c(1464311L, 1071051L,
714837L, 2115018L, 850491L, 207537L, 321195L, 935599L, 194429L,
157260L, 303259L)), .Names = c("Year", "maxTotal"), row.names = c(NA,
-11L), class = "data.frame")
I only want to extract the max Total for 2012 and want the whole row like this:
IDWeek Total Fry Smolt FryEq Year
21 47 303259 34008 269248 491733 2012
My whole dataset is too big to post it so thanks for your help and will try
to figure out why subset returns an empty row
Felipe D. Carrillo
Supervisory Fishery Biologist
Department of the Interior
US Fish & Wildlife Service
California, USA
http://www.fws.gov/redbluff/rbdd_jsmp.aspx
From: William Dunlap <wdunlap at tibco.com>
>To: Felipe Carrillo <mazatlanmexico at yahoo.com>; arun <smartpink111 at yahoo.com>
>Cc: R help <r-help at r-project.org>
>Sent: Sunday, December 2, 2012 11:00 AM
>Subject: RE: [R] simple subset question
>
>> I am
>> still getting an error message
>> >with :
>> > x <- subset(fish,Year==2012
& Total==max(Total));x
>> >I get:
>> >[1] IDWeek Total Fry Smolt FryEq Year
>> ><0 rows> (or 0-length row.names)
>
>The above is not an error message. It says that there
>are no rows satisfying your criteria. Note that Total==max(Total)
>returns a TRUE for each row in which the Total value
>equals the maximum Total value over all the years in
>the data. Are you looking for the maximum value of Total
>in each year?
>
>> tmp <- transform(fish, YearlyMaxTotal = ave(Total, Year, FUN=max))
>> subset(tmp, Total==YearlyMaxTotal)
> IDWeek Total Fry Smolt FryEq Year YearlyMaxTotal
>21 47 303259 34008 269248 491733 2012 303259
>39 39 157260 156909 351 157506 2011 157260
>> subset(tmp, Total==YearlyMaxTotal
& Year==2012)
> IDWeek Total Fry Smolt FryEq Year YearlyMaxTotal
>21 47 303259 34008 269248 491733 2012 303259
>
>Bill Dunlap
>Spotfire, TIBCO Software
>wdunlap tibco.com
>
>
>> -----Original Message-----
>> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf
>> Of Felipe Carrillo
>> Sent: Sunday, December 02, 2012 10:47 AM
>> To: arun
>> Cc: R help
>> Subject: Re: [R] simple subset question
>>
>> Works with the small dataset (2 years) but I get the error message with the whole
>> dataset (12 years of data). I am going to have
>> to check what's
wrong with it...Thanks
>>
>> Felipe D. Carrillo
>> Supervisory Fishery Biologist
>> Department of the Interior
>> US Fish & Wildlife Service
>> California, USA
>> http://www.fws.gov/redbluff/rbdd_jsmp.aspx
>>
>>
>> From: arun <smartpink111 at yahoo.com>
>> >To: Felipe Carrillo <mazatlanmexico at yahoo.com>
>> >Cc: R help <r-help at r-project.org>; R. Michael Weylandt
>> <michael.weylandt at gmail.com>
>> >Sent: Sunday, December 2, 2012 10:29 AM
>> >Subject: Re: [R] simple subset question
>> >
>>
>Hi,
>> >I am getting this:
>> >x<-subset(fish,Year==2012 & Total==max(Total))
>> > x
>> ># IDWeek Total Fry Smolt FryEq Year
>> >#21 47 303259 34008 269248 491733 2012
>> >A.K.
>> >
>> >
>> >
>> >
>> >----- Original Message -----
>> >From: Felipe Carrillo <mazatlanmexico at yahoo.com>
>> >To: R. Michael Weylandt <michael.weylandt at gmail.com>
>> >Cc: "r-help at r-project.org" <r-help at r-project.org>
>> >Sent:
Sunday, December 2, 2012 1:25 PM
>> >Subject: Re: [R] simple subset question
>> >
>> >Sorry, I was trying it to subset from a bigger dataset called 'winter' and forgot to
>> change the variable names
>> >when I asked the question. David W suggestion works but the strange part is that I am
>> still getting an error message
>> >with :
>> > x <- subset(fish,Year==2012 & Total==max(Total));x
>> >I get:
>> >[1] IDWeek Total Fry Smolt FryEq Year
>> ><0 rows> (or 0-length row.names)
>> >
>> >I will start a fresh session to see if that helps...Thank you all
>> >
>> >Felipe D. Carrillo
>> >Supervisory Fishery Biologist
>> >Department of the Interior
>> >US Fish & Wildlife Service
>> >California, USA
>> >http://www.fws.gov/redbluff/rbdd_jsmp.aspx
>> >
>> >
>> >From: R. Michael Weylandt <michael.weylandt at gmail.com>
>> >>To: Felipe Carrillo <mazatlanmexico at yahoo.com>
>> >>Cc: "r-help at r-project.org" <r-help at r-project.org>
>> >>Sent: Sunday, December 2, 2012 9:42 AM
>> >>Subject: Re: [R] simple subset question
>> >>
>> >>On Sun, Dec 2, 2012 at 5:21 PM, Felipe Carrillo
>> >><mazatlanmexico at yahoo.com> wrote:
>> >>> Hi,
>> >>> Consider the small dataset below, I want to subset by two variables in
>> >>> one line but it wont work...it works though if I subset separately. I have
>> >>> to be missing something obvious that I did not realize before while using subset..
>> >>>
>> >>> fish <- structure(list(IDWeek = c(27L, 28L, 29L, 30L, 31L, 32L, 33L,
>> >>> 34L, 35L, 36L, 37L, 38L, 39L, 40L, 41L, 42L, 43L, 44L, 45L, 46L,
>> >>> 47L, 48L, 49L, 50L, 51L, 52L, 27L, 28L, 29L, 30L, 31L, 32L, 33L,
>> >>> 34L, 35L, 36L, 37L, 38L, 39L, 40L, 41L, 42L, 43L, 44L, 45L, 46L,
>> >>> 47L, 48L, 49L, 50L, 51L, 52L), Total = c(0L, 0L, 326L, 1735L,
>> >>> 1807L, 2208L, 3883L, 8820L, 6060L, 19326L, 63158L, 100718L, 53015L,
>>
>>> 91689L, 152629L, 122708L, 61293L, 15574L, 86538L, 75365L, 303259L,
>> >>> 19691L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 161L, 321L, 1000L, 4425L,
>> >>> 13202L, 19726L, 30518L, 84949L, 157260L, 145691L, 85801L, 62044L,
>> >>> 44439L, 23272L, 22391L, 20159L, 14854L, 35379L, 31142L, 7736L,
>> >>> 13221L, 4894L), Fry = c(0L, 0L, 326L, 1735L, 1807L, 2208L, 3883L,
>> >>> 8759L, 6060L, 19326L, 63119L, 100524L, 52582L, 88170L, 145564L,
>> >>> 111416L, 38233L, 5248L, 17826L, 11038L, 34008L, 215L, 0L, 0L,
>> >>> 0L, 0L, 0L, 0L, 0L, 0L, 161L, 321L, 1000L, 4425L, 13055L, 19488L,
>> >>> 30518L, 84818L, 156909L, 144786L, 84207L, 57720L, 31049L, 6858L,
>> >>> 1616L, 719L, 364L, 49L, 0L, 0L, 0L, 0L), Smolt = c(0L, 0L, 0L,
>> >>> 0L, 0L, 0L, 0L, 62L, 0L, 0L, 38L, 195L, 433L, 3518L, 7067L, 11290L,
>> >>> 23058L,
10327L, 68712L, 64328L, 269248L, 19479L, 0L, 0L, 0L,
>> >>> 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 147L, 238L, 0L, 131L, 351L,
>> >>> 905L, 1592L, 4324L, 13391L, 16414L, 20774L, 19444L, 14491L, 35330L,
>> >>> 31142L, 7736L, 13221L, 4894L), FryEq = c(0L, 0L, 326L, 1735L,
>> >>> 1807L, 2208L, 3883L, 8864L, 6060L, 19326L, 63185L, 100854L, 53318L,
>> >>> 94151L, 157576L, 130610L, 77432L, 22805L, 134639L, 120393L, 491733L,
>> >>> 33327L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 161L, 321L, 1000L, 4425L,
>> >>> 13306L, 19894L, 30518L, 85042L, 157506L, 146328L, 86914L, 65073L,
>> >>> 53812L, 34763L, 36931L, 33769L, 24998L, 60110L, 52938L, 13149L,
>> >>> 22476L, 8319L), Year = c(2012L, 2012L, 2012L, 2012L, 2012L, 2012L,
>> >>> 2012L, 2012L, 2012L, 2012L, 2012L, 2012L, 2012L, 2012L, 2012L,
>> >>> 2012L, 2012L, 2012L,
2012L, 2012L, 2012L, 2012L, 2012L, 2012L,
>> >>> 2012L, 2012L, 2011L, 2011L, 2011L, 2011L, 2011L, 2011L, 2011L,
>> >>> 2011L, 2011L, 2011L, 2011L, 2011L, 2011L, 2011L, 2011L, 2011L,
>> >>> 2011L, 2011L, 2011L, 2011L, 2011L, 2011L, 2011L, 2011L, 2011L,
>> >>> 2011L)), .Names = c("IDWeek", "Total", "Fry", "Smolt", "FryEq",
>> >>> "Year"), row.names = c(NA, 52L), class = "data.frame")
>> >>> fish
>> >>> # Subset to get the max Total for 2012
>> >>> x <- subset(winter,Year==2012 & Total==max(Total));b # How come one line doesn't
>> work?
>> >>
>> >>Works fine for me if I change "winter" to fish here.
>> >>
>> >>subset(fish,Year==2012 & Total==max(Total))
>> >> IDWeek Total Fry Smolt FryEq Year
>> >>21 47
303259 34008 269248 491733 2012
>> >>
>> >>>
>> >>> # It works if I subset the year first and then get the Total max from it
>> >>> xx <- subset(winter,Year==2012)
>> >>> xxx <- subset(xx,Total==max(Total));xxx
>> >>> xxx
>> >>>
>> >>> Felipe D. Carrillo
>> >>> Supervisory Fishery Biologist
>> >>> Department of the Interior
>> >>> US Fish & Wildlife Service
>> >>> California, USA
>> >>> http://www.fws.gov/redbluff/rbdd_jsmp.aspx
>> >>>
>> >>> [[alternative HTML version deleted]]
>> >>>
>> >>>
>> >>> ______________________________________________
>> >>> R-help at r-project.org mailing list
>> >>> https://stat.ethz.ch/mailman/listinfo/r-help
>> >>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> >>> and provide commented, minimal, self-contained, reproducible code.
>> >>>
>> >>
>> >>
>> >>
>> > [[alternative HTML version deleted]]
>> >
>> >
>> >______________________________________________
>> >R-help at r-project.org mailing list
>> >https://stat.ethz.ch/mailman/listinfo/r-help
>> >PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> >and provide commented, minimal, self-contained, reproducible code.
>> >
>> >
>> >
>> >
>> [[alternative HTML version deleted]]
>
>
>
>
More information about the R-help
mailing list