[R] Getting minimum value of a column according a factor column of a dataframe
Rui Barradas
ru|pb@rr@d@@ @end|ng |rom @@po@pt
Fri Aug 26 06:14:02 CEST 2022
Hello,
To return 2 rows for each Code, one for the min and another for the max,
try the following.
I'm borrowing Bert's by() idea, it makes everything simpler.
There is a hack to have the original Code order kept, since the final
result res should have two rows for each Code, see what is order()'ed below.
# the output has 2 consecutive rows with
# the same Code, so repeat the unique Codes
i <- order(rep(unique(df1$Code), each = 2))
res <- by(df1, df1$Code, \(x) x[c(which.min(x$Q), which.max(x$Q)), ])
res <- do.call(rbind, res)[order(i), ]
# remake the row names, they're ugly after rbind
row.names(res) <- NULL
res
Hope this helps,
Rui Barradas
Às 19:02 de 25/08/2022, javad bayat escreveu:
> ;Dear all
> First of all I appreciate you for the answers you have sent. I did the
> codes that Rui provided and I got what I wanted.
> "
> res <- lapply(split(df1, df1$Code), \(x) x[which.min(x$Q),])
> res <- do.call(rbind, res)
> i <- order(unique(df1$Code))
> res[order(i), ]
> "
> I think I should explain more about my request. I had a large data frame
> (11059 rows and 16 columns). The Code column represented the stations code,
> totally the number of stations were 128. At each station I had many
> measured variables, like Q and N and O, and these variables were measured
> in different years and days. The days that data were measured were
> different for each station, and due to this reason I had different rows for
> stations. For example, station number one (41009) had 158 rows and station
> number 2 (41011) had 113 rows. Note that the station's codes are not in
> order format (e.g smallest to largest).
> Back to my request, I wanted to extract the minimum value of the Q for each
> station (based on the Code column). The problem was that I wanted to have
> other column values which were measured for this minimum of the Q.
> I hope my explanation was clear enough. As I said before, I used the Rui's
> codes and I got what I wanted. Although, other solutions provided by others
> were all correct.
>
> Regarding my request, unfortunately I faced another problem. I had to
> extract the maximum of the Q and put it exactly under the minimum of the Q.
> Something like the below one:
> "
>
> Code
>
> Y
>
> M
>
> D
>
> Q
>
> N
>
> O
>
> 41003
>
> 81
>
> 1
>
> 19
>
> 0.16
>
> 7.17
>
> 2.5
>
> 41003
>
> 79
>
> 8
>
> 17
>
> 10.21
>
> 5.5
>
> 7.2
>
> 41009
>
> 79
>
> 2
>
> 21
>
> 0.218
>
> 5.56
>
> 4.04
> 41009 79 10 20 12.24 5.3 7.1
> .
> .
> .
> .
> "
> I extract both min and max according to the codes, and I have 2 dataframes,
> one for the minimum values and another for the max values. Both dataframe
> have a Code column which is exactly similar.
> Can I extract both min and max simultaneously or I have to combine two
> dataframes?
> I used the rbind and merge function but they did not give the desired
> results.
>> df3 = merge (df1, df2, by = "Code")
> The result of this code adds a second dataframe as columns to the first
> one. I want the first row of the second dataframe put below the first row
> of the first dataframe and so on. I used a function to do this but it seems
> it does not work correctly.
>
>> fun2 = function(x,y){
> i = 1
> for(i in x) {
> if (x[i,1] == y[i,1]){
> rbind(x[i,],y[i,])
> i = i+1
> }
> }
> }
>> fun2(df1, df2)
>
> Sincerely
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> On Thu, Aug 25, 2022 at 9:08 PM <avi.e.gross using gmail.com> wrote:
>
>> Yes, Timothy, the request was not seen by all of us as the same.
>>
>> Indeed if the request was to show a subset of the original data consisting
>> of only the rows that were the minimum for each Code and also showed ties,
>> then the solution is a tad more complex. I would then do something along
>> the
>> lines of what others showed such as generating another column showing the
>> minimum for each row and then showing only rows that matched their value in
>> two columns or whatever was needed.
>>
>> As noted, keeping the output in a specific order was not initially
>> requested.
>>
>> Keeping the data in some order is a common enough request but in this
>> situation, I suspect the order many might want would be the one showing the
>> minimums in order, not the codes in the original order.
>>
>> -----Original Message-----
>> From: Ebert,Timothy Aaron <tebert using ufl.edu>
>> Sent: Thursday, August 25, 2022 11:59 AM
>> To: avi.e.gross using gmail.com
>> Cc: R-help using r-project.org
>> Subject: RE: [R] Getting minimum value of a column according a factor
>> column
>> of a dataframe
>>
>> My assumption (maybe wrong) was that we needed to keep the other variables.
>> I want to find the values of Y, M, D, N, and O for the minimum value of Q
>> within each unique value of Code, keeping the data in the original order.
>> All one need to do is filter Q in the original dataframe by your answer for
>> minQ.
>>
>> Keeping the data in the original order seems unnecessary, but that is what
>> was asked in a later post.
>>
>> ______________________________________________
>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
More information about the R-help
mailing list