[R] aggregating data with quality control
Ivan Krylov
|kry|ov @end|ng |rom d|@root@org
Sat Aug 31 13:25:35 CEST 2024
В Sat, 31 Aug 2024 11:15:10 +0000
Stefano Sofia <stefano.sofia using regione.marche.it> пишет:
> Evaluating the daily mean indipendently from the status is very easy:
>
> aggregate(mydf$hs, by=list(format(mydf$data_POSIX, "%Y"),
> format(mydf$data_POSIX, "%m"), format(mydf$data_POSIX, "%d")),
> my.mean)
>
>
> Things become more complicated when I need to export also the status:
> this should be "C" when all 48 data have status equal to "C", and
> status "D" when at least one value has status ="D".
>
>
> I have no clue on how to do that in an efficient way.
You can make the status into an ordered factor:
# come up with some statuses
status <- sample(c('C', 'D'), 42, TRUE, c(.9, .1))
# convert them into factors, specifying that D is "more than" C
status <- ordered(status, c('C', 'D'))
Since the factor is ordered and can be subject to comparison like
status[1] < status[2], you can now use max() on your groups. If the
sample contains any 'D's, max() will return a 'D', because it's larger
than any 'C's. If the sample contains only 'C's, that's the maximal
value by default.
--
Best regards,
Ivan
More information about the R-help
mailing list