[R] Odp: reducing data.frame
Petr PIKAL
petr.pikal at precheza.cz
Thu Feb 25 08:04:51 CET 2010
Hi
you can use aggregate or tapply. You did not specify which function to use
for "reduction" so I assume mean.
aggregate(multi[, some columns], multi[, c("id", "r")], mean, na.rm=T)
but this does not solve character columns. For them you could maybe try
?ave. or split/sapply way.
There could be another issue with r values which seems to be fractional
numeric and depending on their way of creation they may not be equal.
Regards
Petr
r-help-bounces at r-project.org napsal dne 25.02.2010 06:44:03:
> Hi All,
>
> Is there an easy way to reduce a data.frame to 1 'id' per row while
keeping
> information from the other rows of that same variable, if applicable?
e.g.:
>
> # data
>
> multi[1:15,]
> id r n wi wi.tau z k alliance a.rater eml
> treatment outcome o.rater german
> 1 100 0.2800000 44 41 21.72514 0.2876821 210 <NA> <NA> <NA>
> <NA> <NA> Client <NA>
> 2 100 0.2800000 44 41 21.80953 0.2876821 182 <NA> <NA> Early
> <NA> <NA> <NA> <NA>
> 3 100 0.2800000 44 41 22.36641 0.2876821 206 <NA> Client <NA>
> <NA> <NA> <NA> <NA>
> 4 100 0.2800000 44 41 23.59224 0.2876821 188 <NA> <NA> <NA>
> <NA> <NA> <NA> Other
> 5 100 0.2800000 44 41 23.83157 0.2876821 147 WAI <NA> <NA>
> <NA> <NA> <NA> <NA>
> 6 101 0.0000000 37 34 19.65678 0.0000000 182 <NA> <NA> Early
> <NA> <NA> <NA> <NA>
> 7 101 0.5423790 37 34 17.65078 0.6075200 98 <NA> <NA> <NA>
> Psychodymic <NA> <NA> <NA>
> 8 101 0.5423790 37 34 19.58820 0.6075200 210 <NA> <NA> <NA>
> <NA> <NA> Observer <NA>
> 9 101 0.5423790 37 34 21.09334 0.6075200 188 <NA> <NA> <NA>
> <NA> <NA> <NA> Other
> 10 101 0.9075737 37 34 19.65678 1.5135878 182 <NA> <NA> Late
> <NA> <NA> <NA> <NA>
> 11 103a 0.4950000 18 15 10.36364 0.5426615 90 <NA> <NA> <NA>
> <NA> SCL <NA> <NA>
> 12 103a 0.6171548 18 15 11.32425 0.7203964 210 <NA> <NA> <NA>
> <NA> <NA> Observer <NA>
> 13 103a 0.6171548 18 15 11.34714 0.7203964 182 <NA> <NA> Early
> <NA> <NA> <NA> <NA>
> 14 103a 0.6171548 18 15 11.49606 0.7203964 206 <NA> Client <NA>
> <NA> <NA> <NA> <NA>
> 15 103a 0.6171548 18 15 11.81150 0.7203964 188 <NA> <NA> <NA>
> <NA> <NA> <NA> Other
>
> # with the goal of having a reduced df (1 id per row) like this:
>
> id r n wi wi.tau z k alliance a.rater eml
> treatment outcome o.rater german
> 1 100 0.2800000 44 41 21.72514 0.2876821 210 wai client early
> <NA> <NA> Client other
> 101 etc...
>
> Ideally, I would like to reduce by id and r, if the values are the same
and
> keep any discrepant values as a separate row (if possible), e.g.:
>
> 6 101 0.0000000 37 34 19.65678 0.0000000 182 <NA> <NA> Early
> <NA> <NA> <NA> <NA>
> 7 101 0.5423790 37 34 17.65078 0.6075200 98 <NA> <NA> Late
> Psychodymic <NA> Observer Other
>
> I appreciate any assistance,
>
> AC
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list