[R] calculating average for multiple subclasses in a data set

Dan Kortschak dan.kortschak at adelaide.edu.au
Sat Mar 28 03:15:46 CET 2009


Hello R users,

I have a data set which is a set of lengths and types of objects. I want
to calculate the mean length for each type of object as opposed to the
mean of all the objects in the set.

This is in order to make a comparison between the lengths of each type
of objects and the number of those objects.

> x
       Chromosome Begin   End      Type   Class     Norm Length
458327          Y     1   318 L2_Plat1b LINE/L2 5.758902    317
458330          Y   439   673 L2_Plat1i LINE/L2 5.455321    234
458331          Y     2   309 L2_Plat1i LINE/L2 5.726848    307
458332          Y  1746  2232 L2_Plat1d LINE/L2 6.186209    486
458333          Y   948  1132 L2_Plat1e LINE/L2 5.214936    184
458335          Y  1511  2043 L2_Plat1f LINE/L2 6.276643    532
458336          Y     1   908 L2_Plat1f LINE/L2 6.810142    907
458337          Y   907  1037 L2_Plat1g LINE/L2 4.867534    130

So a toy set for the relevant parts of the data would be e.g.:

type<-sample(c("L2_Plat1a","L2_Plat1b","L2_Plat1c"),1000,replace=TRUE)
len<-rnorm(1000)
dummy<-as.data.frame(cbind(as.character(type),len))

so looking for

as.data.frame(summary(dummy$V1)) ~ /*average of each type's length*/

as my final goal.

I apologise for the syntax I use (I've come only recently from a perl
background, so there is a certain messiness and lack of consideration
for style to my coding) - I'm still having a really difficult time
figuring out how various data types are used and manipulated in R, but I
think I'm slowly getting the hang of it, but any suggestions of a good
reference for that (other than the R Introduction which didn't help all
that much), would be greatly appreciated.

thanks for any help
Dan




More information about the R-help mailing list