[R] Tapply for Group Specific Means and Proportions
Bret Collier
bacollier at ag.tamu.edu
Mon Mar 3 23:27:55 CET 2008
UseRs,
I am working on a dataset (see small example below) where individuals
were followed on a specific date-time combo and multiple repeated
measurements were taken (e.g., height in meters, behavior class in 2
letter code). Observation numbers varied between individual (ranging
from 1 observation for each date-time combo to >50)
I am trying to summarize the data into 1 row per individual-date-time
combination. I used tapply to pull mean height (TreeHt) out for each
date-time combo. However, all my attempts to get the proportion of
times a specific behavior category occurs within the same date-time
combo have failed thus far having tried tapply, aggregate, table
(because Behavior is a factor), etc.-- likely I probably did not search
the right word combination in the help archives
If anyone can point me in the right direction toward streamlining my
code to output the summaries along these general lines (column headers
being the Behavior categories, 0.xx being the proportion per date-time)
I would appreciate it:
Date-Time MeanHt PE OS SI ...
28Mar96.0752 6.000000 0.xx 0.xx 0.xx ...
28Mar96.1014 7.000000 0.xx 0.xx 0.xx ...
TIA,
Bret (R 2.6.1 on I386-pc-mingw32)
Texas A&M
> Final
Sequence testdate testtime Behavior Substrate TreeHt
1 1 28Mar96 0752 PE TW 6
2 2 28Mar96 0752 OS <NA> 6
3 3 28Mar96 0752 PE TW 6
4 4 28Mar96 0752 PE TW 6
5 1 28Mar96 0924 PE TW 8
6 2 28Mar96 0924 PE BR 8
7 3 28Mar96 0924 PE TW 7
8 4 28Mar96 0924 SI TW 7
9 5 28Mar96 0924 PE TW 7
10 6 28Mar96 0924 PE TW 7
11 1 28Mar96 0954 HO BR 10
12 2 28Mar96 0954 PE BR 10
13 1 28Mar96 1014 PE TW 7
14 2 28Mar96 1014 HO TW 7
15 1 29Mar96 0835 PE TW 4
16 2 29Mar96 0835 EA BR 4
17 3 29Mar96 0835 MA BR 4
18 4 29Mar96 0835 PE TW 5
19 5 29Mar96 0835 PE TW 5
20 6 29Mar96 0835 PE TW 13
21 7 29Mar96 0835 PE TW 13
22 8 29Mar96 0835 PE TW 13
23 9 29Mar96 0835 PE BR 13
24 10 29Mar96 0835 PE TW 13
25 11 29Mar96 0835 HO TW 12
26 12 29Mar96 0835 HO TW 12
27 13 29Mar96 0835 HO TW 12
28 14 29Mar96 0835 HO TW 12
29 15 29Mar96 0835 PE TW 13
30 16 29Mar96 0835 PE TR 13
31 17 29Mar96 0835 FL <NA> NA
32 18 29Mar96 0835 PE BR 12
33 19 29Mar96 0835 FL <NA> NA
34 20 29Mar96 0835 PE TW 13
35 21 29Mar96 0835 PE TW 13
36 22 29Mar96 0835 FL <NA> NA
37 23 29Mar96 0835 HO TW 4
38 24 29Mar96 0835 PE BR 5
39 25 29Mar96 0835 PE BR 5
40 26 29Mar96 0835 PE BR 5
41 27 29Mar96 0835 PE TW 4
42 28 29Mar96 0835 PE TW 5
43 29 29Mar96 0835 PE TW 5
44 30 29Mar96 0835 PE TW 13
45 31 29Mar96 0835 PE TW 5
> str(Final)
'data.frame': 45 obs. of 6 variables:
$ Sequence : num 1 2 3 4 1 2 3 4 5 6 ...
$ testdate : Factor w/ 2 levels "28Mar96","29Mar96": 1 1 1 1 1 1 1 1 1
1 ...
$ testtime : Factor w/ 5 levels "0752","0835",..: 1 1 1 1 3 3 3 3 3 3 ...
$ Behavior : Factor w/ 7 levels "EA","FL","HO",..: 6 5 6 6 6 6 6 7 6 6 ...
$ Substrate: Factor w/ 3 levels "BR","TR","TW": 3 NA 3 3 3 1 3 3 3 3 ...
$ TreeHt : num 6 6 6 6 8 8 7 7 7 7 ...
> test<-sort((tapply(Final$TreeHt, INDEX=interaction(Final$testdate,
Final$testtime), FUN=mean, na.rm=TRUE)))
> data.frame(test)
test
28Mar96.0752 6.000000
28Mar96.1014 7.000000
28Mar96.0924 7.333333
29Mar96.0835 8.928571
28Mar96.0954 10.000000
More information about the R-help
mailing list