[R] means over factors in mlm terms
John Fox
jfox at mcmaster.ca
Tue Nov 21 17:46:41 CET 2006
Dear Mike,
I don't have time this morning to work this out in detail, but perhaps the
following will help:
model.response(model.frame(soils.mod)) will give you the response-variable
matrix for the model. variable.names(model.frame(soils.mod)) will give you
the names of the variables in the model, with the lhs first. If you need
more information about the structure of the model, you'll find that in
terms(soils.mod) (but I think that you've already discovered that).
I hope this helps,
John
--------------------------------
John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox
--------------------------------
> -----Original Message-----
> From: r-help-bounces at stat.math.ethz.ch
> [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of
> Michael Friendly
> Sent: Tuesday, November 21, 2006 11:21 AM
> To: R-Help
> Subject: [R] means over factors in mlm terms
>
> I'm trying to write a function to find the means over factors
> of the responses in a mlm (something I would do easily in SAS
> with PROC SUMMARY).
>
> The not-working stub of a function to do what I want is
> below, and my problem is that I don't know how to call
> aggregate (or some other function) in the context of terms in
> a linear model extracted from a lm/mlm object.
>
> means.mlm <- function(mod, terms, data) {
> responses <-dimnames(mod$fitted.values)[[2]]
> if (missing(terms)) terms <- mod$terms
> n.terms <- length(terms)
>
> for (term in 1:n.terms){
> label <- attr(terms[term], "term.labels")
> cat(label,":\n")
> means <- aggregate(data, list(label), FUN=mean)
> print(means)
> }
> }
>
> Here is a sample context-- a randomized block design with two
> crossed factors (Contour, Depth) and 9 responses (pH, N, ...
> Conduc). [An R-readable version of the data is appended at
> the bottom.]
>
> > soils <- read.delim("soils.dat")
> > str(soils)
> 'data.frame': 48 obs. of 14 variables:
> $ Group : int 1 1 1 1 2 2 2 2 3 3 ...
> $ Contour: Factor w/ 3 levels "Depression","Slope",..: 3 3
> 3 3 3 3 3 3
> 3 3 ...
> $ Depth : Factor w/ 4 levels "0-10","10-30",..: 1 1 1 1 2
> 2 2 2 3 3 ...
> $ Gp : Factor w/ 12 levels "D0","D1","D3",..: 9 9 9 9
> 10 10 10 10
> 11 11 ...
> $ Block : int 1 2 3 4 1 2 3 4 1 2 ...
> $ pH : num 5.4 5.65 5.14 5.14 5.14 5.1 4.7 4.46 4.37 4.39 ...
> $ N : num 0.188 0.165 0.26 0.169 0.164 0.094 0.1 0.112 0.112
> 0.058 ...
> $ Dens : num 0.92 1.04 0.95 1.1 1.12 1.22 1.52 1.47 1.07 1.54 ...
> $ P : int 215 208 300 248 174 129 117 170 121 115 ...
> $ Ca : num 16.4 12.3 13.0 11.9 14.2 ...
> $ Mg : num 7.65 5.15 5.68 7.88 8.12 ...
> $ K : num 0.72 0.71 0.68 1.09 0.7 0.81 0.39 0.7 0.74 0.77 ...
> $ Na : num 1.14 0.94 0.6 1.01 2.17 2.67 3.32 3.76 5.74 5.85 ...
> $ Conduc : num 1.09 1.35 1.41 1.64 1.85 3.18 4.16 5.14
> 5.73 6.45 ...
> >
> I fit a mlm model,
>
> > attach(soils)
> > soils.mod <- lm(cbind(pH,N,Dens,P,Ca,Mg,K,Na,Conduc) ~ Block +
> Contour*Depth)
>
> and then I want to get tables of the means over the factors
> in the model terms. From the console, I can get the means
> for a factor or model term (but with annoying warning messages
>
> > (m1<-aggregate(soils, list(Contour), mean))
> Group.1 Group Contour Depth Gp Block pH N
> Dens
> P Ca Mg
> 1 Depression 10.5 NA NA NA 2.5 4.691875 0.0873125 1.343125
> 188.1875 7.101250 8.98625
> 2 Slope 6.5 NA NA NA 2.5 4.746250 0.1059375 1.332500
> 159.4375 8.109375 8.32000
> 3 Top 2.5 NA NA NA 2.5 4.570000 0.1125625 1.271875
> 150.8750 8.877500 8.08750
> K Na Conduc
> 1 0.378125 5.823125 6.945625
> 2 0.415625 6.103125 6.963750
> 3 0.605000 4.872500 5.856250
> Warning messages:
> 1: argument is not numeric or logical: returning NA in:
> mean.default(X[[1]], ...)
> 2: argument is not numeric or logical: returning NA in:
> mean.default(X[[2]], ...)
> ... (suppressed) ...
> 9: argument is not numeric or logical: returning NA in:
> mean.default(X[[3]], ...)
> >
>
> When I call my function, I get:
>
> > means.mlm(soils.mod, data=soils)
> Block :
> Error in FUN(X[[1]], ...) : arguments must have same length
> > traceback()
> 6: stop("arguments must have same length")
> 5: FUN(X[[1]], ...)
> 4: lapply(x, tapply, by, FUN, ..., simplify = FALSE)
> 3: aggregate.data.frame(data, list(label), FUN = mean)
> 2: aggregate(data, list(label), FUN = mean)
> 1: means.mlm(soils.mod, data = soils)
> >
>
> The soils data:
>
> soils <-
> structure(list(Group = c(1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3,
> 4, 4, 4, 4, 5, 5, 5, 5, 6, 6, 6, 6, 7, 7, 7, 7, 8, 8, 8, 8,
> 9, 9, 9, 9, 10, 10, 10, 10, 11, 11, 11, 11, 12, 12, 12, 12),
> Contour = structure(c(3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
> 3, 3, 3, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1,
> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1), .Label =
> c("Depression", "Slope", "Top"), class = "factor"),
> Depth = structure(c(1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 4,
> 4, 4, 4, 1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4,
> 1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4), .Label
> = c("0-10",
> "10-30", "30-60", "60-90"), class = "factor"), Gp =
> structure(c(9,
> 9, 9, 9, 10, 10, 10, 10, 11, 11, 11, 11, 12, 12, 12, 12,
> 5, 5, 5, 5, 6, 6, 6, 6, 7, 7, 7, 7, 8, 8, 8, 8, 1, 1, 1,
> 1, 2, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4), .Label = c("D0",
> "D1", "D3", "D6", "S0", "S1", "S3", "S6", "T0", "T1", "T3",
> "T6"), class = "factor"), Block = c(1, 2, 3, 4, 1, 2, 3,
> 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2,
> 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1,
> 2, 3, 4), pH = c(5.4, 5.65, 5.14, 5.14, 5.14, 5.1, 4.7, 4.46,
> 4.37, 4.39, 4.17, 3.89, 3.88, 4.07, 3.88, 3.74, 5.11, 5.46,
> 5.61, 5.85, 4.57, 5.11, 4.78, 6.67, 3.96, 4, 4.12, 4.99,
> 3.8, 3.96, 3.93, 4.02, 5.24, 5.2, 5.3, 5.67, 4.46, 4.91,
> 4.79, 5.36, 3.94, 4.52, 4.35, 4.64, 3.82, 4.24, 4.22, 4.41
> ), N = c(0.188, 0.165, 0.26, 0.169, 0.164, 0.094, 0.1, 0.112,
> 0.112, 0.058, 0.078, 0.07, 0.077, 0.046, 0.055, 0.053, 0.247,
> 0.298, 0.145, 0.186, 0.102, 0.097, 0.122, 0.083, 0.059, 0.05,
> 0.086, 0.048, 0.049, 0.036, 0.048, 0.039, 0.194, 0.256, 0.136,
> 0.127, 0.087, 0.092, 0.047, 0.095, 0.054, 0.051, 0.032, 0.065,
> 0.038, 0.035, 0.03, 0.058), Dens = c(0.92, 1.04, 0.95, 1.1,
> 1.12, 1.22, 1.52, 1.47, 1.07, 1.54, 1.26, 1.42, 1.25, 1.54,
> 1.53, 1.4, 0.94, 0.96, 1.1, 1.2, 1.37, 1.3, 1.3, 1.42, 1.53,
> 1.5, 1.55, 1.46, 1.48, 1.28, 1.42, 1.51, 1, 0.78, 1, 1.13,
> 1.24, 1.47, 1.46, 1.26, 1.6, 1.53, 1.55, 1.46, 1.4, 1.47,
> 1.56, 1.58), P = c(215, 208, 300, 248, 174, 129, 117, 170,
> 121, 115, 112, 117, 127, 91, 91, 79, 261, 300, 242, 229,
> 156, 139, 214, 132, 98, 115, 148, 97, 108, 103, 109, 100,
> 445, 380, 259, 248, 276, 158, 121, 195, 148, 115, 82, 152,
> 105, 100, 97, 130), Ca = c(16.35, 12.25, 13.02, 11.92, 14.17,
> 8.55, 8.74, 9.49, 8.85, 4.73, 6.29, 6.61, 6.41, 3.82, 4.98,
> 5.86, 13.25, 12.3, 9.66, 13.78, 8.58, 8.58, 8.22, 12.68,
> 4.8, 5.06, 6.16, 7.49, 3.82, 4.78, 4.93, 5.66, 12.27, 11.39,
> 9.96, 9.12, 7.24, 7.37, 6.99, 8.59, 4.85, 6.34, 5.99, 4.43,
> 4.65, 4.56, 5.29, 4.58), Mg = c(7.65, 5.15, 5.68, 7.88, 8.12,
> 6.92, 8.16, 9.16, 10.35, 6.91, 7.95, 9.76, 10.96, 6.61, 8,
> 10.14, 7.55, 7.5, 6.76, 7.12, 9.92, 8.69, 7.75, 9.56, 10,
> 8.91, 7.58, 9.38, 8.8, 7.29, 7.47, 8.84, 6.27, 7.55, 8.08,
> 7.04, 9.4, 10.57, 9.91, 8.66, 9.62, 9.78, 9.73, 10.54, 9.85,
> 8.95, 8.37, 9.46), K = c(0.72, 0.71, 0.68, 1.09, 0.7, 0.81,
> 0.39, 0.7, 0.74, 0.77, 0.26, 0.41, 0.56, 0.5, 0.23, 0.41,
> 0.61, 0.68, 0.63, 0.62, 0.63, 0.42, 0.32, 0.55, 0.36, 0.28,
> 0.16, 0.4, 0.24, 0.24, 0.14, 0.37, 0.72, 0.78, 0.45, 0.55,
> 0.43, 0.59, 0.3, 0.48, 0.18, 0.34, 0.22, 0.22, 0.18, 0.33,
> 0.14, 0.14), Na = c(1.14, 0.94, 0.6, 1.01, 2.17, 2.67, 3.32,
> 3.76, 5.74, 5.85, 5.3, 8.3, 9.67, 7.67, 8.78, 11.04, 1.86,
> 2, 1.01, 3.09, 3.67, 4.7, 3.07, 8.3, 6.52, 7.91, 6.39, 9.7,
> 9.57, 9.67, 9.65, 10.54, 1.02, 1.63, 1.97, 1.43, 4.17, 5.07,
> 5.15, 4.17, 7.2, 8.52, 7.02, 7.61, 10.15, 10.51, 8.27, 9.28
> ), Conduc = c(1.09, 1.35, 1.41, 1.64, 1.85, 3.18, 4.16, 5.14,
> 5.73, 6.45, 8.37, 9.21, 10.64, 10.07, 11.26, 12.15, 2.61,
> 1.98, 0.76, 2.85, 3.24, 4.63, 3.67, 8.1, 7.72, 9.78, 9.07,
> 9.13, 11.57, 11.42, 13.32, 11.57, 0.75, 2.2, 2.27, 0.67,
> 5.08, 6.37, 6.82, 3.65, 10.14, 9.74, 8.6, 9.09, 12.26, 11.29,
> 9.51, 12.69)), .Names = c("Group", "Contour", "Depth",
> "Gp", "Block", "pH", "N", "Dens", "P", "Ca", "Mg", "K", "Na", "Conduc"
> ), class = "data.frame", row.names = c("1", "2", "3", "4",
> "5", "6", "7", "8", "9", "10", "11", "12", "13", "14", "15",
> "16", "17", "18", "19", "20", "21", "22", "23", "24", "25",
> "26", "27", "28", "29", "30", "31", "32", "33", "34", "35",
> "36", "37", "38", "39", "40", "41", "42", "43", "44", "45",
> "46", "47", "48"))
>
> --
> Michael Friendly Email: friendly AT yorku DOT ca
> Professor, Psychology Dept.
> York University Voice: 416 736-5115 x66249 Fax: 416 736-5814
> 4700 Keele Street http://www.math.yorku.ca/SCS/friendly.html
> Toronto, ONT M3J 1P3 CANADA
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list