Vickie S
isvik at live.com
Thu Feb 9 09:38:51 CET 2012
Thanks for nice explanation.
Unfortunately, matrix in my question is exactly similar to the one I posted earlier :
mat <- matrix(rnorm(700), ncol=5, dimnames=list( paste("f", c(1:140),sep="_"), c("A", "B", "C", "D", "E")))
Question here is which of the 140 characteristics (i.e. f_1...f_140) distinguish the most between the five plant
species.
Is it true that this matrix can't be regressed with factor responses (species) ? If so, what alternatives can be used ?
- Vickie
>
> Dear Vickie,
>
> I'm afraid that the test problem that you've constructed makes no sense, and
> doesn't correspond to the problem that you initially described, in which a
> matrix of presumably 5 responses for presumably 140 observations is
> regressed on 6 predictors. You regressed your randomly generated matrix of 5
> responses and 140 observations on a factor constructed from the distinct 140
> observation names. That factor has 140 levels, and so the model uses 140 df,
> all the df in the data. It's therefore not surprising that the error SSP
> matrix has 0 df, which is exactly what Anova.mlm (actually,
> linearHypothesis.mlm, which it calls) tells you.
>
> The remark that you found about univariate tests that you apparently found
> on-line concerns repeated-measures designs and is not relevant to your data.
> And you can't do a univariate ANOVA when there's 0 df for error in any
> event.
>
> Here's a proper simulation of the kind of data that I think you have:
>
> > set.seed(12345)
> > E <- matrix(rnorm(140*5), ncol=5)
> > X <- matrix(rnorm(140*6), ncol=6)
> > Beta <- matrix(runif(6*5), ncol=5)
> > Y <- X %*% Beta + E
> > colnames(Y) <- c("A", "B", "C", "D", "E")
> > colnames(X) <- c("syct", "mmin", "mmax", "cach", "chmin", "chmax")
> > Data <- as.data.frame(cbind(Y, X))
> > mod <- lm(cbind(A, B, C, D, E) ~ syct + mmin + mmax + cach + chmin +
> chmax, data=Data)
> > Anova(mod)
>
> Type II MANOVA Tests: Pillai test statistic
> Df test stat approx F num Df den Df Pr(>F)
> syct 1 0.41622 18.395 5 129 9.31e-14 ***
> mmin 1 0.48288 24.091 5 129 < 2.2e-16 ***
> mmax 1 0.62100 42.273 5 129 < 2.2e-16 ***
> cach 1 0.61711 41.583 5 129 < 2.2e-16 ***
> chmin 1 0.72547 68.180 5 129 < 2.2e-16 ***
> chmax 1 0.54825 31.311 5 129 < 2.2e-16 ***
>
> >
> > Dear Prof Fox,
> > I tried anova but got the following error message:
> >
> > mat <- matrix(rnorm(700), ncol=5, dimnames=list( paste("f", c(1:140),
> > sep="_"), c("A", "B", "C", "D", "E"))) summary(Anova(lm(cbind(A, B, C,
> > D, E) ~ factor(rownames(mat)), data=as.data.frame(mat))))
> >
> > Error in summary(Anova(lm(cbind(A, B, C, D, E) ~
> > factor(rownames(mat)), :
> > error in evaluating the argument 'object' in selecting a method for
> > function 'summary': Error in linearHypothesis.mlm(mod, hyp.matrix.2,
> > SSPE = SSPE, V = V, ...) :
> > The error SSP matrix is apparently of deficient rank = 0 < 5
> >
> > I looked in previous forum and it seems like i have only option of
> > performing the univariate test here.
> >
> > Therefore I used the following, but it still results in an error
> > message:
> > Anova(lm(cbind(A, B, C, D, E) ~ factor(rownames(mat)),
> > data=as.data.frame(mat)), univariate=TRUE, multivariate=F) Error in
> > linearHypothesis.mlm(mod, hyp.matrix.2, SSPE = SSPE, V = V, ...) :
> > The error SSP matrix is apparently of deficient rank = 0 < 5
> >
> > Any suggestions ?
> >
> > Thanks
> > Vickie
> >
> > I think I am still missing some important clues here. Is it because
> > the feww
> >
> > > > Dear R fans,
> > > > I have got a difficult sounding problem.
> > > >
> > > > For fitting a linear model using continuous response and then for
> > > > re- fitting the model after excluding every single variable, the
> > > > following functions can be used.
> > > > library(MASS)
> > > > model = lm(perf ~ syct + mmin + mmax + cach + chmin + chmax, data
> > =
> > > > cpus) dropterm(model, test = "F")
> > > >
> > > > But I am not sure whether any similar functions is available in R
> > > > for multivariate data with categorical response.
> > > > My data looks like the following:
> > > > mat <- matrix(rnorm(700), ncol=5, dimnames=list( paste("f",
> > > > c(1:140), sep="_"), c("A", "B", "C", "D", "E")))
> > > >
> > > > There are 140 features describing 5 different plant species. I
> > want
> > > > to retain only those features that show good performance in model
> > > > (by using a function similar to dropterm, which can not be used
> > for
> > > > mlm objects).
> > > >
> > > > I wud appreciate some hints n suggestions.
> > > >
> > > > Thx
> > > > - Vickie
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
