[R] predict() works with the design matrix but throws error with some rows of that matrix

peter dalgaard pdalgd at gmail.com
Fri Nov 20 10:07:09 CET 2015

> On 20 Nov 2015, at 04:53 , Damir Cosic <damir.cosic at gmail.com> wrote:
> Hello,
> I am having problems with predict() after a multinomial logit regression by
> multinom(). I generate a design matrix with model.matrix() and use it to
> estimate the model. Then, if I pass the entire design matrix to predict(),
> it returns the same output as fitted(), which is expected. But if I pass
> only a few rows of the design matrix, it throws this error:
> Error in model.frame.default(Terms, newdata, na.action = na.omit, xlev
> = object$xlevels) :    variable lengths differ (found for 'z') In addition:
> Warning message: 'newdata' had 6 rows but variables found have 15 rows

Offhand (sorry, no time for testing things this morning) I suspect that you are mixing paradigms. You can _either_ multiply coefficients with a design matrix _or_ look up variables in a data frame, and I think you are trying to look up variables in a matrix. In particular, I don't expect mm to have a column called "z".

Accordingly, neither of your examples actually work, both cases find z (and x?) in the global environment, it is just only in the latter example that the inconsistency is discovered. I think you want either to use model.frame or an explicit mm %*% coef(model) (or thereabouts).

> This is a minimal example:
> require(nnet)
> y<-factor(rep(c(1,2,3),5), levels=1:3, labels=c("good","bad","ugly"))
> x<-rnorm(15)+.2*rep(1:3,5)
> z<-factor(rep(c(1,2,2),5), levels=1:2, labels=c("short","tall"))
> df<-data.frame(y=y, x=x, z=z)
> mm<-model.matrix(~x+z, data=df)[,2:3]
> m<-multinom(y ~ x+z, data=df)
> p1<-predict(m,mm,"probs")
> p2<-predict(m,head(mm),"probs")
> My actual goal is out-of-sample prediction, but I could not make it work
> and, while debugging it, I reduced it to this problem.
> Best,
> Damir
> 	[[alternative HTML version deleted]]
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com

More information about the R-help mailing list