[R] Caution on the use of model.matrix.

Rolf Turner rolf at math.unb.ca
Thu Jun 2 21:14:34 CEST 2005


Brian Ripley wrote:

> <snip> But the real problem is more likely that Rolf has not passed 
> model.matrix a model frame, so it calls model.frame() internally.  The 
> help page is a bit confused in that it says
> 
>      data: a data frame created with 'model.frame'.
> 
> which the default for the argument is not.  So a better solution would 
> then be to call model.frame and pass a model frame to model.matrix.
> 
> delete.response() might also be useful.
> 
> The suggested warning only applies if `data' is not supplied.

	I don't grok this.  I ***did*** supply data (in the form
	of a data frame, not a model frame).  My call was of the form

		X <- model.matrix(fmla,XXX)

	where (originally) ``fmla'' was a formula with the structure
	``y ~ x + w + z'', and XXX was a data frame with columns
	``y'', ``x'', ``w'', and ``z''.  (The response variable ``y''
	had NAs in it, which caused the problem.) The data frame XXX was
	``input data''; it was not created with model.frame, but it
	was data nonetheless.

	I replaced the forgoing call with

		X <- model.matrix(fmla[-2],XXX)

	(the ``-2'' causing the ``y'' part of the formula to
	be discarded) and got the results I wanted.

	There may be a better way of achieving my goal, but
	I'm happy with my method --- unless someone points out
	lurking hazzards that have so far not been apparent to me.

	I merely wanted to point out to others the somewhat
	unintuitive behaviour of model.matrix.

				cheers,

					Rolf Turner
					rolf at math.unb.ca




More information about the R-help mailing list