[R] Problems with data structure when using plsr() from package pls
David Winsemius
dwinsemius at comcast.net
Thu Jan 14 18:36:26 CET 2016
> On Jan 14, 2016, at 2:33 AM, CG Pettersson <cg.pettersson at lantmannen.com> wrote:
>
> Dear Jeff,
> thanks for the effort, but the use of I() when preparing the dataset is suggested by the authors (Mevik & Wehrens, section 3.2):
>
> +If Z is a matrix, it has to be protected by the ‘protect function’ I() in calls
> +to data.frame: mydata <- data.frame(..., Z = I(Z)). Otherwise, it will be split into
> +separate variables for each column, and there will be no variable called Z in the data frame,
> +so we cannot use Z in the formula. One can also add the matrix to an existing data frame:
> +R> mydata <- data.frame(...)
> +R> mydata$Z <- Z
>
> In the dataset "gasoline" that is supplied with the pls package, there are two variables; octane and NIR, where NIR is a frame with 401 columns and possible to work with like:
> plsr(octane ~NIR, data = gasoline)
> I thought "gasoline" was made like the example above, but I must be missing something else.
>
> Whatever I do ends with " invalid type (list) for variable 'n96'"
Was `n96` a list before you put a copy of it into the `frame1`-object? Maybe it wasn't a simple matrix. You need at the very least to post the output of str(n96). Also .... never use attach().
--
David.
>
> So I am still stuck
> /CG
>
> Från: Jeff Newmiller [mailto:jdnewmil at dcn.davis.ca.us]
> Skickat: den 14 januari 2016 05:16
> Till: CG Pettersson; r-help at r-project.org
> Ämne: Re: [R] Problems with data structure when using plsr() from package pls
>
> Using I() in the data.frame seems ill-advised to me. You complain about 96 variables but from reading your explanation that seems to be what your data are. I have no idea whether it makes sense to NOT have 96 variables if that is what your data are. Note that a reproducible example supplied by you might help us guess better, but it might just be that your expectations are wrong.
> --
> Sent from my phone. Please excuse my brevity.
> On January 13, 2016 11:02:25 AM PST, CG Pettersson <cg.pettersson at lantmannen.com> wrote:
> R version 3.2.3, W7 64bit.
>
> Dear all!
>
> I am trying to make pls-regression using plsr() from package pls, with Mevik & Wehrens (2007) as tutorial and the datasets from the package.
> Everything works real nice as long as I use the supplied datasets, but I don�t understand how to prepare my own data.
> This is what I have done:
> frame1 <- data.frame(gushVM, I(n96))
>
> Where gushVM is a vector with fifteen reference analysis values of a quality problem in grain and n96 is a matrix with fifteen rows and 96 columns from an electronic nose. I try to copy the methods as in 3.2 in Mevik & Wehrens, and want to keep n96 as one variable to avoid addressing 96 different variables in the plsr call. If I don�t use I() in the call I get 96 variables instead.
> Looking at the data
> frame by
> summary(frame1) get a return quite like summary(gasoline) from the package (not shown here).
> But when I try to use plsr() with my own data it doesn�t work due to an error in the data structure:
> pls1 <- plsr(gushVM ~ n96, data = frame1)
> Error in model.frame.default(formula = gushVM ~ n96, data = frame1) :
> invalid type (list) for variable 'n96'
>
> So, n96 has turned into a list, and that is a problem. If gushVM is a vector (one variable) och a matrix (five variables) does not seem to change anything, managing n96 is the problem
> I have tried all alternative ways of creating a proper data frame suggested in the article with exactly the same result.
> I have tried the docum
> entation
> for data.frame() but I probably don�t understand what it says.
>
> What should I do to change "n96" into something better than "list"?
>
> Thanks
> /CG
>
> Med v�nlig h�lsning/Best regards
> CG Pettersson
> Scientific Project Manager, PhD
> ______________________
> Lantm�nnen Corporate R&D
> Phone: +46 10 556 19 85
> Mobile: + 46 70 330 66 85
> Email: cg.pettersson at lantmannen.com<mailto:cg.pettersson at lantmannen.com>
> Visiting Address: S:t G�ransgatan 160 A
> Address: Box 30192, SE-104 25 Stockholm
> Webb: http://www.lantmannen.com<http://www.lantmannen.com/>
> Registered Office: Stockholm
> Before printing, think about the environment
>
>
> [[alternative HTML version deleted]]
>
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
David Winsemius
Alameda, CA, USA
More information about the R-help
mailing list