[R] lda in R vs S
Marc R. Feldesman
feldesmanm at pdx.edu
Thu May 6 23:12:22 CEST 1999
At 09:24 PM 5/6/1999 +0100, Prof Brian D Ripley wrote:
>> I'm running a discriminant analysis in R (0.64.1) to compare it with SPlus
>
>That's not released until tomorrow! I guess you have the pre-release,
>prerw0641, which is actually of 0.64.0.
Yes. Actually the pre-release of 0.64.1
>> 4.5R2. The following command line works fine in SPlus but gives an error
>> in R. I've only used R for a little while so I'm not certain here what R
>> (or lda) is complaining about. The dependent variable (sarich.na[,3]) is
>> an alpha categorical variable, if that makes a difference. I'm using
>
>What's that? The response ought to be a factor, according to the docs:
SAS & SPSS speak. Alpha categorical variable = factor.
> formula: A formula of the form `groups ~ x1 + x2 + ...{}'
> That is, the response is the grouping factor and
> the right hand side specifies the (non-factor)
> discriminators.
>
>> version VR5.3 (file name VR5.3pl037.zip).
>>
>> lda.out<-lda(sarich.na[,3]~., data=sarich.na[,4:32])
>> Error in model.frame(formula, rownames, variables, varnames, extras,
>> extranames, : invalid variable type
>>
>> Is this an lda issue or an R issue?
>
>It is an R issue. Only logical, integer and real variables are allowed
>in R model frames, for as the code says
I haven't delved deeply into R internals yet. I just started experimenting
with it as I was learning SPlus in parallel. So at the present time, even
though sarich.na[,3] *is* a factor but with alpha levels, are you saying
that R won't allow this?
>
> /* Sanity checks to ensure that the the answer can become */
> /* a data frame. Be deeply suspicious here! */
>
Deeply suspicious of what?
>But that is not the `right' way to do this in either. Use either
Either? Are you saying that the formulation above isn't correct in
*either* R or SPlus? It works fine in SPlus (and sarich.na[,3] is coded as
a factor with levels "AINU", "BUSHMAN", etc...). But, SPlus also allows
sarich.na[,3] to be on the left side even if it isn't an explicit factor.
Even if it is coded only as a character variable, SPlus allows it, lda
calculates the results, and gives the correct answers. Presumably if this
isn't the "correct" approach, SPlus or lda is coercing the character
variable to a factor. This also works in aov and other functions that take
a formula.
>lda.out<-lda(sarich.na[,4:32], sarich.na[,3])
>
This works fine in Splus, not in R, at least not with this data set.
>or
>
>lda.out<-lda(somename ~ ., data=sarich.na[,3:32])
>
>where somename is the name of column 3, and that had better be a factor.
>
Also works in Splus, but not in R.
However, *this* works:
attach(sarich.na)
lda.out<-lda(as.factor(populati)~., data=sarich.na[,4:32])
This puzzles me. The variable "populati" *is* a factor already. Why would
I have to coerce a factor to a factor to get this to run? But, following
the logic above, the next variant ought to work, but it doesn't.
lda.out<-lda(sarich.na[,4:32], as.factor(sarich.na[,3])
This emits an error message telling me I can't have negative length
subscripts, an error message that leaves me without a clue at the moment.
Dr. Marc R. Feldesman
email: feldesmanm at pdx.edu
email: feldesman at ibm.net
fax: 503-725-3905
"Math is hard. Let's go to the mall" Barbie
Powered by: Monstrochoerus - the 300 MHz Pentium II
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
More information about the R-help
mailing list