[R] Mimicking SPSS weighted least squares
JRG
loesljrg at verizon.net
Tue Mar 11 03:26:46 CET 2008
On 11 Mar 2008 at 14:09, Rolf Turner wrote:
>
> It would appear that the SPSS procedure would then give exactly the same
> point estimates of the parameters, and change the inference structure by
> changing the ``denominator degrees of freedom'' from n-p to sum(w) - p.
>
Well, if that IS what SPSS does, then it sounds like what Stata calls frequency weights, the
general idea being that each "observation" in fact represents some non-negative number (w) of
actual observations that have identical values. Not much more than a glorified version of a
frequency distribution table.
I don't see anything fundamentally wrong with frequency weights, given an appropriate situation.
---JRG
John R. Gleason
> This seems to me to make little sense ... But then, it ***is***
> SPSS. :-)
>
> cheers,
>
> Rolf
>
> On 11/03/2008, at 11:35 AM, Peter Dalgaard wrote:
>
> > Rolf Turner wrote:
> >> On 11/03/2008, at 4:04 AM, Ben Domingue wrote:
> >>
> >>
> >>> Howdy,
> >>> In SPSS, there are 2 ways to weight a least squares regression:
> >>> 1. You can do it from the regression menu.
> >>> 2. You can set a global weight switch from the data menu.
> >>> These two options have no, in my experience, been equivalent.
> >>> Now, when I run lm in R with the weights= switch set accordingly, I
> >>> get the same set of results you would see with option #1 in SPSS.
> >>> Does anybody know how to duplicate option #2 from SPSS in R?
> >>>
> >>
> >> I think it's up to you to find out what ``option #2 from SPSS''
> >> actually
> >> *does*. If you know that, then you can (with a modicum of effort)
> >> duplicate that option in R. The help file for lm() tells you that
> >> R uses the weights by minimizing sum(w*e^2) where w = weights and
> >> e = ``errors'' or residuals.
> >>
> >>
> >>
> > I believe case weighting in SPSS effectively replicates the
> > relevant row (not sure if anything sensible comes out if weights
> > are non-integer). So
> >
> > lm(...., data=mydata[rep(1:nrow(mydata),w),])
> >
> > or thereabouts should do it. Might not be too efficient though.
> >
> > --
> > O__ ---- Peter Dalgaard Øster Farimagsgade 5, Entr.B
> > c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
> > (*) \(*) -- University of Copenhagen Denmark Ph: (+45)
> > 35327918
> > ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45)
> > 35327907
> >
> >
>
> ######################################################################
> Attention:
> This e-mail message is privileged and confidential. If you are not the
> intended recipient please delete the message and notify the sender.
> Any views or opinions presented are solely those of the author.
>
> This e-mail has been scanned and cleared by MailMarshal
> www.marshalsoftware.com
> ######################################################################
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list