[R] Looking for transformation to overcome heterogeneity ofvariances

Peter Dalgaard p.dalgaard at biostat.ku.dk
Thu Aug 3 20:43:58 CEST 2006


[Resending -- recipient list length issue]

"John Sorkin" <jsorkin at grecc.umaryland.edu> writes:

> Peter

Erm, that was Paul's question, not mine! If you want to help, please
look at the pattern of residuals which he put up on the web on my
request.... 

> You question is difficult to answer without more information about the
> distribution of your residuals. Different residual patterns call for
> different transformations to stabilize the variance. One very common
> form of  heterocedasticity is increasing variance with increasing values
> of an independent predictor, i.e. the variance of the residuals of y=x
> increase as x increases. In this case a log transformation of some, or
> all, of the independent variables of the helps. Please also note the
> comment by Bert Gunter (included below) in which some important points
> are raised, particularly about extreme values. 
> 
> If you want more help, please describe the pattern of your residuals. 
> 
> 
> John Sorkin M.D., Ph.D.
> Chief, Biostatistics and Informatics
> Baltimore VA Medical Center GRECC,
> University of Maryland School of Medicine Claude D. Pepper OAIC,
> University of Maryland Clinical Nutrition Research Unit, and
> Baltimore VA Center Stroke of Excellence
> 
> University of Maryland School of Medicine
> Division of Gerontology
> Baltimore VA Medical Center
> 10 North Greene Street
> GRECC (BT/18/GR)
> Baltimore, MD 21201-1524
> 
> (Phone) 410-605-7119
> (Fax) 410-605-7913 (Please call phone number above prior to faxing)
> jsorkin at grecc.umaryland.edu
> 
> >>> Berton Gunter <gunter.berton at gene.com> 8/3/2006 11:56:28 AM >>>
> I know I'm coming late to this, but ...
> 
> > > Is someone able to suggest to me a transformation to overcome the
> > > problem of heterocedasticity?
> 
> It is not usually useful to worry about this. In my experience, the
> gain in
> efficiency from using an essentially ideal weighted analysis vs. an
> approximate unweighted one is usually small and unimportant
> (transformation
> to simplify a model is another issue ...). Of far greater importance
> usually
> is the loss in efficiency due to the presence of a few "unusual"
> extreme
> values; have you carefully checked to make sure that none of the large
> sample variances you have are due merely to the presence of a small
> number
> of highly discrepant values?
> 
> 
> -- Bert Gunter
> Genentech Non-Clinical Statistics
> South San Francisco, CA
>  
> "The business of the statistician is to catalyze the scientific
> learning
> process."  - George E. P. Box
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help 
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html 
> and provide commented, minimal, self-contained, reproducible code.
> 

-- 
   O__  ---- Peter Dalgaard             Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark          Ph:  (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)                  FAX: (+45) 35327907



More information about the R-help mailing list