[R] Looking for transformation to overcome heterogeneity ofvariances

John Sorkin jsorkin at grecc.umaryland.edu
Thu Aug 3 19:51:18 CEST 2006


Peter
You question is difficult to answer without more information about the
distribution of your residuals. Different residual patterns call for
different transformations to stabilize the variance. One very common
form of  heterocedasticity is increasing variance with increasing values
of an independent predictor, i.e. the variance of the residuals of y=x
increase as x increases. In this case a log transformation of some, or
all, of the independent variables of the helps. Please also note the
comment by Bert Gunter (included below) in which some important points
are raised, particularly about extreme values. 

If you want more help, please describe the pattern of your residuals. 


John Sorkin M.D., Ph.D.
Chief, Biostatistics and Informatics
Baltimore VA Medical Center GRECC,
University of Maryland School of Medicine Claude D. Pepper OAIC,
University of Maryland Clinical Nutrition Research Unit, and
Baltimore VA Center Stroke of Excellence

University of Maryland School of Medicine
Division of Gerontology
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524

(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing)
jsorkin at grecc.umaryland.edu

>>> Berton Gunter <gunter.berton at gene.com> 8/3/2006 11:56:28 AM >>>
I know I'm coming late to this, but ...

> > Is someone able to suggest to me a transformation to overcome the
> > problem of heterocedasticity?

It is not usually useful to worry about this. In my experience, the
gain in
efficiency from using an essentially ideal weighted analysis vs. an
approximate unweighted one is usually small and unimportant
(transformation
to simplify a model is another issue ...). Of far greater importance
usually
is the loss in efficiency due to the presence of a few "unusual"
extreme
values; have you carefully checked to make sure that none of the large
sample variances you have are due merely to the presence of a small
number
of highly discrepant values?


-- Bert Gunter
Genentech Non-Clinical Statistics
South San Francisco, CA
 
"The business of the statistician is to catalyze the scientific
learning
process."  - George E. P. Box

______________________________________________
R-help at stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help 
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html 
and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list