[R] Constrain coefs. in linear model to sum to 0
Gregor Gorjanc
gregor.gorjanc at bfro.uni-lj.si
Thu Aug 10 00:27:05 CEST 2006
Hello,
Bill.Venables at csiro.au wrote:
>
> Gorjanc Gregor asks:
>
>> Hello!
>>
>> I would like to use constrain to sum coeficients of a factor to 0
> instead
>> of classical corner contraint i.e. I would like to fit a model like
>>
>> lm(y ~ 1 + effectA + effectB)
>>
>> and say get parameters
>>
>> intercept
>> effectA_1
>> effectA_2
>> effectB_1
>> effectB_2
>> effectB_3
>>
>> where effectA_1 represents deviation of level A_1 from intercept and
>> sum(effectA_1, effectA_2) = 0 and the same for factor B.
>>
>> Is this possible to do?
>
> Yes, but not quite as simply as you would like. If you set
>
> options(contrasts = c("contr.sum", "contr.poly"))
>
> for example, then factor models are parametrised as you wish above,
> BUT you don't get all the effects directly
>
> In your case above, for example, if fm is the fitted model object, then
>
> coef(fm)
>
> Will give you intercept, effectA_2, effectB_2, effectB_3. The
> remaining effects*_1 you will need to calculate as the negative of the
> sum of all the others.
>
> This gets a bit more complicated when you have crossed terms, a*b, but
> the same principle applies.
Thank you for the response. Peter Dalgaard already mentioned that I can
get missing coefs with taking negative of the sum of displayed coefs.
However, as you already noted that, things are complicated with crossed
terms. I was able to handle nested regression but did not had any luck
with interactions. For example:
### --- Picture of "the model" ---
if(FALSE) {
| a1 | a2 |
----------------------
b1 | 8 | 14 | 11
----------------------
b2 | 13 | 17 | 15
----------------------
b3 | 2 | 8 | 5
----------------------
b4 | 4 | 10 | 7
----------------------
| 6.8 | 12.3 |
}
### --- Setup ---
N <- 400
ab <- c( 8, 13, 2, 4,
14, 17, 8, 10)
sigmaE <- 0.01
levA <- c("a1", "a2")
facA <- factor(sample(rep(levA, times=N / length(levA))))
levB <- c("b1", "b2", "b3", "b4")
facB <- factor(sample(rep(levB, times=N / length(levB))))
table(facA, facB)
facAB <- factor(paste(facA, facB, sep="-"))
yAB <- ab[as.integer(facAB)]
## Create design matrix and simulate y
tmp <- model.matrix(~ facAB - 1)
y <- tmp %*% as.matrix(ab) + rnorm(n=N, mean=0, sd=sigmaE)
contrasts(facA) <- contr.sum
contrasts(facB) <- contr.sum
### --- Fit the model ---
fit <- lm(y ~ facA * facB, data=tmpData)
coefs <- coef(fit)
(int <- coefs[1])
(yMean <- mean(y))
a <- coefs[2]
(a <- c(a, -a))
tapply(y, list(facA), mean)
tapply(y, list(facA), mean) - yMean
## Hmm, why is there such a big difference between mean values and
## coefs from the fit? I am doing something wrong here.
b <- coefs[3:5]
(b <- c(b, -sum(b)))
tapply(y, list(facB), mean)
tapply(y, list(facB), mean) - yMean
## Even more strange
(ab <- coefs[6:8])
## ...#@^+??
tapply(y, list(facA, facB), mean)
tapply(y, list(facA, facB), mean) - yMean
## How can I proceed here?
## Using functions proposed by Martin Maechler:
fitAov <- aov(y ~ facA * facB)
model.tables(fitAov, "means", se=TRUE)
## vauuu, this is great
model.tables(fitAov, "effects", se=TRUE)
## also nice and this fits with my raw mean test
What am I doing wrong then above with coefficients from the model?
--
Lep pozdrav / With regards,
Gregor Gorjanc
----------------------------------------------------------------------
University of Ljubljana PhD student
Biotechnical Faculty
Zootechnical Department URI: http://www.bfro.uni-lj.si/MR/ggorjan
Groblje 3 mail: gregor.gorjanc <at> bfro.uni-lj.si
SI-1230 Domzale tel: +386 (0)1 72 17 861
Slovenia, Europe fax: +386 (0)1 72 17 888
----------------------------------------------------------------------
"One must learn by doing the thing; for though you think you know it,
you have no certainty until you try." Sophocles ~ 450 B.C.
More information about the R-help
mailing list