[R] gls yields much smaller std. errors with different base for contrasts

Thu Jul 21 05:29:37 CEST 2011

Dear List,

After running a compound symmetric model using gls, I realized that
the default contrasts were not the ones that made the most sense given
the biological relationships among the factor levels.  When I either
changed the factor levels to re-arrange the order they occur in the
gls model (not shown below) OR specifically change the contrasts I get
the exact same estimates for the intercepts but now with TINY standard
errors for these estimates.  Below is some example code and output for
what I am attempting.

1> rep<-as.factor(c(1,1,1,1,1,1,2,2,2,2,2,2,1,1,1,1,1,1,2,2,2,2,2,2,1,1,1,1,1,1,2,2,2,2,2,2))
1> tank<-as.factor(c(2,2,2,2,2,2,7,7,7,7,7,7,1,1,1,3,3,3,6,6,6,8,8,8,4,4,4,4,4,4,5,5,5,5,5,5))
1> sex_ratio<-as.factor(c("6m:6f","6m:6f","6m:6f","6m:6f","6m:6f","6m:6f","6m:6f","6m:6f","6m:6f",
1+ "6m:6f","6m:6f","6m:6f","3m:9f","3m:9f","3m:9f","3m:9f","3m:9f","3m:9f","3m:9f","3m:9f",
1+ "3m:9f","3m:9f","3m:9f","3m:9f","6m:0f","6m:0f","6m:0f","6m:0f","6m:0f","6m:0f","6m:0f",
1+ "6m:0f","6m:0f","6m:0f","6m:0f","6m:0f"))
1> response<-c(62.62,72.25,68.88,81.31,54.82,81.60,100.54,66.17,120.74,109.64,79.23,65.84,
1+ 81.49,99.81,93.39,85.42,157.41,32.92,97.8,49.17,53.42,76.30,74.72,24.84,131.83,113.64,
1+ 103,152.05,118.94,65.71,133.98,106.40, 108.57,156.37,110.66,126.76)

1> tmp.data<-data.frame(rep, tank, sex_ratio, response)
1> str(tmp.data)
'data.frame':	36 obs. of  4 variables:
 $ rep      : Factor w/ 2 levels "1","2": 1 1 1 1 1 1 2 2 2 2 ...
 $ tank     : Factor w/ 8 levels "1","2","3","4",..: 2 2 2 2 2 2 7 7 7 7 ...
 $ sex_ratio: Factor w/ 3 levels "3m:9f","6m:0f",..: 3 3 3 3 3 3 3 3 3 3 ...
 $ response : num  62.6 72.2 68.9 81.3 54.8 ...

1> ###Now the first model
1> library(nlme)
1> cs1<-gls(response~rep * sex_ratio, data=tmp.data,
1+   corr=corCompSymm(, form=~ 1 | tank), method="ML")

1> summary(cs1)
Generalized least squares fit by maximum likelihood
  Model: response ~ rep * sex_ratio
  Data: tmp.data
       AIC      BIC    logLik
  223.8389 236.5071 -103.9195

Correlation Structure: Compound symmetry
 Formula: ~1 | tank
 Parameter estimate(s):
 Rho
-0.2

Coefficients:
                        Value Std.Error   t-value p-value
(Intercept)          91.74000  7.589296 12.088077  0.0000
rep2                -29.03167 10.732886 -2.704926  0.0112
sex_ratio6m:0f       22.45500  7.589296  2.958772  0.0060
sex_ratio6m:6f      -21.49333  7.589296 -2.832059  0.0082
rep2:sex_ratio6m:0f  38.62667 10.732886  3.598908  0.0011
rep2:sex_ratio6m:6f  49.14500 10.732886  4.578918  0.0001

 Correlation:
                    (Intr) rep2   sx_6:0 sx_6:6 r2:_6:0
rep2                -0.707
sex_ratio6m:0f      -1.000  0.707
sex_ratio6m:6f      -1.000  0.707  1.000
rep2:sex_ratio6m:0f  0.707 -1.000 -0.707 -0.707
rep2:sex_ratio6m:6f  0.707 -1.000 -0.707 -0.707  1.000

Standardized residuals:
       Min         Q1        Med         Q3        Max
-2.6848135 -0.6039727  0.0249904  0.5257303  2.9974788

Residual standard error: 21.90841
Degrees of freedom: 36 total; 30 residual

1> levels(tmp.data$sex_ratio)
[1] "3m:9f" "6m:0f" "6m:6f"

1> #Now change the contrasts so the base level is the all male sex
ratio (i.e., 6m:0f)
1> tmp.data$sex_ratiob<-tmp.data$sex_ratio
1> contrasts(tmp.data$sex_ratio) #original contrasts
      6m:0f 6m:6f
3m:9f     0     0
6m:0f     1     0
6m:6f     0     1

1> #Set new contrasts for the copied sex_ratio column (i.e., "sex_ratiob")
1> contrasts(tmp.data$sex_ratiob)<-contr.treatment(n=levels(tmp.data$sex_ratio),
base=2, contrasts=TRUE)
1> contrasts(tmp.data$sex_ratiob) #new contrasts
      3m:9f 6m:6f
3m:9f     1     0
6m:0f     0     0
6m:6f     0     1

1> ##Now try the model again
1> cs2<-gls(response~rep * sex_ratiob, data=tmp.data,
1+   corr=corCompSymm(, form=~ 1 | tank), method="ML")
1> summary(cs2)
Generalized least squares fit by maximum likelihood
  Model: response ~ rep * sex_ratiob
  Data: tmp.data
       AIC      BIC    logLik
  223.8389 236.5071 -103.9195

Correlation Structure: Compound symmetry
 Formula: ~1 | tank
 Parameter estimate(s):
 Rho
-0.2

Coefficients:
                         Value Std.Error  t-value p-value
(Intercept)          114.19500  0.000003 36429283  0.0000
rep2                   9.59500  0.000004  2164380  0.0000
sex_ratiob3m:9f      -22.45500  7.589296       -3  0.0060
sex_ratiob6m:6f      -43.94833  0.000004 -9913590  0.0000
rep2:sex_ratiob3m:9f -38.62667 10.732886       -4  0.0011
rep2:sex_ratiob6m:6f  10.51833  0.000006  1677724  0.0000

 Correlation:
                     (Intr) rep2   sx_3:9 sx_6:6 r2:_3:
rep2                 -0.707
sex_ratiob3m:9f       0.000  0.000
sex_ratiob6m:6f      -0.707  0.500  0.000
rep2:sex_ratiob3m:9f  0.000  0.000 -0.707  0.000
rep2:sex_ratiob6m:6f  0.500 -0.707  0.000 -0.707  0.000

Standardized residuals:
       Min         Q1        Med         Q3        Max
-2.6848135 -0.6039727  0.0249904  0.5257303  2.9974788

Residual standard error: 21.90841
Degrees of freedom: 36 total; 30 residual

1> #The intercepts of the sex_ratios are the same
1> coefficients(summary(cs1))[1] #3m:9f intercept in cs1
(Intercept)
      91.74
1> coefficients(summary(cs2))[1]+coefficients(summary(cs2))[3] #3m:9f
intercept in cs2
(Intercept)
      91.74

1> coefficients(summary(cs1))[1]+coefficients(summary(cs1))[3] #6m:0f
intercept in cs1
(Intercept)
    114.195
1> coefficients(summary(cs2))[1] #6m:0f intercept in cs2
(Intercept)
    114.195

1> coefficients(summary(cs1))[1]+coefficients(summary(cs1))[4] #6m:6f
intercetp in cs1
(Intercept)
   70.24667
1> coefficients(summary(cs2))[1]+coefficients(summary(cs2))[4] #6m:6f
intercetp in cs2
(Intercept)
   70.24667

#Curiously, removing the interaction [gls(response~rep +
sex_ratiob...)] gives reasonable numbers for the standard errors

Any suggestions would be greatly appreciated!  Thank you.

Sincerely,
Matthew

--

Matthew Wolak
PhD Candidate
Evolution, Ecology, and Organismal Biology Graduate Program
University of California Riverside
http://student.ucr.edu/~mwola001/