[R] About stepwise regression problem

pigpigmeow glorykwok at hotmail.com
Tue Oct 4 10:00:11 CEST 2011


First of all, I have GAMs 
noxd<-gam(newNOX~pressure+maxtemp+s(avetemp,bs="cr")+s(mintemp,bs="cr")+s(RH,bs="cr")+s(solar,bs="cr")+s(windspeed,bs="cr")+s(transport,bs="cr"),family=gaussian
(link=log),groupD,methods=REML)

Then  I type " summary(noxd)". and show

Family: gaussian 
Link function: log 

Formula:
newNO2 ~ pressure + s(maxtemp, bs = "cr") + s(avetemp, bs = "cr") + 
    s(mintemp, bs = "cr") + RH + s(solar, bs = "cr") + s(windspeed, 
    bs = "cr") + s(transport, bs = "cr")

Parametric coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) 2.721513   0.049108  55.419   <2e-16 ***
pressure    0.028988   0.019434   1.492    0.140    
RH          0.005228   0.009763   0.535    0.594    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

Approximate significance of smooth terms:
               edf Ref.df     F p-value   
s(maxtemp)   6.346  7.276 1.223 0.29991   
s(avetemp)   1.000  1.000 0.226 0.63562   
s(mintemp)   1.908  2.396 1.066 0.35871   
s(solar)     3.797  4.490 2.164 0.07359 . 
s(windspeed) 5.305  6.341 2.346 0.03648 * 
s(transport) 7.234  7.984 2.807 0.00884 **
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

R-sq.(adj) =  0.307   Deviance explained = 49.1%
GCV score = 61.136  Scale est. = 44.49     n = 105

*I eliminate the greatest of p-value, that is s(avetemp) term then type
"summary(no2d)" and show
*

Family: gaussian 
Link function: log 

Formula:
newNO2 ~ pressure + s(maxtemp, bs = "cr") + s(mintemp, bs = "cr") + 
    RH + s(solar, bs = "cr") + s(windspeed, bs = "cr") + s(transport, 
    bs = "cr")

Parametric coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) 2.720973   0.048834  55.719   <2e-16 ***
pressure    0.031346   0.019040   1.646    0.104    
RH          0.006165   0.009583   0.643    0.522    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

Approximate significance of smooth terms:
               edf Ref.df     F p-value   
s(maxtemp)   6.499  7.425 1.450  0.1942   
s(mintemp)   1.975  2.487 1.788  0.1655   
s(solar)     3.925  4.628 2.118  0.0770 . 
s(windspeed) 5.373  6.417 2.967  0.0101 * 
s(transport) 7.043  7.822 2.785  0.0097 **
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

R-sq.(adj) =  0.316   Deviance explained = 49.2%
GCV score = 59.746  Scale est. = 43.919    n = 105
> 


*I eliminate the greatest of p-value, that is RH term then type
"summary(no2d)" and show
*

Family: gaussian 
Link function: log 

Formula:
newNO2 ~ pressure + s(maxtemp, bs = "cr") + s(mintemp, bs = "cr") + 
    s(solar, bs = "cr") + s(windspeed, bs = "cr") + s(transport, 
    bs = "cr")

Parametric coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  2.72001    0.04859  55.974   <2e-16 ***
pressure     0.02978    0.01878   1.586    0.117    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

Approximate significance of smooth terms:
               edf Ref.df     F p-value   
s(maxtemp)   6.544  7.468 1.654 0.12830   
s(mintemp)   1.952  2.460 1.697 0.18301   
s(solar)     3.977  4.686 2.869 0.02211 * 
s(windspeed) 5.381  6.425 2.641 0.01953 * 
s(transport) 7.052  7.830 3.348 0.00257 **
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

R-sq.(adj) =  0.321   Deviance explained =   49%
GCV score =  58.61  Scale est. = 43.591    n = 105

I remove s(mintemp) term... until

Family: gaussian 
Link function: log 

Formula:
newNO2 ~ s(windspeed, bs = "cr")

Parametric coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  2.78159    0.04701   59.16   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

Approximate significance of smooth terms:
               edf Ref.df    F p-value  
s(windspeed) 1.775  2.251 4.54  0.0101 *
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

R-sq.(adj) =    0.1   Deviance explained = 11.5%
GCV score = 59.348  Scale est. = 57.78     n = 105

I remain s(windspeed) term finally.my significant level = 0.05.... I have a
question...

First, Does the backward elimation perform correctly?

Second, Is it possible run the process( backward elimation) automatically?

Third, I found the the linear part was listed "Pr(>|t|)" and the smoothing
part " p-value". these two terms are the same meaning?

 


--
View this message in context: http://r.789695.n4.nabble.com/About-stepwise-regression-problem-tp3870217p3870217.html
Sent from the R help mailing list archive at Nabble.com.



More information about the R-help mailing list