[R] Ridge regression [Repost]

Frank E Harrell Jr f.harrell at vanderbilt.edu
Wed Aug 19 20:52:43 CEST 2009


Sabyasachi Patra wrote:
> Dear all,
> 
> For an ordinary ridge regression problem, I followed three different
> approaches:
> 1. estimate beta without any standardization
> 2. estimate standardized beta (standardizing X and y) and then again convert
> back  
> 3. estimate beta using lm.ridge() function
> 
> 
> X<-matrix(c(1,2,9,3,2,4,7,2,3,5,9,1),4,3)
> y<-as.matrix(c(2,3,4,5))
> 
> n<-nrow(X)
> p<-ncol(X)
> 
> #Without standardization
> intercept <- rep(1,n)
> Xn <- cbind(intercept, X)
> K<-diag(c(0,rep(1.5,p)))
> beta1 <- solve(t(Xn)%*%Xn+K)%*%t(Xn)%*%y
> beta1
> 
> #with standardization
> ys<-scale(y)
> Xs<-scale(X)
> K<-diag(1.5,p)
> bs <- solve(t(Xs)%*%Xs+K)%*%t(Xs)%*%ys
> b<- sd(y)*(bs/sd(X))
> intercept <- mean(y)-sum(as.matrix(colMeans(X))*b)
> beta2<-rbind(intercept,b)
> beta2
> 
> #Using lm.ridge function of MASS package
> beta3<-lm.ridge(y~X,lambda=1.5)
> 
> I'm getting three different results using above described approaches:
> 
>> beta1
>                 [,1]
> intercept  3.4007944
>            0.3977462
>            0.2082025
>           -0.4829115
>> beta2
>                 [,1]
> intercept  3.3399855
>            0.1639469
>            0.0262021
>           -0.1228987
> 
>> beta3
>                      X1          X2          X3
>  3.35158977  0.19460958  0.03152778 -0.15546775
> 
> It will be very helpful to me if anybody can help me regarding why the
> outputs are coming different.
> 
> I am extremely sorry for my previous anonymous post.
> 
> regards.
> 
> -----
> Sabyasachi Patra
> PhD Scholar
> Indian institute of Technology Kanpur
> India.

Thanks for re-posting.  If you look at the ols function in the Design 
package and my chapter on maximum likelihood estimation in the book 
Regrsession Modeling Strategies you'll see an alternative approach that 
doesn't change anything but that may be easier to visualize.  The 
standard deviations are put into the penalty directly so no pre-scaling 
is required.

Frank


-- 
Frank E Harrell Jr   Professor and Chair           School of Medicine
                      Department of Biostatistics   Vanderbilt University




More information about the R-help mailing list