[R] Weighted Ridge Regression with GCV Optimization
Preetam Pal
lordpreetam at gmail.com
Tue Sep 22 22:25:39 CEST 2015
Hi R-users,
I am having problems while implementing the following model:
1. I have numerical regressors (GDP, HPA and FX observed quarterly) and
need to predict the numerical variable Y.
2. I have to run *weighted Ridge Regression* where the weights of the
squared residuals are decreasing at 5% with every quarter into the past.
3. Before estimating beta, I need select the *optimal Ridge parameter*
(lambda) wrt the GCV criterion:
a> For any lambda, divide
the data into say, blocks B1, B2, B3, B4 and B5 of size k = 20% of data
size. For each i, remove B_i, estimate the beta vector over the
remaining data set and find the unweighted SSE (or any other deviation
metric ) using this beta vector on the block B_i. Iterate over all
five B_i''s ( i =1,2,3,4) and get the average of the 4 sse
values.
b> Allow
lambda to vary between 0 to 1 in steps of size 0.01 and choose that lambda
which minimizes the average sse computed in step a>
4. With this choice of lambda, my final beta estimate would be [X'W'WX +
lambda * Identity Matrix]^(-1) * X'W'WY.
5. Here W'W is a diagonal matrix whose diagonals are decreasing from the
last entry upwards at 5% decay rate and trace(W'W) = 1 (i.e. sum of weights
= 1)
I know lm.ridge() can do Ridge Regression, but I dont know how to write the
code with these weights, GCV criterion etc.
Can you please help me with this? I have attached the exact data in .txt
format (should be readable with read.table() ).Please let me know in case I
need to provide any more clarifications.
Thanks,
Preetam
-------------- next part --------------
T GDP Rate HPA FX Y
1 0.806660537 2.177803167 1.14980573 2.733594304
2 0.997724655 1.585686087 0.814496976 3.193948056
3 0.99032353 0.569843997 0.464488882 3.065751781
4 0.606121306 3.037648988 0.565322084 4.537399052
5 0.858131141 4.816423605 1.924534222 7.871730873
6 0.052909178 2.048591352 1.470221953 2.580646078
7 0.081400487 1.152495559 1.128828557 7.200336313
8 0.840972911 3.848225962 1.004272646 1.211124673
9 0.965868218 1.039679934 0.231408747 7.566968
10 0.952626722 4.455565591 0.483541015 9.412639513
11 0.067691757 0.038417569 0.69744243 8.055369029
12 0.985658841 1.143481763 1.65850909 6.962599601
13 0.177186946 3.762691635 0.44379572 9.904367023
14 0.490066697 0.655629739 1.281478696 1.796422139
15 0.223740666 1.393201062 1.235291827 5.237943945
16 0.782873809 1.485727273 0.224511215 6.399036418
17 0.947492758 0.318485005 1.158911495 8.183470692
18 0.49692711 2.169601457 1.777618832 8.830805294
19 0.956704273 1.546827505 0.241838792 7.554654431
20 0.404624372 3.041530693 1.66039172 6.709330773
21 0.98557461 2.45656369 1.695179666 8.638707974
22 0.494102398 4.527230971 0.993352283 7.958872374
23 0.893182943 3.429112971 0.675541115 5.665249801
24 0.669680459 0.459919029 1.011872328 8.883120607
25 0.017296599 2.184045646 1.575891106 2.585709635
More information about the R-help
mailing list