[R] Great difference for piecewise linear function between R and SAS

Mon Mar 24 13:08:39 CET 2008

On 24/03/2008 7:06 AM, zhijie zhang wrote:
> Dear Murdoch,
>    "Compare the predictions, not the coefficients.", this is the key point.
>    I have checked their predicted probability and their results are the
> same.
>   What do you mean by  "You're using different bases"? Could you please give
> me a little more info on it, so that i can go to find some materials?

There are many ways to represent a piecewise linear function.  R and 
your SAS code have both chosen linear combinations of basis functions, 
but you have used different basis functions.  R uses the B-spline basis. 
You have what is sometimes called the truncated power basis, but maybe 
not commonly in the linear context.  I don't know if it has a name here.

Duncan Murdoch

>   Thanks a lot.
> 
> 
> On 3/24/08, Duncan Murdoch <murdoch at stats.uwo.ca> wrote:
>> On 24/03/2008 5:23 AM, zhijie zhang wrote:
>>> Dear Rusers,
>>>   I am now using  R and SAS to fit the piecewise linear functions, and
>> what
>>> surprised me is that they have a great differrent result. See below.
>> You're using different bases.  Compare the predictions, not the
>> coefficients.
>>
>> To see the bases that R uses, do this:
>>
>> matplot(distance, bs(distance,degree=1,knots=c(16.13,24)))
>> matplot(y, bs(y,degree=1,knots=c(-0.4357,-0.3202)))
>>
>> Duncan Murdoch
>>
>>> #R code--Knots for distance are 16.13 and 24, respectively, and Knots
>> for y
>>> are -0.4357 and -0.3202
>>> m.glm<-glm(mark~x+poly(elevation,2)+bs(distance,degree=1,knots=c(16.13
>> ,24))
>>>                 +bs(y,degree=1,knots=c(-0.4357,-0.3202
>>> )),family=binomial(logit),data=point)
>>> summary(m.glm)
>>>
>>> Coefficients:
>>>                                                       Estimate Std.
>> Error z
>>> value Pr(>|z|)
>>> (Intercept)                                             12.104
>> 3.138
>>> 3.857 0.000115 ***
>>> x                                                       5.815      1.987
>>> 2.926 0.003433 **
>>> poly(elevation, 2)1                                     6.654      4.457
>>> 1.493 0.135444
>>> poly(elevation, 2)2                                     -6.755
>> 3.441  -
>>> 1.963 0.049645 *
>>> bs(distance, degree = 1, knots = c(16.13, 24))1         -1.291
>> 1.139  -
>>> 1.133 0.257220
>>> bs(distance, degree = 1, knots = c(16.13, 24))2        -10.348
>> 2.025  -
>>> 5.110 3.22e-07 ***
>>> bs(distance, degree = 1, knots = c(16.13, 24))3        -3.530      3.752
>>   -
>>> 0.941 0.346850
>>> bs(y, degree = 1, knots = c(-0.4357, -0.3202))1        -6.828      1.836
>>   -
>>> 3.719 0.000200 ***
>>> bs(y, degree = 1, knots = c(-0.4357, -0.3202))2        -4.426      1.614
>>   -
>>> 2.742 0.006105 **
>>> bs(y, degree = 1, knots = c(-0.4357, -0.3202))3        -11.216
>> 2.861  -
>>> 3.920 8.86e-05 ***
>>>
>>> #SAS codes
>>> data b;
>>>  set a;
>>>  if distance > 16.13 then d1=1; else d1=0;
>>>  distance2=d1*(distance - 16.13);
>>>  if distance > 24 then d2=1; else d2=0;
>>>  distance3=d2*(distance - 24);
>>>  if y>-0.4357 then dd1=1; else dd1=0;
>>>  y2=dd1*(y+0.4357);
>>>  if y>-0.3202 then dd2=1; else dd2=0;
>>>  y3=dd2*(y+0.3202);
>>> run;
>>>
>>> proc logistic descending data=b;
>>>  model mark =x elevation elevation*elevation distance distance2
>> distance3 y
>>> y2 y3;
>>> run;
>>>
>>>
>>>               The LOGISTIC Procedure  Analysis of Maximum Likelihood
>>> Estimates
>>>
>>>                                                      Standard          Wald
>>>            Parameter               DF    Estimate       Error
>>> Chi-Square    Pr > ChiSq
>>>
>>>            Intercept                1     -2.6148      2.1445
>>> 1.4867
>>> 0.2227
>>>            x                        1      5.8146      1.9872
>>> 8.5615
>>> 0.0034
>>>            elevation                1      0.4457      0.1506
>>> 8.7545
>>> 0.0031
>>>            elevation*elevation      1     -0.0279      0.0142
>>> 3.8533
>>> 0.0496
>>>            distance                 1     -0.1091      0.0963
>>> 1.2836
>>> 0.2572
>>>            distance2                1     -1.0418      0.2668
>>> 15.2424
>>> <.0001
>>>            distance3                1      2.8633      0.7555
>>> 14.3625
>>> 0.0002
>>>            y                        1    -16.2032      4.3568
>>> 13.8314
>>> 0.0002
>>>            y2                       1     36.9974     10.3219
>>> 12.8476
>>> 0.0003
>>>            y3                       1    -58.4938     14.0279
>>> 17.3875
>>> <.0001
>>> Q: What is the problem? which one is correct for the piecewise linear
>>> function?
>>>   Thanks very much.
>>>
>>>
>>
> 
>