[R] why results from regression tree (rpart) are totallyinconsistent with ordinary regression

Bert Gunter gunter.berton at gene.com
Tue Feb 24 00:14:26 CET 2009


You did not read the tree graph correctly.  Mortality is **not** "positively
related" to incidence. You're reading the tree backwards.  Read the output
of summary() on your rpart fit object for clarity.

 -- Bert Gunter
Genentech

-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On
Behalf Of Weidong Gu
Sent: Monday, February 23, 2009 2:39 PM
To: r-help at r-project.org
Subject: [R] why results from regression tree (rpart) are
totallyinconsistent with ordinary regression

Hi,

In my analysis of impacts of insecticide-treated bednets on malaria, I
look at the relationship between malaria incidence and mosquito
behaviors. The condensed data set is copied here. Ordinary regression
(lm) shows that Incidence was negatively related to Mortality. This
makes sense because the latter reflected the strength of killing
mosquitoes by insecticide-treated nets. Since the original data set has
a complex structure with more parameters and scenarios. I guess a tree
model would help explore the structure of the data.  However, regression
tree (rpart(Incidence~Mortality+Deterrence)) indicates that Mortality
was positively related to Incidence. 

How this unintuitive result? Advice is appreciated. 

Weidong Gu, 
Department of Medicine
University of Alabama, Birmingham

Deterrence	Mortality	Incidence
0.695	0.51	66
0.255	0.501	48
0.612	0.483	55
0.209	0.158	47
0.499	0.589	53
0.755	0.285	73
0.764	0.351	77
0.749	0.211	64
0.101	0.336	45
0.556	0.066	72
0.576	0.403	45
0.232	0.667	35
0.424	0.891	34
0.432	0.458	54
0.197	0.269	59
0.188	0.523	40
0.291	0.864	32
0.504	0.791	36
0.387	0.138	66
0.71	0.676	56
0.235	0.183	59
0.358	0.579	41
0.718	0.57	49
0.775	0.254	46
0.269	0.633	42
0.443	0.741	40
0.28	0.438	49
0.385	0.778	37
0.539	0.653	37
0.73	0.094	84
0.489	0.611	40
0.595	0.431	39
0.305	0.003	69
0.511	0.595	37
0.394	0.798	37
0.369	0.541	47
0.414	0.552	51
0.468	0.858	34
0.311	0.201	59
0.142	0.36	43
0.514	0.195	46
0.365	0.325	48
0.608	0.224	67
0.177	0.04	62
0.475	0.146	65
0.526	0.702	46
0.735	0.372	43
0.172	0.66	36
0.622	0.531	53
0.651	0.055	76
0.223	0.296	54
0.783	0.566	52
0.439	0.698	34
0.527	0.493	41
0.766	0.89	49
0.634	0.749	42
0.24	0.732	35
0.792	0.764	36
0.268	0.823	34
0.418	0.407	53
0.251	0.241	54
0.705	0.843	40
0.546	0.474	55
0.685	0.384	62
0.582	0.086	72
0.63	0.618	57
0.131	0.028	56
0.555	0.803	41
0.463	0.299	57
0.154	0.164	55
0.406	0.074	66
0.168	0.118	58
0.597	0.323	47
0.672	0.816	42
0.698	0.623	48
0.676	0.177	43
0.743	0.109	81
0.121	0.244	49
0.799	0.014	85
0.45	0.645	36
0.484	0.448	52
0.585	0.307	68
0.348	0.417	43
0.345	0.459	44
0.374	0.835	30
0.657	0.134	65
0.331	0.022	67
0.141	0.045	66
0.568	0.1	67
0.11	0.876	30
0.212	0.39	46
0.298	0.519	40
0.322	0.721	44
0.201	0.77	35
0.641	0.855	39
0.156	0.277	48
0.327	0.714	40
0.663	0.231	44
0.119	0.688	37
0.287	0.354	46

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




More information about the R-help mailing list