[R] using ddply with segmented regression
arun
smartpink111 at yahoo.com
Tue Oct 15 00:50:02 CEST 2013
Hi Paul,
No problem.
Try:
par(mfrow=c(1,2))
ldply(SP.seg,plot)
#or
lapply(SP.seg,plot)
A.K.
On Monday, October 14, 2013 5:57 PM, "Prew, Paul" <Paul.Prew at ecolab.com> wrote:
Hello, the code provided by arun did the trick. Thank you very much, arun.
However, I'm now unsure of how to further process the results . Looking at the vignette aka "split-apply-combine". It appears that I could now create a dataframe from the list of results, and then run the results through the function plot.segmented to view the piecewise regressions by the grouping variable Lot.Run. However, the list is not in the structure expected by ldply --
>SP.seg <- dlply((df,.(Lot.Run),segmentf_df)
>SP.out <- ldply(SP.seg)
[9] ERROR:
Results must be all atomic, or all data frames
>class(SP.seg)[[1]]
[1] "list"
>head(SP.seg)
$`J062431-1`
Call: segmented.lm(obj = out.lm, seg.Z = ~Cycle, psi = (Cycle = NA),
control = seg.control(stop.if.error = FALSE, n.boot = 0,
gap = FALSE, jt = FALSE, nonParam = TRUE))
Meaningful coefficients of the linear terms:
(Intercept) Cycle U1.Cycle U2.Cycle U3.Cycle U4.Cycle U5.Cycle U6.Cycle
40.11786 -0.06664 -0.68539 0.49316 0.14955 0.03612 0.22257 -0.41166
U7.Cycle U8.Cycle U9.Cycle U10.Cycle
-0.48365 0.37949 0.24945 0.06712
Estimated Break-Point(s) psi1.Cycle psi2.Cycle psi3.Cycle psi4.Cycle psi5.Cycle psi6.Cycle psi7.Cycle psi8.Cycle psi9.Cycle psi10.Cycle : 19.67 34.31 51.02 72.10 97.94 117.20 130.10 147.10 155.70 160.40
$`J062431-2`
Call: segmented.lm(obj = out.lm, seg.Z = ~Cycle, psi = (Cycle = NA),
control = seg.control(stop.if.error = FALSE, n.boot = 0,
gap = FALSE, jt = FALSE, nonParam = TRUE))
Meaningful coefficients of the linear terms:
(Intercept) Cycle U1.Cycle U2.Cycle U3.Cycle U4.Cycle U5.Cycle U6.Cycle
40.11786 -0.06664 -0.68539 0.49316 0.14955 0.03612 0.22257 -0.41166
U7.Cycle U8.Cycle U9.Cycle U10.Cycle
-0.48365 0.37949 0.24945 0.06712
Estimated Break-Point(s) psi1.Cycle psi2.Cycle psi3.Cycle psi4.Cycle psi5.Cycle psi6.Cycle psi7.Cycle psi8.Cycle psi9.Cycle psi10.Cycle : 19.67 34.31 51.02 72.10 97.94 117.20 130.10 147.10 155.70 160.40
My hope was to eventually increase my understanding enough to create lattice plots using 'segment.plot' via ldply. Will that even work with the output object from this segmented package?
Thanks,Paul
Paul Prew | Statistician
651-795-5942 | fax 651-204-7504
Ecolab Research Center | Mail Stop ESC-F4412-A
655 Lone Oak Drive | Eagan, MN 55121-1560
-----Original Message-----
From: arun [mailto:smartpink111 at yahoo.com]
Sent: Saturday, October 12, 2013 1:42 AM
To: R help
Cc: Prew, Paul
Subject: Re: [R] using ddply with segmented regression
Hi,
Try:
segmentf_df <- function(df) {
out.lm<-lm(deltaWgt~Cycle, data=df)
segmented(out.lm,seg.Z=~Cycle, psi=(Cycle=NA),control=seg.control(stop.if.error=FALSE,n.boot=0))
}
library(plyr)
library(segmented)
dlply(df,.(Lot.Run),segmentf_df)
$`J062431-1`
Call: segmented.lm(obj = out.lm, seg.Z = ~Cycle, psi = (Cycle = NA),
control = seg.control(stop.if.error = FALSE, n.boot = 0))
Meaningful coefficients of the linear terms:
(Intercept) Cycle U1.Cycle U2.Cycle
38.480 1.130 -2.760 1.497
Estimated Break-Point(s) psi1.Cycle psi2.Cycle : 3.732 5.056
$`J062431-2`
Call: segmented.lm(obj = out.lm, seg.Z = ~Cycle, psi = (Cycle = NA),
control = seg.control(stop.if.error = FALSE, n.boot = 0))
Meaningful coefficients of the linear terms:
(Intercept) Cycle U1.Cycle U2.Cycle
48.4300 -3.2500 3.0905 -0.6555
Estimated Break-Point(s) psi1.Cycle psi2.Cycle : 2.12 22.15
attr(,"split_type")
[1] "data.frame"
attr(,"split_labels")
Lot.Run
1 J062431-1
2 J062431-2
#or
dlply(df,.(Lot.Run),function(x) segmentf_df(x))
#or
lapply(split(df,df$Lot.Run,drop=TRUE),function(x) segmentf_df(x))
A.K.
On Friday, October 11, 2013 11:16 PM, "Prew, Paul" <Paul.Prew at ecolab.com> wrote:
Hello,
I’m unsuccessfully trying to apply piecewise linear regression over each of 22 groups. The data structure of the reproducible toy dataset is below. I’m using the ‘segmented’ package, it worked fine with a data set that containing only one group (“Lot.Run”).
$ Cycle : int 1 2 3 4 5 6 7 8 9 10 ...
$ Lot.Run : Factor w/ 22 levels "J062431-1","J062431-2",..: 1 1 1 1 1 1 1 1 1 1 ...
$ deltaWgt: num 38.7 42.6 41 42.3 40.6 ...
I am new to ‘segmented’, and also new to ‘plyr’, which is how I’m trying to apply this segmented regression to the 22 Lot.Run groups. Within a Lot.Run, the piecewise linear regressions are deltaWgt vs. Cycle.
##### define the linear regression #####
out.lm<-lm(deltaWgt~Cycle, data=Test50.df)
##### define the function called by dlply #####
##### find cutpoints via bootstrapping, fit the piecewise regressions #####
segmentf_df <- function(df) {
segmented(out.lm,seg.Z=~Cycle, psi=(Cycle=NA),control=seg.control(stop.if.error=FALSE,n.boot=0)), data = df)
}
at this point, there’s an error message
23] ERROR: <text>
##### repeat for each Lot.Run group #####
dlply(Test50.df, .(Lot.Run), segmentf_df)
at this point, there’s an error message
[28] ERROR:
object 'segmentf_df' not found
Any suggestions?
Thanks, Paul
> dput(Test50.df)
structure(list(Cycle = c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L,
10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L, 19L, 20L, 21L, 22L,
23L, 24L, 25L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L,
12L, 13L, 14L, 15L, 16L, 17L, 18L, 19L, 20L, 21L, 22L, 23L, 24L,
25L), Lot.Run = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("J062431-1",
"J062431-2", "J062431-3", "J062432-1", "J062432-2", "J062433-1",
"J062433-2", "J062433-3", "Lot 1-1", "Lot 1-2", "Lot 2-1", "Lot 2-2",
"Lot 2-3", "Lot 3-1", "Lot 3-2", "Lot 3-3", "P041231-1", "P041231-2",
"P041531-1", "P041531-2", "P041531-3", "P041531-4"), class = "factor"),
deltaWgt = c(38.69, 42.58, 40.95, 42.26, 40.63, 41.61, 36.73,
41.28, 39.98, 40.63, 39.66, 39.98, 40.95, 38.36, 39.01, 39,
38.03, 39.66, 37.7, 39.66, 40.63, 38.03, 37.71, 36.73, 37.7,
45.18, 41.93, 42.59, 39.98, 40.95, 42.91, 38.03, 40.96, 39,
41.61, 39.33, 43.88, 39.98, 38.68, 38.68, 36.08, 39.99, 38.35,
40.31, 40.63, 38.68, 37.05, 38.36, 35.43, 36.73)), .Names = c("Cycle",
"Lot.Run", "deltaWgt"), row.names = c(1L, 2L, 3L, 4L, 5L, 6L,
7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L, 19L,
20L, 21L, 22L, 23L, 24L, 25L, 207L, 208L, 209L, 210L, 211L, 212L,
213L, 214L, 215L, 216L, 217L, 218L, 219L, 220L, 221L, 222L, 223L,
224L, 225L, 226L, 227L, 228L, 229L, 230L, 231L), class = "data.frame")
Paul Prew ▪ Statistician
651-795-5942 ▪ fax 651-204-7504
Ecolab Research Center ▪ Mail Stop ESC-F4412-A
655 Lone Oak Drive ▪ Eagan, MN 55121-1560
CONFIDENTIALITY NOTICE: This e-mail communication and any attachments may contain proprietary and privileged information for the use of the designated recipients named above. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message.
[[alternative HTML version deleted]]
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
CONFIDENTIALITY NOTICE: This e-mail communication and any attachments may contain proprietary and privileged information for the use of the designated recipients named above. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message.
More information about the R-help
mailing list