[R] Odp: Data frame modification
Petr PIKAL
petr.pikal at precheza.cz
Wed Jul 28 15:45:37 CEST 2010
Hi
why do you insist on loops. R is not C. If you want to use loops use C or
similar programming languages. It is almost always better to apply whole
object approach. Kind and clever people already programmed it (sometimes
in C ).
x<-rnorm(20)
x[c(10,12,13,17)]<-NA
x
[1] -1.12423790 0.80641765 -1.02686262 0.71894420 -0.76157153
-0.09612362
[7] 0.36681907 0.11164870 -1.06308689 NA -1.32903523 NA
[13] NA 0.43308928 -0.16599726 -1.85594816 NA
0.02117957
[19] -0.58170838 1.45417843
library(zoo)
na.locf(x)
[1] -1.12423790 0.80641765 -1.02686262 0.71894420 -0.76157153
-0.09612362
[7] 0.36681907 0.11164870 -1.06308689 -1.06308689 -1.32903523
-1.32903523
[13] -1.32903523 0.43308928 -0.16599726 -1.85594816 -1.85594816
0.02117957
[19] -0.58170838 1.45417843
Would be always quicker then for cycle with condition checked in each
step.
There was an article in R News and P.Burns R inferno is also worth to look
at if you are interested in loop performance.
If you want to see where the time is spent use Rprof
Regards
Petr
siddharth.garg85 at gmail.com napsal dne 28.07.2010 15:20:11:
> Thanks for the reply Petr. I have solved this problem using sapply but
what I
> am trying to understand here is, why this code is slow.
>
> One of the possible reasons could be when I use the assignment operator
ie
> D$x[i]=D$x[i-1]
> It actually makes a new copy of D$x with the modified value.
>
> Another reason could be indexed lookups might not be very fast in R.
>
> Regards
> Siddharth
>
>
>
> ------Original Message------
> From: Petr PIKAL
> To: siddharth.garg85 at gmail.com
> Cc: r-help at r-project.org
> Subject: Odp: [R] Data frame modification
> Sent: Jul 28, 2010 6:15 PM
>
> Hi
>
> r-help-bounces at r-project.org napsal dne 28.07.2010 11:30:48:
>
> > Hi
> >
> > I am trying to modify a data frame D with lists x and y in such a way
> that if
> > a value in x==0 then it should replace that value with the last not
zero
> valuein x. I.e.
> >
> > for loop over i{
> > if(D$x[i]==0)
> > D$x[i]=D$x[i-1]
> > }
> >
> > The data frame is quite large in size ~ 43000 rows. This operation is
> taking a
> > large amount of time. Can someone please suggest me what might be the
> reason.
>
> Bad programming practice? I would suggest to use zoo package and na.locf
> function after changing all zero values to NA.
>
> Regards
> Petr
>
> >
> > Thanks
> > Regards
> > Siddharth
> > Sent on my BlackBerry® from Vodafone
> > ______________________________________________
> > R-help at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
>
>
> Sent on my BlackBerry® from Vodafone
More information about the R-help
mailing list