[R] remove rows with infinite/nan values from a zoo dataset
arun
smartpink111 at yahoo.com
Tue Sep 3 08:47:48 CEST 2013
Hi,
Please dput() the example dataset. When I read from the one shown below, it looks a bit altered.
library(zoo)
dat1<- read.zoo(text="2009-07-15,#N/A N/A,#N/A N/A,18.96858
2009-07-16,20.30685,20.40664,#N/A N/A
2009-07-17,20.78813,20.03991,20.40664
2009-07-20,21.41278,21.41278,20.03991
2009-07-21,22.9963,22.98397,21.41278
2009-07-22,23.06443,23.01112,22.98397
2009-07-23,23.45905,24.72232,23.01112
2009-07-24,24.89291,25.56603,24.72232
2009-07-27,25.38929,24.80535,25.56603
2009-07-28,25.26712,25.65566,24.80535
2009-07-29,25.83884,24.98163,25.65566
2009-07-30,#N/A N/A,#N/A N/A,24.98163
2009-08-03,25.25553,25.93297,#N/A N/A
2009-08-04,26.02464,25.49159,25.93297
",sep=",",header=FALSE,FUN=as.Date,format="%Y-%m-%d",fill=TRUE)
dput(dat1) ###
structure(c(NA, 20.30685, 20.78813, 21.41278, 22.9963, 23.06443,
23.45905, 24.89291, 25.38929, 25.26712, 25.83884, NA, 25.25553,
26.02464, NA, 20.40664, 20.03991, 21.41278, 22.98397, 23.01112,
24.72232, 25.56603, 24.80535, 25.65566, 24.98163, NA, 25.93297,
25.49159, NA, NA, 20.40664, 20.03991, 21.41278, 22.98397, 23.01112,
24.72232, 25.56603, 24.80535, 25.65566, NA, NA, 25.93297), .Dim = c(14L,
3L), .Dimnames = list(NULL, c("V2", "V3", "V4")), index = structure(c(14440,
14441, 14442, 14445, 14446, 14447, 14448, 14449, 14452, 14453,
14454, 14455, 14459, 14460), class = "Date"), class = "zoo")
dat2<- dat1[!rowSums(is.na(dat1)),]
dat2
# V2 V3 V4
#2009-07-17 20.78813 20.03991 20.40664
#2009-07-20 21.41278 21.41278 20.03991
#2009-07-21 22.99630 22.98397 21.41278
#2009-07-22 23.06443 23.01112 22.98397
#2009-07-23 23.45905 24.72232 23.01112
#2009-07-24 24.89291 25.56603 24.72232
#2009-07-27 25.38929 24.80535 25.56603
#2009-07-28 25.26712 25.65566 24.80535
#2009-07-29 25.83884 24.98163 25.65566
#2009-08-04 26.02464 25.49159 25.93297
dat2[1,2]<- Inf
dat2[5,3]<- -Inf
dat2[rowSums(is.finite(dat2))==ncol(dat2),]
# V2 V3 V4
#2009-07-20 21.41278 21.41278 20.03991
#2009-07-21 22.99630 22.98397 21.41278
#2009-07-22 23.06443 23.01112 22.98397
#2009-07-24 24.89291 25.56603 24.72232
#2009-07-27 25.38929 24.80535 25.56603
#2009-07-28 25.26712 25.65566 24.80535
#2009-07-29 25.83884 24.98163 25.65566
#2009-08-04 26.02464 25.49159 25.93297
A.K.
Hi There,
I have a dataset with many rows and few columns as following:
2009-07-15 #N/A N/A #N/A N/A 18.96858
2009-07-16 20.30685 20.40664 #N/A N/A
2009-07-17 20.78813 20.03991 20.40664
2009-07-20 21.41278 21.41278 20.03991
2009-07-21 22.9963 22.98397 21.41278
2009-07-22 23.06443 23.01112 22.98397
2009-07-23 23.45905 24.72232 23.01112
2009-07-24 24.89291 25.56603 24.72232
2009-07-27 25.38929 24.80535 25.56603
2009-07-28 25.26712 25.65566 24.80535
2009-07-29 25.83884 24.98163 25.65566
2009-07-30 #N/A N/A #N/A N/A 24.98163
2009-08-03 25.25553 25.93297 #N/A N/A
2009-08-04 26.02464 25.49159 25.93297
The class of the dataset is "zoo". My question might be stupid
but could anyone suggest a way to remove the rows with #N/A values?
I tried "rapply" command but it didn't work due to the data class.
btw, how about for the "Inf" values?
Thank you in advance!
More information about the R-help
mailing list