[R] Loop over factor returns NA
arun
smartpink111 at yahoo.com
Sat Oct 12 19:46:16 CEST 2013
Hi,
Not sure if you have any restrictions in using ?lapply().
AB <- read.table(text="
time x y z gene part
1 03:27:58 1 2 3 grom 1
2 03:27:58 2 3 4 grom 1
3 03:27:58 3 4 5 grom 1
4 04:44:23 12 13 14 grom 2
5 04:44:23 13 14 15 grom 2
6 04:44:23 14 15 16 grom 2
7 04:44:23 15 16 17 grom 2
8 06:23:45 101 102 103 vir 3
9 06:23:45 102 103 104 vir 3
10 06:23:45 103 104 105 vir 3",sep="",header=TRUE,stringsAsFactors=FALSE)
str(AB)
#'data.frame': 10 obs. of 6 variables:
# $ time: chr "03:27:58" "03:27:58" "03:27:58" "04:44:23" ...
# $ x : int 1 2 3 12 13 14 15 101 102 103
# $ y : int 2 3 4 13 14 15 16 102 103 104
# $ z : int 3 4 5 14 15 16 17 103 104 105
# $ gene: chr "grom" "grom" "grom" "grom" ...
# $ part: int 1 1 1 2 2 2 2 3 3 3
#It is not clear from the example whether you have multiple 'gene` within 'part' or 'time'.
res1 <- do.call(rbind,lapply(split(AB,AB$part),function(u) {
sdx<- sd(u$x)
sdy<- sd(u$y)
sdz <- sd(u$z)
tab<- data.frame(sdx,sdy,sdz,gene=u$gene[1],stringsAsFactors=FALSE)
}))
#Similarly for time
res2 <- do.call(rbind,lapply(split(AB,AB$time),function(u) {
sdx<- sd(u$x)
sdy<- sd(u$y)
sdz <- sd(u$z)
tab<- data.frame(sdx,sdy,sdz,gene=u$gene[1],stringsAsFactors=FALSE)
}))
str(res1)
#'data.frame': 3 obs. of 4 variables:
# $ sdx : num 1 1.29 1
# $ sdy : num 1 1.29 1
# $ sdz : num 1 1.29 1
# $ gene: chr "grom" "grom" "vir"
#Use
?write.table()
A.K.
On Saturday, October 12, 2013 11:12 AM, anna berg <anna.berg1986 at hotmail.com> wrote:
Dear R users,
I am pretty new to programming in R. So I guess there is some obvious mistake I am making. I hope you can help me.
I have a data frame that looks like this:
> AB
time x y z gene part
1 03:27:58 1 2 3 grom 1
2 03:27:58 2 3 4 grom 1
3 03:27:58 3 4 5 grom 1
4 04:44:23 12 13 14 grom 2
5 04:44:23 13 14 15 grom 2
6 04:44:23 14 15 16 grom 2
7 04:44:23 15 16 17 grom 2
8 06:23:45 101 102 103 vir 3
9 06:23:45 102 103 104 vir 3
10 06:23:45 103 104 105 vir 3
Now I want to apply a loop (here a simplified version; I know that I could do this easily with tapply, but for the other things that I want to do with the loop (e.g. weighted mean of time series after fast fourier transformation) I would rather like to use a loop).
Note that "time" and "part" are actually the same, just one is a factor and the the other is a number.
Here is the loop that works fine and returns the result as I want (the important part here is: Intervall <- AB[AB$part==i,]):
for(i in 1:length(unique(AB$time)))
{
Intervall <- AB[AB$part==i,]
attach(Intervall)
# Standart deviation
sdx <-sd(x)
sdy <-sd(y)
sdz <-sd(z)
# Add Behavior
gene <- as.character(Intervall[1,5])
# Construct a table
tab <-c(sdx, sdy, sdz, gene)
write(tab, file=paste("VariableTable.txt", sep=""),
ncolumns=4,sep=",", append=TRUE)
detach(Intervall)
} # end of for loop
The result looks like this and is fine:
1,1,1,grom
1.3,1.3,1.3,grom
1,1,1,vir
My problem is, that I used the "part" column only to run the loop, but I actually want to use the time column to run the loop. But when I replace
Intervall <- AB[AB$part==i,]
with
Intervall <- AB[AB$time==i,]
then the resulting table only contains NA.
I also tried to use Intervall <- AB[x==i,]
x <- as.factor(AB$part) --> which works fine as well
x <- as.factor(AB$time) --> which returns only NA
x <- unique(AB$time) ---> which returns only NA
x <- levels(unique(AB$time) --> which returns only NA
x <- seq(unique(AB$time) ---> which returns the standard deviation of the entire column (not the single parts)
What do I do wrong? And how can i fix it?
Thank you so much in advance.
Kind regards,
Anna
[[alternative HTML version deleted]]
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list