[R] indexing??
Petr Savicky
savicky at cs.cas.cz
Tue Feb 28 16:33:44 CET 2012
On Tue, Feb 28, 2012 at 05:59:24AM -0800, helin_susam wrote:
> Hello All,
>
> My algorithm as follows;
> y <- c(1,1,1,0,0,1,0,1,0,0)
> x <- c(1,0,0,1,1,0,0,1,1,0)
>
> n <- length(x)
>
> t <- matrix(cbind(y,x), ncol=2)
>
> z = x+y
>
> for(j in 1:length(x)) {
> out <- vector("list", )
>
> for(i in 1:10) {
>
> t.s <- t[sample(n,n,replace=T),]
>
> y.s <- t.s[,1]
> x.s <- t.s[,2]
>
> z.s <- y.s+x.s
>
> out[[i]] <- list(ff <- (z.s), finding=any (y.s==y[j]))
> kk <- sapply(out, function(x) {x$finding})
> ff <- out[! kk]
> }
>
> I tried to find the total of the two vectors as statistic by using
> bootstrap. Finally, I want to get the values which do not contain the y's
> each elemet. In the algorithm ti is referred to "ff". But i get always the
> same result ;
> > ff
> list()
> > kk
> [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
> Because, my "y" vector contains only 2 elements, and probably all of the
> bootstrap resamples include "1", or all of resamples include "0". So I can
> not find the true matches. Can anyone help me about how to be?
Hi.
First of all, there are some unclear points in your code.
In particular, i would expect "}" between the line
out[[i]] <- list(...
and
kk <- sapply(...
Moreover, i do not see, why the loop over j contains the
loop over i. I would expect these loops be disjoint,
since the loop over i collects all the samples to a list.
The following code is a modification, which i suggest
as an alternative.
y <- c(1:5, 1:5)
x <- c(1,0,0,1,1,0,0,1,1,0)
n <- length(x)
t <- matrix(cbind(y,x), ncol=2)
z = x+y
# generate 10 bootstrap samples and keep z.s, y.s
out <- vector("list", 10)
for(i in 1:10) {
t.s <- t[sample(n,n,replace=T),]
y.s <- t.s[,1]
x.s <- t.s[,2]
z.s <- y.s+x.s
out[[i]] <- list(zz = z.s, yy =y.s)
}
# check, which replications do not contain y[j] in their y.s,
# and take the OR of these conditions over j
ff <- rep(FALSE, times=length(out))
for(j in 1:length(y)) {
kk <- sapply(out, function(x) {any(x$yy == y[j])})
ff <- ff | (! kk)
}
out[ff]
With the original y <- c(1,1,1,0,0,1,0,1,0,0), the probability
that a bootstrap sample contains only 1's or only 0's is
2 * (1/2)^10, so i replaced the vector y with another, where
a missing value is more frequent. I obtained, for example
[[1]]
[[1]]$zz
[1] 2 2 5 2 3 2 3 2 2 6
[[1]]$yy
[1] 1 1 5 1 3 2 3 2 1 5 # 4 is missing
[[2]]
[[2]]$zz
[1] 5 5 5 5 3 5 2 5 6 4
[[2]]$yy
[1] 4 4 5 4 3 5 2 5 5 3 # 1 is missing
[[3]]
[[3]]$zz
[1] 5 2 5 1 5 1 2 5 5 5
[[3]]$yy
[1] 4 2 5 1 5 1 1 4 5 4 # 3 is missing
Hope this helps.
Petr Savicky.
More information about the R-help
mailing list