[R] "unsparse" a vector
Petr Savicky
savicky at cs.cas.cz
Thu Feb 9 12:35:36 CET 2012
On Wed, Feb 08, 2012 at 05:01:01PM -0500, Sam Steingold wrote:
> loop is too slow.
> it appears that sparseMatrix does what I want:
>
> ll <- lapply(l,length)
> i <- rep(1:4, ll)
> vv <- unlist(l)
> j1 <- as.factor(substring(vv,1,1))
> t <- table(j1)
> j <- position of elements of j1 in names(t)
> sparseMatrix(i,j,x=as.numeric(substring(vv,2,2)), dimnames = names(t))
>
> so, the question is, how do I produce a vector of positions?
>
> i.e., from vectors
> [1] "A" "B" "A" "C" "A" "B"
> and
> [1] "A" "B" "C"
> I need to produce a vector
> [1] 1 2 1 3 1 2
> of positions of the elements of the first vector in the second vector.
This particular thing may be done as follows
match(c("A", "B", "A", "C", "A", "B"), c("A", "B", "C"))
[1] 1 2 1 3 1 2
> PS. Of course, I would much prefer a dataframe to a matrix...
As the final result or also as an intermediate result?
Changing individual rows in a data frame is much slower
than in a matrix.
Compare
n <- 10000
mat <- matrix(1:(2*n), nrow=n)
df <- as.data.frame(mat)
system.time( for (i in 1:n) { mat[i, 1] <- 0 } )
user system elapsed
0.021 0.000 0.021
system.time( for (i in 1:n) { df[i, 1] <- 0 } )
user system elapsed
4.997 0.069 5.084
This effect is specific to working with rows. Working
with the whole columns is a different thing.
system.time( {
col1 <- df[[1]]
for (i in 1:n) { col1[i] <- 0 }
df[[1]] <- col1
} )
user system elapsed
0.019 0.000 0.019
Hope this helps.
Petr Savicky.
More information about the R-help
mailing list