[BioC] rhdf5 write/read inconsistency
Brad Friedman [guest]
guest at bioconductor.org
Wed Nov 6 15:55:06 CET 2013
I have an example of a matrix which I write with rhdf5 but when I read it back in I get something randomly different from what I wrote.
This example demonstrates the effect. It seems to be related somehow to having small chunks. In the example I write a matrix, then read it back in 10 times, each time printing its sum. It is usually a different sum, and never correct.
library(rhdf5)
go <- function(numRow = blocksize,
chunksize = 4,
numCol = 3,
dims = c(numRow, numCol),
start = 1,
blocksize = 7) {
str(list(numRow = numRow, numCol = numCol,
start = start,
chunksize = chunksize,
blocksize = blocksize))
mtx <- matrix(1:(blocksize*numCol), ncol = numCol)
cat("sum(matrix)=", sum(mtx), "\n")
file.exists("x.hdf5") && unlink("x.hdf5")
h5createFile("x.hdf5")
h5createDataset(file="x.hdf5",
dataset = "x",
dims = dims,
H5type = "H5T_NATIVE_UINT32",
level = 0,
chunk= c(chunksize,numCol))
h5write(mtx, "x.hdf5", name = "x",
start = c(start, 1),
stride = c(1,1),
block = c(blocksize, numCol),
count= c(1,1))
{
for(i in 1:10)
print(sum(h5read("x.hdf5", "/x",
start = c(start, 1),
stride = c(1,1),
block = c(blocksize, numCol),
count= c(1,1))))
}
}
##### and the transcript:
> go()
List of 5
$ numRow : num 7
$ numCol : num 3
$ start : num 1
$ chunksize: num 4
$ blocksize: num 7
sum(matrix)= 231
[1] 209
[1] 47358985
[1] 234
[1] 42963065
[1] 46236113
[1] 48574193
[1] 11738297
[1] 11738297
[1] 11738297
[1] 193
-- output of sessionInfo():
R version 3.0.1 (2013-05-16)
Platform: x86_64-apple-darwin10.8.0 (64-bit)
locale:
[1] en_CA.UTF-8/en_CA.UTF-8/en_CA.UTF-8/C/en_CA.UTF-8/en_CA.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] rhdf5_2.5.7
loaded via a namespace (and not attached):
[1] zlibbioc_1.6.0
--
Sent via the guest posting facility at bioconductor.org.
More information about the Bioconductor
mailing list