[R] dealing with NA in readBin() and writeBin()
Duncan Murdoch
murdoch.duncan at gmail.com
Sun Jan 4 23:27:45 CET 2015
On 04/01/2015 5:13 PM, Mike Miller wrote:
> The help doc for readBin writeBin tells me this:
>
> Handling R's missing and special (Inf, -Inf and NaN) values is discussed
> in the ‘R Data Import/Export’ manual.
>
> So I go here:
>
> http://cran.r-project.org/doc/manuals/r-release/R-data.html#Special-values
>
> Unfortunately, I don't really understand that. Suppose I am using
> single-byte integers and I want 255 (binary 11111111) to be translated to
> NA. Is it possible to do that? Of course I could always do something
> like this:
>
> X[ X==255 ] <- NA
>
> The problem with that is that I want to process the data on the fly,
> dividing the integer to produce a double in the range from 0 to 2:
>
> X <- readBin( file, what="integer", n=N, size=1, signed=FALSE)/127
Why? Why not do it in three steps, i.e.
X <- readBin( file, what="integer", n=N, size=1, signed=FALSE)
X[ X==255 ] <- NA
X <- X/127
If you are worried about the extra typing, then write a function to
handle all three steps.
>
> It looks like this still works:
>
> X[ X==255/127 ] <- NA
I suspect that would work on all current platforms, but I wouldn't trust
it. Don't use == on floating point values unless you know they are
fractions with 2^n in the denominator.
> It would be neater if there were some kind of translation option for the
> input stream, like the way GNU tr (Linux/UNIX) works. I'm looking around
> and not finding such a thing. I can use gsub() to translate on the fly
> and then coerce back to integer format:
It's really trivial to write a wrapper for readBin to do what you want:
myReadBin <- function(...) {
X <- readBin(...)
X[ X==255 ] <- NA
X
}
Duncan Murdoch
>
> X <- as.integer(gsub("255", NA, readBin( file, what="integer", n=N, size=1, signed=FALSE)))/127
>
> What is your opinion of that tactic? Is there a better way? I don't know
> if that has any advantage on the postprocessing tactic above. Maybe what
> I need is something like gsub() that can operate on numeric values...
>
> X <- numsub(255, NA, readBin( file, what="integer", n=N, size=1, signed=FALSE))/127
>
> ...but if that isn't better in terms of speed or memory usage than
> postprocessing like this...
>
> X[ X==255/127 ] <- NA
>
> ...then I really don't need it (for this, but it would be good to know
> about).
>
>
> The na.strings = "NA" functionality of scan() is neat, but I guess that
> doesn't work with the binary read system. I don't think I can scan the
> readBin input because it isn't a file or stdin.
>
> Mike
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
More information about the R-help
mailing list