[R] "na.strings" and the like; suspending interpretation of "NA"

Tue Aug 4 09:05:05 CEST 2009

Jan Theodore Galkowski wrote:
> Can someone point me to the proper place in the documentation or on the
> Wiki where I can learn how to get R to stop interpreting the string "NA"
> as something special?  I have a table in a database which contains
> (among other things) country codes and continent codes.  The standard
> set of two-letter codes includes "NA" to denote "North America". I
> learned of the "na.strings" parameter for RODBC's "sqlQuery", being able
> to shut down this interpretation when data is read in.
> 
> However, in the program which uses this data, I (must) have some other
> instance where the "NA" gets spontaneously"interpreted as "not
> available", shows up in vectors and lists as "<NA>", and breaks
> function. I temporarily solved the problem by defining all instances of
> "NA" in the database as "NAC".  It still would be good to know a
> generaly solution.  I've seen something mentioned in conjunction with
> "options", but I'm not sure what that is about.

The general paradigm is that this shouldn't happen... Back in the old 
days, R had no such thing as character NA, and users had to sort out the 
North America, noradrenaline, Neil Armstrong, etc., issues for 
themselves. Nowadays we do have calculus that preserves "NA" as distinct 
from <NA>; so if one is converted to the other, it could signify a bug.

It could also be due to particularly silly code on your behalf, but in 
either case we need to see the effect narrowed down to a reproducible 
stretch of code.

-- 
    O__  ---- Peter Dalgaard             Øster Farimagsgade 5, Entr.B
   c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
  (*) \(*) -- University of Copenhagen   Denmark      Ph:  (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)              FAX: (+45) 35327907