[R] A slight trap in read.table/read.csv.
Peter Ehlers
ehlers at ucalgary.ca
Mon Mar 1 15:03:51 CET 2010
On 2010-02-28 14:55, Rolf Turner wrote:
>
> I had occasion recently to read in a one-line *.csv file that
> looked like:
>
> "CandidateName","NSN","Ethnicity","dob","gender"
> "Smith, Mary Jane",111222333,"E","2/25/1989","F"
>
> That "F" (for female) in the last field got transformed to
> FALSE. Apparently read.csv (and hence read.table) are inferring
> that if the entries of a file are all F's and T's then the
> field is interpreted as logical.
>
> If I change the file to
>
> "CandidateName","NSN","Ethnicity","dob","gender"
> "Smith, Mary Jane",111222333,"E","2/25/1989","F"
> "Mingdinkler, Melvin Queue",999888777,"01/04/1942","M"
>
> then the read functions correctly interpret the last field
> as being character.
>
> The translation of "F" into FALSE resulted in some mysterious
> contretemps in further analysis, which it took me a while to
> track down.
>
> I solved the problem by putting in a colClasses argument in my
> call to read.csv(). But I really think that the read functions
> are being too clever by half here. If field entries are surrounded
> by quotes, shouldn't they be left as character? Even if they are
> all F's and T's?
>
> Furthermore using F's and T's to represent TRUE's and FALSE's is
> bad practice anyway. Since FALSE and TRUE are reserved words it
> would make sense for the read function to assume that a field is
> logical if it consists entirely of these words. But T's and F's
> .... I don't think so.
>
> I would argue that this behaviour should be changed. I can see no
> downside to such a change.
>
I agree with Rolf. Indeed, I'm not fond of the use of T/F for
TRUE/FALSE at all.
> cheers,
>
> Rolf Turner
>
> ######################################################################
> Attention:\ This e-mail message is privileged and confid...{{dropped:9}}
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
--
Peter Ehlers
University of Calgary
More information about the R-help
mailing list